Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Lu P, Nakorchevskiy A, Marcotte EM. Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc Natl Acad Sci U S A 2003;100:10370-5. [PMID: 12934019 PMCID: PMC193568 DOI: 10.1073/pnas.1832361100] [Citation(s) in RCA: 109] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Lu P, Nakorchevskiy A, Marcotte EM. Expression deconvolution: a reinterpretation of DNA microarray data reveals dynamic changes in cell populations. Proc Natl Acad Sci U S A 2003;100:10370-5. [PMID: 12934019 PMCID: PMC193568 DOI: 10.1073/pnas.1832361100] [Citation(s) in RCA: 109] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Zhong Y, Wan YW, Pang K, Chow LML, Liu Z. Digital sorting of complex tissues for cell type-specific gene expression profiles. BMC Bioinformatics 2013;14:89. [PMID: 23497278 PMCID: PMC3626856 DOI: 10.1186/1471-2105-14-89] [Citation(s) in RCA: 139] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Accepted: 02/14/2013] [Indexed: 11/29/2022] Open

PERT: a method for expression deconvolution of human blood samples from varied microenvironmental and developmental conditions. PLoS Comput Biol 2012;8:e1002838. [PMID: 23284283 PMCID: PMC3527275 DOI: 10.1371/journal.pcbi.1002838] [Citation(s) in RCA: 82] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Accepted: 10/26/2012] [Indexed: 12/30/2022] Open

Kuhn A, Kumar A, Beilina A, Dillman A, Cookson MR, Singleton AB. Cell population-specific expression analysis of human cerebellum. BMC Genomics 2012;13:610. [PMID: 23145530 PMCID: PMC3561119 DOI: 10.1186/1471-2164-13-610] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Accepted: 10/09/2012] [Indexed: 11/10/2022] Open

Jia Z, Wang Y, Hu Y, McLaren C, Yu Y, Ye K, Xia XQ, Koziol JA, Lernhardt W, McClelland M, Mercola D. A sample selection strategy to boost the statistical power of signature detection in cancer expression profile studies. Anticancer Agents Med Chem 2012;13:203-11. [PMID: 22934703 DOI: 10.2174/1871520611313020004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2012] [Revised: 05/01/2012] [Accepted: 05/05/2012] [Indexed: 11/22/2022]

Shannon CP, Hollander Z, Wilson-McManus J, Balshaw R, Ng RT, McMaster R, McManus BM, Keown PA, Tebbutt SJ. White blood cell differentials enrich whole blood expression data in the context of acute cardiac allograft rejection. Bioinform Biol Insights 2012;6:49-61. [PMID: 22550401 PMCID: PMC3329187 DOI: 10.4137/bbi.s9197] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open

Optimal deconvolution of transcriptional profiling data using quadratic programming with application to complex clinical blood samples. PLoS One 2011;6:e27156. [PMID: 22110609 PMCID: PMC3217948 DOI: 10.1371/journal.pone.0027156] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2011] [Accepted: 10/11/2011] [Indexed: 11/19/2022] Open

Abstract

Large-scale molecular profiling technologies have assisted the identification of disease biomarkers and facilitated the basic understanding of cellular processes. However, samples collected from human subjects in clinical trials possess a level of complexity, arising from multiple cell types, that can obfuscate the analysis of data derived from them. Failure to identify, quantify, and incorporate sources of heterogeneity into an analysis can have widespread and detrimental effects on subsequent statistical studies.We describe an approach that builds upon a linear latent variable model, in which expression levels from mixed cell populations are modeled as the weighted average of expression from different cell types. We solve these equations using quadratic programming, which efficiently identifies the globally optimal solution while preserving non-negativity of the fraction of the cells. We applied our method to various existing platforms to estimate proportions of different pure cell or tissue types and gene expression profilings of distinct phenotypes, with a focus on complex samples collected in clinical trials. We tested our methods on several well controlled benchmark data sets with known mixing fractions of pure cell or tissue types and mRNA expression profiling data from samples collected in a clinical trial. Accurate agreement between predicted and actual mixing fractions was observed. In addition, our method was able to predict mixing fractions for more than ten species of circulating cells and to provide accurate estimates for relatively rare cell types (<10% total population). Furthermore, accurate changes in leukocyte trafficking associated with Fingolomid (FTY720) treatment were identified that were consistent with previous results generated by both cell counts and flow cytometry. These data suggest that our method can solve one of the open questions regarding the analysis of complex transcriptional data: namely, how to identify the optimal mixing fractions in a given experiment.

Collapse

Investigation of variation in gene expression profiling of human blood by extended principle component analysis. PLoS One 2011;6:e26905. [PMID: 22046403 PMCID: PMC3203156 DOI: 10.1371/journal.pone.0026905] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2011] [Accepted: 10/06/2011] [Indexed: 01/08/2023] Open

Population-specific expression analysis (PSEA) reveals molecular changes in diseased brain. Nat Methods 2011;8:945-7. [PMID: 21983921 DOI: 10.1038/nmeth.1710] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Accepted: 08/04/2011] [Indexed: 11/08/2022]

Gaujoux R, Seoighe C. Semi-supervised Nonnegative Matrix Factorization for gene expression deconvolution: a case study. INFECTION GENETICS AND EVOLUTION 2011;12:913-21. [PMID: 21930246 DOI: 10.1016/j.meegid.2011.08.014] [Citation(s) in RCA: 81] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/03/2011] [Revised: 08/10/2011] [Accepted: 08/11/2011] [Indexed: 10/17/2022]

Abstract

Heterogeneity in sample composition is an inherent issue in many gene expression studies and, in many cases, should be taken into account in the downstream analysis to enable correct interpretation of the underlying biological processes. Typical examples are infectious diseases or immunology-related studies using blood samples, where, for example, the proportions of lymphocyte sub-populations are expected to vary between cases and controls. Nonnegative Matrix Factorization (NMF) is an unsupervised learning technique that has been applied successfully in several fields, notably in bioinformatics where its ability to extract meaningful information from high-dimensional data such as gene expression microarrays has been demonstrated. Very recently, it has been applied to biomarker discovery and gene expression deconvolution in heterogeneous tissue samples. Being essentially unsupervised, standard NMF methods are not guaranteed to find components corresponding to the cell types of interest in the sample, which may jeopardize the correct estimation of cell proportions. We have investigated the use of prior knowledge, in the form of a set of marker genes, to improve gene expression deconvolution with NMF algorithms. We found that this improves the consistency with which both cell type proportions and cell type gene expression signatures are estimated. The proposed method was tested on a microarray dataset consisting of pure cell types mixed in known proportions. Pearson correlation coefficients between true and estimated cell type proportions improved substantially (typically from about 0.5 to approximately 0.8) with the semi-supervised (marker-guided) versions of commonly used NMF algorithms. Furthermore known marker genes associated with each cell type were assigned to the correct cell type more frequently for the guided versions. We conclude that the use of marker genes improves the accuracy of gene expression deconvolution using NMF and suggest modifications to how the marker gene information is used that may lead to further improvements.

Collapse

Miller JA, Cai C, Langfelder P, Geschwind DH, Kurian SM, Salomon DR, Horvath S. Strategies for aggregating gene expression data: the collapseRows R function. BMC Bioinformatics 2011;12:322. [PMID: 21816037 PMCID: PMC3166942 DOI: 10.1186/1471-2105-12-322] [Citation(s) in RCA: 229] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2011] [Accepted: 08/04/2011] [Indexed: 12/19/2022] Open

Abstract

Background

Genomic and other high dimensional analyses often require one to summarize multiple related variables by a single representative. This task is also variously referred to as collapsing, combining, reducing, or aggregating variables. Examples include summarizing several probe measurements corresponding to a single gene, representing the expression profiles of a co-expression module by a single expression profile, and aggregating cell-type marker information to de-convolute expression data. Several standard statistical summary techniques can be used, but network methods also provide useful alternative methods to find representatives. Currently few collapsing functions are developed and widely applied.

Results

We introduce the R function collapseRows that implements several collapsing methods and evaluate its performance in three applications. First, we study a crucial step of the meta-analysis of microarray data: the merging of independent gene expression data sets, which may have been measured on different platforms. Toward this end, we collapse multiple microarray probes for a single gene and then merge the data by gene identifier. We find that choosing the probe with the highest average expression leads to best between-study consistency. Second, we study methods for summarizing the gene expression profiles of a co-expression module. Several gene co-expression network analysis applications show that the optimal collapsing strategy depends on the analysis goal. Third, we study aggregating the information of cell type marker genes when the aim is to predict the abundance of cell types in a tissue sample based on gene expression data ("expression deconvolution"). We apply different collapsing methods to predict cell type abundances in peripheral human blood and in mixtures of blood cell lines. Interestingly, the most accurate prediction method involves choosing the most highly connected "hub" marker gene. Finally, to facilitate biological interpretation of collapsed gene lists, we introduce the function userListEnrichment, which assesses the enrichment of gene lists for known brain and blood cell type markers, and for other published biological pathways.

Conclusions

The R function collapseRows implements several standard and network-based collapsing methods. In various genomic applications we provide evidence that both types of methods are robust and biologically relevant tools.

Collapse

Elloumi F, Hu Z, Li Y, Parker JS, Gulley ML, Amos KD, Troester MA. Systematic bias in genomic classification due to contaminating non-neoplastic tissue in breast tumor samples. BMC Med Genomics 2011;4:54. [PMID: 21718502 PMCID: PMC3151208 DOI: 10.1186/1755-8794-4-54] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2011] [Accepted: 06/30/2011] [Indexed: 12/15/2022] Open

Abstract

Background

Genomic tests are available to predict breast cancer recurrence and to guide clinical decision making. These predictors provide recurrence risk scores along with a measure of uncertainty, usually a confidence interval. The confidence interval conveys random error and not systematic bias. Standard tumor sampling methods make this problematic, as it is common to have a substantial proportion (typically 30-50%) of a tumor sample comprised of histologically benign tissue. This "normal" tissue could represent a source of non-random error or systematic bias in genomic classification.

Methods

To assess the performance characteristics of genomic classification to systematic error from normal contamination, we collected 55 tumor samples and paired tumor-adjacent normal tissue. Using genomic signatures from the tumor and paired normal, we evaluated how increasing normal contamination altered recurrence risk scores for various genomic predictors.

Results

Simulations of normal tissue contamination caused misclassification of tumors in all predictors evaluated, but different breast cancer predictors showed different types of vulnerability to normal tissue bias. While two predictors had unpredictable direction of bias (either higher or lower risk of relapse resulted from normal contamination), one signature showed predictable direction of normal tissue effects. Due to this predictable direction of effect, this signature (the PAM50) was adjusted for normal tissue contamination and these corrections improved sensitivity and negative predictive value. For all three assays quality control standards and/or appropriate bias adjustment strategies can be used to improve assay reliability.

Conclusions

Normal tissue sampled concurrently with tumor is an important source of bias in breast genomic predictors. All genomic predictors show some sensitivity to normal tissue contamination and ideal strategies for mitigating this bias vary depending upon the particular genes and computational methods used in the predictor.

Collapse

Rivas LA, Aguirre J, Blanco Y, González-Toril E, Parro V. Graph-based deconvolution analysis of multiplex sandwich microarray immunoassays: applications for environmental monitoring. Environ Microbiol 2011;13:1421-32. [PMID: 21401847 DOI: 10.1111/j.1462-2920.2011.02442.x] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

A Simple Approximation for Fast Nonlinear Deconvolution. ACTA ACUST UNITED AC 2011. [DOI: 10.1007/978-3-642-25020-0_8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

Zhao Y, Simon R. Gene expression deconvolution in clinical samples. Genome Med 2010;2:93. [PMID: 21211069 PMCID: PMC3025435 DOI: 10.1186/gm214] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Camp JT, Elloumi F, Roman-Perez E, Rein J, Stewart DA, Harrell JC, Perou CM, Troester MA. Interactions with fibroblasts are distinct in Basal-like and luminal breast cancers. Mol Cancer Res 2010;9:3-13. [PMID: 21131600 DOI: 10.1158/1541-7786.mcr-10-0372] [Citation(s) in RCA: 85] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Chaussabel D, Pascual V, Banchereau J. Assessing the human immune system through blood transcriptomics. BMC Biol 2010;8:84. [PMID: 20619006 PMCID: PMC2895587 DOI: 10.1186/1741-7007-8-84] [Citation(s) in RCA: 174] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2010] [Accepted: 06/15/2010] [Indexed: 02/07/2023] Open

Shen-Orr SS, Tibshirani R, Khatri P, Bodian DL, Staedtler F, Perry NM, Hastie T, Sarwal MM, Davis MM, Butte AJ. Cell type-specific gene expression differences in complex tissues. Nat Methods 2010;7:287-9. [PMID: 20208531 DOI: 10.1038/nmeth.1439] [Citation(s) in RCA: 350] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2009] [Accepted: 01/09/2010] [Indexed: 12/13/2022]

Clarke J, Seo P, Clarke B. Statistical expression deconvolution from mixed tissue samples. Bioinformatics 2010;26:1043-9. [PMID: 20202973 PMCID: PMC2853690 DOI: 10.1093/bioinformatics/btq097] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Repsilber D, Kern S, Telaar A, Walzl G, Black GF, Selbig J, Parida SK, Kaufmann SHE, Jacobsen M. Biomarker discovery in heterogeneous tissue samples -taking the in-silico deconfounding approach. BMC Bioinformatics 2010;11:27. [PMID: 20070912 PMCID: PMC3098067 DOI: 10.1186/1471-2105-11-27] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 01/14/2010] [Indexed: 11/24/2022] Open

Abstract

Background

For heterogeneous tissues, such as blood, measurements of gene expression are confounded by relative proportions of cell types involved. Conclusions have to rely on estimation of gene expression signals for homogeneous cell populations, e.g. by applying micro-dissection, fluorescence activated cell sorting, or in-silico deconfounding. We studied feasibility and validity of a non-negative matrix decomposition algorithm using experimental gene expression data for blood and sorted cells from the same donor samples. Our objective was to optimize the algorithm regarding detection of differentially expressed genes and to enable its use for classification in the difficult scenario of reversely regulated genes. This would be of importance for the identification of candidate biomarkers in heterogeneous tissues.

Results

Experimental data and simulation studies involving noise parameters estimated from these data revealed that for valid detection of differential gene expression, quantile normalization and use of non-log data are optimal. We demonstrate the feasibility of predicting proportions of constituting cell types from gene expression data of single samples, as a prerequisite for a deconfounding-based classification approach.

Classification cross-validation errors with and without using deconfounding results are reported as well as sample-size dependencies. Implementation of the algorithm, simulation and analysis scripts are available.

Conclusions

The deconfounding algorithm without decorrelation using quantile normalization on non-log data is proposed for biomarkers that are difficult to detect, and for cases where confounding by varying proportions of cell types is the suspected reason. In this case, a deconfounding ranking approach can be used as a powerful alternative to, or complement of, other statistical learning approaches to define candidate biomarkers for molecular diagnosis and prediction in biomedicine, in realistically noisy conditions and with moderate sample sizes.

Collapse

Siegal-Gaskins D, Ash JN, Crosson S. Model-based deconvolution of cell cycle time-series data reveals gene expression details at high resolution. PLoS Comput Biol 2009;5:e1000460. [PMID: 19680537 PMCID: PMC2718844 DOI: 10.1371/journal.pcbi.1000460] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2009] [Accepted: 07/08/2009] [Indexed: 11/23/2022] Open

Abstract

In both prokaryotic and eukaryotic cells, gene expression is regulated across the cell cycle to ensure “just-in-time” assembly of select cellular structures and molecular machines. However, present in all time-series gene expression measurements is variability that arises from both systematic error in the cell synchrony process and variance in the timing of cell division at the level of the single cell. Thus, gene or protein expression data collected from a population of synchronized cells is an inaccurate measure of what occurs in the average single-cell across a cell cycle. Here, we present a general computational method to extract “single-cell”-like information from population-level time-series expression data. This method removes the effects of 1) variance in growth rate and 2) variance in the physiological and developmental state of the cell. Moreover, this method represents an advance in the deconvolution of molecular expression data in its flexibility, minimal assumptions, and the use of a cross-validation analysis to determine the appropriate level of regularization. Applying our deconvolution algorithm to cell cycle gene expression data from the dimorphic bacterium Caulobacter crescentus, we recovered critical features of cell cycle regulation in essential genes, including ctrA and ftsZ, that were obscured in population-based measurements. In doing so, we highlight the problem with using population data alone to decipher cellular regulatory mechanisms and demonstrate how our deconvolution algorithm can be applied to produce a more realistic picture of temporal regulation in a cell.

Time-series analyses of cellular regulatory processes have successfully drawn attention to the importance of temporal regulation in biological systems. A number of model systems can be synchronized such that data collected on cell populations better reflect the dynamic properties of the individual cell. However, experimental synchronization is never perfect, and the degree of synchrony that does exist at the outset of an experiment is quickly lost over time as cells grow at different rates and enter different developmental or physiological states on cell division. Thus, data collected from a population of synchronized cells can lead to incorrect models of temporal regulation. Here we demonstrate that the problem of relating population data to the individual cell can be resolved with a computational method that effectively removes the effects of both imperfect synchrony and time-dependent loss of synchrony. Application of this deconvolution algorithm to a cell cycle time-series data set from the model bacterium Caulobacter crescentus uncovers critical temporal details in the expression of essential genes that are not evident in the raw population-based data. The deconvolution routine presented here is a robust and general tool for extracting biochemical parameters of the average single cell from population time-series data.

Collapse

Abbas AR, Wolslegel K, Seshasayee D, Modrusan Z, Clark HF. Deconvolution of blood microarray data identifies cellular activation patterns in systemic lupus erythematosus. PLoS One 2009;4:e6098. [PMID: 19568420 PMCID: PMC2699551 DOI: 10.1371/journal.pone.0006098] [Citation(s) in RCA: 293] [Impact Index Per Article: 19.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2009] [Accepted: 06/02/2009] [Indexed: 02/04/2023] Open

Mizuno H, Nakanishi Y, Ishii N, Sarai A, Kitada K. A signature-based method for indexing cell cycle phase distribution from microarray profiles. BMC Genomics 2009;10:137. [PMID: 19331659 PMCID: PMC2676301 DOI: 10.1186/1471-2164-10-137] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 03/30/2009] [Indexed: 12/31/2022] Open

Papanikolaou NA, Papavassiliou AG. Protein complex, gene, and regulatory modules in cancer heterogeneity. Mol Med 2008;14:543-5. [PMID: 18654660 DOI: 10.2119/2008-00083.papanikolaou] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2008] [Accepted: 07/18/2008] [Indexed: 11/06/2022] Open

Buess M, Nuyten DSA, Hastie T, Nielsen T, Pesich R, Brown PO. Characterization of heterotypic interaction effects in vitro to deconvolute global gene expression profiles in cancer. Genome Biol 2008;8:R191. [PMID: 17868458 PMCID: PMC2375029 DOI: 10.1186/gb-2007-8-9-r191] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2007] [Revised: 06/14/2007] [Accepted: 09/14/2007] [Indexed: 01/10/2023] Open

Abstract

In an effort to deconvolute global gene-expression profiles, an interaction between some breast cancer cells and stromal fibroblasts was found to induce an interferon response, which may be associated with a greater propensity for tumor progression.

Background

Perturbations in cell-cell interactions are a key feature of cancer. However, little is known about the systematic effects of cell-cell interaction on global gene expression in cancer.

Results

We used an ex vivo model to simulate tumor-stroma interaction by systematically co-cultivating breast cancer cells with stromal fibroblasts and determined associated gene expression changes with cDNA microarrays. In the complex picture of epithelial-mesenchymal interaction effects, a prominent characteristic was an induction of interferon-response genes (IRGs) in a subset of cancer cells. In close proximity to these cancer cells, the fibroblasts secreted type I interferons, which, in turn, induced expression of the IRGs in the tumor cells. Paralleling this model, immunohistochemical analysis of human breast cancer tissues showed that STAT1, the key transcriptional activator of the IRGs, and itself an IRG, was expressed in a subset of the cancers, with a striking pattern of elevated expression in the cancer cells in close proximity to the stroma. In vivo, expression of the IRGs was remarkably coherent, providing a basis for segregation of 295 early-stage breast cancers into two groups. Tumors with high compared to low expression levels of IRGs were associated with significantly shorter overall survival; 59% versus 80% at 10 years (log-rank p = 0.001).

Conclusion

In an effort to deconvolute global gene expression profiles of breast cancer by systematic characterization of heterotypic interaction effects in vitro, we found that an interaction between some breast cancer cells and stromal fibroblasts can induce an interferon-response, and that this response may be associated with a greater propensity for tumor progression.

Collapse

Jacobsen M, Mattow J, Repsilber D, Kaufmann SH. Novel strategies to identify biomarkers in tuberculosis. Biol Chem 2008;389:487-95. [DOI: 10.1515/bc.2008.053] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Gosink MM, Petrie HT, Tsinoremas NF. Electronically subtracting expression patterns from a mixed cell population. ACTA ACUST UNITED AC 2007;23:3328-34. [PMID: 17956877 DOI: 10.1093/bioinformatics/btm508] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Li JZ, Meng F, Tsavaler L, Evans SJ, Choudary PV, Tomita H, Vawter MP, Walsh D, Shokoohi V, Chung T, Bunney WE, Jones EG, Akil H, Watson SJ, Myers RM. Sample matching by inferred agonal stress in gene expression analyses of the brain. BMC Genomics 2007;8:336. [PMID: 17892578 PMCID: PMC2213675 DOI: 10.1186/1471-2164-8-336] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2007] [Accepted: 09/24/2007] [Indexed: 12/26/2022] Open

Alter O. Genomic signal processing: from matrix algebra to genetic networks. Methods Mol Biol 2007;377:17-60. [PMID: 17634608 DOI: 10.1007/978-1-59745-390-5_2] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Abstract

DNA microarrays make it possible, for the first time, to record the complete genomic signals that guide the progression of cellular processes. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment, and drug development. This chapter reviews the first data-driven models that were created from these genome-scale data, through adaptations and generalizations of mathematical frameworks from matrix algebra that have proven successful in describing the physical world, in such diverse areas as mechanics and perception: the singular value decomposition model, the generalized singular value decomposition model comparative model, and the pseudoinverse projection integrative model. These models provide mathematical descriptions of the genetic networks that generate and sense the measured data, where the mathematical variables and operations represent biological reality. The variables, patterns uncovered in the data, correlate with activities of cellular elements such as regulators or transcription factors that drive the measured signals and cellular states where these elements are active. The operations, such as data reconstruction, rotation, and classification in subspaces of selected patterns, simulate experimental observation of only the cellular programs that these patterns represent. These models are illustrated in the analyses of RNA expression data from yeast and human during their cell cycle programs and DNA-binding data from yeast cell cycle transcription factors and replication initiation proteins. Two alternative pictures of RNA expression oscillations during the cell cycle that emerge from these analyses, which parallel well-known designs of physical oscillators, convey the capacity of the models to elucidate the design principles of cellular systems, as well as guide the design of synthetic ones. In these analyses, the power of the models to predict previously unknown biological principles is demonstrated with a prediction of a novel mechanism of regulation that correlates DNA replication initiation with cell cycle-regulated RNA transcription in yeast. These models may become the foundation of a future in which biological systems are modeled as physical systems are today.

Collapse

Hoffmann M, Pohlers D, Koczan D, Thiesen HJ, Wölfl S, Kinne RW. Robust computational reconstitution - a new method for the comparative analysis of gene expression in tissues and isolated cell fractions. BMC Bioinformatics 2006;7:369. [PMID: 16889662 PMCID: PMC1574358 DOI: 10.1186/1471-2105-7-369] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2006] [Accepted: 08/04/2006] [Indexed: 11/29/2022] Open

Abstract

Background

Biological tissues consist of various cell types that differentially contribute to physiological and pathophysiological processes. Determining and analyzing cell type-specific gene expression under diverse conditions is therefore a central aim of biomedical research. The present study compares gene expression profiles in whole tissues and isolated cell fractions purified from these tissues in patients with rheumatoid arthritis and osteoarthritis.

Results

The expression profiles of the whole tissues were compared to computationally reconstituted expression profiles that combine the expression profiles of the isolated cell fractions (macrophages, fibroblasts, and non-adherent cells) according to their relative mRNA proportions in the tissue. The mRNA proportions were determined by trimmed robust regression using only the most robustly-expressed genes (1/3 to 1/2 of all measured genes), i.e. those showing the most similar expression in tissue and isolated cell fractions. The relative mRNA proportions were determined using several different chip evaluation methods, among which the MAS 5.0 signal algorithm appeared to be most robust. The computed mRNA proportions agreed well with the cell proportions determined by immunohistochemistry except for a minor number of outliers. Genes that were either regulated (i.e. differentially-expressed in tissue and isolated cell fractions) or robustly-expressed in all patients were identified using different test statistics.

Conclusion

Robust Computational Reconstitution uses an intermediate number of robustly-expressed genes to estimate the relative mRNA proportions. This avoids both the exclusive dependence on the robust expression of individual, highly cell type-specific marker genes and the bias towards an equal distribution upon inclusion of all genes for computation.

Collapse

Wentzell PD, Karakach TK, Roy S, Martinez MJ, Allen CP, Werner-Washburne M. Multivariate curve resolution of time course microarray data. BMC Bioinformatics 2006;7:343. [PMID: 16839419 PMCID: PMC1539028 DOI: 10.1186/1471-2105-7-343] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2006] [Accepted: 07/13/2006] [Indexed: 11/17/2022] Open

Wang M, Master SR, Chodosh LA. Computational expression deconvolution in a complex mammalian organ. BMC Bioinformatics 2006;7:328. [PMID: 16817968 PMCID: PMC1559723 DOI: 10.1186/1471-2105-7-328] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2006] [Accepted: 07/03/2006] [Indexed: 11/28/2022] Open

Abstract

Background

Microarray expression profiling has been widely used to identify differentially expressed genes in complex cellular systems. However, while such methods can be used to directly infer intracellular regulation within homogeneous cell populations, interpretation of in vivo gene expression data derived from complex organs composed of multiple cell types is more problematic. Specifically, observed changes in gene expression may be due either to changes in gene regulation within a given cell type or to changes in the relative abundance of expressing cell types. Consequently, bona fide changes in intrinsic gene regulation may be either mimicked or masked by changes in the relative proportion of different cell types. To date, few analytical approaches have addressed this problem.

Results

We have chosen to apply a computational method for deconvoluting gene expression profiles derived from intact tissues by using reference expression data for purified populations of the constituent cell types of the mammary gland. These data were used to estimate changes in the relative proportions of different cell types during murine mammary gland development and Ras-induced mammary tumorigenesis. These computational estimates of changing compartment sizes were then used to enrich lists of differentially expressed genes for transcripts that change as a function of intrinsic intracellular regulation rather than shifts in the relative abundance of expressing cell types. Using this approach, we have demonstrated that adjusting mammary gene expression profiles for changes in three principal compartments – epithelium, white adipose tissue, and brown adipose tissue – is sufficient both to reduce false-positive changes in gene expression due solely to changes in compartment sizes and to reduce false-negative changes by unmasking genuine alterations in gene expression that were otherwise obscured by changes in compartment sizes.

Conclusion

By adjusting gene expression values for changes in the sizes of cell type-specific compartments, this computational deconvolution method has the potential to increase both the sensitivity and specificity of differential gene expression experiments performed on complex tissues. Given the necessity for understanding complex biological processes such as development and carcinogenesis within the context of intact tissues, this approach offers substantial utility and should be broadly applicable to identifying gene expression changes in tissues composed of multiple cell types.

Collapse

Lu P, Rangan A, Chan SY, Appling DR, Hoffman DW, Marcotte EM. Global metabolic changes following loss of a feedback loop reveal dynamic steady states of the yeast metabolome. Metab Eng 2006;9:8-20. [PMID: 17049899 DOI: 10.1016/j.ymben.2006.06.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2005] [Revised: 05/27/2006] [Accepted: 06/20/2006] [Indexed: 11/16/2022]

Fannin RD, Auman JT, Bruno ME, Sieber SO, Ward SM, Tucker CJ, Merrick BA, Paules RS. Differential gene expression profiling in whole blood during acute systemic inflammation in lipopolysaccharide-treated rats. Physiol Genomics 2006;21:92-104. [PMID: 15781589 DOI: 10.1152/physiolgenomics.00190.2004] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open

Bowers PM, O'Connor BD, Cokus SJ, Sprinzak E, Yeates TO, Eisenberg D. Utilizing logical relationships in genomic data to decipher cellular processes. FEBS J 2005;272:5110-8. [PMID: 16218945 DOI: 10.1111/j.1742-4658.2005.04946.x] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Segal E, Friedman N, Kaminski N, Regev A, Koller D. From signatures to models: understanding cancer using microarrays. Nat Genet 2005;37 Suppl:S38-45. [PMID: 15920529 DOI: 10.1038/ng1561] [Citation(s) in RCA: 283] [Impact Index Per Article: 14.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]

de Ridder D, van der Linden CE, Schonewille T, Dik WA, Reinders MJT, van Dongen JJM, Staal FJT. Purity for clarity: the need for purification of tumor cells in DNA microarray studies. Leukemia 2005;19:618-27. [PMID: 15744349 DOI: 10.1038/sj.leu.2403685] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Lähdesmäki H, Shmulevich L, Dunmire V, Yli-Harja O, Zhang W. In silico microdissection of microarray data from heterogeneous cell populations. BMC Bioinformatics 2005;6:54. [PMID: 15766384 PMCID: PMC1274251 DOI: 10.1186/1471-2105-6-54] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2004] [Accepted: 03/14/2005] [Indexed: 11/10/2022] Open

Abstract

Background

Very few analytical approaches have been reported to resolve the variability in microarray measurements stemming from sample heterogeneity. For example, tissue samples used in cancer studies are usually contaminated with the surrounding or infiltrating cell types. This heterogeneity in the sample preparation hinders further statistical analysis, significantly so if different samples contain different proportions of these cell types. Thus, sample heterogeneity can result in the identification of differentially expressed genes that may be unrelated to the biological question being studied. Similarly, irrelevant gene combinations can be discovered in the case of gene expression based classification.

Results

We propose a computational framework for removing the effects of sample heterogeneity by "microdissecting" microarray data in silico. The computational method provides estimates of the expression values of the pure (non-heterogeneous) cell samples. The inversion of the sample heterogeneity can be facilitated by providing accurate estimates of the mixing percentages of different cell types in each measurement. For those cases where no such information is available, we develop an optimization-based method for joint estimation of the mixing percentages and the expression values of the pure cell samples. We also consider the problem of selecting the correct number of cell types.

Conclusion

The efficiency of the proposed methods is illustrated by applying them to a carefully controlled cDNA microarray data obtained from heterogeneous samples. The results demonstrate that the methods are capable of reconstructing both the sample and cell type specific expression values from heterogeneous mixtures and that the mixing percentages of different cell types can also be estimated. Furthermore, a general purpose model selection method can be used to select the correct number of cell types.

Collapse

Han CB, Mao XY, Xin Y, Wang SC, Ma JM, Zhao YJ. Quantitative analysis of tumor mitochondrial RNA using microarray. World J Gastroenterol 2005;11:36-40. [PMID: 15609393 PMCID: PMC4205380 DOI: 10.3748/wjg.v11.i1.36] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Alter O, Golub GH. Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. Proc Natl Acad Sci U S A 2004;101:16577-82. [PMID: 15545604 PMCID: PMC534520 DOI: 10.1073/pnas.0406767101] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Fraser AG, Marcotte EM. Development through the eyes of functional genomics. Curr Opin Genet Dev 2004;14:336-42. [PMID: 15261648 DOI: 10.1016/j.gde.2004.06.015] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Prince JT, Carlson MW, Wang R, Lu P, Marcotte EM. The need for a public proteomics repository. Nat Biotechnol 2004;22:471-2. [PMID: 15085804 DOI: 10.1038/nbt0404-471] [Citation(s) in RCA: 128] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Stuart RO, Wachsman W, Berry CC, Wang-Rodriguez J, Wasserman L, Klacansky I, Masys D, Arden K, Goodison S, McClelland M, Wang Y, Sawyers A, Kalcheva I, Tarin D, Mercola D. In silico dissection of cell-type-associated patterns of gene expression in prostate cancer. Proc Natl Acad Sci U S A 2004;101:615-20. [PMID: 14722351 PMCID: PMC327196 DOI: 10.1073/pnas.2536479100] [Citation(s) in RCA: 158] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Kawahara N, Wang Y, Mukasa A, Furuya K, Shimizu T, Hamakubo T, Aburatani H, Kodama T, Kirino T. Genome-wide gene expression analysis for induced ischemic tolerance and delayed neuronal death following transient global ischemia in rats. J Cereb Blood Flow Metab 2004;24:212-23. [PMID: 14747748 DOI: 10.1097/01.wcb.0000106012.33322.a2] [Citation(s) in RCA: 88] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]