Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Qiu P, Gentles AJ, Plevritis SK. Discovering biological progression underlying microarray samples. PLoS Comput Biol 2011;7:e1001123. [PMID: 21533210 PMCID: PMC3077357 DOI: 10.1371/journal.pcbi.1001123] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Accepted: 03/16/2011] [Indexed: 11/27/2022] Open

For:	Qiu P, Gentles AJ, Plevritis SK. Discovering biological progression underlying microarray samples. PLoS Comput Biol 2011;7:e1001123. [PMID: 21533210 PMCID: PMC3077357 DOI: 10.1371/journal.pcbi.1001123] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2010] [Accepted: 03/16/2011] [Indexed: 11/27/2022] Open

Number

Cited by Other Article(s)

Laber S, Strobel S, Mercader JM, Dashti H, dos Santos FR, Kubitz P, Jackson M, Ainbinder A, Honecker J, Agrawal S, Garborcauskas G, Stirling DR, Leong A, Figueroa K, Sinnott-Armstrong N, Kost-Alimova M, Deodato G, Harney A, Way GP, Saadat A, Harken S, Reibe-Pal S, Ebert H, Zhang Y, Calabuig-Navarro V, McGonagle E, Stefek A, Dupuis J, Cimini BA, Hauner H, Udler MS, Carpenter AE, Florez JC, Lindgren C, Jacobs SB, Claussnitzer M. Discovering cellular programs of intrinsic and extrinsic drivers of metabolic traits using LipocyteProfiler. CELL GENOMICS 2023;3:100346. [PMID: 37492099 PMCID: PMC10363917 DOI: 10.1016/j.xgen.2023.100346] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Revised: 08/22/2022] [Accepted: 05/26/2023] [Indexed: 07/27/2023]

Affiliation(s)

Samantha Laber Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
Sophie Strobel Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Institute of Nutritional Medicine, School of Medicine, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
Josep M. Mercader Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Hesam Dashti Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Felipe R.C. dos Santos Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Phil Kubitz Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Maya Jackson Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Alina Ainbinder Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Julius Honecker Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany
Saaket Agrawal Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Garrett Garborcauskas Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
David R. Stirling Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Aaron Leong Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Katherine Figueroa Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
Nasa Sinnott-Armstrong Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Department of Genetics, Stanford University, San Francisco, CA, USA
Maria Kost-Alimova Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Giacomo Deodato Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Alycen Harney Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Gregory P. Way Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Alham Saadat Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Sierra Harken Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Saskia Reibe-Pal Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK
Hannah Ebert Institute of Nutritional Science, University Hohenheim, 70599 Stuttgart, Germany
Yixin Zhang Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA
Virtu Calabuig-Navarro Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Institute of Nutritional Science, University Hohenheim, 70599 Stuttgart, Germany
Elizabeth McGonagle Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Adam Stefek Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Josée Dupuis Department of Biostatistics, Boston University School of Public Health, Boston, MA 02118, USA Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montreal, QC H3A 1G1, Canada
Beth A. Cimini Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Hans Hauner Institute of Nutritional Medicine, School of Medicine, Technical University of Munich, 85354 Freising-Weihenstephan, Germany Else Kröner-Fresenius-Centre for Nutritional Medicine, School of Life Sciences, Technical University of Munich, 85354 Freising-Weihenstephan, Germany German Center for Diabetes Research (DZD), 85764 Neuherberg, Germany
Miriam S. Udler Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Anne E. Carpenter Imaging Platform, Center for the Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
Jose C. Florez Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA
Cecilia Lindgren Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford OX3 7FZ, UK Wellcome Centre for Human Genetics, University of Oxford, Oxford OX3 7BN, UK
Suzanne B.R. Jacobs Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA
Melina Claussnitzer Programs in Metabolism and Medical and Population Genetics, Type 2 Diabetes Systems Genomics Initiative, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA Massachusetts General Hospital, Harvard Medical School, Boston, MA 02114, USA Diabetes Unit and Center for Genomic Medicine, Massachusetts General Hospital, Boston, MA 02114, USA Department of Medicine, Harvard Medical School, Boston, MA 02114, USA The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA

Collapse

Rams M, Conrad TOF. Dictionary learning allows model-free pseudotime estimation of transcriptomic data. BMC Genomics 2022;23:56. [PMID: 35033004 PMCID: PMC8760643 DOI: 10.1186/s12864-021-08276-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Accepted: 12/22/2021] [Indexed: 11/10/2022] Open

Abstract

Background

Pseudotime estimation from dynamic single-cell transcriptomic data enables characterisation and understanding of the underlying processes, for example developmental processes. Various pseudotime estimation methods have been proposed during the last years. Typically, these methods start with a dimension reduction step because the low-dimensional representation is usually easier to analyse. Approaches such as PCA, ICA or t-SNE belong to the most widely used methods for dimension reduction in pseudotime estimation methods. However, these methods usually make assumptions on the derived dimensions, which can result in important dataset properties being missed. In this paper, we suggest a new dictionary learning based approach, dynDLT, for dimension reduction and pseudotime estimation of dynamic transcriptomic data. Dictionary learning is a matrix factorisation approach that does not restrict the dependence of the derived dimensions. To evaluate the performance, we conduct a large simulation study and analyse 8 real-world datasets.

Results

The simulation studies reveal that firstly, dynDLT preserves the simulated patterns in low-dimension and the pseudotimes can be derived from the low-dimensional representation. Secondly, the results show that dynDLT is suitable for the detection of genes exhibiting the simulated dynamic patterns, thereby facilitating the interpretation of the compressed representation and thus the dynamic processes. For the real-world data analysis, we select datasets with samples that are taken at different time points throughout an experiment. The pseudotimes found by dynDLT have high correlations with the experimental times. We compare the results to other approaches used in pseudotime estimation, or those that are method-wise closely connected to dictionary learning: ICA, NMF, PCA, t-SNE, and UMAP. DynDLT has the best overall performance for the simulated and real-world datasets.

Conclusions

We introduce dynDLT, a method that is suitable for pseudotime estimation. Its main advantages are: (1) It presents a model-free approach, meaning that it does not restrict the dependence of the derived dimensions; (2) Genes that are relevant in the detected dynamic processes can be identified from the dictionary matrix; (3) By a restriction of the dictionary entries to positive values, the dictionary atoms are highly interpretable.

Supplementary Information

The online version contains supplementary material available at (10.1186/s12864-021-08276-9).

Collapse

Jamalkandi SA, Kouhsar M, Salimian J, Ahmadi A. The identification of co-expressed gene modules in Streptococcus pneumonia from colonization to infection to predict novel potential virulence genes. BMC Microbiol 2020;20:376. [PMID: 33334315 PMCID: PMC7745498 DOI: 10.1186/s12866-020-02059-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2020] [Accepted: 12/02/2020] [Indexed: 11/14/2022] Open

Deng AC, Sun XQ. Dynamic gene regulatory network reconstruction and analysis based on clinical transcriptomic data of colorectal cancer. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2020;17:3224-3239. [PMID: 32987526 DOI: 10.3934/mbe.2020183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Liang S, Wang F, Han J, Chen K. Latent periodic process inference from single-cell RNA-seq data. Nat Commun 2020;11:1441. [PMID: 32188848 PMCID: PMC7080821 DOI: 10.1038/s41467-020-15295-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 03/03/2020] [Indexed: 11/15/2022] Open

Pierson E, Koh PW, Hashimoto T, Koller D, Leskovec J, Eriksson N, Liang P. Inferring Multidimensional Rates of Aging from Cross-Sectional Data. PROCEEDINGS OF MACHINE LEARNING RESEARCH 2019;89:97-107. [PMID: 31538144 PMCID: PMC6752884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

Identification of Novel Genes in Human Airway Epithelial Cells associated with Chronic Obstructive Pulmonary Disease (COPD) using Machine-Based Learning Algorithms. Sci Rep 2018;8:15775. [PMID: 30361509 PMCID: PMC6202402 DOI: 10.1038/s41598-018-33986-8] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Accepted: 10/07/2018] [Indexed: 01/26/2023] Open

Cook D, Achanta S, Hoek JB, Ogunnaike BA, Vadigepalli R. Cellular network modeling and single cell gene expression analysis reveals novel hepatic stellate cell phenotypes controlling liver regeneration dynamics. BMC SYSTEMS BIOLOGY 2018;12:86. [PMID: 30285726 PMCID: PMC6171157 DOI: 10.1186/s12918-018-0605-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 08/21/2018] [Indexed: 12/26/2022]

Abstract

Background

Recent results from single cell gene and protein regulation studies are starting to uncover the previously underappreciated fact that individual cells within a population exhibit high variability in the expression of mRNA and proteins (i.e., molecular variability). By combining cellular network modeling, and high-throughput gene expression measurements in single cells, we seek to reconcile the high molecular variability in single cells with the relatively low variability in tissue-scale gene and protein expression and the highly coordinated functional responses of tissues to physiological challenges. In this study, we focus on relating the dynamic changes in distributions of hepatic stellate cell (HSC) functional phenotypes to the tightly regulated physiological response of liver regeneration.

Results

We develop a mathematical model describing contributions of HSC functional phenotype populations to liver regeneration and test model predictions through isolation and transcriptional characterization of single HSCs. We identify and characterize four HSC transcriptional states contributing to liver regeneration, two of which are described for the first time in this work. We show that HSC state populations change in vivo in response to acute challenges (in this case, 70% partial hepatectomy) and chronic challenges (chronic ethanol consumption). Our results indicate that HSCs influence the dynamics of liver regeneration through steady-state tissue preconditioning prior to an acute insult and through dynamic control of cell state balances. Furthermore, our modeling approach provides a framework to understand how balances among cell states influence tissue dynamics.

Conclusions

Taken together, our combined modeling and experimental studies reveal novel HSC transcriptional states and indicate that baseline differences in HSC phenotypes as well as a dynamic balance of transitions between these phenotypes control liver regeneration responses.

Electronic supplementary material

The online version of this article (10.1186/s12918-018-0605-7) contains supplementary material, which is available to authorized users.

Collapse

Uncovering pseudotemporal trajectories with covariates from single cell and bulk expression data. Nat Commun 2018;9:2442. [PMID: 29934517 PMCID: PMC6015076 DOI: 10.1038/s41467-018-04696-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Accepted: 05/17/2018] [Indexed: 12/29/2022] Open

Hong CF, Chen YC, Chen WC, Tu KC, Tsai MH, Chan YK, Yu SS. Construction of diagnosis system and gene regulatory networks based on microarray analysis. J Biomed Inform 2018;81:61-73. [PMID: 29550394 DOI: 10.1016/j.jbi.2018.03.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2017] [Revised: 01/30/2018] [Accepted: 03/12/2018] [Indexed: 01/02/2023]

Abstract

A microarray analysis generally contains expression data of thousands of genes, but most of them are irrelevant to the disease of interest, making analyzing the genes concerning specific diseases complicated. Therefore, filtering out a few essential genes as well as their regulatory networks is critical, and a disease can be easily diagnosed just depending on the expression profiles of a few critical genes. In this study, a target gene screening (TGS) system, which is a microarray-based information system that integrates F-statistics, pattern recognition matching, a two-layer K-means classifier, a Parameter Detection Genetic Algorithm (PDGA), a genetic-based gene selector (GBG selector) and the association rule, was developed to screen out a small subset of genes that can discriminate malignant stages of cancers. During the first stage, F-statistic, pattern recognition matching, and a two-layer K-means classifier were applied in the system to filter out the 20 critical genes most relevant to ovarian cancer from 9600 genes, and the PDGA was used to decide the fittest values of the parameters for these critical genes. Among the 20 critical genes, 15 are associated with cancer progression. In the second stage, we further employed a GBG selector and the association rule to screen out seven target gene sets, each with only four to six genes, and each of which can precisely identify the malignancy stage of ovarian cancer based on their expression profiles. We further deduced the gene regulatory networks of the 20 critical genes by applying the Pearson correlation coefficient to evaluate the correlationship between the expression of each gene at the same stages and at different stages. Correlationships between gene pairs were calculated, and then, three regulatory networks were deduced. Their correlationships were further confirmed by the Ingenuity pathway analysis. The prognostic significances of the genes identified via regulatory networks were examined using online tools, and most represented biomarker candidates. In summary, our proposed system provides a new strategy to identify critical genes or biomarkers, as well as their regulatory networks, from microarray data.

Collapse

Data-analysis strategies for image-based cell profiling. Nat Methods 2017;14:849-863. [PMID: 28858338 PMCID: PMC6871000 DOI: 10.1038/nmeth.4397] [Citation(s) in RCA: 402] [Impact Index Per Article: 57.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 07/28/2017] [Indexed: 12/16/2022]

Sun Y, Yao J, Yang L, Chen R, Nowak NJ, Goodison S. Computational approach for deriving cancer progression roadmaps from static sample data. Nucleic Acids Res 2017;45:e69. [PMID: 28108658 PMCID: PMC5436003 DOI: 10.1093/nar/gkx003] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Accepted: 01/07/2017] [Indexed: 12/26/2022] Open

Zhang X, Yosef N. A new way to build cell lineages. eLife 2017;6. [PMID: 28332977 PMCID: PMC5364025 DOI: 10.7554/elife.25654] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2017] [Accepted: 03/20/2017] [Indexed: 02/06/2023] Open

Eshleman R, Singh R. Reconstructing the Temporal Progression of Biological Data Using Cluster Spanning Trees. IEEE Trans Nanobioscience 2017;16:140-147. [PMID: 28207402 DOI: 10.1109/tnb.2017.2667402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Campbell KR, Yau C. Order Under Uncertainty: Robust Differential Expression Analysis Using Probabilistic Models for Pseudotime Inference. PLoS Comput Biol 2016;12:e1005212. [PMID: 27870852 PMCID: PMC5117567 DOI: 10.1371/journal.pcbi.1005212] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 10/13/2016] [Indexed: 11/18/2022] Open

Abstract

Single cell gene expression profiling can be used to quantify transcriptional dynamics in temporal processes, such as cell differentiation, using computational methods to label each cell with a ‘pseudotime’ where true time series experimentation is too difficult to perform. However, owing to the high variability in gene expression between individual cells, there is an inherent uncertainty in the precise temporal ordering of the cells. Pre-existing methods for pseudotime estimation have predominantly given point estimates precluding a rigorous analysis of the implications of uncertainty. We use probabilistic modelling techniques to quantify pseudotime uncertainty and propagate this into downstream differential expression analysis. We demonstrate that reliance on a point estimate of pseudotime can lead to inflated false discovery rates and that probabilistic approaches provide greater robustness and measures of the temporal resolution that can be obtained from pseudotime inference.

Understanding the “cellular programming” that controls fundamental, dynamic biological processes is important for determining normal cellular function and potential perturbations that might give rise to physiological disorders. Ideally, investigations would employ time series experiments to periodically measure the properties of each cell. This would allow us to understand the sequence of gene (in)activations that constitute the program being followed. In practice, such experiments can be difficult to perform as cellular activity may be asynchronous with each cell occupying a different phase of the process of interested. Furthermore, the unbiased measurement of all transcripts or proteins requires the cells to be captured and lysed precluding the continued monitoring of that cell. In the absence of the ability to conduct true time series experiments, pseudotime algorithms exploit the asynchronous cellular nature of these systems to mathematically assign a “pseudotime” to each cell based on its molecular profile allowing the cells to be aligned and the sequence of gene activation events retrospectively inferred. Existing approaches predominantly use deterministic methods that ignore the statistical uncertainties associated with the problem. This paper demonstrates that this statistical uncertainty limits the temporal resolution that can be extracted from static snapshots of cell expression profiles and can also detrimentally affect downstream analysis.

Collapse

Trapnell C. Defining cell types and states with single-cell genomics. Genome Res 2016;25:1491-8. [PMID: 26430159 PMCID: PMC4579334 DOI: 10.1101/gr.190595.115] [Citation(s) in RCA: 449] [Impact Index Per Article: 56.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Visualization and cellular hierarchy inference of single-cell data using SPADE. Nat Protoc 2016;11:1264-79. [PMID: 27310265 DOI: 10.1038/nprot.2016.066] [Citation(s) in RCA: 80] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Xu Y, Qiu P, Roysam B. Unsupervised Discovery of Subspace Trends. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2015;37:2131-2145. [PMID: 26353189 DOI: 10.1109/tpami.2015.2394475] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Oscope identifies oscillatory genes in unsynchronized single-cell RNA-seq experiments. Nat Methods 2015;12:947-950. [PMID: 26301841 PMCID: PMC4589503 DOI: 10.1038/nmeth.3549] [Citation(s) in RCA: 110] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 06/05/2015] [Indexed: 01/24/2023]

Yuan K, Sakoparnig T, Markowetz F, Beerenwinkel N. BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies. Genome Biol 2015;16:36. [PMID: 25786108 PMCID: PMC4359483 DOI: 10.1186/s13059-015-0592-6] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2014] [Accepted: 01/21/2015] [Indexed: 11/28/2022] Open

Francesconi M, Lehner B. Reconstructing and analysing cellular states, space and time from gene expression profiles of many cells and single cells. MOLECULAR BIOSYSTEMS 2015;11:2690-8. [DOI: 10.1039/c5mb00339c] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Xie R, Huang H, Li W, Chen B, Jiang J, He Y, Lv J, ma B, Zhou Y, Feng C, Chen L, He W. Identifying progression related disease risk modules based on the human subcellular signaling networks. MOLECULAR BIOSYSTEMS 2014;10:3298-309. [PMID: 25315201 DOI: 10.1039/c4mb00482e] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]

Sun Y, Yao J, Nowak NJ, Goodison S. Cancer progression modeling using static sample data. Genome Biol 2014;15:440. [PMID: 25155694 PMCID: PMC4196119 DOI: 10.1186/s13059-014-0440-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2014] [Accepted: 08/14/2014] [Indexed: 12/20/2022] Open

Wang Z, San Lucas FA, Qiu P, Liu Y. Improving the sensitivity of sample clustering by leveraging gene co-expression networks in variable selection. BMC Bioinformatics 2014;15:153. [PMID: 24885641 PMCID: PMC4035826 DOI: 10.1186/1471-2105-15-153] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2013] [Accepted: 05/14/2014] [Indexed: 11/10/2022] Open

Abstract

Background

Many variable selection techniques have been proposed for the clustering of gene expression data. While these methods tend to filter out irrelevant genes and identify informative genes that contribute to a clustering solution, they are based on criteria that do not consider the potential interactive influence among individual genes. Motivated by ensemble clustering, there is a strong interest in leveraging the structure of gene networks for gene selection, so that the relationship information between genes can be effectively utilized, while the selected genes are expected to preserve all the possible clustering structures in the data.

Results

We present a new filter method that uses the gene connectivity in the gene co-expression network as the evaluation criteria for variable selection. The gene connectivity measures the importance of the genes in term of their expression similarity with others in the co-expression network. The hard threshold and soft threshold transformations are employed to construct the gene co-expression networks. Both simulation studies and real data analysis have shown that the network based on soft thresholding is more effective in selecting relevant variables and provides better clustering results compared to the hard thresholding transformation and two other canonical filter methods for variable selection. Furthermore, a new module analysis approach is proposed to reveal the higher order organization of the gene space, where the genes of a module share significant topological similarity and are associated with a consensus partition of the sample space. We demonstrate that the identified modules can lead to biologically meaningful sample partitions that might be missed by other methods.

Conclusions

By leveraging the structure of gene co-expression network, first we propose a variable selection method that selects individual genes with top connectivity. Both simulation studies and real data application have demonstrated that our method has better performance in terms of the reliability of the selected genes and sample clustering results. In addition, we propose a module recovery method that can help discover novel sample partitions that might be hidden when performing clustering analyses using all available genes. The source code of our program is available at http://nba.uth.tmc.edu/homepage/liu/netVar/.

Collapse

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 2014. [PMID: 24658644 DOI: 10.1038/nbt.2859.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol 2014;32:381-386. [PMID: 24658644 PMCID: PMC4122333 DOI: 10.1038/nbt.2859] [Citation(s) in RCA: 3646] [Impact Index Per Article: 364.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2013] [Accepted: 02/25/2014] [Indexed: 11/20/2022]

Martinez E, Trevino V. Modelling gene expression profiles related to prostate tumor progression using binary states. Theor Biol Med Model 2013;10:37. [PMID: 23721350 PMCID: PMC3691825 DOI: 10.1186/1742-4682-10-37] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2012] [Accepted: 05/21/2013] [Indexed: 01/27/2023] Open

Sánchez-Alvarez R, Gayen S, Vadigepalli R, Anni H. Ethanol diverts early neuronal differentiation trajectory of embryonic stem cells by disrupting the balance of lineage specifiers. PLoS One 2013;8:e63794. [PMID: 23724002 PMCID: PMC3665827 DOI: 10.1371/journal.pone.0063794] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2013] [Accepted: 04/04/2013] [Indexed: 02/07/2023] Open

Qiu P, Plevritis SK. TreeVis: a MATLAB-based tool for tree visualization. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2013;109:74-6. [PMID: 23036855 PMCID: PMC3508366 DOI: 10.1016/j.cmpb.2012.08.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2011] [Revised: 06/02/2012] [Accepted: 08/15/2012] [Indexed: 05/25/2023]

Qiu P, Zhang L. Identification of markers associated with global changes in DNA methylation regulation in cancers. BMC Bioinformatics 2012;13 Suppl 13:S7. [PMID: 23320390 PMCID: PMC3426805 DOI: 10.1186/1471-2105-13-s13-s7] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Ng JWY, Barrett LM, Wong A, Kuh D, Smith GD, Relton CL. The role of longitudinal cohort studies in epigenetic epidemiology: challenges and opportunities. Genome Biol 2012;13:246. [PMID: 22747597 PMCID: PMC3446311 DOI: 10.1186/gb-2012-13-6-246] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Inferring phenotypic properties from single-cell characteristics. PLoS One 2012;7:e37038. [PMID: 22662133 PMCID: PMC3360688 DOI: 10.1371/journal.pone.0037038] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Accepted: 04/11/2012] [Indexed: 11/19/2022] Open

Cornero A, Acquaviva M, Fardin P, Versteeg R, Schramm A, Eva A, Bosco MC, Blengio F, Barzaghi S, Varesio L. Design of a multi-signature ensemble classifier predicting neuroblastoma patients' outcome. BMC Bioinformatics 2012;13 Suppl 4:S13. [PMID: 22536959 PMCID: PMC3314564 DOI: 10.1186/1471-2105-13-s4-s13] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Abstract

Background

Neuroblastoma is the most common pediatric solid tumor of the sympathetic nervous system. Development of improved predictive tools for patients stratification is a crucial requirement for neuroblastoma therapy. Several studies utilized gene expression-based signatures to stratify neuroblastoma patients and demonstrated a clear advantage of adding genomic analysis to risk assessment. There is little overlapping among signatures and merging their prognostic potential would be advantageous. Here, we describe a new strategy to merge published neuroblastoma related gene signatures into a single, highly accurate, Multi-Signature Ensemble (MuSE)-classifier of neuroblastoma (NB) patients outcome.

Methods

Gene expression profiles of 182 neuroblastoma tumors, subdivided into three independent datasets, were used in the various phases of development and validation of neuroblastoma NB-MuSE-classifier. Thirty three signatures were evaluated for patients' outcome prediction using 22 classification algorithms each and generating 726 classifiers and prediction results. The best-performing algorithm for each signature was selected, validated on an independent dataset and the 20 signatures performing with an accuracy > = 80% were retained.

Results

We combined the 20 predictions associated to the corresponding signatures through the selection of the best performing algorithm into a single outcome predictor. The best performance was obtained by the Decision Table algorithm that produced the NB-MuSE-classifier characterized by an external validation accuracy of 94%. Kaplan-Meier curves and log-rank test demonstrated that patients with good and poor outcome prediction by the NB-MuSE-classifier have a significantly different survival (p < 0.0001). Survival curves constructed on subgroups of patients divided on the bases of known prognostic marker suggested an excellent stratification of localized and stage 4s tumors but more data are needed to prove this point.

Conclusions

The NB-MuSE-classifier is based on an ensemble approach that merges twenty heterogeneous, neuroblastoma-related gene signatures to blend their discriminating power, rather than numeric values, into a single, highly accurate patients' outcome predictor. The novelty of our approach derives from the way to integrate the gene expression signatures, by optimally associating them with a single paradigm ultimately integrated into a single classifier. This model can be exported to other types of cancer and to diseases for which dedicated databases exist.

Collapse

Ng JWY, Barrett LM, Wong A, Kuh D, Smith G, Relton CL. The role of longitudinal cohort studies in epigenetic epidemiology: challenges and opportunities. Genome Biol 2012. [DOI: 10.1186/gb4029] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open

Chen EY, Xu H, Gordonov S, Lim MP, Perkins MH, Ma'ayan A. Expression2Kinases: mRNA profiling linked to multiple upstream regulatory layers. ACTA ACUST UNITED AC 2011;28:105-11. [PMID: 22080467 DOI: 10.1093/bioinformatics/btr625] [Citation(s) in RCA: 113] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Extracting a cellular hierarchy from high-dimensional cytometry data with SPADE. Nat Biotechnol 2011;29:886-91. [PMID: 21964415 PMCID: PMC3196363 DOI: 10.1038/nbt.1991] [Citation(s) in RCA: 702] [Impact Index Per Article: 54.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2011] [Accepted: 08/31/2011] [Indexed: 01/17/2023]