151
|
Noise distorts the epigenetic landscape and shapes cell-fate decisions. Cell Syst 2021; 13:83-102.e6. [PMID: 34626539 DOI: 10.1016/j.cels.2021.09.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 06/21/2021] [Accepted: 09/02/2021] [Indexed: 12/24/2022]
Abstract
The Waddington epigenetic landscape has become an iconic representation of the cellular differentiation process. Recent single-cell transcriptomic data provide new opportunities for quantifying this originally conceptual tool, offering insight into the gene regulatory networks underlying cellular development. While many methods for constructing the landscape have been proposed, by far the most commonly employed approach is based on computing the landscape as the negative logarithm of the steady-state probability distribution. Here, we use simple models to highlight the complexities and limitations that arise when reconstructing the potential landscape in the presence of stochastic fluctuations. We consider how the landscape changes in accordance with different stochastic systems and show that it is the subtle interplay between the deterministic and stochastic components of the system that ultimately shapes the landscape. We further discuss how the presence of noise has important implications for the identifiability of the regulatory dynamics from experimental data. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
|
152
|
Ma X, Somasundaram A, Qi Z, Hartman D, Singh H, Osmanbeyoglu H. SPaRTAN, a computational framework for linking cell-surface receptors to transcriptional regulators. Nucleic Acids Res 2021; 49:9633-9647. [PMID: 34500467 PMCID: PMC8464045 DOI: 10.1093/nar/gkab745] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 08/09/2021] [Accepted: 09/06/2021] [Indexed: 12/22/2022] Open
Abstract
The identity and functions of specialized cell types are dependent on the complex interplay between signaling and transcriptional networks. Recently single-cell technologies have been developed that enable simultaneous quantitative analysis of cell-surface receptor expression with transcriptional states. To date, these datasets have not been used to systematically develop cell-context-specific maps of the interface between signaling and transcriptional regulators orchestrating cellular identity and function. We present SPaRTAN (Single-cell Proteomic and RNA based Transcription factor Activity Network), a computational method to link cell-surface receptors to transcription factors (TFs) by exploiting cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) datasets with cis-regulatory information. SPaRTAN is applied to immune cell types in the blood to predict the coupling of signaling receptors with cell context-specific TFs. Selected predictions are validated by prior knowledge and flow cytometry analyses. SPaRTAN is then used to predict the signaling coupled TF states of tumor infiltrating CD8+ T cells in malignant peritoneal and pleural mesotheliomas. SPaRTAN enhances the utility of CITE-seq datasets to uncover TF and cell-surface receptor relationships in diverse cellular states.
Collapse
Affiliation(s)
- Xiaojun Ma
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Ashwin Somasundaram
- Department of Medicine, Division of Hematology/Oncology, University of Pittsburgh, Pittsburgh, PA 15213, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Zengbiao Qi
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Douglas J Hartman
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| | - Harinder Singh
- Center for Systems Immunology and Department of Immunology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Hatice Ulku Osmanbeyoglu
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
- Department of Bioengineering, University of Pittsburgh, Pittsburgh, PA 15261, USA
- UPMC Hillman Cancer Center, Pittsburgh, PA 15213, USA
| |
Collapse
|
153
|
Saint-André V. Computational biology approaches for mapping transcriptional regulatory networks. Comput Struct Biotechnol J 2021; 19:4884-4895. [PMID: 34522292 PMCID: PMC8426465 DOI: 10.1016/j.csbj.2021.08.028] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 12/13/2022] Open
Abstract
Transcriptional Regulatory Networks (TRNs) are mainly responsible for the cell-type- or cell-state-specific expression of gene sets from the same DNA sequence. However, so far there are no precise maps of TRNs available for each cell-type or cell-state, and no ideal tool to map those networks clearly and in full from biological samples. In this review, major approaches and tools to map TRNs from high-throughput data are presented, depending on the type of methods or data used to infer them, and their advantages and limitations are discussed. After summarizing the main principles defining the topology and structure–function relationships in TRNs, an overview of the extensive work done to map TRNs from bulk transcriptomic data will be presented by type of methodological approach. Most recent modellings of TRNs using other types of molecular data or integrating different data types, including single-cell RNA-sequencing and chromatin information, will then be discussed, before briefly concluding with improvements expected to come in the field.
Collapse
Affiliation(s)
- Violaine Saint-André
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| |
Collapse
|
154
|
Raharinirina NA, Peppert F, von Kleist M, Schütte C, Sunkara V. Inferring gene regulatory networks from single-cell RNA-seq temporal snapshot data requires higher-order moments. PATTERNS 2021; 2:100332. [PMID: 34553172 PMCID: PMC8441581 DOI: 10.1016/j.patter.2021.100332] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Revised: 02/23/2021] [Accepted: 07/22/2021] [Indexed: 11/30/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. To date, this aspiration remains unrealized due to technical and computational challenges. In this work we focus on the latter, which is under-represented in the literature. We took a systemic approach by subdividing the GRN inference into three fundamental components: data pre-processing, feature extraction, and inference. We observed that the regulatory signature is captured in the statistical moments of scRNA-seq data and requires computationally intensive minimization solvers to extract it. Furthermore, current data pre-processing might not conserve these statistical moments. Although our moment-based approach is a didactic tool for understanding the different compartments of GRN inference, this line of thinking—finding computationally feasible multi-dimensional statistics of data—is imperative for designing GRN inference methods. Single-cell RNA-seq temporal snapshot data for detecting regulation Challenges in data pre-processing, feature extraction, and network inference for GRNs Encoding of regulatory information in higher-order raw moments Non-linear least-squares inference for temporal scRNA-seq snapshot data
Single-cell RNA sequencing (scRNA-seq) has become ubiquitous in biology. Recently, there has been a push for using scRNA-seq snapshot data to infer the underlying gene regulatory networks (GRNs) steering cellular function. A recent benchmark of 12 GRN methods demonstrated that the algorithms struggled to predict the ground-truth GRNs and speculated that the low performance was due to the insufficient resolution in the scRNA-seq data. Rather than proposing another method, this paper focuses on how to decompose a GRN problem into three subproblems (pre-processing, feature extraction, and inference), so that the gene regulatory information is preserved in each step. Subsequently, we discuss how to best approach each of the three subproblems.
Collapse
Affiliation(s)
| | - Felix Peppert
- Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| | - Max von Kleist
- MF1 Bioinformatics, Methods Development and Research Infrastructure, Robert Koch Institute, 13353 Berlin, Germany
| | - Christof Schütte
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
| | - Vikram Sunkara
- Mathematics of Complex Systems, Zuse Institute Berlin, 14195 Berlin, Germany.,Explainable A.I. for Biology, Zuse Institute Berlin, 14195 Berlin, Germany
| |
Collapse
|
155
|
Moutsopoulos I, Maischak L, Lauzikaite E, Vasquez Urbina S, Williams E, Drost HG, Mohorianu I. noisyR: enhancing biological signal in sequencing datasets by characterizing random technical noise. Nucleic Acids Res 2021; 49:e83. [PMID: 34076236 PMCID: PMC8373073 DOI: 10.1093/nar/gkab433] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 04/16/2021] [Accepted: 05/06/2021] [Indexed: 01/22/2023] Open
Abstract
High-throughput sequencing enables an unprecedented resolution in transcript quantification, at the cost of magnifying the impact of technical noise. The consistent reduction of random background noise to capture functionally meaningful biological signals is still challenging. Intrinsic sequencing variability introducing low-level expression variations can obscure patterns in downstream analyses. We introduce noisyR, a comprehensive noise filter to assess the variation in signal distribution and achieve an optimal information-consistency across replicates and samples; this selection also facilitates meaningful pattern recognition outside the background-noise range. noisyR is applicable to count matrices and sequencing data; it outputs sample-specific signal/noise thresholds and filtered expression matrices. We exemplify the effects of minimizing technical noise on several datasets, across various sequencing assays: coding, non-coding RNAs and interactions, at bulk and single-cell level. An immediate consequence of filtering out noise is the convergence of predictions (differential-expression calls, enrichment analyses and inference of gene regulatory networks) across different approaches.
Collapse
Affiliation(s)
- Ilias Moutsopoulos
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Lukas Maischak
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Elze Lauzikaite
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Sergio A Vasquez Urbina
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Eleanor C Williams
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| | - Hajk-Georg Drost
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Max-Planck Ring 1, 72076 Tübingen, Germany
| | - Irina I Mohorianu
- Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK
| |
Collapse
|
156
|
Dorantes-Gilardi R, García-Cortés D, Hernández-Lemus E, Espinal-Enríquez J. k-core genes underpin structural features of breast cancer. Sci Rep 2021; 11:16284. [PMID: 34381069 PMCID: PMC8358063 DOI: 10.1038/s41598-021-95313-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 07/12/2021] [Indexed: 02/07/2023] Open
Abstract
Gene co-expression networks (GCNs) have been developed as relevant analytical tools for the study of the gene expression patterns behind complex phenotypes. Determining the association between structure and function in GCNs is a current challenge in biomedical research. Several structural differences between GCNs of breast cancer and healthy phenotypes have been reported. In a previous study, using co-expression multilayer networks, we have shown that there are abrupt differences in the connectivity patterns of the GCN of basal-like breast cancer between top co-expressed gene-pairs and the remaining gene-pairs. Here, we compared the top-100,000 interactions networks for the four breast cancer phenotypes (Luminal-A, Luminal-B, Her2+ and Basal), in terms of structural properties. For this purpose, we used the graph-theoretical k-core of a network (maximal sub-network with nodes of degree at least k). We developed a comprehensive analysis of the network k-core ([Formula: see text]) structures in cancer, and its relationship with biological functions. We found that in the Top-100,000-edges networks, the majority of interactions in breast cancer networks are intra-chromosome, meanwhile inter-chromosome interactions serve as connecting bridges between clusters. Moreover, core genes in the healthy network are strongly associated with processes such as metabolism and cell cycle. In breast cancer, only the core of Luminal A is related to those processes, and genes in its core are over-expressed. The intersection of the core nodes in all subtypes of cancer is composed only by genes in the chr8q24.3 region. This region has been observed to be highly amplified in several cancers before, and its appearance in the intersection of the four breast cancer k-cores, may suggest that local co-expression is a conserved phenomenon in cancer. Considering the many intricacies associated with these phenomena and the vast amount of research in epigenomic regulation which is currently undergoing, there is a need for further research on the epigenomic effects on the structure and function of gene co-expression networks in cancer.
Collapse
Affiliation(s)
- Rodrigo Dorantes-Gilardi
- grid.261112.70000 0001 2173 3359Network Science Institute and Department of Physics, Northeastern University, Boston, MA 02115 USA ,grid.462201.3El Colegio de México, Tlalpan, Mexico City, 14110 Mexico ,grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico
| | - Diana García-Cortés
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico
| | - Enrique Hernández-Lemus
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico ,grid.9486.30000 0001 2159 0001Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510 Mexico
| | - Jesús Espinal-Enríquez
- grid.452651.10000 0004 0627 7633Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, 14610 Mexico ,grid.9486.30000 0001 2159 0001Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México (UNAM), Mexico City, 04510 Mexico
| |
Collapse
|
157
|
Davis-Marcisak EF, Deshpande A, Stein-O'Brien GL, Ho WJ, Laheru D, Jaffee EM, Fertig EJ, Kagohara LT. From bench to bedside: Single-cell analysis for cancer immunotherapy. Cancer Cell 2021; 39:1062-1080. [PMID: 34329587 PMCID: PMC8406623 DOI: 10.1016/j.ccell.2021.07.004] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 06/16/2021] [Accepted: 07/02/2021] [Indexed: 01/04/2023]
Abstract
Single-cell technologies are emerging as powerful tools for cancer research. These technologies characterize the molecular state of each cell within a tumor, enabling new exploration of tumor heterogeneity, microenvironment cell-type composition, and cell state transitions that affect therapeutic response, particularly in the context of immunotherapy. Analyzing clinical samples has great promise for precision medicine but is technically challenging. Successfully identifying predictors of response requires well-coordinated, multi-disciplinary teams to ensure adequate sample processing for high-quality data generation and computational analysis for data interpretation. Here, we review current approaches to sample processing and computational analysis regarding their application to translational cancer immunotherapy research.
Collapse
Affiliation(s)
- Emily F Davis-Marcisak
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Atul Deshpande
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Genevieve L Stein-O'Brien
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Won J Ho
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Daniel Laheru
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elizabeth M Jaffee
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Elana J Fertig
- McKusick-Nathans Institute of the Department of Genetic Medicine, Johns Hopkins School of Medicine, 550 N Broadway, Suite 1101E, Baltimore, MD 21205, USA; Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA; Department of Applied Mathematics and Statistics, Johns Hopkins University Whiting School of Engineering, Baltimore, MD, USA; Department of Biomedical Engineering, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| | - Luciane T Kagohara
- Department of Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 1650 Orleans Street, Room 485, Baltimore, MD 21287, USA; Convergence Institute, Johns Hopkins University, Baltimore, MD, USA; Bloomberg-Kimmel Immunotherapy Institute, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
158
|
Sc-compReg enables the comparison of gene regulatory networks between conditions using single-cell data. Nat Commun 2021; 12:4763. [PMID: 34362918 PMCID: PMC8346476 DOI: 10.1038/s41467-021-25089-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 07/20/2021] [Indexed: 12/22/2022] Open
Abstract
The comparison of gene regulatory networks between diseased versus healthy individuals or between two different treatments is an important scientific problem. Here, we propose sc-compReg as a method for the comparative analysis of gene expression regulatory networks between two conditions using single cell gene expression (scRNA-seq) and single cell chromatin accessibility data (scATAC-seq). Our software, sc-compReg, can be used as a stand-alone package that provides joint clustering and embedding of the cells from both scRNA-seq and scATAC-seq, and the construction of differential regulatory networks across two conditions. We apply the method to compare the gene regulatory networks of an individual with chronic lymphocytic leukemia (CLL) versus a healthy control. The analysis reveals a tumor-specific B cell subpopulation in the CLL patient and identifies TOX2 as a potential regulator of this subpopulation. Changes in cell state underlie the difference between health and disease. Here, the authors propose a computational framework for the integration of gene expression and chromatin-accessibility data from single cells to identify differences in gene regulation in cell types across two conditions.
Collapse
|
159
|
Yu CY, Mitrofanova A. Mechanism-Centric Approaches for Biomarker Detection and Precision Therapeutics in Cancer. Front Genet 2021; 12:687813. [PMID: 34408770 PMCID: PMC8365516 DOI: 10.3389/fgene.2021.687813] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 06/28/2021] [Indexed: 12/18/2022] Open
Abstract
Biomarker discovery is at the heart of personalized treatment planning and cancer precision therapeutics, encompassing disease classification and prognosis, prediction of treatment response, and therapeutic targeting. However, many biomarkers represent passenger rather than driver alterations, limiting their utilization as functional units for therapeutic targeting. We suggest that identification of driver biomarkers through mechanism-centric approaches, which take into account upstream and downstream regulatory mechanisms, is fundamental to the discovery of functionally meaningful markers. Here, we examine computational approaches that identify mechanism-centric biomarkers elucidated from gene co-expression networks, regulatory networks (e.g., transcriptional regulation), protein-protein interaction (PPI) networks, and molecular pathways. We discuss their objectives, advantages over gene-centric approaches, and known limitations. Future directions highlight the importance of input and model interpretability, method and data integration, and the role of recently introduced technological advantages, such as single-cell sequencing, which are central for effective biomarker discovery and time-cautious precision therapeutics.
Collapse
Affiliation(s)
- Christina Y. Yu
- Department of Biomedical and Health Informatics, School of Health Professions, Rutgers, The State University of New Jersey, Newark, NJ, United States
| | - Antonina Mitrofanova
- Department of Biomedical and Health Informatics, School of Health Professions, Rutgers, The State University of New Jersey, Newark, NJ, United States
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, United States
| |
Collapse
|
160
|
Weidemüller P, Kholmatov M, Petsalaki E, Zaugg JB. Transcription factors: Bridge between cell signaling and gene regulation. Proteomics 2021; 21:e2000034. [PMID: 34314098 DOI: 10.1002/pmic.202000034] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 07/05/2021] [Accepted: 07/16/2021] [Indexed: 01/17/2023]
Abstract
Transcription factors (TFs) are key regulators of intrinsic cellular processes, such as differentiation and development, and of the cellular response to external perturbation through signaling pathways. In this review we focus on the role of TFs as a link between signaling pathways and gene regulation. Cell signaling tends to result in the modulation of a set of TFs that then lead to changes in the cell's transcriptional program. We highlight the molecular layers at which TF activity can be measured and the associated technical and conceptual challenges. These layers include post-translational modifications (PTMs) of the TF, regulation of TF binding to DNA through chromatin accessibility and epigenetics, and expression of target genes. We highlight that a large number of TFs are understudied in both signaling and gene regulation studies, and that our knowledge about known TF targets has a strong literature bias. We argue that TFs serve as a perfect bridge between the fields of gene regulation and signaling, and that separating these fields hinders our understanding of cell functions. Multi-omics approaches that measure multiple dimensions of TF activity are ideally suited to study the interplay of cell signaling and gene regulation using TFs as the anchor to link the two fields.
Collapse
Affiliation(s)
- Paula Weidemüller
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Maksim Kholmatov
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, Heidelberg, 69117, Germany
| | - Evangelia Petsalaki
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, CB10 1SD, UK
| | - Judith B Zaugg
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Meyerhofstraße 1, Heidelberg, 69117, Germany
| |
Collapse
|
161
|
Mishra S, Srivastava D, Kumar V. Improving gene network inference with graph wavelets and making insights about ageing-associated regulatory changes in lungs. Brief Bioinform 2021; 22:bbaa360. [PMID: 33381809 PMCID: PMC7799288 DOI: 10.1093/bib/bbaa360] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 10/12/2020] [Accepted: 11/10/2020] [Indexed: 01/20/2023] Open
Abstract
Using gene-regulatory-networks-based approach for single-cell expression profiles can reveal unprecedented details about the effects of external and internal factors. However, noise and batch effect in sparse single-cell expression profiles can hamper correct estimation of dependencies among genes and regulatory changes. Here, we devise a conceptually different method using graphwavelet filters for improving gene network (GWNet)-based analysis of the transcriptome. Our approach improved the performance of several gene network-inference methods. Most Importantly, GWNet improved consistency in the prediction of gene regulatory network using single-cell transcriptome even in the presence of batch effect. The consistency of predicted gene network enabled reliable estimates of changes in the influence of genes not highlighted by differential-expression analysis. Applying GWNet on the single-cell transcriptome profile of lung cells, revealed biologically relevant changes in the influence of pathways and master regulators due to ageing. Surprisingly, the regulatory influence of ageing on pneumocytes type II cells showed noticeable similarity with patterns due to the effect of novel coronavirus infection in human lung.
Collapse
|
162
|
Martinelli J, Dulong S, Li XM, Teboul M, Soliman S, Lévi F, Fages F, Ballesta A. Model learning to identify systemic regulators of the peripheral circadian clock. Bioinformatics 2021; 37:i401-i409. [PMID: 34252929 PMCID: PMC8557835 DOI: 10.1093/bioinformatics/btab297] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Personalized medicine aims at providing patient-tailored therapeutics based on multi-type data toward improved treatment outcomes. Chronotherapy that consists in adapting drug administration to the patient's circadian rhythms may be improved by such approach. Recent clinical studies demonstrated large variability in patients' circadian coordination and optimal drug timing. Consequently, new eHealth platforms allow the monitoring of circadian biomarkers in individual patients through wearable technologies (rest-activity, body temperature), blood or salivary samples (melatonin, cortisol) and daily questionnaires (food intake, symptoms). A current clinical challenge involves designing a methodology predicting from circadian biomarkers the patient peripheral circadian clocks and associated optimal drug timing. The mammalian circadian timing system being largely conserved between mouse and humans yet with phase opposition, the study was developed using available mouse datasets. RESULTS We investigated at the molecular scale the influence of systemic regulators (e.g. temperature, hormones) on peripheral clocks, through a model learning approach involving systems biology models based on ordinary differential equations. Using as prior knowledge our existing circadian clock model, we derived an approximation for the action of systemic regulators on the expression of three core-clock genes: Bmal1, Per2 and Rev-Erbα. These time profiles were then fitted with a population of models, based on linear regression. Best models involved a modulation of either Bmal1 or Per2 transcription most likely by temperature or nutrient exposure cycles. This agreed with biological knowledge on temperature-dependent control of Per2 transcription. The strengths of systemic regulations were found to be significantly different according to mouse sex and genetic background. AVAILABILITY AND IMPLEMENTATION https://gitlab.inria.fr/julmarti/model-learning-mb21eccb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Julien Martinelli
- INSERM UMR-S 900, Institut Curie, MINES ParisTech CBIO, PSL Research University, 92210 Saint-Cloud, France.,Lifeware Group, Inria Saclay Ile-de-France, Palaiseau 91120, France
| | - Sandrine Dulong
- UPR "Chronotherapy, Cancers and Transplantation", Paris-Saclay University, Faculty of Medicine Kremlin Bicêtre, Le Kremlin Bicêtre, 94270, France
| | - Xiao-Mei Li
- UPR "Chronotherapy, Cancers and Transplantation", Paris-Saclay University, Faculty of Medicine Kremlin Bicêtre, Le Kremlin Bicêtre, 94270, France
| | - Michèle Teboul
- Côte d'Azur University, CNRS, INSERM, iBV, Nice 06000, France
| | - Sylvain Soliman
- Lifeware Group, Inria Saclay Ile-de-France, Palaiseau 91120, France
| | - Francis Lévi
- UPR "Chronotherapy, Cancers and Transplantation", Paris-Saclay University, Faculty of Medicine Kremlin Bicêtre, Le Kremlin Bicêtre, 94270, France.,Hepato-Biliary Center, Paul-Brousse Hospital, Assistance Publique-Hôpitaux de Paris, Villejuif 94800, France
| | - François Fages
- Lifeware Group, Inria Saclay Ile-de-France, Palaiseau 91120, France
| | - Annabelle Ballesta
- INSERM UMR-S 900, Institut Curie, MINES ParisTech CBIO, PSL Research University, 92210 Saint-Cloud, France
| |
Collapse
|
163
|
Wu AP, Peng J, Berger B, Cho H. Bayesian information sharing enhances detection of regulatory associations in rare cell types. Bioinformatics 2021; 37:i349-i357. [PMID: 34252956 PMCID: PMC8275330 DOI: 10.1093/bioinformatics/btab269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Recent advances in single-cell RNA-sequencing (scRNA-seq) technologies promise to enable the study of gene regulatory associations at unprecedented resolution in diverse cellular contexts. However, identifying unique regulatory associations observed only in specific cell types or conditions remains a key challenge; this is particularly so for rare transcriptional states whose sample sizes are too small for existing gene regulatory network inference methods to be effective. RESULTS We present ShareNet, a Bayesian framework for boosting the accuracy of cell type-specific gene regulatory networks by propagating information across related cell types via an information sharing structure that is adaptively optimized for a given single-cell dataset. The techniques we introduce can be used with a range of general network inference algorithms to enhance the output for each cell type. We demonstrate the enhanced accuracy of our approach on three benchmark scRNA-seq datasets. We find that our inferred cell type-specific networks also uncover key changes in gene associations that underpin the complex rewiring of regulatory networks across cell types, tissues and dynamic biological processes. Our work presents a path toward extracting deeper insights about cell type-specific gene regulation in the rapidly growing compendium of scRNA-seq datasets. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. AVAILABILITY AND IMPLEMENTATION The code for ShareNet is available at http://sharenet.csail.mit.edu and https://github.com/alexw16/sharenet.
Collapse
Affiliation(s)
- Alexander P Wu
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Champaign, IL 61801, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA 02139, USA.,Department of Mathematics, MIT, Cambridge, MA 02139, USA.,Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Hyunghoon Cho
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| |
Collapse
|
164
|
scLink: Inferring Sparse Gene Co-expression Networks from Single-cell Expression Data. GENOMICS PROTEOMICS & BIOINFORMATICS 2021; 19:475-492. [PMID: 34252628 PMCID: PMC8896229 DOI: 10.1016/j.gpb.2020.11.006] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/23/2020] [Accepted: 12/26/2020] [Indexed: 11/23/2022]
Abstract
A system-level understanding of the regulation and coordination mechanisms of gene expression is essential for studying the complexity of biological processes in health and disease. With the rapid development of single-cell RNA sequencing technologies, it is now possible to investigate gene interactions in a cell type-specific manner. Here we propose the scLink method, which uses statistical network modeling to understand the co-expression relationships among genes and construct sparse gene co-expression networks from single-cell gene expression data. We use both simulation and real data studies to demonstrate the advantages of scLink and its ability to improve single-cell gene network analysis. The scLink R package is available at https://github.com/Vivianstats/scLink.
Collapse
|
165
|
Shu H, Zhou J, Lian Q, Li H, Zhao D, Zeng J, Ma J. Modeling gene regulatory networks using neural network architectures. NATURE COMPUTATIONAL SCIENCE 2021; 1:491-501. [PMID: 38217125 DOI: 10.1038/s43588-021-00099-8] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 06/15/2021] [Indexed: 01/15/2024]
Abstract
Gene regulatory networks (GRNs) encode the complex molecular interactions that govern cell identity. Here we propose DeepSEM, a deep generative model that can jointly infer GRNs and biologically meaningful representation of single-cell RNA sequencing (scRNA-seq) data. In particular, we developed a neural network version of the structural equation model (SEM) to explicitly model the regulatory relationships among genes. Benchmark results show that DeepSEM achieves comparable or better performance on a variety of single-cell computational tasks, such as GRN inference, scRNA-seq data visualization, clustering and simulation, compared with the state-of-the-art methods. In addition, the gene regulations predicted by DeepSEM on cell-type marker genes in the mouse cortex can be validated by epigenetic data, which further demonstrates the accuracy and efficiency of our method. DeepSEM can provide a useful and powerful tool to analyze scRNA-seq data and infer a GRN.
Collapse
Affiliation(s)
- Hantao Shu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Jingtian Zhou
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
- Bioinformatics and Systems Biology Program, University of California, San Diego, La Jolla, CA, USA
| | - Qiuyu Lian
- UM-SJTU Joint Institute, Shanghai Jiao Tong University, Shanghai, China
- Department of Automation, Shanghai Jiao Tong University, Shanghai, China
| | - Han Li
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing, China.
| | - Jianzhu Ma
- Institute for Artificial Intelligence, Peking University, Beijing, China.
| |
Collapse
|
166
|
Gao S, Dai Y, Rehman J. A Bayesian inference transcription factor activity model for the analysis of single-cell transcriptomes. Genome Res 2021; 31:1296-1311. [PMID: 34193535 PMCID: PMC8256867 DOI: 10.1101/gr.265595.120] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 05/26/2021] [Indexed: 01/06/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful experimental approach to study cellular heterogeneity. One of the challenges in scRNA-seq data analysis is integrating different types of biological data to consistently recognize discrete biological functions and regulatory mechanisms of cells, such as transcription factor activities and gene regulatory networks in distinct cell populations. We have developed an approach to infer transcription factor activities from scRNA-seq data that leverages existing biological data on transcription factor binding sites. The Bayesian inference transcription factor activity model (BITFAM) integrates ChIP-seq transcription factor binding information into scRNA-seq data analysis. We show that the inferred transcription factor activities for key cell types identify regulatory transcription factors that are known to mechanistically control cell function and cell fate. The BITFAM approach not only identifies biologically meaningful transcription factor activities, but also provides valuable insights into underlying transcription factor regulatory mechanisms.
Collapse
Affiliation(s)
- Shang Gao
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60612, USA
- Department of Medicine, Division of Cardiology, University of Illinois at Chicago, Chicago, Illinois 60612, USA
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA
| | - Yang Dai
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60612, USA
| | - Jalees Rehman
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois 60612, USA
- Department of Medicine, Division of Cardiology, University of Illinois at Chicago, Chicago, Illinois 60612, USA
- Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA
- University of Illinois Cancer Center, Chicago, Illinois 60612, USA
| |
Collapse
|
167
|
Tripathi RK, Wilkins O. Single cell gene regulatory networks in plants: Opportunities for enhancing climate change stress resilience. PLANT, CELL & ENVIRONMENT 2021; 44:2006-2017. [PMID: 33522607 PMCID: PMC8359182 DOI: 10.1111/pce.14012] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Revised: 01/21/2021] [Accepted: 01/22/2021] [Indexed: 05/05/2023]
Abstract
Global warming poses major challenges for plant survival and agricultural productivity. Thus, efforts to enhance stress resilience in plants are key strategies for protecting food security. Gene regulatory networks (GRNs) are a critical mechanism conferring stress resilience. Until recently, predicting GRNs of the individual cells that make up plants and other multicellular organisms was impeded by aggregate population scale measurements of transcriptome and other genome-scale features. With the advancement of high-throughput single cell RNA-seq and other single cell assays, learning GRNs for individual cells is now possible, in principle. In this article, we report on recent advances in experimental and analytical methodologies for single cell sequencing assays especially as they have been applied to the study of plants. We highlight recent advances and ongoing challenges for scGRN prediction, and finally, we highlight the opportunity to use scGRN discovery for studying and ultimately enhancing abiotic stress resilience in plants.
Collapse
Affiliation(s)
- Rajiv K. Tripathi
- Department of Biological SciencesUniversity of ManitobaWinnipegManitobaCanada
| | - Olivia Wilkins
- Department of Biological SciencesUniversity of ManitobaWinnipegManitobaCanada
| |
Collapse
|
168
|
Integrated Inference of Asymmetric Protein Interaction Networks Using Dynamic Model and Individual Patient Proteomics Data. Symmetry (Basel) 2021. [DOI: 10.3390/sym13061097] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Recent advances in experimental biology studies have produced large amount of molecular activity data. In particular, individual patient data provide non-time series information for the molecular activities in disease conditions. The challenge is how to design effective algorithms to infer regulatory networks using the individual patient datasets and consequently address the issue of network symmetry. This work is aimed at developing an efficient pipeline to reverse-engineer regulatory networks based on the individual patient proteomic data. The first step uses the SCOUT algorithm to infer the pseudo-time trajectory of individual patients. Then the path-consistent method with part mutual information is used to construct a static network that contains the potential protein interactions. To address the issue of network symmetry in terms of undirected symmetric network, a dynamic model of ordinary differential equations is used to further remove false interactions to derive asymmetric networks. In this work a dataset from triple-negative breast cancer patients is used to develop a protein-protein interaction network with 15 proteins.
Collapse
|
169
|
Davies J, Vallejo AF, Sirvent S, Porter G, Clayton K, Qumbelo Y, Stumpf P, West J, Gray CM, Chigorimbo-Murefu NTL, MacArthur B, Polak ME. An IRF1-IRF4 Toggle-Switch Controls Tolerogenic and Immunogenic Transcriptional Programming in Human Langerhans Cells. Front Immunol 2021; 12:665312. [PMID: 34211464 PMCID: PMC8239435 DOI: 10.3389/fimmu.2021.665312] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 05/25/2021] [Indexed: 12/27/2022] Open
Abstract
Langerhans cells (LCs) reside in the epidermis as a dense network of immune system sentinels, coordinating both immunogenic and tolerogenic immune responses. To determine molecular switches directing induction of LC immune activation, we performed mathematical modelling of gene regulatory networks identified by single cell RNA sequencing of LCs exposed to TNF-alpha, a key pro-inflammatory signal produced by the skin. Our approach delineated three programmes of LC phenotypic activation (immunogenic, tolerogenic or ambivalent), and confirmed that TNF-alpha enhanced LC immunogenic programming. Through regulon analysis followed by mutual information modelling, we identified IRF1 as the key transcription factor for the regulation of immunogenicity in LCs. Application of a mathematical toggle switch model, coupling IRF1 with tolerance-inducing transcription factors, determined the key set of transcription factors regulating the switch between tolerance and immunogenicity, and correctly predicted LC behaviour in LCs derived from different body sites. Our findings provide a mechanistic explanation of how combinatorial interactions between different transcription factors can coordinate specific transcriptional programmes in human LCs, interpreting the microenvironmental context of the local tissue microenvironments.
Collapse
Affiliation(s)
- James Davies
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Andres F Vallejo
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Sofia Sirvent
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Gemma Porter
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Kalum Clayton
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom
| | - Yamkela Qumbelo
- Division of Immunology, Institute of Infectious Disease and Molecular Medicine, Department of Pathology, University of Cape Town, Cape Town, South Africa
| | - Patrick Stumpf
- Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Jonathan West
- Cancer Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Clive M Gray
- Division of Immunology, Institute of Infectious Disease and Molecular Medicine, Department of Pathology, University of Cape Town, Cape Town, South Africa
| | - Nyaradzo T L Chigorimbo-Murefu
- Division of Immunology, Institute of Infectious Disease and Molecular Medicine, Department of Pathology, University of Cape Town, Cape Town, South Africa
| | - Ben MacArthur
- Cancer Sciences, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| | - Marta E Polak
- Clinical and Experimental Sciences, Sir Henry Wellcome Laboratories, Faculty of Medicine, University of Southampton, Southampton, United Kingdom.,Institute for Life Sciences, University of Southampton, Southampton, United Kingdom
| |
Collapse
|
170
|
de Anda-Jáuregui G, Espinal-Enríquez J, Hernández-Lemus E. Highly connected, non-redundant microRNA functional control in breast cancer molecular subtypes. Interface Focus 2021; 11:20200073. [PMID: 34123357 PMCID: PMC8193465 DOI: 10.1098/rsfs.2020.0073] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/21/2021] [Indexed: 12/18/2022] Open
Abstract
Breast cancer is a complex, heterogeneous disease at the phenotypic and molecular level. In particular, the transcriptional regulatory programs are known to be significantly affected and such transcriptional alterations are able to capture some of the heterogeneity of the disease, leading to the emergence of breast cancer molecular subtypes. Recently, it has been found that network biology approaches to decipher such abnormal gene regulation programs, for instance by means of gene co-expression networks, have been able to recapitulate the differences between breast cancer subtypes providing elements to further understand their functional origins and consequences. Network biology approaches may be extended to include other co-expression patterns, like those found between genes and non-coding transcripts such as microRNAs (miRs). As is known, miRs play relevant roles in the establishment of normal and anomalous transcription processes. Commodore miRs (cdre-miRs) have been defined as miRs that, based on their connectivity and redundancy in co-expression networks, are potential control elements of biological functions. In this work, we reconstructed miR–gene co-expression networks for each breast cancer molecular subtype, from high throughput data in 424 samples from the Cancer Genome Atlas consortium. We identified cdre-miRs in three out of four molecular subtypes. We found that in each subtype, each cdre-miR was linked to a different set of associated genes, as well as a different set of associated biological functions. We used a systematic literature validation strategy, and identified that the associated biological functions to these cdre-miRs are hallmarks of cancer such as angiogenesis, cell adhesion, cell cycle and regulation of apoptosis. The relevance of such cdre-miRs as actionable molecular targets in breast cancer is still to be determined from functional studies.
Collapse
Affiliation(s)
- Guillermo de Anda-Jáuregui
- Computational Genomics, Instituto Nacional de Medicina Genómica, Mexico City, Mexico.,Cátedras CONACYT for Young Researchers, Consejo Nacional de Ciencia y Tecnología, Mexico City, Mexico.,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Jesús Espinal-Enríquez
- Computational Genomics, Instituto Nacional de Medicina Genómica, Mexico City, Mexico.,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics, Instituto Nacional de Medicina Genómica, Mexico City, Mexico.,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| |
Collapse
|
171
|
Katebi A, Ramirez D, Lu M. Computational systems-biology approaches for modeling gene networks driving epithelial-mesenchymal transitions. COMPUTATIONAL AND SYSTEMS ONCOLOGY 2021; 1:e1021. [PMID: 34164628 PMCID: PMC8219219 DOI: 10.1002/cso2.1021] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Epithelial-mesenchymal transition (EMT) is an important biological process through which epithelial cells undergo phenotypic transitions to mesenchymal cells by losing cell-cell adhesion and gaining migratory properties that cells use in embryogenesis, wound healing, and cancer metastasis. An important research topic is to identify the underlying gene regulatory networks (GRNs) governing the decision making of EMT and develop predictive models based on the GRNs. The advent of recent genomic technology, such as single-cell RNA sequencing, has opened new opportunities to improve our understanding about the dynamical controls of EMT. In this article, we review three major types of computational and mathematical approaches and methods for inferring and modeling GRNs driving EMT. We emphasize (1) the bottom-up approaches, where GRNs are constructed through literature search; (2) the top-down approaches, where GRNs are derived from genome-wide sequencing data; (3) the combined top-down and bottom-up approaches, where EMT GRNs are constructed and simulated by integrating bioinformatics and mathematical modeling. We discuss the methodologies and applications of each approach and the available resources for these studies.
Collapse
Affiliation(s)
- Ataur Katebi
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
| | - Daniel Ramirez
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
- College of Health Solutions, Arizona State University, Tempe, Arizona, USA
| | - Mingyang Lu
- Department of Bioengineering, Northeastern University, Boston, Massachusetts, USA
- Center for Theoretical Biological Physics, Northeastern University, Boston, Massachusetts, USA
| |
Collapse
|
172
|
Stein-O'Brien GL, Ainsile MC, Fertig EJ. Forecasting cellular states: from descriptive to predictive biology via single-cell multiomics. CURRENT OPINION IN SYSTEMS BIOLOGY 2021; 26:24-32. [PMID: 34660940 PMCID: PMC8516130 DOI: 10.1016/j.coisb.2021.03.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
As the single cell field races to characterize each cell type, state, and behavior, the complexity of the computational analysis approaches the complexity of the biological systems. Single cell and imaging technologies now enable unprecedented measurements of state transitions in biological systems, providing high-throughput data that capture tens-of-thousands of measurements on hundreds-of-thousands of samples. Thus, the definition of cell type and state is evolving to encompass the broad range of biological questions now attainable. To answer these questions requires the development of computational tools for integrated multi-omics analysis. Merged with mathematical models, these algorithms will be able to forecast future states of biological systems, going from statistical inferences of phenotypes to time course predictions of the biological systems with dynamic maps analogous to weather systems. Thus, systems biology for forecasting biological system dynamics from multi-omic data represents the future of cell biology empowering a new generation of technology-driven predictive medicine.
Collapse
Affiliation(s)
- Genevieve L Stein-O'Brien
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Department of Neuroscience, Johns Hopkins School of Medicine, Baltimore, MD
- McKusick-Nathans Department of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
| | - Michaela C Ainsile
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
| | - Elana J Fertig
- Department of Oncology, Division of Biostatistics and Bioinformatics, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins School of Medicine, Baltimore, MD
- Convergence Institute, Johns Hopkins University, Baltimore, MD
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD
- Department of Applied Mathematics & Statistics, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
173
|
Jin T, Rehani P, Ying M, Huang J, Liu S, Roussos P, Wang D. scGRNom: a computational pipeline of integrative multi-omics analyses for predicting cell-type disease genes and regulatory networks. Genome Med 2021; 13:95. [PMID: 34044854 PMCID: PMC8161957 DOI: 10.1186/s13073-021-00908-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2020] [Accepted: 05/13/2021] [Indexed: 02/06/2023] Open
Abstract
Understanding cell-type-specific gene regulatory mechanisms from genetic variants to diseases remains challenging. To address this, we developed a computational pipeline, scGRNom (single-cell Gene Regulatory Network prediction from multi-omics), to predict cell-type disease genes and regulatory networks including transcription factors and regulatory elements. With applications to schizophrenia and Alzheimer's disease, we predicted disease genes and regulatory networks for excitatory and inhibitory neurons, microglia, and oligodendrocytes. Further enrichment analyses revealed cross-disease and disease-specific functions and pathways at the cell-type level. Our machine learning analysis also found that cell-type disease genes improved clinical phenotype predictions. scGRNom is a general-purpose tool available at https://github.com/daifengwanglab/scGRNom .
Collapse
Affiliation(s)
- Ting Jin
- Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, 53706, USA
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA
| | - Peter Rehani
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA
- Department of Integrative Biology, University of Wisconsin - Madison, Madison, WI, 53706, USA
- Present address: Morgridge Institute for Research, Madison, WI, 53715, USA
| | - Mufang Ying
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
- Present address: Department of Statistics, Rutgers University, Piscataway, NJ, 08854, USA
| | - Jiawei Huang
- Department of Statistics, University of Wisconsin - Madison, Madison, WI, 53706, USA
| | - Shuang Liu
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA
| | - Panagiotis Roussos
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, 10029, USA
| | - Daifeng Wang
- Department of Biostatistics and Medical Informatics, University of Wisconsin - Madison, Madison, WI, 53706, USA.
- Waisman Center, University of Wisconsin - Madison, Madison, WI, 53705, USA.
- Department of Computer Sciences, University of Wisconsin - Madison, Madison, WI, 53706, USA.
| |
Collapse
|
174
|
Bobrovskikh A, Doroshkov A, Mazzoleni S, Cartenì F, Giannino F, Zubairova U. A Sight on Single-Cell Transcriptomics in Plants Through the Prism of Cell-Based Computational Modeling Approaches: Benefits and Challenges for Data Analysis. Front Genet 2021; 12:652974. [PMID: 34093652 PMCID: PMC8176226 DOI: 10.3389/fgene.2021.652974] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 04/20/2021] [Indexed: 01/09/2023] Open
Abstract
Single-cell technology is a relatively new and promising way to obtain high-resolution transcriptomic data mostly used for animals during the last decade. However, several scientific groups developed and applied the protocols for some plant tissues. Together with deeply-developed cell-resolution imaging techniques, this achievement opens up new horizons for studying the complex mechanisms of plant tissue architecture formation. While the opportunities for integrating data from transcriptomic to morphogenetic levels in a unified system still present several difficulties, plant tissues have some additional peculiarities. One of the plants' features is that cell-to-cell communication topology through plasmodesmata forms during tissue growth and morphogenesis and results in mutual regulation of expression between neighboring cells affecting internal processes and cell domain development. Undoubtedly, we must take this fact into account when analyzing single-cell transcriptomic data. Cell-based computational modeling approaches successfully used in plant morphogenesis studies promise to be an efficient way to summarize such novel multiscale data. The inverse problem's solutions for these models computed on the real tissue templates can shed light on the restoration of individual cells' spatial localization in the initial plant organ-one of the most ambiguous and challenging stages in single-cell transcriptomic data analysis. This review summarizes new opportunities for advanced plant morphogenesis models, which become possible thanks to single-cell transcriptome data. Besides, we show the prospects of microscopy and cell-resolution imaging techniques to solve several spatial problems in single-cell transcriptomic data analysis and enhance the hybrid modeling framework opportunities.
Collapse
Affiliation(s)
- Aleksandr Bobrovskikh
- Laboratory of Plant Growth Biomechanics, Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia.,Department of Agricultural Sciences, University of Naples Federico II, Naples, Italy
| | - Alexey Doroshkov
- Laboratory of Plant Growth Biomechanics, Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia.,Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| | - Stefano Mazzoleni
- Department of Agricultural Sciences, University of Naples Federico II, Naples, Italy
| | - Fabrizio Cartenì
- Department of Agricultural Sciences, University of Naples Federico II, Naples, Italy
| | - Francesco Giannino
- Department of Agricultural Sciences, University of Naples Federico II, Naples, Italy
| | - Ulyana Zubairova
- Laboratory of Plant Growth Biomechanics, Institute of Cytology and Genetics Siberian Branch of Russian Academy of Sciences (SB RAS), Novosibirsk, Russia.,Department of Natural Sciences, Novosibirsk State University, Novosibirsk, Russia
| |
Collapse
|
175
|
Nguyen H, Tran D, Tran B, Pehlivan B, Nguyen T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021; 22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bang Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bahadir Pehlivan
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| |
Collapse
|
176
|
Gan Y, Xin Y, Hu X, Zou G. Inferring gene regulatory network from single-cell transcriptomic data by integrating multiple prior networks. Comput Biol Chem 2021; 93:107512. [PMID: 34044202 DOI: 10.1016/j.compbiolchem.2021.107512] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 05/12/2021] [Indexed: 11/29/2022]
Abstract
Gene regulatory network models the interactions between transcription factors and target genes. Reconstructing gene regulation network is critically important to understand gene function in a particular cellular context, providing key insights into complex biological systems. We develop a new computational method, named iMPRN, which integrates multiple prior networks to infer regulatory network. Based on the network component analysis model, iMPRN adopts linear regression, graph embedding, and elastic networks to optimize each prior network in line with specific biological context. For each rewired prior networks, iMPRN evaluate the confidence of the regulatory edges in each network based on B scores and finally integrated these optimized networks. We validate the effectiveness of iMPRN by comparing it with four widely-used gene regulatory network reconstruction algorithms on a simulation data set. The results show that iMPRN can infer the gene regulatory network more accurately. Further, on a real scRNA-seq dataset, iMPRN is respectively applied to reconstruct gene regulatory networks for malignant and nonmalignant head and neck tumor cells, demonstrating distinctive differences in their corresponding regulatory networks.
Collapse
Affiliation(s)
- Yanglan Gan
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Yongchang Xin
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Xin Hu
- School of Computer Science and Technology, Donghua University, Shanghai, China
| | - Guobing Zou
- School of Computer Engineering and Science, Shanghai University, Shanghai, China
| |
Collapse
|
177
|
Wu W, Liu Y, Dai Q, Yan X, Wang Z. G2S3: A gene graph-based imputation method for single-cell RNA sequencing data. PLoS Comput Biol 2021; 17:e1009029. [PMID: 34003861 PMCID: PMC8189489 DOI: 10.1371/journal.pcbi.1009029] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2020] [Revised: 06/09/2021] [Accepted: 04/29/2021] [Indexed: 12/20/2022] Open
Abstract
Single-cell RNA sequencing technology provides an opportunity to study gene expression at single-cell resolution. However, prevalent dropout events result in high data sparsity and noise that may obscure downstream analyses in single-cell transcriptomic studies. We propose a new method, G2S3, that imputes dropouts by borrowing information from adjacent genes in a sparse gene graph learned from gene expression profiles across cells. We applied G2S3 and ten existing imputation methods to eight single-cell transcriptomic datasets and compared their performance. Our results demonstrated that G2S3 has superior overall performance in recovering gene expression, identifying cell subtypes, reconstructing cell trajectories, identifying differentially expressed genes, and recovering gene regulatory and correlation relationships. Moreover, G2S3 is computationally efficient for imputation in large-scale single-cell transcriptomic datasets.
Collapse
Affiliation(s)
- Weimiao Wu
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Yunqing Liu
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Qile Dai
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| | - Xiting Yan
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
- Section of Pulmonary, Critical Care and Sleep Medicine, Yale School of Medicine, New Haven, Connecticut, United States of America
| | - Zuoheng Wang
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, United States of America
| |
Collapse
|
178
|
Kuksin M, Morel D, Aglave M, Danlos FX, Marabelle A, Zinovyev A, Gautheret D, Verlingue L. Applications of single-cell and bulk RNA sequencing in onco-immunology. Eur J Cancer 2021; 149:193-210. [PMID: 33866228 DOI: 10.1016/j.ejca.2021.03.005] [Citation(s) in RCA: 56] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 02/26/2021] [Accepted: 03/04/2021] [Indexed: 02/08/2023]
Abstract
The rising interest for precise characterization of the tumour immune contexture has recently brought forward the high potential of RNA sequencing (RNA-seq) in identifying molecular mechanisms engaged in the response to immunotherapy. In this review, we provide an overview of the major principles of single-cell and conventional (bulk) RNA-seq applied to onco-immunology. We describe standard preprocessing and statistical analyses of data obtained from such techniques and highlight some computational challenges relative to the sequencing of individual cells. We notably provide examples of gene expression analyses such as differential expression analysis, dimensionality reduction, clustering and enrichment analysis. Additionally, we used public data sets to exemplify how deconvolution algorithms can identify and quantify multiple immune subpopulations from either bulk or single-cell RNA-seq. We give examples of machine and deep learning models used to predict patient outcomes and treatment effect from high-dimensional data. Finally, we balance the strengths and weaknesses of single-cell and bulk RNA-seq regarding their applications in the clinic.
Collapse
Affiliation(s)
- Maria Kuksin
- ENS de Lyon, 15 Parvis René Descartes, 69007, Lyon, France; Département d'Innovations Thérapeutiques et Essais Précoces (DITEP), Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, 94800, Villejuif, France
| | - Daphné Morel
- Département d'Innovations Thérapeutiques et Essais Précoces (DITEP), Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, 94800, Villejuif, France; Département de Radiothérapie, Gustave Roussy Cancer Campus, Gustave Roussy, 114 rue Edouard Vaillant, 94800, Villejuif, France; INSERM UMR1030, Molecular Radiotherapy and Therapeutic Innovations, Gustave Roussy, 114 rue Edouard Vaillant, 94800, Villejuif, France
| | - Marine Aglave
- INSERM US23, CNRS UMS 3655, Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, 94800, Villejuif, France
| | | | - Aurélien Marabelle
- Département d'Innovations Thérapeutiques et Essais Précoces (DITEP), Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, 94800, Villejuif, France; INSERM U1015, Gustave Roussy, Université Paris Saclay, France
| | - Andrei Zinovyev
- Institut Curie, PSL Research University, F-75005, Paris, France; INSERM, U900, F-75005, Paris, France; MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, F-75006, Paris, France; Laboratory of Advanced Methods for High-dimensional Data Analysis, Lobachevsky University, 603000, Nizhny Novgorod, Russia
| | - Daniel Gautheret
- Institute for Integrative Biology of the Cell, UMR 9198, CEA, CNRS, Université Paris-Saclay, Gif-Sur-Yvette, France; IHU PRISM, Gustave Roussy Cancer Campus, Gustave Roussy, 114 Rue Edouard Vaillant, 94800, Villejuif, France; Université Paris-Saclay, France
| | - Loïc Verlingue
- Département d'Innovations Thérapeutiques et Essais Précoces (DITEP), Gustave Roussy Cancer Campus, 114 rue Edouard Vaillant, 94800, Villejuif, France; INSERM UMR1030, Molecular Radiotherapy and Therapeutic Innovations, Gustave Roussy, 114 rue Edouard Vaillant, 94800, Villejuif, France; Institut Curie, PSL Research University, F-75005, Paris, France; Université Paris-Saclay, France.
| |
Collapse
|
179
|
Koldobskiy MA, Jenkinson G, Abante J, Rodriguez DiBlasi VA, Zhou W, Pujadas E, Idrizi A, Tryggvadottir R, Callahan C, Bonifant CL, Rabin KR, Brown PA, Ji H, Goutsias J, Feinberg AP. Converging genetic and epigenetic drivers of paediatric acute lymphoblastic leukaemia identified by an information-theoretic analysis. Nat Biomed Eng 2021; 5:360-376. [PMID: 33859388 PMCID: PMC8370714 DOI: 10.1038/s41551-021-00703-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 02/18/2021] [Indexed: 02/02/2023]
Abstract
In cancer, linking epigenetic alterations to drivers of transformation has been difficult, in part because DNA methylation analyses must capture epigenetic variability, which is central to tumour heterogeneity and tumour plasticity. Here, by conducting a comprehensive analysis, based on information theory, of differences in methylation stochasticity in samples from patients with paediatric acute lymphoblastic leukaemia (ALL), we show that ALL epigenomes are stochastic and marked by increased methylation entropy at specific regulatory regions and genes. By integrating DNA methylation and single-cell gene-expression data, we arrived at a relationship between methylation entropy and gene-expression variability, and found that epigenetic changes in ALL converge on a shared set of genes that overlap with genetic drivers involved in chromosomal translocations across the disease spectrum. Our findings suggest that an epigenetically driven gene-regulation network, with UHRF1 (ubiquitin-like with PHD and RING finger domains 1) as a central node, links genetic drivers and epigenetic mediators in ALL.
Collapse
Affiliation(s)
- Michael A Koldobskiy
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Garrett Jenkinson
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
- Department of Health Science Research, Mayo Clinic, Rochester, MN, USA
| | - Jordi Abante
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA
| | - Varenka A Rodriguez DiBlasi
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Cancer Immunology and Immune Modulation, Boehringer Ingelheim, Ridgefield, CT, USA
| | - Weiqiang Zhou
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - Elisabet Pujadas
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Adrian Idrizi
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Rakel Tryggvadottir
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Colin Callahan
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Challice L Bonifant
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Karen R Rabin
- Department of Pediatrics, Section of Hematology-Oncology, Baylor College of Medicine, Houston, TX, USA
| | - Patrick A Brown
- Pediatric Oncology, Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Hongkai Ji
- Department of Biostatistics, Johns Hopkins University Bloomberg School of Public Health, Baltimore, MD, USA
| | - John Goutsias
- Whitaker Biomedical Engineering Institute, Johns Hopkins University, Baltimore, MD, USA.
| | - Andrew P Feinberg
- Center for Epigenetics, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA.
- Department of Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
180
|
Li X, Zhang W, Zhang J, Li G. ModularBoost: an efficient network inference algorithm based on module decomposition. BMC Bioinformatics 2021; 22:153. [PMID: 33761871 PMCID: PMC7992795 DOI: 10.1186/s12859-021-04074-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2020] [Accepted: 03/11/2021] [Indexed: 11/15/2022] Open
Abstract
Background Given expression data, gene regulatory network(GRN) inference approaches try to determine regulatory relations. However, current inference methods ignore the inherent topological characters of GRN to some extent, leading to structures that lack clear biological explanation. To increase the biophysical meanings of inferred networks, this study performed data-driven module detection before network inference. Gene modules were identified by decomposition-based methods. Results ICA-decomposition based module detection methods have been used to detect functional modules directly from transcriptomic data. Experiments about time-series expression, curated and scRNA-seq datasets suggested that the advantages of the proposed ModularBoost method over established methods, especially in the efficiency and accuracy. For scRNA-seq datasets, the ModularBoost method outperformed other candidate inference algorithms. Conclusions As a complicated task, GRN inference can be decomposed into several tasks of reduced complexity. Using identified gene modules as topological constraints, the initial inference problem can be accomplished by inferring intra-modular and inter-modular interactions respectively. Experimental outcomes suggest that the proposed ModularBoost method can improve the accuracy and efficiency of inference algorithms by introducing topological constraints.
Collapse
Affiliation(s)
- Xinyu Li
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China
| | - Wei Zhang
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China.
| | - Jianming Zhang
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China.
| | - Guang Li
- State Key Laboratory of Industrial Control Technology, Institute of Cyber-Systems and Control, Zhejiang University, Zheda Road, 310027, Hangzhou, China
| |
Collapse
|
181
|
Kang Y, Thieffry D, Cantini L. Evaluating the Reproducibility of Single-Cell Gene Regulatory Network Inference Algorithms. Front Genet 2021; 12:617282. [PMID: 33828580 PMCID: PMC8019823 DOI: 10.3389/fgene.2021.617282] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 02/24/2021] [Indexed: 12/13/2022] Open
Abstract
Networks are powerful tools to represent and investigate biological systems. The development of algorithms inferring regulatory interactions from functional genomics data has been an active area of research. With the advent of single-cell RNA-seq data (scRNA-seq), numerous methods specifically designed to take advantage of single-cell datasets have been proposed. However, published benchmarks on single-cell network inference are mostly based on simulated data. Once applied to real data, these benchmarks take into account only a small set of genes and only compare the inferred networks with an imposed ground-truth. Here, we benchmark six single-cell network inference methods based on their reproducibility, i.e., their ability to infer similar networks when applied to two independent datasets for the same biological condition. We tested each of these methods on real data from three biological conditions: human retina, T-cells in colorectal cancer, and human hematopoiesis. Once taking into account networks with up to 100,000 links, GENIE3 results to be the most reproducible algorithm and, together with GRNBoost2, show higher intersection with ground-truth biological interactions. These results are independent from the single-cell sequencing platform, the cell type annotation system and the number of cells constituting the dataset. Finally, GRNBoost2 and CLR show more reproducible performance once a more stringent thresholding is applied to the networks (1,000–100 links). In order to ensure the reproducibility and ease extensions of this benchmark study, we implemented all the analyses in scNET, a Jupyter notebook available at https://github.com/ComputationalSystemsBiology/scNET.
Collapse
Affiliation(s)
- Yoonjee Kang
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Denis Thieffry
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| | - Laura Cantini
- Computational Systems Biology Team, Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR 8197, INSERM U1024, Ecole Normale Supérieure, Paris Sciences et Lettres Research University, Paris, France
| |
Collapse
|
182
|
Phillips NE, Hugues A, Yeung J, Durandau E, Nicolas D, Naef F. The circadian oscillator analysed at the single-transcript level. Mol Syst Biol 2021; 17:e10135. [PMID: 33719202 PMCID: PMC7957410 DOI: 10.15252/msb.202010135] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 01/05/2021] [Accepted: 01/19/2021] [Indexed: 12/31/2022] Open
Abstract
The circadian clock is an endogenous and self-sustained oscillator that anticipates daily environmental cycles. While rhythmic gene expression of circadian genes is well-described in populations of cells, the single-cell mRNA dynamics of multiple core clock genes remain largely unknown. Here we use single-molecule fluorescence in situ hybridisation (smFISH) at multiple time points to measure pairs of core clock transcripts, Rev-erbα (Nr1d1), Cry1 and Bmal1, in mouse fibroblasts. The mean mRNA level oscillates over 24 h for all three genes, but mRNA numbers show considerable spread between cells. We develop a probabilistic model for multivariate mRNA counts using mixtures of negative binomials, which accounts for transcriptional bursting, circadian time and cell-to-cell heterogeneity, notably in cell size. Decomposing the mRNA variability into distinct noise sources shows that clock time contributes a small fraction of the total variability in mRNA number between cells. Thus, our results highlight the intrinsic biological challenges in estimating circadian phase from single-cell mRNA counts and suggest that circadian phase in single cells is encoded post-transcriptionally.
Collapse
Affiliation(s)
- Nicholas E Phillips
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Alice Hugues
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
- Master de BiologieÉcole Normale Supérieure de LyonUniversité Claude Bernard Lyon IUniversité de LyonLyonFrance
| | - Jake Yeung
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Eric Durandau
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Damien Nicolas
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| | - Felix Naef
- Institute of BioengineeringSchool of Life SciencesEcole Polytechnique Fédérale de LausanneLausanneSwitzerland
| |
Collapse
|
183
|
Sun X, Zhang J, Nie Q. Inferring latent temporal progression and regulatory networks from cross-sectional transcriptomic data of cancer samples. PLoS Comput Biol 2021; 17:e1008379. [PMID: 33667222 PMCID: PMC7968745 DOI: 10.1371/journal.pcbi.1008379] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2020] [Revised: 03/17/2021] [Accepted: 02/15/2021] [Indexed: 12/19/2022] Open
Abstract
Unraveling molecular regulatory networks underlying disease progression is critically important for understanding disease mechanisms and identifying drug targets. The existing methods for inferring gene regulatory networks (GRNs) rely mainly on time-course gene expression data. However, most available omics data from cross-sectional studies of cancer patients often lack sufficient temporal information, leading to a key challenge for GRN inference. Through quantifying the latent progression using random walks-based manifold distance, we propose a latent-temporal progression-based Bayesian method, PROB, for inferring GRNs from the cross-sectional transcriptomic data of tumor samples. The robustness of PROB to the measurement variabilities in the data is mathematically proved and numerically verified. Performance evaluation on real data indicates that PROB outperforms other methods in both pseudotime inference and GRN inference. Applications to bladder cancer and breast cancer demonstrate that our method is effective to identify key regulators of cancer progression or drug targets. The identified ACSS1 is experimentally validated to promote epithelial-to-mesenchymal transition of bladder cancer cells, and the predicted FOXM1-targets interactions are verified and are predictive of relapse in breast cancer. Our study suggests new effective ways to clinical transcriptomic data modeling for characterizing cancer progression and facilitates the translation of regulatory network-based approaches into precision medicine.
Collapse
Affiliation(s)
- Xiaoqiang Sun
- Key Laboratory of Tropical Disease Control, Chinese Ministry of Education; Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, China
- School of Mathematics, Sun Yat-sen University, Guangzhou, China
| | - Ji Zhang
- State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Sun Yat-sen University Cancer Center, Guangzhou, Guangdong, China
| | - Qing Nie
- Department of Mathematics and Department of Developmental & Cell Biology, NSF-Simons Center for Multiscale Cell Fate Research, University of California Irvine, Irvine, California, United States of America
| |
Collapse
|
184
|
Stadler T, Pybus OG, Stumpf MPH. Phylodynamics for cell biologists. Science 2021; 371:371/6526/eaah6266. [PMID: 33446527 DOI: 10.1126/science.aah6266] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 08/13/2020] [Indexed: 12/12/2022]
Abstract
Multicellular organisms are composed of cells connected by ancestry and descent from progenitor cells. The dynamics of cell birth, death, and inheritance within an organism give rise to the fundamental processes of development, differentiation, and cancer. Technical advances in molecular biology now allow us to study cellular composition, ancestry, and evolution at the resolution of individual cells within an organism or tissue. Here, we take a phylogenetic and phylodynamic approach to single-cell biology. We explain how "tree thinking" is important to the interpretation of the growing body of cell-level data and how ecological null models can benefit statistical hypothesis testing. Experimental progress in cell biology should be accompanied by theoretical developments if we are to exploit fully the dynamical information in single-cell data.
Collapse
Affiliation(s)
- T Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Switzerland. .,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - O G Pybus
- Department of Zoology, University of Oxford, Oxford, UK.
| | - M P H Stumpf
- Melbourne Integrative Genomics, School of BioSciences and School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia.
| |
Collapse
|
185
|
Zanin M, Aitya NA, Basilio J, Baumbach J, Benis A, Behera CK, Bucholc M, Castiglione F, Chouvarda I, Comte B, Dao TT, Ding X, Pujos-Guillot E, Filipovic N, Finn DP, Glass DH, Harel N, Iesmantas T, Ivanoska I, Joshi A, Boudjeltia KZ, Kaoui B, Kaur D, Maguire LP, McClean PL, McCombe N, de Miranda JL, Moisescu MA, Pappalardo F, Polster A, Prasad G, Rozman D, Sacala I, Sanchez-Bornot JM, Schmid JA, Sharp T, Solé-Casals J, Spiwok V, Spyrou GM, Stalidzans E, Stres B, Sustersic T, Symeonidis I, Tieri P, Todd S, Van Steen K, Veneva M, Wang DH, Wang H, Wang H, Watterson S, Wong-Lin K, Yang S, Zou X, Schmidt HH. An Early Stage Researcher's Primer on Systems Medicine Terminology. NETWORK AND SYSTEMS MEDICINE 2021; 4:2-50. [PMID: 33659919 PMCID: PMC7919422 DOI: 10.1089/nsm.2020.0003] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/27/2020] [Indexed: 12/19/2022] Open
Abstract
Background: Systems Medicine is a novel approach to medicine, that is, an interdisciplinary field that considers the human body as a system, composed of multiple parts and of complex relationships at multiple levels, and further integrated into an environment. Exploring Systems Medicine implies understanding and combining concepts coming from diametral different fields, including medicine, biology, statistics, modeling and simulation, and data science. Such heterogeneity leads to semantic issues, which may slow down implementation and fruitful interaction between these highly diverse fields. Methods: In this review, we collect and explain more than100 terms related to Systems Medicine. These include both modeling and data science terms and basic systems medicine terms, along with some synthetic definitions, examples of applications, and lists of relevant references. Results: This glossary aims at being a first aid kit for the Systems Medicine researcher facing an unfamiliar term, where he/she can get a first understanding of them, and, more importantly, examples and references for digging into the topic.
Collapse
Affiliation(s)
- Massimiliano Zanin
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Madrid, Spain
| | - Nadim A.A. Aitya
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - José Basilio
- Center for Physiology and Pharmacology, Institute of Vascular Biology and Thrombosis Research, Medical University of Vienna, Vienna, Austria
| | - Jan Baumbach
- TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Arriel Benis
- Faculty of Technology Management, Holon Institute of Technology (HIT), Holon, Israel
| | - Chandan K. Behera
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Magda Bucholc
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Filippo Castiglione
- CNR National Research Council, IAC Institute for Applied Computing, Rome, Italy
| | - Ioanna Chouvarda
- Lab of Computing, Medical Informatics, and Biomedical Imaging Technologies, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece
| | - Blandine Comte
- Université Clermont Auvergne, INRAE, UNH, Plateforme d'Exploration du Métabolisme, MetaboHUB Clermont, Clermont-Ferrand, France
| | - Tien-Tuan Dao
- Biomechanics and Bioengineering Laboratory (UMR CNRS 7338), Université de Technologie de Compiègne, Compiègne, France
- Labex MS2T “Control of Technological Systems-of-Systems,” CNRS and Université de Technologie de Compiègne, Compiègne, France
| | - Xuemei Ding
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Estelle Pujos-Guillot
- Université Clermont Auvergne, INRAE, UNH, Plateforme d'Exploration du Métabolisme, MetaboHUB Clermont, Clermont-Ferrand, France
| | - Nenad Filipovic
- Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia
- Bioengineering Research and Development Center (BioIRC), Kragujevac, Serbia
- Steinbeis Advanced Risk Technologies Institute doo Kragujevac, Kragujevac, Serbia
| | - David P. Finn
- Pharmacology and Therapeutics, School of Medicine, Galway Neuroscience Centre, National University of Ireland, Galway, Republic of Ireland
| | - David H. Glass
- School of Computing, Ulster University, Ulster, United Kingdom
| | - Nissim Harel
- Faculty of Sciences, Holon Institute of Technology (HIT), Holon, Israel
| | - Tomas Iesmantas
- Department of Mathematics and Natural Sciences, Kaunas University of Technology, Kaunas, Lithuania
| | - Ilinka Ivanoska
- Faculty of Computer Science and Engineering, Ss. Cyril and Methodius University, Skopje, Macedonia
| | - Alok Joshi
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Karim Zouaoui Boudjeltia
- Laboratory of Experimental Medicine (ULB 222), Medicine Faculty, Université libre de Bruxelles, CHU de Charleroi, Charleroi, Belgium
| | - Badr Kaoui
- Biomechanics and Bioengineering Laboratory (UMR CNRS 7338), Université de Technologie de Compiègne, Compiègne, France
- Labex MS2T “Control of Technological Systems-of-Systems,” CNRS and Université de Technologie de Compiègne, Compiègne, France
| | - Daman Kaur
- Northern Ireland Centre for Stratified Medicine, Biomedical Sciences Research Institute, Ulster University, Ulster, United Kingdom
| | - Liam P. Maguire
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Paula L. McClean
- Northern Ireland Centre for Stratified Medicine, Biomedical Sciences Research Institute, Ulster University, Ulster, United Kingdom
| | - Niamh McCombe
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - João Luís de Miranda
- Escola Superior de Tecnologia e Gestão, Instituto Politécnico de Portalegre, Portalegre, Portugal
- Centro de Recursos Naturais e Ambiente (CERENA), Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| | | | | | - Annikka Polster
- Centre for Molecular Medicine Norway (NCMM), Forskningparken, Oslo, Norway
| | - Girijesh Prasad
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Damjana Rozman
- Centre for Functional Genomics and Bio-Chips, Institute of Biochemistry, Faculty of Medicine, University of Ljubljana, Ljubljana, Slovenia
| | - Ioan Sacala
- Faculty of Automatic Control and Computers, University Politehnica of Bucharest, Bucharest, Romania
| | - Jose M. Sanchez-Bornot
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Johannes A. Schmid
- Center for Physiology and Pharmacology, Institute of Vascular Biology and Thrombosis Research, Medical University of Vienna, Vienna, Austria
| | - Trevor Sharp
- Department of Pharmacology, University of Oxford, Oxford, United Kingdom
| | - Jordi Solé-Casals
- Data and Signal Processing Research Group, University of Vic–Central University of Catalonia, Vic, Spain
- Department of Psychiatry, University of Cambridge, Cambridge, United Kingdom
- College of Artificial Intelligence, Nankai University, Tianjin, China
| | - Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Czech Republic
| | - George M. Spyrou
- The Cyprus School of Molecular Medicine, The Cyprus Institute of Neurology and Genetics, Nicosia, Cyprus
| | - Egils Stalidzans
- Computational Systems Biology Group, Institute of Microbiology and Biotechnology, University of Latvia, Riga, Latvia
| | - Blaž Stres
- Department of Animal Science, Biotechnical Faculty, University of Ljubljana, Ljubljana, Slovenia
- Faculty of Civil and Geodetic Engineering, University of Ljubljana, Ljubljana, Slovenia
- Department of Automation, Biocybernetics and Robotics, Jozef Stefan Institute, Ljubljana, Slovenia
| | - Tijana Sustersic
- Faculty of Engineering, University of Kragujevac, Kragujevac, Serbia
- Bioengineering Research and Development Center (BioIRC), Kragujevac, Serbia
- Steinbeis Advanced Risk Technologies Institute doo Kragujevac, Kragujevac, Serbia
| | - Ioannis Symeonidis
- Center for Research and Technology Hellas, Hellenic Institute of Transport, Thessaloniki, Greece
| | - Paolo Tieri
- CNR National Research Council, IAC Institute for Applied Computing, Rome, Italy
| | - Stephen Todd
- Altnagelvin Area Hospital, Western Health and Social Care Trust, Altnagelvin, United Kingdom
| | - Kristel Van Steen
- BIO3-Systems Genetics, GIGA-R, University of Liege, Liege, Belgium
- BIO3-Systems Medicine, Department of Human Genetics, KU Leuven, Leuven, Belgium
| | | | - Da-Hui Wang
- State Key Laboratory of Cognitive Neuroscience and Learning, and School of Systems Science, Beijing Normal University, Beijing, China
| | - Haiying Wang
- School of Computing, Ulster University, Ulster, United Kingdom
| | - Hui Wang
- School of Computing, Ulster University, Ulster, United Kingdom
| | - Steven Watterson
- Northern Ireland Centre for Stratified Medicine, Ulster University, Londonderry, United Kingdom
| | - KongFatt Wong-Lin
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Su Yang
- Intelligent Systems Research Centre, School of Computing, Engineering and Intelligent Systems, Ulster University, Ulster, United Kingdom
| | - Xin Zou
- Shanghai Centre for Systems Biomedicine, Key Laboratory of Systems Biomedicine (Ministry of Education), Shanghai Jiao Tong University, Shanghai, China
| | - Harald H.H.W. Schmidt
- Faculty of Health, Medicine & Life Science, Maastricht University, Maastricht, The Netherlands
| |
Collapse
|
186
|
Grønning AGB, Oubounyt M, Kanev K, Lund J, Kacprowski T, Zehn D, Röttger R, Baumbach J. Enabling single-cell trajectory network enrichment. NATURE COMPUTATIONAL SCIENCE 2021; 1:153-163. [PMID: 38217228 DOI: 10.1038/s43588-021-00025-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2020] [Accepted: 01/15/2021] [Indexed: 01/15/2024]
Abstract
Single-cell sequencing (scRNA-seq) technologies allow the investigation of cellular differentiation processes with unprecedented resolution. Although powerful software packages for scRNA-seq data analysis exist, systems biology-based tools for trajectory analysis are rare and typically difficult to handle. This hampers biological exploration and prevents researchers from gaining deeper insights into the molecular control of developmental processes. Here, to address this, we have developed Scellnetor; a network-constraint time-series clustering algorithm. It allows extraction of temporal differential gene expression network patterns (modules) that explain the difference in regulation of two developmental trajectories. Using well-characterized experimental model systems, we demonstrate the capacity of Scellnetor as a hypothesis generator to identify putative mechanisms driving haematopoiesis or mechanistically interpretable subnetworks driving dysfunctional CD8 T-cell development in chronic infections. Altogether, Scellnetor allows for single-cell trajectory network enrichment, which effectively lifts scRNA-seq data analysis to a systems biology level.
Collapse
Affiliation(s)
- Alexander G B Grønning
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark.
| | - Mhaned Oubounyt
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Kristiyan Kanev
- Division of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Jesper Lund
- Department of Biostatistics and Epidemiology, University of Southern Denmark, Odense, Denmark
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, Brunswick, Germany
| | - Dietmar Zehn
- Division of Animal Physiology and Immunology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Richard Röttger
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Jan Baumbach
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark.
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
- Chair of Computational Systems Biology, University of Hamburg, Hamburg, Germany.
| |
Collapse
|
187
|
Wang YXR, Li L, Li JJ, Huang H. Network Modeling in Biology: Statistical Methods for Gene and Brain Networks. Stat Sci 2021; 36:89-108. [PMID: 34305304 DOI: 10.1214/20-sts792] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The rise of network data in many different domains has offered researchers new insight into the problem of modeling complex systems and propelled the development of numerous innovative statistical methodologies and computational tools. In this paper, we primarily focus on two types of biological networks, gene networks and brain networks, where statistical network modeling has found both fruitful and challenging applications. Unlike other network examples such as social networks where network edges can be directly observed, both gene and brain networks require careful estimation of edges using covariates as a first step. We provide a discussion on existing statistical and computational methods for edge esitimation and subsequent statistical inference problems in these two types of biological networks.
Collapse
Affiliation(s)
- Y X Rachel Wang
- School of Mathematics and Statistics, University of Sydney, Australia
| | - Lexin Li
- Department of Biostatistics and Epidemiology, School of Public Health, University of California, Berkeley
| | | | - Haiyan Huang
- Department of Statistics, University of California, Berkeley
| |
Collapse
|
188
|
AlMusawi S, Ahmed M, Nateri AS. Understanding cell-cell communication and signaling in the colorectal cancer microenvironment. Clin Transl Med 2021; 11:e308. [PMID: 33635003 PMCID: PMC7868082 DOI: 10.1002/ctm2.308] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2020] [Revised: 12/31/2020] [Accepted: 01/19/2021] [Indexed: 12/12/2022] Open
Abstract
Carcinomas are complex heterocellular systems containing epithelial cancer cells, stromal fibroblasts, and multiple immune cell-types. Cell-cell communication between these tumor microenvironments (TME) and cells drives cancer progression and influences response to existing therapies. In order to provide better treatments for patients, we must understand how various cell-types collaborate within the TME to drive cancer and consider the multiple signals present between and within different cancer types. To investigate how tissues function, we need a model to measure both how signals are transferred between cells and how that information is processed within cells. The interplay of collaboration between different cell-types requires cell-cell communication. This article aims to review the current in vitro and in vivo mono-cellular and multi-cellular cultures models of colorectal cancer (CRC), and to explore how they can be used for single-cell multi-omics approaches for isolating multiple types of molecules from a single-cell required for cell-cell communication to distinguish cancer cells from normal cells. Integrating the existing single-cell signaling measurements and models, and through understanding the cell identity and how different cell types communicate, will help predict drug sensitivities in tumor cells and between- and within-patients responses.
Collapse
Affiliation(s)
- Shaikha AlMusawi
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
| | - Mehreen Ahmed
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
- Department of Laboratory Medicine, Division of Translational Cancer ResearchLund UniversityLundSweden
| | - Abdolrahman S. Nateri
- Cancer Genetics & Stem Cell Group, BioDiscovery Institute, Division of Cancer & Stem Cells, School of MedicineUniversity of NottinghamNottinghamUK
| |
Collapse
|
189
|
Musilova J, Sedlar K. Tools for time-course simulation in systems biology: a brief overview. Brief Bioinform 2021; 22:6076933. [PMID: 33423059 DOI: 10.1093/bib/bbaa392] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 11/27/2020] [Accepted: 11/30/2020] [Indexed: 11/13/2022] Open
Abstract
Dynamic modeling of biological systems is essential for understanding all properties of a given organism as it allows us to look not only at the static picture of an organism but also at its behavior under various conditions. With the increasing amount of experimental data, the number of tools that enable dynamic analysis also grows. However, various tools are based on different approaches, use different types of data and offer different functions for analyses; so it can be difficult to choose the most suitable tool for a selected type of model. Here, we bring a brief overview containing descriptions of 50 tools for the reconstruction of biological models, their time-course simulation and dynamic analysis. We examined each tool using test data and divided them based on the qualitative and quantitative nature of the mathematical apparatus they use.
Collapse
Affiliation(s)
- Jana Musilova
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czechia
| | - Karel Sedlar
- Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Brno, Czechia
| |
Collapse
|
190
|
Kim J, T. Jakobsen S, Natarajan KN, Won KJ. TENET: gene network reconstruction using transfer entropy reveals key regulatory factors from single cell transcriptomic data. Nucleic Acids Res 2021; 49:e1. [PMID: 33170214 PMCID: PMC7797076 DOI: 10.1093/nar/gkaa1014] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Revised: 10/05/2020] [Accepted: 10/14/2020] [Indexed: 12/22/2022] Open
Abstract
Accurate prediction of gene regulatory rules is important towards understanding of cellular processes. Existing computational algorithms devised for bulk transcriptomics typically require a large number of time points to infer gene regulatory networks (GRNs), are applicable for a small number of genes and fail to detect potential causal relationships effectively. Here, we propose a novel approach 'TENET' to reconstruct GRNs from single cell RNA sequencing (scRNAseq) datasets. Employing transfer entropy (TE) to measure the amount of causal relationships between genes, TENET predicts large-scale gene regulatory cascades/relationships from scRNAseq data. TENET showed better performance than other GRN reconstructors, in identifying key regulators from public datasets. Specifically from scRNAseq, TENET identified key transcriptional factors in embryonic stem cells (ESCs) and during direct cardiomyocytes reprogramming, where other predictors failed. We further demonstrate that known target genes have significantly higher TE values, and TENET predicted higher TE genes were more influenced by the perturbation of their regulator. Using TENET, we identified and validated that Nme2 is a culture condition specific stem cell factor. These results indicate that TENET is uniquely capable of identifying key regulators from scRNAseq data.
Collapse
Affiliation(s)
- Junil Kim
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| | - Simon T. Jakobsen
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
| | - Kedar N Natarajan
- Functional Genomics and Metabolism Unit, Department of Biochemistry and Molecular Biology, University of Southern Denmark, Denmark
- Danish Institute of Advanced Study (D-IAS), University of Southern Denmark, Denmark
| | - Kyoung-Jae Won
- Biotech Research and Innovation Centre (BRIC), University of Copenhagen, 2200 Copenhagen N, Denmark
- Novo Nordisk Foundation Center for Stem Cell Biology, DanStem, Faculty of Health and Medical Sciences, University of Copenhagen, Ole Maaløes Vej 5, 2200 Copenhagen N, Denmark
| |
Collapse
|
191
|
Sha Y, Wang S, Bocci F, Zhou P, Nie Q. Inference of Intercellular Communications and Multilayer Gene-Regulations of Epithelial-Mesenchymal Transition From Single-Cell Transcriptomic Data. Front Genet 2021; 11:604585. [PMID: 33488673 PMCID: PMC7820899 DOI: 10.3389/fgene.2020.604585] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 12/02/2020] [Indexed: 01/31/2023] Open
Abstract
Epithelial-to-mesenchymal transition (EMT) plays an important role in many biological processes during development and cancer. The advent of single-cell transcriptome sequencing techniques allows the dissection of dynamical details underlying EMT with unprecedented resolution. Despite several single-cell data analysis on EMT, how cell communicates and regulates dynamics along the EMT trajectory remains elusive. Using single-cell transcriptomic datasets, here we infer the cell-cell communications and the multilayer gene-gene regulation networks to analyze and visualize the complex cellular crosstalk and the underlying gene regulatory dynamics along EMT. Combining with trajectory analysis, our approach reveals the existence of multiple intermediate cell states (ICSs) with hybrid epithelial and mesenchymal features. Analyses on the time-series datasets from cancer cell lines with different inducing factors show that the induced EMTs are context-specific: the EMT induced by transforming growth factor B1 (TGFB1) is synchronous, whereas the EMTs induced by epidermal growth factor and tumor necrosis factor are asynchronous, and the responses of TGF-β pathway in terms of gene expression regulations are heterogeneous under different treatments or among various cell states. Meanwhile, network topology analysis suggests that the ICSs during EMT serve as the signaling in cellular communication under different conditions. Interestingly, our analysis of a mouse skin squamous cell carcinoma dataset also suggests regardless of the significant discrepancy in concrete genes between in vitro and in vivo EMT systems, the ICSs play dominant role in the TGF-β signaling crosstalk. Overall, our approach reveals the multiscale mechanisms coupling cell-cell communications and gene-gene regulations responsible for complex cell-state transitions.
Collapse
Affiliation(s)
- Yutong Sha
- Department of Mathematics, University of California, Irvine, Irvine, CA, United States
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA, United States
| | - Shuxiong Wang
- Department of Mathematics, University of California, Irvine, Irvine, CA, United States
| | - Federico Bocci
- Department of Mathematics, University of California, Irvine, Irvine, CA, United States
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA, United States
| | - Peijie Zhou
- Department of Mathematics, University of California, Irvine, Irvine, CA, United States
| | - Qing Nie
- Department of Mathematics, University of California, Irvine, Irvine, CA, United States
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, Irvine, CA, United States
- Department of Developmental and Cell Biology, University of California, Irvine, Irvine, CA, United States
| |
Collapse
|
192
|
Cahan P, Cacchiarelli D, Dunn SJ, Hemberg M, de Sousa Lopes SMC, Morris SA, Rackham OJL, Del Sol A, Wells CA. Computational Stem Cell Biology: Open Questions and Guiding Principles. Cell Stem Cell 2021; 28:20-32. [PMID: 33417869 PMCID: PMC7799393 DOI: 10.1016/j.stem.2020.12.012] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area.
Collapse
Affiliation(s)
- Patrick Cahan
- Institute for Cell Engineering, Department of Biomedical Engineering, Department of Molecular Biology and Genetics, Johns Hopkins School of Medicine, Baltimore, MD 21205, USA.
| | - Davide Cacchiarelli
- Telethon Institute of Genetics and Medicine (TIGEM), Armenise/Harvard Laboratory of Integrative Genomics, Pozzuoli, Italy d Department of Translational Medicine, University of Naples "Federico II," Naples, Italy
| | - Sara-Jane Dunn
- DeepMind, 14-18 Handyside Street, London N1C 4DN, UK; Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Jeffrey Cheah Biomedical Centre, Puddicombe Way, Cambridge Biomedical Campus, Cambridge CB2 0AW, UK
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | | | - Samantha A Morris
- Department of Developmental Biology, Department of Genetics, Center of Regenerative Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Owen J L Rackham
- Centre for Computational Biology and The Program for Cardiovascular and Metabolic Disorders, Duke-NUS Medical School, Singapore, Singapore
| | - Antonio Del Sol
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, Belvaux 4366, Luxembourg; CIC bioGUNE, Bizkaia Technology Park, 801 Building, 48160 Derio, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao 48013, Spain
| | - Christine A Wells
- Centre for Stem Cell Systems, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Melbourne, VIC 3010, Australia
| |
Collapse
|
193
|
Guillemin A, Stumpf MPH. Noise and the molecular processes underlying cell fate decision-making. Phys Biol 2021; 18:011002. [PMID: 33181489 DOI: 10.1088/1478-3975/abc9d1] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Cell fate decision-making events involve the interplay of many molecular processes, ranging from signal transduction to genetic regulation, as well as a set of molecular and physiological feedback loops. Each aspect offers a rich field of investigation in its own right, but to understand the whole process, even in simple terms, we need to consider them together. Here we attempt to characterise this process by focussing on the roles of noise during cell fate decisions. We use a range of recent results to develop a view of the sequence of events by which a cell progresses from a pluripotent or multipotent to a differentiated state: chromatin organisation, transcription factor stoichiometry, and cellular signalling all change during this progression, and all shape cellular variability, which becomes maximal at the transition state.
Collapse
Affiliation(s)
- Anissa Guillemin
- School of BioSciences, University of Melbourne, Parkville, Australia
| | | |
Collapse
|
194
|
Discovering Higher-Order Interactions Through Neural Information Decomposition. ENTROPY 2021; 23:e23010079. [PMID: 33430463 PMCID: PMC7827712 DOI: 10.3390/e23010079] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 12/21/2020] [Accepted: 12/25/2020] [Indexed: 11/25/2022]
Abstract
If regularity in data takes the form of higher-order functions among groups of variables, models which are biased towards lower-order functions may easily mistake the data for noise. To distinguish whether this is the case, one must be able to quantify the contribution of different orders of dependence to the total information. Recent work in information theory attempts to do this through measures of multivariate mutual information (MMI) and information decomposition (ID). Despite substantial theoretical progress, practical issues related to tractability and learnability of higher-order functions are still largely unaddressed. In this work, we introduce a new approach to information decomposition—termed Neural Information Decomposition (NID)—which is both theoretically grounded, and can be efficiently estimated in practice using neural networks. We show on synthetic data that NID can learn to distinguish higher-order functions from noise, while many unsupervised probability models cannot. Additionally, we demonstrate the usefulness of this framework as a tool for exploring biological and artificial neural networks.
Collapse
|
195
|
Kumar N, Mishra B, Athar M, Mukhtar S. Inference of Gene Regulatory Network from Single-Cell Transcriptomic Data Using pySCENIC. Methods Mol Biol 2021; 2328:171-182. [PMID: 34251625 DOI: 10.1007/978-1-0716-1534-8_10] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
With the advent of recent next-generation sequencing (NGS) technologies in genomics, transcriptomics, and epigenomics, profiling single-cell sequencing became possible. The single-cell RNA sequencing (scRNA-seq) is widely used to characterize diverse cell populations and ascertain cell type-specific regulatory mechanisms. The gene regulatory network (GRN) mainly consists of genes and their regulators-transcription factors (TF). Here, we describe the lightning-fast Python implementation of the SCENIC (Single-Cell reEgulatory Network Inference and Clustering) pipeline called pySCENIC. Using single-cell RNA-seq data, it maps TFs onto gene regulatory networks and integrates various cell types to infer cell-specific GRNs. There are two fast and efficient GRN inference algorithms, GRNBoost2 and GENIE3, optionally available with pySCENIC. The pipeline has three steps: (1) identification of potential TF targets based on co-expression; (2) TF-motif enrichment analysis to identify the direct targets (regulons); and (3) scoring the activity of regulons (or other gene sets) on single cell types.
Collapse
Affiliation(s)
- Nilesh Kumar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Bharat Mishra
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Mohammad Athar
- Department of Dermatology, School of Medicine, University of Alabama at Birmingham, Birmingham, AL, USA
| | - Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL, USA.
| |
Collapse
|
196
|
|
197
|
Augugliaro L, Abbruzzo A, Vinciotti V. ℓ 1-Penalized censored Gaussian graphical model. Biostatistics 2020; 21:e1-e16. [PMID: 30203001 DOI: 10.1093/biostatistics/kxy043] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Revised: 07/02/2018] [Accepted: 07/15/2018] [Indexed: 12/30/2022] Open
Abstract
Graphical lasso is one of the most used estimators for inferring genetic networks. Despite its diffusion, there are several fields in applied research where the limits of detection of modern measurement technologies make the use of this estimator theoretically unfounded, even when the assumption of a multivariate Gaussian distribution is satisfied. Typical examples are data generated by polymerase chain reactions and flow cytometer. The combination of censoring and high-dimensionality make inference of the underlying genetic networks from these data very challenging. In this article, we propose an $\ell_1$-penalized Gaussian graphical model for censored data and derive two EM-like algorithms for inference. We evaluate the computational efficiency of the proposed algorithms by an extensive simulation study and show that, when censored data are available, our proposal is superior to existing competitors both in terms of network recovery and parameter estimation. We apply the proposed method to gene expression data generated by microfluidic Reverse Transcription quantitative Polymerase Chain Reaction technology in order to make inference on the regulatory mechanisms of blood development. A software implementation of our method is available on github (https://github.com/LuigiAugugliaro/cglasso).
Collapse
Affiliation(s)
- Luigi Augugliaro
- Department of Economics, Business and Statistics, University of Palermo, Building 13, Viale delle Scienze, Palermo, Italy
| | - Antonino Abbruzzo
- Department of Economics, Business and Statistics, University of Palermo, Building 13, Viale delle Scienze, Palermo, Italy
| | - Veronica Vinciotti
- Department of Mathematics, Brunel University London, Kingston Lane, Uxbridge UB8 3PH, UK
| |
Collapse
|
198
|
Yuan B, Shen C, Luna A, Korkut A, Marks DS, Ingraham J, Sander C. CellBox: Interpretable Machine Learning for Perturbation Biology with Application to the Design of Cancer Combination Therapy. Cell Syst 2020; 12:128-140.e4. [PMID: 33373583 DOI: 10.1016/j.cels.2020.11.013] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2020] [Revised: 07/13/2020] [Accepted: 11/25/2020] [Indexed: 01/13/2023]
Abstract
Systematic perturbation of cells followed by comprehensive measurements of molecular and phenotypic responses provides informative data resources for constructing computational models of cell biology. Models that generalize well beyond training data can be used to identify combinatorial perturbations of potential therapeutic interest. Major challenges for machine learning on large biological datasets are to find global optima in a complex multidimensional space and mechanistically interpret the solutions. To address these challenges, we introduce a hybrid approach that combines explicit mathematical models of cell dynamics with a machine-learning framework, implemented in TensorFlow. We tested the modeling framework on a perturbation-response dataset of a melanoma cell line after drug treatments. The models can be efficiently trained to describe cellular behavior accurately. Even though completely data driven and independent of prior knowledge, the resulting de novo network models recapitulate some known interactions. The approach is readily applicable to various kinetic models of cell biology. A record of this paper's Transparent Peer Review process is included in the Supplemental Information.
Collapse
Affiliation(s)
- Bo Yuan
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA.
| | - Ciyue Shen
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA.
| | - Augustin Luna
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA
| | - Anil Korkut
- Department of Bioinformatics & Computational Biology, the University of Texas M D Anderson Cancer Center, Houston, TX, USA
| | - Debora S Marks
- Broad Institute, Cambridge, MA, USA; Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - John Ingraham
- MIT Computer Science & Artificial Intelligence Laboratory, Boston, MA, USA
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, MA, USA; cBio Center, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA; Broad Institute, Cambridge, MA, USA.
| |
Collapse
|
199
|
Fidanza A, Stumpf PS, Ramachandran P, Tamagno S, Babtie A, Lopez-Yrigoyen M, Taylor AH, Easterbrook J, Henderson BEP, Axton R, Henderson NC, Medvinsky A, Ottersbach K, Romanò N, Forrester LM. Single-cell analyses and machine learning define hematopoietic progenitor and HSC-like cells derived from human PSCs. Blood 2020; 136:2893-2904. [PMID: 32614947 PMCID: PMC7862875 DOI: 10.1182/blood.2020006229] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Accepted: 06/20/2020] [Indexed: 01/19/2023] Open
Abstract
Hematopoietic stem and progenitor cells (HSPCs) develop in distinct waves at various anatomical sites during embryonic development. The in vitro differentiation of human pluripotent stem cells (hPSCs) recapitulates some of these processes; however, it has proven difficult to generate functional hematopoietic stem cells (HSCs). To define the dynamics and heterogeneity of HSPCs that can be generated in vitro from hPSCs, we explored single-cell RNA sequencing (scRNAseq) in combination with single-cell protein expression analysis. Bioinformatics analyses and functional validation defined the transcriptomes of naïve progenitors and erythroid-, megakaryocyte-, and leukocyte-committed progenitors, and we identified CD44, CD326, ICAM2/CD9, and CD18, respectively, as markers of these progenitors. Using an artificial neural network that we trained on scRNAseq derived from human fetal liver, we identified a wide range of hPSC-derived HSPCs phenotypes, including a small group classified as HSCs. This transient HSC-like population decreased as differentiation proceeded, and was completely missing in the data set that had been generated using cells selected on the basis of CD43 expression. By comparing the single-cell transcriptome of in vitro-generated HSC-like cells with those generated within the fetal liver, we identified transcription factors and molecular pathways that can be explored in the future to improve the in vitro production of HSCs.
Collapse
Affiliation(s)
- Antonella Fidanza
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Patrick S Stumpf
- Joint Research Center for Computational Biomedicine, Uniklinik Rheinisch-Westfälische Technische Hochschule (RWTH) Aachen, Aachen, Germany
| | - Prakash Ramachandran
- Centre for Inflammation Research, University of Edinburgh, Edinburgh, United Kingdom
| | - Sara Tamagno
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Ann Babtie
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, United Kingdom; and
| | - Martha Lopez-Yrigoyen
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - A Helen Taylor
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Jennifer Easterbrook
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Beth E P Henderson
- Centre for Inflammation Research, University of Edinburgh, Edinburgh, United Kingdom
| | - Richard Axton
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Neil C Henderson
- Centre for Inflammation Research, University of Edinburgh, Edinburgh, United Kingdom
| | - Alexander Medvinsky
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Katrin Ottersbach
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| | - Nicola Romanò
- Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh, United Kingdom
| | - Lesley M Forrester
- Centre for Regenerative Medicine, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
200
|
Osorio D, Zhong Y, Li G, Huang JZ, Cai JJ. scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data. PATTERNS (NEW YORK, N.Y.) 2020; 1:100139. [PMID: 33336197 PMCID: PMC7733883 DOI: 10.1016/j.patter.2020.100139] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 09/29/2020] [Accepted: 10/12/2020] [Indexed: 02/02/2023]
Abstract
We present scTenifoldNet-a machine learning workflow built upon principal-component regression, low-rank tensor approximation, and manifold alignment-for constructing and comparing single-cell gene regulatory networks (scGRNs) using data from single-cell RNA sequencing. scTenifoldNet reveals regulatory changes in gene expression between samples by comparing the constructed scGRNs. With real data, scTenifoldNet identifies specific gene expression programs associated with different biological processes, providing critical insights into the underlying mechanism of regulatory networks governing cellular transcriptional activities.
Collapse
Affiliation(s)
- Daniel Osorio
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
| | - Yan Zhong
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Guanxun Li
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - Jianhua Z. Huang
- Department of Statistics, Texas A&M University, College Station, TX 77843, USA
| | - James J. Cai
- Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX 77843, USA
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, TX 77843, USA
- Interdisciplinary Program of Genetics, Texas A&M University, College Station, TX 77843, USA
| |
Collapse
|