1
|
Karamveer, Uzun Y. Approaches for Benchmarking Single-Cell Gene Regulatory Network Methods. Bioinform Biol Insights 2024; 18:11779322241287120. [PMID: 39502448 PMCID: PMC11536393 DOI: 10.1177/11779322241287120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 09/10/2024] [Indexed: 11/08/2024] Open
Abstract
Gene regulatory networks are powerful tools for modeling genetic interactions that control the expression of genes driving cell differentiation, and single-cell sequencing offers a unique opportunity to build these networks with high-resolution genomic data. There are many proposed computational methods to build these networks using single-cell data, and different approaches are used to benchmark these methods. However, a comprehensive discussion specifically focusing on benchmarking approaches is missing. In this article, we lay the GRN terminology, present an overview of common gold-standard studies and data sets, and define the performance metrics for benchmarking network construction methodologies. We also point out the advantages and limitations of different benchmarking approaches, suggest alternative ground truth data sets that can be used for benchmarking, and specify additional considerations in this context.
Collapse
Affiliation(s)
- Karamveer
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| | - Yasin Uzun
- Department of Pediatrics, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Penn State Cancer Institute, The Pennsylvania State University College of Medicine, Hershey, PA, USA
| |
Collapse
|
2
|
Wang Y, Zheng P, Cheng YC, Wang Z, Aravkin A. WENDY: Covariance dynamics based gene regulatory network inference. Math Biosci 2024; 377:109284. [PMID: 39168402 DOI: 10.1016/j.mbs.2024.109284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 06/25/2024] [Accepted: 08/16/2024] [Indexed: 08/23/2024]
Abstract
Determining gene regulatory network (GRN) structure is a central problem in biology, with a variety of inference methods available for different types of data. For a widely prevalent and challenging use case, namely single-cell gene expression data measured after intervention at multiple time points with unknown joint distributions, there is only one known specifically developed method, which does not fully utilize the rich information contained in this data type. We develop an inference method for the GRN in this case, netWork infErence by covariaNce DYnamics, dubbed WENDY. The core idea of WENDY is to model the dynamics of the covariance matrix, and solve this dynamics as an optimization problem to determine the regulatory relationships. To evaluate its effectiveness, we compare WENDY with other inference methods using synthetic data and experimental data. Our results demonstrate that WENDY performs well across different data sets.
Collapse
Affiliation(s)
- Yue Wang
- Irving Institute for Cancer Dynamics and Department of Statistics, Columbia University, New York, 10027, NY, USA.
| | - Peng Zheng
- Institute for Health Metrics and Evaluation, Seattle, 98195, WA, USA; Department of Health Metrics Sciences, University of Washington, Seattle, 98195, WA, USA
| | - Yu-Chen Cheng
- Department of Data Science, Dana-Farber Cancer Institute, Boston, 02215, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, 02115, MA, USA; Center for Cancer Evolution, Dana-Farber Cancer Institute, Boston, 02215, MA, USA; Department of Stem Cell and Regenerative Biology, Harvard University, Cambridge, 02138, MA, USA
| | - Zikun Wang
- Laboratory of Genetics, The Rockefeller University, New York, 10065, NY, USA
| | - Aleksandr Aravkin
- Department of Applied Mathematics, University of Washington, Seattle, 98195, WA, USA
| |
Collapse
|
3
|
Wei PJ, Bao JJ, Gao Z, Tan JY, Cao RF, Su Y, Zheng CH, Deng L. MEFFGRN: Matrix enhancement and feature fusion-based method for reconstructing the gene regulatory network of epithelioma papulosum cyprini cells by spring viremia of carp virus infection. Comput Biol Med 2024; 179:108835. [PMID: 38996550 DOI: 10.1016/j.compbiomed.2024.108835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Revised: 06/05/2024] [Accepted: 06/29/2024] [Indexed: 07/14/2024]
Abstract
Gene regulatory networks (GRNs) are crucial for understanding organismal molecular mechanisms and processes. Construction of GRN in the epithelioma papulosum cyprini (EPC) cells of cyprinid fish by spring viremia of carp virus (SVCV) infection helps understand the immune regulatory mechanisms that enhance the survival capabilities of cyprinid fish. Although many computational methods have been used to infer GRNs, specialized approaches for predicting the GRN of EPC cells following SVCV infection are lacking. In addition, most existing methods focus primarily on gene expression features, neglecting the valuable network structural information in known GRNs. In this study, we propose a novel supervised deep neural network, named MEFFGRN (Matrix Enhancement- and Feature Fusion-based method for Gene Regulatory Network inference), to accurately predict the GRN of EPC cells following SVCV infection. MEFFGRN considers both gene expression data and network structure information of known GRN and introduces a matrix enhancement method to address the sparsity issue of known GRN, extracting richer network structure information. To optimize the benefits of CNN (Convolutional Neural Network) in image processing, gene expression and enhanced GRN data were transformed into histogram images for each gene pair respectively. Subsequently, these histograms were separately fed into CNNs for training to obtain the corresponding gene expression and network structural features. Furthermore, a feature fusion mechanism was introduced to comprehensively integrate the gene expression and network structural features. This integration considers the specificity of each feature and their interactive information, resulting in a more comprehensive and precise feature representation during the fusion process. Experimental results from both real-world and benchmark datasets demonstrate that MEFFGRN achieves competitive performance compared with state-of-the-art computational methods. Furthermore, study findings from SVCV-infected EPC cells suggest that MEFFGRN can predict novel gene regulatory relationships.
Collapse
Affiliation(s)
- Pi-Jing Wei
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, Institutes of Physical Science and Information Technology, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China
| | - Jin-Jin Bao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China
| | - Zhen Gao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China
| | - Jing-Yun Tan
- Shenzhen Key Laboratory of Microbial Genetic Engineering, College of Life Sciences and Oceanology, Shenzhen University, Shenzhen, 518055, Guangdong, China
| | - Rui-Fen Cao
- Key Laboratory of Intelligent Computing & Signal Processing of Ministry of Education, School of Computer Science and Technology, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China
| | - Chun-Hou Zheng
- School of Artificial Intelligence, Anhui University, 111 Jiulong Road, Hefei, 230601, Anhui, China.
| | - Li Deng
- Shenzhen Key Laboratory of Microbial Genetic Engineering, College of Life Sciences and Oceanology, Shenzhen University, Shenzhen, 518055, Guangdong, China.
| |
Collapse
|
4
|
Lee J, Kim N, Cho KH. Decoding the principle of cell-fate determination for its reverse control. NPJ Syst Biol Appl 2024; 10:47. [PMID: 38710700 PMCID: PMC11074314 DOI: 10.1038/s41540-024-00372-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 04/16/2024] [Indexed: 05/08/2024] Open
Abstract
Understanding and manipulating cell fate determination is pivotal in biology. Cell fate is determined by intricate and nonlinear interactions among molecules, making mathematical model-based quantitative analysis indispensable for its elucidation. Nevertheless, obtaining the essential dynamic experimental data for model development has been a significant obstacle. However, recent advancements in large-scale omics data technology are providing the necessary foundation for developing such models. Based on accumulated experimental evidence, we can postulate that cell fate is governed by a limited number of core regulatory circuits. Following this concept, we present a conceptual control framework that leverages single-cell RNA-seq data for dynamic molecular regulatory network modeling, aiming to identify and manipulate core regulatory circuits and their master regulators to drive desired cellular state transitions. We illustrate the proposed framework by applying it to the reversion of lung cancer cell states, although it is more broadly applicable to understanding and controlling a wide range of cell-fate determination processes.
Collapse
Affiliation(s)
- Jonghoon Lee
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
| | - Namhee Kim
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea
- biorevert, Inc., Daejeon, Republic of Korea
| | - Kwang-Hyun Cho
- Laboratory for Systems Biology and Bio-inspired Engineering, Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea.
| |
Collapse
|
5
|
Pan X, Zhang X. Studying temporal dynamics of single cells: expression, lineage and regulatory networks. Biophys Rev 2024; 16:57-67. [PMID: 38495440 PMCID: PMC10937865 DOI: 10.1007/s12551-023-01090-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 06/27/2023] [Indexed: 03/19/2024] Open
Abstract
Learning how multicellular organs are developed from single cells to different cell types is a fundamental problem in biology. With the high-throughput scRNA-seq technology, computational methods have been developed to reveal the temporal dynamics of single cells from transcriptomic data, from phenomena on cell trajectories to the underlying mechanism that formed the trajectory. There are several distinct families of computational methods including Trajectory Inference (TI), Lineage Tracing (LT), and Gene Regulatory Network (GRN) Inference which are involved in such studies. This review summarizes these computational approaches which use scRNA-seq data to study cell differentiation and cell fate specification as well as the advantages and limitations of different methods. We further discuss how GRNs can potentially affect cell fate decisions and trajectory structures. Supplementary Information The online version contains supplementary material available at 10.1007/s12551-023-01090-5.
Collapse
Affiliation(s)
- Xinhai Pan
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| | - Xiuwei Zhang
- School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA
| |
Collapse
|
6
|
Zeng Y, He Y, Zheng R, Li M. Inferring single-cell gene regulatory network by non-redundant mutual information. Brief Bioinform 2023; 24:bbad326. [PMID: 37715282 DOI: 10.1093/bib/bbad326] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Revised: 07/12/2023] [Accepted: 08/08/2023] [Indexed: 09/17/2023] Open
Abstract
Gene regulatory network plays a crucial role in controlling the biological processes of living creatures. Deciphering the complex gene regulatory networks from experimental data remains a major challenge in system biology. Recent advances in single-cell RNA sequencing technology bring massive high-resolution data, enabling computational inference of cell-specific gene regulatory networks (GRNs). Many relevant algorithms have been developed to achieve this goal in the past years. However, GRN inference is still less ideal due to the extra noises involved in pseudo-time information and large amounts of dropouts in datasets. Here, we present a novel GRN inference method named Normi, which is based on non-redundant mutual information. Normi manipulates these problems by employing a sliding size-fixed window approach on the entire trajectory and conducts average smoothing strategy on the gene expression of the cells in each window to obtain representative cells. To further alleviate the impact of dropouts, we utilize the mixed KSG estimator to quantify the high-order time-delayed mutual information among genes, then filter out the redundant edges by adopting Max-Relevance and Min Redundancy algorithm. Moreover, we determined the optimal time delay for each gene pair by distance correlation. Normi outperforms other state-of-the-art GRN inference methods on both simulated data and single-cell RNA sequencing (scRNA-seq) datasets, demonstrating its superiority in robustness. The performance of Normi in real scRNA-seq data further reveals its ability to identify the key regulators and crucial biological processes.
Collapse
Affiliation(s)
- Yanping Zeng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yongxin He
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
7
|
Daniels RR, Taylor RS, Robledo D, Macqueen DJ. Single cell genomics as a transformative approach for aquaculture research and innovation. REVIEWS IN AQUACULTURE 2023; 15:1618-1637. [PMID: 38505116 PMCID: PMC10946576 DOI: 10.1111/raq.12806] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 02/16/2023] [Accepted: 02/16/2023] [Indexed: 03/21/2024]
Abstract
Single cell genomics encompasses a suite of rapidly maturing technologies that measure the molecular profiles of individual cells within target samples. These approaches provide a large up-step in biological information compared to long-established 'bulk' methods that profile the average molecular profiles of all cells in a sample, and have led to transformative advances in understanding of cellular biology, particularly in humans and model organisms. The application of single cell genomics is fast expanding to non-model taxa, including aquaculture species, where numerous research applications are underway with many more envisaged. In this review, we highlight the potential transformative applications of single cell genomics in aquaculture research, considering barriers and potential solutions to the broad uptake of these technologies. Focusing on single cell transcriptomics, we outline considerations for experimental design, including the essential requirement to obtain high quality cells/nuclei for sequencing in ectothermic aquatic species. We further outline data analysis and bioinformatics considerations, tailored to studies with the under-characterized genomes of aquaculture species, where our knowledge of cellular heterogeneity and cell marker genes is immature. Overall, this review offers a useful source of knowledge for researchers aiming to apply single cell genomics to address biological challenges faced by the global aquaculture sector though an improved understanding of cell biology.
Collapse
Affiliation(s)
- Rose Ruiz Daniels
- The Roslin Institute and Royal (Dick) School of Veterinary StudiesThe University of EdinburghMidlothianUK
| | - Richard S. Taylor
- The Roslin Institute and Royal (Dick) School of Veterinary StudiesThe University of EdinburghMidlothianUK
| | - Diego Robledo
- The Roslin Institute and Royal (Dick) School of Veterinary StudiesThe University of EdinburghMidlothianUK
| | - Daniel J. Macqueen
- The Roslin Institute and Royal (Dick) School of Veterinary StudiesThe University of EdinburghMidlothianUK
| |
Collapse
|
8
|
Vanheer L, Fantuzzi F, To SK, Schiavo A, Van Haele M, Ostyn T, Haesen T, Yi X, Janiszewski A, Chappell J, Rihoux A, Sawatani T, Roskams T, Pattou F, Kerr-Conte J, Cnop M, Pasque V. Inferring regulators of cell identity in the human adult pancreas. NAR Genom Bioinform 2023; 5:lqad068. [PMID: 37435358 PMCID: PMC10331937 DOI: 10.1093/nargab/lqad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/17/2023] [Accepted: 06/28/2023] [Indexed: 07/13/2023] Open
Abstract
Cellular identity during development is under the control of transcription factors that form gene regulatory networks. However, the transcription factors and gene regulatory networks underlying cellular identity in the human adult pancreas remain largely unexplored. Here, we integrate multiple single-cell RNA-sequencing datasets of the human adult pancreas, totaling 7393 cells, and comprehensively reconstruct gene regulatory networks. We show that a network of 142 transcription factors forms distinct regulatory modules that characterize pancreatic cell types. We present evidence that our approach identifies regulators of cell identity and cell states in the human adult pancreas. We predict that HEYL, BHLHE41 and JUND are active in acinar, beta and alpha cells, respectively, and show that these proteins are present in the human adult pancreas as well as in human induced pluripotent stem cell (hiPSC)-derived islet cells. Using single-cell transcriptomics, we found that JUND represses beta cell genes in hiPSC-alpha cells. BHLHE41 depletion induced apoptosis in primary pancreatic islets. The comprehensive gene regulatory network atlas can be explored interactively online. We anticipate our analysis to be the starting point for a more sophisticated dissection of how transcription factors regulate cell identity and cell states in the human adult pancreas.
Collapse
Affiliation(s)
- Lotte Vanheer
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Federica Fantuzzi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - San Kit To
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Andrea Schiavo
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Matthias Van Haele
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tessa Ostyn
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tine Haesen
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Xiaoyan Yi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Adrian Janiszewski
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Joel Chappell
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Adrien Rihoux
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Toshiaki Sawatani
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Tania Roskams
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Francois Pattou
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Julie Kerr-Conte
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Miriam Cnop
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
- Division of Endocrinology; Erasmus Hospital, Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Vincent Pasque
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| |
Collapse
|
9
|
Glazer BJ, Lifferth JT, Lopez CF. Automatic mechanistic inference from large families of Boolean models generated by Monte Carlo tree search. Front Cell Dev Biol 2023; 11:1198359. [PMID: 37691824 PMCID: PMC10485623 DOI: 10.3389/fcell.2023.1198359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Accepted: 08/07/2023] [Indexed: 09/12/2023] Open
Abstract
Many important processes in biology, such as signaling and gene regulation, can be described using logic models. These logic models are typically built to behaviorally emulate experimentally observed phenotypes, which are assumed to be steady states of a biological system. Most models are built by hand and therefore researchers are only able to consider one or perhaps a few potential mechanisms. We present a method to automatically synthesize Boolean logic models with a specified set of steady states. Our method, called MC-Boomer, is based on Monte Carlo Tree Search an efficient, parallel search method using reinforcement learning. Our approach enables users to constrain the model search space using prior knowledge or biochemical interaction databases, thus leading to generation of biologically plausible mechanistic hypotheses. Our approach can generate very large numbers of data-consistent models. To help develop mechanistic insight from these models, we developed analytical tools for multi-model inference and model selection. These tools reveal the key sets of interactions that govern the behavior of the models. We demonstrate that MC-Boomer works well at reconstructing randomly generated models. Then, using single time point measurements and reasonable biological constraints, our method generates hundreds of thousands of candidate models that match experimentally validated in-vivo behaviors of the Drosophila segment polarity network. Finally we outline how our multi-model analysis procedures elucidate potentially novel biological mechanisms and provide opportunities for model-driven experimental validation.
Collapse
Affiliation(s)
- Bryan J. Glazer
- Department of Biomedical Informatics, Vanderbilt University, Nashville, TN, United States
| | - Jonathan T. Lifferth
- Department of Human Genetics, Vanderbilt University, Nashville, TN, United States
| | - Carlos F. Lopez
- Department of Biochemistry, Vanderbilt University, Nashville, TN, United States
- Altos Labs, Redwood City, CA, United States
| |
Collapse
|
10
|
Zhang S, Pyne S, Pietrzak S, Halberg S, McCalla SG, Siahpirani AF, Sridharan R, Roy S. Inference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets. Nat Commun 2023; 14:3064. [PMID: 37244909 PMCID: PMC10224950 DOI: 10.1038/s41467-023-38637-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 05/10/2023] [Indexed: 05/29/2023] Open
Abstract
Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that connect transcription factors and signaling proteins to target genes. Single-cell technologies such as single cell RNA-sequencing (scRNA-seq) and single cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), can examine cell-type specific gene regulation at unprecedented detail. However, current approaches to infer cell type-specific GRNs are limited in their ability to integrate scRNA-seq and scATAC-seq measurements and to model network dynamics on a cell lineage. To address this challenge, we have developed single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer the GRN for each cell type on a lineage from scRNA-seq and scATAC-seq data. Using simulated and real datasets, we show that scMTNI is a broadly applicable framework for linear and branching lineages that accurately infers GRN dynamics and identifies key regulators of fate transitions for diverse processes such as cellular reprogramming and differentiation.
Collapse
Affiliation(s)
- Shilu Zhang
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
| | - Stefan Pietrzak
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, WI, USA
| | - Spencer Halberg
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Alireza Fotuhi Siahpirani
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Rupa Sridharan
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, WI, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
11
|
Beneš N, Brim L, Huvar O, Pastva S, Šafránek D. Boolean network sketches: a unifying framework for logical model inference. Bioinformatics 2023; 39:btad158. [PMID: 37004199 PMCID: PMC10122605 DOI: 10.1093/bioinformatics/btad158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 03/02/2023] [Accepted: 03/20/2023] [Indexed: 04/03/2023] Open
Abstract
MOTIVATION The problem of model inference is of fundamental importance to systems biology. Logical models (e.g. Boolean networks; BNs) represent a computationally attractive approach capable of handling large biological networks. The models are typically inferred from experimental data. However, even with a substantial amount of experimental data supported by some prior knowledge, existing inference methods often focus on a small sample of admissible candidate models only. RESULTS We propose Boolean network sketches as a new formal instrument for the inference of Boolean networks. A sketch integrates (typically partial) knowledge about the network's topology and the update logic (obtained through, e.g. a biological knowledge base or a literature search), as well as further assumptions about the properties of the network's transitions (e.g. the form of its attractor landscape), and additional restrictions on the model dynamics given by the measured experimental data. Our new BNs inference algorithm starts with an 'initial' sketch, which is extended by adding restrictions representing experimental data to a 'data-informed' sketch and subsequently computes all BNs consistent with the data-informed sketch. Our algorithm is based on a symbolic representation and coloured model-checking. Our approach is unique in its ability to cover a broad spectrum of knowledge and efficiently produce a compact representation of all inferred BNs. We evaluate the method on a non-trivial collection of real-world and simulated data. AVAILABILITY AND IMPLEMENTATION All software and data are freely available as a reproducible artefact at https://doi.org/10.5281/zenodo.7688740.
Collapse
Affiliation(s)
- Nikola Beneš
- Faculty of Informatics, Masaryk University, Brno 602 00, Czech Republic
| | - Luboš Brim
- Faculty of Informatics, Masaryk University, Brno 602 00, Czech Republic
| | - Ondřej Huvar
- Faculty of Informatics, Masaryk University, Brno 602 00, Czech Republic
| | - Samuel Pastva
- Institute of Science and Technology Austria, Klosterneuburg 3400, Austria
| | - David Šafránek
- Faculty of Informatics, Masaryk University, Brno 602 00, Czech Republic
| |
Collapse
|
12
|
McCalla SG, Fotuhi Siahpirani A, Li J, Pyne S, Stone M, Periyasamy V, Shin J, Roy S. Identifying strengths and weaknesses of methods for computational network inference from single-cell RNA-seq data. G3 (BETHESDA, MD.) 2023; 13:jkad004. [PMID: 36626328 PMCID: PMC9997554 DOI: 10.1093/g3journal/jkad004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 11/09/2022] [Accepted: 12/16/2022] [Indexed: 01/11/2023]
Abstract
Single-cell RNA-sequencing (scRNA-seq) offers unparalleled insight into the transcriptional programs of different cellular states by measuring the transcriptome of thousands of individual cells. An emerging problem in the analysis of scRNA-seq is the inference of transcriptional gene regulatory networks and a number of methods with different learning frameworks have been developed to address this problem. Here, we present an expanded benchmarking study of eleven recent network inference methods on seven published scRNA-seq datasets in human, mouse, and yeast considering different types of gold standard networks and evaluation metrics. We evaluate methods based on their computing requirements as well as on their ability to recover the network structure. We find that, while most methods have a modest recovery of experimentally derived interactions based on global metrics such as Area Under the Precision Recall curve, methods are able to capture targets of regulators that are relevant to the system under study. Among the top performing methods that use only expression were SCENIC, PIDC, MERLIN or Correlation. Addition of prior biological knowledge and the estimation of transcription factor activities resulted in the best overall performance with the Inferelator and MERLIN methods that use prior knowledge outperforming methods that use expression alone. We found that imputation for network inference did not improve network inference accuracy and could be detrimental. Comparisons of inferred networks for comparable bulk conditions showed that the networks inferred from scRNA-seq datasets are often better or at par with the networks inferred from bulk datasets. Our analysis should be beneficial in selecting methods for network inference. At the same time, this highlights the need for improved methods and better gold standards for regulatory network inference from scRNAseq datasets.
Collapse
Affiliation(s)
- Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Jiaxin Li
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Matthew Stone
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Viswesh Periyasamy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Junha Shin
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
13
|
Palshikar MG, Palli R, Tyrell A, Maggirwar S, Schifitto G, Singh MV, Thakar J. Executable models of immune signaling pathways in HIV-associated atherosclerosis. NPJ Syst Biol Appl 2022; 8:35. [PMID: 36131068 PMCID: PMC9492768 DOI: 10.1038/s41540-022-00246-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 09/01/2022] [Indexed: 11/09/2022] Open
Abstract
Atherosclerosis (AS)-associated cardiovascular disease is an important cause of mortality in an aging population of people living with HIV (PLWH). This elevated risk has been attributed to viral infection, anti-retroviral therapy, chronic inflammation, and lifestyle factors. However, the rates at which PLWH develop AS vary even after controlling for length of infection, treatment duration, and for lifestyle factors. To investigate the molecular signaling underlying this variation, we sequenced 9368 peripheral blood mononuclear cells (PBMCs) from eight PLWH, four of whom have atherosclerosis (AS+). Additionally, a publicly available dataset of PBMCs from persons before and after HIV infection was used to investigate the effect of acute HIV infection. To characterize dysregulation of pathways rather than just measuring enrichment, we developed the single-cell Boolean Omics Network Invariant Time Analysis (scBONITA) algorithm. scBONITA infers executable dynamic pathway models and performs a perturbation analysis to identify high impact genes. These dynamic models are used for pathway analysis and to map sequenced cells to characteristic signaling states (attractor analysis). scBONITA revealed that lipid signaling regulates cell migration into the vascular endothelium in AS+ PLWH. Pathways implicated included AGE-RAGE and PI3K-AKT signaling in CD8+ T cells, and glucagon and cAMP signaling pathways in monocytes. Attractor analysis with scBONITA facilitated the pathway-based characterization of cellular states in CD8+ T cells and monocytes. In this manner, we identify critical cell-type specific molecular mechanisms underlying HIV-associated atherosclerosis using a novel computational method.
Collapse
Affiliation(s)
- Mukta G Palshikar
- Biophysics, Structural, and Computational Biology Program, University of Rochester School of Medicine and Dentistry, Rochester, USA
| | - Rohith Palli
- Medical Scientist Training Program, University of Rochester School of Medicine and Dentistry, Rochester, USA
| | - Alicia Tyrell
- University of Rochester Clinical & Translational Science Institute, Rochester, USA
| | - Sanjay Maggirwar
- Department of Microbiology, Immunology and Tropical Medicine, George Washington University School of Medicine and Health Sciences, Washington, DC, USA
| | - Giovanni Schifitto
- Department of Neurology, University of Rochester School of Medicine and Dentistry, Rochester, USA
- Department of Imaging Sciences, University of Rochester School of Medicine and Dentistry, Rochester, USA
| | - Meera V Singh
- Department of Neurology, University of Rochester School of Medicine and Dentistry, Rochester, USA
- Department of Microbiology and Immunology, University of Rochester School of Medicine and Dentistry, Rochester, USA
| | - Juilee Thakar
- Biophysics, Structural, and Computational Biology Program, University of Rochester School of Medicine and Dentistry, Rochester, USA.
- Department of Microbiology and Immunology, University of Rochester School of Medicine and Dentistry, Rochester, USA.
- Department of Biostatistics and Computational Biology, University of Rochester School of Medicine and Dentistry, Rochester, USA.
- Department of Biomedical Genetics, University of Rochester School of Medicine and Dentistry, Rochester, USA.
| |
Collapse
|
14
|
A probabilistic Boolean model on hair follicle cell fate regulation by TGF-β. Biophys J 2022; 121:2638-2652. [PMID: 35714600 DOI: 10.1016/j.bpj.2022.05.035] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 05/20/2022] [Accepted: 05/23/2022] [Indexed: 11/24/2022] Open
Abstract
Hair follicles (HFs) are mini skin organs that undergo cyclic growth. Various signals regulate HF cell fate decisions jointly. Recent experimental results suggest that transforming growth factor beta (TGF-β) exhibits a dual role in HF cell fate regulation that can be either anti- or pro-apoptosis. To understand the underlying mechanisms of HF cell fate control, we develop a novel probabilistic Boolean network (pBN) model on the HF epithelial cell gene regulation dynamics. First, the model is derived from literature, then refined using single-cell RNA sequencing data. Using the model, we both explore the mechanisms underlying HF cell fate decisions and make predictions that could potentially guide future experiments: 1) we propose that a threshold-like switch in the TGF-β strength may necessitate the dual roles of TGF-β in either activating apoptosis or cell proliferation, in cooperation with Bmp and tumor necrosis factor (TNF) and at different stages of a follicle growth cycle; 2) our model shows concordance with the high-activator-low-inhibitor theory of anagen initiation; 3) we predict that TNF may be more effective in catagen initiation than TGF-β, and they may cooperate in a two-step fashion; 4) finally, predictions of gene knockout and overexpression reveal the roles in HF cell fate regulations of each gene. Attractor and motif analysis from the associated Boolean networks reveal the relations between the topological structure of the gene regulation network and the cell fate regulation mechanism. A discrete spatial model equipped with the pBN illustrates how TGF-β and TNF cooperate in initiating and driving the apoptosis wave during catagen.
Collapse
|
15
|
Liu W, Sun X, Yang L, Li K, Yang Y, Fu X. NSCGRN: a network structure control method for gene regulatory network inference. Brief Bioinform 2022; 23:6585392. [PMID: 35554485 DOI: 10.1093/bib/bbac156] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Revised: 03/27/2022] [Accepted: 04/06/2022] [Indexed: 01/18/2023] Open
Abstract
Accurate inference of gene regulatory networks (GRNs) is an essential premise for understanding pathogenesis and curing diseases. Various computational methods have been developed for GRN inference, but the identification of redundant regulation remains a challenge faced by researchers. Although combining global and local topology can identify and reduce redundant regulations, the topologies' specific forms and cooperation modes are unclear and real regulations may be sacrificed. Here, we propose a network structure control method [network-structure-controlling-based GRN inference method (NSCGRN)] that stipulates the global and local topology's specific forms and cooperation mode. The method is carried out in a cooperative mode of 'global topology dominates and local topology refines'. Global topology requires layering and sparseness of the network, and local topology requires consistency of the subgraph association pattern with the network motifs (fan-in, fan-out, cascade and feedforward loop). Specifically, an ordered gene list is obtained by network topology centrality sorting. A Bernaola-Galvan mutation detection algorithm applied to the list gives the hierarchy of GRNs to control the upstream and downstream regulations within the global scope. Finally, four network motifs are integrated into the hierarchy to optimize local complex regulations and form a cooperative mode where global and local topologies play the dominant and refined roles, respectively. NSCGRN is compared with state-of-the-art methods on three different datasets (six networks in total), and it achieves the highest F1 and Matthews correlation coefficient. Experimental results show its unique advantages in GRN inference.
Collapse
Affiliation(s)
- Wei Liu
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Li Yang
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Kaiwen Li
- Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| | - Yu Yang
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410000, China
| |
Collapse
|
16
|
Gao S, Sun C, Xiang C, Qin K, Lee TH. Learning Asynchronous Boolean Networks From Single-Cell Data Using Multiobjective Cooperative Genetic Programming. IEEE TRANSACTIONS ON CYBERNETICS 2022; 52:2916-2930. [PMID: 33027020 DOI: 10.1109/tcyb.2020.3022430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Recent advances in high-throughput single-cell technologies provide new opportunities for computational modeling of gene regulatory networks (GRNs) with an unprecedented amount of gene expression data. Current studies on the Boolean network (BN) modeling of GRNs mostly depend on bulk time-series data and focus on the synchronous update scheme due to its computational simplicity and tractability. However, such synchrony is a strong and rarely biologically realistic assumption. In this study, we adopt the asynchronous update scheme instead and propose a novel framework called SgpNet to infer asynchronous BNs from single-cell data by formulating it into a multiobjective optimization problem. SgpNet aims to find BNs that can match the asynchronous state transition graph (STG) extracted from single-cell data and retain the sparsity of GRNs. To search the huge solution space efficiently, we encode each Boolean function as a tree in genetic programming and evolve all functions of a network simultaneously via cooperative coevolution. Besides, we develop a regulator preselection strategy in view of GRN sparsity to further enhance learning efficiency. An error threshold estimation heuristic is also proposed to ease tedious parameter tuning. SgpNet is compared with the state-of-the-art method on both synthetic data and experimental single-cell data. Results show that SgpNet achieves comparable inference accuracy, while it has far fewer parameters and eliminates artificial restrictions on the Boolean function structures. Furthermore, SgpNet can potentially scale to large networks via straightforward parallelization on multiple cores.
Collapse
|
17
|
Wang M, Song WM, Ming C, Wang Q, Zhou X, Xu P, Krek A, Yoon Y, Ho L, Orr ME, Yuan GC, Zhang B. Guidelines for bioinformatics of single-cell sequencing data analysis in Alzheimer's disease: review, recommendation, implementation and application. Mol Neurodegener 2022; 17:17. [PMID: 35236372 PMCID: PMC8889402 DOI: 10.1186/s13024-022-00517-z] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 01/18/2022] [Indexed: 12/13/2022] Open
Abstract
Alzheimer's disease (AD) is the most common form of dementia, characterized by progressive cognitive impairment and neurodegeneration. Extensive clinical and genomic studies have revealed biomarkers, risk factors, pathways, and targets of AD in the past decade. However, the exact molecular basis of AD development and progression remains elusive. The emerging single-cell sequencing technology can potentially provide cell-level insights into the disease. Here we systematically review the state-of-the-art bioinformatics approaches to analyze single-cell sequencing data and their applications to AD in 14 major directions, including 1) quality control and normalization, 2) dimension reduction and feature extraction, 3) cell clustering analysis, 4) cell type inference and annotation, 5) differential expression, 6) trajectory inference, 7) copy number variation analysis, 8) integration of single-cell multi-omics, 9) epigenomic analysis, 10) gene network inference, 11) prioritization of cell subpopulations, 12) integrative analysis of human and mouse sc-RNA-seq data, 13) spatial transcriptomics, and 14) comparison of single cell AD mouse model studies and single cell human AD studies. We also address challenges in using human postmortem and mouse tissues and outline future developments in single cell sequencing data analysis. Importantly, we have implemented our recommended workflow for each major analytic direction and applied them to a large single nucleus RNA-sequencing (snRNA-seq) dataset in AD. Key analytic results are reported while the scripts and the data are shared with the research community through GitHub. In summary, this comprehensive review provides insights into various approaches to analyze single cell sequencing data and offers specific guidelines for study design and a variety of analytic directions. The review and the accompanied software tools will serve as a valuable resource for studying cellular and molecular mechanisms of AD, other diseases, or biological systems at the single cell level.
Collapse
Affiliation(s)
- Minghui Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Won-min Song
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Chen Ming
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Qian Wang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Xianxiao Zhou
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Peng Xu
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Azra Krek
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Yonejung Yoon
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Lap Ho
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| | - Miranda E. Orr
- Department of Internal Medicine, Section of Gerontology and Geriatric Medicine, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
- Sticht Center for Healthy Aging and Alzheimer’s Prevention, Wake Forest School of Medicine, Winston-Salem, North Carolina USA
| | - Guo-Cheng Yuan
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, New York, NY 10029 USA
| | - Bin Zhang
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Mount Sinai Center for Transformative Disease Modeling, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Icahn Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, 1470 Madison Avenue, Room S8-111, New York, NY 10029 USA
| |
Collapse
|
18
|
Jiang R, Sun T, Song D, Li JJ. Statistics or biology: the zero-inflation controversy about scRNA-seq data. Genome Biol 2022; 23:31. [PMID: 35063006 PMCID: PMC8783472 DOI: 10.1186/s13059-022-02601-5] [Citation(s) in RCA: 130] [Impact Index Per Article: 65.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open
Abstract
Researchers view vast zeros in single-cell RNA-seq data differently: some regard zeros as biological signals representing no or low gene expression, while others regard zeros as missing data to be corrected. To help address the controversy, here we discuss the sources of biological and non-biological zeros; introduce five mechanisms of adding non-biological zeros in computational benchmarking; evaluate the impacts of non-biological zeros on data analysis; benchmark three input data types: observed counts, imputed counts, and binarized counts; discuss the open questions regarding non-biological zeros; and advocate the importance of transparent analysis.
Collapse
Affiliation(s)
- Ruochen Jiang
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Tianyi Sun
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA
| | - Dongyuan Song
- Bioinformatics Interdepartmental Ph.D. Program, University of California, Los Angeles, 90095-7246, CA, USA
| | - Jingyi Jessica Li
- Department of Statistics, University of California, Los Angeles, 90095-1554, CA, USA.
- Department of Human Genetics, University of California, Los Angeles, 90095-7088, CA, USA.
- Department of Computational Medicine, University of California, Los Angeles, 90095-1766, CA, USA.
- Department of Biostatistics, University of California, Los Angeles, 90095-1772, CA, USA.
| |
Collapse
|
19
|
Shrivastava H, Zhang X, Song L, Aluru S. GRNUlar: A Deep Learning Framework for Recovering Single-Cell Gene Regulatory Networks. J Comput Biol 2022; 29:27-44. [PMID: 35050715 DOI: 10.1089/cmb.2021.0437] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
We propose GRNUlar, a novel deep learning framework for supervised learning of gene regulatory networks (GRNs) from single-cell RNA-Sequencing (scRNA-Seq) data. Our framework incorporates two intertwined models. First, we leverage the expressive ability of neural networks to capture complex dependencies between transcription factors and the corresponding genes they regulate, by developing a multitask learning framework. Second, to capture sparsity of GRNs observed in the real world, we design an unrolled algorithm technique for our framework. Our deep architecture requires supervision for training, for which we repurpose existing synthetic data simulators that generate scRNA-Seq data guided by an underlying GRN. Experimental results demonstrate that GRNUlar outperforms state-of-the-art methods on both synthetic and real data sets. Our study also demonstrates the novel and successful use of expression data simulators for supervised learning of GRN inference.
Collapse
Affiliation(s)
- Harsh Shrivastava
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Xiuwei Zhang
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Le Song
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| | - Srinivas Aluru
- Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia, USA
| |
Collapse
|
20
|
Constructing local cell-specific networks from single-cell data. Proc Natl Acad Sci U S A 2021; 118:2113178118. [PMID: 34903665 PMCID: PMC8713783 DOI: 10.1073/pnas.2113178118] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/09/2021] [Indexed: 11/18/2022] Open
Abstract
Understanding gene regulatory networks is a topic of great interest because it can provide insights into cellular development, and identify factors that differ between normal and abnormal cells and phenotypes. Single-cell RNA sequencing provides a unique opportunity to gain understanding at the cellular level, but the technical features of the data create severe challenges when constructing gene networks. We develop a method that successfully skirts these challenges to estimate a cell-specific network for each single cell and cell type. Application of our algorithm to two brain cell samples furthers our understanding of autism spectrum disorder by examining the evolution of gene networks in fetal brain cells and comparing the networks of cells sampled from case and control subjects. Gene coexpression networks yield critical insights into biological processes, and single-cell RNA sequencing provides an opportunity to target inquiries at the cellular level. However, due to the sparsity and heterogeneity of transcript counts, it is challenging to construct accurate gene networks. We develop an approach, locCSN, that estimates cell-specific networks (CSNs) for each cell, preserving information about cellular heterogeneity that is lost with other approaches. LocCSN is based on a nonparametric investigation of the joint distribution of gene expression; hence it can readily detect nonlinear correlations, and it is more robust to distributional challenges. Although individual CSNs are estimated with considerable noise, average CSNs provide stable estimates of networks, which reveal gene communities better than traditional measures. Additionally, we propose downstream analysis methods using CSNs to utilize more fully the information contained within them. Repeated estimates of gene networks facilitate testing for differences in network structure between cell groups. Notably, with this approach, we can identify differential network genes, which typically do not differ in gene expression, but do differ in terms of the coexpression networks. These genes might help explain the etiology of disease. Finally, to further our understanding of autism spectrum disorder, we examine the evolution of gene networks in fetal brain cells and compare the CSNs of cells sampled from case and control subjects to reveal intriguing patterns in gene coexpression.
Collapse
|
21
|
Liu W, Jiang Y, Peng L, Sun X, Gan W, Zhao Q, Tang H. Inferring Gene Regulatory Networks Using the Improved Markov Blanket Discovery Algorithm. Interdiscip Sci 2021; 14:168-181. [PMID: 34495484 DOI: 10.1007/s12539-021-00478-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 11/26/2022]
Abstract
Inferring gene regulatory networks (GRNs) from microarray data can help us understand the mechanisms of life and eventually develop effective therapies. Currently, many computational methods have been used in inferring GRNs. However, owing to high-dimensional data and small samples, these methods often tend to introduce redundant regulatory relationships. Therefore, a novel network inference method based on the improved Markov blanket discovery algorithm, IMBDANET, is proposed to infer GRNs. Specifically, for each target gene, data processing inequality was applied to the Markov blanket discovery algorithm for the accurate differentiation of direct regulatory genes from indirect regulatory genes. Finally, direct regulatory genes were used in constructing GRNs, and the network structure was optimized according to the importance degree score. Experimental results on six public network datasets show that the proposed method can be effectively used to infer GRNs.
Collapse
Affiliation(s)
- Wei Liu
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Yi Jiang
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Wenqing Gan
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| | - Huanrong Tang
- School of Computer Science, Xiangtan University, Xiangtan, 411105, China.
| |
Collapse
|
22
|
Trinh HC, Kwon YK. A novel constrained genetic algorithm-based Boolean network inference method from steady-state gene expression data. Bioinformatics 2021; 37:i383-i391. [PMID: 34252959 PMCID: PMC8275338 DOI: 10.1093/bioinformatics/btab295] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION It is a challenging problem in systems biology to infer both the network structure and dynamics of a gene regulatory network from steady-state gene expression data. Some methods based on Boolean or differential equation models have been proposed but they were not efficient in inference of large-scale networks. Therefore, it is necessary to develop a method to infer the network structure and dynamics accurately on large-scale networks using steady-state expression. RESULTS In this study, we propose a novel constrained genetic algorithm-based Boolean network inference (CGA-BNI) method where a Boolean canalyzing update rule scheme was employed to capture coarse-grained dynamics. Given steady-state gene expression data as an input, CGA-BNI identifies a set of path consistency-based constraints by comparing the gene expression level between the wild-type and the mutant experiments. It then searches Boolean networks which satisfy the constraints and induce attractors most similar to steady-state expressions. We devised a heuristic mutation operation for faster convergence and implemented a parallel evaluation routine for execution time reduction. Through extensive simulations on the artificial and the real gene expression datasets, CGA-BNI showed better performance than four other existing methods in terms of both structural and dynamics prediction accuracies. Taken together, CGA-BNI is a promising tool to predict both the structure and the dynamics of a gene regulatory network when a highest accuracy is needed at the cost of sacrificing the execution time. AVAILABILITY AND IMPLEMENTATION Source code and data are freely available at https://github.com/csclab/CGA-BNI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hung-Cuong Trinh
- Faculty of Information Technology, Ton Duc Thang University, Ho Chi Minh 758307, Vietnam
| | - Yung-Keun Kwon
- Department of IT Convergence, University of Ulsan, Ulsan 680-749, Korea
| |
Collapse
|
23
|
Liu J, Fan Z, Zhao W, Zhou X. Machine Intelligence in Single-Cell Data Analysis: Advances and New Challenges. Front Genet 2021; 12:655536. [PMID: 34135939 PMCID: PMC8203333 DOI: 10.3389/fgene.2021.655536] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 04/26/2021] [Indexed: 12/18/2022] Open
Abstract
The rapid development of single-cell technologies allows for dissecting cellular heterogeneity at different omics layers with an unprecedented resolution. In-dep analysis of cellular heterogeneity will boost our understanding of complex biological systems or processes, including cancer, immune system and chronic diseases, thereby providing valuable insights for clinical and translational research. In this review, we will focus on the application of machine learning methods in single-cell multi-omics data analysis. We will start with the pre-processing of single-cell RNA sequencing (scRNA-seq) data, including data imputation, cross-platform batch effect removal, and cell cycle and cell-type identification. Next, we will introduce advanced data analysis tools and methods used for copy number variance estimate, single-cell pseudo-time trajectory analysis, phylogenetic tree inference, cell-cell interaction, regulatory network inference, and integrated analysis of scRNA-seq and spatial transcriptome data. Finally, we will present the latest analyzing challenges, such as multi-omics integration and integrated analysis of scRNA-seq data.
Collapse
Affiliation(s)
- Jiajia Liu
- College of Electronic and Information Engineering, Tongji University, Shanghai, China
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| | - Zhiwei Fan
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
- West China School of Public Health, West China Fourth Hospital, Sichuan University, Chengdu, China
| | - Weiling Zhao
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| | - Xiaobo Zhou
- School of Biomedical Informatics, The University of Texas Health Science Centre at Houston, Houston, TX, United States
| |
Collapse
|
24
|
Nguyen H, Tran D, Tran B, Pehlivan B, Nguyen T. A comprehensive survey of regulatory network inference methods using single cell RNA sequencing data. Brief Bioinform 2021; 22:bbaa190. [PMID: 34020546 PMCID: PMC8138892 DOI: 10.1093/bib/bbaa190] [Citation(s) in RCA: 60] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 06/19/2020] [Accepted: 07/24/2020] [Indexed: 12/13/2022] Open
Abstract
Gene regulatory network is a complicated set of interactions between genetic materials, which dictates how cells develop in living organisms and react to their surrounding environment. Robust comprehension of these interactions would help explain how cells function as well as predict their reactions to external factors. This knowledge can benefit both developmental biology and clinical research such as drug development or epidemiology research. Recently, the rapid advance of single-cell sequencing technologies, which pushed the limit of transcriptomic profiling to the individual cell level, opens up an entirely new area for regulatory network research. To exploit this new abundant source of data and take advantage of data in single-cell resolution, a number of computational methods have been proposed to uncover the interactions hidden by the averaging process in standard bulk sequencing. In this article, we review 15 such network inference methods developed for single-cell data. We discuss their underlying assumptions, inference techniques, usability, and pros and cons. In an extensive analysis using simulation, we also assess the methods' performance, sensitivity to dropout and time complexity. The main objective of this survey is to assist not only life scientists in selecting suitable methods for their data and analysis purposes but also computational scientists in developing new methods by highlighting outstanding challenges in the field that remain to be addressed in the future development.
Collapse
Affiliation(s)
- Hung Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Duc Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bang Tran
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Bahadir Pehlivan
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| | - Tin Nguyen
- Department of Computer Science and Engineering, University of Nevada, Reno, NV 89557
| |
Collapse
|
25
|
Li L, Xiong F, Wang Y, Zhang S, Gong Z, Li X, He Y, Shi L, Wang F, Liao Q, Xiang B, Zhou M, Li X, Li Y, Li G, Zeng Z, Xiong W, Guo C. What are the applications of single-cell RNA sequencing in cancer research: a systematic review. JOURNAL OF EXPERIMENTAL & CLINICAL CANCER RESEARCH : CR 2021; 40:163. [PMID: 33975628 PMCID: PMC8111731 DOI: 10.1186/s13046-021-01955-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Accepted: 04/20/2021] [Indexed: 12/18/2022]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a tool for studying gene expression at the single-cell level that has been widely used due to its unprecedented high resolution. In the present review, we outline the preparation process and sequencing platforms for the scRNA-seq analysis of solid tumor specimens and discuss the main steps and methods used during data analysis, including quality control, batch-effect correction, normalization, cell cycle phase assignment, clustering, cell trajectory and pseudo-time reconstruction, differential expression analysis and gene set enrichment analysis, as well as gene regulatory network inference. Traditional bulk RNA sequencing does not address the heterogeneity within and between tumors, and since the development of the first scRNA-seq technique, this approach has been widely used in cancer research to better understand cancer cell biology and pathogenetic mechanisms. ScRNA-seq has been of great significance for the development of targeted therapy and immunotherapy. In the second part of this review, we focus on the application of scRNA-seq in solid tumors, and summarize the findings and achievements in tumor research afforded by its use. ScRNA-seq holds promise for improving our understanding of the molecular characteristics of cancer, and potentially contributing to improved diagnosis, prognosis, and therapeutics.
Collapse
Affiliation(s)
- Lvyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Fang Xiong
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Yumin Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.,Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Shanshan Zhang
- Department of Stomatology, Xiangya Hospital, Central South University, Changsha, China
| | - Zhaojian Gong
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Xiayu Li
- Hunan Key Laboratory of Nonresolving Inflammation and Cancer, Disease Genome Research Center, The Third Xiangya Hospital, Central South University, Changsha, China
| | - Yi He
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Lei Shi
- Department of Oral and Maxillofacial Surgery, The Second Xiangya Hospital, Central South University, Changsha, China
| | - Fuyan Wang
- Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Qianjin Liao
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China
| | - Bo Xiang
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Ming Zhou
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Xiaoling Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Yong Li
- Department of Medicine, Dan L Duncan Comprehensive Cancer Center, Baylor College of Medicine, Houston, TX, USA
| | - Guiyuan Li
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Zhaoyang Zeng
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China.,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China
| | - Wei Xiong
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| | - Can Guo
- NHC Key Laboratory of Carcinogenesis and Hunan Key Laboratory of Cancer Metabolism, Hunan Cancer Hospital and the Affiliated Cancer Hospital of Xiangya School of Medicine, Central South University, Changsha, China. .,Key Laboratory of Carcinogenesis and Cancer Invasion of the Chinese Ministry of Education, Cancer Research Institute, Central South University, Changsha, China.
| |
Collapse
|
26
|
Zhang Y, Chang X, Liu X. Inference of gene regulatory networks using pseudo-time series data. Bioinformatics 2021; 37:2423-2431. [PMID: 33576787 DOI: 10.1093/bioinformatics/btab099] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/18/2021] [Accepted: 02/10/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks (GRNs) from high-throughput data is an important and challenging problem in systems biology. Although numerous GRN methods have been developed, most have focused on the verification of the specific data set. However, it is difficult to establish directed topological networks that are both suitable for time-series and non-time-series datasets due to the complexity and diversity of biological networks. RESULTS Here, we proposed a novel method, GNIPLR (Gene networks inference based on projection and lagged regression) to infer GRNs from time-series or non-time-series gene expression data. GNIPLR projected gene data twice using the LASSO projection (LSP) algorithm and the linear projection (LP) approximation to produce a linear and monotonous pseudo-time series, and then determined the direction of regulation in combination with lagged regression analyses. The proposed algorithm was validated using simulated and real biological data. Moreover, we also applied the GNIPLR algorithm to the liver hepatocellular carcinoma (LIHC) and bladder urothelial carcinoma (BLCA) cancer expression datasets. These analyses revealed significantly higher accuracy and AUC values than other popular methods. AVAILABILITY The GNIPLR tool is freely available at https://github.com/zyllluck/GNIPLR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuelei Zhang
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| | - Xiao Chang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China
| | - Xiaoping Liu
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| |
Collapse
|
27
|
Béal J, Pantolini L, Noël V, Barillot E, Calzone L. Personalized logical models to investigate cancer response to BRAF treatments in melanomas and colorectal cancers. PLoS Comput Biol 2021; 17:e1007900. [PMID: 33507915 PMCID: PMC7872233 DOI: 10.1371/journal.pcbi.1007900] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 02/09/2021] [Accepted: 12/21/2020] [Indexed: 11/19/2022] Open
Abstract
The study of response to cancer treatments has benefited greatly from the contribution of different omics data but their interpretation is sometimes difficult. Some mathematical models based on prior biological knowledge of signaling pathways facilitate this interpretation but often require fitting of their parameters using perturbation data. We propose a more qualitative mechanistic approach, based on logical formalism and on the sole mapping and interpretation of omics data, and able to recover differences in sensitivity to gene inhibition without model training. This approach is showcased by the study of BRAF inhibition in patients with melanomas and colorectal cancers who experience significant differences in sensitivity despite similar omics profiles. We first gather information from literature and build a logical model summarizing the regulatory network of the mitogen-activated protein kinase (MAPK) pathway surrounding BRAF, with factors involved in the BRAF inhibition resistance mechanisms. The relevance of this model is verified by automatically assessing that it qualitatively reproduces response or resistance behaviors identified in the literature. Data from over 100 melanoma and colorectal cancer cell lines are then used to validate the model's ability to explain differences in sensitivity. This generic model is transformed into personalized cell line-specific logical models by integrating the omics information of the cell lines as constraints of the model. The use of mutations alone allows personalized models to correlate significantly with experimental sensitivities to BRAF inhibition, both from drug and CRISPR targeting, and even better with the joint use of mutations and RNA, supporting multi-omics mechanistic models. A comparison of these untrained models with learning approaches highlights similarities in interpretation and complementarity depending on the size of the datasets. This parsimonious pipeline, which can easily be extended to other biological questions, makes it possible to explore the mechanistic causes of the response to treatment, on an individualized basis.
Collapse
Affiliation(s)
- Jonas Béal
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Lorenzo Pantolini
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Vincent Noël
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Emmanuel Barillot
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| | - Laurence Calzone
- Institut Curie, PSL Research University, Paris, France
- INSERM, U900, Paris, France
- MINES ParisTech, PSL Research University, CBIO-Centre for Computational Biology, Paris, France
| |
Collapse
|
28
|
Single-cell network biology for resolving cellular heterogeneity in human diseases. Exp Mol Med 2020; 52:1798-1808. [PMID: 33244151 PMCID: PMC8080824 DOI: 10.1038/s12276-020-00528-0] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 08/26/2020] [Accepted: 08/31/2020] [Indexed: 01/10/2023] Open
Abstract
Understanding cellular heterogeneity is the holy grail of biology and medicine. Cells harboring identical genomes show a wide variety of behaviors in multicellular organisms. Genetic circuits underlying cell-type identities will facilitate the understanding of the regulatory programs for differentiation and maintenance of distinct cellular states. Such a cell-type-specific gene network can be inferred from coregulatory patterns across individual cells. Conventional methods of transcriptome profiling using tissue samples provide only average signals of diverse cell types. Therefore, reconstructing gene regulatory networks for a particular cell type is not feasible with tissue-based transcriptome data. Recently, single-cell omics technology has emerged and enabled the capture of the transcriptomic landscape of every individual cell. Although single-cell gene expression studies have already opened up new avenues, network biology using single-cell transcriptome data will further accelerate our understanding of cellular heterogeneity. In this review, we provide an overview of single-cell network biology and summarize recent progress in method development for network inference from single-cell RNA sequencing (scRNA-seq) data. Then, we describe how cell-type-specific gene networks can be utilized to study regulatory programs specific to disease-associated cell types and cellular states. Moreover, with scRNA data, modeling personal or patient-specific gene networks is feasible. Therefore, we also introduce potential applications of single-cell network biology for precision medicine. We envision a rapid paradigm shift toward single-cell network analysis for systems biology in the near future. Gene regulatory networks reconstructed from single-cell RNA sequencing datasets are allowing researchers to better understand the molecular circuits and cell states that contribute to complex human disease. Junha Cha and Insuk Lee from Yonsei University in Seoul, South Korea, review the concept of ‘single-cell network biology’, which involves using computational algorithms on genetic expression data from thousands of cells to infer functional interactions in various biological contexts. This systems biology approach to analyzing the profiles of messenger RNA in single cells is helping researchers discover new signaling pathways that could serve as disease biomarkers or therapeutic targets. In the future, patient-specific models of personal gene networks could explain why certain genetic variants affect disease risk. This research could also eventually lead to new types of individualized medical treatments.
Collapse
|
29
|
Dai H, Jin QQ, Li L, Chen LN. Reconstructing gene regulatory networks in single-cell transcriptomic data analysis. Zool Res 2020; 41:599-604. [PMID: 33124218 PMCID: PMC7671911 DOI: 10.24272/j.issn.2095-8137.2020.215] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 10/20/2020] [Indexed: 11/07/2022] Open
Abstract
Gene regulatory networks play pivotal roles in our understanding of biological processes/mechanisms at the molecular level. Many studies have developed sample-specific or cell-type-specific gene regulatory networks from single-cell transcriptomic data based on a large amount of cell samples. Here, we review the state-of-the-art computational algorithms and describe various applications of gene regulatory networks in biological studies.
Collapse
Affiliation(s)
- Hao Dai
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- Institute of Brain-Intelligence Technology, Zhangjiang Laboratory, Shanghai 201210, China
| | - Qi-Qi Jin
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Lin Li
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Luo-Nan Chen
- Key Laboratory of Systems Biology, Shanghai Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, Zhejiang 310024, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, Yunnan 650223, China
| |
Collapse
|
30
|
Liu W, Sun X, Peng L, Zhou L, Lin H, Jiang Y. RWRNET: A Gene Regulatory Network Inference Algorithm Using Random Walk With Restart. Front Genet 2020; 11:591461. [PMID: 33101398 PMCID: PMC7545090 DOI: 10.3389/fgene.2020.591461] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 09/02/2020] [Indexed: 11/30/2022] Open
Abstract
Inferring gene regulatory networks from expression data is essential in identifying complex regulatory relationships among genes and revealing the mechanism of certain diseases. Various computation methods have been developed for inferring gene regulatory networks. However, these methods focus on the local topology of the network rather than on the global topology. From network optimisation standpoint, emphasising the global topology of the network also reduces redundant regulatory relationships. In this study, we propose a novel network inference algorithm using Random Walk with Restart (RWRNET) that combines local and global topology relationships. The method first captures the local topology through three elements of random walk and then combines the local topology with the global topology by Random Walk with Restart. The Markov Blanket discovery algorithm is then used to deal with isolated genes. The proposed method is compared with several state-of-the-art methods on the basis of six benchmark datasets. Experimental results demonstrated the effectiveness of the proposed method.
Collapse
Affiliation(s)
- Wei Liu
- School of Computer Science, Xiangtan University, Xiangtan, China.,Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, China
| | - Xingen Sun
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China
| | - Lili Zhou
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Hui Lin
- School of Computer Science, Xiangtan University, Xiangtan, China
| | - Yi Jiang
- School of Computer Science, Xiangtan University, Xiangtan, China
| |
Collapse
|
31
|
Shi N, Zhu Z, Tang K, Parker D, He S. ATEN: And/Or tree ensemble for inferring accurate Boolean network topology and dynamics. Bioinformatics 2020; 36:578-585. [PMID: 31368481 DOI: 10.1093/bioinformatics/btz563] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 07/02/2019] [Accepted: 07/24/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks from gene expression time series data is important for gaining insights into the complex processes of cell life. A popular approach is to infer Boolean networks. However, it is still a pressing open problem to infer accurate Boolean networks from experimental data that are typically short and noisy. RESULTS To address the problem, we propose a Boolean network inference algorithm which is able to infer accurate Boolean network topology and dynamics from short and noisy time series data. The main idea is that, for each target gene, we use an And/Or tree ensemble algorithm to select prime implicants of which each is a conjunction of a set of input genes. The selected prime implicants are important features for predicting the states of the target gene. Using these important features we then infer the Boolean function of the target gene. Finally, the Boolean functions of all target genes are combined as a Boolean network. Using the data generated from artificial and real-world gene regulatory networks, we show that our algorithm can infer more accurate Boolean network topology and dynamics from short and noisy time series data than other algorithms. Our algorithm enables us to gain better insights into complex regulatory mechanisms of cell life. AVAILABILITY AND IMPLEMENTATION Package ATEN is freely available at https://github.com/ningshi/ATEN. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ning Shi
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK
| | - Zexuan Zhu
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060, China
| | - Ke Tang
- Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - David Parker
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK
| | - Shan He
- School of Computer Science, University of Birmingham, Birmingham B15 2TT, UK.,Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
32
|
Pratapa A, Jalihal AP, Law JN, Bharadwaj A, Murali TM. Benchmarking algorithms for gene regulatory network inference from single-cell transcriptomic data. Nat Methods 2020; 17:147-154. [PMID: 31907445 PMCID: PMC7098173 DOI: 10.1038/s41592-019-0690-6] [Citation(s) in RCA: 331] [Impact Index Per Article: 82.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 11/22/2019] [Indexed: 01/10/2023]
Abstract
We present a systematic evaluation of state-of-the-art algorithms for inferring gene regulatory networks from single-cell transcriptional data. As the ground truth for assessing accuracy, we use synthetic networks with predictable trajectories, literature-curated Boolean models and diverse transcriptional regulatory networks. We develop a strategy to simulate single-cell transcriptional data from synthetic and Boolean networks that avoids pitfalls of previously used methods. Furthermore, we collect networks from multiple experimental single-cell RNA-seq datasets. We develop an evaluation framework called BEELINE. We find that the area under the precision-recall curve and early precision of the algorithms are moderate. The methods are better in recovering interactions in synthetic networks than Boolean models. The algorithms with the best early precision values for Boolean models also perform well on experimental datasets. Techniques that do not require pseudotime-ordered cells are generally more accurate. Based on these results, we present recommendations to end users. BEELINE will aid the development of gene regulatory network inference algorithms.
Collapse
Affiliation(s)
- Aditya Pratapa
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Amogh P Jalihal
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Jeffrey N Law
- Genetics, Bioinformatics, and Computational Biology Ph.D. Program, Virginia Tech, Blacksburg, VA, USA
| | - Aditya Bharadwaj
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - T M Murali
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA.
| |
Collapse
|
33
|
Mercatelli D, Scalambra L, Triboli L, Ray F, Giorgi FM. Gene regulatory network inference resources: A practical overview. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194430. [PMID: 31678629 DOI: 10.1016/j.bbagrm.2019.194430] [Citation(s) in RCA: 58] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/06/2019] [Accepted: 09/09/2019] [Indexed: 02/08/2023]
Abstract
Transcriptional regulation is a fundamental molecular mechanism involved in almost every aspect of life, from homeostasis to development, from metabolism to behavior, from reaction to stimuli to disease progression. In recent years, the concept of Gene Regulatory Networks (GRNs) has grown popular as an effective applied biology approach for describing the complex and highly dynamic set of transcriptional interactions, due to its easy-to-interpret features. Since cataloguing, predicting and understanding every GRN connection in all species and cellular contexts remains a great challenge for biology, researchers have developed numerous tools and methods to infer regulatory processes. In this review, we catalogue these methods in six major areas, based on the dominant underlying information leveraged to infer GRNs: Coexpression, Sequence Motifs, Chromatin Immunoprecipitation (ChIP), Orthology, Literature and Protein-Protein Interaction (PPI) specifically focused on transcriptional complexes. The methods described here cover a wide range of user-friendliness: from web tools that require no prior computational expertise to command line programs and algorithms for large scale GRN inferences. Each method for GRN inference described herein effectively illustrates a type of transcriptional relationship, with many methods being complementary to others. While a truly holistic approach for inferring and displaying GRNs remains one of the greatest challenges in the field of systems biology, we believe that the integration of multiple methods described herein provides an effective means with which experimental and computational biologists alike may obtain the most complete pictures of transcriptional relationships. This article is part of a Special Issue entitled: Transcriptional Profiles and Regulatory Gene Networks edited by Dr. Federico Manuel Giorgi and Dr. Shaun Mahony.
Collapse
Affiliation(s)
- Daniele Mercatelli
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Laura Scalambra
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy
| | - Luca Triboli
- Centre for Integrative Biology (CIBIO), University of Trento, Italy
| | - Forest Ray
- Department of Systems Biology, Columbia University Medical Center, New York, NY, United States
| | - Federico M Giorgi
- Department of Pharmacy and Biotechnology, University of Bologna, Bologna, Italy.
| |
Collapse
|
34
|
Blencowe M, Arneson D, Ding J, Chen YW, Saleem Z, Yang X. Network modeling of single-cell omics data: challenges, opportunities, and progresses. Emerg Top Life Sci 2019; 3:379-398. [PMID: 32270049 PMCID: PMC7141415 DOI: 10.1042/etls20180176] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 06/07/2019] [Accepted: 06/24/2019] [Indexed: 01/07/2023]
Abstract
Single-cell multi-omics technologies are rapidly evolving, prompting both methodological advances and biological discoveries at an unprecedented speed. Gene regulatory network modeling has been used as a powerful approach to elucidate the complex molecular interactions underlying biological processes and systems, yet its application in single-cell omics data modeling has been met with unique challenges and opportunities. In this review, we discuss these challenges and opportunities, and offer an overview of the recent development of network modeling approaches designed to capture dynamic networks, within-cell networks, and cell-cell interaction or communication networks. Finally, we outline the remaining gaps in single-cell gene network modeling and the outlooks of the field moving forward.
Collapse
Affiliation(s)
- Montgomery Blencowe
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Douglas Arneson
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Jessica Ding
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Yen-Wei Chen
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Molecular Toxicology Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Zara Saleem
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| | - Xia Yang
- Department of Integrative Biology and Physiology, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Molecular Toxicology Program, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, 610 Charles E. Young Drive East, Los Angeles, CA 90095, U.S.A
| |
Collapse
|
35
|
Iacono G, Massoni-Badosa R, Heyn H. Single-cell transcriptomics unveils gene regulatory network plasticity. Genome Biol 2019; 20:110. [PMID: 31159854 PMCID: PMC6547541 DOI: 10.1186/s13059-019-1713-4] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 05/08/2019] [Indexed: 12/20/2022] Open
Abstract
BACKGROUND Single-cell RNA sequencing (scRNA-seq) plays a pivotal role in our understanding of cellular heterogeneity. Current analytical workflows are driven by categorizing principles that consider cells as individual entities and classify them into complex taxonomies. RESULTS We devise a conceptually different computational framework based on a holistic view, where single-cell datasets are used to infer global, large-scale regulatory networks. We develop correlation metrics that are specifically tailored to single-cell data, and then generate, validate, and interpret single-cell-derived regulatory networks from organs and perturbed systems, such as diabetes and Alzheimer's disease. Using tools from graph theory, we compute an unbiased quantification of a gene's biological relevance and accurately pinpoint key players in organ function and drivers of diseases. CONCLUSIONS Our approach detects multiple latent regulatory changes that are invisible to single-cell workflows based on clustering or differential expression analysis, significantly broadening the biological insights that can be obtained with this leading technology.
Collapse
Affiliation(s)
- Giovanni Iacono
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain.
| | - Ramon Massoni-Badosa
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain
| | - Holger Heyn
- CNAG-CRG, Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Baldiri Reixac 4, 08028, Barcelona, Spain.
- Universitat Pompeu Fabra (UPF), Barcelona, Spain.
| |
Collapse
|
36
|
Bonnaffoux A, Herbach U, Richard A, Guillemin A, Gonin-Giraud S, Gros PA, Gandrillon O. WASABI: a dynamic iterative framework for gene regulatory network inference. BMC Bioinformatics 2019; 20:220. [PMID: 31046682 PMCID: PMC6498543 DOI: 10.1186/s12859-019-2798-1] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 04/09/2019] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Inference of gene regulatory networks from gene expression data has been a long-standing and notoriously difficult task in systems biology. Recently, single-cell transcriptomic data have been massively used for gene regulatory network inference, with both successes and limitations. RESULTS In the present work we propose an iterative algorithm called WASABI, dedicated to inferring a causal dynamical network from time-stamped single-cell data, which tackles some of the limitations associated with current approaches. We first introduce the concept of waves, which posits that the information provided by an external stimulus will affect genes one-by-one through a cascade, like waves spreading through a network. This concept allows us to infer the network one gene at a time, after genes have been ordered regarding their time of regulation. We then demonstrate the ability of WASABI to correctly infer small networks, which have been simulated in silico using a mechanistic model consisting of coupled piecewise-deterministic Markov processes for the proper description of gene expression at the single-cell level. We finally apply WASABI on in vitro generated data on an avian model of erythroid differentiation. The structure of the resulting gene regulatory network sheds a new light on the molecular mechanisms controlling this process. In particular, we find no evidence for hub genes and a much more distributed network structure than expected. Interestingly, we find that a majority of genes are under the direct control of the differentiation-inducing stimulus. CONCLUSIONS Together, these results demonstrate WASABI versatility and ability to tackle some general gene regulatory networks inference issues. It is our hope that WASABI will prove useful in helping biologists to fully exploit the power of time-stamped single-cell data.
Collapse
Affiliation(s)
- Arnaud Bonnaffoux
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Cosmotech, Lyon, France
| | - Ulysse Herbach
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5208, Institut Camille Jordan, Villeurbanne, France
| | - Angélique Richard
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Anissa Guillemin
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | - Sandrine Gonin-Giraud
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
| | | | - Olivier Gandrillon
- University Lyon, ENS de Lyon, University Claude Bernard, CNRS UMR 5239, INSERM U1210, Laboratory of Biology and Modelling of the Cell, Lyon, France
- Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Lyon, France
| |
Collapse
|
37
|
Massaia A, Chaves P, Samari S, Miragaia RJ, Meyer K, Teichmann SA, Noseda M. Single Cell Gene Expression to Understand the Dynamic Architecture of the Heart. Front Cardiovasc Med 2018; 5:167. [PMID: 30525044 PMCID: PMC6258739 DOI: 10.3389/fcvm.2018.00167] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 10/29/2018] [Indexed: 12/21/2022] Open
Abstract
The recent development of single cell gene expression technologies, and especially single cell transcriptomics, have revolutionized the way biologists and clinicians investigate organs and organisms, allowing an unprecedented level of resolution to the description of cell demographics in both healthy and diseased states. Single cell transcriptomics provide information on prevalence, heterogeneity, and gene co-expression at the individual cell level. This enables a cell-centric outlook to define intracellular gene regulatory networks and to bridge toward the definition of intercellular pathways otherwise masked in bulk analysis. The technologies have developed at a fast pace producing a multitude of different approaches, with several alternatives to choose from at any step, including single cell isolation and capturing, lysis, RNA reverse transcription and cDNA amplification, library preparation, sequencing, and computational analyses. Here, we provide guidelines for the experimental design of single cell RNA sequencing experiments, exploring the current options for the crucial steps. Furthermore, we provide a complete overview of the typical data analysis workflow, from handling the raw sequencing data to making biological inferences. Significantly, advancements in single cell transcriptomics have already contributed to outstanding exploratory and functional studies of cardiac development and disease models, as summarized in this review. In conclusion, we discuss achievable outcomes of single cell transcriptomics' applications in addressing unanswered questions and influencing future cardiac clinical applications.
Collapse
Affiliation(s)
- Andrea Massaia
- British Heart Foundation Centre of Research Excellence and British Heart Foundation Centre for Regenerative Medicine, National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - Patricia Chaves
- British Heart Foundation Centre of Research Excellence and British Heart Foundation Centre for Regenerative Medicine, National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | - Sara Samari
- British Heart Foundation Centre of Research Excellence and British Heart Foundation Centre for Regenerative Medicine, National Heart and Lung Institute, Imperial College London, London, United Kingdom
| | | | - Kerstin Meyer
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Sarah Amalia Teichmann
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Michela Noseda
- British Heart Foundation Centre of Research Excellence and British Heart Foundation Centre for Regenerative Medicine, National Heart and Lung Institute, Imperial College London, London, United Kingdom
| |
Collapse
|
38
|
Hon CC, Shin JW, Carninci P, Stubbington MJT. The Human Cell Atlas: Technical approaches and challenges. Brief Funct Genomics 2018; 17:283-294. [PMID: 29092000 PMCID: PMC6063304 DOI: 10.1093/bfgp/elx029] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The Human Cell Atlas is a large, international consortium that aims to identify and describe every cell type in the human body. The comprehensive cellular maps that arise from this ambitious effort have the potential to transform many aspects of fundamental biology and clinical practice. Here, we discuss the technical approaches that could be used today to generate such a resource and also the technical challenges that will be encountered.
Collapse
Affiliation(s)
- Chung-Chau Hon
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Jay W Shin
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | - Piero Carninci
- RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama, Kanagawa, Japan
| | | |
Collapse
|
39
|
Fiers MWEJ, Minnoye L, Aibar S, Bravo González-Blas C, Kalender Atak Z, Aerts S. Mapping gene regulatory networks from single-cell omics data. Brief Funct Genomics 2018; 17:246-254. [PMID: 29342231 PMCID: PMC6063279 DOI: 10.1093/bfgp/elx046] [Citation(s) in RCA: 141] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Single-cell techniques are advancing rapidly and are yielding unprecedented insight into cellular heterogeneity. Mapping the gene regulatory networks (GRNs) underlying cell states provides attractive opportunities to mechanistically understand this heterogeneity. In this review, we discuss recently emerging methods to map GRNs from single-cell transcriptomics data, tackling the challenge of increased noise levels and data sparsity compared with bulk data, alongside increasing data volumes. Next, we discuss how new techniques for single-cell epigenomics, such as single-cell ATAC-seq and single-cell DNA methylation profiling, can be used to decipher gene regulatory programmes. We finally look forward to the application of single-cell multi-omics and perturbation techniques that will likely play important roles for GRN inference in the future.
Collapse
Affiliation(s)
- Mark W E J Fiers
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
| | - Liesbeth Minnoye
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Sara Aibar
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Carmen Bravo González-Blas
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Zeynep Kalender Atak
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| | - Stein Aerts
- VIB Center for Brain & Disease Research, Laboratory of Computational Biology, Leuven, Belgium
- KU Leuven, Department of Human Genetics, Leuven, Belgium
| |
Collapse
|
40
|
Chen S, Mar JC. Evaluating methods of inferring gene regulatory networks highlights their lack of performance for single cell gene expression data. BMC Bioinformatics 2018; 19:232. [PMID: 29914350 PMCID: PMC6006753 DOI: 10.1186/s12859-018-2217-z] [Citation(s) in RCA: 127] [Impact Index Per Article: 21.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2017] [Accepted: 05/24/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A fundamental fact in biology states that genes do not operate in isolation, and yet, methods that infer regulatory networks for single cell gene expression data have been slow to emerge. With single cell sequencing methods now becoming accessible, general network inference algorithms that were initially developed for data collected from bulk samples may not be suitable for single cells. Meanwhile, although methods that are specific for single cell data are now emerging, whether they have improved performance over general methods is unknown. In this study, we evaluate the applicability of five general methods and three single cell methods for inferring gene regulatory networks from both experimental single cell gene expression data and in silico simulated data. RESULTS Standard evaluation metrics using ROC curves and Precision-Recall curves against reference sets sourced from the literature demonstrated that most of the methods performed poorly when they were applied to either experimental single cell data, or simulated single cell data, which demonstrates their lack of performance for this task. Using default settings, network methods were applied to the same datasets. Comparisons of the learned networks highlighted the uniqueness of some predicted edges for each method. The fact that different methods infer networks that vary substantially reflects the underlying mathematical rationale and assumptions that distinguish network methods from each other. CONCLUSIONS This study provides a comprehensive evaluation of network modeling algorithms applied to experimental single cell gene expression data and in silico simulated datasets where the network structure is known. Comparisons demonstrate that most of these assessed network methods are not able to predict network structures from single cell expression data accurately, even if they are specifically developed for single cell methods. Also, single cell methods, which usually depend on more elaborative algorithms, in general have less similarity to each other in the sets of edges detected. The results from this study emphasize the importance for developing more accurate optimized network modeling methods that are compatible for single cell data. Newly-developed single cell methods may uniquely capture particular features of potential gene-gene relationships, and caution should be taken when we interpret these results.
Collapse
Affiliation(s)
- Shuonan Chen
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA
| | - Jessica C Mar
- Department of Systems and Computational Biology, Albert Einstein College of Medicine, Bronx, New York, USA. .,Department of Epidemiology and Population Health, Albert Einstein College of Medicine, Bronx, New York, USA. .,Australian Institute for Bioengineering and Nanotechnology, University of Queensland, Brisbane, QLD, Australia.
| |
Collapse
|
41
|
Woodhouse S, Piterman N, Wintersteiger CM, Göttgens B, Fisher J. SCNS: a graphical tool for reconstructing executable regulatory networks from single-cell genomic data. BMC SYSTEMS BIOLOGY 2018; 12:59. [PMID: 29801503 PMCID: PMC5970485 DOI: 10.1186/s12918-018-0581-y] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/18/2018] [Accepted: 04/10/2018] [Indexed: 11/25/2022]
Abstract
Background Reconstruction of executable mechanistic models from single-cell gene expression data represents a powerful approach to understanding developmental and disease processes. New ambitious efforts like the Human Cell Atlas will soon lead to an explosion of data with potential for uncovering and understanding the regulatory networks which underlie the behaviour of all human cells. In order to take advantage of this data, however, there is a need for general-purpose, user-friendly and efficient computational tools that can be readily used by biologists who do not have specialist computer science knowledge. Results The Single Cell Network Synthesis toolkit (SCNS) is a general-purpose computational tool for the reconstruction and analysis of executable models from single-cell gene expression data. Through a graphical user interface, SCNS takes single-cell qPCR or RNA-sequencing data taken across a time course, and searches for logical rules that drive transitions from early cell states towards late cell states. Because the resulting reconstructed models are executable, they can be used to make predictions about the effect of specific gene perturbations on the generation of specific lineages. Conclusions SCNS should be of broad interest to the growing number of researchers working in single-cell genomics and will help further facilitate the generation of valuable mechanistic insights into developmental, homeostatic and disease processes. Electronic supplementary material The online version of this article (10.1186/s12918-018-0581-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Steven Woodhouse
- Department of Hematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK.,Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.,Microsoft Research Cambridge, 21 Station Road, Cambridge, CB1 2FB, UK
| | - Nir Piterman
- Department of Informatics, University of Leicester, University Road, Leicester, LE1 7RH, UK
| | | | - Berthold Göttgens
- Department of Hematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, CB2 0XY, UK. .,Wellcome Trust - Medical Research Council Cambridge Stem Cell Institute, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK.
| | - Jasmin Fisher
- Microsoft Research Cambridge, 21 Station Road, Cambridge, CB1 2FB, UK. .,Department of Biochemistry, University of Cambridge, Cambridge, CB2 1QW, UK.
| |
Collapse
|
42
|
Mohammadi S, Ravindra V, Gleich DF, Grama A. A geometric approach to characterize the functional identity of single cells. Nat Commun 2018; 9:1516. [PMID: 29666373 PMCID: PMC5904143 DOI: 10.1038/s41467-018-03933-2] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Accepted: 03/20/2018] [Indexed: 02/07/2023] Open
Abstract
Single-cell transcriptomic data has the potential to radically redefine our view of cell-type identity. Cells that were previously believed to be homogeneous are now clearly distinguishable in terms of their expression phenotype. Methods for automatically characterizing the functional identity of cells, and their associated properties, can be used to uncover processes involved in lineage differentiation as well as sub-typing cancer cells. They can also be used to suggest personalized therapies based on molecular signatures associated with pathology. We develop a new method, called ACTION, to infer the functional identity of cells from their transcriptional profile, classify them based on their dominant function, and reconstruct regulatory networks that are responsible for mediating their identity. Using ACTION, we identify novel Melanoma subtypes with differential survival rates and therapeutic responses, for which we provide biomarkers along with their underlying regulatory networks.
Collapse
Affiliation(s)
- Shahin Mohammadi
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA. .,Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA.
| | - Vikram Ravindra
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - David F Gleich
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Ananth Grama
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA.
| |
Collapse
|
43
|
Matsumoto H, Kiryu H, Furusawa C, Ko MSH, Ko SBH, Gouda N, Hayashi T, Nikaido I. SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation. Bioinformatics 2018; 33:2314-2321. [PMID: 28379368 PMCID: PMC5860123 DOI: 10.1093/bioinformatics/btx194] [Citation(s) in RCA: 220] [Impact Index Per Article: 36.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 04/02/2017] [Indexed: 01/17/2023] Open
Abstract
Motivation The analysis of RNA-Seq data from individual differentiating cells enables us to reconstruct the differentiation process and the degree of differentiation (in pseudo-time) of each cell. Such analyses can reveal detailed expression dynamics and functional relationships for differentiation. To further elucidate differentiation processes, more insight into gene regulatory networks is required. The pseudo-time can be regarded as time information and, therefore, single-cell RNA-Seq data are time-course data with high time resolution. Although time-course data are useful for inferring networks, conventional inference algorithms for such data suffer from high time complexity when the number of samples and genes is large. Therefore, a novel algorithm is necessary to infer networks from single-cell RNA-Seq during differentiation. Results In this study, we developed the novel and efficient algorithm SCODE to infer regulatory networks, based on ordinary differential equations. We applied SCODE to three single-cell RNA-Seq datasets and confirmed that SCODE can reconstruct observed expression dynamics. We evaluated SCODE by comparing its inferred networks with use of a DNaseI-footprint based network. The performance of SCODE was best for two of the datasets and nearly best for the remaining dataset. We also compared the runtimes and showed that the runtimes for SCODE are significantly shorter than for alternatives. Thus, our algorithm provides a promising approach for further single-cell differentiation analyses. Availability and Implementation The R source code of SCODE is available at https://github.com/hmatsu1226/SCODE Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hirotaka Matsumoto
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| | - Hisanori Kiryu
- Department of Computational Biology and Medical Sciences, Faculty of Frontier Sciences, The University of Tokyo, Kashiwa, Chiba 277-8561, Japan
| | - Chikara Furusawa
- Quantitative Biology Center (QBiC), RIKEN, Suita, Osaka 565-0874, Japan.,Universal Biology Institute, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Minoru S H Ko
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Shigeru B H Ko
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Norio Gouda
- Department of Systems Medicine, Keio University School of Medicine, Tokyo 160-8582, Japan
| | - Tetsutaro Hayashi
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| | - Itoshi Nikaido
- Bioinformatics Research Unit, Advanced Center for Computing and Communication, RIKEN, Wako, Saitama 351-0198, Japan
| |
Collapse
|
44
|
Papili Gao N, Ud-Dean SMM, Gandrillon O, Gunawan R. SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles. Bioinformatics 2017; 34:258-266. [PMID: 28968704 PMCID: PMC5860204 DOI: 10.1093/bioinformatics/btx575] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2016] [Revised: 06/12/2017] [Accepted: 09/13/2017] [Indexed: 11/13/2022] Open
Abstract
Motivation Single cell transcriptional profiling opens up a new avenue in studying the functional role of cell-to-cell variability in physiological processes. The analysis of single cell expression profiles creates new challenges due to the distributive nature of the data and the stochastic dynamics of gene transcription process. The reconstruction of gene regulatory networks (GRNs) using single cell transcriptional profiles is particularly challenging, especially when directed gene-gene relationships are desired. Results We developed SINCERITIES (SINgle CEll Regularized Inference using TIme-stamped Expression profileS) for the inference of GRNs from single cell transcriptional profiles. We focused on time-stamped cross-sectional expression data, commonly generated from transcriptional profiling of single cells collected at multiple time points after cell stimulation. SINCERITIES recovers directed regulatory relationships among genes by employing regularized linear regression (ridge regression), using temporal changes in the distributions of gene expressions. Meanwhile, the modes of the gene regulations (activation and repression) come from partial correlation analyses between pairs of genes. We demonstrated the efficacy of SINCERITIES in inferring GRNs using in silico time-stamped single cell expression data and single cell transcriptional profiles of THP-1 monocytic human leukemia cells. The case studies showed that SINCERITIES could provide accurate GRN predictions, significantly better than other GRN inference algorithms such as TSNI, GENIE3 and JUMP3. Moreover, SINCERITIES has a low computational complexity and is amenable to problems of extremely large dimensionality. Finally, an application of SINCERITIES to single cell expression data of T2EC chicken erythrocytes pointed to BATF as a candidate novel regulator of erythroid development. Availability and implementation MATLAB and R version of SINCERITIES are freely available from the following websites: http://www.cabsel.ethz.ch/tools/sincerities.html and https://github.com/CABSEL/SINCERITIES. The single cell THP-1 and T2EC transcriptional profiles are available from the original publications (Kouno et al., 2013; Richard et al., 2016). The in silico single cell data are available on SINCERITIES websites. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Nan Papili Gao
- Institute for Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - S M Minhaz Ud-Dean
- Department of Environmental Health Sciences, Mailman School of Public Health, Columbia University, New York, NY, USA
| | - Olivier Gandrillon
- Laboratory of Biology and Modelling of the Cell, Univ Lyon, ENS de Lyon, Univ Claude Bernard, CNRS UMR, INSERM Lyon, France.,Inria Team Dracula, Inria Center Grenoble Rhône-Alpes, Rhône-Alpes, France
| | - Rudiyanto Gunawan
- Institute for Chemical and Bioengineering, ETH Zurich, Zurich, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
45
|
Basilico S, Göttgens B. Dysregulation of haematopoietic stem cell regulatory programs in acute myeloid leukaemia. J Mol Med (Berl) 2017; 95:719-727. [PMID: 28429049 PMCID: PMC5487585 DOI: 10.1007/s00109-017-1535-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Revised: 03/29/2017] [Accepted: 04/11/2017] [Indexed: 12/28/2022]
Abstract
Haematopoietic stem cells (HSC) are situated at the apex of the haematopoietic differentiation hierarchy, ensuring the life-long supply of mature haematopoietic cells and forming a reservoir to replenish the haematopoietic system in case of emergency such as acute blood loss. To maintain a balanced production of all mature lineages and at the same time secure a stem cell reservoir, intricate regulatory programs have evolved to control multi-lineage differentiation and self-renewal in haematopoietic stem and progenitor cells (HSPCs). Leukaemogenic mutations commonly disrupt these regulatory programs causing a block in differentiation with simultaneous enhancement of proliferation. Here, we briefly summarize key aspects of HSPC regulatory programs, and then focus on their disruption by leukaemogenic fusion genes containing the mixed lineage leukaemia (MLL) gene. Using MLL as an example, we explore important questions of wider significance that are still under debate, including the importance of cell of origin, to what extent leukaemia oncogenes impose specific regulatory programs and the relevance of leukaemia stem cells for disease development and prognosis. Finally, we suggest that disruption of stem cell regulatory programs is likely to play an important role in many other pathologies including ageing-associated regenerative failure.
Collapse
Affiliation(s)
- Silvia Basilico
- Department of Haematology, Cambridge Institute for Medical Research and Wellcome Trust and MRC Cambridge Stem Cell Institute, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK
| | - Berthold Göttgens
- Department of Haematology, Cambridge Institute for Medical Research and Wellcome Trust and MRC Cambridge Stem Cell Institute, University of Cambridge, Hills Road, Cambridge, CB2 0XY, UK.
| |
Collapse
|