51
|
Xue L, Wu Y, Lin Y. Dissecting and improving gene regulatory network inference using single-cell transcriptome data. Genome Res 2023; 33:1609-1621. [PMID: 37580132 PMCID: PMC10620053 DOI: 10.1101/gr.277488.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 08/07/2023] [Indexed: 08/16/2023]
Abstract
Single-cell transcriptome data has been widely used to reconstruct gene regulatory networks (GRNs) controlling critical biological processes such as development and differentiation. Although a growing list of algorithms has been developed to infer GRNs using such data, achieving an inference accuracy consistently higher than random guessing has remained challenging. To address this, it is essential to delineate how the accuracy of regulatory inference is limited. Here, we systematically characterized factors limiting the accuracy of inferred GRNs and demonstrated that using pre-mRNA information can help improve regulatory inference compared to the typically used information (i.e., mature mRNA). Using kinetic modeling and simulated single-cell data sets, we showed that target genes' mature mRNA levels often fail to accurately report upstream regulatory activities because of gene-level and network-level factors, which can be improved by using pre-mRNA levels. We tested this finding on public single-cell RNA-seq data sets using intronic reads as proxies of pre-mRNA levels and can indeed achieve a higher inference accuracy compared to using exonic reads (corresponding to mature mRNAs). Using experimental data sets, we further validated findings from the simulated data sets and identified factors such as transcription factor activity dynamics influencing the accuracy of pre-mRNA-based inference. This work delineates the fundamental limitations of gene regulatory inference and helps improve GRN inference using single-cell RNA-seq data.
Collapse
Affiliation(s)
- Lingfeng Xue
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China, 100871
| | - Yan Wu
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China, 100871
- The MOE Key Laboratory of Cell Proliferation and Differentiation, School of Life Sciences, Peking University, Beijing, China, 100871
| | - Yihan Lin
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China, 100871;
- The MOE Key Laboratory of Cell Proliferation and Differentiation, School of Life Sciences, Peking University, Beijing, China, 100871
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China, 100871
| |
Collapse
|
52
|
Wang J, Chen Y, Zou Q. Inferring gene regulatory network from single-cell transcriptomes with graph autoencoder model. PLoS Genet 2023; 19:e1010942. [PMID: 37703293 PMCID: PMC10519590 DOI: 10.1371/journal.pgen.1010942] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 09/25/2023] [Accepted: 08/29/2023] [Indexed: 09/15/2023] Open
Abstract
The gene regulatory structure of cells involves not only the regulatory relationship between two genes, but also the cooperative associations of multiple genes. However, most gene regulatory network inference methods for single cell only focus on and infer the regulatory relationships of pairs of genes, ignoring the global regulatory structure which is crucial to identify the regulations in the complex biological systems. Here, we proposed a graph-based Deep learning model for Regulatory networks Inference among Genes (DeepRIG) from single-cell RNA-seq data. To learn the global regulatory structure, DeepRIG builds a prior regulatory graph by transforming the gene expression of data into the co-expression mode. Then it utilizes a graph autoencoder model to embed the global regulatory information contained in the graph into gene latent embeddings and to reconstruct the gene regulatory network. Extensive benchmarking results demonstrate that DeepRIG can accurately reconstruct the gene regulatory networks and outperform existing methods on multiple simulated networks and real-cell regulatory networks. Additionally, we applied DeepRIG to the samples of human peripheral blood mononuclear cells and triple-negative breast cancer, and presented that DeepRIG can provide accurate cell-type-specific gene regulatory networks inference and identify novel regulators of progression and inhibition.
Collapse
Affiliation(s)
- Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| | - Yaojia Chen
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China
| |
Collapse
|
53
|
Vanheer L, Fantuzzi F, To SK, Schiavo A, Van Haele M, Ostyn T, Haesen T, Yi X, Janiszewski A, Chappell J, Rihoux A, Sawatani T, Roskams T, Pattou F, Kerr-Conte J, Cnop M, Pasque V. Inferring regulators of cell identity in the human adult pancreas. NAR Genom Bioinform 2023; 5:lqad068. [PMID: 37435358 PMCID: PMC10331937 DOI: 10.1093/nargab/lqad068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 06/17/2023] [Accepted: 06/28/2023] [Indexed: 07/13/2023] Open
Abstract
Cellular identity during development is under the control of transcription factors that form gene regulatory networks. However, the transcription factors and gene regulatory networks underlying cellular identity in the human adult pancreas remain largely unexplored. Here, we integrate multiple single-cell RNA-sequencing datasets of the human adult pancreas, totaling 7393 cells, and comprehensively reconstruct gene regulatory networks. We show that a network of 142 transcription factors forms distinct regulatory modules that characterize pancreatic cell types. We present evidence that our approach identifies regulators of cell identity and cell states in the human adult pancreas. We predict that HEYL, BHLHE41 and JUND are active in acinar, beta and alpha cells, respectively, and show that these proteins are present in the human adult pancreas as well as in human induced pluripotent stem cell (hiPSC)-derived islet cells. Using single-cell transcriptomics, we found that JUND represses beta cell genes in hiPSC-alpha cells. BHLHE41 depletion induced apoptosis in primary pancreatic islets. The comprehensive gene regulatory network atlas can be explored interactively online. We anticipate our analysis to be the starting point for a more sophisticated dissection of how transcription factors regulate cell identity and cell states in the human adult pancreas.
Collapse
Affiliation(s)
- Lotte Vanheer
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Federica Fantuzzi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - San Kit To
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Andrea Schiavo
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Matthias Van Haele
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tessa Ostyn
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Tine Haesen
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Xiaoyan Yi
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Adrian Janiszewski
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Joel Chappell
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Adrien Rihoux
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| | - Toshiaki Sawatani
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Tania Roskams
- Department of Imaging and Pathology; Translational Cell and Tissue Research, KU Leuven and University Hospitals Leuven; Herestraat 49, B-3000 Leuven, Belgium
| | - Francois Pattou
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Julie Kerr-Conte
- University of Lille, Inserm, CHU Lille, Institute Pasteur Lille, U1190-EGID, F-59000 Lille, France
- European Genomic Institute for Diabetes, F-59000 Lille, France
- University of Lille, F-59000 Lille, France
| | - Miriam Cnop
- ULB Center for Diabetes Research; Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
- Division of Endocrinology; Erasmus Hospital, Université Libre de Bruxelles; Route de Lennik 808, B-1070 Brussels, Belgium
| | - Vincent Pasque
- Department of Development and Regeneration; KU Leuven - University of Leuven; Single-cell Omics Institute and Leuven Stem Cell Institute, Herestraat 49, B-3000 Leuven, Belgium
| |
Collapse
|
54
|
Dautle M, Zhang S, Chen Y. scTIGER: A Deep-Learning Method for Inferring Gene Regulatory Networks from Case versus Control scRNA-seq Datasets. Int J Mol Sci 2023; 24:13339. [PMID: 37686146 PMCID: PMC10488287 DOI: 10.3390/ijms241713339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Revised: 08/06/2023] [Accepted: 08/23/2023] [Indexed: 09/10/2023] Open
Abstract
Inferring gene regulatory networks (GRNs) from single-cell RNA-seq (scRNA-seq) data is an important computational question to find regulatory mechanisms involved in fundamental cellular processes. Although many computational methods have been designed to predict GRNs from scRNA-seq data, they usually have high false positive rates and none infer GRNs by directly using the paired datasets of case-versus-control experiments. Here we present a novel deep-learning-based method, named scTIGER, for GRN detection by using the co-differential relationships of gene expression profiles in paired scRNA-seq datasets. scTIGER employs cell-type-based pseudotiming, an attention-based convolutional neural network method and permutation-based significance testing for inferring GRNs among gene modules. As state-of-the-art applications, we first applied scTIGER to scRNA-seq datasets of prostate cancer cells, and successfully identified the dynamic regulatory networks of AR, ERG, PTEN and ATF3 for same-cell type between prostatic cancerous and normal conditions, and two-cell types within the prostatic cancerous environment. We then applied scTIGER to scRNA-seq data from neurons with and without fear memory and detected specific regulatory networks for BDNF, CREB1 and MAPK4. Additionally, scTIGER demonstrates robustness against high levels of dropout noise in scRNA-seq data.
Collapse
Affiliation(s)
- Madison Dautle
- Department of Biological and Biomedical Sciences, Rowan University, Glassboro, NJ 08028, USA;
| | - Shaoqiang Zhang
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin 300387, China
| | - Yong Chen
- Department of Biological and Biomedical Sciences, Rowan University, Glassboro, NJ 08028, USA;
| |
Collapse
|
55
|
Bocci F, Jia D, Nie Q, Jolly MK, Onuchic J. Theoretical and computational tools to model multistable gene regulatory networks. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2023; 86:10.1088/1361-6633/acec88. [PMID: 37531952 PMCID: PMC10521208 DOI: 10.1088/1361-6633/acec88] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 08/02/2023] [Indexed: 08/04/2023]
Abstract
The last decade has witnessed a surge of theoretical and computational models to describe the dynamics of complex gene regulatory networks, and how these interactions can give rise to multistable and heterogeneous cell populations. As the use of theoretical modeling to describe genetic and biochemical circuits becomes more widespread, theoreticians with mathematical and physical backgrounds routinely apply concepts from statistical physics, non-linear dynamics, and network theory to biological systems. This review aims at providing a clear overview of the most important methodologies applied in the field while highlighting current and future challenges. It also includes hands-on tutorials to solve and simulate some of the archetypical biological system models used in the field. Furthermore, we provide concrete examples from the existing literature for theoreticians that wish to explore this fast-developing field. Whenever possible, we highlight the similarities and differences between biochemical and regulatory networks and 'classical' systems typically studied in non-equilibrium statistical and quantum mechanics.
Collapse
Affiliation(s)
- Federico Bocci
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697, USA
- Department of Mathematics, University of California, Irvine, CA 92697, USA
| | - Dongya Jia
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
| | - Qing Nie
- The NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697, USA
- Department of Mathematics, University of California, Irvine, CA 92697, USA
| | - Mohit Kumar Jolly
- Centre for BioSystems Science and Engineering, Indian Institute of Science, Bangalore 560012, India
| | - José Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX 77005, USA
- Department of Physics and Astronomy, Rice University, Houston, TX 77005, USA
- Department of Chemistry, Rice University, Houston, TX 77005, USA
- Department of Biosciences, Rice University, Houston, TX 77005, USA
| |
Collapse
|
56
|
Yuan Q, Duren Z. Continuous lifelong learning for modeling of gene regulation from single cell multiome data by leveraging atlas-scale external data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.01.551575. [PMID: 37577525 PMCID: PMC10418251 DOI: 10.1101/2023.08.01.551575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Accurate context-specific Gene Regulatory Networks (GRNs) inference from genomics data is a crucial task in computational biology. However, existing methods face limitations, such as reliance on gene expression data alone, lower resolution from bulk data, and data scarcity for specific cellular systems. Despite recent technological advancements, including single-cell sequencing and the integration of ATAC-seq and RNA-seq data, learning such complex mechanisms from limited independent data points still presents a daunting challenge, impeding GRN inference accuracy. To overcome this challenge, we present LINGER (LIfelong neural Network for GEne Regulation), a novel deep learning-based method to infer GRNs from single-cell multiome data with paired gene expression and chromatin accessibility data from the same cell. LINGER incorporates both 1) atlas-scale external bulk data across diverse cellular contexts and 2) the knowledge of transcription factor (TF) motif matching to cis-regulatory elements as a manifold regularization to address the challenge of limited data and extensive parameter space in GRN inference. Our results demonstrate that LINGER achieves 2-3 fold higher accuracy over existing methods. LINGER reveals a complex regulatory landscape of genome-wide association studies, enabling enhanced interpretation of disease-associated variants and genes. Additionally, following the GRN inference from a reference sc-multiome data, LINGER allows for the estimation of TF activity solely from bulk or single-cell gene expression data, leveraging the abundance of available gene expression data to identify driver regulators from case-control studies. Overall, LINGER provides a comprehensive tool for robust gene regulation inference from genomics data, empowering deeper insights into cellular mechanisms.
Collapse
Affiliation(s)
- Qiuyue Yuan
- Center for Human Genetics, Department of Genetics and Biochemistry, Clemson University, Greenwood, SC 29646, USA
| | - Zhana Duren
- Center for Human Genetics, Department of Genetics and Biochemistry, Clemson University, Greenwood, SC 29646, USA
| |
Collapse
|
57
|
Marku M, Pancaldi V. From time-series transcriptomics to gene regulatory networks: A review on inference methods. PLoS Comput Biol 2023; 19:e1011254. [PMID: 37561790 PMCID: PMC10414591 DOI: 10.1371/journal.pcbi.1011254] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/12/2023] Open
Abstract
Inference of gene regulatory networks has been an active area of research for around 20 years, leading to the development of sophisticated inference algorithms based on a variety of assumptions and approaches. With the ever increasing demand for more accurate and powerful models, the inference problem remains of broad scientific interest. The abstract representation of biological systems through gene regulatory networks represents a powerful method to study such systems, encoding different amounts and types of information. In this review, we summarize the different types of inference algorithms specifically based on time-series transcriptomics, giving an overview of the main applications of gene regulatory networks in computational biology. This review is intended to give an updated reference of regulatory networks inference tools to biologists and researchers new to the topic and guide them in selecting the appropriate inference method that best fits their questions, aims, and experimental data.
Collapse
Affiliation(s)
- Malvina Marku
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
58
|
Littman R, Cheng M, Wang N, Peng C, Yang X. SCING: Inference of robust, interpretable gene regulatory networks from single cell and spatial transcriptomics. iScience 2023; 26:107124. [PMID: 37434694 PMCID: PMC10331489 DOI: 10.1016/j.isci.2023.107124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 03/31/2023] [Accepted: 06/09/2023] [Indexed: 07/13/2023] Open
Abstract
Gene regulatory network (GRN) inference is an integral part of understanding physiology and disease. Single cell/nuclei RNA-seq (scRNA-seq/snRNA-seq) data has been used to elucidate cell-type GRNs; however, the accuracy and speed of current scRNAseq-based GRN approaches are suboptimal. Here, we present Single Cell INtegrative Gene regulatory network inference (SCING), a gradient boosting and mutual information-based approach for identifying robust GRNs from scRNA-seq, snRNA-seq, and spatial transcriptomics data. Performance evaluation using Perturb-seq datasets, held-out data, and the mouse cell atlas combined with the DisGeNET database demonstrates the improved accuracy and biological interpretability of SCING compared to existing methods. We applied SCING to the entire mouse single cell atlas, human Alzheimer's disease (AD), and mouse AD spatial transcriptomics. SCING GRNs reveal unique disease subnetwork modeling capabilities, have intrinsic capacity to correct for batch effects, retrieve disease relevant genes and pathways, and are informative on spatial specificity of disease pathogenesis.
Collapse
Affiliation(s)
- Russell Littman
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Michael Cheng
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Ning Wang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
| | - Chao Peng
- Department of Neurology, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
| | - Xia Yang
- Department of Integrative Biology & Physiology, UCLA, Los Angeles, CA, USA
- Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
- Institute for Quantitative and Computational Biosciences (QCBio), Los Angeles, CA, USA
- Molecular Biology Institute (MBI), Los Angeles, CA, USA
- Brain Research Institute (BRI), Los Angeles, CA, USA
| |
Collapse
|
59
|
Bocci F, Jia D, Nie Q, Jolly MK, Onuchic J. Theoretical and computational tools to model multistable gene regulatory networks. ARXIV 2023:arXiv:2302.07401v2. [PMID: 36824430 PMCID: PMC9949162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
The last decade has witnessed a surge of theoretical and computational models to describe the dynamics of complex gene regulatory networks, and how these interactions can give rise to multistable and heterogeneous cell populations. As the use of theoretical modeling to describe genetic and biochemical circuits becomes more widespread, theoreticians with mathematical and physical backgrounds routinely apply concepts from statistical physics, non-linear dynamics, and network theory to biological systems. This review aims at providing a clear overview of the most important methodologies applied in the field while highlighting current and future challenges. It also includes hands-on tutorials to solve and simulate some of the archetypical biological system models used in the field. Furthermore, we provide concrete examples from the existing literature for theoreticians that wish to explore this fast-developing field. Whenever possible, we highlight the similarities and differences between biochemical and regulatory networks and 'classical' systems typically studied in non-equilibrium statistical and quantum mechanics.
Collapse
|
60
|
Schiffthaler B, van Zalen E, Serrano AR, Street NR, Delhomme N. Seiðr: Efficient calculation of robust ensemble gene networks. Heliyon 2023; 9:e16811. [PMID: 37313140 PMCID: PMC10258422 DOI: 10.1016/j.heliyon.2023.e16811] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Revised: 05/22/2023] [Accepted: 05/29/2023] [Indexed: 06/15/2023] Open
Abstract
Gene regulatory and gene co-expression networks are powerful research tools for identifying biological signal within high-dimensional gene expression data. In recent years, research has focused on addressing shortcomings of these techniques with regard to the low signal-to-noise ratio, non-linear interactions and dataset dependent biases of published methods. Furthermore, it has been shown that aggregating networks from multiple methods provides improved results. Despite this, few useable and scalable software tools have been implemented to perform such best-practice analyses. Here, we present Seidr (stylized Seiðr), a software toolkit designed to assist scientists in gene regulatory and gene co-expression network inference. Seidr creates community networks to reduce algorithmic bias and utilizes noise corrected network backboning to prune noisy edges in the networks. Using benchmarks in real-world conditions across three eukaryotic model organisms, Saccharomyces cerevisiae, Drosophila melanogaster, and Arabidopsis thaliana, we show that individual algorithms are biased toward functional evidence for certain gene-gene interactions. We further demonstrate that the community network is less biased, providing robust performance across different standards and comparisons for the model organisms. Finally, we apply Seidr to a network of drought stress in Norway spruce (Picea abies (L.) H. Krast) as an example application in a non-model species. We demonstrate the use of a network inferred using Seidr for identifying key components, communities and suggesting gene function for non-annotated genes.
Collapse
Affiliation(s)
- Bastian Schiffthaler
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Elena van Zalen
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Alonso R. Serrano
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| | - Nathaniel R. Street
- Department of Plant Physiology, Umea Plant Science Center, Umea University, Umea, Sweden
| | - Nicolas Delhomme
- Department of Plant Physiology, Umea Plant Science Center, Swedish University of Agricultural Sciences, Umea, Sweden
| |
Collapse
|
61
|
Zhang S, Pyne S, Pietrzak S, Halberg S, McCalla SG, Siahpirani AF, Sridharan R, Roy S. Inference of cell type-specific gene regulatory networks on cell lineages from single cell omic datasets. Nat Commun 2023; 14:3064. [PMID: 37244909 PMCID: PMC10224950 DOI: 10.1038/s41467-023-38637-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2022] [Accepted: 05/10/2023] [Indexed: 05/29/2023] Open
Abstract
Cell type-specific gene expression patterns are outputs of transcriptional gene regulatory networks (GRNs) that connect transcription factors and signaling proteins to target genes. Single-cell technologies such as single cell RNA-sequencing (scRNA-seq) and single cell Assay for Transposase-Accessible Chromatin using sequencing (scATAC-seq), can examine cell-type specific gene regulation at unprecedented detail. However, current approaches to infer cell type-specific GRNs are limited in their ability to integrate scRNA-seq and scATAC-seq measurements and to model network dynamics on a cell lineage. To address this challenge, we have developed single-cell Multi-Task Network Inference (scMTNI), a multi-task learning framework to infer the GRN for each cell type on a lineage from scRNA-seq and scATAC-seq data. Using simulated and real datasets, we show that scMTNI is a broadly applicable framework for linear and branching lineages that accurately infers GRN dynamics and identifies key regulators of fate transitions for diverse processes such as cellular reprogramming and differentiation.
Collapse
Affiliation(s)
- Shilu Zhang
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
| | - Stefan Pietrzak
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, WI, USA
| | - Spencer Halberg
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA
| | - Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI, 53706, USA
| | - Alireza Fotuhi Siahpirani
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Rupa Sridharan
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA
- Department of Cell and Regenerative Biology, University of Wisconsin-Madison, Madison, WI, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, USA.
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI, USA.
| |
Collapse
|
62
|
Li L, Sun L, Chen G, Wong CW, Ching WK, Liu ZP. LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinformatics 2023; 39:btad256. [PMID: 37079737 PMCID: PMC10172039 DOI: 10.1093/bioinformatics/btad256] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 02/25/2023] [Accepted: 04/13/2023] [Indexed: 04/22/2023] Open
Abstract
MOTIVATION From a systematic perspective, it is crucial to infer and analyze gene regulatory network (GRN) from high-throughput single-cell RNA sequencing data. However, most existing GRN inference methods mainly focus on the network topology, only few of them consider how to explicitly describe the updated logic rules of regulation in GRNs to obtain their dynamics. Moreover, some inference methods also fail to deal with the over-fitting problem caused by the noise in time series data. RESULTS In this article, we propose a novel embedded Boolean threshold network method called LogBTF, which effectively infers GRN by integrating regularized logistic regression and Boolean threshold function. First, the continuous gene expression values are converted into Boolean values and the elastic net regression model is adopted to fit the binarized time series data. Then, the estimated regression coefficients are applied to represent the unknown Boolean threshold function of the candidate Boolean threshold network as the dynamical equations. To overcome the multi-collinearity and over-fitting problems, a new and effective approach is designed to optimize the network topology by adding a perturbation design matrix to the input data and thereafter setting sufficiently small elements of the output coefficient vector to zeros. In addition, the cross-validation procedure is implemented into the Boolean threshold network model framework to strengthen the inference capability. Finally, extensive experiments on one simulated Boolean value dataset, dozens of simulation datasets, and three real single-cell RNA sequencing datasets demonstrate that the LogBTF method can infer GRNs from time series data more accurately than some other alternative methods for GRN inference. AVAILABILITY AND IMPLEMENTATION The source data and code are available at https://github.com/zpliulab/LogBTF.
Collapse
Affiliation(s)
- Lingyu Li
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Liangjie Sun
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Guangyi Chen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| | - Chi-Wing Wong
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Wai-Ki Ching
- Advanced Modeling and Applied Computing Laboratory, Department of Mathematics, The University of Hong Kong, Hong Kong, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan 250061, China
| |
Collapse
|
63
|
Su X, Wang L, Ma N, Yang X, Liu C, Yang F, Li J, Yi X, Xing Y. Immune heterogeneity in cardiovascular diseases from a single-cell perspective. Front Cardiovasc Med 2023; 10:1057870. [PMID: 37180791 PMCID: PMC10167030 DOI: 10.3389/fcvm.2023.1057870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Accepted: 04/10/2023] [Indexed: 05/16/2023] Open
Abstract
A variety of immune cell subsets occupy different niches in the cardiovascular system, causing changes in the structure and function of the heart and vascular system, and driving the progress of cardiovascular diseases (CVDs). The immune cells infiltrating the injury site are highly diverse and integrate into a broad dynamic immune network that controls the dynamic changes of CVDs. Due to technical limitations, the effects and molecular mechanisms of these dynamic immune networks on CVDs have not been fully revealed. With recent advances in single-cell technologies such as single-cell RNA sequencing, systematic interrogation of the immune cell subsets is feasible and will provide insights into the way we understand the integrative behavior of immune populations. We no longer lightly ignore the role of individual cells, especially certain highly heterogeneous or rare subpopulations. We summarize the phenotypic diversity of immune cell subsets and their significance in three CVDs of atherosclerosis, myocardial ischemia and heart failure. We believe that such a review could enhance our understanding of how immune heterogeneity drives the progression of CVDs, help to elucidate the regulatory roles of immune cell subsets in disease, and thus guide the development of new immunotherapies.
Collapse
Affiliation(s)
- Xin Su
- China Academy of Chinese Medical Sciences, Guang’anmen Hospital, Beijing, China
| | - Li Wang
- Department of Breast Surgery, Xingtai People’s Hospital, Xingtai, China
| | - Ning Ma
- Department of Breast Surgery, Dezhou Second People’s Hospital, Dezhou, China
| | - Xinyu Yang
- Fangshan Hospital Beijing University of Chinese Medicine, Beijing, China
| | - Can Liu
- China Academy of Chinese Medical Sciences, Guang’anmen Hospital, Beijing, China
| | - Fan Yang
- China Academy of Chinese Medical Sciences, Guang’anmen Hospital, Beijing, China
| | - Jun Li
- China Academy of Chinese Medical Sciences, Guang’anmen Hospital, Beijing, China
| | - Xin Yi
- Department of Cardiology, Beijing Huimin Hospital, Beijing, China
| | - Yanwei Xing
- China Academy of Chinese Medical Sciences, Guang’anmen Hospital, Beijing, China
| |
Collapse
|
64
|
Wang Q, Guo M, Chen J, Duan R. A gene regulatory network inference model based on pseudo-siamese network. BMC Bioinformatics 2023; 24:163. [PMID: 37085776 PMCID: PMC10122305 DOI: 10.1186/s12859-023-05253-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Accepted: 03/24/2023] [Indexed: 04/23/2023] Open
Abstract
MOTIVATION Gene regulatory networks (GRNs) arise from the intricate interactions between transcription factors (TFs) and their target genes during the growth and development of organisms. The inference of GRNs can unveil the underlying gene interactions in living systems and facilitate the investigation of the relationship between gene expression patterns and phenotypic traits. Although several machine-learning models have been proposed for inferring GRNs from single-cell RNA sequencing (scRNA-seq) data, some of these models, such as Boolean and tree-based networks, suffer from sensitivity to noise and may encounter difficulties in handling the high noise and dimensionality of actual scRNA-seq data, as well as the sparse nature of gene regulation relationships. Thus, inferring large-scale information from GRNs remains a formidable challenge. RESULTS This study proposes a multilevel, multi-structure framework called a pseudo-Siamese GRN (PSGRN) for inferring large-scale GRNs from time-series expression datasets. Based on the pseudo-Siamese network, we applied a gated recurrent unit to capture the time features of each TF and target matrix and learn the spatial features of the matrices after merging by applying the DenseNet framework. Finally, we applied a sigmoid function to evaluate interactions. We constructed two maize sub-datasets, including gene expression levels and GRNs, using existing open-source maize multi-omics data and compared them to other GRN inference methods, including GENIE3, GRNBoost2, nonlinear ordinary differential equations, CNNC, and DGRNS. Our results show that PSGRN outperforms state-of-the-art methods. This study proposed a new framework: a PSGRN that allows GRNs to be inferred from scRNA-seq data, elucidating the temporal and spatial features of TFs and their target genes. The results show the model's robustness and generalization, laying a theoretical foundation for maize genotype-phenotype associations with implications for breeding work.
Collapse
Affiliation(s)
- Qian Wang
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China.
| | - Jian Chen
- College of Agronomy and Biotechnology, China Agricultural University, Beijing, China
| | - Ran Duan
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing, China
| |
Collapse
|
65
|
Rommelfanger MK, Behrends M, Chen Y, Martinez J, Bens M, Xiong L, Rudolph KL, MacLean AL. Gene regulatory network inference with popInfer reveals dynamic regulation of hematopoietic stem cell quiescence upon diet restriction and aging. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.18.537360. [PMID: 37131596 PMCID: PMC10153203 DOI: 10.1101/2023.04.18.537360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Inference of gene regulatory networks (GRNs) can reveal cell state transitions from single-cell genomics data. However, obstacles to temporal inference from snapshot data are difficult to overcome. Single-nuclei multiomics data offer means to bridge this gap and derive temporal information from snapshot data using joint measurements of gene expression and chromatin accessibility in the same single cells. We developed popInfer to infer networks that characterize lineage-specific dynamic cell state transitions from joint gene expression and chromatin accessibility data. Benchmarking against alternative methods for GRN inference, we showed that popInfer achieves higher accuracy in the GRNs inferred. popInfer was applied to study single-cell multiomics data characterizing hematopoietic stem cells (HSCs) and the transition from HSC to a multipotent progenitor cell state during murine hematopoiesis across age and dietary conditions. From networks predicted by popInfer, we discovered gene interactions controlling entry to/exit from HSC quiescence that are perturbed in response to diet or aging.
Collapse
Affiliation(s)
- Megan K. Rommelfanger
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Marthe Behrends
- Research Group on Stem Cell and Metabolism Aging, Leibniz Institute on Aging, Fritz Lipmann Institute (FLI), Jena, Germany
| | - Yulin Chen
- Research Group on Stem Cell and Metabolism Aging, Leibniz Institute on Aging, Fritz Lipmann Institute (FLI), Jena, Germany
| | - Jonathan Martinez
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| | - Martin Bens
- Core Facility Next Generation Sequencing, Leibniz Institute on Aging, Fritz Lipmann Institute (FLI), Jena, Germany
| | - Lingyun Xiong
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
- Department of Stem Cell Biology and Regenerative Medicine, Broad-CIRM Center, Keck School of Medicine, University of Southern California, Los Angeles, CA 90089, USA
| | - K. Lenhard Rudolph
- Research Group on Stem Cell and Metabolism Aging, Leibniz Institute on Aging, Fritz Lipmann Institute (FLI), Jena, Germany
- Medical Faculty, Jena University Hospital, Friedrich Schiller University, Jena, Germany
| | - Adam L. MacLean
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA 90089, USA
| |
Collapse
|
66
|
Karaaslanli A, Saha S, Maiti T, Aviyente S. Kernelized multiview signed graph learning for single-cell RNA sequencing data. BMC Bioinformatics 2023; 24:127. [PMID: 37016281 PMCID: PMC10071725 DOI: 10.1186/s12859-023-05250-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Accepted: 03/22/2023] [Indexed: 04/06/2023] Open
Abstract
BACKGROUND Characterizing the topology of gene regulatory networks (GRNs) is a fundamental problem in systems biology. The advent of single cell technologies has made it possible to construct GRNs at finer resolutions than bulk and microarray datasets. However, cellular heterogeneity and sparsity of the single cell datasets render void the application of regular Gaussian assumptions for constructing GRNs. Additionally, most GRN reconstruction approaches estimate a single network for the entire data. This could cause potential loss of information when single cell datasets are generated from multiple treatment conditions/disease states. RESULTS To better characterize single cell GRNs under different but related conditions, we propose the joint estimation of multiple networks using multiple signed graph learning (scMSGL). The proposed method is based on recently developed graph signal processing (GSP) based graph learning, where GRNs and gene expressions are modeled as signed graphs and graph signals, respectively. scMSGL learns multiple GRNs by optimizing the total variation of gene expressions with respect to GRNs while ensuring that the learned GRNs are similar to each other through regularization with respect to a learned signed consensus graph. We further kernelize scMSGL with the kernel selected to suit the structure of single cell data. CONCLUSIONS scMSGL is shown to have superior performance over existing state of the art methods in GRN recovery on simulated datasets. Furthermore, scMSGL successfully identifies well-established regulators in a mouse embryonic stem cell differentiation study and a cancer clinical study of medulloblastoma.
Collapse
Affiliation(s)
- Abdullah Karaaslanli
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA.
| | - Satabdi Saha
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
| | - Tapabrata Maiti
- Department of Statistics and Probability, Michigan State University, East Lansing, MI, USA
| | - Selin Aviyente
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, MI, USA
| |
Collapse
|
67
|
Reagor CC, Velez-Angel N, Hudspeth AJ. Depicting pseudotime-lagged causality across single-cell trajectories for accurate gene-regulatory inference. PNAS NEXUS 2023; 2:pgad113. [PMID: 37113980 PMCID: PMC10129065 DOI: 10.1093/pnasnexus/pgad113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 03/21/2023] [Accepted: 03/23/2023] [Indexed: 04/29/2023]
Abstract
Identifying the causal interactions in gene-regulatory networks requires an accurate understanding of the time-lagged relationships between transcription factors and their target genes. Here we describe DELAY (short for Depicting Lagged Causality), a convolutional neural network for the inference of gene-regulatory relationships across pseudotime-ordered single-cell trajectories. We show that combining supervised deep learning with joint probability matrices of pseudotime-lagged trajectories allows the network to overcome important limitations of ordinary Granger causality-based methods, for example, the inability to infer cyclic relationships such as feedback loops. Our network outperforms several common methods for inferring gene regulation and, when given partial ground-truth labels, predicts novel regulatory networks from single-cell RNA sequencing (scRNA-seq) and single-cell ATAC sequencing (scATAC-seq) data sets. To validate this approach, we used DELAY to identify important genes and modules in the regulatory network of auditory hair cells, as well as likely DNA-binding partners for two hair cell cofactors (Hist1h1c and Ccnd1) and a novel binding sequence for the hair cell-specific transcription factor Fiz1. We provide an easy-to-use implementation of DELAY under an open-source license at https://github.com/calebclayreagor/DELAY.
Collapse
Affiliation(s)
| | - Nicolas Velez-Angel
- Howard Hughes Medical Institute and Laboratory of Sensory Neuroscience, The Rockefeller University, New York, NY 10065, USA
| | | |
Collapse
|
68
|
Shen B, Coruzzi G, Shasha D. EnsInfer: a simple ensemble approach to network inference outperforms any single method. BMC Bioinformatics 2023; 24:114. [PMID: 36964499 PMCID: PMC10037858 DOI: 10.1186/s12859-023-05231-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/15/2023] [Indexed: 03/26/2023] Open
Abstract
This study evaluates both a variety of existing base causal inference methods and a variety of ensemble methods. We show that: (i) base network inference methods vary in their performance across different datasets, so a method that works poorly on one dataset may work well on another; (ii) a non-homogeneous ensemble method in the form of a Naive Bayes classifier leads overall to as good or better results than using the best single base method or any other ensemble method; (iii) for the best results, the ensemble method should integrate all methods that satisfy a statistical test of normality on training data. The resulting ensemble model EnsInfer easily integrates all kinds of RNA-seq data as well as new and existing inference methods. The paper categorizes and reviews state-of-the-art underlying methods, describes the EnsInfer ensemble approach in detail, and presents experimental results. The source code and data used will be made available to the community upon publication.
Collapse
Affiliation(s)
- Bingran Shen
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, New York, 10012 USA
| | - Gloria Coruzzi
- Department of Biology, Center for Genomics and Systems Biology, New York University, 12 Waverly Pl, New York, 10003 USA
| | - Dennis Shasha
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 251 Mercer St, New York, 10012 USA
| |
Collapse
|
69
|
Tu YH, Juan HF, Huang HC. Context-dependent gene regulatory network reveals regulation dynamics and cell trajectories using unspliced transcripts. Brief Bioinform 2023; 24:6991202. [PMID: 36653899 DOI: 10.1093/bib/bbac633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2022] [Revised: 12/06/2022] [Accepted: 12/29/2022] [Indexed: 01/20/2023] Open
Abstract
Gene regulatory networks govern complex gene expression programs in various biological phenomena, including embryonic development, cell fate decisions and oncogenesis. Single-cell techniques are increasingly being used to study gene expression, providing higher resolution than traditional approaches. However, inferring a comprehensive gene regulatory network across different cell types remains a challenge. Here, we propose to construct context-dependent gene regulatory networks (CDGRNs) from single-cell RNA sequencing data utilizing both spliced and unspliced transcript expression levels. A gene regulatory network is decomposed into subnetworks corresponding to different transcriptomic contexts. Each subnetwork comprises the consensus active regulation pairs of transcription factors and their target genes shared by a group of cells, inferred by a Gaussian mixture model. We find that the union of gene regulation pairs in all contexts is sufficient to reconstruct differentiation trajectories. Functions specific to the cell cycle, cell differentiation or tissue-specific functions are enriched throughout the developmental process in each context. Surprisingly, we also observe that the network entropy of CDGRNs decreases along differentiation trajectories, indicating directionality in differentiation. Overall, CDGRN allows us to establish the connection between gene regulation at the molecular level and cell differentiation at the macroscopic level.
Collapse
Affiliation(s)
- Yueh-Hua Tu
- Bioinformatics Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 115, Taiwan
- Taiwan International Graduate Program on Bioinformatics, National Taiwan University, Taipei, 106, Taiwan
| | - Hsueh-Fen Juan
- Taiwan International Graduate Program on Bioinformatics, National Taiwan University, Taipei, 106, Taiwan
- Department of Life Science, National Taiwan University, Taipei, 106, Taiwan
| | - Hsuan-Cheng Huang
- Institute of Biomedical Informatics, National Yang Ming Chiao Tung University, Taipei, 112, Taiwan
| |
Collapse
|
70
|
McCalla SG, Fotuhi Siahpirani A, Li J, Pyne S, Stone M, Periyasamy V, Shin J, Roy S. Identifying strengths and weaknesses of methods for computational network inference from single-cell RNA-seq data. G3 (BETHESDA, MD.) 2023; 13:jkad004. [PMID: 36626328 PMCID: PMC9997554 DOI: 10.1093/g3journal/jkad004] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 11/09/2022] [Accepted: 12/16/2022] [Indexed: 01/11/2023]
Abstract
Single-cell RNA-sequencing (scRNA-seq) offers unparalleled insight into the transcriptional programs of different cellular states by measuring the transcriptome of thousands of individual cells. An emerging problem in the analysis of scRNA-seq is the inference of transcriptional gene regulatory networks and a number of methods with different learning frameworks have been developed to address this problem. Here, we present an expanded benchmarking study of eleven recent network inference methods on seven published scRNA-seq datasets in human, mouse, and yeast considering different types of gold standard networks and evaluation metrics. We evaluate methods based on their computing requirements as well as on their ability to recover the network structure. We find that, while most methods have a modest recovery of experimentally derived interactions based on global metrics such as Area Under the Precision Recall curve, methods are able to capture targets of regulators that are relevant to the system under study. Among the top performing methods that use only expression were SCENIC, PIDC, MERLIN or Correlation. Addition of prior biological knowledge and the estimation of transcription factor activities resulted in the best overall performance with the Inferelator and MERLIN methods that use prior knowledge outperforming methods that use expression alone. We found that imputation for network inference did not improve network inference accuracy and could be detrimental. Comparisons of inferred networks for comparable bulk conditions showed that the networks inferred from scRNA-seq datasets are often better or at par with the networks inferred from bulk datasets. Our analysis should be beneficial in selecting methods for network inference. At the same time, this highlights the need for improved methods and better gold standards for regulatory network inference from scRNAseq datasets.
Collapse
Affiliation(s)
- Sunnie Grace McCalla
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | | | - Jiaxin Li
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Saptarshi Pyne
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Matthew Stone
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
| | - Viswesh Periyasamy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Junha Shin
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
| | - Sushmita Roy
- Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI 53715, USA
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53792, USA
- Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, USA
| |
Collapse
|
71
|
Franchini M, Pellecchia S, Viscido G, Gambardella G. Single-cell gene set enrichment analysis and transfer learning for functional annotation of scRNA-seq data. NAR Genom Bioinform 2023; 5:lqad024. [PMID: 36879897 PMCID: PMC9985338 DOI: 10.1093/nargab/lqad024] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 01/16/2023] [Accepted: 02/20/2023] [Indexed: 03/07/2023] Open
Abstract
Although an essential step, cell functional annotation often proves particularly challenging from single-cell transcriptional data. Several methods have been developed to accomplish this task. However, in most cases, these rely on techniques initially developed for bulk RNA sequencing or simply make use of marker genes identified from cell clustering followed by supervised annotation. To overcome these limitations and automatize the process, we have developed two novel methods, the single-cell gene set enrichment analysis (scGSEA) and the single-cell mapper (scMAP). scGSEA combines latent data representations and gene set enrichment scores to detect coordinated gene activity at single-cell resolution. scMAP uses transfer learning techniques to re-purpose and contextualize new cells into a reference cell atlas. Using both simulated and real datasets, we show that scGSEA effectively recapitulates recurrent patterns of pathways' activity shared by cells from different experimental conditions. At the same time, we show that scMAP can reliably map and contextualize new single-cell profiles on a breast cancer atlas we recently released. Both tools are provided in an effective and straightforward workflow providing a framework to determine cell function and significantly improve annotation and interpretation of scRNA-seq data.
Collapse
Affiliation(s)
- Melania Franchini
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy.,Department of Electrical Engineering and Information Technologies, University of Naples Federico II, 80125 Naples, Italy
| | - Simona Pellecchia
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy
| | - Gaetano Viscido
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy
| | - Gennaro Gambardella
- Telethon Institute of Genetics and Medicine, Pozzuoli 80078 Naples, Italy.,Department of Chemical Materials and Industrial Engineering, University of Naples Federico II, 80125 Naples, Italy
| |
Collapse
|
72
|
Ventre E, Herbach U, Espinasse T, Benoit G, Gandrillon O. One model fits all: Combining inference and simulation of gene regulatory networks. PLoS Comput Biol 2023; 19:e1010962. [PMID: 36972296 PMCID: PMC10079230 DOI: 10.1371/journal.pcbi.1010962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 04/06/2023] [Accepted: 02/17/2023] [Indexed: 03/29/2023] Open
Abstract
The rise of single-cell data highlights the need for a nondeterministic view of gene expression, while offering new opportunities regarding gene regulatory network inference. We recently introduced two strategies that specifically exploit time-course data, where single-cell profiling is performed after a stimulus: HARISSA, a mechanistic network model with a highly efficient simulation procedure, and CARDAMOM, a scalable inference method seen as model calibration. Here, we combine the two approaches and show that the same model driven by transcriptional bursting can be used simultaneously as an inference tool, to reconstruct biologically relevant networks, and as a simulation tool, to generate realistic transcriptional profiles emerging from gene interactions. We verify that CARDAMOM quantitatively reconstructs causal links when the data is simulated from HARISSA, and demonstrate its performance on experimental data collected on in vitro differentiating mouse embryonic stem cells. Overall, this integrated strategy largely overcomes the limitations of disconnected inference and simulation.
Collapse
Affiliation(s)
- Elias Ventre
- Laboratoire de Biologie et Modélisation de la Cellule, École Normale Supérieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Université Claude Bernard Lyon 1, Lyon, France
- Inria Center Grenoble Rhône-Alpes, Équipe Dracula, Villeurbanne, France
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5208, Institut Camille Jordan, Villeurbanne, France
| | - Ulysse Herbach
- Université de Lorraine, CNRS, Inria, IECL, Nancy, France
| | - Thibault Espinasse
- Inria Center Grenoble Rhône-Alpes, Équipe Dracula, Villeurbanne, France
- Univ Lyon, Université Claude Bernard Lyon 1, CNRS UMR 5208, Institut Camille Jordan, Villeurbanne, France
| | - Gérard Benoit
- Laboratoire de Biologie et Modélisation de la Cellule, École Normale Supérieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Université Claude Bernard Lyon 1, Lyon, France
| | - Olivier Gandrillon
- Laboratoire de Biologie et Modélisation de la Cellule, École Normale Supérieure de Lyon, CNRS, UMR 5239, Inserm, U1293, Université Claude Bernard Lyon 1, Lyon, France
- Inria Center Grenoble Rhône-Alpes, Équipe Dracula, Villeurbanne, France
| |
Collapse
|
73
|
Oubounyt M, Elkjaer ML, Laske T, Grønning AB, Moeller M, Baumbach J. De-novo reconstruction and identification of transcriptional gene regulatory network modules differentiating single-cell clusters. NAR Genom Bioinform 2023; 5:lqad018. [PMID: 36879901 PMCID: PMC9985332 DOI: 10.1093/nargab/lqad018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 01/16/2023] [Accepted: 02/09/2023] [Indexed: 03/07/2023] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) technology provides an unprecedented opportunity to understand gene functions and interactions at single-cell resolution. While computational tools for scRNA-seq data analysis to decipher differential gene expression profiles and differential pathway expression exist, we still lack methods to learn differential regulatory disease mechanisms directly from the single-cell data. Here, we provide a new methodology, named DiNiro, to unravel such mechanisms de novo and report them as small, easily interpretable transcriptional regulatory network modules. We demonstrate that DiNiro is able to uncover novel, relevant, and deep mechanistic models that not just predict but explain differential cellular gene expression programs. DiNiro is available at https://exbio.wzw.tum.de/diniro/.
Collapse
Affiliation(s)
- Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Chair of Experimental Bioinformatics, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Maria L Elkjaer
- Department of Neurology, Odense University Hospital, Odense, Denmark
- Institute of Clinical Research, University of Southern Denmark, Odense, Denmark
- Institute of Molecular Medicine, University of Southern Denmark, Odense, Denmark
| | - Tanja Laske
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Alexander G B Grønning
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Marcus J Moeller
- Heisenberg Chair of Preventive and Translational Nephrology, Department of Nephrology, Rheumatology and Clinical Immunology, RWTH Aachen University, Aachen, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Department of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| |
Collapse
|
74
|
van der Sande M, Frölich S, van Heeringen SJ. Computational approaches to understand transcription regulation in development. Biochem Soc Trans 2023; 51:1-12. [PMID: 36695505 PMCID: PMC9988001 DOI: 10.1042/bst20210145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 01/07/2023] [Accepted: 01/13/2023] [Indexed: 01/26/2023]
Abstract
Gene regulatory networks (GRNs) serve as useful abstractions to understand transcriptional dynamics in developmental systems. Computational prediction of GRNs has been successfully applied to genome-wide gene expression measurements with the advent of microarrays and RNA-sequencing. However, these inferred networks are inaccurate and mostly based on correlative rather than causative interactions. In this review, we highlight three approaches that significantly impact GRN inference: (1) moving from one genome-wide functional modality, gene expression, to multi-omics, (2) single cell sequencing, to measure cell type-specific signals and predict context-specific GRNs, and (3) neural networks as flexible models. Together, these experimental and computational developments have the potential to significantly impact the quality of inferred GRNs. Ultimately, accurately modeling the regulatory interactions between transcription factors and their target genes will be essential to understand the role of transcription factors in driving developmental gene expression programs and to derive testable hypotheses for validation.
Collapse
Affiliation(s)
| | | | - Simon J. van Heeringen
- Radboud University, Department of Molecular Developmental Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, 6525GA Nijmegen, The Netherlands
| |
Collapse
|
75
|
Song Q, Ruffalo M, Bar-Joseph Z. Using single cell atlas data to reconstruct regulatory networks. Nucleic Acids Res 2023; 51:e38. [PMID: 36762475 PMCID: PMC10123116 DOI: 10.1093/nar/gkad053] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 12/16/2022] [Accepted: 01/19/2023] [Indexed: 02/11/2023] Open
Abstract
Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.
Collapse
Affiliation(s)
- Qi Song
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Matthew Ruffalo
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
76
|
Khozyainova AA, Valyaeva AA, Arbatsky MS, Isaev SV, Iamshchikov PS, Volchkov EV, Sabirov MS, Zainullina VR, Chechekhin VI, Vorobev RS, Menyailo ME, Tyurin-Kuzmin PA, Denisov EV. Complex Analysis of Single-Cell RNA Sequencing Data. BIOCHEMISTRY. BIOKHIMIIA 2023; 88:231-252. [PMID: 37072324 PMCID: PMC10000364 DOI: 10.1134/s0006297923020074] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 12/13/2022] [Accepted: 12/13/2022] [Indexed: 03/12/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) is a revolutionary tool for studying the physiology of normal and pathologically altered tissues. This approach provides information about molecular features (gene expression, mutations, chromatin accessibility, etc.) of cells, opens up the possibility to analyze the trajectories/phylogeny of cell differentiation and cell-cell interactions, and helps in discovery of new cell types and previously unexplored processes. From a clinical point of view, scRNA-seq facilitates deeper and more detailed analysis of molecular mechanisms of diseases and serves as a basis for the development of new preventive, diagnostic, and therapeutic strategies. The review describes different approaches to the analysis of scRNA-seq data, discusses the advantages and disadvantages of bioinformatics tools, provides recommendations and examples of their successful use, and suggests potential directions for improvement. We also emphasize the need for creating new protocols, including multiomics ones, for the preparation of DNA/RNA libraries of single cells with the purpose of more complete understanding of individual cells.
Collapse
Affiliation(s)
- Anna A Khozyainova
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia.
| | - Anna A Valyaeva
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, Moscow, 119991, Russia
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Mikhail S Arbatsky
- Laboratory of Artificial Intelligence and Bioinformatics, The Russian Clinical Research Center for Gerontology, Pirogov Russian National Medical University, Moscow, 129226, Russia
- School of Public Administration, Lomonosov Moscow State University, Moscow, 119991, Russia
- Faculty of Fundamental Medicine, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Sergey V Isaev
- Research Institute of Personalized Medicine, National Center for Personalized Medicine of Endocrine Diseases, National Medical Research Center for Endocrinology, Moscow, 117036, Russia
| | - Pavel S Iamshchikov
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia
- Laboratory of Complex Analysis of Big Bioimage Data, National Research Tomsk State University, Tomsk, 634050, Russia
| | - Egor V Volchkov
- Department of Oncohematology, Dmitry Rogachev National Research Center of Pediatric Hematology, Oncology and Immunology, Moscow, 117198, Russia
| | - Marat S Sabirov
- Laboratory of Bioinformatics and Molecular Genetics, Koltzov Institute of Developmental Biology of the Russian Academy of Sciences, Moscow, 119334, Russia
| | - Viktoria R Zainullina
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia
| | - Vadim I Chechekhin
- Faculty of Fundamental Medicine, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Rostislav S Vorobev
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia
| | - Maxim E Menyailo
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia
| | - Pyotr A Tyurin-Kuzmin
- Faculty of Fundamental Medicine, Lomonosov Moscow State University, Moscow, 119991, Russia
| | - Evgeny V Denisov
- Laboratory of Cancer Progression Biology, Cancer Research Institute, Tomsk National Research Medical Center, Russian Academy of Sciences, Tomsk, 634050, Russia
| |
Collapse
|
77
|
Juan H, Huang H. Quantitative analysis of high‐throughput biological data. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Hsueh‐Fen Juan
- Department of Life Science, Institute of Biomedical Electronics and Bioinformatics, and Center for Systems Biology National Taiwan University Taipei Taiwan
- Taiwan AI Labs Taipei Taiwan
| | - Hsuan‐Cheng Huang
- Institute of Biomedical Informatics National Yang Ming Chiao Tung University Taipei Taiwan
| |
Collapse
|
78
|
Zhang Y, Wang M, Wang Z, Liu Y, Xiong S, Zou Q. MetaSEM: Gene Regulatory Network Inference from Single-Cell RNA Data by Meta-Learning. Int J Mol Sci 2023; 24:2595. [PMID: 36768917 PMCID: PMC9916710 DOI: 10.3390/ijms24032595] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/23/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open
Abstract
Regulators in gene regulatory networks (GRNs) are crucial for identifying cell states. However, GRN inference based on scRNA-seq data has several problems, including high dimensionality and sparsity, and requires more label data. Therefore, we propose a meta-learning GRN inference framework to identify regulatory factors. Specifically, meta-learning solves the parameter optimization problem caused by high-dimensional sparse data features. In addition, a few-shot solution was used to solve the problem of lack of label data. A structural equation model (SEM) was embedded in the model to identify important regulators. We integrated the parameter optimization strategy into the bi-level optimization to extract the feature consistent with GRN reasoning. This unique design makes our model robust to small-scale data. By studying the GRN inference task, we confirmed that the selected regulators were closely related to gene expression specificity. We further analyzed the GRN inferred to find the important regulators in cell type identification. Extensive experimental results showed that our model effectively captured the regulator in single-cell GRN inference. Finally, the visualization results verified the importance of the selected regulators for cell type recognition.
Collapse
Affiliation(s)
- Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Maocheng Wang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Zixuan Wang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yuhang Liu
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Shuwen Xiong
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610051, China
| |
Collapse
|
79
|
Lin Z, Ou-Yang L. Inferring gene regulatory networks from single-cell gene expression data via deep multi-view contrastive learning. Brief Bioinform 2023; 24:6965907. [PMID: 36585783 DOI: 10.1093/bib/bbac586] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2022] [Revised: 11/28/2022] [Accepted: 11/29/2022] [Indexed: 01/01/2023] Open
Abstract
The inference of gene regulatory networks (GRNs) is of great importance for understanding the complex regulatory mechanisms within cells. The emergence of single-cell RNA-sequencing (scRNA-seq) technologies enables the measure of gene expression levels for individual cells, which promotes the reconstruction of GRNs at single-cell resolution. However, existing network inference methods are mainly designed for data collected from a single data source, which ignores the information provided by multiple related data sources. In this paper, we propose a multi-view contrastive learning (DeepMCL) model to infer GRNs from scRNA-seq data collected from multiple data sources or time points. We first represent each gene pair as a set of histogram images, and then introduce a deep Siamese convolutional neural network with contrastive loss to learn the low-dimensional embedding for each gene pair. Moreover, an attention mechanism is introduced to integrate the embeddings extracted from different data sources and different neighbor gene pairs. Experimental results on synthetic and real-world datasets validate the effectiveness of our contrastive learning and attention mechanisms, demonstrating the effectiveness of our model in integrating multiple data sources for GRN inference.
Collapse
Affiliation(s)
- Zerun Lin
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| |
Collapse
|
80
|
Caranica C, Lu M. A data-driven optimization method for coarse-graining gene regulatory networks. iScience 2023; 26:105927. [PMID: 36698721 PMCID: PMC9868542 DOI: 10.1016/j.isci.2023.105927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 12/19/2022] [Accepted: 01/03/2023] [Indexed: 01/06/2023] Open
Abstract
One major challenge in systems biology is to understand how various genes in a gene regulatory network (GRN) collectively perform their functions and control network dynamics. This task becomes extremely hard to tackle in the case of large networks with hundreds of genes and edges, many of which have redundant regulatory roles and functions. The existing methods for model reduction usually require the detailed mathematical description of dynamical systems and their corresponding kinetic parameters, which are often not available. Here, we present a data-driven method for coarse-graining large GRNs, named SacoGraci, using ensemble-based mathematical modeling, dimensionality reduction, and gene circuit optimization by Markov Chain Monte Carlo methods. SacoGraci requires network topology as the only input and is robust against errors in GRNs. We benchmark and demonstrate its usage with synthetic, literature-based, and bioinformatics-derived GRNs. We hope SacoGraci will enhance our ability to model the gene regulation of complex biological systems.
Collapse
Affiliation(s)
- Cristian Caranica
- Department of Bioengineering, Northeastern University, Boston, MA 02115, USA,Center for Theoretical Biological Physics, Northeastern University, Boston, MA 02115, USA
| | - Mingyang Lu
- Department of Bioengineering, Northeastern University, Boston, MA 02115, USA,Center for Theoretical Biological Physics, Northeastern University, Boston, MA 02115, USA,The Jackson Laboratory, Bar Harbor, ME 04609, USA,Corresponding author
| |
Collapse
|
81
|
Koumadorakis DE, Krokidis MG, Dimitrakopoulos GN, Vrahatis AG. A Consensus Gene Regulatory Network for Neurodegenerative Diseases Using Single-Cell RNA-Seq Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2023; 1423:215-224. [PMID: 37525047 DOI: 10.1007/978-3-031-31978-5_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/02/2023]
Abstract
Gene regulatory network (GRN) inference from gene expression data is a highly complex and challenging task in systems biology. Despite the challenges, GRNs have emerged, and for complex diseases such as neurodegenerative diseases, they have the potential to provide vital information and identify key regulators. However, every GRN method produced predicts results based on its assumptions, providing limited biological insights. For that reason, the current work focused on the development of an ensemble method from individual GRN methods to address this issue. Four state-of-the-art GRN algorithms were selected to form a consensus GRN from their common gene interactions. Each algorithm uses a different construction method, and for a more robust behavior, both static and dynamic methods were selected as well. The algorithms were applied to a scRNA-seq dataset from the CK-p25 mus musculus model during neurodegeneration. The top subnetworks were constructed from the consensus network, and potential key regulators were identified. The results also demonstrated the overlap between the algorithms for the current dataset and the necessity for an ensemble approach. This work aims to demonstrate the creation of an ensemble network and provide insights into whether a combination of different GRN methods can produce valuable results.
Collapse
Affiliation(s)
- Dimitrios E Koumadorakis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Marios G Krokidis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Georgios N Dimitrakopoulos
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| | - Aristidis G Vrahatis
- Bioinformatics and Human Electrophysiology Lab (BiHELab), Department of Informatics, Ionian University, Corfu, Greece
| |
Collapse
|
82
|
Huang L, Clauss B, Lu M. What Makes a Functional Gene Regulatory Network? A Circuit Motif Analysis. J Phys Chem B 2022; 126:10374-10383. [PMID: 36471236 PMCID: PMC9896654 DOI: 10.1021/acs.jpcb.2c05412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
One of the key questions in systems biology is to understand the roles of gene regulatory circuits in determining cellular states and their functions. In previous studies, some researchers have inferred large gene networks from genome wide genomics/transcriptomics data using the top-down approach, while others have modeled core gene circuits of small sizes using the bottom-up approach. Despite many existing systems biology studies, there is still no general rule on what sizes of gene networks and what types of circuit motifs a system would need to achieve robust biological functions. Here, we adopt a gene circuit motif analysis to discover four-node circuits responsible for multiplicity (rich in dynamical behavior), flexibility (versatile to alter gene expression), or both. We identify the most reoccurring two-node circuit motifs and the co-occurring motif pairs. Furthermore, we investigate the contributing factors of multiplicity and flexibility for large gene networks of different types and sizes. We find that gene networks of intermediate sizes tend to have combined high levels of multiplicity and flexibility. Our study will contribute to a better understanding of the dynamical mechanisms of gene regulatory circuits and provide insights into rational designs of robust gene circuits in synthetic and systems biology.
Collapse
Affiliation(s)
- Lijia Huang
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA 02115, USA
- Department of Bioengineering, Northeastern University, Boston, MA 02115, USA
| | - Benjamin Clauss
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA 02115, USA
- Genetics Program, Graduate School of Biomedical Sciences, Tufts University, Boston, MA 02111, USA
| | - Mingyang Lu
- Center for Theoretical Biological Physics, Northeastern University, Boston, MA 02115, USA
- Department of Bioengineering, Northeastern University, Boston, MA 02115, USA
- Genetics Program, Graduate School of Biomedical Sciences, Tufts University, Boston, MA 02111, USA
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| |
Collapse
|
83
|
Coulier A, Singh P, Sturrock M, Hellander A. Systematic comparison of modeling fidelity levels and parameter inference settings applied to negative feedback gene regulation. PLoS Comput Biol 2022; 18:e1010683. [PMID: 36520957 PMCID: PMC9799300 DOI: 10.1371/journal.pcbi.1010683] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 12/29/2022] [Accepted: 10/25/2022] [Indexed: 12/23/2022] Open
Abstract
Quantitative stochastic models of gene regulatory networks are important tools for studying cellular regulation. Such models can be formulated at many different levels of fidelity. A practical challenge is to determine what model fidelity to use in order to get accurate and representative results. The choice is important, because models of successively higher fidelity come at a rapidly increasing computational cost. In some situations, the level of detail is clearly motivated by the question under study. In many situations however, many model options could qualitatively agree with available data, depending on the amount of data and the nature of the observations. Here, an important distinction is whether we are interested in inferring the true (but unknown) physical parameters of the model or if it is sufficient to be able to capture and explain available data. The situation becomes complicated from a computational perspective because inference needs to be approximate. Most often it is based on likelihood-free Approximate Bayesian Computation (ABC) and here determining which summary statistics to use, as well as how much data is needed to reach the desired level of accuracy, are difficult tasks. Ultimately, all of these aspects-the model fidelity, the available data, and the numerical choices for inference-interplay in a complex manner. In this paper we develop a computational pipeline designed to systematically evaluate inference accuracy for a wide range of true known parameters. We then use it to explore inference settings for negative feedback gene regulation. In particular, we compare a detailed spatial stochastic model, a coarse-grained compartment-based multiscale model, and the standard well-mixed model, across several data-scenarios and for multiple numerical options for parameter inference. Practically speaking, this pipeline can be used as a preliminary step to guide modelers prior to gathering experimental data. By training Gaussian processes to approximate the distance function values, we are able to substantially reduce the computational cost of running the pipeline.
Collapse
Affiliation(s)
- Adrien Coulier
- Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Prashant Singh
- Science for Life Laboratory, Department of Information Technology, Uppsala University, Uppsala, Sweden
| | - Marc Sturrock
- Department of Physiology, Royal College of Surgeons in Ireland, Dublin, Ireland
| | - Andreas Hellander
- Department of Information Technology, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
84
|
Chuwdhury GS, Ng IOL, Ho DWH. scAnalyzeR: A Comprehensive Software Package With Graphical User Interface for Single-Cell RNA Sequencing Analysis and its Application on Liver Cancer. Technol Cancer Res Treat 2022; 21:15330338221142729. [PMID: 36476060 PMCID: PMC9742707 DOI: 10.1177/15330338221142729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Introduction: The application of single-cell RNA sequencing to delineate tissue heterogeneity and complexity has become increasingly popular. Given its tremendous resolution and high-dimensional capacity for in-depth investigation, single-cell RNA sequencing offers an unprecedented research power. Although some popular software packages are available for single-cell RNA sequencing data analysis and visualization, it is still a big challenge for their usage, as they provide only a command-line interface and require significant level of bioinformatics skills. Methods: We have developed scAnalyzeR, which is a single-cell RNA sequencing analysis pipeline with an interactive and user-friendly graphical interface for analyzing and visualizing single-cell RNA sequencing data. It accepts single-cell RNA sequencing data from various technology platforms and different model organisms (human and mouse) and allows flexibility in input file format. It provides functionalities for data preprocessing, quality control, basic summary statistics, dimension reduction, unsupervised clustering, differential gene expression, gene set enrichment analysis, correlation analysis, pseudotime cell trajectory inference, and various visualization plots. It also provides default parameters for easy usage and allows a wide range of flexibility and optimization by accepting user-defined options. It has been developed as a docker image that can be run in any docker-supported environment including Linux, Mac, and Windows, without installing any dependencies. Results: We compared the performance of scAnalyzeR with 2 other graphical tools that are popular for analyzing single-cell RNA sequencing data. The comparison was based on the comprehensiveness of functionalities, ease of usage and flexibility, and execution time. In general, scAnalyzeR outperformed the other tested counterparts in various aspects, demonstrating its superior overall performance. To illustrate the usefulness of scAnalyzeR in cancer research, we have analyzed the in-house liver cancer single-cell RNA sequencing dataset. Liver cancer tumor cells were revealed to have multiple subpopulations with distinctive gene expression signatures. Conclusion: scAnalyzeR has comprehensive functionalities and demonstrated usability. We anticipate more functionalities to be adopted in the future development.
Collapse
Affiliation(s)
- GS Chuwdhury
- Department of Pathology and State Key Laboratory of Liver Research, The University of Hong Kong, Hong Kong
| | - Irene Oi-Lin Ng
- Department of Pathology and State Key Laboratory of Liver Research, The University of Hong Kong, Hong Kong
| | - Daniel Wai-Hung Ho
- Department of Pathology and State Key Laboratory of Liver Research, The University of Hong Kong, Hong Kong,Daniel Ho, Department of Pathology and State Key Laboratory of Liver Research, The University of Hong Kong, Hong Kong.
| |
Collapse
|
85
|
Su M, Pan T, Chen QZ, Zhou WW, Gong Y, Xu G, Yan HY, Li S, Shi QZ, Zhang Y, He X, Jiang CJ, Fan SC, Li X, Cairns MJ, Wang X, Li YS. Data analysis guidelines for single-cell RNA-seq in biomedical studies and clinical applications. Mil Med Res 2022; 9:68. [PMID: 36461064 PMCID: PMC9716519 DOI: 10.1186/s40779-022-00434-8] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/27/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
The application of single-cell RNA sequencing (scRNA-seq) in biomedical research has advanced our understanding of the pathogenesis of disease and provided valuable insights into new diagnostic and therapeutic strategies. With the expansion of capacity for high-throughput scRNA-seq, including clinical samples, the analysis of these huge volumes of data has become a daunting prospect for researchers entering this field. Here, we review the workflow for typical scRNA-seq data analysis, covering raw data processing and quality control, basic data analysis applicable for almost all scRNA-seq data sets, and advanced data analysis that should be tailored to specific scientific questions. While summarizing the current methods for each analysis step, we also provide an online repository of software and wrapped-up scripts to support the implementation. Recommendations and caveats are pointed out for some specific analysis tasks and approaches. We hope this resource will be helpful to researchers engaging with scRNA-seq, in particular for emerging clinical applications.
Collapse
Affiliation(s)
- Min Su
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Tao Pan
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiu-Zhen Chen
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Wei-Wei Zhou
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Yi Gong
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
- Department of Immunology, Nanjing Medical University, Nanjing, 211166 China
| | - Gang Xu
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Huan-Yu Yan
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Si Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Qiao-Zhen Shi
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Ya Zhang
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| | - Xiao He
- Department of Laboratory Medicine, Women and Children’s Hospital of Chongqing Medical University, Chongqing, 401174 China
| | | | - Shi-Cai Fan
- Shenzhen Institute for Advanced Study, University of Electronic Science and Technology of China, Shenzhen, 518110 Guangdong China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150081 Heilongjiang China
| | - Murray J. Cairns
- School of Biomedical Sciences and Pharmacy, Faculty of Health and Medicine, the University of Newcastle, University Drive, Callaghan, NSW 2308 Australia
- Precision Medicine Research Program, Hunter Medical Research Institute, New Lambton Heights, NSW 2305 Australia
| | - Xi Wang
- State Key Laboratory of Reproductive Medicine, Nanjing Medical University, Nanjing, 211166 China
| | - Yong-Sheng Li
- College of Biomedical Information and Engineering, the First Affiliated Hospital of Hainan Medical University, Hainan Medical University, Haikou, 571199 Hainan China
| |
Collapse
|
86
|
Chen Z, King WC, Hwang A, Gerstein M, Zhang J. DeepVelo: Single-cell transcriptomic deep velocity field learning with neural ordinary differential equations. SCIENCE ADVANCES 2022; 8:eabq3745. [PMID: 36449617 PMCID: PMC9710871 DOI: 10.1126/sciadv.abq3745] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 05/14/2023]
Abstract
Recent advances in single-cell sequencing technologies have provided unprecedented opportunities to measure the gene expression profile and RNA velocity of individual cells. However, modeling transcriptional dynamics is computationally challenging because of the high-dimensional, sparse nature of the single-cell gene expression measurements and the nonlinear regulatory relationships. Here, we present DeepVelo, a neural network-based ordinary differential equation that can model complex transcriptome dynamics by describing continuous-time gene expression changes within individual cells. We apply DeepVelo to public datasets from different sequencing platforms to (i) formulate transcriptome dynamics on different time scales, (ii) measure the instability of cell states, and (iii) identify developmental driver genes via perturbation analysis. Benchmarking against the state-of-the-art methods shows that DeepVelo can learn a more accurate representation of the velocity field. Furthermore, our perturbation studies reveal that single-cell dynamical systems could exhibit chaotic properties. In summary, DeepVelo allows data-driven discoveries of differential equations that delineate single-cell transcriptome dynamics.
Collapse
Affiliation(s)
- Zhanlin Chen
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
| | - William C. King
- Healthcare and Life Sciences, Microsoft, Redmond, WA 98052, USA
| | - Aheyon Hwang
- Mathematical, Computational, and Systems Biology, University of California, Irvine, Irvine, CA 92697, USA
| | - Mark Gerstein
- Department of Statistics and Data Science, Yale University, New Haven, CT 06520, USA
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT 06520, USA
- Department of Computer Science, Yale University, New Haven, CT 06520, USA
- Corresponding author. (M.G.); (J.Z.)
| | - Jing Zhang
- Department of Computer Science, University of California, Irvine, Irvine, CA 92697, USA
- Corresponding author. (M.G.); (J.Z.)
| |
Collapse
|
87
|
Guan J, Wang Y, Wang Y, Zhuang Y, Ji G. SRGS: sparse partial least squares-based recursive gene selection for gene regulatory network inference. BMC Genomics 2022; 23:782. [PMID: 36451086 PMCID: PMC9710113 DOI: 10.1186/s12864-022-09020-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/16/2022] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND The identification of gene regulatory networks (GRNs) facilitates the understanding of the underlying molecular mechanism of various biological processes and complex diseases. With the availability of single-cell RNA sequencing data, it is essential to infer GRNs from single-cell expression. Although some GRN methods originally developed for bulk expression data can be applicable to single-cell data and several single-cell specific GRN algorithms were developed, recent benchmarking studies have emphasized the need of developing more accurate and robust GRN modeling methods that are compatible for single-cell expression data. RESULTS We present SRGS, SPLS (sparse partial least squares)-based recursive gene selection, to infer GRNs from bulk or single-cell expression data. SRGS recursively selects and scores the genes which may have regulations on the considered target gene based on SPLS. When dealing with gene expression data with dropouts, we randomly scramble samples, set some values in the expression matrix to zeroes, and generate multiple copies of data through multiple iterations to make SRGS more robust. We test SRGS on different kinds of expression data, including simulated bulk data, simulated single-cell data without and with dropouts, and experimental single-cell data, and also compared with the existing GRN methods, including the ones originally developed for bulk data, the ones developed specifically for single-cell data, and even the ones recommended by recent benchmarking studies. CONCLUSIONS It has been shown that SRGS is competitive with the existing GRN methods and effective in the gene regulatory network inference from bulk or single-cell gene expression data. SRGS is available at: https://github.com/JGuan-lab/SRGS .
Collapse
Affiliation(s)
- Jinting Guan
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China ,grid.12955.3a0000 0001 2264 7233National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian China
| | - Yang Wang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Yongjie Wang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Yan Zhuang
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China
| | - Guoli Ji
- grid.12955.3a0000 0001 2264 7233Department of Automation, Xiamen University, Xiamen, Fujian China ,grid.12955.3a0000 0001 2264 7233National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, Fujian China
| |
Collapse
|
88
|
Mao G, Zeng R, Peng J, Zuo K, Pang Z, Liu J. Reconstructing gene regulatory networks of biological function using differential equations of multilayer perceptrons. BMC Bioinformatics 2022; 23:503. [PMID: 36434499 PMCID: PMC9700916 DOI: 10.1186/s12859-022-05055-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Accepted: 11/14/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Building biological networks with a certain function is a challenge in systems biology. For the functionality of small (less than ten nodes) biological networks, most methods are implemented by exhausting all possible network topological spaces. This exhaustive approach is difficult to scale to large-scale biological networks. And regulatory relationships are complex and often nonlinear or non-monotonic, which makes inference using linear models challenging. RESULTS In this paper, we propose a multi-layer perceptron-based differential equation method, which operates by training a fully connected neural network (NN) to simulate the transcription rate of genes in traditional differential equations. We verify whether the regulatory network constructed by the NN method can continue to achieve the expected biological function by verifying the degree of overlap between the regulatory network discovered by NN and the regulatory network constructed by the Hill function. And we validate our approach by adapting to noise signals, regulator knockout, and constructing large-scale gene regulatory networks using link-knockout techniques. We apply a real dataset (the mesoderm inducer Xenopus Brachyury expression) to construct the core topology of the gene regulatory network and find that Xbra is only strongly expressed at moderate levels of activin signaling. CONCLUSION We have demonstrated from the results that this method has the ability to identify the underlying network topology and functional mechanisms, and can also be applied to larger and more complex gene network topologies.
Collapse
Affiliation(s)
- Guo Mao
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ruigeng Zeng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jintao Peng
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Ke Zuo
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Zhengbin Pang
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China
| | - Jie Liu
- grid.412110.70000 0000 9548 2110Science and Technology on Parallel and Distributed Processing Laboratory, National University of Defense Technology, Deya Road, Changsha, 410073 China ,grid.412110.70000 0000 9548 2110Laboratory of Software Engineering for Complex System, National University of Defense Technology, Deya Road, Changsha, 410073 China
| |
Collapse
|
89
|
Dindhoria K, Monga I, Thind AS. Computational approaches and challenges for identification and annotation of non-coding RNAs using RNA-Seq. Funct Integr Genomics 2022; 22:1105-1112. [DOI: 10.1007/s10142-022-00915-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/22/2022]
|
90
|
Xu Y, Chen J, Lyu A, Cheung WK, Zhang L. dynDeepDRIM: a dynamic deep learning model to infer direct regulatory interactions using time-course single-cell gene expression data. Brief Bioinform 2022; 23:6720420. [PMID: 36168811 DOI: 10.1093/bib/bbac424] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 08/02/2022] [Accepted: 09/01/2022] [Indexed: 12/14/2022] Open
Abstract
Time-course single-cell RNA sequencing (scRNA-seq) data have been widely used to explore dynamic changes in gene expression of transcription factors (TFs) and their target genes. This information is useful to reconstruct cell-type-specific gene regulatory networks (GRNs). However, the existing tools are commonly designed to analyze either time-course bulk gene expression data or static scRNA-seq data via pseudo-time cell ordering. A few methods successfully utilize the information from multiple time points while also considering the characteristics of scRNA-seq data. We proposed dynDeepDRIM, a novel deep learning model to reconstruct GRNs using time-course scRNA-seq data. It represents the joint expression of a gene pair as an image and utilizes the image of the target TF-gene pair and the ones of the potential neighbors to reconstruct GRNs from time-course scRNA-seq data. dynDeepDRIM can effectively remove the transitive TF-gene interactions by considering neighborhood context and model the gene expression dynamics using high-dimensional tensors. We compared dynDeepDRIM with six GRN reconstruction methods on both simulation and four real time-course scRNA-seq data. dynDeepDRIM achieved substantially better performance than the other methods in inferring TF-gene interactions and eliminated the false positives effectively. We also applied dynDeepDRIM to annotate gene functions and found it achieved evidently better performance than the other tools due to considering the neighbor genes.
Collapse
Affiliation(s)
- Yu Xu
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Jiaxing Chen
- Computer Science and Technology, Division of Science and Technology, BNU-HKBU United International College, Jintong Road, 519087, Zhuhai, China
| | - Aiping Lyu
- School of Chinese Medicine, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - William K Cheung
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
| |
Collapse
|
91
|
Ferrari C, Manosalva Pérez N, Vandepoele K. MINI-EX: Integrative inference of single-cell gene regulatory networks in plants. MOLECULAR PLANT 2022; 15:1807-1824. [PMID: 36307979 DOI: 10.1016/j.molp.2022.10.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 09/30/2022] [Accepted: 10/21/2022] [Indexed: 05/26/2023]
Abstract
Multicellular organisms, such as plants, are characterized by highly specialized and tightly regulated cell populations, establishing specific morphological structures and executing distinct functions. Gene regulatory networks (GRNs) describe condition-specific interactions of transcription factors (TFs) regulating the expression of target genes, underpinning these specific functions. As efficient and validated methods to identify cell-type-specific GRNs from single-cell data in plants are lacking, limiting our understanding of the organization of specific cell types in both model species and crops, we developed MINI-EX (Motif-Informed Network Inference based on single-cell EXpression data), an integrative approach to infer cell-type-specific networks in plants. MINI-EX uses single-cell transcriptomic data to define expression-based networks and integrates TF motif information to filter the inferred regulons, resulting in networks with increased accuracy. Next, regulons are assigned to different cell types, leveraging cell-specific expression, and candidate regulators are prioritized using network centrality measures, functional annotations, and expression specificity. This embedded prioritization strategy offers a unique and efficient means to unravel signaling cascades in specific cell types controlling a biological process of interest. We demonstrate the stability of MINI-EX toward input data sets with low number of cells and its robustness toward missing data, and show that it infers state-of-the-art networks with a better performance compared with other related single-cell network tools. MINI-EX successfully identifies key regulators controlling root development in Arabidopsis and rice, leaf development in Arabidopsis, and ear development in maize, enhancing our understanding of cell-type-specific regulation and unraveling the roles of different regulators controlling the development of specific cell types in plants.
Collapse
Affiliation(s)
- Camilla Ferrari
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium; Center for Plant Systems Biology, VIB, 9052 Ghent, Belgium
| | - Nicolás Manosalva Pérez
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium; Center for Plant Systems Biology, VIB, 9052 Ghent, Belgium
| | - Klaas Vandepoele
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium; Center for Plant Systems Biology, VIB, 9052 Ghent, Belgium; Bioinformatics Institute Ghent, Ghent University, 9052 Ghent, Belgium.
| |
Collapse
|
92
|
Majumder S, Thakran Y, Pal V, Singh K. Fuzzy and Rough Set Theory Based Computational Framework for Mining Genetic Interaction Triplets From Gene Expression Profiles for Lung Adenocarcinoma. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3469-3481. [PMID: 34665736 DOI: 10.1109/tcbb.2021.3120844] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Genetic interactions are very helpful in understanding different disease and discovering drugs for it. Compared to the gene pairs that represent the genetic interactions between two genes, the gene triplets are more informative and useful. However, existing works on genetic interactions among gene triplets have primarily focused on detecting gene triplets from time series gene expression profiles. Generating the time series gene expression profiles for humans is quite impracticable but the labeled gene expression profiles are available for different diseases in case of humans. In this paper, a computational framework has been proposed to detect gene triplets from labeled gene expression profiles. First, it employs Rough Set Theory for extracting the key genes and then designs a fuzzy inference system for generating possible gene triplets. Further, Root Mean Squared Error measure has been used to prune out the irrelevant gene triplets. In the present work, the proposed computational framework has been applied to labeled lung adenocarcinoma dataset and can be applied to any other labeled gene expression dataset. The extracted gene triplets and their functionalities have been verified with existing biological literature and benchmark databases and the results of verification signify that the proposed framework is promising in terms of finding useful genetic triplets. Further, the proposed framework has been found more efficient as compared to an existing mutual information-based technique in terms of detecting known genetic interactions.
Collapse
|
93
|
Jiang J, Lyu P, Li J, Huang S, Tao J, Blackshaw S, Qian J, Wang J. IReNA: Integrated regulatory network analysis of single-cell transcriptomes and chromatin accessibility profiles. iScience 2022; 25:105359. [PMID: 36325073 PMCID: PMC9619378 DOI: 10.1016/j.isci.2022.105359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 09/19/2022] [Accepted: 10/12/2022] [Indexed: 11/16/2022] Open
Abstract
Recently, single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) have been developed to separately measure transcriptomes and chromatin accessibility profiles at the single-cell resolution. However, few methods can reliably integrate these data to perform regulatory network analysis. Here, we developed integrated regulatory network analysis (IReNA) for network inference through the integrated analysis of scRNA-seq and scATAC-seq data, network modularization, transcription factor enrichment, and construction of simplified intermodular regulatory networks. Using public datasets, we showed that integrated network analysis of scRNA-seq data with scATAC-seq data is more precise to identify known regulators than scRNA-seq data analysis alone. Moreover, IReNA outperformed currently available methods in identifying known regulators. IReNA facilitates the systems-level understanding of biological regulatory mechanisms and is available at https://github.com/jiang-junyao/IReNA.
Collapse
Affiliation(s)
- Junyao Jiang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Pin Lyu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jinlian Li
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Sunan Huang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Jiawang Tao
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Seth Blackshaw
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jie Wang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- China-New Zealand Joint Laboratory on Biomedicine and Health, Guangzhou 510530, China
- Corresponding author
| |
Collapse
|
94
|
Bocci F, Zhou P, Nie Q. spliceJAC: transition genes and state-specific gene regulation from single-cell transcriptome data. Mol Syst Biol 2022; 18:e11176. [PMID: 36321549 PMCID: PMC9627675 DOI: 10.15252/msb.202211176] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 10/07/2022] [Accepted: 10/10/2022] [Indexed: 11/25/2022] Open
Abstract
Extracting dynamical information from single-cell transcriptomics is a novel task with the promise to advance our understanding of cell state transition and interactions between genes. Yet, theory-oriented, bottom-up approaches that consider differences among cell states are largely lacking. Here, we present spliceJAC, a method to quantify the multivariate mRNA splicing from single-cell RNA sequencing (scRNA-seq). spliceJAC utilizes the unspliced and spliced mRNA count matrices to constructs cell state-specific gene-gene regulatory interactions and applies stability analysis to predict putative driver genes critical to the transitions between cell states. By applying spliceJAC to biological systems including pancreas endothelium development and epithelial-mesenchymal transition (EMT) in A549 lung cancer cells, we predict genes that serve specific signaling roles in different cell states, recover important differentially expressed genes in agreement with pre-existing analysis, and predict new transition genes that are either exclusive or shared between different cell state transitions.
Collapse
Affiliation(s)
- Federico Bocci
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
- NSF‐Simons Center for Multiscale Cell Fate ResearchUniversity of CaliforniaIrvineCAUSA
| | - Peijie Zhou
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
| | - Qing Nie
- Department of MathematicsUniversity of CaliforniaIrvineCAUSA
- NSF‐Simons Center for Multiscale Cell Fate ResearchUniversity of CaliforniaIrvineCAUSA
- Department of Developmental and Cell BiologyUniversity of CaliforniaIrvineCAUSA
| |
Collapse
|
95
|
Chen G, Liu ZP. Graph attention network for link prediction of gene regulations from single-cell RNA-sequencing data. Bioinformatics 2022; 38:4522-4529. [PMID: 35961023 DOI: 10.1093/bioinformatics/btac559] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 07/18/2022] [Accepted: 08/10/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Single-cell RNA sequencing (scRNA-seq) data provides unprecedented opportunities to reconstruct gene regulatory networks (GRNs) at fine-grained resolution. Numerous unsupervised or self-supervised models have been proposed to infer GRN from bulk RNA-seq data, but few of them are appropriate for scRNA-seq data under the circumstance of low signal-to-noise ratio and dropout. Fortunately, the surging of TF-DNA binding data (e.g. ChIP-seq) makes supervised GRN inference possible. We regard supervised GRN inference as a graph-based link prediction problem that expects to learn gene low-dimensional vectorized representations to predict potential regulatory interactions. RESULTS In this paper, we present GENELink to infer latent interactions between transcription factors (TFs) and target genes in GRN using graph attention network. GENELink projects the single-cell gene expression with observed TF-gene pairs to a low-dimensional space. Then, the specific gene representations are learned to serve for downstream similarity measurement or causal inference of pairwise genes by optimizing the embedding space. Compared to eight existing GRN reconstruction methods, GENELink achieves comparable or better performance on seven scRNA-seq datasets with four types of ground-truth networks. We further apply GENELink on scRNA-seq of human breast cancer metastasis and reveal regulatory heterogeneity of Notch and Wnt signalling pathways between primary tumour and lung metastasis. Moreover, the ontology enrichment results of unique lung metastasis GRN indicate that mitochondrial oxidative phosphorylation (OXPHOS) is functionally important during the seeding step of the cancer metastatic cascade, which is validated by pharmacological assays. AVAILABILITY AND IMPLEMENTATION The code and data are available at https://github.com/zpliulab/GENELink. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guangyi Chen
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| | - Zhi-Ping Liu
- Department of Biomedical Engineering, School of Control Science and Engineering, Shandong University, Jinan, Shandong 250061, China
| |
Collapse
|
96
|
Subramanian A, Zakeri P, Mousa M, Alnaqbi H, Alshamsi FY, Bettoni L, Damiani E, Alsafar H, Saeys Y, Carmeliet P. Angiogenesis goes computational - The future way forward to discover new angiogenic targets? Comput Struct Biotechnol J 2022; 20:5235-5255. [PMID: 36187917 PMCID: PMC9508490 DOI: 10.1016/j.csbj.2022.09.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/09/2022] [Accepted: 09/09/2022] [Indexed: 11/26/2022] Open
Abstract
Multi-omics technologies are being increasingly utilized in angiogenesis research. Yet, computational methods have not been widely used for angiogenic target discovery and prioritization in this field, partly because (wet-lab) vascular biologists are insufficiently familiar with computational biology tools and the opportunities they may offer. With this review, written for vascular biologists who lack expertise in computational methods, we aspire to break boundaries between both fields and to illustrate the potential of these tools for future angiogenic target discovery. We provide a comprehensive survey of currently available computational approaches that may be useful in prioritizing candidate genes, predicting associated mechanisms, and identifying their specificity to endothelial cell subtypes. We specifically highlight tools that use flexible, machine learning frameworks for large-scale data integration and gene prioritization. For each purpose-oriented category of tools, we describe underlying conceptual principles, highlight interesting applications and discuss limitations. Finally, we will discuss challenges and recommend some guidelines which can help to optimize the process of accurate target discovery.
Collapse
Affiliation(s)
- Abhishek Subramanian
- Laboratory of Angiogenesis & Vascular Metabolism, Center for Cancer Biology, VIB, Leuven, Belgium
- Laboratory of Angiogenesis & Vascular Metabolism, Department of Oncology, KU Leuven, Leuven, Belgium
| | - Pooya Zakeri
- Laboratory of Angiogenesis & Vascular Heterogeneity, Department of Biomedicine, Aarhus University, Aarhus, Denmark
- Centre for Brain and Disease Research, Flanders Institute for Biotechnology (VIB), Leuven, Belgium
- Department of Neurosciences and Leuven Brain Institute, KU Leuven, Leuven, Belgium
| | - Mira Mousa
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Halima Alnaqbi
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Fatima Yousif Alshamsi
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Leo Bettoni
- Laboratory of Angiogenesis & Vascular Metabolism, Center for Cancer Biology, VIB, Leuven, Belgium
- Laboratory of Angiogenesis & Vascular Metabolism, Department of Oncology, KU Leuven, Leuven, Belgium
| | - Ernesto Damiani
- Robotics and Intelligent Systems Institute, Khalifa University, Abu Dhabi, United Arab Emirates
| | - Habiba Alsafar
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
- Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine Group, VIB Center for Inflammation Research, Ghent, Belgium
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Ghent, Belgium
| | - Peter Carmeliet
- Laboratory of Angiogenesis & Vascular Metabolism, Center for Cancer Biology, VIB, Leuven, Belgium
- Laboratory of Angiogenesis & Vascular Metabolism, Department of Oncology, KU Leuven, Leuven, Belgium
- Laboratory of Angiogenesis & Vascular Heterogeneity, Department of Biomedicine, Aarhus University, Aarhus, Denmark
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| |
Collapse
|
97
|
Shu H, Ding F, Zhou J, Xue Y, Zhao D, Zeng J, Ma J. Boosting single-cell gene regulatory network reconstruction via bulk-cell transcriptomic data. Brief Bioinform 2022; 23:6693602. [PMID: 36070863 DOI: 10.1093/bib/bbac389] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 08/09/2022] [Accepted: 08/11/2022] [Indexed: 11/12/2022] Open
Abstract
Computational recovery of gene regulatory network (GRN) has recently undergone a great shift from bulk-cell towards designing algorithms targeting single-cell data. In this work, we investigate whether the widely available bulk-cell data could be leveraged to assist the GRN predictions for single cells. We infer cell-type-specific GRNs from both the single-cell RNA sequencing data and the generic GRN derived from the bulk cells by constructing a weakly supervised learning framework based on the axial transformer. We verify our assumption that the bulk-cell transcriptomic data are a valuable resource, which could improve the prediction of single-cell GRN by conducting extensive experiments. Our GRN-transformer achieves the state-of-the-art prediction accuracy in comparison to existing supervised and unsupervised approaches. In addition, we show that our method can identify important transcription factors and potential regulations for Alzheimer's disease risk genes by using the predicted GRN. Availability: The implementation of GRN-transformer is available at https://github.com/HantaoShu/GRN-Transformer.
Collapse
Affiliation(s)
- Hantao Shu
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Fan Ding
- Department of Computer Science, Purdue University, IN 47907, United States
| | - Jingtian Zhou
- Genomic Analysis Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, United States.,Bioinformatics Program, University of California, San Diego, La Jolla, CA 92093, United States
| | - Yexiang Xue
- Department of Computer Science, Purdue University, IN 47907, United States
| | - Dan Zhao
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianyang Zeng
- Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, China
| | - Jianzhu Ma
- Institute for Artificial Intelligence, Peking University, Beijing 100091, China
| |
Collapse
|
98
|
Zheng H, Wang S, Li X, Hu H. INSISTC: Incorporating network structure information for single-cell type classification. Genomics 2022; 114:110480. [PMID: 36075505 DOI: 10.1016/j.ygeno.2022.110480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Revised: 08/30/2022] [Accepted: 09/04/2022] [Indexed: 11/27/2022]
Abstract
Uncovering gene regulatory mechanisms in individual cells can provide insight into cell heterogeneity and function. Recent accumulated Single-Cell RNA-Seq data have made it possible to analyze gene regulation at single-cell resolution. Understanding cell-type-specific gene regulation can assist in more accurate cell type and state identification. Computational approaches utilizing such relationships are under development. Methods pioneering in integrating gene regulatory mechanism discovery with cell-type classification encounter challenges such as determine gene regulatory relationships and incorporate gene regulatory network structure. To fill this gap, we developed INSISTC, a computational method to incorporate gene regulatory network structure information for single-cell type classification. INSISTC is capable of identifying cell-type-specific gene regulatory mechanisms while performing single-cell type classification. INSISTC demonstrated its accuracy in cell type classification and its potential for providing insight into molecular mechanisms specific to individual cells. In comparison with the alternative methods, INSISTC demonstrated its complementary performance for gene regulation interpretation.
Collapse
Affiliation(s)
- Hansi Zheng
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Saidi Wang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Xiaoman Li
- Burnett School of Biomedical Science, College of Medicine, University of Central Florida, Orlando, FL 32816, USA.
| | - Haiyan Hu
- Department of Computer Science, Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA.
| |
Collapse
|
99
|
Zhao X, Lan Y, Chen D. Exploring long non-coding RNA networks from single cell omics data. Comput Struct Biotechnol J 2022; 20:4381-4389. [PMID: 36051880 PMCID: PMC9403499 DOI: 10.1016/j.csbj.2022.08.003] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 08/01/2022] [Accepted: 08/01/2022] [Indexed: 11/03/2022] Open
|
100
|
Bulbul Ahmed M, Humayan Kabir A. Understanding of the various aspects of gene regulatory networks related to crop improvement. Gene 2022; 833:146556. [PMID: 35609798 DOI: 10.1016/j.gene.2022.146556] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 03/14/2022] [Accepted: 05/06/2022] [Indexed: 12/30/2022]
Abstract
The hierarchical relationship between transcription factors, associated proteins, and their target genes is defined by a gene regulatory network (GRN). GRNs allow us to understand how the genotype and environment of a plant are incorporated to control the downstream physiological responses. During plant growth or environmental acclimatization, GRNs are diverse and can be differently regulated across tissue types and organs. An overview of recent advances in the development of GRN that speed up basic and applied plant research is given here. Furthermore, the overview of genome and transcriptome involving GRN research along with the exciting advancement and application are discussed. In addition, different approaches to GRN predictions were elucidated. In this review, we also describe the role of GRN in crop improvement, crop plant manipulation, stress responses, speed breeding and identifying genetic variations/locus. Finally, the challenges and prospects of GRN in plant biology are discussed.
Collapse
Affiliation(s)
- Md Bulbul Ahmed
- Plant Science Department, McGill University, 21111 lakeshore Road, Ste. Anne de Bellevue H9X3V9, Quebec, Canada; Institut de Recherche en Biologie Végétale (IRBV), University of Montreal, Montréal, Québec H1X 2B2, Canada.
| | | |
Collapse
|