1
|
Cassan O, Lecellier CH, Martin A, Bréhélin L, Lèbre S. Optimizing data integration improves gene regulatory network inference in Arabidopsis thaliana. Bioinformatics 2024; 40:btae415. [PMID: 38913855 PMCID: PMC11227367 DOI: 10.1093/bioinformatics/btae415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 06/12/2024] [Accepted: 06/21/2024] [Indexed: 06/26/2024] Open
Abstract
MOTIVATIONS Gene regulatory networks (GRNs) are traditionally inferred from gene expression profiles monitoring a specific condition or treatment. In the last decade, integrative strategies have successfully emerged to guide GRN inference from gene expression with complementary prior data. However, datasets used as prior information and validation gold standards are often related and limited to a subset of genes. This lack of complete and independent evaluation calls for new criteria to robustly estimate the optimal intensity of prior data integration in the inference process. RESULTS We address this issue for two regression-based GRN inference models, a weighted random forest (weigthedRF) and a generalized linear model estimated under a weighted LASSO penalty with stability selection (weightedLASSO). These approaches are applied to data from the root response to nitrate induction in Arabidopsis thaliana. For each gene, we measure how the integration of transcription factor binding motifs influences model prediction. We propose a new approach, DIOgene, that uses model prediction error and a simulated null hypothesis in order to optimize data integration strength in a hypothesis-driven, gene-specific manner. This integration scheme reveals a strong diversity of optimal integration intensities between genes, and offers good performance in minimizing prediction error as well as retrieving experimental interactions. Experimental results show that DIOgene compares favorably against state-of-the-art approaches and allows to recover master regulators of nitrate induction. AVAILABILITY AND IMPLEMENTATION The R code and notebooks demonstrating the use of the proposed approaches are available in the repository https://github.com/OceaneCsn/integrative_GRN_N_induction.
Collapse
Affiliation(s)
- Océane Cassan
- LIRMM, Univ Montpellier, CNRS, Montpellier, 34095, France
| | - Charles-Henri Lecellier
- LIRMM, Univ Montpellier, CNRS, Montpellier, 34095, France
- IGMM, Univ Montpellier, CNRS, Montpellier, 34090, France
| | - Antoine Martin
- IPSIM, CNRS, INRAE, Institut Agro, Univ Montpellier, 34060, Montpellier, France
| | | | - Sophie Lèbre
- LIRMM, Univ Montpellier, CNRS, Montpellier, 34095, France
- IMAG, Univ Montpellier, CNRS, Montpellier, 34090, France
- Université Paul-Valéry-Montpellier 3, Montpellier, 34090, France
| |
Collapse
|
2
|
Roy S, Sheikh SZ, Furey TS. CoVar: A generalizable machine learning approach to identify the coordinated regulators driving variational gene expression. PLoS Comput Biol 2024; 20:e1012016. [PMID: 38630807 PMCID: PMC11057768 DOI: 10.1371/journal.pcbi.1012016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 04/29/2024] [Accepted: 03/22/2024] [Indexed: 04/19/2024] Open
Abstract
Network inference is used to model transcriptional, signaling, and metabolic interactions among genes, proteins, and metabolites that identify biological pathways influencing disease pathogenesis. Advances in machine learning (ML)-based inference models exhibit the predictive capabilities of capturing latent patterns in genomic data. Such models are emerging as an alternative to the statistical models identifying causative factors driving complex diseases. We present CoVar, an ML-based framework that builds upon the properties of existing inference models, to find the central genes driving perturbed gene expression across biological states. Unlike differentially expressed genes (DEGs) that capture changes in individual gene expression across conditions, CoVar focuses on identifying variational genes that undergo changes in their expression network interaction profiles, providing insights into changes in the regulatory dynamics, such as in disease pathogenesis. Subsequently, it finds core genes from among the nearest neighbors of these variational genes, which are central to the variational activity and influence the coordinated regulatory processes underlying the observed changes in gene expression. Through the analysis of simulated as well as yeast expression data perturbed by the deletion of the mitochondrial genome, we show that CoVar captures the intrinsic variationality and modularity in the expression data, identifying key driver genes not found through existing differential analysis methodologies.
Collapse
Affiliation(s)
- Satyaki Roy
- Department of Genetics, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Shehzad Z. Sheikh
- Departments of Medicine and Genetics, Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, North Carolina, United States of America
| | - Terrence S. Furey
- Departments of Genetics and Biology, Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
3
|
Roy S, Sheikh SZ, Furey TS. CoVar: A generalizable machine learning approach to identify the coordinated regulators driving variational gene expression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.12.523808. [PMID: 36712050 PMCID: PMC9882103 DOI: 10.1101/2023.01.12.523808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Network inference is used to model transcriptional, signaling, and metabolic interactions among genes, proteins, and metabolites that identify biological pathways influencing disease pathogenesis. Advances in machine learning (ML)-based inference models exhibit the predictive capabilities of capturing latent patterns in genomic data. Such models are emerging as an alternative to the statistical models identifying causative factors driving complex diseases. We present CoVar, an inference framework that builds upon the properties of existing inference models, to find the central genes driving perturbed gene expression across biological states. We leverage ML-based network inference to find networks that capture the strength of regulatory interactions. Our model first pinpoints a subset of genes, termed variational, whose expression variabilities typify the differences in network connectivity between the control and perturbed data. Variational genes, by being differentially expressed themselves or possessing differentially expressed neighbor genes, capture gene expression variability. CoVar then creates subnetworks comprising variational genes and their strongly connected neighbor genes and identifies core genes central to these subnetworks that influence the bulk of the variational activity. Through the analysis of yeast expression data perturbed by the deletion of the mitochondrial genome, we show that CoVar identifies key genes not found through independent differential expression analysis.
Collapse
|
4
|
Hawe JS, Saha A, Waldenberger M, Kunze S, Wahl S, Müller-Nurasyid M, Prokisch H, Grallert H, Herder C, Peters A, Strauch K, Theis FJ, Gieger C, Chambers J, Battle A, Heinig M. Network reconstruction for trans acting genetic loci using multi-omics data and prior information. Genome Med 2022; 14:125. [PMID: 36344995 PMCID: PMC9641770 DOI: 10.1186/s13073-022-01124-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 10/11/2022] [Indexed: 11/09/2022] Open
Abstract
BACKGROUND Molecular measurements of the genome, the transcriptome, and the epigenome, often termed multi-omics data, provide an in-depth view on biological systems and their integration is crucial for gaining insights in complex regulatory processes. These data can be used to explain disease related genetic variants by linking them to intermediate molecular traits (quantitative trait loci, QTL). Molecular networks regulating cellular processes leave footprints in QTL results as so-called trans-QTL hotspots. Reconstructing these networks is a complex endeavor and use of biological prior information can improve network inference. However, previous efforts were limited in the types of priors used or have only been applied to model systems. In this study, we reconstruct the regulatory networks underlying trans-QTL hotspots using human cohort data and data-driven prior information. METHODS We devised a new strategy to integrate QTL with human population scale multi-omics data. State-of-the art network inference methods including BDgraph and glasso were applied to these data. Comprehensive prior information to guide network inference was manually curated from large-scale biological databases. The inference approach was extensively benchmarked using simulated data and cross-cohort replication analyses. Best performing methods were subsequently applied to real-world human cohort data. RESULTS Our benchmarks showed that prior-based strategies outperform methods without prior information in simulated data and show better replication across datasets. Application of our approach to human cohort data highlighted two novel regulatory networks related to schizophrenia and lean body mass for which we generated novel functional hypotheses. CONCLUSIONS We demonstrate that existing biological knowledge can improve the integrative analysis of networks underlying trans associations and generate novel hypotheses about regulatory mechanisms.
Collapse
Affiliation(s)
- Johann S Hawe
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Heart Centre Munich, Department of Cardiology, Technical University Munich, Munich, Germany.,Department of Informatics, Technical University of Munich, Garching, Germany
| | - Ashis Saha
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Melanie Waldenberger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Sonja Kunze
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Simone Wahl
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Martina Müller-Nurasyid
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,IBE, Faculty of Medicine, LMU Munich, 81377, Munich, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Department of Internal Medicine I (Cardiology), Hospital of the Ludwig-Maximilians-University (LMU) Munich, Munich, Germany
| | - Holger Prokisch
- Institute of Human Genetics, School of Medicine, Technische Universität München, Munich, Germany
| | - Harald Grallert
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - Christian Herder
- German Center for Diabetes Research (DZD), Neuherberg, Germany.,Institute for Clinical Diabetology, German Diabetes Center, Leibniz Center for Diabetes Research at Heinrich Heine University, Düsseldorf, Germany.,Division of Endocrinology and Diabetology, Medical Faculty, Heinrich Heine University, Düsseldorf, Germany
| | - Annette Peters
- Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany
| | - Konstantin Strauch
- Institute of Genetic Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center, Johannes Gutenberg University, Mainz, Germany.,Chair of Genetic Epidemiology, IBE, Faculty of Medicine, LMU Munich, Munich, Germany
| | - Fabian J Theis
- Department of Informatics, Technical University of Munich, Garching, Germany.,Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Christian Gieger
- Research Unit of Molecular Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,Institute of Epidemiology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany.,German Center for Diabetes Research (DZD), Neuherberg, Germany
| | - John Chambers
- Department of Epidemiology and Biostatistics, MRC-PHE Centre for Environment and Health, School of Public Health, Imperial College London, London, UK.,Lee Kong Chian School of Medicine, Nanyang Technological University, 308232, Singapore, Singapore
| | - Alexis Battle
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, USA
| | - Matthias Heinig
- Institute of Computational Biology, German Research Center for Environmental Health, HelmholtzZentrum München, Neuherberg, Germany. .,Department of Informatics, Technical University of Munich, Garching, Germany. .,Munich Heart Association, Partner Site Munich, DZHK (German Centre for Cardiovascular Research), 10785, Berlin, Germany.
| |
Collapse
|
5
|
Jiang X, Zhang X. RSNET: inferring gene regulatory networks by a redundancy silencing and network enhancement technique. BMC Bioinformatics 2022; 23:165. [PMID: 35524190 PMCID: PMC9074326 DOI: 10.1186/s12859-022-04696-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 04/25/2022] [Indexed: 11/29/2022] Open
Abstract
Background Current gene regulatory network (GRN) inference methods are notorious for a great number of indirect interactions hidden in the predictions. Filtering out the indirect interactions from direct ones remains an important challenge in the reconstruction of GRNs. To address this issue, we developed a redundancy silencing and network enhancement technique (RSNET) for inferring GRNs. Results To assess the performance of RSNET method, we implemented the experiments on several gold-standard networks by using simulation study, DREAM challenge dataset and Escherichia coli network. The results show that RSNET method performed better than the compared methods in sensitivity and accuracy. As a case of study, we used RSNET to construct functional GRN for apple fruit ripening from gene expression data. Conclusions In the proposed method, the redundant interactions including weak and indirect connections are silenced by recursive optimization adaptively, and the highly dependent nodes are constrained in the model to keep the real interactions. This study provides a useful tool for inferring clean networks. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04696-w.
Collapse
Affiliation(s)
- Xiaohan Jiang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China.,Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan, 430074, China. .,Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan, 430074, China.
| |
Collapse
|
6
|
Chen H, Maduranga DAK, Mundra PA, Zheng J. Bayesian Data Fusion of Gene Expression and Histone Modification Profiles for Inference of Gene Regulatory Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:516-525. [PMID: 30207963 DOI: 10.1109/tcbb.2018.2869590] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Accurately reconstructing gene regulatory networks (GRNs) from high-throughput gene expression data has been a major challenge in systems biology for decades. Many approaches have been proposed to solve this problem. However, there is still much room for the improvement of GRN inference. Integrating data from different sources is a promising strategy. Epigenetic modifications have a close relationship with gene regulation. Hence, epigenetic data such as histone modification profiles can provide useful information for uncovering regulatory interactions between genes. In this paper, we propose a method to integrate epigenetic data into the inference of GRNs. In particular, a dynamic Bayesian network (DBN) is employed to infer gene regulations from time-series gene expression data. Epigenetic data (histone modification profiles here) are integrated into the prior probability distribution of the Bayesian model. Our method has been validated on both synthetic and real datasets. Experimental results show that the integration of epigenetic data can significantly improve the performance of GRN inference. As more epigenetic datasets become available, our method would be useful for elucidating the gene regulatory mechanisms driving various cellular activities. The source code and testing datasets are available at https://github.com/Zheng-Lab/MetaGRN/tree/master/histonePrior.
Collapse
|
7
|
Zhang J, Zhu W, Wang Q, Gu J, Huang LF, Sun X. Differential regulatory network-based quantification and prioritization of key genes underlying cancer drug resistance based on time-course RNA-seq data. PLoS Comput Biol 2019; 15:e1007435. [PMID: 31682596 PMCID: PMC6827891 DOI: 10.1371/journal.pcbi.1007435] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Accepted: 09/24/2019] [Indexed: 12/22/2022] Open
Abstract
Drug resistance is a major cause for the failure of cancer chemotherapy or targeted therapy. However, the molecular regulatory mechanisms controlling the dynamic evolvement of drug resistance remain poorly understood. Thus, it is important to develop methods for identifying key gene regulatory mechanisms of the resistance to specific drugs. In this study, we developed a data-driven computational framework, DryNetMC, using a differential regulatory network-based modeling and characterization strategy to quantify and prioritize key genes underlying cancer drug resistance. The DryNetMC does not only infer gene regulatory networks (GRNs) via an integrated approach, but also characterizes and quantifies dynamical network properties for measuring node importance. We used time-course RNA-seq data from glioma cells treated with dbcAMP (a cAMP activator) as a realistic case to reconstruct the GRNs for sensitive and resistant cells. Based on a novel node importance index that comprehensively quantifies network topology, network entropy and expression dynamics, the top ranked genes were verified to be predictive of the drug sensitivities of different glioma cell lines, in comparison with other existing methods. The proposed method provides a quantitative approach to gain insights into the dynamic adaptation and regulatory mechanisms of cancer drug resistance and sheds light on the design of novel biomarkers or targets for predicting or overcoming drug resistance.
Collapse
Affiliation(s)
- Jiajun Zhang
- School of Mathematics, Sun Yat-Sen University, Guangzhou, China
| | - Wenbo Zhu
- Department of Pharmacology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - Qianliang Wang
- School of Mathematics, Sun Yat-Sen University, Guangzhou, China
| | - Jiayu Gu
- Department of Pharmacology, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
| | - L. Frank Huang
- Brain Tumor Center, Division of Experimental Hematology and Cancer Biology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, United States of America
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH, United States of America
| | - Xiaoqiang Sun
- Department of Medical Informatics, Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China; Key Laboratory of Tropical Disease Control (Sun Yat-Sen University), Chinese Ministry of Education, Guangzhou, Guangdong, China
| |
Collapse
|
8
|
Gabrb2-knockout mice displayed schizophrenia-like and comorbid phenotypes with interneuron-astrocyte-microglia dysregulation. Transl Psychiatry 2018; 8:128. [PMID: 30013074 PMCID: PMC6048160 DOI: 10.1038/s41398-018-0176-9] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Revised: 04/30/2018] [Accepted: 06/04/2018] [Indexed: 12/05/2022] Open
Abstract
Intronic polymorphisms of the GABAA receptor β2 subunit gene (GABRB2) under adaptive evolution were associated with schizophrenia and reduced expression, especially of the long isoform which differs in electrophysiological properties from the short isoform. The present study was directed to examining the gene dosage effects of Gabrb2 in knockout mice of both heterozygous (HT) and homozygous (KO) genotypes with respect to possible schizophrenia-like and comorbid phenotypes. The KO mice, and HT mice to a lesser extent, were found to display prepulse inhibition (PPI) deficit, locomotor hyperactivity, stereotypy, sociability impairments, spatial-working and spatial-reference memory deficits, reduced depression and anxiety, and accelerated pentylenetetrazol (PTZ)-induced seizure. In addition, the KO mice were highly susceptible to audiogenic epilepsy. Some of the behavioral phenotypes showed evidence of imprinting, gender effect and amelioration by the antipsychotic risperidone, and the audiogenic epilepsy was inhibited by the antiepileptic diazepam. GABAergic parvalbumin (PV)-positive interneuron dystrophy, astrocyte dystrophy, and extensive microglia activation were observed in the frontotemporal corticolimbic regions, and reduction of newborn neurons was observed in the hippocampus by immunohistochemical staining. The neuroinflammation indicated by microglial activation was accompanied by elevated brain levels of oxidative stress marker malondialdehyde (MDA) and the pro-inflammatory cytokines tumor necrosis factor-alpha (TNF-α) and interleukin-6 (IL-6). These extensive schizophrenia-like and comorbid phenotypes brought about by Gabrb2 knockout, in conjunction with our previous findings on GABRB2 association with schizophrenia, support a pivotal role of GABRB2 in schizophrenia etiology.
Collapse
|
9
|
Young WC, Raftery AE, Yeung KY. A posterior probability approach for gene regulatory network inference in genetic perturbation data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2016; 13:1241-1251. [PMID: 27775378 DOI: 10.3934/mbe.2016041] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Inferring gene regulatory networks is an important problem in systems biology. However, these networks can be hard to infer from experimental data because of the inherent variability in biological data as well as the large number of genes involved. We propose a fast, simple method for inferring regulatory relationships between genes from knockdown experiments in the NIH LINCS dataset by calculating posterior probabilities, incorporating prior information. We show that the method is able to find previously identified edges from TRANSFAC and JASPAR and discuss the merits and limitations of this approach.
Collapse
Affiliation(s)
- William Chad Young
- University of Washington, Department of Statistics, Box 354322, Seattle, WA 98195-4322, United States.
| | | | | |
Collapse
|
10
|
Xu X, Olivas ND, Ikrar T, Peng T, Holmes TC, Nie Q, Shi Y. Primary visual cortex shows laminar-specific and balanced circuit organization of excitatory and inhibitory synaptic connectivity. J Physiol 2016; 594:1891-910. [PMID: 26844927 DOI: 10.1113/jp271891] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2015] [Accepted: 01/27/2016] [Indexed: 11/08/2022] Open
Abstract
KEY POINTS Using functional mapping assays, we conducted a quantitative assessment of both excitatory and inhibitory synaptic laminar connections to excitatory neurons in layers 2/3-6 of the mouse visual cortex (V1). Laminar-specific synaptic wiring diagrams of excitatory neurons were constructed on the basis of circuit mapping. The present study reveals that that excitatory and inhibitory synaptic connectivity is spatially balanced across excitatory neuronal networks in V1. ABSTRACT In the mammalian neocortex, excitatory neurons provide excitation in both columnar and laminar dimensions, which is modulated further by inhibitory neurons. However, our understanding of intracortical excitatory and inhibitory synaptic inputs in relation to principal excitatory neurons remains incomplete, and it is unclear how local excitatory and inhibitory synaptic connections to excitatory neurons are spatially organized on a layer-by-layer basis. In the present study, we combined whole cell recordings with laser scanning photostimulation via glutamate uncaging to map excitatory and inhibitory synaptic inputs to single excitatory neurons throughout cortical layers 2/3-6 in the mouse primary visual cortex (V1). We find that synaptic input sources of excitatory neurons span the radial columns of laminar microcircuits, and excitatory neurons in different V1 laminae exhibit distinct patterns of layer-specific organization of excitatory inputs. Remarkably, the spatial extent of inhibitory inputs of excitatory neurons for a given layer closely mirrors that of their excitatory input sources, indicating that excitatory and inhibitory synaptic connectivity is spatially balanced across excitatory neuronal networks. Strong interlaminar inhibitory inputs are found, particularly for excitatory neurons in layers 2/3 and 5. This differs from earlier studies reporting that inhibitory cortical connections to excitatory neurons are generally localized within the same cortical layer. On the basis of the functional mapping assays, we conducted a quantitative assessment of both excitatory and inhibitory synaptic laminar connections to excitatory cells at single cell resolution, establishing precise layer-by-layer synaptic wiring diagrams of excitatory neurons in the visual cortex.
Collapse
Affiliation(s)
- Xiangmin Xu
- Department of Anatomy and Neurobiology, School of Medicine.,Department of Biomedical Engineering
| | - Nicholas D Olivas
- Department of Anatomy and Neurobiology, School of Medicine.,Present address: Department of Neurobiology, David Geffen School of Medicine, University of California, Los Angeles, CA, USA
| | - Taruna Ikrar
- Department of Anatomy and Neurobiology, School of Medicine
| | - Tao Peng
- Department of Mathematics.,Center for Complex Biological Systems
| | - Todd C Holmes
- Department of Physiology and Biophysics, School of Medicine, University of California, Irvine, CA, USA
| | - Qing Nie
- Department of Biomedical Engineering.,Department of Mathematics.,Center for Complex Biological Systems
| | - Yulin Shi
- Department of Anatomy and Neurobiology, School of Medicine
| |
Collapse
|
11
|
Papasaikas P, Rao A, Huggins P, Valcarcel J, Lopez A. Reconstruction of composite regulator-target splicing networks from high-throughput transcriptome data. BMC Genomics 2015; 16 Suppl 10:S7. [PMID: 26449793 PMCID: PMC4603746 DOI: 10.1186/1471-2164-16-s10-s7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
We present a computational framework tailored for the modeling of the complex, dynamic relationships that are encountered in splicing regulation. The starting point is whole-genome transcriptomic data from high-throughput array or sequencing methods that are used to quantify gene expression and alternative splicing across multiple contexts. This information is used as input for state of the art methods for Graphical Model Selection in order to recover the structure of a composite network that simultaneously models exon co-regulation and their cognate regulators. Community structure detection and social network analysis methods are used to identify distinct modules and key actors within the network. As a proof of concept for our framework we studied the splicing regulatory network for Drosophila development using the publicly available modENCODE data. The final model offers a comprehensive view of the splicing circuitry that underlies fly development. Identified modules are associated with major developmental hallmarks including maternally loaded RNAs, onset of zygotic gene expression, transitions between life stages and sex differentiation. Within-module key actors include well-known developmental-specific splicing regulators from the literature while additional factors previously unassociated with developmental-specific splicing are also highlighted. Finally we analyze an extensive battery of Splicing Factor knock-down transcriptome data and demonstrate that our approach captures true regulatory relationships.
Collapse
|
12
|
Abstract
With the development of high-throughput genomic technologies, large, genome-wide datasets have been collected, and the integration of these datasets should provide large-scale, multidimensional, and insightful views of biological systems. We developed a method for gene association network construction based on gene expression data that integrate a variety of biological resources. Assuming gene expression data are from a multivariate Gaussian distribution, a graphical lasso (glasso) algorithm is able to estimate the sparse inverse covariance matrix by a lasso (L1) penalty. The inverse covariance matrix can be seen as direct correlation between gene pairs in the gene association network. In our work, instead of using a single penalty, different penalty values were applied for gene pairs based on a priori knowledge as to whether the two genes should be connected. The a priori information can be calculated or retrieved from other biological data, e.g., Gene Ontology similarity, protein-protein interaction, gene regulatory network. By incorporating prior knowledge, the weighted graphical lasso (wglasso) outperforms the original glasso both on simulations and on data from Arabidopsis. Simulation studies show that even when some prior knowledge is not correct, the overall quality of the wglasso network was still greater than when not incorporating that information, e.g., glasso.
Collapse
|
13
|
Gong W, Koyano-Nakagawa N, Li T, Garry DJ. Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data. BMC Bioinformatics 2015; 16:74. [PMID: 25887857 PMCID: PMC4359553 DOI: 10.1186/s12859-015-0460-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 01/12/2015] [Indexed: 02/07/2023] Open
Abstract
Background Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. Results We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10−100), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately −9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (−9435 to −8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. Conclusion We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0460-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wuming Gong
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Naoko Koyano-Nakagawa
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Tongbin Li
- AccuraScience LLC, 5721 Merle Hay Road, Suite #16B, Johnston, IA, 50131, USA.
| | - Daniel J Garry
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| |
Collapse
|
14
|
Christley S, Cockrell C, An G. Computational Studies of the Intestinal Host-Microbiota Interactome. COMPUTATION (BASEL, SWITZERLAND) 2015; 3:2-28. [PMID: 34765258 PMCID: PMC8580329 DOI: 10.3390/computation3010002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A large and growing body of research implicates aberrant immune response and compositional shifts of the intestinal microbiota in the pathogenesis of many intestinal disorders. The molecular and physical interaction between the host and the microbiota, known as the host-microbiota interactome, is one of the key drivers in the pathophysiology of many of these disorders. This host-microbiota interactome is a set of dynamic and complex processes, and needs to be treated as a distinct entity and subject for study. Disentangling this complex web of interactions will require novel approaches, using a combination of data-driven bioinformatics with knowledge-driven computational modeling. This review describes the computational approaches for investigating the host-microbiota interactome, with emphasis on the human intestinal tract and innate immunity, and highlights open challenges and existing gaps in the computation methodology for advancing our knowledge about this important facet of human health.
Collapse
Affiliation(s)
- Scott Christley
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| | - Chase Cockrell
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| | - Gary An
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| |
Collapse
|
15
|
Studham ME, Tjärnberg A, Nordling TEM, Nelander S, Sonnhammer ELL. Functional association networks as priors for gene regulatory network inference. ACTA ACUST UNITED AC 2014; 30:i130-8. [PMID: 24931976 PMCID: PMC4058914 DOI: 10.1093/bioinformatics/btu285] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Motivation: Gene regulatory network (GRN) inference reveals the influences genes have on one another in cellular regulatory systems. If the experimental data are inadequate for reliable inference of the network, informative priors have been shown to improve the accuracy of inferences. Results: This study explores the potential of undirected, confidence-weighted networks, such as those in functional association databases, as a prior source for GRN inference. Such networks often erroneously indicate symmetric interaction between genes and may contain mostly correlation-based interaction information. Despite these drawbacks, our testing on synthetic datasets indicates that even noisy priors reflect some causal information that can improve GRN inference accuracy. Our analysis on yeast data indicates that using the functional association databases FunCoup and STRING as priors can give a small improvement in GRN inference accuracy with biological data. Contact:matthew.studham@scilifelab.se Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matthew E Studham
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Andreas Tjärnberg
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Torbjörn E M Nordling
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Sven Nelander
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| | - Erik L L Sonnhammer
- Stockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, SwedenStockholm Bioinformatics Centre, Science for Life Laboratory, SE-171 65 Solna, Sweden, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden, Department of Immunology, Genetics and Pathology, Uppsala University, Rudbeck Laboratory, SE-751 05 Uppsala, Sweden and Swedish eScience Research Center, SE-100 44 Stockholm, Sweden
| |
Collapse
|
16
|
Artificial neural network inference (ANNI): a study on gene-gene interaction for biomarkers in childhood sarcomas. PLoS One 2014; 9:e102483. [PMID: 25025207 PMCID: PMC4099183 DOI: 10.1371/journal.pone.0102483] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Accepted: 06/19/2014] [Indexed: 01/31/2023] Open
Abstract
Objective To model the potential interaction between previously identified biomarkers in children sarcomas using artificial neural network inference (ANNI). Method To concisely demonstrate the biological interactions between correlated genes in an interaction network map, only 2 types of sarcomas in the children small round blue cell tumors (SRBCTs) dataset are discussed in this paper. A backpropagation neural network was used to model the potential interaction between genes. The prediction weights and signal directions were used to model the strengths of the interaction signals and the direction of the interaction link between genes. The ANN model was validated using Monte Carlo cross-validation to minimize the risk of over-fitting and to optimize generalization ability of the model. Results Strong connection links on certain genes (TNNT1 and FNDC5 in rhabdomyosarcoma (RMS); FCGRT and OLFM1 in Ewing’s sarcoma (EWS)) suggested their potency as central hubs in the interconnection of genes with different functionalities. The results showed that the RMS patients in this dataset are likely to be congenital and at low risk of cardiomyopathy development. The EWS patients are likely to be complicated by EWS-FLI fusion and deficiency in various signaling pathways, including Wnt, Fas/Rho and intracellular oxygen. Conclusions The ANN network inference approach and the examination of identified genes in the published literature within the context of the disease highlights the substantial influence of certain genes in sarcomas.
Collapse
|
17
|
A feature selection technique for inference of graphs from their known topological properties: Revealing scale-free gene regulatory networks. Inf Sci (N Y) 2014. [DOI: 10.1016/j.ins.2014.02.096] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
18
|
Zheng Z, Christley S, Chiu WT, Blitz IL, Xie X, Cho KWY, Nie Q. Inference of the Xenopus tropicalis embryonic regulatory network and spatial gene expression patterns. BMC SYSTEMS BIOLOGY 2014; 8:3. [PMID: 24397936 PMCID: PMC3896677 DOI: 10.1186/1752-0509-8-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Accepted: 12/19/2013] [Indexed: 11/10/2022]
Abstract
BACKGROUND During embryogenesis, signaling molecules produced by one cell population direct gene regulatory changes in neighboring cells and influence their developmental fates and spatial organization. One of the earliest events in the development of the vertebrate embryo is the establishment of three germ layers, consisting of the ectoderm, mesoderm and endoderm. Attempts to measure gene expression in vivo in different germ layers and cell types are typically complicated by the heterogeneity of cell types within biological samples (i.e., embryos), as the responses of individual cell types are intermingled into an aggregate observation of heterogeneous cell types. Here, we propose a novel method to elucidate gene regulatory circuits from these aggregate measurements in embryos of the frog Xenopus tropicalis using gene network inference algorithms and then test the ability of the inferred networks to predict spatial gene expression patterns. RESULTS We use two inference models with different underlying assumptions that incorporate existing network information, an ODE model for steady-state data and a Markov model for time series data, and contrast the performance of the two models. We apply our method to both control and knockdown embryos at multiple time points to reconstruct the core mesoderm and endoderm regulatory circuits. Those inferred networks are then used in combination with known dorsal-ventral spatial expression patterns of a subset of genes to predict spatial expression patterns for other genes. Both models are able to predict spatial expression patterns for some of the core mesoderm and endoderm genes, but interestingly of different gene subsets, suggesting that neither model is sufficient to recapitulate all of the spatial patterns, yet they are complementary for the patterns that they do capture. CONCLUSION The presented methodology of gene network inference combined with spatial pattern prediction provides an additional layer of validation to elucidate the regulatory circuits controlling the spatial-temporal dynamics in embryonic development.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Qing Nie
- Department of Mathematics, University of California, Irvine, CA 92697, USA.
| |
Collapse
|
19
|
Jia B, Wang X. Gene regulatory network inference by point-based Gaussian approximation filters incorporating the prior information. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2013; 2013:16. [PMID: 24341668 PMCID: PMC3977693 DOI: 10.1186/1687-4153-2013-16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Accepted: 11/11/2013] [Indexed: 11/17/2022]
Abstract
The extended Kalman filter (EKF) has been applied to inferring gene regulatory
networks. However, it is well known that the EKF becomes less accurate when the
system exhibits high nonlinearity. In addition, certain prior information about
the gene regulatory network exists in practice, and no systematic approach has
been developed to incorporate such prior information into the Kalman-type filter
for inferring the structure of the gene regulatory network. In this paper, an
inference framework based on point-based Gaussian approximation filters that can
exploit the prior information is developed to solve the gene regulatory network
inference problem. Different point-based Gaussian approximation filters, including
the unscented Kalman filter (UKF), the third-degree cubature Kalman filter
(CKF3), and the fifth-degree cubature Kalman filter
(CKF5) are employed. Several types of network prior information,
including the existing network structure information, sparsity assumption, and the
range constraint of parameters, are considered, and the corresponding filters
incorporating the prior information are developed. Experiments on a synthetic
network of eight genes and the yeast protein synthesis network of five genes are
carried out to demonstrate the performance of the proposed framework. The results
show that the proposed methods provide more accurate inference results than
existing methods, such as the EKF and the traditional UKF.
Collapse
Affiliation(s)
| | - Xiaodong Wang
- Department of Electrical Engineering, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
20
|
Ye G, Tang M, Cai JF, Nie Q, Xie X. Low-rank regularization for learning gene expression programs. PLoS One 2013; 8:e82146. [PMID: 24358148 PMCID: PMC3866120 DOI: 10.1371/journal.pone.0082146] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2013] [Accepted: 10/30/2013] [Indexed: 12/25/2022] Open
Abstract
Learning gene expression programs directly from a set of observations is challenging due to the complexity of gene regulation, high noise of experimental measurements, and insufficient number of experimental measurements. Imposing additional constraints with strong and biologically motivated regularizations is critical in developing reliable and effective algorithms for inferring gene expression programs. Here we propose a new form of regulation that constrains the number of independent connectivity patterns between regulators and targets, motivated by the modular design of gene regulatory programs and the belief that the total number of independent regulatory modules should be small. We formulate a multi-target linear regression framework to incorporate this type of regulation, in which the number of independent connectivity patterns is expressed as the rank of the connectivity matrix between regulators and targets. We then generalize the linear framework to nonlinear cases, and prove that the generalized low-rank regularization model is still convex. Efficient algorithms are derived to solve both the linear and nonlinear low-rank regularized problems. Finally, we test the algorithms on three gene expression datasets, and show that the low-rank regularization improves the accuracy of gene expression prediction in these three datasets.
Collapse
Affiliation(s)
- Guibo Ye
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
- Department of Mathematics, University of California Irvine, Irvine, California, United States of America
| | - Mengfan Tang
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
| | - Jian-Feng Cai
- Department of Mathematics, University of Iowa, Iowa City, Iowa, United States of America
| | - Qing Nie
- Department of Mathematics, University of California Irvine, Irvine, California, United States of America
- Center for Complex Biological Systems, University of California Irvine, Irvine, California, United States of America
| | - Xiaohui Xie
- Department of Computer Science, University of California Irvine, Irvine, California, United States of America
- Center for Complex Biological Systems, University of California Irvine, Irvine, California, United States of America
- * E-mail:
| |
Collapse
|
21
|
Jin S, Zou X. Construction of the influenza A virus infection-induced cell-specific inflammatory regulatory network based on mutual information and optimization. BMC SYSTEMS BIOLOGY 2013; 7:105. [PMID: 24138989 PMCID: PMC4016583 DOI: 10.1186/1752-0509-7-105] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/25/2012] [Accepted: 09/20/2013] [Indexed: 12/21/2022]
Abstract
Background Influenza A virus (IAV) infection-induced inflammatory regulatory networks (IRNs) are extremely complex and dynamic. Specific biological experiments for investigating the interactions between individual inflammatory factors cannot provide a detailed and insightful multidimensional view of IRNs. Recently, data from high-throughput technologies have permitted system-level analyses. The construction of large and cell-specific IRNs from high-throughput data is essential to understanding the pathogenesis of IAV infection. Results In this study, we proposed a computational method, which combines nonlinear ordinary differential equation (ODE)-based optimization with mutual information, to construct a cell-specific optimized IRN during IAV infection by integrating gene expression data with a prior knowledge of network topology. Moreover, we used the average relative error and sensitivity analysis to evaluate the effectiveness of our proposed approach. Furthermore, from the optimized IRN, we confirmed 45 interactions between proteins in biological experiments and identified 37 new regulatory interactions and 8 false positive interactions, including the following interactions: IL1β regulates TLR3, TLR3 regulates IFN-β and TNF regulates IL6. Most of these regulatory interactions are statistically significant by Z-statistic. The functional annotations of the optimized IRN demonstrated clearly that the defense response, immune response, response to wounding and regulation of cytokine production are the pivotal processes of IAV-induced inflammatory response. The pathway analysis results from the Kyoto Encyclopaedia of Genes and Genomes (KEGG) showed that 8 pathways are enriched significantly. The 5 pathways were validated by experiments, and 3 other pathways, including the intestinal immune network for IgA production, the cytosolic DNA-sensing pathway and the allograft rejection pathway, are the predicted novel pathways involved in the inflammatory response. Conclusions Integration of knowledge-driven and data-driven methods allows us to construct an effective IRN during IAV infection. Based on the constructed network, we have identified new interactions among inflammatory factors and biological pathways. These findings provide new insight into our understanding of the molecular mechanisms in the inflammatory network in response to IAV infection. Further characterization and experimental validation of the interaction mechanisms identified from this study may lead to a novel therapeutic strategy for the control of infections and inflammatory responses.
Collapse
Affiliation(s)
| | - Xiufen Zou
- School of Mathematics and Statistics, Wuhan University, Wuhan 430072, China.
| |
Collapse
|
22
|
Wang YK, Hurley DG, Schnell S, Print CG, Crampin EJ. Integration of steady-state and temporal gene expression data for the inference of gene regulatory networks. PLoS One 2013; 8:e72103. [PMID: 23967277 PMCID: PMC3743784 DOI: 10.1371/journal.pone.0072103] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2013] [Accepted: 07/05/2013] [Indexed: 01/02/2023] Open
Abstract
We develop a new regression algorithm, cMIKANA, for inference of gene regulatory networks from combinations of steady-state and time-series gene expression data. Using simulated gene expression datasets to assess the accuracy of reconstructing gene regulatory networks, we show that steady-state and time-series data sets can successfully be combined to identify gene regulatory interactions using the new algorithm. Inferring gene networks from combined data sets was found to be advantageous when using noisy measurements collected with either lower sampling rates or a limited number of experimental replicates. We illustrate our method by applying it to a microarray gene expression dataset from human umbilical vein endothelial cells (HUVECs) which combines time series data from treatment with growth factor TNF and steady state data from siRNA knockdown treatments. Our results suggest that the combination of steady-state and time-series datasets may provide better prediction of RNA-to-RNA interactions, and may also reveal biological features that cannot be identified from dynamic or steady state information alone. Finally, we consider the experimental design of genomics experiments for gene regulatory network inference and show that network inference can be improved by incorporating steady-state measurements with time-series data.
Collapse
Affiliation(s)
- Yi Kan Wang
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
| | - Daniel G. Hurley
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
- Department of Molecular Medicine and Pathology, University of Auckland, Auckland, New Zealand
| | - Santiago Schnell
- Department of Molecular & Integrative Physiology and Department of Computational Medicine & Bioinformatics, University of Michigan Medical School, Ann Arbor, Michigan, United States of America
| | - Cristin G. Print
- Department of Molecular Medicine and Pathology, University of Auckland, Auckland, New Zealand
- New Zealand Bioinformatics Institute, Auckland, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
| | - Edmund J. Crampin
- Auckland Bioengineering Institute, University of Auckland, Auckland, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Auckland, New Zealand
- Department of Engineering Science, University of Auckland, Auckland, New Zealand
- Melbourne School of Engineering, The University of Melbourne, Melbourne, Victoria, Australia
- National ICT Australia Victoria Research Lab, Canberra, Victoria, Australia
| |
Collapse
|
23
|
Reverse-engineering the genetic circuitry of a cancer cell with predicted intervention in chronic lymphocytic leukemia. Proc Natl Acad Sci U S A 2012; 110:459-64. [PMID: 23267079 DOI: 10.1073/pnas.1211130110] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Cellular behavior is sustained by genetic programs that are progressively disrupted in pathological conditions--notably, cancer. High-throughput gene expression profiling has been used to infer statistical models describing these cellular programs, and development is now needed to guide orientated modulation of these systems. Here we develop a regression-based model to reverse-engineer a temporal genetic program, based on relevant patterns of gene expression after cell stimulation. This method integrates the temporal dimension of biological rewiring of genetic programs and enables the prediction of the effect of targeted gene disruption at the system level. We tested the performance accuracy of this model on synthetic data before reverse-engineering the response of primary cancer cells to a proliferative (protumorigenic) stimulation in a multistate leukemia biological model (i.e., chronic lymphocytic leukemia). To validate the ability of our method to predict the effects of gene modulation on the global program, we performed an intervention experiment on a targeted gene. Comparison of the predicted and observed gene expression changes demonstrates the possibility of predicting the effects of a perturbation in a gene regulatory network, a first step toward an orientated intervention in a cancer cell genetic program.
Collapse
|
24
|
Zhang X, Liu K, Liu ZP, Duval B, Richer JM, Zhao XM, Hao JK, Chen L. NARROMI: a noise and redundancy reduction technique improves accuracy of gene regulatory network inference. Bioinformatics 2012; 29:106-13. [DOI: 10.1093/bioinformatics/bts619] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
|
25
|
Poultney CS, Greenfield A, Bonneau R. Integrated inference and analysis of regulatory networks from multi-level measurements. Methods Cell Biol 2012; 110:19-56. [PMID: 22482944 PMCID: PMC5615108 DOI: 10.1016/b978-0-12-388403-9.00002-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Regulatory and signaling networks coordinate the enormously complex interactions and processes that control cellular processes (such as metabolism and cell division), coordinate response to the environment, and carry out multiple cell decisions (such as development and quorum sensing). Regulatory network inference is the process of inferring these networks, traditionally from microarray data but increasingly incorporating other measurement types such as proteomics, ChIP-seq, metabolomics, and mass cytometry. We discuss existing techniques for network inference. We review in detail our pipeline, which consists of an initial biclustering step, designed to estimate co-regulated groups; a network inference step, designed to select and parameterize likely regulatory models for the control of the co-regulated groups from the biclustering step; and a visualization and analysis step, designed to find and communicate key features of the network. Learning biological networks from even the most complete data sets is challenging; we argue that integrating new data types into the inference pipeline produces networks of increased accuracy, validity, and biological relevance.
Collapse
Affiliation(s)
- Christopher S Poultney
- Department of Biology, Center for Genomics and Systems Biology, New York University, New York, NY, USA
| | | | | |
Collapse
|
26
|
Wang XD, Qi YX, Jiang ZL. Reconstruction of transcriptional network from microarray data using combined mutual information and network-assisted regression. IET Syst Biol 2011; 5:95-102. [PMID: 21405197 DOI: 10.1049/iet-syb.2010.0041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Many methods had been developed on inferring transcriptional network from gene expression. However, it is still necessary to design new method that discloses more detailed and exact network information. Using network-assisted regression, the authors combined the averaged three-way mutual information (AMI3) and non-linear ordinary differential equation (ODE) model to infer the transcriptional network, and to obtain both the topological structure and the regulatory dynamics. Synthetic and experimental data were used to evaluate the performance of the above approach. In comparison with the previous methods based on mutual information, AMI3 obtained higher precision with the same sensitivity. To describe the regulatory dynamics between transcription factors and target genes, network-assisted regression and regression without network, respectively, were applied in the steady-state and time series microarray data. The results revealed that comparing with regression without network, network-assisted regression increased the precision, but decreased the fitting goodness. Then, the authors reconstructed the transcriptional network of Escherichia coli and simulated the regulatory dynamics of genes. Furthermore, the authors' approach identified potential transcription factors regulating yeast cell cycle. In conclusion, network-assisted regression, combined AMI3 and ODE model, was a more precisely to infer the topological structure and the regulatory dynamics of transcriptional network from microarray data. [Includes supplementary material].
Collapse
Affiliation(s)
- X-D Wang
- Shanghai Jiao Tong University, Institute of Mechanobiology and Medical Engineering, Shanghai, People's Republic of China
| | | | | |
Collapse
|
27
|
Kerwin RE, Jimenez-Gomez JM, Fulop D, Harmer SL, Maloof JN, Kliebenstein DJ. Network quantitative trait loci mapping of circadian clock outputs identifies metabolic pathway-to-clock linkages in Arabidopsis. THE PLANT CELL 2011; 23:471-85. [PMID: 21343415 PMCID: PMC3077772 DOI: 10.1105/tpc.110.082065] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2010] [Revised: 01/19/2011] [Accepted: 01/30/2011] [Indexed: 05/18/2023]
Abstract
Modern systems biology permits the study of complex networks, such as circadian clocks, and the use of complex methodologies, such as quantitative genetics. However, it is difficult to combine these approaches due to factorial expansion in experiments when networks are examined using complex methods. We developed a genomic quantitative genetic approach to overcome this problem, allowing us to examine the function(s) of the plant circadian clock in different populations derived from natural accessions. Using existing microarray data, we defined 24 circadian time phase groups (i.e., groups of genes with peak phases of expression at particular times of day). These groups were used to examine natural variation in circadian clock function using existing single time point microarray experiments from a recombinant inbred line population. We identified naturally variable loci that altered circadian clock outputs and linked these circadian quantitative trait loci to preexisting metabolomics quantitative trait loci, thereby identifying possible links between clock function and metabolism. Using single-gene isogenic lines, we found that circadian clock output was altered by natural variation in Arabidopsis thaliana secondary metabolism. Specifically, genetic manipulation of a secondary metabolic enzyme led to altered free-running rhythms. This represents a unique and valuable approach to the study of complex networks using quantitative genetics.
Collapse
Affiliation(s)
- Rachel E. Kerwin
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Jose M. Jimenez-Gomez
- Department of Plant Biology, University of California, Davis, California 95616
- Max Planck Institute for Plant Breeding Research, Plant Breeding and Genetics Department, 50829 Cologne, Germany
| | - Daniel Fulop
- Department of Plant Biology, University of California, Davis, California 95616
| | - Stacey L. Harmer
- Department of Plant Biology, University of California, Davis, California 95616
| | - Julin N. Maloof
- Department of Plant Biology, University of California, Davis, California 95616
| | - Daniel J. Kliebenstein
- Department of Plant Sciences, University of California, Davis, California 95616
- Address correspondence to
| |
Collapse
|
28
|
Christley S, Lee B, Dai X, Nie Q. Integrative multicellular biological modeling: a case study of 3D epidermal development using GPU algorithms. BMC SYSTEMS BIOLOGY 2010; 4:107. [PMID: 20696053 PMCID: PMC2936904 DOI: 10.1186/1752-0509-4-107] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/28/2010] [Accepted: 08/09/2010] [Indexed: 12/18/2022]
Abstract
BACKGROUND Simulation of sophisticated biological models requires considerable computational power. These models typically integrate together numerous biological phenomena such as spatially-explicit heterogeneous cells, cell-cell interactions, cell-environment interactions and intracellular gene networks. The recent advent of programming for graphical processing units (GPU) opens up the possibility of developing more integrative, detailed and predictive biological models while at the same time decreasing the computational cost to simulate those models. RESULTS We construct a 3D model of epidermal development and provide a set of GPU algorithms that executes significantly faster than sequential central processing unit (CPU) code. We provide a parallel implementation of the subcellular element method for individual cells residing in a lattice-free spatial environment. Each cell in our epidermal model includes an internal gene network, which integrates cellular interaction of Notch signaling together with environmental interaction of basement membrane adhesion, to specify cellular state and behaviors such as growth and division. We take a pedagogical approach to describing how modeling methods are efficiently implemented on the GPU including memory layout of data structures and functional decomposition. We discuss various programmatic issues and provide a set of design guidelines for GPU programming that are instructive to avoid common pitfalls as well as to extract performance from the GPU architecture. CONCLUSIONS We demonstrate that GPU algorithms represent a significant technological advance for the simulation of complex biological models. We further demonstrate with our epidermal model that the integration of multiple complex modeling methods for heterogeneous multicellular biological processes is both feasible and computationally tractable using this new technology. We hope that the provided algorithms and source code will be a starting point for modelers to develop their own GPU implementations, and encourage others to implement their modeling methods on the GPU and to make that code available to the wider community.
Collapse
Affiliation(s)
- Scott Christley
- Department of Mathematics, University of California, Irvine, CA 92697, USA.
| | | | | | | |
Collapse
|