51
|
BTNET : boosted tree based gene regulatory network inference algorithm using time-course measurement data. BMC SYSTEMS BIOLOGY 2018; 12:20. [PMID: 29560827 PMCID: PMC5861501 DOI: 10.1186/s12918-018-0547-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Background Identifying gene regulatory networks is an important task for understanding biological systems. Time-course measurement data became a valuable resource for inferring gene regulatory networks. Various methods have been presented for reconstructing the networks from time-course measurement data. However, existing methods have been validated on only a limited number of benchmark datasets, and rarely verified on real biological systems. Results We first integrated benchmark time-course gene expression datasets from previous studies and reassessed the baseline methods. We observed that GENIE3-time, a tree-based ensemble method, achieved the best performance among the baselines. In this study, we introduce BTNET, a boosted tree based gene regulatory network inference algorithm which improves the state-of-the-art. We quantitatively validated BTNET on the integrated benchmark dataset. The AUROC and AUPR scores of BTNET were higher than those of the baselines. We also qualitatively validated the results of BTNET through an experiment on neuroblastoma cells treated with an antidepressant. The inferred regulatory network from BTNET showed that brachyury, a transcription factor, was regulated by fluoxetine, an antidepressant, which was verified by the expression of its downstream genes. Conclusions We present BTENT that infers a GRN from time-course measurement data using boosting algorithms. Our model achieved the highest AUROC and AUPR scores on the integrated benchmark dataset. We further validated BTNET qualitatively through a wet-lab experiment and showed that BTNET can produce biologically meaningful results. Electronic supplementary material The online version of this article (10.1186/s12918-018-0547-0) contains supplementary material, which is available to authorized users.
Collapse
|
52
|
Windowed Granger causal inference strategy improves discovery of gene regulatory networks. Proc Natl Acad Sci U S A 2018; 115:2252-2257. [PMID: 29440433 DOI: 10.1073/pnas.1710936115] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Accurate inference of regulatory networks from experimental data facilitates the rapid characterization and understanding of biological systems. High-throughput technologies can provide a wealth of time-series data to better interrogate the complex regulatory dynamics inherent to organisms, but many network inference strategies do not effectively use temporal information. We address this limitation by introducing Sliding Window Inference for Network Generation (SWING), a generalized framework that incorporates multivariate Granger causality to infer network structure from time-series data. SWING moves beyond existing Granger methods by generating windowed models that simultaneously evaluate multiple upstream regulators at several potential time delays. We demonstrate that SWING elucidates network structure with greater accuracy in both in silico and experimentally validated in vitro systems. We estimate the apparent time delays present in each system and demonstrate that SWING infers time-delayed, gene-gene interactions that are distinct from baseline methods. By providing a temporal framework to infer the underlying directed network topology, SWING generates testable hypotheses for gene-gene influences.
Collapse
|
53
|
Ezer D, Shepherd SJK, Brestovitsky A, Dickinson P, Cortijo S, Charoensawan V, Box MS, Biswas S, Jaeger KE, Wigge PA. The G-Box Transcriptional Regulatory Code in Arabidopsis. PLANT PHYSIOLOGY 2017; 175:628-640. [PMID: 28864470 PMCID: PMC5619884 DOI: 10.1104/pp.17.01086] [Citation(s) in RCA: 81] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/03/2017] [Accepted: 08/30/2017] [Indexed: 05/19/2023]
Abstract
Plants have significantly more transcription factor (TF) families than animals and fungi, and plant TF families tend to contain more genes; these expansions are linked to adaptation to environmental stressors. Many TF family members bind to similar or identical sequence motifs, such as G-boxes (CACGTG), so it is difficult to predict regulatory relationships. We determined that the flanking sequences near G-boxes help determine in vitro specificity but that this is insufficient to predict the transcription pattern of genes near G-boxes. Therefore, we constructed a gene regulatory network that identifies the set of bZIPs and bHLHs that are most predictive of the expression of genes downstream of perfect G-boxes. This network accurately predicts transcriptional patterns and reconstructs known regulatory subnetworks. Finally, we present Ara-BOX-cis (araboxcis.org), a Web site that provides interactive visualizations of the G-box regulatory network, a useful resource for generating predictions for gene regulatory relations.
Collapse
Affiliation(s)
- Daphne Ezer
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Samuel J K Shepherd
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Anna Brestovitsky
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Patrick Dickinson
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Sandra Cortijo
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Varodom Charoensawan
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
- Department of Biochemistry, Faculty of Science, and Integrative Computational BioScience Center, Mahidol University, Bangkok 10400, Thailand
| | - Mathew S Box
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Surojit Biswas
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Katja E Jaeger
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
| | - Philip A Wigge
- Sainsbury Laboratory, University of Cambridge, Cambridge CB2 1LR, United Kingdom
- Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, United Kingdom
| |
Collapse
|
54
|
Kordmahalleh MM, Sefidmazgi MG, Harrison SH, Homaifar A. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network. BioData Min 2017; 10:29. [PMID: 28785315 PMCID: PMC5543747 DOI: 10.1186/s13040-017-0146-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 07/14/2017] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. METHODS We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. RESULTS Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network sizes and levels of stochastic noise. We found our HRNN method to be superior in terms of accuracy for nonlinear data sets with higher amounts of noise. CONCLUSIONS The proposed method identifies time-delayed gene-gene interactions of GRNs. The topology-based advancement of our HRNN worked as expected by more effectively modeling nonlinear data sets. As a non-fully connected network, an added benefit to HRNN was how it helped to find the few genes which regulated the target gene over different time delays.
Collapse
Affiliation(s)
- Mina Moradi Kordmahalleh
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Mohammad Gorji Sefidmazgi
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Scott H Harrison
- Department of Biology, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Abdollah Homaifar
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| |
Collapse
|
55
|
MapReduce Algorithms for Inferring Gene Regulatory Networks from Time-Series Microarray Data Using an Information-Theoretic Approach. BIOMED RESEARCH INTERNATIONAL 2017; 2017:6261802. [PMID: 28243601 PMCID: PMC5294223 DOI: 10.1155/2017/6261802] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 11/14/2016] [Accepted: 12/13/2016] [Indexed: 12/15/2022]
Abstract
Gene regulation is a series of processes that control gene expression and its extent. The connections among genes and their regulatory molecules, usually transcription factors, and a descriptive model of such connections are known as gene regulatory networks (GRNs). Elucidating GRNs is crucial to understand the inner workings of the cell and the complexity of gene interactions. To date, numerous algorithms have been developed to infer gene regulatory networks. However, as the number of identified genes increases and the complexity of their interactions is uncovered, networks and their regulatory mechanisms become cumbersome to test. Furthermore, prodding through experimental results requires an enormous amount of computation, resulting in slow data processing. Therefore, new approaches are needed to expeditiously analyze copious amounts of experimental data resulting from cellular GRNs. To meet this need, cloud computing is promising as reported in the literature. Here, we propose new MapReduce algorithms for inferring gene regulatory networks on a Hadoop cluster in a cloud environment. These algorithms employ an information-theoretic approach to infer GRNs using time-series microarray data. Experimental results show that our MapReduce program is much faster than an existing tool while achieving slightly better prediction accuracy than the existing tool.
Collapse
|
56
|
Module Anchored Network Inference: A Sequential Module-Based Approach to Novel Gene Network Construction from Genomic Expression Data on Human Disease Mechanism. Int J Genomics 2017; 2017:8514071. [PMID: 28197408 PMCID: PMC5286469 DOI: 10.1155/2017/8514071] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2016] [Accepted: 12/15/2016] [Indexed: 11/26/2022] Open
Abstract
Different computational approaches have been examined and compared for inferring network relationships from time-series genomic data on human disease mechanisms under the recent Dialogue on Reverse Engineering Assessment and Methods (DREAM) challenge. Many of these approaches infer all possible relationships among all candidate genes, often resulting in extremely crowded candidate network relationships with many more False Positives than True Positives. To overcome this limitation, we introduce a novel approach, Module Anchored Network Inference (MANI), that constructs networks by analyzing sequentially small adjacent building blocks (modules). Using MANI, we inferred a 7-gene adipogenesis network based on time-series gene expression data during adipocyte differentiation. MANI was also applied to infer two 10-gene networks based on time-course perturbation datasets from DREAM3 and DREAM4 challenges. MANI well inferred and distinguished serial, parallel, and time-dependent gene interactions and network cascades in these applications showing a superior performance to other in silico network inference techniques for discovering and reconstructing gene network relationships.
Collapse
|
57
|
Barman S, Kwon YK. A novel mutual information-based Boolean network inference method from time-series gene expression data. PLoS One 2017; 12:e0171097. [PMID: 28178334 PMCID: PMC5298315 DOI: 10.1371/journal.pone.0171097] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 01/16/2017] [Indexed: 11/27/2022] Open
Abstract
Background Inferring a gene regulatory network from time-series gene expression data in systems biology is a challenging problem. Many methods have been suggested, most of which have a scalability limitation due to the combinatorial cost of searching a regulatory set of genes. In addition, they have focused on the accurate inference of a network structure only. Therefore, there is a pressing need to develop a network inference method to search regulatory genes efficiently and to predict the network dynamics accurately. Results In this study, we employed a Boolean network model with a restricted update rule scheme to capture coarse-grained dynamics, and propose a novel mutual information-based Boolean network inference (MIBNI) method. Given time-series gene expression data as an input, the method first identifies a set of initial regulatory genes using mutual information-based feature selection, and then improves the dynamics prediction accuracy by iteratively swapping a pair of genes between sets of the selected regulatory genes and the other genes. Through extensive simulations with artificial datasets, MIBNI showed consistently better performance than six well-known existing methods, REVEAL, Best-Fit, RelNet, CST, CLR, and BIBN in terms of both structural and dynamics prediction accuracy. We further tested the proposed method with two real gene expression datasets for an Escherichia coli gene regulatory network and a fission yeast cell cycle network, and also observed better results using MIBNI compared to the six other methods. Conclusions Taken together, MIBNI is a promising tool for predicting both the structure and the dynamics of a gene regulatory network.
Collapse
Affiliation(s)
- Shohag Barman
- School of Electrical Engineering, University of Ulsan, Daehak-ro, Nam-gu, Ulsan, Republic of Korea
| | - Yung-Keun Kwon
- School of Electrical Engineering, University of Ulsan, Daehak-ro, Nam-gu, Ulsan, Republic of Korea
- * E-mail:
| |
Collapse
|
58
|
Henriques D, Villaverde AF, Rocha M, Saez-Rodriguez J, Banga JR. Data-driven reverse engineering of signaling pathways using ensembles of dynamic models. PLoS Comput Biol 2017; 13:e1005379. [PMID: 28166222 PMCID: PMC5319798 DOI: 10.1371/journal.pcbi.1005379] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2016] [Revised: 02/21/2017] [Accepted: 01/24/2017] [Indexed: 11/19/2022] Open
Abstract
Despite significant efforts and remarkable progress, the inference of signaling networks from experimental data remains very challenging. The problem is particularly difficult when the objective is to obtain a dynamic model capable of predicting the effect of novel perturbations not considered during model training. The problem is ill-posed due to the nonlinear nature of these systems, the fact that only a fraction of the involved proteins and their post-translational modifications can be measured, and limitations on the technologies used for growing cells in vitro, perturbing them, and measuring their variations. As a consequence, there is a pervasive lack of identifiability. To overcome these issues, we present a methodology called SELDOM (enSEmbLe of Dynamic lOgic-based Models), which builds an ensemble of logic-based dynamic models, trains them to experimental data, and combines their individual simulations into an ensemble prediction. It also includes a model reduction step to prune spurious interactions and mitigate overfitting. SELDOM is a data-driven method, in the sense that it does not require any prior knowledge of the system: the interaction networks that act as scaffolds for the dynamic models are inferred from data using mutual information. We have tested SELDOM on a number of experimental and in silico signal transduction case-studies, including the recent HPN-DREAM breast cancer challenge. We found that its performance is highly competitive compared to state-of-the-art methods for the purpose of recovering network topology. More importantly, the utility of SELDOM goes beyond basic network inference (i.e. uncovering static interaction networks): it builds dynamic (based on ordinary differential equation) models, which can be used for mechanistic interpretations and reliable dynamic predictions in new experimental conditions (i.e. not used in the training). For this task, SELDOM's ensemble prediction is not only consistently better than predictions from individual models, but also often outperforms the state of the art represented by the methods used in the HPN-DREAM challenge.
Collapse
Affiliation(s)
- David Henriques
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| | - Alejandro F. Villaverde
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Braga, Portugal
| | - Julio Saez-Rodriguez
- Joint Research Center for Computational Biomedicine, RWTH-Aachen University, Aachen, Germany
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, United Kingdom
| | - Julio R. Banga
- Bioprocess Engineering Group, Spanish National Research Council, IIM-CSIC, Vigo, Spain
| |
Collapse
|
59
|
Gui S, Rice AP, Chen R, Wu L, Liu J, Miao H. A scalable algorithm for structure identification of complex gene regulatory network from temporal expression data. BMC Bioinformatics 2017; 18:74. [PMID: 28143596 PMCID: PMC5294888 DOI: 10.1186/s12859-017-1489-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2016] [Accepted: 01/20/2017] [Indexed: 12/31/2022] Open
Abstract
Background Gene regulatory interactions are of fundamental importance to various biological functions and processes. However, only a few previous computational studies have claimed success in revealing genome-wide regulatory landscapes from temporal gene expression data, especially for complex eukaryotes like human. Moreover, recent work suggests that these methods still suffer from the curse of dimensionality if a network size increases to 100 or higher. Results Here we present a novel scalable algorithm for identifying genome-wide gene regulatory network (GRN) structures, and we have verified the algorithm performances by extensive simulation studies based on the DREAM challenge benchmark data. The highlight of our method is that its superior performance does not degenerate even for a network size on the order of 104, and is thus readily applicable to large-scale complex networks. Such a breakthrough is achieved by considering both prior biological knowledge and multiple topological properties (i.e., sparsity and hub gene structure) of complex networks in the regularized formulation. We also validate and illustrate the application of our algorithm in practice using the time-course gene expression data from a study on human respiratory epithelial cells in response to influenza A virus (IAV) infection, as well as the CHIP-seq data from ENCODE on transcription factor (TF) and target gene interactions. An interesting finding, owing to the proposed algorithm, is that the biggest hub structures (e.g., top ten) in the GRN all center at some transcription factors in the context of epithelial cell infection by IAV. Conclusions The proposed algorithm is the first scalable method for large complex network structure identification. The GRN structure identified by our algorithm could reveal possible biological links and help researchers to choose which gene functions to investigate in a biological event. The algorithm described in this article is implemented in MATLAB Ⓡ, and the source code is freely available from https://github.com/Hongyu-Miao/DMI.git. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1489-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shupeng Gui
- Department of Computer Science, University of Rochester, Rochester, 14620, NY, USA
| | - Andrew P Rice
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, 77030, TX, USA
| | - Rui Chen
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, 77030, TX, USA
| | - Liang Wu
- Department of Biostatistics, University of Texas Health Science Center, Houston, 77030, TX, USA
| | - Ji Liu
- Department of Computer Science, University of Rochester, Rochester, 14620, NY, USA.,Goergen Institute for Data Science, University of Rochester, Rochester, 14620, NY, USA
| | - Hongyu Miao
- Department of Biostatistics, University of Texas Health Science Center, Houston, 77030, TX, USA.
| |
Collapse
|
60
|
Abstract
The goal of the gene regulatory network (GRN) inference is to determine the interactions between genes given heterogeneous data capturing spatiotemporal gene expression. Since transcription underlines all cellular processes, the inference of GRN is the first step in deciphering the determinants of the dynamics of biological systems. Here, we first describe the generic steps of the inference approaches that rely on similarity measures and group the similarity measures based on the computational methodology used. For each group of similarity measures, we not only review the existing approaches but also describe specifically the detailed steps of the existing state-of-the-art algorithms.
Collapse
Affiliation(s)
- Nooshin Omranian
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm, 14476, Germany
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modeling Group, Max Planck Institute of Molecular Plant Physiology, Am Mühlenberg 1, Potsdam-Golm, 14476, Germany.
| |
Collapse
|
61
|
Young WC, Raftery AE, Yeung KY. A posterior probability approach for gene regulatory network inference in genetic perturbation data. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2016; 13:1241-1251. [PMID: 27775378 DOI: 10.3934/mbe.2016041] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Inferring gene regulatory networks is an important problem in systems biology. However, these networks can be hard to infer from experimental data because of the inherent variability in biological data as well as the large number of genes involved. We propose a fast, simple method for inferring regulatory relationships between genes from knockdown experiments in the NIH LINCS dataset by calculating posterior probabilities, incorporating prior information. We show that the method is able to find previously identified edges from TRANSFAC and JASPAR and discuss the merits and limitations of this approach.
Collapse
Affiliation(s)
- William Chad Young
- University of Washington, Department of Statistics, Box 354322, Seattle, WA 98195-4322, United States.
| | | | | |
Collapse
|
62
|
Hu Y, Zhao H, Ai X. Inferring Weighted Directed Association Network from Multivariate Time Series with a Synthetic Method of Partial Symbolic Transfer Entropy Spectrum and Granger Causality. PLoS One 2016; 11:e0166084. [PMID: 27832153 PMCID: PMC5104482 DOI: 10.1371/journal.pone.0166084] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Accepted: 10/21/2016] [Indexed: 11/18/2022] Open
Abstract
Complex network methodology is very useful for complex system explorer. However, the relationships among variables in complex system are usually not clear. Therefore, inferring association networks among variables from their observed data has been a popular research topic. We propose a synthetic method, named small-shuffle partial symbolic transfer entropy spectrum (SSPSTES), for inferring association network from multivariate time series. The method synthesizes surrogate data, partial symbolic transfer entropy (PSTE) and Granger causality. A proper threshold selection is crucial for common correlation identification methods and it is not easy for users. The proposed method can not only identify the strong correlation without selecting a threshold but also has the ability of correlation quantification, direction identification and temporal relation identification. The method can be divided into three layers, i.e. data layer, model layer and network layer. In the model layer, the method identifies all the possible pair-wise correlation. In the network layer, we introduce a filter algorithm to remove the indirect weak correlation and retain strong correlation. Finally, we build a weighted adjacency matrix, the value of each entry representing the correlation level between pair-wise variables, and then get the weighted directed association network. Two numerical simulated data from linear system and nonlinear system are illustrated to show the steps and performance of the proposed approach. The ability of the proposed method is approved by an application finally.
Collapse
Affiliation(s)
- Yanzhu Hu
- Beijing Key Laboratory of Work Safety Intelligent Monitoring, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| | - Huiyang Zhao
- Beijing Key Laboratory of Work Safety Intelligent Monitoring, Beijing University of Posts and Telecommunications, Beijing, 100876, China
- School of Information Engineering, Xuchang University, Xuchang, 461000, China
- * E-mail:
| | - Xinbo Ai
- Beijing Key Laboratory of Work Safety Intelligent Monitoring, Beijing University of Posts and Telecommunications, Beijing, 100876, China
| |
Collapse
|
63
|
McGoff KA, Guo X, Deckard A, Kelliher CM, Leman AR, Francey LJ, Hogenesch JB, Haase SB, Harer JL. The Local Edge Machine: inference of dynamic models of gene regulation. Genome Biol 2016; 17:214. [PMID: 27760556 PMCID: PMC5072315 DOI: 10.1186/s13059-016-1076-z] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2016] [Accepted: 10/03/2016] [Indexed: 12/31/2022] Open
Abstract
We present a novel approach, the Local Edge Machine, for the inference of regulatory interactions directly from time-series gene expression data. We demonstrate its performance, robustness, and scalability on in silico datasets with varying behaviors, sizes, and degrees of complexity. Moreover, we demonstrate its ability to incorporate biological prior information and make informative predictions on a well-characterized in vivo system using data from budding yeast that have been synchronized in the cell cycle. Finally, we use an atlas of transcription data in a mammalian circadian system to illustrate how the method can be used for discovery in the context of large complex networks.
Collapse
Affiliation(s)
- Kevin A McGoff
- Department of Mathematics and Statistics, UNC Charlotte, 9201 University City Blvd., Charlotte, 28269, NC, USA.
| | - Xin Guo
- Department of Applied Mathematics, The Hong Kong Polytechnic University, Hong Kong, China
| | | | | | - Adam R Leman
- Department of Biology, Duke University, Durham, NC, USA
| | - Lauren J Francey
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | - John B Hogenesch
- Department of Molecular and Cellular Physiology, University of Cincinnati, Cincinnati, OH, USA
| | | | - John L Harer
- Department of Mathematics, Duke University, Durham, NC, USA
| |
Collapse
|
64
|
Banf M, Rhee SY. Computational inference of gene regulatory networks: Approaches, limitations and opportunities. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2016; 1860:41-52. [PMID: 27641093 DOI: 10.1016/j.bbagrm.2016.09.003] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/13/2016] [Revised: 09/08/2016] [Accepted: 09/08/2016] [Indexed: 10/21/2022]
Abstract
Gene regulatory networks lie at the core of cell function control. In E. coli and S. cerevisiae, the study of gene regulatory networks has led to the discovery of regulatory mechanisms responsible for the control of cell growth, differentiation and responses to environmental stimuli. In plants, computational rendering of gene regulatory networks is gaining momentum, thanks to the recent availability of high-quality genomes and transcriptomes and development of computational network inference approaches. Here, we review current techniques, challenges and trends in gene regulatory network inference and highlight challenges and opportunities for plant science. We provide plant-specific application examples to guide researchers in selecting methodologies that suit their particular research questions. Given the interdisciplinary nature of gene regulatory network inference, we tried to cater to both biologists and computer scientists to help them engage in a dialogue about concepts and caveats in network inference. Specifically, we discuss problems and opportunities in heterogeneous data integration for eukaryotic organisms and common caveats to be considered during network model evaluation. This article is part of a Special Issue entitled: Plant Gene Regulatory Mechanisms and Networks, edited by Dr. Erich Grotewold and Dr. Nathan Springer.
Collapse
Affiliation(s)
- Michael Banf
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford 93405, United States.
| | - Seung Y Rhee
- Department of Plant Biology, Carnegie Institution for Science, 260 Panama Street, Stanford 93405, United States.
| |
Collapse
|
65
|
Inferring Weighted Directed Association Networks from Multivariate Time Series with the Small-Shuffle Symbolic Transfer Entropy Spectrum Method. ENTROPY 2016. [DOI: 10.3390/e18090328] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
|
66
|
Anderson WD, Makadia HK, Greenhalgh AD, Schwaber JS, David S, Vadigepalli R. Computational modeling of cytokine signaling in microglia. MOLECULAR BIOSYSTEMS 2016; 11:3332-46. [PMID: 26440115 DOI: 10.1039/c5mb00488h] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Neuroinflammation due to glial activation has been linked to many CNS diseases. We developed a computational model of a microglial cytokine interaction network to study the regulatory mechanisms of microglia-mediated neuroinflammation. We established a literature-based cytokine network, including TNFα, TGFβ, and IL-10, and fitted a mathematical model to published data from LPS-treated microglia. The addition of a previously unreported TGFβ autoregulation loop to our model was required to account for experimental data. Global sensitivity analysis revealed that TGFβ- and IL-10-mediated inhibition of TNFα was critical for regulating network behavior. We assessed the sensitivity of the LPS-induced TNFα response profile to the initial TGFβ and IL-10 levels. The analysis showed two relatively shifted TNFα response profiles within separate domains of initial condition space. Further analysis revealed that TNFα exhibited adaptation to sustained LPS stimulation. We simulated the effects of functionally inhibiting TGFβ and IL-10 on TNFα adaptation. Our analysis showed that TGFβ and IL-10 knockouts (TGFβ KO and IL-10 KO) exert divergent effects on adaptation. TFGβ KO attenuated TNFα adaptation whereas IL-10 KO enhanced TNFα adaptation. We experimentally tested the hypothesis that IL-10 KO enhances TNFα adaptation in murine macrophages and found supporting evidence. These opposing effects could be explained by differential kinetics of negative feedback. Inhibition of IL-10 reduced early negative feedback that results in enhanced TNFα-mediated TGFβ expression. We propose that differential kinetics in parallel negative feedback loops constitute a novel mechanism underlying the complex and non-intuitive pro- versus anti-inflammatory effects of individual cytokine perturbations.
Collapse
Affiliation(s)
- Warren D Anderson
- Daniel Baugh Institute for Functional Genomics and Computational Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA. and Graduate Program in Neuroscience, Jefferson College of Biomedical Sciences, Thomas Jefferson University, Philadelphia, PA, USA
| | - Hirenkumar K Makadia
- Daniel Baugh Institute for Functional Genomics and Computational Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA. and Department of Pathology, Anatomy, and Cell Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
| | - Andrew D Greenhalgh
- Center for Research in Neuroscience, The Research Institute of the McGill University Health Center, Montreal, Quebec, Canada
| | - James S Schwaber
- Daniel Baugh Institute for Functional Genomics and Computational Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA. and Graduate Program in Neuroscience, Jefferson College of Biomedical Sciences, Thomas Jefferson University, Philadelphia, PA, USA
| | - Samuel David
- Center for Research in Neuroscience, The Research Institute of the McGill University Health Center, Montreal, Quebec, Canada
| | - Rajanikanth Vadigepalli
- Daniel Baugh Institute for Functional Genomics and Computational Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA. and Graduate Program in Neuroscience, Jefferson College of Biomedical Sciences, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
67
|
Perkins EJ, Antczak P, Burgoon L, Falciani F, Garcia-Reyero N, Gutsell S, Hodges G, Kienzler A, Knapen D, McBride M, Willett C. Adverse Outcome Pathways for Regulatory Applications: Examination of Four Case Studies With Different Degrees of Completeness and Scientific Confidence. Toxicol Sci 2016; 148:14-25. [PMID: 26500288 DOI: 10.1093/toxsci/kfv181] [Citation(s) in RCA: 71] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Adverse outcome pathways (AOPs) offer a pathway-based toxicological framework to support hazard assessment and regulatory decision-making. However, little has been discussed about the scientific confidence needed, or how complete a pathway should be, before use in a specific regulatory application. Here we review four case studies to explore the degree of scientific confidence and extent of completeness (in terms of causal events) that is required for an AOP to be useful for a specific purpose in a regulatory application: (i) Membrane disruption (Narcosis) leading to respiratory failure (low confidence), (ii) Hepatocellular proliferation leading to cancer (partial pathway, moderate confidence), (iii) Covalent binding to proteins leading to skin sensitization (high confidence), and (iv) Aromatase inhibition leading to reproductive dysfunction in fish (high confidence). Partially complete AOPs with unknown molecular initiating events, such as 'Hepatocellular proliferation leading to cancer', were found to be valuable. We demonstrate that scientific confidence in these pathways can be increased though the use of unconventional information (eg, computational identification of potential initiators). AOPs at all levels of confidence can contribute to specific uses. A significant statistical or quantitative relationship between events and/or the adverse outcome relationships is a common characteristic of AOPs, both incomplete and complete, that have specific regulatory uses. For AOPs to be useful in a regulatory context they must be at least as useful as the tools that regulators currently possess, or the techniques currently employed by regulators.
Collapse
Affiliation(s)
- Edward J Perkins
- *Environmental Laboratory, US Army Engineer Research and Development Center, Vicksburg Mississippi;
| | - Philipp Antczak
- Institute of Integrative Biology, University of Liverpool, Liverpool, Merseyside L69 3BX, UK
| | - Lyle Burgoon
- *Environmental Laboratory, US Army Engineer Research and Development Center, Vicksburg Mississippi
| | - Francesco Falciani
- Institute of Integrative Biology, University of Liverpool, Liverpool, Merseyside L69 3BX, UK
| | - Natàlia Garcia-Reyero
- Mississippi State University, Institute for Genomics, Biocomputing and Biotechnology, Starkville, Mississippi
| | - Steve Gutsell
- Unilever, Colworth Science Park, Sharnbrook MK44 1LQ, UK
| | - Geoff Hodges
- Unilever, Colworth Science Park, Sharnbrook MK44 1LQ, UK
| | - Aude Kienzler
- JRC Institute for Health and Consumer Protection, Ispra, Italy
| | - Dries Knapen
- University of Antwerp, Zebrafishlab, Universiteitsplein 1, 2610 Wilrijk, Belgium
| | - Mary McBride
- Agilent Technologies, Washington, District of Columbia; and
| | - Catherine Willett
- The Humane Society of the United States, Washington, District of Columbia, USA
| |
Collapse
|
68
|
Recurrent neural network based hybrid model for reconstructing gene regulatory network. Comput Biol Chem 2016; 64:322-334. [PMID: 27570069 DOI: 10.1016/j.compbiolchem.2016.08.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2016] [Revised: 05/01/2016] [Accepted: 08/13/2016] [Indexed: 11/22/2022]
Abstract
One of the exciting problems in systems biology research is to decipher how genome controls the development of complex biological system. The gene regulatory networks (GRNs) help in the identification of regulatory interactions between genes and offer fruitful information related to functional role of individual gene in a cellular system. Discovering GRNs lead to a wide range of applications, including identification of disease related pathways providing novel tentative drug targets, helps to predict disease response, and also assists in diagnosing various diseases including cancer. Reconstruction of GRNs from available biological data is still an open problem. This paper proposes a recurrent neural network (RNN) based model of GRN, hybridized with generalized extended Kalman filter for weight update in backpropagation through time training algorithm. The RNN is a complex neural network that gives a better settlement between biological closeness and mathematical flexibility to model GRN; and is also able to capture complex, non-linear and dynamic relationships among variables. Gene expression data are inherently noisy and Kalman filter performs well for estimation problem even in noisy data. Hence, we applied non-linear version of Kalman filter, known as generalized extended Kalman filter, for weight update during RNN training. The developed model has been tested on four benchmark networks such as DNA SOS repair network, IRMA network, and two synthetic networks from DREAM Challenge. We performed a comparison of our results with other state-of-the-art techniques which shows superiority of our proposed model. Further, 5% Gaussian noise has been induced in the dataset and result of the proposed model shows negligible effect of noise on results, demonstrating the noise tolerance capability of the model.
Collapse
|
69
|
Li Y, Chen H, Zheng J, Ngom A. The Max-Min High-Order Dynamic Bayesian Network for Learning Gene Regulatory Networks with Time-Delayed Regulations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:792-803. [PMID: 26336144 DOI: 10.1109/tcbb.2015.2474409] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Accurately reconstructing gene regulatory network (GRN) from gene expression data is a challenging task in systems biology. Although some progresses have been made, the performance of GRN reconstruction still has much room for improvement. Because many regulatory events are asynchronous, learning gene interactions with multiple time delays is an effective way to improve the accuracy of GRN reconstruction. Here, we propose a new approach, called Max-Min high-order dynamic Bayesian network (MMHO-DBN) by extending the Max-Min hill-climbing Bayesian network technique originally devised for learning a Bayesian network's structure from static data. Our MMHO-DBN can explicitly model the time lags between regulators and targets in an efficient manner. It first uses constraint-based ideas to limit the space of potential structures, and then applies search-and-score ideas to search for an optimal HO-DBN structure. The performance of MMHO-DBN to GRN reconstruction was evaluated using both synthetic and real gene expression time-series data. Results show that MMHO-DBN is more accurate than current time-delayed GRN learning methods, and has an intermediate computing performance. Furthermore, it is able to learn long time-delayed relationships between genes. We applied sensitivity analysis on our model to study the performance variation along different parameter settings. The result provides hints on the setting of parameters of MMHO-DBN.
Collapse
|
70
|
Gómez-Vela F, Barranco CD, Díaz-Díaz N. Incorporating biological knowledge for construction of fuzzy networks of gene associations. Appl Soft Comput 2016. [DOI: 10.1016/j.asoc.2016.01.014] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
71
|
Yang B, Zhang W, Wang H, Song C, Chen Y. TDSDMI: Inference of time-delayed gene regulatory network using S-system model with delayed mutual information. Comput Biol Med 2016; 72:218-25. [DOI: 10.1016/j.compbiomed.2016.03.024] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2015] [Revised: 03/04/2016] [Accepted: 03/29/2016] [Indexed: 01/06/2023]
|
72
|
Use of systems biology to decipher host-pathogen interaction networks and predict biomarkers. Clin Microbiol Infect 2016; 22:600-6. [PMID: 27113568 DOI: 10.1016/j.cmi.2016.04.014] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Revised: 04/13/2016] [Accepted: 04/15/2016] [Indexed: 02/06/2023]
Abstract
In systems biology, researchers aim to understand complex biological systems as a whole, which is often achieved by mathematical modelling and the analyses of high-throughput data. In this review, we give an overview of medical applications of systems biology approaches with special focus on host-pathogen interactions. After introducing general ideas of systems biology, we focus on (1) the detection of putative biomarkers for improved diagnosis and support of therapeutic decisions, (2) network modelling for the identification of regulatory interactions between cellular molecules to reveal putative drug targets and (3) module discovery for the detection of phenotype-specific modules in molecular interaction networks. Biomarker detection applies supervised machine learning methods utilizing high-throughput data (e.g. single nucleotide polymorphism (SNP) detection, RNA-seq, proteomics) and clinical data. We demonstrate structural analysis of molecular networks, especially by identification of disease modules as a novel strategy, and discuss possible applications to host-pathogen interactions. Pioneering work was done to predict molecular host-pathogen interactions networks based on dual RNA-seq data. However, currently this network modelling is restricted to a small number of genes. With increasing number and quality of databases and data repositories, the prediction of large-scale networks will also be feasible that can used for multidimensional diagnosis and decision support for prevention and therapy of diseases. Finally, we outline further perspective issues such as support of personalized medicine with high-throughput data and generation of multiscale host-pathogen interaction models.
Collapse
|
73
|
Riccadonna S, Jurman G, Visintainer R, Filosi M, Furlanello C. DTW-MIC Coexpression Networks from Time-Course Data. PLoS One 2016; 11:e0152648. [PMID: 27031641 PMCID: PMC4816347 DOI: 10.1371/journal.pone.0152648] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2014] [Accepted: 03/17/2016] [Indexed: 01/01/2023] Open
Abstract
When modeling coexpression networks from high-throughput time course data, Pearson Correlation Coefficient (PCC) is one of the most effective and popular similarity functions. However, its reliability is limited since it cannot capture non-linear interactions and time shifts. Here we propose to overcome these two issues by employing a novel similarity function, Dynamic Time Warping Maximal Information Coefficient (DTW-MIC), combining a measure taking care of functional interactions of signals (MIC) and a measure identifying time lag (DTW). By using the Hamming-Ipsen-Mikhailov (HIM) metric to quantify network differences, the effectiveness of the DTW-MIC approach is demonstrated on a set of four synthetic and one transcriptomic datasets, also in comparison to TimeDelay ARACNE and Transfer Entropy.
Collapse
Affiliation(s)
| | - Giuseppe Jurman
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Italy
| | - Roberto Visintainer
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Italy
| | - Michele Filosi
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Italy
| | - Cesare Furlanello
- Research and Innovation Centre, Fondazione Edmund Mach, San Michele all’Adige, Italy
| |
Collapse
|
74
|
Omranian N, Eloundou-Mbebi JMO, Mueller-Roeber B, Nikoloski Z. Gene regulatory network inference using fused LASSO on multiple data sets. Sci Rep 2016; 6:20533. [PMID: 26864687 PMCID: PMC4750075 DOI: 10.1038/srep20533] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 01/06/2016] [Indexed: 01/14/2023] Open
Abstract
Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.
Collapse
Affiliation(s)
- Nooshin Omranian
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
- Department of Molecular Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, Haus 20, 14476 Potsdam, Germany
| | - Jeanne M. O. Eloundou-Mbebi
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| | - Bernd Mueller-Roeber
- Department of Molecular Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, Haus 20, 14476 Potsdam, Germany
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| |
Collapse
|
75
|
Analysis of spatial-temporal gene expression patterns reveals dynamics and regionalization in developing mouse brain. Sci Rep 2016; 6:19274. [PMID: 26786896 PMCID: PMC4726224 DOI: 10.1038/srep19274] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2015] [Accepted: 12/10/2015] [Indexed: 01/14/2023] Open
Abstract
Allen Brain Atlas (ABA) provides a valuable resource of spatial/temporal gene expressions in mammalian brains. Despite rich information extracted from this database, current analyses suffer from several limitations. First, most studies are either gene-centric or region-centric, thus are inadequate to capture the superposition of multiple spatial-temporal patterns. Second, standard tools of expression analysis such as matrix factorization can capture those patterns but do not explicitly incorporate spatial dependency. To overcome those limitations, we proposed a computational method to detect recurrent patterns in the spatial-temporal gene expression data of developing mouse brains. We demonstrated that regional distinction in brain development could be revealed by localized gene expression patterns. The patterns expressed in the forebrain, medullary and pontomedullary, and basal ganglia are enriched with genes involved in forebrain development, locomotory behavior, and dopamine metabolism respectively. In addition, the timing of global gene expression patterns reflects the general trends of molecular events in mouse brain development. Furthermore, we validated functional implications of the inferred patterns by showing genes sharing similar spatial-temporal expression patterns with Lhx2 exhibited differential expression in the embryonic forebrains of Lhx2 mutant mice. These analysis outcomes confirm the utility of recurrent expression patterns in studying brain development.
Collapse
|
76
|
Zhang X, Kuivenhoven JA, Groen AK. Forward Individualized Medicine from Personal Genomes to Interactomes. Front Physiol 2015; 6:364. [PMID: 26696898 PMCID: PMC4673427 DOI: 10.3389/fphys.2015.00364] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 11/16/2015] [Indexed: 12/23/2022] Open
Abstract
When considering the variation in the genome, transcriptome, proteome and metabolome, and their interaction with the environment, every individual can be rightfully considered as a unique biological entity. Individualized medicine promises to take this uniqueness into account to optimize disease treatment and thereby improve health benefits for every patient. The success of individualized medicine relies on a precise understanding of the genotype-phenotype relationship. Although omics technologies advance rapidly, there are several challenges that need to be overcome: Next generation sequencing can efficiently decipher genomic sequences, epigenetic changes, and transcriptomic variation in patients, but it does not automatically indicate how or whether the identified variation will cause pathological changes. This is likely due to the inability to account for (1) the consequences of gene-gene and gene-environment interactions, and (2) (post)transcriptional as well as (post)translational processes that eventually determine the concentration of key metabolites. The technologies to accurately measure changes in these latter layers are still under development, and such measurements in humans are also mainly restricted to blood and circulating cells. Despite these challenges, it is already possible to track dynamic changes in the human interactome in healthy and diseased states by using the integration of multi-omics data. In this review, we evaluate the potential value of current major bioinformatics and systems biology-based approaches, including genome wide association studies, epigenetics, gene regulatory and protein-protein interaction networks, and genome-scale metabolic modeling. Moreover, we address the question whether integrative analysis of personal multi-omics data will help understanding of personal genotype-phenotype relationships.
Collapse
Affiliation(s)
- Xiang Zhang
- Department of Pediatrics, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| | - Jan A Kuivenhoven
- Section Molecular Genetics, Department of Pediatrics, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| | - Albert K Groen
- Department of Pediatrics, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands ; Department of Laboratory Medicine, Center for Liver Digestive and Metabolic Diseases, University of Groningen, University Medical Center Groningen Groningen, Netherlands
| |
Collapse
|
77
|
Wang CCN, Sheu PCY, Tsai JJP. Towards Semantic Biomedical Problem Solving. INTERNATIONAL JOURNAL OF SEMANTIC COMPUTING 2015. [DOI: 10.1142/s1793351x15500075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Biological and medical intelligence (BMI) has been studied in solos, lacking a systematic methodology. In this paper, we describe how Semantic Computing can enhance biological and medical intelligence. Specifically, we show how Structured Natural Language (SNL) can express many problems in BMI with a finite number of sentence patterns, and show how biological tools, OLAP, data mining tools and statistical analysis tools may be linked to solve problems related to biomedical data.
Collapse
Affiliation(s)
- Charles C. N. Wang
- Department of Biomedical Informatics, Asia University, 500, Lioufeng Rd., Wufeng, Taichung 41354, Taiwan
| | - Phillip C.-Y. Sheu
- Department of Electrical Engineering and Computer Science, University of California – Irvine, 5200 Engineering Hall, Irvine, CA 92697, USA
| | - Jeffrey J. P. Tsai
- Department of Biomedical Informatics, Asia University, 500, Lioufeng Rd., Wufeng, Taichung 41354, Taiwan
| |
Collapse
|
78
|
Lo LY, Wong ML, Lee KH, Leung KS. High-order dynamic Bayesian Network learning with hidden common causes for causal gene regulatory network. BMC Bioinformatics 2015; 16:395. [PMID: 26608050 PMCID: PMC4659244 DOI: 10.1186/s12859-015-0823-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2015] [Accepted: 11/11/2015] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Inferring gene regulatory network (GRN) has been an important topic in Bioinformatics. Many computational methods infer the GRN from high-throughput expression data. Due to the presence of time delays in the regulatory relationships, High-Order Dynamic Bayesian Network (HO-DBN) is a good model of GRN. However, previous GRN inference methods assume causal sufficiency, i.e. no unobserved common cause. This assumption is convenient but unrealistic, because it is possible that relevant factors have not even been conceived of and therefore un-measured. Therefore an inference method that also handles hidden common cause(s) is highly desirable. Also, previous methods for discovering hidden common causes either do not handle multi-step time delays or restrict that the parents of hidden common causes are not observed genes. RESULTS We have developed a discrete HO-DBN learning algorithm that can infer also hidden common cause(s) from discrete time series expression data, with some assumptions on the conditional distribution, but is less restrictive than previous methods. We assume that each hidden variable has only observed variables as children and parents, with at least two children and possibly no parents. We also make the simplifying assumption that children of hidden variable(s) are not linked to each other. Moreover, our proposed algorithm can also utilize multiple short time series (not necessarily of the same length), as long time series are difficult to obtain. CONCLUSIONS We have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. Experiment results show that our proposed algorithm can recover the causal GRNs adequately given the incomplete data. Using the limited real expression data and small subnetworks of the YEASTRACT network, we have also demonstrated the potential of our algorithm on real data, though more time series expression data is needed.
Collapse
Affiliation(s)
- Leung-Yau Lo
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.
| | - Man-Leung Wong
- Department of Computing and Decision Sciences, Lingnan University, Tuen Mun, Hong Kong.
| | - Kin-Hong Lee
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, Hong Kong.
| |
Collapse
|
79
|
Yoon H, Lim J, Lim JS. Reconstructing time series GRN using a neuro-fuzzy system. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2015. [DOI: 10.3233/ifs-151979] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Heejin Yoon
- IT College, Jangan University, Whasung, South Korea
| | - Jongwoo Lim
- IT College, Gachon University, Seongnam, South Korea
| | - Joon S. Lim
- IT College, Gachon University, Seongnam, South Korea
| |
Collapse
|
80
|
Fronczuk M, Raftery AE, Yeung KY. CyNetworkBMA: a Cytoscape app for inferring gene regulatory networks. SOURCE CODE FOR BIOLOGY AND MEDICINE 2015; 10:11. [PMID: 26566394 PMCID: PMC4642660 DOI: 10.1186/s13029-015-0043-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/22/2014] [Accepted: 10/31/2015] [Indexed: 12/31/2022]
Abstract
Background Inference of gene networks from expression data is an important problem in computational biology. Many algorithms have been proposed for solving the problem efficiently. However, many of the available implementations are programming libraries that require users to write code, which limits their accessibility. Results We have developed a tool called CyNetworkBMA for inferring gene networks from expression data that integrates with Cytoscape. Our application offers a graphical user interface for networkBMA, an efficient implementation of Bayesian Model Averaging methods for network construction. The client-server architecture of CyNetworkBMA makes it possible to distribute or centralize computation depending on user needs. Conclusions CyNetworkBMA is an easy-to-use tool that makes network inference accessible to non-programmers through seamless integration with Cytoscape. CyNetworkBMA is available on the Cytoscape App Store at http://apps.cytoscape.org/apps/cynetworkbma. Electronic supplementary material The online version of this article (doi:10.1186/s13029-015-0043-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maciej Fronczuk
- Institute of Technology, University of Washington, Tacoma, 98402 WA USA
| | - Adrian E Raftery
- Department of Statistics, University of Washington, Seattle, 98195 WA USA
| | - Ka Yee Yeung
- Institute of Technology, University of Washington, Tacoma, 98402 WA USA
| |
Collapse
|
81
|
Takenaka Y, Seno S, Matsuda H. Detecting shifts in gene regulatory networks during time-course experiments at single-time-point temporal resolution. J Bioinform Comput Biol 2015; 13:1543002. [PMID: 26508425 DOI: 10.1142/s0219720015430027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Comprehensively understanding the dynamics of biological systems is one of the greatest challenges in biology. Vastly improved biological technologies have provided vast amounts of information that must be understood by bioinformatics and systems biology researchers. Gene regulations have been frequently modeled by ordinary differential equations or graphical models based on time-course gene expression profiles. The state-of-the-art computational approaches for analyzing gene regulations assume that their models are same throughout time-course experiments. However, these approaches cannot easily analyze transient changes at a time point, such as diauxic shift. We propose a score that analyzes the gene regulations at each time point. The score is based on the information gains of information criterion values. The method detects the shifts in gene regulatory networks (GRNs) during time-course experiments with single-time-point resolution. The effectiveness of the method is evaluated on the diauxic shift from glucose to lactose in Escherichia coli. Gene regulation shifts were detected at two time points: the first corresponding to the time at which the growth of E. coli ceased and the second corresponding to the end of the experiment, when the nutrient sources (glucose and lactose) had become exhausted. According to these results, the proposed score and method can appropriately detect the time of gene regulation shifts. The method based on the proposed score provides a new tool for analyzing dynamic biological systems. Because the score value indicates the strength of gene regulation at each time point in a gene expression profile, it can potentially infer hidden GRNs from time-course experiments.
Collapse
Affiliation(s)
- Yoichi Takenaka
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| | - Shigeto Seno
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| | - Hideo Matsuda
- Graduate School of Information Science and Technology, Osaka University, Yamadaoka 1-5, Suita, Osaka, Japan
| |
Collapse
|
82
|
Lo LY, Wong ML, Lee KH, Leung KS. Time Delayed Causal Gene Regulatory Network Inference with Hidden Common Causes. PLoS One 2015; 10:e0138596. [PMID: 26394325 PMCID: PMC4578777 DOI: 10.1371/journal.pone.0138596] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 09/01/2015] [Indexed: 01/07/2023] Open
Abstract
Inferring the gene regulatory network (GRN) is crucial to understanding the working of the cell. Many computational methods attempt to infer the GRN from time series expression data, instead of through expensive and time-consuming experiments. However, existing methods make the convenient but unrealistic assumption of causal sufficiency, i.e. all the relevant factors in the causal network have been observed and there are no unobserved common cause. In principle, in the real world, it is impossible to be certain that all relevant factors or common causes have been observed, because some factors may not have been conceived of, and therefore are impossible to measure. In view of this, we have developed a novel algorithm named HCC-CLINDE to infer an GRN from time series data allowing the presence of hidden common cause(s). We assume there is a sparse causal graph (possibly with cycles) of interest, where the variables are continuous and each causal link has a delay (possibly more than one time step). A small but unknown number of variables are not observed. Each unobserved variable has only observed variables as children and parents, with at least two children, and the children are not linked to each other. Since it is difficult to obtain very long time series, our algorithm is also capable of utilizing multiple short time series, which is more realistic. To our knowledge, our algorithm is far less restrictive than previous works. We have performed extensive experiments using synthetic data on GRNs of size up to 100, with up to 10 hidden nodes. The results show that our algorithm can adequately recover the true causal GRN and is robust to slight deviation from Gaussian distribution in the error terms. We have also demonstrated the potential of our algorithm on small YEASTRACT subnetworks using limited real data.
Collapse
Affiliation(s)
- Leung-Yau Lo
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, Hong Kong
- * E-mail:
| | - Man-Leung Wong
- Department of Computing and Decision Sciences, Lingnan University, Tuen Mun, Hong Kong
| | - Kin-Hong Lee
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, Hong Kong
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, Hong Kong
| |
Collapse
|
83
|
Folch-Fortuny A, Villaverde AF, Ferrer A, Banga JR. Enabling network inference methods to handle missing data and outliers. BMC Bioinformatics 2015; 16:283. [PMID: 26335628 PMCID: PMC4559359 DOI: 10.1186/s12859-015-0717-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2015] [Accepted: 08/24/2015] [Indexed: 12/20/2022] Open
Abstract
Background The inference of complex networks from data is a challenging problem in biological sciences, as well as in a wide range of disciplines such as chemistry, technology, economics, or sociology. The quantity and quality of the data greatly affect the results. While many methodologies have been developed for this task, they seldom take into account issues such as missing data or outlier detection and correction, which need to be properly addressed before network inference. Results Here we present an approach to (i) handle missing data and (ii) detect and correct outliers based on multivariate projection to latent structures. The method, called trimmed scores regression (TSR), enables network inference methods to analyse incomplete datasets by imputing the missing values coherently with the latent data structure. Furthermore, it substitutes the faulty values in a dataset by proper estimations. We provide an implementation of this approach, and show how it can be integrated with any network inference method as a preliminary data curation step. This functionality is demonstrated with a state of the art network inference method based on mutual information distance and entropy reduction, MIDER. Conclusion The methodology presented here enables network inference methods to analyse a large number of incomplete and faulty datasets that could not be reliably analysed so far. Our comparative studies show the superiority of TSR over other missing data approaches used by practitioners. Furthermore, the method allows for outlier detection and correction. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0717-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Abel Folch-Fortuny
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camino de Vera s/n, Valencia, 46022, Spain.
| | - Alejandro F Villaverde
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain.,Centre of Biological Engineering, Universidade do Minho, Campus de Gualtar, Braga, 4710-057, Portugal.,Department of Systems and Control Engineering, Universidade de Vigo, Rua Maxwell, Vigo, 36310, Spain
| | - Alberto Ferrer
- Departamento de Estadística e Investigación Operativa Aplicadas y Calidad, Universitat Politècnica de València, Camino de Vera s/n, Valencia, 46022, Spain
| | - Julio R Banga
- BioProcess Engineering Group, IIM-CSIC, Eduardo Cabello 6, Vigo, 36208, Spain
| |
Collapse
|
84
|
Lo LY, Leung KS, Lee KH. Inferring Time-Delayed Causal Gene Network Using Time-Series Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1169-1182. [PMID: 26451828 DOI: 10.1109/tcbb.2015.2394442] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Inferring gene regulatory network (GRN) from the microarray expression data is an important problem in Bioinformatics, because knowing the GRN is an essential first step in understanding the inner workings of the cell and the related diseases. Time delays exist in the regulatory effects from one gene to another due to the time needed for transcription, translation, and to accumulate a sufficient number of needed proteins. Also, it is known that the delays are important for oscillatory phenomenon. Therefore, it is crucial to develop a causal gene network model, preferably as a function of time. In this paper, we propose an algorithm CLINDE to infer causal directed links in GRN with time delays and regulatory effects in the links from time-series microarray gene expression data. It is one of the most comprehensive in terms of features compared to the state-of-the-art discrete gene network models. We have tested CLINDE on synthetic data, the in vivo IRMA (On and Off) datasets and the [1] yeast expression data validated using KEGG pathways. Results show that CLINDE can effectively recover the links, the time delays and the regulatory effects in the synthetic data, and outperforms other algorithms in the IRMA in vivo datasets.
Collapse
|
85
|
Kim JR, Choo SM, Choi HS, Cho KH. Identification of Gene Networks with Time Delayed Regulation Based on Temporal Expression Profiles. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:1161-1168. [PMID: 26451827 DOI: 10.1109/tcbb.2015.2394312] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
There are fundamental limitations in inferring the functional interaction structure of a gene (regulatory) network only from sequence information such as binding motifs. To overcome such limitations, various approaches have been developed to infer the functional interaction structure from expression profiles. However, most of them have not been so successful due to the experimental limitations and computational complexity. Hence, there is a pressing need to develop a simple but effective methodology that can systematically identify the functional interaction structure of a gene network from time-series expression profiles. In particular, we need to take into account the different time delay effects in gene regulation since they are ubiquitously present. We have considered a new experiment that measures the overall expression changes after a perturbation on a specific gene. Based on this experiment, we have proposed a new inference method that can take account of the time delay induced while the perturbation affects its primary target genes. Specifically, we have developed an algebraic equation from which we can identify the subnetwork structure around the perturbed gene. We have also analyzed the influence of time delay on the inferred network structure. The proposed method is particularly useful for identification of a gene network with small variations in the time delay of gene regulation.
Collapse
|
86
|
Gong W, Naoko KN, Garry DJ. Inferring Gene Regulatory Networks by Context Dependent and Independent Effects1. J Med Device 2015. [DOI: 10.1115/1.4030577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Affiliation(s)
- Wuming Gong
- Lillehei Heart Institute, University of Minnesota, Minneapolis, MN 55455
| | | | - Daniel J. Garry
- Lillehei Heart Institute, University of Minnesota, Minneapolis, MN 55455
| |
Collapse
|
87
|
Yao S, Yoo S, Yu D. Prior knowledge driven Granger causality analysis on gene regulatory network discovery. BMC Bioinformatics 2015; 16:273. [PMID: 26316173 PMCID: PMC4551367 DOI: 10.1186/s12859-015-0710-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2015] [Accepted: 08/17/2015] [Indexed: 12/20/2022] Open
Abstract
Background Our study focuses on discovering gene regulatory networks from time series gene expression data using the Granger causality (GC) model. However, the number of available time points (T) usually is much smaller than the number of target genes (n) in biological datasets. The widely applied pairwise GC model (PGC) and other regularization strategies can lead to a significant number of false identifications when n>>T. Results In this study, we proposed a new method, viz., CGC-2SPR (CGC using two-step prior Ridge regularization) to resolve the problem by incorporating prior biological knowledge about a target gene data set. In our simulation experiments, the propose new methodology CGC-2SPR showed significant performance improvement in terms of accuracy over other widely used GC modeling (PGC, Ridge and Lasso) and MI-based (MRNET and ARACNE) methods. In addition, we applied CGC-2SPR to a real biological dataset, i.e., the yeast metabolic cycle, and discovered more true positive edges with CGC-2SPR than with the other existing methods. Conclusions In our research, we noticed a “ 1+1>2” effect when we combined prior knowledge and gene expression data to discover regulatory networks. Based on causality networks, we made a functional prediction that the Abm1 gene (its functions previously were unknown) might be related to the yeast’s responses to different levels of glucose. Our research improves causality modeling by combining heterogeneous knowledge, which is well aligned with the future direction in system biology. Furthermore, we proposed a method of Monte Carlo significance estimation (MCSE) to calculate the edge significances which provide statistical meanings to the discovered causality networks. All of our data and source codes will be available under the link https://bitbucket.org/dtyu/granger-causality/wiki/Home.
Collapse
Affiliation(s)
- Shun Yao
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, 11790, NY, USA. .,Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| | - Shinjae Yoo
- Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| | - Dantong Yu
- Computational Science Center, Brookhaven National Laboratory, Upton, 11793, NY, USA.
| |
Collapse
|
88
|
Parikshak NN, Gandal MJ, Geschwind DH. Systems biology and gene networks in neurodevelopmental and neurodegenerative disorders. Nat Rev Genet 2015; 16:441-58. [PMID: 26149713 PMCID: PMC4699316 DOI: 10.1038/nrg3934] [Citation(s) in RCA: 287] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Genetic and genomic approaches have implicated hundreds of genetic loci in neurodevelopmental disorders and neurodegeneration, but mechanistic understanding continues to lag behind the pace of gene discovery. Understanding the role of specific genetic variants in the brain involves dissecting a functional hierarchy that encompasses molecular pathways, diverse cell types, neural circuits and, ultimately, cognition and behaviour. With a focus on transcriptomics, this Review discusses how high-throughput molecular, integrative and network approaches inform disease biology by placing human genetics in a molecular systems and neurobiological context. We provide a framework for interpreting network biology studies and leveraging big genomics data sets in neurobiology.
Collapse
Affiliation(s)
- Neelroop N Parikshak
- 1] Program in Neurobehavioral Genetics, Semel Institute, and Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA. [2] Interdepartmental Program in Neuroscience, University of California, Los Angeles, California 90095, USA
| | - Michael J Gandal
- 1] Program in Neurobehavioral Genetics, Semel Institute, and Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA. [2] Center for Autism Treatment and Research, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| | - Daniel H Geschwind
- 1] Program in Neurobehavioral Genetics, Semel Institute, and Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA. [2] Interdepartmental Program in Neuroscience, University of California, Los Angeles, California 90095, USA. [3] Center for Autism Treatment and Research, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA. [4] Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, California 90095, USA
| |
Collapse
|
89
|
Inferring Broad Regulatory Biology from Time Course Data: Have We Reached an Upper Bound under Constraints Typical of In Vivo Studies? PLoS One 2015; 10:e0127364. [PMID: 25984725 PMCID: PMC4435750 DOI: 10.1371/journal.pone.0127364] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2014] [Accepted: 04/13/2015] [Indexed: 12/21/2022] Open
Abstract
There is a growing appreciation for the network biology that regulates the coordinated expression of molecular and cellular markers however questions persist regarding the identifiability of these networks. Here we explore some of the issues relevant to recovering directed regulatory networks from time course data collected under experimental constraints typical of in vivo studies. NetSim simulations of sparsely connected biological networks were used to evaluate two simple feature selection techniques used in the construction of linear Ordinary Differential Equation (ODE) models, namely truncation of terms versus latent vector projection. Performance was compared with ODE-based Time Series Network Identification (TSNI) integral, and the information-theoretic Time-Delay ARACNE (TD-ARACNE). Projection-based techniques and TSNI integral outperformed truncation-based selection and TD-ARACNE on aggregate networks with edge densities of 10-30%, i.e. transcription factor, protein-protein cliques and immune signaling networks. All were more robust to noise than truncation-based feature selection. Performance was comparable on the in silico 10-node DREAM 3 network, a 5-node Yeast synthetic network designed for In vivo Reverse-engineering and Modeling Assessment (IRMA) and a 9-node human HeLa cell cycle network of similar size and edge density. Performance was more sensitive to the number of time courses than to sample frequency and extrapolated better to larger networks by grouping experiments. In all cases performance declined rapidly in larger networks with lower edge density. Limited recovery and high false positive rates obtained overall bring into question our ability to generate informative time course data rather than the design of any particular reverse engineering algorithm.
Collapse
|
90
|
Lavenus J, Goh T, Guyomarc'h S, Hill K, Lucas M, Voß U, Kenobi K, Wilson MH, Farcot E, Hagen G, Guilfoyle TJ, Fukaki H, Laplaze L, Bennett MJ. Inference of the Arabidopsis lateral root gene regulatory network suggests a bifurcation mechanism that defines primordia flanking and central zones. THE PLANT CELL 2015; 27:1368-88. [PMID: 25944102 PMCID: PMC4456640 DOI: 10.1105/tpc.114.132993] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2014] [Revised: 03/02/2015] [Accepted: 04/07/2015] [Indexed: 05/18/2023]
Abstract
A large number of genes involved in lateral root (LR) organogenesis have been identified over the last decade using forward and reverse genetic approaches in Arabidopsis thaliana. Nevertheless, how these genes interact to form a LR regulatory network largely remains to be elucidated. In this study, we developed a time-delay correlation algorithm (TDCor) to infer the gene regulatory network (GRN) controlling LR primordium initiation and patterning in Arabidopsis from a time-series transcriptomic data set. The predicted network topology links the very early-activated genes involved in LR initiation to later expressed cell identity markers through a multistep genetic cascade exhibiting both positive and negative feedback loops. The predictions were tested for the key transcriptional regulator AUXIN RESPONSE FACTOR7 node, and over 70% of its targets were validated experimentally. Intriguingly, the predicted GRN revealed a mutual inhibition between the ARF7 and ARF5 modules that would control an early bifurcation between two cell fates. Analyses of the expression pattern of ARF7 and ARF5 targets suggest that this patterning mechanism controls flanking and central zone specification in Arabidopsis LR primordia.
Collapse
Affiliation(s)
- Julien Lavenus
- Institut de Recherche pour le Développement, UMR DIADE, 34394 Montpellier cedex 5, France Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| | - Tatsuaki Goh
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom Department of Biology, Graduate School of Science, Kobe University, Kobe 657-8501, Japan
| | - Soazig Guyomarc'h
- Université de Montpellier, UMR DIADE, 34394 Montpellier cedex 5, France
| | - Kristine Hill
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| | - Mikael Lucas
- Institut de Recherche pour le Développement, UMR DIADE, 34394 Montpellier cedex 5, France
| | - Ute Voß
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| | - Kim Kenobi
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| | - Michael H Wilson
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| | - Etienne Farcot
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom Inria, Virtual Plants Team, 34095 Montpellier cedex 5, France
| | | | | | - Hidehiro Fukaki
- Department of Biology, Graduate School of Science, Kobe University, Kobe 657-8501, Japan
| | - Laurent Laplaze
- Institut de Recherche pour le Développement, UMR DIADE, 34394 Montpellier cedex 5, France
| | - Malcolm J Bennett
- Centre for Plant Integrative Biology, School of Biosciences, University of Nottingham, Loughborough, Leicestershire LE12 5RD, United Kingdom
| |
Collapse
|
91
|
Remo A, Simeone I, Pancione M, Parcesepe P, Finetti P, Cerulo L, Bensmail H, Birnbaum D, Van Laere SJ, Colantuoni V, Bonetti F, Bertucci F, Manfrin E, Ceccarelli M. Systems biology analysis reveals NFAT5 as a novel biomarker and master regulator of inflammatory breast cancer. J Transl Med 2015; 13:138. [PMID: 25928084 PMCID: PMC4438533 DOI: 10.1186/s12967-015-0492-2] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 04/14/2015] [Indexed: 01/30/2023] Open
Abstract
Background Inflammatory breast cancer (IBC) is the most rare and aggressive variant of breast cancer (BC); however, only a limited number of specific gene signatures with low generalization abilities are available and few reliable biomarkers are helpful to improve IBC classification into a molecularly distinct phenotype. We applied a network-based strategy to gain insight into master regulators (MRs) linked to IBC pathogenesis. Methods In-silico modeling and Algorithm for the Reconstruction of Accurate Cellular Networks (ARACNe) on IBC/non-IBC (nIBC) gene expression data (n = 197) was employed to identify novel master regulators connected to the IBC phenotype. Pathway enrichment analysis was used to characterize predicted targets of candidate genes. The expression pattern of the most significant MRs was then evaluated by immunohistochemistry (IHC) in two independent cohorts of IBCs (n = 39) and nIBCs (n = 82) and normal breast tissues (n = 15) spotted on tissue microarrays. The staining pattern of non-neoplastic mammary epithelial cells was used as a normal control. Results Using in-silico modeling of network-based strategy, we identified three top enriched MRs (NFAT5, CTNNB1 or β-catenin, and MGA) strongly linked to the IBC phenotype. By IHC assays, we found that IBC patients displayed a higher number of NFAT5-positive cases than nIBC (69.2% vs. 19.5%; p-value = 2.79 10-7). Accordingly, the majority of NFAT5-positive IBC samples revealed an aberrant nuclear expression in comparison with nIBC samples (70% vs. 12.5%; p-value = 0.000797). NFAT5 nuclear accumulation occurs regardless of WNT/β-catenin activated signaling in a substantial portion of IBCs, suggesting that NFAT5 pathway activation may have a relevant role in IBC pathogenesis. Accordingly, cytoplasmic NFAT5 and membranous β-catenin expression were preferentially linked to nIBC, accounting for the better prognosis of this phenotype. Conclusions We provide evidence that NFAT-signaling pathway activation could help to identify aggressive forms of BC and potentially be a guide to assignment of phenotype-specific therapeutic agents. The NFAT5 transcription factor might be developed into routine clinical practice as a putative biomarker of IBC phenotype. Electronic supplementary material The online version of this article (doi:10.1186/s12967-015-0492-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Andrea Remo
- Department of Pathology, Mater Salutis Hospital, Legnago, Italy.
| | - Ines Simeone
- Department of Science and Technology, University of Sannio, Benevento, Italy. .,Qatar Computing Research Institute (QCRI), Qatar Foundation, Doha, Qatar.
| | - Massimo Pancione
- Department of Science and Technology, University of Sannio, Benevento, Italy.
| | - Pietro Parcesepe
- Department of Pathology and Diagnosis, University of Verona, Verona, Italy.
| | - Pascal Finetti
- Department of Molecular Oncology, Institut Paoli-Calmettes, U1068 Inserm, Marseille, France.
| | - Luigi Cerulo
- Department of Science and Technology, University of Sannio, Benevento, Italy. .,Bioinformatics Laboratory, BIOGEM, Ariano Irpino, Avellino, Italy.
| | - Halima Bensmail
- Qatar Computing Research Institute (QCRI), Qatar Foundation, Doha, Qatar.
| | - Daniel Birnbaum
- Department of Molecular Oncology, Institut Paoli-Calmettes, U1068 Inserm, Marseille, France.
| | | | - Vittorio Colantuoni
- Department of Science and Technology, University of Sannio, Benevento, Italy.
| | - Franco Bonetti
- Department of Pathology and Diagnosis, University of Verona, Verona, Italy.
| | - François Bertucci
- Department of Molecular Oncology, Institut Paoli-Calmettes, U1068 Inserm, Marseille, France.
| | - Erminia Manfrin
- Department of Pathology and Diagnosis, University of Verona, Verona, Italy.
| | - Michele Ceccarelli
- Department of Science and Technology, University of Sannio, Benevento, Italy. .,Qatar Computing Research Institute (QCRI), Qatar Foundation, Doha, Qatar.
| |
Collapse
|
92
|
Gong W, Koyano-Nakagawa N, Li T, Garry DJ. Inferring dynamic gene regulatory networks in cardiac differentiation through the integration of multi-dimensional data. BMC Bioinformatics 2015; 16:74. [PMID: 25887857 PMCID: PMC4359553 DOI: 10.1186/s12859-015-0460-0] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2014] [Accepted: 01/12/2015] [Indexed: 02/07/2023] Open
Abstract
Background Decoding the temporal control of gene expression patterns is key to the understanding of the complex mechanisms that govern developmental decisions during heart development. High-throughput methods have been employed to systematically study the dynamic and coordinated nature of cardiac differentiation at the global level with multiple dimensions. Therefore, there is a pressing need to develop a systems approach to integrate these data from individual studies and infer the dynamic regulatory networks in an unbiased fashion. Results We developed a two-step strategy to integrate data from (1) temporal RNA-seq, (2) temporal histone modification ChIP-seq, (3) transcription factor (TF) ChIP-seq and (4) gene perturbation experiments to reconstruct the dynamic network during heart development. First, we trained a logistic regression model to predict the probability (LR score) of any base being bound by 543 TFs with known positional weight matrices. Second, four dimensions of data were combined using a time-varying dynamic Bayesian network model to infer the dynamic networks at four developmental stages in the mouse [mouse embryonic stem cells (ESCs), mesoderm (MES), cardiac progenitors (CP) and cardiomyocytes (CM)]. Our method not only infers the time-varying networks between different stages of heart development, but it also identifies the TF binding sites associated with promoter or enhancers of downstream genes. The LR scores of experimentally verified ESCs and heart enhancers were significantly higher than random regions (p <10−100), suggesting that a high LR score is a reliable indicator for functional TF binding sites. Our network inference model identified a region with an elevated LR score approximately −9400 bp upstream of the transcriptional start site of Nkx2-5, which overlapped with a previously reported enhancer region (−9435 to −8922 bp). TFs such as Tead1, Gata4, Msx2, and Tgif1 were predicted to bind to this region and participate in the regulation of Nkx2-5 gene expression. Our model also predicted the key regulatory networks for the ESC-MES, MES-CP and CP-CM transitions. Conclusion We report a novel method to systematically integrate multi-dimensional -omics data and reconstruct the gene regulatory networks. This method will allow one to rapidly determine the cis-modules that regulate key genes during cardiac differentiation. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0460-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Wuming Gong
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Naoko Koyano-Nakagawa
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| | - Tongbin Li
- AccuraScience LLC, 5721 Merle Hay Road, Suite #16B, Johnston, IA, 50131, USA.
| | - Daniel J Garry
- Lillehei Heart Institute, University of Minnesota, 2231 6th St S.E, 4-165 CCRB, Minneapolis, MN, 55114, USA.
| |
Collapse
|
93
|
Linde J, Schulze S, Henkel SG, Guthke R. Data- and knowledge-based modeling of gene regulatory networks: an update. EXCLI JOURNAL 2015; 14:346-78. [PMID: 27047314 PMCID: PMC4817425 DOI: 10.17179/excli2015-168] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 02/10/2015] [Indexed: 02/01/2023]
Abstract
Gene regulatory network inference is a systems biology approach which predicts interactions between genes with the help of high-throughput data. In this review, we present current and updated network inference methods focusing on novel techniques for data acquisition, network inference assessment, network inference for interacting species and the integration of prior knowledge. After the advance of Next-Generation-Sequencing of cDNAs derived from RNA samples (RNA-Seq) we discuss in detail its application to network inference. Furthermore, we present progress for large-scale or even full-genomic network inference as well as for small-scale condensed network inference and review advances in the evaluation of network inference methods by crowdsourcing. Finally, we reflect the current availability of data and prior knowledge sources and give an outlook for the inference of gene regulatory networks that reflect interacting species, in particular pathogen-host interactions.
Collapse
Affiliation(s)
- Jörg Linde
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| | - Sylvie Schulze
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| | | | - Reinhard Guthke
- Research Group Systems Biology / Bioinformatics, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Beutenbergstr. 11a, 07745 Jena, Germany
| |
Collapse
|
94
|
Aghdam R, Ganjali M, Zhang X, Eslahchi C. CN: a consensus algorithm for inferring gene regulatory networks using the SORDER algorithm and conditional mutual information test. MOLECULAR BIOSYSTEMS 2015; 11:942-9. [PMID: 25607659 DOI: 10.1039/c4mb00413b] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Inferring Gene Regulatory Networks (GRNs) from gene expression data is a major challenge in systems biology. The Path Consistency (PC) algorithm is one of the popular methods in this field. However, as an order dependent algorithm, PC algorithm is not robust because it achieves different network topologies if gene orders are permuted. In addition, the performance of this algorithm depends on the threshold value used for independence tests. Consequently, selecting suitable sequential ordering of nodes and an appropriate threshold value for the inputs of PC algorithm are challenges to infer a good GRN. In this work, we propose a heuristic algorithm, namely SORDER, to find a suitable sequential ordering of nodes. Based on the SORDER algorithm and a suitable interval threshold for Conditional Mutual Information (CMI) tests, a network inference method, namely the Consensus Network (CN), has been developed. In the proposed method, for each edge of the complete graph, a weighted value is defined. This value is considered as the reliability value of dependency between two nodes. The final inferred network, obtained using the CN algorithm, contains edges with a reliability value of dependency of more than a defined threshold. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network in Escherichia coli. The results indicate that the CN algorithm is suitable for learning GRNs and it considerably improves the precision of network inference. The source of data sets and codes are available at .
Collapse
Affiliation(s)
- Rosa Aghdam
- Faculty of Mathematical Sciences, Department of Statistics, Shahid Beheshti University, G.C., Tehran, Iran.
| | | | | | | |
Collapse
|
95
|
Youseph ASK, Chetty M, Karmakar G. Decoupled Modeling of Gene Regulatory Networks Using Michaelis-Menten Kinetics. NEURAL INFORMATION PROCESSING 2015:497-505. [DOI: 10.1007/978-3-319-26555-1_56] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
96
|
Chen H, Mundra PA, Zhao LN, Lin F, Zheng J. Highly sensitive inference of time-delayed gene regulation by network deconvolution. BMC SYSTEMS BIOLOGY 2014; 8 Suppl 4:S6. [PMID: 25521243 PMCID: PMC4290726 DOI: 10.1186/1752-0509-8-s4-s6] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Background Gene regulatory network (GRN) is a fundamental topic in systems biology. The dynamics of GRN can shed light on the cellular processes, which facilitates the understanding of the mechanisms of diseases when the processes are dysregulated. Accurate reconstruction of GRN could also provide guidelines for experimental biologists. Therefore, inferring gene regulatory networks from high-throughput gene expression data is a central problem in systems biology. However, due to the inherent complexity of gene regulation, noise in measuring the data and the short length of time-series data, it is very challenging to reconstruct accurate GRNs. On the other hand, a better understanding into gene regulation could help to improve the performance of GRN inference. Time delay is one of the most important characteristics of gene regulation. By incorporating the information of time delays, we can achieve more accurate inference of GRN. Results In this paper, we propose a method to infer time-delayed gene regulation based on cross-correlation and network deconvolution (ND). First, we employ cross-correlation to obtain the probable time delays for the interactions between each target gene and its potential regulators. Then based on the inferred delays, the technique of ND is applied to identify direct interactions between the target gene and its regulators. Experiments on real-life gene expression datasets show that our method achieves overall better performance than existing methods for inferring time-delayed GRNs. Conclusion By taking into account the time delays among gene interactions, our method is able to infer GRN more accurately. The effectiveness of our method has been shown by the experiments on three real-life gene expression datasets of yeast. Compared with other existing methods which were designed for learning time-delayed GRN, our method has significantly higher sensitivity without much reduction of specificity.
Collapse
|
97
|
Hurley DG, Cursons J, Wang YK, Budden DM, Print CG, Crampin EJ. NAIL, a software toolset for inferring, analyzing and visualizing regulatory networks. ACTA ACUST UNITED AC 2014; 31:277-8. [PMID: 25246431 DOI: 10.1093/bioinformatics/btu612] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
UNLABELLED The wide variety of published approaches for the problem of regulatory network inference makes using multiple inference algorithms complex and time-consuming. Network Analysis and Inference Library (NAIL) is a set of software tools to simplify the range of computational activities involved in regulatory network inference. It uses a modular approach to connect different network inference algorithms to the same visualization and network-based analyses. NAIL is technology-independent and includes an interface layer to allow easy integration of components into other applications. AVAILABILITY AND IMPLEMENTATION NAIL is implemented in MATLAB, runs on Windows, Linux and OSX, and is available from SourceForge at https://sourceforge.net/projects/nailsystemsbiology/ for all researchers to use. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel G Hurley
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Aus
| | - Joseph Cursons
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada
| | - Yi Kan Wang
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada
| | - David M Budden
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada
| | - Cristin G Print
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Aus
| | - Edmund J Crampin
- Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Australia, Department of Mathematics and Statistics, University of Melbourne School of Medicine, University of Melbourne, Victoria 3010, Australia and Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, Canada Auckland Bioengineering Institute, University of Auckland, Auckland 1001, New Zealand, Department of Molecular Medicine and Pathology, School of Medical Sciences, Faculty of Medical and Health Sciences,University of Auckland, Auckland 1001, New Zealand, Bioinformatics Institute, University of Auckland, Auckland 1001, New Zealand, Maurice Wilkins Centre, University of Auckland, Auckland 1001, New Zealand, Systems Biology Laboratory, Melbourne School of Engineering, University of Melbourne, Victoria 3010, Aus
| |
Collapse
|
98
|
Gene network biological validity based on gene-gene interaction relevance. ScientificWorldJournal 2014; 2014:540679. [PMID: 25295303 PMCID: PMC4175387 DOI: 10.1155/2014/540679] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2014] [Accepted: 07/11/2014] [Indexed: 01/17/2023] Open
Abstract
In recent years, gene networks have become one of the most useful tools for modeling biological processes. Many inference gene network algorithms have been developed as techniques for extracting knowledge from gene expression data. Ensuring the reliability of the inferred gene relationships is a crucial task in any study in order to prove that the algorithms used are precise. Usually, this validation process can be carried out using prior biological knowledge. The metabolic pathways stored in KEGG are one of the most widely used knowledgeable sources for analyzing relationships between genes. This paper introduces a new methodology, GeneNetVal, to assess the biological validity of gene networks based on the relevance of the gene-gene interactions stored in KEGG metabolic pathways. Hence, a complete KEGG pathway conversion into a gene association network and a new matching distance based on gene-gene interaction relevance are proposed. The performance of GeneNetVal was established with three different experiments. Firstly, our proposal is tested in a comparative ROC analysis. Secondly, a randomness study is presented to show the behavior of GeneNetVal when the noise is increased in the input network. Finally, the ability of GeneNetVal to detect biological functionality of the network is shown.
Collapse
|
99
|
Penfold CA, Buchanan-Wollaston V. Modelling transcriptional networks in leaf senescence. JOURNAL OF EXPERIMENTAL BOTANY 2014; 65:3859-73. [PMID: 24600015 DOI: 10.1093/jxb/eru054] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
The process of leaf senescence is induced by an extensive range of developmental and environmental signals and controlled by multiple, cross-linking pathways, many of which overlap with plant stress-response signals. Elucidation of this complex regulation requires a step beyond a traditional one-gene-at-a-time analysis. Application of a more global analysis using statistical and mathematical tools of systems biology is an approach that is being applied to address this problem. A variety of modelling methods applicable to the analysis of current and future senescence data are reviewed and discussed using some senescence-specific examples. Network modelling with a senescence transcriptome time course followed by testing predictions with gene-expression data illustrates the application of systems biology tools.
Collapse
Affiliation(s)
| | - Vicky Buchanan-Wollaston
- Warwick Systems Biology Centre, University of Warwick, Coventry CV4 7AL, UK School of Life Sciences, University of Warwick, Coventry CV4 7AL, UK
| |
Collapse
|
100
|
Ceccarelli M, Cerulo L, Santone A. De novo reconstruction of gene regulatory networks from time series data, an approach based on formal methods. Methods 2014; 69:298-305. [PMID: 24960286 DOI: 10.1016/j.ymeth.2014.06.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 06/11/2014] [Accepted: 06/15/2014] [Indexed: 01/18/2023] Open
Abstract
Reverse engineering of gene regulatory relationships from genomics data is a crucial task to dissect the complex underlying regulatory mechanism occurring in a cell. From a computational point of view the reconstruction of gene regulatory networks is an undetermined problem as the large number of possible solutions is typically high in contrast to the number of available independent data points. Many possible solutions can fit the available data, explaining the data equally well, but only one of them can be the biologically true solution. Several strategies have been proposed in literature to reduce the search space and/or extend the amount of independent information. In this paper we propose a novel algorithm based on formal methods, mathematically rigorous techniques widely adopted in engineering to specify and verify complex software and hardware systems. Starting with a formal specification of gene regulatory hypotheses we are able to mathematically prove whether a time course experiment belongs or not to the formal specification, determining in fact whether a gene regulation exists or not. The method is able to detect both direction and sign (inhibition/activation) of regulations whereas most of literature methods are limited to undirected and/or unsigned relationships. We empirically evaluated the approach on experimental and synthetic datasets in terms of precision and recall. In most cases we observed high levels of accuracy outperforming the current state of art, despite the computational cost increases exponentially with the size of the network. We made available the tool implementing the algorithm at the following url: http://www.bioinformatics.unisannio.it.
Collapse
Affiliation(s)
- Michele Ceccarelli
- Dept. of Science and Technology, University of Sannio, Benevento, Italy; BioGeM, Institute of Genetic Research "Gaetano Salvatore", Ariano Irpino, AV, Italy
| | - Luigi Cerulo
- Dept. of Science and Technology, University of Sannio, Benevento, Italy; BioGeM, Institute of Genetic Research "Gaetano Salvatore", Ariano Irpino, AV, Italy.
| | | |
Collapse
|