1
|
Segura-Ortiz A, García-Nieto J, Aldana-Montes JF, Navas-Delgado I. Multi-objective context-guided consensus of a massive array of techniques for the inference of Gene Regulatory Networks. Comput Biol Med 2024; 179:108850. [PMID: 39013340 DOI: 10.1016/j.compbiomed.2024.108850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 07/03/2024] [Accepted: 07/03/2024] [Indexed: 07/18/2024]
Abstract
BACKGROUND AND OBJECTIVE Gene Regulatory Network (GRN) inference is a fundamental task in biology and medicine, as it enables a deeper understanding of the intricate mechanisms of gene expression present in organisms. This bioinformatics problem has been addressed in the literature through multiple computational approaches. Techniques developed for inferring from expression data have employed Bayesian networks, ordinary differential equations (ODEs), machine learning, information theory measures and neural networks, among others. The diversity of implementations and their respective customization have led to the emergence of many tools and multiple specialized domains derived from them, understood as subsets of networks with specific characteristics that are challenging to detect a priori. This specialization has introduced significant uncertainty when choosing the most appropriate technique for a particular dataset. This proposal, named MO-GENECI, builds upon the basic idea of the previous proposal GENECI and optimizes consensus among different inference techniques, through a carefully refined multi-objective evolutionary algorithm guided by various objective functions, linked to the biological context at hand. METHODS MO-GENECI has been tested on an extensive and diverse academic benchmark of 106 gene regulatory networks from multiple sources and sizes. The evaluation of MO-GENECI compared its performance to individual techniques using key metrics (AUROC and AUPR) for gene regulatory network inference. Friedman's statistical ranking provided an ordered classification, followed by non-parametric Holm tests to determine statistical significance. RESULTS MO-GENECI's Pareto front approximation facilitates easy selection of an appropriate solution based on generic input data characteristics. The best solution consistently emerged as the winner in all statistical tests, and in many cases, the median precision solution showed no statistically significant difference compared to the winner. CONCLUSIONS MO-GENECI has not only demonstrated achieving more accurate results than individual techniques, but has also overcome the uncertainty associated with the initial choice due to its flexibility and adaptability. It is shown intelligently to select the most suitable techniques for each case. The source code is hosted in a public repository at GitHub under MIT license: https://github.com/AdrianSeguraOrtiz/MO-GENECI. Moreover, to facilitate its installation and use, the software associated with this implementation has been encapsulated in a Python package available at PyPI: https://pypi.org/project/geneci/.
Collapse
Affiliation(s)
- Adrián Segura-Ortiz
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain.
| | - José García-Nieto
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| | - José F Aldana-Montes
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| | - Ismael Navas-Delgado
- Department de Lenguajes y Ciencias de la Computación, ITIS Software, Universidad de Málaga, Málaga, 29071, Spain; Biomedical Research Institute of Málaga (IBIMA), Universidad de Málaga, Málaga, Spain
| |
Collapse
|
2
|
Jia Y, Niu Y, Zhao H, Wang Z, Gao C, Wang C, Chen S, Wang Y. Hierarchical transcription factor and regulatory network for drought response in Betula platyphylla. HORTICULTURE RESEARCH 2022; 9:uhac040. [PMID: 35184174 PMCID: PMC9070641 DOI: 10.1093/hr/uhac040] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Revised: 01/03/2022] [Accepted: 02/05/2022] [Indexed: 05/16/2023]
Abstract
Although many genes and biological processes involved in abiotic stress response have been identified, how they are regulated remains largely unclear. Here, to study the regulatory mechanism of birch (Betula platyphylla) responding to drought induced by polyethylene glycol (PEG) 6000 (20%, w/v), a partial correlation coefficient-based algorithm for constructing gene regulatory network (GRN) was proposed, and a three-layer hierarchical GRN was constructed, including 68 transcription factors (TFs), and 252 structural genes. Totally, 1448 predicted regulatory relationships are included, and most of them are novel. The reliability of GRN was verified by ChIP-PCR and qRT-PCR based on transient transformation. About 55% of genes in the bottom layer of GRN could confer drought tolerance. We selected the two TFs, BpMADS11 and BpNAC090, from the up layer and characterized their function in drought tolerance. Overexpression of BpMADS11 and BpNAC090 both reduces electrolyte leakage, ROS and MDA contents, displaying increased drought tolerance than wild-type birch. According to this GRN, the important biological processes involved in drought were identified, including "signaling hormone pathways", "water transport", "regulation of stomatal movement" and "response to oxidative stress". This work indicated that BpERF017, BpAGL61 and BpNAC090 are the key upstream regulators in birch drought tolerance. Our data clearly revealed the upstream regulators and TF-DNA interaction regulate different biological processes to adapt drought stress.
Collapse
Affiliation(s)
- Yaqi Jia
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Yani Niu
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Huimin Zhao
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Zhibo Wang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Caiqiu Gao
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Chao Wang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Su Chen
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| | - Yucheng Wang
- State Key Laboratory of Tree Genetics and Breeding, Northeast Forestry University, 26 Hexing Road, Harbin 150040, China
| |
Collapse
|
3
|
Naseri A, Sharghi M, Hasheminejad SMH. Enhancing gene regulatory networks inference through hub-based data integration. Comput Biol Chem 2021; 95:107589. [PMID: 34673384 DOI: 10.1016/j.compbiolchem.2021.107589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Revised: 08/11/2021] [Accepted: 10/04/2021] [Indexed: 12/09/2022]
Abstract
One of the main research topics in computational biology is Gene Regulatory Network (GRN) reconstruction that refers to inferring the relationships between genes involved in regulating cell conditions in response to internal or external stimuli. To this end, most computational methods use only transcriptional gene expression data to reconstruct gene regulatory networks, but recent studies suggest that gene expression data must be integrated with other types of data to obtain more accurate models predicting real relationships between genes. In this study, a diffusion-based method is enhanced to integrate biological data of network types besides structural prior knowledge. The Random Walk with Restart algorithm (RWR) with an emphasis on hub nodes is executed separately on each network, and then jointly optimizes low-dimensional feature vectors for network nodes by diffusion component analysis. Next, these feature vectors are used to infer gene regulatory networks. Fourteen centrality measures are studied for the detection of hub nodes to be used in the RWR algorithm, and the best centrality measure having the greatest effect on the improvement of gene network inference is selected. A case study for the Saccharomyces cerevisiae and E. coli networks shows that using the proposed features in comparison with gene expression data alone results in 0.02-0.08 units improvement in Area Under Receiver Characteristic Operator (AUROC) criteria across different gene regulatory network inference methods. Furthermore, the proposed method was applied to the esophageal cancer data to infer its gene regulatory network. The proposed framework substantially improves accuracy and scalability of GRN inference. The fused features and the best centrality measure detected can be used to provide functional insights about genes or proteins in various biological applications. Moreover, it can be served as a general framework for network data and structural data integration and analysis problems in various scientific disciplines including biology.
Collapse
Affiliation(s)
- Atefeh Naseri
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | - Mehran Sharghi
- Department of Computer Engineering, Alzahra University, Tehran, Iran.
| | | |
Collapse
|
4
|
Zhang Y, Chang X, Liu X. Inference of gene regulatory networks using pseudo-time series data. Bioinformatics 2021; 37:2423-2431. [PMID: 33576787 DOI: 10.1093/bioinformatics/btab099] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2020] [Revised: 01/18/2021] [Accepted: 02/10/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Inferring gene regulatory networks (GRNs) from high-throughput data is an important and challenging problem in systems biology. Although numerous GRN methods have been developed, most have focused on the verification of the specific data set. However, it is difficult to establish directed topological networks that are both suitable for time-series and non-time-series datasets due to the complexity and diversity of biological networks. RESULTS Here, we proposed a novel method, GNIPLR (Gene networks inference based on projection and lagged regression) to infer GRNs from time-series or non-time-series gene expression data. GNIPLR projected gene data twice using the LASSO projection (LSP) algorithm and the linear projection (LP) approximation to produce a linear and monotonous pseudo-time series, and then determined the direction of regulation in combination with lagged regression analyses. The proposed algorithm was validated using simulated and real biological data. Moreover, we also applied the GNIPLR algorithm to the liver hepatocellular carcinoma (LIHC) and bladder urothelial carcinoma (BLCA) cancer expression datasets. These analyses revealed significantly higher accuracy and AUC values than other popular methods. AVAILABILITY The GNIPLR tool is freely available at https://github.com/zyllluck/GNIPLR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuelei Zhang
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| | - Xiao Chang
- Institute of Statistics and Applied Mathematics, Anhui University of Finance and Economics, Bengbu, 233030, China
| | - Xiaoping Liu
- Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, 310012, China.,School of Mathematics and Statistics, Shandong University, Weihai, Shandong, 264209, China
| |
Collapse
|
5
|
Schubert M, Colomé-Tatché M, Foijer F. Gene networks in cancer are biased by aneuploidies and sample impurities. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1863:194444. [PMID: 31654805 DOI: 10.1016/j.bbagrm.2019.194444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 09/05/2019] [Accepted: 10/14/2019] [Indexed: 12/14/2022]
Abstract
Gene regulatory network inference is a standard technique for obtaining structured regulatory information from, for instance, gene expression measurements. Methods performing this task have been extensively evaluated on synthetic, and to a lesser extent real data sets. In contrast to these test evaluations, applications to gene expression data of human cancers are often limited by fewer samples and more potential regulatory links, and are biased by copy number aberrations as well as cell mixtures and sample impurities. Here, we take networks inferred from TCGA cohorts as an example to show that (1) transcription factor annotations are essential to obtain reliable networks, and (2) even for state of the art methods, we expect that between 20 and 80% of edges are caused by copy number changes and cell mixtures rather than transcription factor regulation.
Collapse
Affiliation(s)
- Michael Schubert
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany.
| | - Maria Colomé-Tatché
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands; Institute of Computational Biology, Helmholtz Zentrum München, Ingolstädter Landstr. 1, 85764 Neuherberg, Germany; TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Floris Foijer
- European Research Institute for the Biology of Ageing, University of Groningen, University Medical Center Groningen, 9713 AV, Groningen, the Netherlands.
| |
Collapse
|
6
|
Wu J, Gu Y, Xiao Y, Xia C, Li H, Kang Y, Sun J, Shao Z, Lin Z, Zhao X. Characterization of DNA Methylation Associated Gene Regulatory Networks During Stomach Cancer Progression. Front Genet 2019; 9:711. [PMID: 30778372 PMCID: PMC6369581 DOI: 10.3389/fgene.2018.00711] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 12/18/2018] [Indexed: 01/11/2023] Open
Abstract
DNA methylation plays a critical role in tumorigenesis through regulating oncogene activation and tumor suppressor gene silencing. Although extensively analyzed, the implication of DNA methylation in gene regulatory network is less characterized. To address this issue, in this study we performed an integrative analysis on the alteration of DNA methylation patterns and the dynamics of gene regulatory network topology across distinct stages of stomach cancer. We found the global DNA methylation patterns in different stages are generally conserved, whereas some significantly differentially methylated genes were exclusively observed in the early stage of stomach cancer. Integrative analysis of DNA methylation and network topology alteration yielded several genes which have been reported to be involved in the progression of stomach cancer, such as IGF2, ERBB2, GSTP1, MYH11, TMEM59, and SST. Finally, we demonstrated that inhibition of SST promotes cell proliferation, suggesting that DNA methylation-associated SST suppression possibly contributes to the gastric cancer progression. Taken together, our study suggests the DNA methylation-associated regulatory network analysis could be used for identifying cancer-related genes. This strategy can facilitate the understanding of gene regulatory network in cancer biology and provide a new insight into the study of DNA methylation at system level.
Collapse
Affiliation(s)
- Jun Wu
- School of Life Sciences, East China Normal University, Shanghai, China
| | - Yunzhao Gu
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yawen Xiao
- Department of Automation, Shanghai Jiao Tong University, Shanghai, China
| | - Chao Xia
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hua Li
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yani Kang
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jielin Sun
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| | - Zhifeng Shao
- Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Zongli Lin
- Charles L. Brown Department of Electrical and Computer Engineering, University of Virginia, Charlottesville, VA, United States
| | - Xiaodong Zhao
- Shanghai Center for Systems Biomedicine, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
7
|
Jurman G, Filosi M, Visintainer R, Riccadonna S, Furlanello C. Stability in GRN Inference. Methods Mol Biol 2019; 1883:323-346. [PMID: 30547407 DOI: 10.1007/978-1-4939-8882-2_14] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Reconstructing a gene regulatory network from one or more sets of omics measurements has been a major task of computational biology in the last 20 years. Despite an overwhelming number of algorithms proposed to solve the network inference problem either in the general scenario or in an ad-hoc tailored situation, assessing the stability of reconstruction is still an uncharted territory and exploratory studies mainly tackled theoretical aspects. We introduce here empirical stability, which is induced by variability of reconstruction as a function of data subsampling. By evaluating differences between networks that are inferred using different subsets of the same data we obtain quantitative indicators of the robustness of the algorithm, of the noise level affecting the data, and, overall, of the reliability of the reconstructed graph. We show that empirical stability can be used whenever no ground truth is available to compute a direct measure of the similarity between the inferred structure and the true network. The main ingredient here is a suite of indicators, called NetSI, providing statistics of distances between graphs generated by a given algorithm fed with different data subsets, where the chosen metric is the Hamming-Ipsen-Mikhailov (HIM) distance evaluating dissimilarity of graph topologies with shared nodes. Operatively, the NetSI family is demonstrated here on synthetic and high-throughput datasets, inferring graphs at different resolution levels (topology, direction, weight), showing how the stability indicators can be effectively used for the quantitative comparison of the stability of different reconstruction algorithms.
Collapse
Affiliation(s)
| | | | - Roberto Visintainer
- The Microsoft Research - University of Trento Centre for Computational and Systems Biology (COSBI), Rovereto, Italy
| | | | | |
Collapse
|
8
|
Barbosa S, Niebel B, Wolf S, Mauch K, Takors R. A guide to gene regulatory network inference for obtaining predictive solutions: Underlying assumptions and fundamental biological and data constraints. Biosystems 2018; 174:37-48. [PMID: 30312740 DOI: 10.1016/j.biosystems.2018.10.008] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 10/05/2018] [Accepted: 10/08/2018] [Indexed: 02/07/2023]
Abstract
The study of biological systems at a system level has become a reality due to the increasing powerful computational approaches able to handle increasingly larger datasets. Uncovering the dynamic nature of gene regulatory networks in order to attain a system level understanding and improve the predictive power of biological models is an important research field in systems biology. The task itself presents several challenges, since the problem is of combinatorial nature and highly depends on several biological constraints and also the intended application. Given the intrinsic interdisciplinary nature of gene regulatory network inference, we present a review on the currently available approaches, their challenges and limitations. We propose guidelines to select the most appropriate method considering the underlying assumptions and fundamental biological and data constraints.
Collapse
Affiliation(s)
- Sara Barbosa
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany.
| | - Bastian Niebel
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Sebastian Wolf
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Klaus Mauch
- Insilico Biotechnology AG, Meitnerstrasse 9, 70563 Stuttgart, Germany
| | - Ralf Takors
- Institute of Biochemical Engineering, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| |
Collapse
|
9
|
An integrative method to decode regulatory logics in gene transcription. Nat Commun 2017; 8:1044. [PMID: 29051499 PMCID: PMC5715098 DOI: 10.1038/s41467-017-01193-0] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2016] [Accepted: 08/25/2017] [Indexed: 12/27/2022] Open
Abstract
Modeling of transcriptional regulatory networks (TRNs) has been increasingly used to dissect the nature of gene regulation. Inference of regulatory relationships among transcription factors (TFs) and genes, especially among multiple TFs, is still challenging. In this study, we introduced an integrative method, LogicTRN, to decode TF–TF interactions that form TF logics in regulating target genes. By combining cis-regulatory logics and transcriptional kinetics into one single model framework, LogicTRN can naturally integrate dynamic gene expression data and TF-DNA-binding signals in order to identify the TF logics and to reconstruct the underlying TRNs. We evaluated the newly developed methodology using simulation, comparison and application studies, and the results not only show their consistence with existing knowledge, but also demonstrate its ability to accurately reconstruct TRNs in biological complex systems. Existing transcriptional regulatory networks models fall short of deciphering the cooperation between multiple transcription factors on dynamic gene expression. Here the authors develop an integrative method that combines gene expression and transcription factor-DNA binding data to decode transcription regulatory logics.
Collapse
|
10
|
Yu B, Xu JM, Li S, Chen C, Chen RX, Wang L, Zhang Y, Wang MH. Inference of time-delayed gene regulatory networks based on dynamic Bayesian network hybrid learning method. Oncotarget 2017; 8:80373-80392. [PMID: 29113310 PMCID: PMC5655205 DOI: 10.18632/oncotarget.21268] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 08/27/2017] [Indexed: 01/31/2023] Open
Abstract
Gene regulatory networks (GRNs) research reveals complex life phenomena from the perspective of gene interaction, which is an important research field in systems biology. Traditional Bayesian networks have a high computational complexity, and the network structure scoring model has a single feature. Information-based approaches cannot identify the direction of regulation. In order to make up for the shortcomings of the above methods, this paper presents a novel hybrid learning method (DBNCS) based on dynamic Bayesian network (DBN) to construct the multiple time-delayed GRNs for the first time, combining the comprehensive score (CS) with the DBN model. DBNCS algorithm first uses CMI2NI (conditional mutual inclusive information-based network inference) algorithm for network structure profiles learning, namely the construction of search space. Then the redundant regulations are removed by using the recursive optimization algorithm (RO), thereby reduce the false positive rate. Secondly, the network structure profiles are decomposed into a set of cliques without loss, which can significantly reduce the computational complexity. Finally, DBN model is used to identify the direction of gene regulation within the cliques and search for the optimal network structure. The performance of DBNCS algorithm is evaluated by the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in Escherichia coli, and compared with other state-of-the-art methods. The experimental results show the rationality of the algorithm design and the outstanding performance of the GRNs.
Collapse
Affiliation(s)
- Bin Yu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- CAS Key Laboratory of Geospace Environment, Department of Geophysics and Planetary Science, University of Science and Technology of China, Hefei 230026, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Jia-Meng Xu
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Shan Li
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Cheng Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Rui-Xin Chen
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Lei Wang
- Key Laboratory of Eco-chemical Engineering, Ministry of Education, Laboratory of Inorganic Synthesis and Applied Chemistry, College of Chemistry and Molecular Engineering, Qingdao University of Science and Technology, Qingdao 266042, China
| | - Yan Zhang
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
- College of Electromechanical Engineering, Qingdao University of Science and Technology, Qingdao 266061, China
| | - Ming-Hui Wang
- College of Mathematics and Physics, Qingdao University of Science and Technology, Qingdao 266061, China
- Bioinformatics and Systems Biology Research Center, Qingdao University of Science and Technology, Qingdao 266061, China
| |
Collapse
|
11
|
Wang Z, Danziger SA, Heavner BD, Ma S, Smith JJ, Li S, Herricks T, Simeonidis E, Baliga NS, Aitchison JD, Price ND. Combining inferred regulatory and reconstructed metabolic networks enhances phenotype prediction in yeast. PLoS Comput Biol 2017; 13:e1005489. [PMID: 28520713 PMCID: PMC5453602 DOI: 10.1371/journal.pcbi.1005489] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2016] [Revised: 06/01/2017] [Accepted: 03/30/2017] [Indexed: 01/24/2023] Open
Abstract
Gene regulatory and metabolic network models have been used successfully in many organisms, but inherent differences between them make networks difficult to integrate. Probabilistic Regulation Of Metabolism (PROM) provides a partial solution, but it does not incorporate network inference and underperforms in eukaryotes. We present an Integrated Deduced And Metabolism (IDREAM) method that combines statistically inferred Environment and Gene Regulatory Influence Network (EGRIN) models with the PROM framework to create enhanced metabolic-regulatory network models. We used IDREAM to predict phenotypes and genetic interactions between transcription factors and genes encoding metabolic activities in the eukaryote, Saccharomyces cerevisiae. IDREAM models contain many fewer interactions than PROM and yet produce significantly more accurate growth predictions. IDREAM consistently outperformed PROM using any of three popular yeast metabolic models and across three experimental growth conditions. Importantly, IDREAM's enhanced accuracy makes it possible to identify subtle synthetic growth defects. With experimental validation, these novel genetic interactions involving the pyruvate dehydrogenase complex suggested a new role for fatty acid-responsive factor Oaf1 in regulating acetyl-CoA production in glucose grown cells.
Collapse
Affiliation(s)
- Zhuo Wang
- Key laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Bio-X Institutes, Shanghai Jiao Tong University, Shanghai, China
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Samuel A. Danziger
- Institute for Systems Biology, Seattle, Washington, United States of America
- Center for Infectious Disease Research, Seattle, Washington, United States of America
| | - Benjamin D. Heavner
- Institute for Systems Biology, Seattle, Washington, United States of America
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Shuyi Ma
- Institute for Systems Biology, Seattle, Washington, United States of America
- Center for Infectious Disease Research, Seattle, Washington, United States of America
- Department of Chemical and Biomolecular Engineering, University of Illinois, Urbana-Champaign, Illinois, United States of America
| | - Jennifer J. Smith
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Song Li
- Institute for Systems Biology, Seattle, Washington, United States of America
| | - Thurston Herricks
- Institute for Systems Biology, Seattle, Washington, United States of America
| | | | - Nitin S. Baliga
- Institute for Systems Biology, Seattle, Washington, United States of America
- Departments of Biology and Microbiology & Molecular and Cellular Biology Program, University of Washington, Seattle, Washington, United States of America
- Lawrence Berkeley National Lab, Berkeley, California, United States of America
| | - John D. Aitchison
- Institute for Systems Biology, Seattle, Washington, United States of America
- Center for Infectious Disease Research, Seattle, Washington, United States of America
| | - Nathan D. Price
- Institute for Systems Biology, Seattle, Washington, United States of America
| |
Collapse
|
12
|
Liu F, Zhang SW, Guo WF, Wei ZG, Chen L. Inference of Gene Regulatory Network Based on Local Bayesian Networks. PLoS Comput Biol 2016; 12:e1005024. [PMID: 27479082 PMCID: PMC4968793 DOI: 10.1371/journal.pcbi.1005024] [Citation(s) in RCA: 77] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2015] [Accepted: 06/20/2016] [Indexed: 11/18/2022] Open
Abstract
The inference of gene regulatory networks (GRNs) from expression data can mine the direct regulations among genes and gain deep insights into biological processes at a network level. During past decades, numerous computational approaches have been introduced for inferring the GRNs. However, many of them still suffer from various problems, e.g., Bayesian network (BN) methods cannot handle large-scale networks due to their high computational complexity, while information theory-based methods cannot identify the directions of regulatory interactions and also suffer from false positive/negative problems. To overcome the limitations, in this work we present a novel algorithm, namely local Bayesian network (LBN), to infer GRNs from gene expression data by using the network decomposition strategy and false-positive edge elimination scheme. Specifically, LBN algorithm first uses conditional mutual information (CMI) to construct an initial network or GRN, which is decomposed into a number of local networks or GRNs. Then, BN method is employed to generate a series of local BNs by selecting the k-nearest neighbors of each gene as its candidate regulatory genes, which significantly reduces the exponential search space from all possible GRN structures. Integrating these local BNs forms a tentative network or GRN by performing CMI, which reduces redundant regulations in the GRN and thus alleviates the false positive problem. The final network or GRN can be obtained by iteratively performing CMI and local BN on the tentative network. In the iterative process, the false or redundant regulations are gradually removed. When tested on the benchmark GRN datasets from DREAM challenge as well as the SOS DNA repair network in E.coli, our results suggest that LBN outperforms other state-of-the-art methods (ARACNE, GENIE3 and NARROMI) significantly, with more accurate and robust performance. In particular, the decomposition strategy with local Bayesian networks not only effectively reduce the computational cost of BN due to much smaller sizes of local GRNs, but also identify the directions of the regulations.
Collapse
Affiliation(s)
- Fei Liu
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
- Institute of Physics and Optoelectronics Technology, Baoji University of Arts and Science, Baoji, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Wei-Feng Guo
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Ze-Gang Wei
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
| | - Luonan Chen
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, Xi’an, China
- Key Laboratory of Systems Biology, Innovation Center for Cell Signaling Network, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| |
Collapse
|