1
|
Magni S, Sawlekar R, Capelle CM, Tslaf V, Baron A, Zeng N, Mombaerts L, Yue Z, Yuan Y, Hefeng FQ, Gonçalves J. Inferring upstream regulatory genes of FOXP3 in human regulatory T cells from time-series transcriptomic data. NPJ Syst Biol Appl 2024; 10:59. [PMID: 38811598 PMCID: PMC11137136 DOI: 10.1038/s41540-024-00387-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Accepted: 05/10/2024] [Indexed: 05/31/2024] Open
Abstract
The discovery of upstream regulatory genes of a gene of interest still remains challenging. Here we applied a scalable computational method to unbiasedly predict candidate regulatory genes of critical transcription factors by searching the whole genome. We illustrated our approach with a case study on the master regulator FOXP3 of human primary regulatory T cells (Tregs). While target genes of FOXP3 have been identified, its upstream regulatory machinery still remains elusive. Our methodology selected five top-ranked candidates that were tested via proof-of-concept experiments. Following knockdown, three out of five candidates showed significant effects on the mRNA expression of FOXP3 across multiple donors. This provides insights into the regulatory mechanisms modulating FOXP3 transcriptional expression in Tregs. Overall, at the genome level this represents a high level of accuracy in predicting upstream regulatory genes of key genes of interest.
Collapse
Affiliation(s)
- Stefano Magni
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Rucha Sawlekar
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
- Robotics and Artificial Intelligence, Department of Computer Science, Electrical and Space Engineering, Luleå University of Technology, Luleå, Sweden
| | - Christophe M Capelle
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg
- Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Vera Tslaf
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg
- Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Transversal Translational Medicine, Luxembourg Institute of Health, Strassen, Luxembourg
| | - Alexandre Baron
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg
| | - Ni Zeng
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg
| | - Laurent Mombaerts
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Zuogong Yue
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg
| | - Ye Yuan
- School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China
| | - Feng Q Hefeng
- Department of Infection and Immunity, Luxembourg Institute of Health, Esch-Sur-Alzette, Luxembourg.
| | - Jorge Gonçalves
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Belvaux, Luxembourg.
- Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom.
| |
Collapse
|
2
|
Chen Y, Mao R, Xu J, Huang Y, Xu J, Cui S, Zhu Z, Ji X, Huang S, Huang Y, Huang HY, Yen SC, Lin YCD, Huang HD. A Causal Regulation Modeling Algorithm for Temporal Events with Application to Escherichia coli's Aerobic to Anaerobic Transition. Int J Mol Sci 2024; 25:5654. [PMID: 38891842 PMCID: PMC11171773 DOI: 10.3390/ijms25115654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2024] [Revised: 05/10/2024] [Accepted: 05/21/2024] [Indexed: 06/21/2024] Open
Abstract
Time-series experiments are crucial for understanding the transient and dynamic nature of biological phenomena. These experiments, leveraging advanced classification and clustering algorithms, allow for a deep dive into the cellular processes. However, while these approaches effectively identify patterns and trends within data, they often need to improve in elucidating the causal mechanisms behind these changes. Building on this foundation, our study introduces a novel algorithm for temporal causal signaling modeling, integrating established knowledge networks with sequential gene expression data to elucidate signal transduction pathways over time. Focusing on Escherichia coli's (E. coli) aerobic to anaerobic transition (AAT), this research marks a significant leap in understanding the organism's metabolic shifts. By applying our algorithm to a comprehensive E. coli regulatory network and a time-series microarray dataset, we constructed the cross-time point core signaling and regulatory processes of E. coli's AAT. Through gene expression analysis, we validated the primary regulatory interactions governing this process. We identified a novel regulatory scheme wherein environmentally responsive genes, soxR and oxyR, activate fur, modulating the nitrogen metabolism regulators fnr and nac. This regulatory cascade controls the stress regulators ompR and lrhA, ultimately affecting the cell motility gene flhD, unveiling a novel regulatory axis that elucidates the complex regulatory dynamics during the AAT process. Our approach, merging empirical data with prior knowledge, represents a significant advance in modeling cellular signaling processes, offering a deeper understanding of microbial physiology and its applications in biotechnology.
Collapse
Affiliation(s)
- Yigang Chen
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Runbo Mao
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
| | - Jiatong Xu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Yixian Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Jingyi Xu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
| | - Shidong Cui
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Zihao Zhu
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Xiang Ji
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Shenghan Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Yanzhe Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
| | - Hsi-Yuan Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Shih-Chung Yen
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Yang-Chi-Duang Lin
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| | - Hsien-Da Huang
- School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China; (Y.C.); (R.M.); (J.X.); (Y.H.); (J.X.); (S.C.); (Z.Z.); (X.J.); (S.H.); (Y.H.); (H.-Y.H.); (S.-C.Y.)
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen, Longgang District, Shenzhen 518172, China
| |
Collapse
|
3
|
Fang WQ, Wu YL, Hwang MJ. A Noise-Tolerating Gene Association Network Uncovering an Oncogenic Regulatory Motif in Lymphoma Transcriptomics. Life (Basel) 2023; 13:1331. [PMID: 37374114 DOI: 10.3390/life13061331] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 05/24/2023] [Accepted: 05/26/2023] [Indexed: 06/29/2023] Open
Abstract
In cancer genomics research, gene expressions provide clues to gene regulations implicating patients' risk of survival. Gene expressions, however, fluctuate due to noises arising internally and externally, making their use to infer gene associations, hence regulation mechanisms, problematic. Here, we develop a new regression approach to model gene association networks while considering uncertain biological noises. In a series of simulation experiments accounting for varying levels of biological noises, the new method was shown to be robust and perform better than conventional regression methods, as judged by a number of statistical measures on unbiasedness, consistency and accuracy. Application to infer gene associations in germinal-center B cells led to the discovery of a three-by-two regulatory motif gene expression and a three-gene prognostic signature for diffuse large B-cell lymphoma.
Collapse
Affiliation(s)
- Wei-Quan Fang
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
- Division of New Drug, Center for Drug Evaluation, Taipei 115, Taiwan
| | - Yu-Le Wu
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| | - Ming-Jing Hwang
- Institute of Biomedical Sciences, Academia Sinica, Taipei 115, Taiwan
| |
Collapse
|
4
|
Banerjee A, Chandra S, Ott E. Network inference from short, noisy, low time-resolution, partial measurements: Application to C. elegans neuronal calcium dynamics. Proc Natl Acad Sci U S A 2023; 120:e2216030120. [PMID: 36927154 PMCID: PMC10041139 DOI: 10.1073/pnas.2216030120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 02/04/2023] [Indexed: 03/18/2023] Open
Abstract
Network link inference from measured time series data of the behavior of dynamically interacting network nodes is an important problem with wide-ranging applications, e.g., estimating synaptic connectivity among neurons from measurements of their calcium fluorescence. Network inference methods typically begin by using the measured time series to assign to any given ordered pair of nodes a numerical score reflecting the likelihood of a directed link between those two nodes. In typical cases, the measured time series data may be subject to limitations, including limited duration, low sampling rate, observational noise, and partial nodal state measurement. However, it is unknown how the performance of link inference techniques on such datasets depends on these experimental limitations of data acquisition. Here, we utilize both synthetic data generated from coupled chaotic systems as well as experimental data obtained from Caenorhabditis elegans neural activity to systematically assess the influence of data limitations on the character of scores reflecting the likelihood of a directed link between a given node pair. We do this for three network inference techniques: Granger causality, transfer entropy, and, a machine learning-based method. Furthermore, we assess the ability of appropriate surrogate data to determine statistical confidence levels associated with the results of link-inference techniques.
Collapse
Affiliation(s)
- Amitava Banerjee
- Department of Physics, University of Maryland, College Park, MD20742
- Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD20742
| | - Sarthak Chandra
- Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA02139
- McGovern Institute, Massachusetts Institute of Technology, Cambridge, MA02139
| | - Edward Ott
- Department of Physics, University of Maryland, College Park, MD20742
- Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, MD20742
- Department of Electrical and Computer Engineering, University of Maryland, College Park, MD20742
| |
Collapse
|
5
|
Song Q, Ruffalo M, Bar-Joseph Z. Using single cell atlas data to reconstruct regulatory networks. Nucleic Acids Res 2023; 51:e38. [PMID: 36762475 PMCID: PMC10123116 DOI: 10.1093/nar/gkad053] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 12/16/2022] [Accepted: 01/19/2023] [Indexed: 02/11/2023] Open
Abstract
Inference of global gene regulatory networks from omics data is a long-term goal of systems biology. Most methods developed for inferring transcription factor (TF)-gene interactions either relied on a small dataset or used snapshot data which is not suitable for inferring a process that is inherently temporal. Here, we developed a new computational method that combines neural networks and multi-task learning to predict RNA velocity rather than gene expression values. This allows our method to overcome many of the problems faced by prior methods leading to more accurate and more comprehensive set of identified regulatory interactions. Application of our method to atlas scale single cell data from 6 HuBMAP tissues led to several validated and novel predictions and greatly improved on prior methods proposed for this task.
Collapse
Affiliation(s)
- Qi Song
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Matthew Ruffalo
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Machine Learning Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
6
|
Oh VKS, Li RW. Temporal Dynamic Methods for Bulk RNA-Seq Time Series Data. Genes (Basel) 2021; 12:352. [PMID: 33673721 PMCID: PMC7997275 DOI: 10.3390/genes12030352] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2021] [Revised: 02/19/2021] [Accepted: 02/22/2021] [Indexed: 02/06/2023] Open
Abstract
Dynamic studies in time course experimental designs and clinical approaches have been widely used by the biomedical community. These applications are particularly relevant in stimuli-response models under environmental conditions, characterization of gradient biological processes in developmental biology, identification of therapeutic effects in clinical trials, disease progressive models, cell-cycle, and circadian periodicity. Despite their feasibility and popularity, sophisticated dynamic methods that are well validated in large-scale comparative studies, in terms of statistical and computational rigor, are less benchmarked, comparing to their static counterparts. To date, a number of novel methods in bulk RNA-Seq data have been developed for the various time-dependent stimuli, circadian rhythms, cell-lineage in differentiation, and disease progression. Here, we comprehensively review a key set of representative dynamic strategies and discuss current issues associated with the detection of dynamically changing genes. We also provide recommendations for future directions for studying non-periodical, periodical time course data, and meta-dynamic datasets.
Collapse
Affiliation(s)
- Vera-Khlara S. Oh
- Animal Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA;
- Department of Computer Science and Statistics, College of Natural Sciences, Jeju National University, Jeju City 63243, Korea
| | - Robert W. Li
- Animal Genomics and Improvement Laboratory, United States Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705, USA;
| |
Collapse
|
7
|
Banerjee A, Pathak J, Roy R, Restrepo JG, Ott E. Using machine learning to assess short term causal dependence and infer network links. CHAOS (WOODBURY, N.Y.) 2019; 29:121104. [PMID: 31893648 DOI: 10.1063/1.5134845] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 12/05/2019] [Indexed: 06/10/2023]
Abstract
We introduce and test a general machine-learning-based technique for the inference of short term causal dependence between state variables of an unknown dynamical system from time-series measurements of its state variables. Our technique leverages the results of a machine learning process for short time prediction to achieve our goal. The basic idea is to use the machine learning to estimate the elements of the Jacobian matrix of the dynamical flow along an orbit. The type of machine learning that we employ is reservoir computing. We present numerical tests on link inference of a network of interacting dynamical nodes. It is seen that dynamical noise can greatly enhance the effectiveness of our technique, while observational noise degrades the effectiveness. We believe that the competition between these two opposing types of noise will be the key factor determining the success of causal inference in many of the most important application situations.
Collapse
Affiliation(s)
- Amitava Banerjee
- Department of Physics and Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, Maryland 20742, USA
| | - Jaideep Pathak
- Department of Physics and Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, Maryland 20742, USA
| | - Rajarshi Roy
- Department of Physics and Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, Maryland 20742, USA
| | - Juan G Restrepo
- Department of Applied Mathematics, University of Colorado, Boulder, Colorado 80309, USA
| | - Edward Ott
- Department of Physics and Institute for Research in Electronics and Applied Physics, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
8
|
van der Wijst MGP, de Vries DH, Brugge H, Westra HJ, Franke L. An integrative approach for building personalized gene regulatory networks for precision medicine. Genome Med 2018; 10:96. [PMID: 30567569 PMCID: PMC6299585 DOI: 10.1186/s13073-018-0608-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Only a small fraction of patients respond to the drug prescribed to treat their disease, which means that most are at risk of unnecessary exposure to side effects through ineffective drugs. This inter-individual variation in drug response is driven by differences in gene interactions caused by each patient's genetic background, environmental exposures, and the proportions of specific cell types involved in disease. These gene interactions can now be captured by building gene regulatory networks, by taking advantage of RNA velocity (the time derivative of the gene expression state), the ability to study hundreds of thousands of cells simultaneously, and the falling price of single-cell sequencing. Here, we propose an integrative approach that leverages these recent advances in single-cell data with the sensitivity of bulk data to enable the reconstruction of personalized, cell-type- and context-specific gene regulatory networks. We expect this approach will allow the prioritization of key driver genes for specific diseases and will provide knowledge that opens new avenues towards improved personalized healthcare.
Collapse
Affiliation(s)
- Monique G P van der Wijst
- Department of Genetics, 5th floor ERIBA building, Antonius Deusinglaan 1, 9713AV Groningen, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Dylan H de Vries
- Department of Genetics, 5th floor ERIBA building, Antonius Deusinglaan 1, 9713AV Groningen, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Harm Brugge
- Department of Genetics, 5th floor ERIBA building, Antonius Deusinglaan 1, 9713AV Groningen, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Harm-Jan Westra
- Department of Genetics, 5th floor ERIBA building, Antonius Deusinglaan 1, 9713AV Groningen, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands
| | - Lude Franke
- Department of Genetics, 5th floor ERIBA building, Antonius Deusinglaan 1, 9713AV Groningen, University of Groningen, University Medical Center Groningen, Groningen, The Netherlands.
| |
Collapse
|
9
|
Abbaszadeh O, Khanteymoori AR, Azarpeyvand A. Parallel Algorithms for Inferring Gene Regulatory Networks: A Review. Curr Genomics 2018; 19:603-614. [PMID: 30386172 PMCID: PMC6194435 DOI: 10.2174/1389202919666180601081718] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2017] [Revised: 02/20/2018] [Accepted: 05/22/2018] [Indexed: 11/22/2022] Open
Abstract
System biology problems such as whole-genome network construction from large-scale gene expression data are sophisticated and time-consuming. Therefore, using sequential algorithms are not feasible to obtain a solution in an acceptable amount of time. Today, by using massively parallel computing, it is possible to infer large-scale gene regulatory networks. Recently, establishing gene regulatory networks from large-scale datasets have drawn the noticeable attention of researchers in the field of parallel computing and system biology. In this paper, we attempt to provide a more detailed overview of the recent parallel algorithms for constructing gene regulatory networks. Firstly, fundamentals of gene regulatory networks inference and large-scale datasets challenges are given. Secondly, a detailed description of the four parallel frameworks and libraries including CUDA, OpenMP, MPI, and Hadoop is discussed. Thirdly, parallel algorithms are reviewed. Finally, some conclusions and guidelines for parallel reverse engineering are described.
Collapse
Affiliation(s)
- Omid Abbaszadeh
- Department of Electrical and Computer Engineering, University of Zanjan, Zanjan, Iran
| | - Ali Reza Khanteymoori
- Department of Electrical and Computer Engineering, University of Zanjan, Zanjan, Iran
| | - Ali Azarpeyvand
- Department of Electrical and Computer Engineering, University of Zanjan, Zanjan, Iran
| |
Collapse
|
10
|
Li Y, Chen J, Jiang L, Zeng N, Jiang H, Du M. The p53–Mdm2 regulation relationship under different radiation doses based on the continuous–discrete extended Kalman filter algorithm. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.08.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
11
|
Simak M, Yeang CH, Lu HHS. Exploring candidate biological functions by Boolean Function Networks for Saccharomyces cerevisiae. PLoS One 2017; 12:e0185475. [PMID: 28981547 PMCID: PMC5628832 DOI: 10.1371/journal.pone.0185475] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2017] [Accepted: 09/13/2017] [Indexed: 01/26/2023] Open
Abstract
The great amount of gene expression data has brought a big challenge for the discovery of Gene Regulatory Network (GRN). For network reconstruction and the investigation of regulatory relations, it is desirable to ensure directness of links between genes on a map, infer their directionality and explore candidate biological functions from high-throughput transcriptomic data. To address these problems, we introduce a Boolean Function Network (BFN) model based on techniques of hidden Markov model (HMM), likelihood ratio test and Boolean logic functions. BFN consists of two consecutive tests to establish links between pairs of genes and check their directness. We evaluate the performance of BFN through the application to S. cerevisiae time course data. BFN produces regulatory relations which show consistency with succession of cell cycle phases. Furthermore, it also improves sensitivity and specificity when compared with alternative methods of genetic network reverse engineering. Moreover, we demonstrate that BFN can provide proper resolution for GO enrichment of gene sets. Finally, the Boolean functions discovered by BFN can provide useful insights for the identification of control mechanisms of regulatory processes, which is the special advantage of the proposed approach. In combination with low computational complexity, BFN can serve as an efficient screening tool to reconstruct genes relations on the whole genome level. In addition, the BFN approach is also feasible to a wide range of time course datasets.
Collapse
Affiliation(s)
- Maria Simak
- Bioinformatics Program, Taiwan International Graduate Program, Institute of Information Science, Academia Sinica, Taipei, Taiwan
- Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan
| | | | - Henry Horng-Shing Lu
- Institute of Statistics, National Chiao Tung University, Hsinchu, Taiwan
- Big Data Research Center, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
12
|
Kordmahalleh MM, Sefidmazgi MG, Harrison SH, Homaifar A. Identifying time-delayed gene regulatory networks via an evolvable hierarchical recurrent neural network. BioData Min 2017; 10:29. [PMID: 28785315 PMCID: PMC5543747 DOI: 10.1186/s13040-017-0146-4] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2017] [Accepted: 07/14/2017] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND The modeling of genetic interactions within a cell is crucial for a basic understanding of physiology and for applied areas such as drug design. Interactions in gene regulatory networks (GRNs) include effects of transcription factors, repressors, small metabolites, and microRNA species. In addition, the effects of regulatory interactions are not always simultaneous, but can occur after a finite time delay, or as a combined outcome of simultaneous and time delayed interactions. Powerful biotechnologies have been rapidly and successfully measuring levels of genetic expression to illuminate different states of biological systems. This has led to an ensuing challenge to improve the identification of specific regulatory mechanisms through regulatory network reconstructions. Solutions to this challenge will ultimately help to spur forward efforts based on the usage of regulatory network reconstructions in systems biology applications. METHODS We have developed a hierarchical recurrent neural network (HRNN) that identifies time-delayed gene interactions using time-course data. A customized genetic algorithm (GA) was used to optimize hierarchical connectivity of regulatory genes and a target gene. The proposed design provides a non-fully connected network with the flexibility of using recurrent connections inside the network. These features and the non-linearity of the HRNN facilitate the process of identifying temporal patterns of a GRN. RESULTS Our HRNN method was implemented with the Python language. It was first evaluated on simulated data representing linear and nonlinear time-delayed gene-gene interaction models across a range of network sizes and variances of noise. We then further demonstrated the capability of our method in reconstructing GRNs of the Saccharomyces cerevisiae synthetic network for in vivo benchmarking of reverse-engineering and modeling approaches (IRMA). We compared the performance of our method to TD-ARACNE, HCC-CLINDE, TSNI and ebdbNet across different network sizes and levels of stochastic noise. We found our HRNN method to be superior in terms of accuracy for nonlinear data sets with higher amounts of noise. CONCLUSIONS The proposed method identifies time-delayed gene-gene interactions of GRNs. The topology-based advancement of our HRNN worked as expected by more effectively modeling nonlinear data sets. As a non-fully connected network, an added benefit to HRNN was how it helped to find the few genes which regulated the target gene over different time delays.
Collapse
Affiliation(s)
- Mina Moradi Kordmahalleh
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Mohammad Gorji Sefidmazgi
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Scott H Harrison
- Department of Biology, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| | - Abdollah Homaifar
- Department of Electrical and Computer Engineering, North Carolina A&T State University, 1601 E. Market Street, Greensboro, 27411 NC USA
| |
Collapse
|
13
|
Li X, Omotere O, Qian L, Dougherty ER. Review of stochastic hybrid systems with applications in biological systems modeling and analysis. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2017; 2017:8. [PMID: 28667450 PMCID: PMC5493609 DOI: 10.1186/s13637-017-0061-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2017] [Accepted: 06/05/2017] [Indexed: 11/10/2022]
Abstract
Stochastic hybrid systems (SHS) have attracted a lot of research interests in recent years. In this paper, we review some of the recent applications of SHS to biological systems modeling and analysis. Due to the nature of molecular interactions, many biological processes can be conveniently described as a mixture of continuous and discrete phenomena employing SHS models. With the advancement of SHS theory, it is expected that insights can be obtained about biological processes such as drug effects on gene regulation. Furthermore, combining with advanced experimental methods, in silico simulations using SHS modeling techniques can be carried out for massive and rapid verification or falsification of biological hypotheses. The hope is to substitute costly and time-consuming in vitro or in vivo experiments or provide guidance for those experiments and generate better hypotheses.
Collapse
Affiliation(s)
- Xiangfang Li
- Department of Electrical and Computer Engineering, Prairie View A&M University, Prairie View, 77446, TX, USA.
| | - Oluwaseyi Omotere
- Department of Electrical and Computer Engineering, Prairie View A&M University, Prairie View, 77446, TX, USA
| | - Lijun Qian
- Department of Electrical and Computer Engineering, Prairie View A&M University, Prairie View, 77446, TX, USA
| | - Edward R Dougherty
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, 77843, TX, USA
| |
Collapse
|
14
|
Modeling Delayed Dynamics in Biological Regulatory Networks from Time Series Data. ALGORITHMS 2017. [DOI: 10.3390/a10010008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
15
|
Cannoodt R, Saelens W, Saeys Y. Computational methods for trajectory inference from single-cell transcriptomics. Eur J Immunol 2016; 46:2496-2506. [DOI: 10.1002/eji.201646347] [Citation(s) in RCA: 112] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2016] [Revised: 08/30/2016] [Accepted: 09/26/2016] [Indexed: 12/22/2022]
Affiliation(s)
- Robrecht Cannoodt
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
- Center for Medical Genetics; Ghent University; Ghent Belgium
- Cancer Research Institute Ghent (CRIG); Ghent Belgium
| | - Wouter Saelens
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
| | - Yvan Saeys
- Data Mining and Modelling for Biomedicine group; VIB Inflammation Research Center; Ghent Belgium
- Department of Internal Medicine; Ghent University; Ghent Belgium
| |
Collapse
|
16
|
Hodos RA, Kidd BA, Khader S, Readhead BP, Dudley JT. In silico methods for drug repurposing and pharmacology. WILEY INTERDISCIPLINARY REVIEWS. SYSTEMS BIOLOGY AND MEDICINE 2016; 8:186-210. [PMID: 27080087 PMCID: PMC4845762 DOI: 10.1002/wsbm.1337] [Citation(s) in RCA: 181] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Revised: 02/08/2016] [Accepted: 02/11/2016] [Indexed: 12/18/2022]
Abstract
Data in the biological, chemical, and clinical domains are accumulating at ever-increasing rates and have the potential to accelerate and inform drug development in new ways. Challenges and opportunities now lie in developing analytic tools to transform these often complex and heterogeneous data into testable hypotheses and actionable insights. This is the aim of computational pharmacology, which uses in silico techniques to better understand and predict how drugs affect biological systems, which can in turn improve clinical use, avoid unwanted side effects, and guide selection and development of better treatments. One exciting application of computational pharmacology is drug repurposing-finding new uses for existing drugs. Already yielding many promising candidates, this strategy has the potential to improve the efficiency of the drug development process and reach patient populations with previously unmet needs such as those with rare diseases. While current techniques in computational pharmacology and drug repurposing often focus on just a single data modality such as gene expression or drug-target interactions, we argue that methods such as matrix factorization that can integrate data within and across diverse data types have the potential to improve predictive performance and provide a fuller picture of a drug's pharmacological action. WIREs Syst Biol Med 2016, 8:186-210. doi: 10.1002/wsbm.1337 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Rachel A Hodos
- New York University and Icahn School of Medicine at Mt. Sinai, New York, NY
| | - Brian A Kidd
- Icahn School of Medicine at Mt. Sinai, New York, NY
| | | | | | | |
Collapse
|
17
|
Neural model of gene regulatory network: a survey on supportive meta-heuristics. Theory Biosci 2016; 135:1-19. [DOI: 10.1007/s12064-016-0224-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2015] [Accepted: 03/21/2016] [Indexed: 12/21/2022]
|
18
|
Acerbi E, Viganò E, Poidinger M, Mortellaro A, Zelante T, Stella F. Continuous time Bayesian networks identify Prdm1 as a negative regulator of TH17 cell differentiation in humans. Sci Rep 2016; 6:23128. [PMID: 26976045 PMCID: PMC4791550 DOI: 10.1038/srep23128] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Accepted: 02/29/2016] [Indexed: 02/05/2023] Open
Abstract
T helper 17 (TH17) cells represent a pivotal adaptive cell subset involved in multiple immune disorders in mammalian species. Deciphering the molecular interactions regulating TH17 cell differentiation is particularly critical for novel drug target discovery designed to control maladaptive inflammatory conditions. Using continuous time Bayesian networks over a time-course gene expression dataset, we inferred the global regulatory network controlling TH17 differentiation. From the network, we identified the Prdm1 gene encoding the B lymphocyte-induced maturation protein 1 as a crucial negative regulator of human TH17 cell differentiation. The results have been validated by perturbing Prdm1 expression on freshly isolated CD4(+) naïve T cells: reduction of Prdm1 expression leads to augmentation of IL-17 release. These data unravel a possible novel target to control TH17 polarization in inflammatory disorders. Furthermore, this study represents the first in vitro validation of continuous time Bayesian networks as gene network reconstruction method and as hypothesis generation tool for wet-lab biological experiments.
Collapse
Affiliation(s)
- Enzo Acerbi
- Singapore Centre on Environmental Life Sciences Engineering (Nanyang Technological University), Singapore 637551
- Singapore Immunology Network (SIgN), ASTAR, 8A Biomedical Grove, Immunos #04-06, Singapore 138648
| | - Elena Viganò
- Singapore Immunology Network (SIgN), ASTAR, 8A Biomedical Grove, Immunos #04-06, Singapore 138648
| | - Michael Poidinger
- Singapore Immunology Network (SIgN), ASTAR, 8A Biomedical Grove, Immunos #04-06, Singapore 138648
| | - Alessandra Mortellaro
- Singapore Immunology Network (SIgN), ASTAR, 8A Biomedical Grove, Immunos #04-06, Singapore 138648
| | - Teresa Zelante
- Singapore Immunology Network (SIgN), ASTAR, 8A Biomedical Grove, Immunos #04-06, Singapore 138648
- Department of Experimental Medicine, University of Perugia, 06132 Perugia, Italy
| | - Fabio Stella
- Department of Informatics, Systems and Communication, University of Milano-Bicocca, Viale Sarca 336, Building U14, 20126 Milan, Italy
| |
Collapse
|
19
|
Omranian N, Eloundou-Mbebi JMO, Mueller-Roeber B, Nikoloski Z. Gene regulatory network inference using fused LASSO on multiple data sets. Sci Rep 2016; 6:20533. [PMID: 26864687 PMCID: PMC4750075 DOI: 10.1038/srep20533] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Accepted: 01/06/2016] [Indexed: 01/14/2023] Open
Abstract
Devising computational methods to accurately reconstruct gene regulatory networks given gene expression data is key to systems biology applications. Here we propose a method for reconstructing gene regulatory networks by simultaneous consideration of data sets from different perturbation experiments and corresponding controls. The method imposes three biologically meaningful constraints: (1) expression levels of each gene should be explained by the expression levels of a small number of transcription factor coding genes, (2) networks inferred from different data sets should be similar with respect to the type and number of regulatory interactions, and (3) relationships between genes which exhibit similar differential behavior over the considered perturbations should be favored. We demonstrate that these constraints can be transformed in a fused LASSO formulation for the proposed method. The comparative analysis on transcriptomics time-series data from prokaryotic species, Escherichia coli and Mycobacterium tuberculosis, as well as a eukaryotic species, mouse, demonstrated that the proposed method has the advantages of the most recent approaches for regulatory network inference, while obtaining better performance and assigning higher scores to the true regulatory links. The study indicates that the combination of sparse regression techniques with other biologically meaningful constraints is a promising framework for gene regulatory network reconstructions.
Collapse
Affiliation(s)
- Nooshin Omranian
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
- Department of Molecular Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, Haus 20, 14476 Potsdam, Germany
| | - Jeanne M. O. Eloundou-Mbebi
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| | - Bernd Mueller-Roeber
- Department of Molecular Biology, University of Potsdam, Karl-Liebknecht-Str. 24-25, Haus 20, 14476 Potsdam, Germany
| | - Zoran Nikoloski
- Systems Biology and Mathematical Modelling Group, Max Planck Institute for Molecular Plant Physiology, Am Muehlenberg 1, 14476 Potsdam, Germany
| |
Collapse
|
20
|
Clustering and Differential Alignment Algorithm: Identification of Early Stage Regulators in the Arabidopsis thaliana Iron Deficiency Response. PLoS One 2015; 10:e0136591. [PMID: 26317202 PMCID: PMC4552565 DOI: 10.1371/journal.pone.0136591] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2015] [Accepted: 08/05/2015] [Indexed: 11/25/2022] Open
Abstract
Time course transcriptome datasets are commonly used to predict key gene regulators associated with stress responses and to explore gene functionality. Techniques developed to extract causal relationships between genes from high throughput time course expression data are limited by low signal levels coupled with noise and sparseness in time points. We deal with these limitations by proposing the Cluster and Differential Alignment Algorithm (CDAA). This algorithm was designed to process transcriptome data by first grouping genes based on stages of activity and then using similarities in gene expression to predict influential connections between individual genes. Regulatory relationships are assigned based on pairwise alignment scores generated using the expression patterns of two genes and some inferred delay between the regulator and the observed activity of the target. We applied the CDAA to an iron deficiency time course microarray dataset to identify regulators that influence 7 target transcription factors known to participate in the Arabidopsis thaliana iron deficiency response. The algorithm predicted that 7 regulators previously unlinked to iron homeostasis influence the expression of these known transcription factors. We validated over half of predicted influential relationships using qRT-PCR expression analysis in mutant backgrounds. One predicted regulator-target relationship was shown to be a direct binding interaction according to yeast one-hybrid (Y1H) analysis. These results serve as a proof of concept emphasizing the utility of the CDAA for identifying unknown or missing nodes in regulatory cascades, providing the fundamental knowledge needed for constructing predictive gene regulatory networks. We propose that this tool can be used successfully for similar time course datasets to extract additional information and infer reliable regulatory connections for individual genes.
Collapse
|
21
|
Chen C, Yao Y, Zhang L, Xu M, Jiang J, Dou T, Lin W, Zhao G, Huang M, Zhou Y. A Comprehensive Analysis of the Transcriptomes of Marssonina brunnea and Infected Poplar Leaves to Capture Vital Events in Host-Pathogen Interactions. PLoS One 2015. [PMID: 26222429 PMCID: PMC4519268 DOI: 10.1371/journal.pone.0134246] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Background Understanding host-pathogen interaction mechanisms helps to elucidate the entire infection process and focus on important events, and it is a promising approach for improvement of disease control and selection of treatment strategy. Time-course host-pathogen transcriptome analyses and network inference have been applied to unravel the direct or indirect relationships of gene expression alterations. However, time series analyses can suffer from absent time points due to technical problems such as RNA degradation, which limits the application of algorithms that require strict sequential sampling. Here, we introduce an efficient method using independence test to infer an independent network that is exclusively concerned with the frequency of gene expression changes. Results Highly resistant NL895 poplar leaves and weakly resistant NL214 leaves were infected with highly active and weakly active Marssonina brunnea, respectively, and were harvested at different time points. The independent network inference illustrated the top 1,000 vital fungus-poplar relationships, which contained 768 fungal genes and 54 poplar genes. These genes could be classified into three categories: a fungal gene surrounded by many poplar genes; a poplar gene connected to many fungal genes; and other genes (possessing low degrees of connectivity). Notably, the fungal gene M6_08342 (a metalloprotease) was connected to 10 poplar genes, particularly including two disease-resistance genes. These core genes, which are surrounded by other genes, may be of particular importance in complicated infection processes and worthy of further investigation. Conclusions We provide a clear framework of the interaction network and identify a number of candidate key effectors in this process, which might assist in functional tests, resistant clone selection, and disease control in the future.
Collapse
Affiliation(s)
- Chengwen Chen
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
- Shanghai Jiao Tong University School of Medicine, Shanghai, People's Republic of China
| | - Ye Yao
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Center for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, People’s Republic of China
| | - Liang Zhang
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
| | - Minjie Xu
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
| | - Jianping Jiang
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
| | - Tonghai Dou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
| | - Wei Lin
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Center for Computational Systems Biology and School of Mathematical Sciences, Fudan University, Shanghai, People’s Republic of China
| | - Guoping Zhao
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
| | - Minren Huang
- Jiangsu Key Laboratory for Poplar Germplasm Enhancement and Variety Improvement, Nanjing Forestry University, Nanjing, People’s Republic of China
| | - Yan Zhou
- State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Shanghai, People’s Republic of China
- Shanghai-MOST Key Laboratory of Health and Disease Genomics, Chinese National Human Genome Center at Shanghai, Shanghai, People's Republic of China
- * E-mail:
| |
Collapse
|
22
|
System-wide analysis of the transcriptional network of human myelomonocytic leukemia cells predicts attractor structure and phorbol-ester-induced differentiation and dedifferentiation transitions. Sci Rep 2015; 5:8283. [PMID: 25655563 PMCID: PMC4319166 DOI: 10.1038/srep08283] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Accepted: 01/09/2015] [Indexed: 11/24/2022] Open
Abstract
We present a system-wide transcriptional network structure that controls cell types in the context of expression pattern transitions that correspond to cell type transitions. Co-expression based analyses uncovered a system-wide, ladder-like transcription factor cluster structure composed of nearly 1,600 transcription factors in a human transcriptional network. Computer simulations based on a transcriptional regulatory model deduced from the system-wide, ladder-like transcription factor cluster structure reproduced expression pattern transitions when human THP-1 myelomonocytic leukaemia cells cease proliferation and differentiate under phorbol myristate acetate stimulation. The behaviour of MYC, a reprogramming Yamanaka factor that was suggested to be essential for induced pluripotent stem cells during dedifferentiation, could be interpreted based on the transcriptional regulation predicted by the system-wide, ladder-like transcription factor cluster structure. This study introduces a novel system-wide structure to transcriptional networks that provides new insights into network topology.
Collapse
|
23
|
Acerbi E, Zelante T, Narang V, Stella F. Gene network inference using continuous time Bayesian networks: a comparative study and application to Th17 cell differentiation. BMC Bioinformatics 2014; 15:387. [PMID: 25495206 PMCID: PMC4267461 DOI: 10.1186/s12859-014-0387-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2014] [Accepted: 11/17/2014] [Indexed: 12/17/2022] Open
Abstract
Background Dynamic aspects of gene regulatory networks are typically investigated by measuring system variables at multiple time points. Current state-of-the-art computational approaches for reconstructing gene networks directly build on such data, making a strong assumption that the system evolves in a synchronous fashion at fixed points in time. However, nowadays omics data are being generated with increasing time course granularity. Thus, modellers now have the possibility to represent the system as evolving in continuous time and to improve the models’ expressiveness. Results Continuous time Bayesian networks are proposed as a new approach for gene network reconstruction from time course expression data. Their performance was compared to two state-of-the-art methods: dynamic Bayesian networks and Granger causality analysis. On simulated data, the methods comparison was carried out for networks of increasing size, for measurements taken at different time granularity densities and for measurements unevenly spaced over time. Continuous time Bayesian networks outperformed the other methods in terms of the accuracy of regulatory interactions learnt from data for all network sizes. Furthermore, their performance degraded smoothly as the size of the network increased. Continuous time Bayesian networks were significantly better than dynamic Bayesian networks for all time granularities tested and better than Granger causality for dense time series. Both continuous time Bayesian networks and Granger causality performed robustly for unevenly spaced time series, with no significant loss of performance compared to the evenly spaced case, while the same did not hold true for dynamic Bayesian networks. The comparison included the IRMA experimental datasets which confirmed the effectiveness of the proposed method. Continuous time Bayesian networks were then applied to elucidate the regulatory mechanisms controlling murine T helper 17 (Th17) cell differentiation and were found to be effective in discovering well-known regulatory mechanisms, as well as new plausible biological insights. Conclusions Continuous time Bayesian networks were effective on networks of both small and large size and were particularly feasible when the measurements were not evenly distributed over time. Reconstruction of the murine Th17 cell differentiation network using continuous time Bayesian networks revealed several autocrine loops, suggesting that Th17 cells may be auto regulating their own differentiation process.
Collapse
Affiliation(s)
- Enzo Acerbi
- Singapore Immunology Network (SIgN), A*STAR, 8A Biomedical Grove, Immunos Building, Level 4 138648, Singapore.
| | | | | | | |
Collapse
|
24
|
Yin X, Sakata K, Komatsu S. Phosphoproteomics reveals the effect of ethylene in soybean root under flooding stress. J Proteome Res 2014; 13:5618-34. [PMID: 25316100 DOI: 10.1021/pr500621c] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Flooding has severe negative effects on soybean growth. To explore the flooding-responsive mechanisms in early-stage soybean, a phosphoproteomic approach was used. Two-day-old soybean plants were treated without or with flooding for 3, 6, 12, and 24 h, and root tip proteins were then extracted and analyzed at each time point. After 3 h of flooding exposure, the fresh weight of soybeans increased, whereas the ATP content of soybean root tips decreased. Using a gel-free proteomic technique, a total of 114 phosphoproteins were identified in the root tip samples, and 34 of the phosphoproteins were significantly changed with respect to phosphorylation status after 3 h of flooding stress. Among these phosphoproteins, eukaryotic translation initiation factors were dephosphorylated, whereas several protein synthesis-related proteins were phosphorylated. The mRNA expression levels of sucrose phosphate synthase 1F and eukaryotic translation initiation factor 4 G were down-regulated, whereas UDP-glucose 6-dehydrogenase mRNA expression was up-regulated during growth but down-regulated under flooding stress. Furthermore, bioinformatic protein interaction analysis of flooding-responsive proteins based on temporal phosphorylation patterns indicated that eukaryotic translation initiation factor 4 G was located in the center of the network during flooding. Soybean eukaryotic translation initiation factor 4 G has homology to programmed cell death 4 protein and is implicated in ethylene signaling. The weight of soybeans was increased with treatment by an ethylene-releasing agent under flooding condition, but it was decreased when plants were exposed to an ethylene receptor antagonist. These results suggest that the ethylene signaling pathway plays an important role, via the protein phosphorylation, in mechanisms of plant tolerance to the initial stages of flooding stress in soybean root tips.
Collapse
Affiliation(s)
- Xiaojian Yin
- Graduate School of Life and Environmental Sciences, University of Tsukuba , Tsukuba 305-8572, Japan
| | | | | |
Collapse
|
25
|
Fu LM, Fu KA. Analysis of Parkinson's disease pathophysiology using an integrated genomics-bioinformatics approach. ACTA ACUST UNITED AC 2014; 22:15-29. [PMID: 25466606 DOI: 10.1016/j.pathophys.2014.10.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2014] [Revised: 10/11/2014] [Accepted: 10/20/2014] [Indexed: 11/28/2022]
Abstract
The pathogenesis and pathophysiology of a disease determine how it should be diagnosed and treated. Yet, understanding the cause and mechanisms of progression often requires intensive human efforts, especially for diseases with complex etiology. The latest genomic technology coupled with advanced, large-scale data analysis in the field known as bioinformatics has promised a high-throughput approach that can quickly identify disease-affected genes and pathways by examining tissue samples collected from patients and control subjects. Furthermore, significant biological themes indicative of genomic events can be recognized on the basis of affected genes. However, given identified biological themes, it is not clear how to organize genomic events to arrive at a coherent pathophysiological explanation about the disease. To address this important issue, we have developed an innovative method named "Expression Data Up-Stream Analysis" (EDUSA) that can perform a bioinformatics analysis to identify and rank upstream processes effectively. We applied it to Parkinson's disease (PD) using a genomic data set available at a public data repository known as Gene Expression Omnibus (GEO). In this study, disease-affected genes were identified using GEO2R software, and disease-pertinent processes were identified using EASE software. Then the EDUSA program was used to determine the upstream versus downstream hierarchy of the processes. The results confirmed the current misfolded protein theory about the pathogenesis of PD, and provided new insights as well. Particularly, our program discovered that RNA (ribonucleic acid) metabolism pathology was a potential cause of PD, which in fact, is an emerging theory of neurodegenerative disorders. In addition, it was found that the dysfunction of the transport system seemed to occur in the early phase of neurodegeneration, whereas mitochondrial dysfunction appeared at a later stage. Using this methodology, we have demonstrated how to determine the stages of disease development with single-point data collection.
Collapse
Affiliation(s)
- Li M Fu
- Biomedical Engineering Department, AHMC Healthcare, Los Angeles, CA, USA.
| | - Katherine A Fu
- Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| |
Collapse
|
26
|
Henderson J, Michailidis G. Network reconstruction using nonparametric additive ODE models. PLoS One 2014; 9:e94003. [PMID: 24732037 PMCID: PMC3986056 DOI: 10.1371/journal.pone.0094003] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2013] [Accepted: 03/13/2014] [Indexed: 01/05/2023] Open
Abstract
Network representations of biological systems are widespread and reconstructing unknown networks from data is a focal problem for computational biologists. For example, the series of biochemical reactions in a metabolic pathway can be represented as a network, with nodes corresponding to metabolites and edges linking reactants to products. In a different context, regulatory relationships among genes are commonly represented as directed networks with edges pointing from influential genes to their targets. Reconstructing such networks from data is a challenging problem receiving much attention in the literature. There is a particular need for approaches tailored to time-series data and not reliant on direct intervention experiments, as the former are often more readily available. In this paper, we introduce an approach to reconstructing directed networks based on dynamic systems models. Our approach generalizes commonly used ODE models based on linear or nonlinear dynamics by extending the functional class for the functions involved from parametric to nonparametric models. Concomitantly we limit the complexity by imposing an additive structure on the estimated slope functions. Thus the submodel associated with each node is a sum of univariate functions. These univariate component functions form the basis for a novel coupling metric that we define in order to quantify the strength of proposed relationships and hence rank potential edges. We show the utility of the method by reconstructing networks using simulated data from computational models for the glycolytic pathway of Lactocaccus Lactis and a gene network regulating the pluripotency of mouse embryonic stem cells. For purposes of comparison, we also assess reconstruction performance using gene networks from the DREAM challenges. We compare our method to those that similarly rely on dynamic systems models and use the results to attempt to disentangle the distinct roles of linearity, sparsity, and derivative estimation.
Collapse
Affiliation(s)
- James Henderson
- Department of Statistics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - George Michailidis
- Department of Statistics, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
27
|
CaSPIAN: a causal compressive sensing algorithm for discovering directed interactions in gene networks. PLoS One 2014; 9:e90781. [PMID: 24622336 PMCID: PMC3951243 DOI: 10.1371/journal.pone.0090781] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 02/05/2014] [Indexed: 11/21/2022] Open
Abstract
We introduce a novel algorithm for inference of causal gene interactions, termed CaSPIAN (Causal Subspace Pursuit for Inference and Analysis of Networks), which is based on coupling compressive sensing and Granger causality techniques. The core of the approach is to discover sparse linear dependencies between shifted time series of gene expressions using a sequential list-version of the subspace pursuit reconstruction algorithm and to estimate the direction of gene interactions via Granger-type elimination. The method is conceptually simple and computationally efficient, and it allows for dealing with noisy measurements. Its performance as a stand-alone platform without biological side-information was tested on simulated networks, on the synthetic IRMA network in Saccharomyces cerevisiae, and on data pertaining to the human HeLa cell network and the SOS network in E. coli. The results produced by CaSPIAN are compared to the results of several related algorithms, demonstrating significant improvements in inference accuracy of documented interactions. These findings highlight the importance of Granger causality techniques for reducing the number of false-positives, as well as the influence of noise and sampling period on the accuracy of the estimates. In addition, the performance of the method was tested in conjunction with biological side information of the form of sparse “scaffold networks”, to which new edges were added using available RNA-seq or microarray data. These biological priors aid in increasing the sensitivity and precision of the algorithm in the small sample regime.
Collapse
|
28
|
Haye A, Albert J, Rooman M. Modeling the Drosophila gene cluster regulation network for muscle development. PLoS One 2014; 9:e90285. [PMID: 24594656 PMCID: PMC3940846 DOI: 10.1371/journal.pone.0090285] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2013] [Accepted: 01/29/2014] [Indexed: 11/19/2022] Open
Abstract
The development of accurate and reliable dynamical modeling procedures that describe the time evolution of gene expression levels is a prerequisite to understanding and controlling the transcription process. We focused on data from DNA microarray time series for 20 Drosophila genes involved in muscle development during the embryonic stage. Genes with similar expression profiles were clustered on the basis of a translation-invariant and scale-invariant distance measure. The time evolution of these clusters was modeled using coupled differential equations. Three model structures involving a transcription term and a degradation term were tested. The parameters were identified in successive steps: network construction, parameter optimization, and parameter reduction. The solutions were evaluated on the basis of the data reproduction and the number of parameters, as well as on two biology-based requirements: the robustness with respect to parameter variations and the values of the expression levels not being unrealistically large upon extrapolation in time. Various solutions were obtained that satisfied all our evaluation criteria. The regulatory networks inferred from these solutions were compared with experimental data. The best solution has half of the experimental connections, which compares favorably with previous approaches. Biasing the network toward the experimental connections led to the identification of a model that is only slightly less good on the basis of the evaluation criteria. The non-uniqueness of the solutions and the variable agreement with experimental connections were discussed in the context of the different hypotheses underlying this type of approach.
Collapse
Affiliation(s)
- Alexandre Haye
- BioModeling, BioInformatics & BioProcesses Department, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Jaroslav Albert
- BioModeling, BioInformatics & BioProcesses Department, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Marianne Rooman
- BioModeling, BioInformatics & BioProcesses Department, Université Libre de Bruxelles, Bruxelles, Belgium
- * E-mail:
| |
Collapse
|
29
|
Windhager L, Zierer J, Küffner R. Refining ensembles of predicted gene regulatory networks based on characteristic interaction sets. PLoS One 2014; 9:e84596. [PMID: 24498260 PMCID: PMC3911903 DOI: 10.1371/journal.pone.0084596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2013] [Accepted: 11/14/2013] [Indexed: 11/30/2022] Open
Abstract
Different ensemble voting approaches have been successfully applied for reverse-engineering of gene regulatory networks. They are based on the assumption that a good approximation of true network structure can be derived by considering the frequencies of individual interactions in a large number of predicted networks. Such approximations are typically superior in terms of prediction quality and robustness as compared to considering a single best scoring network only. Nevertheless, ensemble approaches only work well if the predicted gene regulatory networks are sufficiently similar to each other. If the topologies of predicted networks are considerably different, an ensemble of all networks obscures interesting individual characteristics. Instead, networks should be grouped according to local topological similarities and ensemble voting performed for each group separately. We argue that the presence of sets of co-occurring interactions is a suitable indicator for grouping predicted networks. A stepwise bottom-up procedure is proposed, where first mutual dependencies between pairs of interactions are derived from predicted networks. Pairs of co-occurring interactions are subsequently extended to derive characteristic interaction sets that distinguish groups of networks. Finally, ensemble voting is applied separately to the resulting topologically similar groups of networks to create distinct group-ensembles. Ensembles of topologically similar networks constitute distinct hypotheses about the reference network structure. Such group-ensembles are easier to interpret as their characteristic topology becomes clear and dependencies between interactions are known. The availability of distinct hypotheses facilitates the design of further experiments to distinguish between plausible network structures. The proposed procedure is a reasonable refinement step for non-deterministic reverse-engineering applications that produce a large number of candidate predictions for a gene regulatory network, e.g. due to probabilistic optimization or a cross-validation procedure.
Collapse
Affiliation(s)
- Lukas Windhager
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Jonas Zierer
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Robert Küffner
- Institute for Informatics, Ludwig-Maximilians-Universität München, Munich, Germany
- * E-mail:
| |
Collapse
|
30
|
Han C, Yang P, Sakata K, Komatsu S. Quantitative proteomics reveals the role of protein phosphorylation in rice embryos during early stages of germination. J Proteome Res 2014; 13:1766-82. [PMID: 24460219 DOI: 10.1021/pr401295c] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Seed germination begins with water uptake and ends with radicle emergence. A gel-free phosphoproteomic technique was used to investigate the role of protein phosphorylation events in the early stages of rice seed germination. Both seed weight and ATP content increased gradually during the first 24 h following imbibition. Proteomic analysis indicated that carbohydrate metabolism- and protein synthesis/degradation-related proteins were predominantly increased and displayed temporal patterns of expression. Analyses of cluster and protein-protein interactions indicated that the regulation of sucrose synthases and alpha-amylases was the central event controlling germination. Phosphoproteomic analysis identified several proteins involved in protein modification and transcriptional regulation that exhibited significantly temporal changes in phosphorylation levels during germination. Cluster analysis indicated that 12 protein modification-related proteins had a peak abundance of phosphoproteins at 12 h after imbibition. These results suggest that the first 12 h following imbibition is a potentially important signal transduction phase for the initiation of rice seed germination. Three core components involved in brassinosteroid signal transduction displayed significant increases in phosphoprotein abundance during the early stages of germination. Brassinolide treatment increased the rice seed germination rate but not the rate of embryonic axis elongation. These findings suggest that brassinosteroid signal transduction likely triggers seed germination.
Collapse
Affiliation(s)
- Chao Han
- Key Laboratory of Plant Germplasm Enhancement and Speciality Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences , Moshan, Wuchang, Wuhan 430074, China
| | | | | | | |
Collapse
|
31
|
Michailidis G, d'Alché-Buc F. Autoregressive models for gene regulatory network inference: sparsity, stability and causality issues. Math Biosci 2013; 246:326-34. [PMID: 24176667 DOI: 10.1016/j.mbs.2013.10.003] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2013] [Revised: 10/09/2013] [Accepted: 10/14/2013] [Indexed: 10/26/2022]
Abstract
Reconstructing gene regulatory networks from high-throughput measurements represents a key problem in functional genomics. It also represents a canonical learning problem and thus has attracted a lot of attention in both the informatics and the statistical learning literature. Numerous approaches have been proposed, ranging from simple clustering to rather involved dynamic Bayesian network modeling, as well as hybrid ones that combine a number of modeling steps, such as employing ordinary differential equations coupled with genome annotation. These approaches are tailored to the type of data being employed. Available data sources include static steady state data and time course data obtained either for wild type phenotypes or from perturbation experiments. This review focuses on the class of autoregressive models using time course data for inferring gene regulatory networks. The central themes of sparsity, stability and causality are discussed as well as the ability to integrate prior knowledge for successful use of these models for the learning task at hand.
Collapse
Affiliation(s)
- George Michailidis
- Department of Statistics, University of Michigan, Ann Arbor, MI 48109-1107, USA
| | | |
Collapse
|
32
|
Brouard C, Vrain C, Dubois J, Castel D, Debily MA, d'Alché-Buc F. Learning a Markov Logic network for supervised gene regulatory network inference. BMC Bioinformatics 2013; 14:273. [PMID: 24028533 PMCID: PMC3849013 DOI: 10.1186/1471-2105-14-273] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 09/03/2013] [Indexed: 11/23/2022] Open
Abstract
Background Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate. Such a method builds a binary classifier able to assign a class (Regulation/No regulation) to an ordered pair of genes. Once learnt, the pairwise classifier can be used to predict new regulations. In this work, we explore the framework of Markov Logic Networks (MLN) that combine features of probabilistic graphical models with the expressivity of first-order logic rules. Results We propose to learn a Markov Logic network, e.g. a set of weighted rules that conclude on the predicate “regulates”, starting from a known gene regulatory network involved in the switch proliferation/differentiation of keratinocyte cells, a set of experimental transcriptomic data and various descriptions of genes all encoded into first-order logic. As training data are unbalanced, we use asymmetric bagging to learn a set of MLNs. The prediction of a new regulation can then be obtained by averaging predictions of individual MLNs. As a side contribution, we propose three in silico tests to assess the performance of any pairwise classifier in various network inference tasks on real datasets. A first test consists of measuring the average performance on balanced edge prediction problem; a second one deals with the ability of the classifier, once enhanced by asymmetric bagging, to update a given network. Finally our main result concerns a third test that measures the ability of the method to predict regulations with a new set of genes. As expected, MLN, when provided with only numerical discretized gene expression data, does not perform as well as a pairwise SVM in terms of AUPR. However, when a more complete description of gene properties is provided by heterogeneous sources, MLN achieves the same performance as a black-box model such as a pairwise SVM while providing relevant insights on the predictions. Conclusions The numerical studies show that MLN achieves very good predictive performance while opening the door to some interpretability of the decisions. Besides the ability to suggest new regulations, such an approach allows to cross-validate experimental data with existing knowledge.
Collapse
Affiliation(s)
- Céline Brouard
- IBISC EA 4526, Université d'Évry-Val d'Essonne, 23 Boulevard de France, 91037, Évry, France.
| | | | | | | | | | | |
Collapse
|
33
|
Clarke J, Penas C, Pastori C, Komotar RJ, Bregy A, Shah AH, Wahlestedt C, Ayad NG. Epigenetic pathways and glioblastoma treatment. Epigenetics 2013; 8:785-95. [PMID: 23807265 PMCID: PMC3883781 DOI: 10.4161/epi.25440] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
Glioblastoma multiforme (GBM) is the most common malignant adult brain tumor. Standard GBM treatment includes maximal safe surgical resection with combination radiotherapy and adjuvant temozolomide (TMZ) chemotherapy. Alarmingly, patient survival at five-years is below 10%. This is in part due to the invasive behavior of the tumor and the resulting inability to resect greater than 98% of some tumors. In fact, recurrence after such treatment may be inevitable, even in cases where gross total resection is achieved. The Cancer Genome Atlas (TCGA) research network performed whole genome sequencing of GBM tumors and found that GBM recurrence is linked to epigenetic mechanisms and pathways. Central to these pathways are epigenetic enzymes, which have recently emerged as possible new drug targets for multiple cancers, including GBM. Here we review GBM treatment, and provide a systems approach to identifying epigenetic drivers of GBM tumor progression based on temporal modeling of putative GBM cells of origin. We also discuss advances in defining epigenetic mechanisms controlling GBM initiation and recurrence and the drug discovery considerations associated with targeting epigenetic enzymes for GBM treatment.
Collapse
Affiliation(s)
- Jennifer Clarke
- Division of Biostatistics; Department of Epidemiology and Public Health; University of Miami Miller School of Medicine; Miami, FL USA
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Cai X, Bazerque JA, Giannakis GB. Inference of gene regulatory networks with sparse structural equation models exploiting genetic perturbations. PLoS Comput Biol 2013; 9:e1003068. [PMID: 23717196 PMCID: PMC3662697 DOI: 10.1371/journal.pcbi.1003068] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 03/28/2013] [Indexed: 12/22/2022] Open
Abstract
Integrating genetic perturbations with gene expression data not only improves accuracy of regulatory network topology inference, but also enables learning of causal regulatory relations between genes. Although a number of methods have been developed to integrate both types of data, the desiderata of efficient and powerful algorithms still remains. In this paper, sparse structural equation models (SEMs) are employed to integrate both gene expression data and cis-expression quantitative trait loci (cis-eQTL), for modeling gene regulatory networks in accordance with biological evidence about genes regulating or being regulated by a small number of genes. A systematic inference method named sparsity-aware maximum likelihood (SML) is developed for SEM estimation. Using simulated directed acyclic or cyclic networks, the SML performance is compared with that of two state-of-the-art algorithms: the adaptive Lasso (AL) based scheme, and the QTL-directed dependency graph (QDG) method. Computer simulations demonstrate that the novel SML algorithm offers significantly better performance than the AL-based and QDG algorithms across all sample sizes from 100 to 1,000, in terms of detection power and false discovery rate, in all the cases tested that include acyclic or cyclic networks of 10, 30 and 300 genes. The SML method is further applied to infer a network of 39 human genes that are related to the immune function and are chosen to have a reliable eQTL per gene. The resulting network consists of 9 genes and 13 edges. Most of the edges represent interactions reasonably expected from experimental evidence, while the remaining may just indicate the emergence of new interactions. The sparse SEM and efficient SML algorithm provide an effective means of exploiting both gene expression and perturbation data to infer gene regulatory networks. An open-source computer program implementing the SML algorithm is freely available upon request.
Collapse
Affiliation(s)
- Xiaodong Cai
- Department of Electrical and Computer Engineering, University of Miami, Coral Gables, FL, USA.
| | | | | |
Collapse
|
35
|
Gene regulation, modulation, and their applications in gene expression data analysis. Adv Bioinformatics 2013; 2013:360678. [PMID: 23573084 PMCID: PMC3610383 DOI: 10.1155/2013/360678] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2012] [Accepted: 01/24/2013] [Indexed: 12/21/2022] Open
Abstract
Common microarray and next-generation sequencing data analysis concentrate on tumor subtype classification, marker detection, and transcriptional regulation discovery during biological processes by exploring the correlated gene expression patterns and their shared functions. Genetic regulatory network (GRN) based approaches have been employed in many large studies in order to scrutinize for dysregulation and potential treatment controls. In addition to gene regulation and network construction, the concept of the network modulator that has significant systemic impact has been proposed, and detection algorithms have been developed in past years. Here we provide a unified mathematic description of these methods, followed with a brief survey of these modulator identification algorithms. As an early attempt to extend the concept to new RNA regulation mechanism, competitive endogenous RNA (ceRNA), into a modulator framework, we provide two applications to illustrate the network construction, modulation effect, and the preliminary finding from these networks. Those methods we surveyed and developed are used to dissect the regulated network under different modulators. Not limit to these, the concept of “modulation” can adapt to various biological mechanisms to discover the novel gene regulation mechanisms.
Collapse
|
36
|
Abegaz F, Wit E. Sparse time series chain graphical models for reconstructing genetic networks. Biostatistics 2013; 14:586-99. [PMID: 23462022 DOI: 10.1093/biostatistics/kxt005] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
We propose a sparse high-dimensional time series chain graphical model for reconstructing genetic networks from gene expression data parametrized by a precision matrix and autoregressive coefficient matrix. We consider the time steps as blocks or chains. The proposed approach explores patterns of contemporaneous and dynamic interactions by efficiently combining Gaussian graphical models and Bayesian dynamic networks. We use penalized likelihood inference with a smoothly clipped absolute deviation penalty to explore the relationships among the observed time course gene expressions. The method is illustrated on simulated data and on real data examples from Arabidopsis thaliana and mammary gland time course microarray gene expressions.
Collapse
Affiliation(s)
- Fentaw Abegaz
- Johann Bernoulli Institute of Mathematics and Computer Science, University of Groningen, Nijenborgh 9, The Netherlands.
| | | |
Collapse
|
37
|
Abstract
Biochemical systems theory (BST) is the foundation for a set of analytical andmodeling tools that facilitate the analysis of dynamic biological systems. This paper depicts major developments in BST up to the current state of the art in 2012. It discusses its rationale, describes the typical strategies and methods of designing, diagnosing, analyzing, and utilizing BST models, and reviews areas of application. The paper is intended as a guide for investigators entering the fascinating field of biological systems analysis and as a resource for practitioners and experts.
Collapse
|
38
|
Higa CHA, Andrade TP, Hashimoto RF. Growing seed genes from time series data and thresholded Boolean networks with perturbation. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:37-49. [PMID: 23702542 DOI: 10.1109/tcbb.2012.169] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Models of gene regulatory networks (GRN) have been proposed along with algorithms for inferring their structure. By structure, we mean the relationships among the genes of the biological system under study. Despite the large number of genes found in the genome of an organism, it is believed that a small set of genes is responsible for maintaining a specific core regulatory mechanism (small subnetworks). We propose an algorithm for inference of subnetworks of genes from a small initial set of genes called seed and time series gene expression data. The algorithm has two main steps: First, it grows the seed of genes by adding genes to it, and second, it searches for subnetworks that can be biologically meaningful. The seed growing step is treated as a feature selection problem and we used a thresholded Boolean network with a perturbation model to design the criterion function that is used to select the features (genes). Given that the reverse engineering of GRN is a problem that does not necessarily have one unique solution, the proposed algorithm has as output a set of networks instead of one single network. The algorithm also analyzes the dynamics of the networks which can be time-consuming. Nevertheless, the algorithm is suitable when the number of genes is small. The results showed that the algorithm is capable of recovering an acceptable rate of gene interactions and to generate regulatory hypotheses that can be explored in the wet lab.
Collapse
Affiliation(s)
- Carlos H A Higa
- College of Computing, Federal University of Mato Grosso do Sul, Campo Grande MS, Brazil.
| | | | | |
Collapse
|
39
|
Comparison of co-expression measures: mutual information, correlation, and model based indices. BMC Bioinformatics 2012; 13:328. [PMID: 23217028 PMCID: PMC3586947 DOI: 10.1186/1471-2105-13-328] [Citation(s) in RCA: 256] [Impact Index Per Article: 21.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2012] [Accepted: 11/30/2012] [Indexed: 11/27/2022] Open
Abstract
Background Co-expression measures are often used to define networks among genes. Mutual information (MI) is often used as a generalized correlation measure. It is not clear how much MI adds beyond standard (robust) correlation measures or regression model based association measures. Further, it is important to assess what transformations of these and other co-expression measures lead to biologically meaningful modules (clusters of genes). Results We provide a comprehensive comparison between mutual information and several correlation measures in 8 empirical data sets and in simulations. We also study different approaches for transforming an adjacency matrix, e.g. using the topological overlap measure. Overall, we confirm close relationships between MI and correlation in all data sets which reflects the fact that most gene pairs satisfy linear or monotonic relationships. We discuss rare situations when the two measures disagree. We also compare correlation and MI based approaches when it comes to defining co-expression network modules. We show that a robust measure of correlation (the biweight midcorrelation transformed via the topological overlap transformation) leads to modules that are superior to MI based modules and maximal information coefficient (MIC) based modules in terms of gene ontology enrichment. We present a function that relates correlation to mutual information which can be used to approximate the mutual information from the corresponding correlation coefficient. We propose the use of polynomial or spline regression models as an alternative to MI for capturing non-linear relationships between quantitative variables. Conclusion The biweight midcorrelation outperforms MI in terms of elucidating gene pairwise relationships. Coupled with the topological overlap matrix transformation, it often leads to more significantly enriched co-expression modules. Spline and polynomial networks form attractive alternatives to MI in case of non-linear relationships. Our results indicate that MI networks can safely be replaced by correlation networks when it comes to measuring co-expression relationships in stationary data.
Collapse
|
40
|
|
41
|
Zhu H, Rao RSP, Zeng T, Chen L. Reconstructing dynamic gene regulatory networks from sample-based transcriptional data. Nucleic Acids Res 2012; 40:10657-67. [PMID: 23002138 PMCID: PMC3510506 DOI: 10.1093/nar/gks860] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
The current method for reconstructing gene regulatory networks faces a dilemma concerning the study of bio-medical problems. On the one hand, static approaches assume that genes are expressed in a steady state and thus cannot exploit and describe the dynamic patterns of an evolving process. On the other hand, approaches that can describe the dynamical behaviours require time-course data, which are normally not available in many bio-medical studies. To overcome the limitations of both the static and dynamic approaches, we propose a dynamic cascaded method (DCM) to reconstruct dynamic gene networks from sample-based transcriptional data. Our method is based on the intra-stage steady-rate assumption and the continuity assumption, which can properly characterize the dynamic and continuous nature of gene transcription in a biological process. Our simulation study showed that compared with static approaches, the DCM not only can reconstruct dynamical network but also can significantly improve network inference performance. We further applied our method to reconstruct the dynamic gene networks of hepatocellular carcinoma (HCC) progression. The derived HCC networks were verified by functional analysis and network enrichment analysis. Furthermore, it was shown that the modularity and network rewiring in the HCC networks can clearly characterize the dynamic patterns of HCC progression.
Collapse
Affiliation(s)
- Hailong Zhu
- Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China.
| | | | | | | |
Collapse
|
42
|
Haye A, Albert J, Rooman M. Robust non-linear differential equation models of gene expression evolution across Drosophila development. BMC Res Notes 2012; 5:46. [PMID: 22260205 PMCID: PMC3398324 DOI: 10.1186/1756-0500-5-46] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 01/19/2012] [Indexed: 01/20/2023] Open
Abstract
Background This paper lies in the context of modeling the evolution of gene expression away from stationary states, for example in systems subject to external perturbations or during the development of an organism. We base our analysis on experimental data and proceed in a top-down approach, where we start from data on a system's transcriptome, and deduce rules and models from it without a priori knowledge. We focus here on a publicly available DNA microarray time series, representing the transcriptome of Drosophila across evolution from the embryonic to the adult stage. Results In the first step, genes were clustered on the basis of similarity of their expression profiles, measured by a translation-invariant and scale-invariant distance that proved appropriate for detecting transitions between development stages. Average profiles representing each cluster were computed and their time evolution was analyzed using coupled differential equations. A linear and several non-linear model structures involving a transcription and a degradation term were tested. The parameters were identified in three steps: determination of the strongest connections between genes, optimization of the parameters defining these connections, and elimination of the unnecessary parameters using various reduction schemes. Different solutions were compared on the basis of their abilities to reproduce the data, to keep realistic gene expression levels when extrapolated in time, to show the biologically expected robustness with respect to parameter variations, and to contain as few parameters as possible. Conclusions We showed that the linear model did very well in reproducing the data with few parameters, but was not sufficiently robust and yielded unrealistic values upon extrapolation in time. In contrast, the non-linear models all reached the latter two objectives, but some were unable to reproduce the data. A family of non-linear models, constructed from the exponential of linear combinations of expression levels, reached all the objectives. It defined networks with a mean number of connections equal to two, when restricted to the embryonic time series, and equal to five for the full time series. These networks were compared with experimental data about gene-transcription factor and protein-protein interactions. The non-uniqueness of the solutions was discussed in the context of plasticity and cluster versus single-gene networks.
Collapse
Affiliation(s)
- Alexandre Haye
- BioSystems, BioModeling & BioProcesses Department, Université Libre de Bruxelles, CP 165/61, Avenue Roosevelt 50, 1050 Bruxelles, Belgium
| | | | | |
Collapse
|
43
|
Yaghoobi H, Haghipour S, Hamzeiy H, Asadi-Khiavi M. A review of modeling techniques for genetic regulatory networks. JOURNAL OF MEDICAL SIGNALS & SENSORS 2012; 2:61-70. [PMID: 23493097 PMCID: PMC3592506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2011] [Accepted: 01/15/2012] [Indexed: 12/04/2022]
Abstract
Understanding the genetic regulatory networks, the discovery of interactions between genes and understanding regulatory processes in a cell at the gene level are the major goals of system biology and computational biology. Modeling gene regulatory networks and describing the actions of the cells at the molecular level are used in medicine and molecular biology applications such as metabolic pathways and drug discovery. Modeling these networks is also one of the important issues in genomic signal processing. After the advent of microarray technology, it is possible to model these networks using time-series data. In this paper, we provide an extensive review of methods that have been used on time-series data and represent the features, advantages and disadvantages of each. Also, we classify these methods according to their nature. A parallel study of these methods can lead to the discovery of new synthetic methods or improve previous methods.
Collapse
Affiliation(s)
- Hanif Yaghoobi
- Department of Biomedical Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Siyamak Haghipour
- Department of Biomedical Engineering, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| | - Hossein Hamzeiy
- Department of Pharmacology and Toxicology, School of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Masoud Asadi-Khiavi
- School of Pharmacy, Zanjan University of Medical Sciences, Zanjan, Iran,Address for correspondence: Prof. Masoud Asadi-Khiavi, School of Pharmacy, Zanjan University of Medical Science, Zanjan, Iran. E-mail:
| |
Collapse
|
44
|
Rajapakse JC, Mundra PA. Stability of building gene regulatory networks with sparse autoregressive models. BMC Bioinformatics 2011; 12 Suppl 13:S17. [PMID: 22373004 PMCID: PMC3278833 DOI: 10.1186/1471-2105-12-s13-s17] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Biological networks are constantly subjected to random perturbations, and efficient feedback and compensatory mechanisms exist to maintain their stability. There is an increased interest in building gene regulatory networks (GRNs) from temporal gene expression data because of their numerous applications in life sciences. However, because of the limited number of time points at which gene expressions can be gathered in practice, computational techniques of building GRN often lead to inaccuracies and instabilities. This paper investigates the stability of sparse auto-regressive models of building GRN from gene expression data. Results Criteria for evaluating the stability of estimating GRN structure are proposed. Thereby, stability of multivariate vector autoregressive (MVAR) methods - ridge, lasso, and elastic-net - of building GRN were studied by simulating temporal gene expression datasets on scale-free topologies as well as on real data gathered over Hela cell-cycle. Effects of the number of time points on the stability of constructing GRN are investigated. When the number of time points are relatively low compared to the size of network, both accuracy and stability are adversely affected. At least, the number of time points equal to the number of genes in the network are needed to achieve decent accuracy and stability of the networks. Our results on synthetic data indicate that the stability of lasso and elastic-net MVAR methods are comparable, and their accuracies are much higher than the ridge MVAR. As the size of the network grows, the number of time points required to achieve acceptable accuracy and stability are much less relative to the number of genes in the network. The effects of false negatives are easier to improve by increasing the number time points than those due to false positives. Application to HeLa cell-cycle gene expression dataset shows that biologically stable GRN can be obtained by introducing perturbations to the data. Conclusions Accuracy and stability of building GRN are crucial for investigation of gene regulations. Sparse MVAR techniques such as lasso and elastic-net provide accurate and stable methods for building even GRN of small size. The effect of false negatives is corrected much easier with the increased number of time points than those due to false positives. With real data, we demonstrate how stable networks can be derived by introducing random perturbation to data.
Collapse
Affiliation(s)
- Jagath C Rajapakse
- BioInformatics Research Centre, School of Computer Engineering, Nanyang Technological University, Singapore 639798.
| | | |
Collapse
|
45
|
Dimitrova ES, Mitra I, Jarrah AS. Probabilistic polynomial dynamical systems for reverse engineering of gene regulatory networks. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2011; 2011:1. [PMID: 21910920 PMCID: PMC3171177 DOI: 10.1186/1687-4153-2011-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/09/2010] [Accepted: 06/06/2011] [Indexed: 02/08/2023]
Abstract
Elucidating the structure and/or dynamics of gene regulatory networks from experimental data is a major goal of systems biology. Stochastic models have the potential to absorb noise, account for un-certainty, and help avoid data overfitting. Within the frame work of probabilistic polynomial dynamical systems, we present an algorithm for the reverse engineering of any gene regulatory network as a discrete, probabilistic polynomial dynamical system. The resulting stochastic model is assembled from all minimal models in the model space and the probability assignment is based on partitioning the model space according to the likeliness with which a minimal model explains the observed data. We used this method to identify stochastic models for two published synthetic network models. In both cases, the generated model retains the key features of the original model and compares favorably to the resulting models from other algorithms.
Collapse
Affiliation(s)
- Elena S Dimitrova
- Department of Mathematical Sciences, Clemson University, Clemson, SC 29634-0975, USA
| | - Indranil Mitra
- Sealy Center of Molecular Medicine, University of Texas Medical Branch, Galveston, TX 77550, USA
| | - Abdul Salam Jarrah
- Virginia Bioinformatics Institute, Virginia Tech, Blacksburg, VA 24061-0477, USA
- Department of Mathematics and Statistics, American University of Sharjah, Sharjah, UAE
| |
Collapse
|
46
|
Constraint-based analysis of gene interactions using restricted boolean networks and time-series data. BMC Proc 2011; 5 Suppl 2:S5. [PMID: 21554763 PMCID: PMC3090763 DOI: 10.1186/1753-6561-5-s2-s5] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Background A popular model for gene regulatory networks is the Boolean network model. In this paper, we propose an algorithm to perform an analysis of gene regulatory interactions using the Boolean network model and time-series data. Actually, the Boolean network is restricted in the sense that only a subset of all possible Boolean functions are considered. We explore some mathematical properties of the restricted Boolean networks in order to avoid the full search approach. The problem is modeled as a Constraint Satisfaction Problem (CSP) and CSP techniques are used to solve it. Results We applied the proposed algorithm in two data sets. First, we used an artificial dataset obtained from a model for the budding yeast cell cycle. The second data set is derived from experiments performed using HeLa cells. The results show that some interactions can be fully or, at least, partially determined under the Boolean model considered. Conclusions The algorithm proposed can be used as a first step for detection of gene/protein interactions. It is able to infer gene relationships from time-series data of gene expression, and this inference process can be aided by a priori knowledge available.
Collapse
|
47
|
Jin Y, Meng Y. Morphogenetic Robotics: An Emerging New Field in Developmental Robotics. ACTA ACUST UNITED AC 2011. [DOI: 10.1109/tsmcc.2010.2057424] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
48
|
Dougherty ER. Validation of gene regulatory networks: scientific and inferential. Brief Bioinform 2010; 12:245-52. [PMID: 21183477 DOI: 10.1093/bib/bbq078] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Gene regulatory network models are a major area of study in systems and computational biology and the construction of network models is among the most important problems in these disciplines. The critical epistemological issue concerns validation. Validity can be approached from two different perspectives (i) given a hypothesized network model, its scientific validity relates to the ability to make predictions from the model that can be checked against experimental observations; and (ii) the validity of a network inference procedure must be evaluated relative to its ability to infer a network from sample points generated by the network. This article examines both perspectives in the framework of a distance function between two networks. It considers some of the obstacles to validation and provides examples of both validation paradigms.
Collapse
Affiliation(s)
- Edward R Dougherty
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, USA.
| |
Collapse
|
49
|
Küffner R, Petri T, Windhager L, Zimmer R. Petri Nets with Fuzzy Logic (PNFL): reverse engineering and parametrization. PLoS One 2010; 5. [PMID: 20862218 PMCID: PMC2942832 DOI: 10.1371/journal.pone.0012807] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2010] [Accepted: 06/18/2010] [Indexed: 12/31/2022] Open
Abstract
Background The recent DREAM4 blind assessment provided a particularly realistic and challenging setting for network reverse engineering methods. The in silico part of DREAM4 solicited the inference of cycle-rich gene regulatory networks from heterogeneous, noisy expression data including time courses as well as knockout, knockdown and multifactorial perturbations. Methodology and Principal Findings We inferred and parametrized simulation models based on Petri Nets with Fuzzy Logic (PNFL). This completely automated approach correctly reconstructed networks with cycles as well as oscillating network motifs. PNFL was evaluated as the best performer on DREAM4 in silico networks of size 10 with an area under the precision-recall curve (AUPR) of 81%. Besides topology, we inferred a range of additional mechanistic details with good reliability, e.g. distinguishing activation from inhibition as well as dependent from independent regulation. Our models also performed well on new experimental conditions such as double knockout mutations that were not included in the provided datasets. Conclusions The inference of biological networks substantially benefits from methods that are expressive enough to deal with diverse datasets in a unified way. At the same time, overly complex approaches could generate multiple different models that explain the data equally well. PNFL appears to strike the balance between expressive power and complexity. This also applies to the intuitive representation of PNFL models combining a straightforward graphical notation with colloquial fuzzy parameters.
Collapse
Affiliation(s)
- Robert Küffner
- Institut für Informatik, Ludwig-Maximilians-Universität, München, Germany.
| | | | | | | |
Collapse
|
50
|
Knott S, Mostafavi S, Mousavi P. A neural network based modeling and validation approach for identifying gene regulatory networks. Neurocomputing 2010. [DOI: 10.1016/j.neucom.2010.04.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|