1
|
Semiparametric Mixed-Effects Ordinary Differential Equation Models with Heavy-Tailed Distributions. JOURNAL OF AGRICULTURAL, BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2021; 26:428-445. [PMID: 33840991 PMCID: PMC8020077 DOI: 10.1007/s13253-021-00446-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 02/12/2021] [Accepted: 02/24/2021] [Indexed: 11/01/2022]
Abstract
Ordinary differential equation (ODE) models are popularly used to describe complex dynamical systems. When estimating ODE parameters from noisy data, a common distribution assumption is using the Gaussian distribution. It is known that the Gaussian distribution is not robust when abnormal data exist. In this article, we develop a hierarchical semiparametric mixed-effects ODE model for longitudinal data under the Bayesian framework. For robust inference on ODE parameters, we consider a class of heavy-tailed distributions to model the random effects of ODE parameters and observations errors. An MCMC method is proposed to sample ODE parameters from the posterior distributions. Our proposed method is illustrated by studying a gene regulation experiment. Simulation studies show that our proposed method provides satisfactory results for the semiparametric mixed-effects ODE models with finite samples. Supplementary materials accompanying this paper appear online.
Collapse
|
2
|
Ma CZ, Brent MR. Inferring TF activities and activity regulators from gene expression data with constraints from TF perturbation data. Bioinformatics 2021; 37:1234-1245. [PMID: 33135076 PMCID: PMC8189679 DOI: 10.1093/bioinformatics/btaa947] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2020] [Revised: 09/26/2020] [Accepted: 10/27/2020] [Indexed: 12/20/2022] Open
Abstract
Motivation The activity of a transcription factor (TF) in a sample of cells is the extent to which it is exerting its regulatory potential. Many methods of inferring TF activity from gene expression data have been described, but due to the lack of appropriate large-scale datasets, systematic and objective validation has not been possible until now. Results We systematically evaluate and optimize the approach to TF activity inference in which a gene expression matrix is factored into a condition-independent matrix of control strengths and a condition-dependent matrix of TF activity levels. We find that expression data in which the activities of individual TFs have been perturbed are both necessary and sufficient for obtaining good performance. To a considerable extent, control strengths inferred using expression data from one growth condition carry over to other conditions, so the control strength matrices derived here can be used by others. Finally, we apply these methods to gain insight into the upstream factors that regulate the activities of yeast TFs Gcr2, Gln3, Gcn4 and Msn2. Availability and implementation Evaluation code and data are available at https://doi.org/10.5281/zenodo.4050573. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cynthia Z Ma
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA
| | - Michael R Brent
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO 63110, USA.,Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA.,Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
3
|
Chang CY, Wang J, Zhao Y, Liu J, Yang X, Yue X, Wang H, Zhou F, Inclan-Rico JM, Ponessa JJ, Xie P, Zhang L, Siracusa MC, Feng Z, Hu W. Tumor suppressor p53 regulates intestinal type 2 immunity. Nat Commun 2021; 12:3371. [PMID: 34099671 PMCID: PMC8184793 DOI: 10.1038/s41467-021-23587-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 04/30/2021] [Indexed: 02/07/2023] Open
Abstract
The role of p53 in tumor suppression has been extensively studied and well-established. However, the role of p53 in parasitic infections and the intestinal type 2 immunity is unclear. Here, we report that p53 is crucial for intestinal type 2 immunity in response to the infection of parasites, such as Tritrichomonas muris and Nippostrongylus brasiliensis. Mechanistically, p53 plays a critical role in the activation of the tuft cell-IL-25-type 2 innate lymphoid cell circuit, partly via transcriptional regulation of Lrmp in tuft cells. Lrmp modulates Ca2+ influx and IL-25 release, which are critical triggers of type 2 innate lymphoid cell response. Our results thus reveal a previously unrecognized function of p53 in regulating intestinal type 2 immunity to protect against parasitic infections, highlighting the role of p53 as a guardian of immune integrity.
Collapse
Affiliation(s)
- Chun-Yuan Chang
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Jianming Wang
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Yuhan Zhao
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Juan Liu
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Xue Yang
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Xuetian Yue
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Huaying Wang
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Fan Zhou
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
| | - Juan M Inclan-Rico
- Department of Medicine, Rutgers New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - John J Ponessa
- Department of Medicine, Rutgers New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - Ping Xie
- Department of Cell Biology and Neuroscience, Rutgers University, Piscataway, NJ, USA
| | - Lanjing Zhang
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA
- Department of Pathology, Penn Medicine Princeton Medical Center, Plainsboro, NJ, USA
| | - Mark C Siracusa
- Department of Medicine, Rutgers New Jersey Medical School, Rutgers University, Newark, NJ, USA
| | - Zhaohui Feng
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA.
| | - Wenwei Hu
- Rutgers Cancer Institute of New Jersey, Rutgers University, New Brunswick, NJ, USA.
| |
Collapse
|
4
|
Lopez-Lopera AF, Durrande N, Alvarez MA. Physically-Inspired Gaussian Process Models for Post-Transcriptional Regulation in Drosophila. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:656-666. [PMID: 31144643 DOI: 10.1109/tcbb.2019.2918774] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The regulatory process of Drosophila is thoroughly studied for understanding a great variety of biological principles. While pattern-forming gene networks are analyzed in the transcription step, post-transcriptional events (e.g., translation, protein processing) play an important role in establishing protein expression patterns and levels. Since the post-transcriptional regulation of Drosophila depends on spatiotemporal interactions between mRNAs and gap proteins, proper physically-inspired stochastic models are required to study the link between both quantities. Previous research attempts have shown that using Gaussian processes (GPs) and differential equations lead to promising predictions when analyzing regulatory networks. Here, we aim at further investigating two types of physically-inspired GP models based on a reaction-diffusion equation where the main difference lies in where the prior is placed. While one of them has been studied previously using protein data only, the other is novel and yields a simple approach requiring only the differentiation of kernel functions. In contrast to other stochastic frameworks, discretizing the spatial space is not required here. Both GP models are tested under different conditions depending on the availability of gap gene mRNA expression data. Finally, their performances are assessed on a high-resolution dataset describing the blastoderm stage of the early embryo of Drosophila melanogaster.
Collapse
|
5
|
Hasegawa T, Yamaguchi R, Kakuta M, Sawada K, Kawatani K, Murashita K, Nakaji S, Imoto S. Prediction of blood test values under different lifestyle scenarios using time-series electronic health record. PLoS One 2020; 15:e0230172. [PMID: 32196517 PMCID: PMC7083324 DOI: 10.1371/journal.pone.0230172] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 02/24/2020] [Indexed: 12/13/2022] Open
Abstract
Owing to increasing medical expenses, researchers have attempted to detect clinical signs and preventive measures of diseases using electronic health record (EHR). In particular, time-series EHRs collected by periodic medical check-up enable us to clarify the relevance among check-up results and individual environmental factors such as lifestyle. However, usually such time-series data have many missing observations and some results are strongly correlated to each other. These problems make the analysis difficult and there exists strong demand to detect clinical findings beyond them. We focus on blood test values in medical check-up results and apply a time-series analysis methodology using a state space model. It can infer the internal medical states emerged in blood test values and handle missing observations. The estimated models enable us to predict one's blood test values under specified condition and predict the effect of intervention, such as changes of body composition and lifestyle. We use time-series data of EHRs periodically collected in the Hirosaki cohort study in Japan and elucidate the effect of 17 environmental factors to 38 blood test values in elderly people. Using the estimated model, we then simulate and compare time-transitions of participant's blood test values under several lifestyle scenarios. It visualizes the impact of lifestyle changes for the prevention of diseases. Finally, we exemplify that prediction errors under participant's actual lifestyle can be partially explained by genetic variations, and some of their effects have not been investigated by traditional association studies.
Collapse
Affiliation(s)
- Takanori Hasegawa
- Health Intelligence Center, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan
| | - Rui Yamaguchi
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan
| | - Masanori Kakuta
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan
| | - Kaori Sawada
- Department of Social Medicine, Graduate School of Medicine, Hirosaki University, Hirosaki, Aomori, Japan
| | - Kenichi Kawatani
- COI Research Initiatives Organization, Hirosaki University, Hirosaki, Aomori, Japan
| | - Koichi Murashita
- COI Research Initiatives Organization, Hirosaki University, Hirosaki, Aomori, Japan
| | - Shigeyuki Nakaji
- Department of Social Medicine, Graduate School of Medicine, Hirosaki University, Hirosaki, Aomori, Japan
| | - Seiya Imoto
- Health Intelligence Center, The Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo, Japan
| |
Collapse
|
6
|
Tavakoli M, Tsekouras K, Day R, Dunn KW, Pressé S. Quantitative Kinetic Models from Intravital Microscopy: A Case Study Using Hepatic Transport. J Phys Chem B 2019; 123:7302-7312. [PMID: 31298856 PMCID: PMC6857640 DOI: 10.1021/acs.jpcb.9b04729] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The liver performs critical physiological functions, including metabolizing and removing substances, such as toxins and drugs, from the bloodstream. Hepatotoxicity itself is intimately linked to abnormal hepatic transport, and hepatotoxicity remains the primary reason drugs in development fail and approved drugs are withdrawn from the market. For this reason, we propose to analyze, across liver compartments, the transport kinetics of fluorescein-a fluorescent marker used as a proxy for drug molecules-using intravital microscopy data. To resolve the transport kinetics quantitatively from fluorescence data, we account for the effect that different liver compartments (with different chemical properties) have on fluorescein's emission rate. To do so, we develop ordinary differential equation transport models from the data where the kinetics is related to the observable fluorescence levels by "measurement parameters" that vary across different liver compartments. On account of the steep non-linearities in the kinetics and stochasticity inherent to the model, we infer kinetic and measurement parameters by generalizing the method of parameter cascades. For this application, the method of parameter cascades ensures fast and precise parameter estimates from noisy time traces.
Collapse
Affiliation(s)
- Meysam Tavakoli
- Department of Physics, Indiana University-Purdue University, Indianapolis, Indiana 46202, United States
| | | | - Richard Day
- Department of Cellular and Integrative Physiology, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Kenneth W. Dunn
- Department of Medicine and Biochemistry, Indiana University School of Medicine, Indianapolis, Indiana 46202, United States
| | - Steve Pressé
- Center for Biological Physics, Arizona State University, Tempe, Arizona 85287, United States
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287, United States
| |
Collapse
|
7
|
Grzegorczyk M, Aderhold A, Husmeier D. Overview and Evaluation of Recent Methods for Statistical Inference of Gene Regulatory Networks from Time Series Data. Methods Mol Biol 2019; 1883:49-94. [PMID: 30547396 DOI: 10.1007/978-1-4939-8882-2_3] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
A challenging problem in systems biology is the reconstruction of gene regulatory networks from postgenomic data. A variety of reverse engineering methods from machine learning and computational statistics have been proposed in the literature. However, deciding on the best method to adopt for a particular application or data set might be a confusing task. The present chapter provides a broad overview of state-of-the-art methods with an emphasis on conceptual understanding rather than a deluge of mathematical details, and the pros and cons of the various approaches are discussed. Guidance on practical applications with pointers to publicly available software implementations are included. The chapter concludes with a comprehensive comparative benchmark study on simulated data and a real-work application taken from the current plant systems biology.
Collapse
Affiliation(s)
- Marco Grzegorczyk
- Johann Bernoulli Institute, University of Groningen, Groningen, The Netherlands
| | - Andrej Aderhold
- Center for Computer Science, Universidade Federal do Rio Grande, Rio Grande, Brazil
| | - Dirk Husmeier
- School of Mathematics and Statistics, University of Glasgow, Glasgow, UK.
| |
Collapse
|
8
|
Dony L, He F, Stumpf MPH. Parametric and non-parametric gradient matching for network inference: a comparison. BMC Bioinformatics 2019; 20:52. [PMID: 30683048 PMCID: PMC6346534 DOI: 10.1186/s12859-018-2590-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 12/21/2018] [Indexed: 11/24/2022] Open
Abstract
Background Reverse engineering of gene regulatory networks from time series gene-expression data is a challenging problem, not only because of the vast sets of candidate interactions but also due to the stochastic nature of gene expression. We limit our analysis to nonlinear differential equation based inference methods. In order to avoid the computational cost of large-scale simulations, a two-step Gaussian process interpolation based gradient matching approach has been proposed to solve differential equations approximately. Results We apply a gradient matching inference approach to a large number of candidate models, including parametric differential equations or their corresponding non-parametric representations, we evaluate the network inference performance under various settings for different inference objectives. We use model averaging, based on the Bayesian Information Criterion (BIC), to combine the different inferences. The performance of different inference approaches is evaluated using area under the precision-recall curves. Conclusions We found that parametric methods can provide comparable, and often improved inference compared to non-parametric methods; the latter, however, require no kinetic information and are computationally more efficient. Electronic supplementary material The online version of this article (10.1186/s12859-018-2590-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Leander Dony
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.,Institute of Computational Biology, Helmholtz Center Munich, German Research Center for Environmental Health, Neuherberg, 85764, Germany.,Max Planck Institute of Psychiatry, Kraepelinstr. 2-10, Munich, 80804, Germany
| | - Fei He
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK.,School of Computing, Electronics, and Mathematics, Coventry University, Coventry, CV1 2JH, UK
| | - Michael P H Stumpf
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London, SW7 2AZ, UK. .,Melbourne Integrative Genomics, School of BioScience & School of Mathematics and Statistics, University of Melbourne, Parkville Melbourne, 3010, Australia.
| |
Collapse
|
9
|
Lopez-Lopera AF, Alvarez MA. Switched Latent Force Models for Reverse-Engineering Transcriptional Regulation in Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:322-335. [PMID: 29990003 DOI: 10.1109/tcbb.2017.2764908] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
To survive environmental conditions, cells transcribe their response activities into encoded mRNA sequences in order to produce certain amounts of protein concentrations. The external conditions are mapped into the cell through the activation of special proteins called transcription factors (TFs). Due to the difficult task to measure experimentally TF behaviors, and the challenges to capture their quick-time dynamics, different types of models based on differential equations have been proposed. However, those approaches usually incur in costly procedures, and they present problems to describe sudden changes in TF regulators. In this paper, we present a switched dynamical latent force model for reverse-engineering transcriptional regulation in gene expression data which allows the exact inference over latent TF activities driving some observed gene expressions through a linear differential equation. To deal with discontinuities in the dynamics, we introduce an approach that switches between different TF activities and different dynamical systems. This creates a versatile representation of transcription networks that can capture discrete changes and non-linearities. We evaluate our model on both simulated data and real data (e.g., microaerobic shift in E. coli, yeast respiration), concluding that our framework allows for the fitting of the expression data while being able to infer continuous-time TF profiles.
Collapse
|
10
|
Reverse engineering gene regulatory networks by modular response analysis - a benchmark. Essays Biochem 2018; 62:535-547. [PMID: 30315094 DOI: 10.1042/ebc20180012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2018] [Revised: 08/13/2018] [Accepted: 08/24/2018] [Indexed: 11/17/2022]
Abstract
Gene regulatory networks control the cellular phenotype by changing the RNA and protein composition. Despite its importance, the gene regulatory network in higher organisms is only partly mapped out. Here, we investigate the potential of reverse engineering methods to unravel the structure of these networks. Particularly, we focus on modular response analysis (MRA), a method that can disentangle networks from perturbation data. We benchmark a version of MRA that was previously successfully applied to reconstruct a signalling-driven genetic network, termed MLMSMRA, to test cases mimicking various aspects of gene regulatory networks. We then investigate the performance in comparison with other MRA realisations and related methods. The benchmark shows that MRA has the potential to predict functional interactions, but also shows that successful application of MRA is restricted to small sparse networks and to data with a low signal-to-noise ratio.
Collapse
|
11
|
Zhang F, Brenner M, Yang WL, Wang P. A cold-inducible RNA-binding protein (CIRP)-derived peptide attenuates inflammation and organ injury in septic mice. Sci Rep 2018; 8:3052. [PMID: 29434211 PMCID: PMC5809586 DOI: 10.1038/s41598-017-13139-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 09/19/2017] [Indexed: 12/29/2022] Open
Abstract
Cold-inducible RNA-binding protein (CIRP) is a novel sepsis inflammatory mediator and C23 is a putative CIRP competitive inhibitor. Therefore, we hypothesized that C23 can ameliorate sepsis-associated injury to the lungs and kidneys. First, we confirmed that C23 dose-dependently inhibited TNF-α release, IκBα degradation, and NF-κB nuclear translocation in macrophages stimulated with CIRP. Next, we observed that male C57BL/6 mice treated with C23 (8 mg/kg BW) at 2 h after cecal ligation and puncture (CLP) had lower serum levels of LDH, ALT, IL-6, TNF-α, and IL-1β (reduced by ≥39%) at 20 h after CLP compared with mice treated with vehicle. C23-treated mice also had improved lung histology, less TUNEL-positive cells, lower serum levels of creatinine (34%) and BUN (26%), and lower kidney expression of NGAL (50%) and KIM-1 (86%). C23-treated mice also had reduced lung and kidney levels of IL-6, TNF-α, and IL-1β. E-selectin and ICAM-1 mRNA was significantly lower in C23-treated mice. The 10-day survival after CLP of vehicle-treated mice was 55%, while that of C23-treated mice was 85%. In summary, C23 decreased systemic, lung, and kidney injury and inflammation, and improved the survival rate after CLP, suggesting that it may be developed as a new treatment for sepsis.
Collapse
Affiliation(s)
- Fangming Zhang
- Center for Immunology and Inflammation, The Feinstein Institute for Medical Research, Manhasset, NY, 11030, United States
| | - Max Brenner
- Center for Immunology and Inflammation, The Feinstein Institute for Medical Research, Manhasset, NY, 11030, United States
| | - Weng-Lang Yang
- Center for Immunology and Inflammation, The Feinstein Institute for Medical Research, Manhasset, NY, 11030, United States
- Department of Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Manhasset, NY, 11030, United States
| | - Ping Wang
- Center for Immunology and Inflammation, The Feinstein Institute for Medical Research, Manhasset, NY, 11030, United States.
- Department of Surgery, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell, Manhasset, NY, 11030, United States.
| |
Collapse
|
12
|
Li Y, Chen J, Jiang L, Zeng N, Jiang H, Du M. The p53–Mdm2 regulation relationship under different radiation doses based on the continuous–discrete extended Kalman filter algorithm. Neurocomputing 2018. [DOI: 10.1016/j.neucom.2017.08.016] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
Attenuation of hemorrhage-associated lung injury by adjuvant treatment with C23, an oligopeptide derived from cold-inducible RNA-binding protein. J Trauma Acute Care Surg 2017; 83:690-697. [PMID: 28930962 DOI: 10.1097/ta.0000000000001566] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
BACKGROUND Hemorrhagic shock (HS) is an important cause of mortality. HS is associated with an elevated incidence of acute lung injury and acute respiratory distress syndrome, significantly contributing to HS morbidity and mortality. Cold-inducible RNA-binding protein (CIRP) is released into the circulation during HS and can cause lung injury. C23 is a CIRP-derived oligopeptide that binds with high affinity to the CIRP receptor and inhibits CIRP-induced phagocyte secretion of TNF-α. This study was designed to determine whether C23 is able to attenuate HS-associated lung injury. METHODS C57BL/6 mice were subjected to controlled hemorrhage leading to a mean arterial pressure of 25 ± 3 mm Hg for 90 minutes. Mice were then volume-resuscitated for 30 minutes with normal saline solution alone (vehicle) or plus adjuvant treatment with C23 (8 mg/kg BW). At 4.5 hours after resuscitation, the blood and lungs were harvested. RESULTS Serum levels of organ injury markers lactate dehydrogenase, aspartate aminotransferase were significantly elevated in hemorrhaged mice receiving vehicle and were reduced by 51.3% and 52.2% in mice adjuvantly treated with C23, respectively. Similarly, lung mRNA levels of IL-1β, TNF-α, and IL-6, and lung myeloperoxidase activity were elevated after HS and reduced by 66.1%, 54.4%, 69.7%, and 24.3%, respectively, in mice treated with C23. Adjuvant treatment with C23 also decreased the lung histology score by 33.9%, lung extravasation of albumin carrying Evans blue dye by 36.8%, and the protein level of intercellular adhesion molecule-1, and indicator of vascular endothelial cell activation, by 40.3%. CONCLUSION Together, these results indicate that adjuvant treatment with the CIRP-derived oligopeptide C23 is able to improve lung inflammation and vascular endothelial activation secondary to HS, lending support to the development of CIRP-targeting adjuvant treatments to minimize lung injury after HS.
Collapse
|
14
|
Grzegorczyk M, Aderhold A, Husmeier D. Targeting Bayes factors with direct-path non-equilibrium thermodynamic integration. Comput Stat 2017; 32:717-761. [PMID: 32103862 PMCID: PMC7010372 DOI: 10.1007/s00180-017-0721-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Accepted: 02/27/2017] [Indexed: 11/21/2022]
Abstract
Thermodynamic integration (TI) for computing marginal likelihoods is based on an inverse annealing path from the prior to the posterior distribution. In many cases, the resulting estimator suffers from high variability, which particularly stems from the prior regime. When comparing complex models with differences in a comparatively small number of parameters, intrinsic errors from sampling fluctuations may outweigh the differences in the log marginal likelihood estimates. In the present article, we propose a TI scheme that directly targets the log Bayes factor. The method is based on a modified annealing path between the posterior distributions of the two models compared, which systematically avoids the high variance prior regime. We combine this scheme with the concept of non-equilibrium TI to minimise discretisation errors from numerical integration. Results obtained on Bayesian regression models applied to standard benchmark data, and a complex hierarchical model applied to biopathway inference, demonstrate a significant reduction in estimator variance over state-of-the-art TI methods.
Collapse
Affiliation(s)
- Marco Grzegorczyk
- Johann Bernoulli Institute (JBI), Groningen University, Groningen, The Netherlands
| | - Andrej Aderhold
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| | - Dirk Husmeier
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| |
Collapse
|
15
|
Sahni H, Ross S, Barbarulo A, Solanki A, Lau CI, Furmanski A, Saldaña JI, Ono M, Hubank M, Barenco M, Crompton T. A genome wide transcriptional model of the complex response to pre-TCR signalling during thymocyte differentiation. Oncotarget 2016; 6:28646-60. [PMID: 26415229 PMCID: PMC4745683 DOI: 10.18632/oncotarget.5796] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 09/08/2015] [Indexed: 01/19/2023] Open
Abstract
Developing thymocytes require pre-TCR signalling to differentiate from CD4-CD8- double negative to CD4+CD8+ double positive cell. Here we followed the transcriptional response to pre-TCR signalling in a synchronised population of differentiating double negative thymocytes. This time series analysis revealed a complex transcriptional response, in which thousands of genes were up and down-regulated before changes in cell surface phenotype were detected. Genome-wide measurement of RNA degradation of individual genes showed great heterogeneity in the rate of degradation between different genes. We therefore used time course expression and degradation data and a genome wide transcriptional modelling (GWTM) strategy to model the transcriptional response of genes up-regulated on pre-TCR signal transduction. This analysis revealed five major temporally distinct transcriptional activities that up regulate transcription through time, whereas down-regulation of expression occurred in three waves. Our model thus placed known regulators in a temporal perspective, and in addition identified novel candidate regulators of thymocyte differentiation.
Collapse
Affiliation(s)
- Hemant Sahni
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Susan Ross
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | | | - Anisha Solanki
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Ching-In Lau
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Anna Furmanski
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | | | - Masahiro Ono
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Mike Hubank
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Martino Barenco
- Institute of Child Health, University College London, London WC1N 1EH, UK
| | - Tessa Crompton
- Institute of Child Health, University College London, London WC1N 1EH, UK
| |
Collapse
|
16
|
Wang J, Wu Q, Hu XT, Tian T. An integrated approach to infer dynamic protein-gene interactions - A case study of the human P53 protein. Methods 2016; 110:3-13. [PMID: 27514497 DOI: 10.1016/j.ymeth.2016.08.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 07/18/2016] [Accepted: 08/01/2016] [Indexed: 11/19/2022] Open
Abstract
Investigating the dynamics of genetic regulatory networks through high throughput experimental data, such as microarray gene expression profiles, is a very important but challenging task. One of the major hindrances in building detailed mathematical models for genetic regulation is the large number of unknown model parameters. To tackle this challenge, a new integrated method is proposed by combining a top-down approach and a bottom-up approach. First, the top-down approach uses probabilistic graphical models to predict the network structure of DNA repair pathway that is regulated by the p53 protein. Two networks are predicted, namely a network of eight genes with eight inferred interactions and an extended network of 21 genes with 17 interactions. Then, the bottom-up approach using differential equation models is developed to study the detailed genetic regulations based on either a fully connected regulatory network or a gene network obtained by the top-down approach. Model simulation error, parameter identifiability and robustness property are used as criteria to select the optimal network. Simulation results together with permutation tests of input gene network structures indicate that the prediction accuracy and robustness property of the two predicted networks using the top-down approach are better than those of the corresponding fully connected networks. In particular, the proposed approach reduces computational cost significantly for inferring model parameters. Overall, the new integrated method is a promising approach for investigating the dynamics of genetic regulation.
Collapse
Affiliation(s)
- Junbai Wang
- Department of Pathology, Oslo University Hospital - Norwegian Radium Hospital, Montebello, 0310 Oslo, Norway
| | - Qianqian Wu
- School of Mathematical Sciences, Monash University, Melbourne 3800, Victoria, Australia; School of Mathematics, Hefei University of Technology, Hefei, Anhui 230009, China
| | - Xiaohua Tony Hu
- College of Computing and Informatics, Drexel University, Philadelphia, PA 19104, USA
| | - Tianhai Tian
- School of Mathematical Sciences, Monash University, Melbourne 3800, Victoria, Australia.
| |
Collapse
|
17
|
Aderhold A, Husmeier D, Grzegorczyk M. Approximate Bayesian inference in semi-mechanistic models. STATISTICS AND COMPUTING 2016; 27:1003-1040. [PMID: 32226236 PMCID: PMC7089672 DOI: 10.1007/s11222-016-9668-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2015] [Accepted: 05/05/2016] [Indexed: 06/10/2023]
Abstract
Inference of interaction networks represented by systems of differential equations is a challenging problem in many scientific disciplines. In the present article, we follow a semi-mechanistic modelling approach based on gradient matching. We investigate the extent to which key factors, including the kinetic model, statistical formulation and numerical methods, impact upon performance at network reconstruction. We emphasize general lessons for computational statisticians when faced with the challenge of model selection, and we assess the accuracy of various alternative paradigms, including recent widely applicable information criteria and different numerical procedures for approximating Bayes factors. We conduct the comparative evaluation with a novel inferential pipeline that systematically disambiguates confounding factors via an ANOVA scheme.
Collapse
Affiliation(s)
- Andrej Aderhold
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| | - Dirk Husmeier
- School of Mathematics and Statistics, Glasgow University, Glasgow, UK
| | - Marco Grzegorczyk
- Johann Bernoulli Institute (JBI), Groningen University, Groningen, The Netherlands
| |
Collapse
|
18
|
Tellaroli P, Bazzi M, Donato M, Brazzale AR, Drăghici S. Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters. PLoS One 2016; 11:e0152333. [PMID: 27015427 PMCID: PMC4807765 DOI: 10.1371/journal.pone.0152333] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2015] [Accepted: 03/11/2016] [Indexed: 11/19/2022] Open
Abstract
Four of the most common limitations of the many available clustering methods are: i) the lack of a proper strategy to deal with outliers; ii) the need for a good a priori estimate of the number of clusters to obtain reasonable results; iii) the lack of a method able to detect when partitioning of a specific data set is not appropriate; and iv) the dependence of the result on the initialization. Here we propose Cross-clustering (CC), a partial clustering algorithm that overcomes these four limitations by combining the principles of two well established hierarchical clustering algorithms: Ward’s minimum variance and Complete-linkage. We validated CC by comparing it with a number of existing clustering methods, including Ward’s and Complete-linkage. We show on both simulated and real datasets, that CC performs better than the other methods in terms of: the identification of the correct number of clusters, the identification of outliers, and the determination of real cluster memberships. We used CC to cluster samples in order to identify disease subtypes, and on gene profiles, in order to determine groups of genes with the same behavior. Results obtained on a non-biological dataset show that the method is general enough to be successfully used in such diverse applications. The algorithm has been implemented in the statistical language R and is freely available from the CRAN contributed packages repository.
Collapse
Affiliation(s)
- Paola Tellaroli
- Department of Statistical Sciences, University of Padova, Padova, Italy
- * E-mail:
| | - Marco Bazzi
- Department of Statistical Sciences, University of Padova, Padova, Italy
| | - Michele Donato
- Department of Computer Science, Wayne State University, Detroit, MI, United States of America
| | | | - Sorin Drăghici
- Department of Computer Science, Wayne State University, Detroit, MI, United States of America
- Department of Obstetrics and Gynecology, Wayne State University School of Medicine, Detroit, MI, United States of America
| |
Collapse
|
19
|
Macdonald B, Husmeier D. Gradient Matching Methods for Computational Inference in Mechanistic Models for Systems Biology: A Review and Comparative Analysis. Front Bioeng Biotechnol 2015; 3:180. [PMID: 26636071 PMCID: PMC4654429 DOI: 10.3389/fbioe.2015.00180] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 10/23/2015] [Indexed: 11/13/2022] Open
Abstract
Parameter inference in mathematical models of biological pathways, expressed as coupled ordinary differential equations (ODEs), is a challenging problem in contemporary systems biology. Conventional methods involve repeatedly solving the ODEs by numerical integration, which is computationally onerous and does not scale up to complex systems. Aimed at reducing the computational costs, new concepts based on gradient matching have recently been proposed in the computational statistics and machine learning literature. In a preliminary smoothing step, the time series data are interpolated; then, in a second step, the parameters of the ODEs are optimized, so as to minimize some metric measuring the difference between the slopes of the tangents to the interpolants, and the time derivatives from the ODEs. In this way, the ODEs never have to be solved explicitly. This review provides a concise methodological overview of the current state-of-the-art methods for gradient matching in ODEs, followed by an empirical comparative evaluation based on a set of widely used and representative benchmark data.
Collapse
Affiliation(s)
- Benn Macdonald
- School of Mathematics and Statistics, University of Glasgow , Glasgow , UK
| | - Dirk Husmeier
- School of Mathematics and Statistics, University of Glasgow , Glasgow , UK
| |
Collapse
|
20
|
Wang S, Shen Y, Hu J. Thermodynamics-based models of transcriptional regulation with gene sequence. Bioprocess Biosyst Eng 2015; 38:2469-76. [PMID: 26458822 DOI: 10.1007/s00449-015-1484-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2015] [Accepted: 10/03/2015] [Indexed: 11/24/2022]
Abstract
Quantitative models of gene regulatory activity have the potential to improve our mechanistic understanding of transcriptional regulation. However, the few models available today have been based on simplistic assumptions about the sequences being modeled or heuristic approximations of the underlying regulatory mechanisms. In this work, we have developed a thermodynamics-based model to predict gene expression driven by any DNA sequence. The proposed model relies on a continuous time, differential equation description of transcriptional dynamics. The sequence features of the promoter are exploited to derive the binding affinity which is derived based on statistical molecular thermodynamics. Experimental results show that the proposed model can effectively identify the activity levels of transcription factors and the regulatory parameters. Comparing with the previous models, the proposed model can reveal more biological sense.
Collapse
Affiliation(s)
- Shuqiang Wang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.
| | - Yanyan Shen
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Jinxing Hu
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|
21
|
Hasegawa T, Mori T, Yamaguchi R, Shimamura T, Miyano S, Imoto S, Akutsu T. Genomic data assimilation using a higher moment filtering technique for restoration of gene regulatory networks. BMC SYSTEMS BIOLOGY 2015; 9:14. [PMID: 25890175 PMCID: PMC4371723 DOI: 10.1186/s12918-015-0154-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/24/2014] [Accepted: 02/20/2015] [Indexed: 11/20/2022]
Abstract
Background As a result of recent advances in biotechnology, many findings related to intracellular systems have been published, e.g., transcription factor (TF) information. Although we can reproduce biological systems by incorporating such findings and describing their dynamics as mathematical equations, simulation results can be inconsistent with data from biological observations if there are inaccurate or unknown parts in the constructed system. For the completion of such systems, relationships among genes have been inferred through several computational approaches, which typically apply several abstractions, e.g., linearization, to handle the heavy computational cost in evaluating biological systems. However, since these approximations can generate false regulations, computational methods that can infer regulatory relationships based on less abstract models incorporating existing knowledge have been strongly required. Results We propose a new data assimilation algorithm that utilizes a simple nonlinear regulatory model and a state space representation to infer gene regulatory networks (GRNs) using time-course observation data. For the estimation of the hidden state variables and the parameter values, we developed a novel method termed a higher moment ensemble particle filter (HMEnPF) that can retain first four moments of the conditional distributions through filtering steps. Starting from the original model, e.g., derived from the literature, the proposed algorithm can sequentially evaluate candidate models, which are generated by partially changing the current best model, to find the model that can best predict the data. For the performance evaluation, we generated six synthetic data based on two real biological networks and evaluated effectiveness of the proposed algorithm by improving the networks inferred by previous methods. We then applied time-course observation data of rat skeletal muscle stimulated with corticosteroid. Since a corticosteroid pharmacogenomic pathway, its kinetic/dynamics and TF candidate genes have been partially elucidated, we incorporated these findings and inferred an extended pathway of rat pharmacogenomics. Conclusions Through the simulation study, the proposed algorithm outperformed previous methods and successfully improved the regulatory structure inferred by the previous methods. Furthermore, the proposed algorithm could extend a corticosteroid related pathway, which has been partially elucidated, with incorporating several information sources. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0154-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Takanori Hasegawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| | - Tomoya Mori
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| | - Rui Yamaguchi
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Teppei Shimamura
- Division of Systems Biology, Nagoya University Graduate School of Medicine, 65 Tsurumai-cho, Nagoya, 466-8550 Showa-ku, Japan.
| | - Satoru Miyano
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Seiya Imoto
- Human Genome Center, The Institute of Medical Science, The University of Tokyo, 4-6-1 Shirokanedai, Tokyo, 108-8639 Minato-ku, Japan.
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Kyoto, 611-0011 Uji, Japan.
| |
Collapse
|
22
|
Iorio F, Saez-Rodriguez J, Bernardo DD. Network based elucidation of drug response: from modulators to targets. BMC SYSTEMS BIOLOGY 2013; 7:139. [PMID: 24330611 PMCID: PMC3878740 DOI: 10.1186/1752-0509-7-139] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2012] [Accepted: 07/19/2013] [Indexed: 11/20/2022]
Abstract
: Network-based drug discovery aims at harnessing the power of networks to investigate the mechanism of action of existing drugs, or new molecules, in order to identify innovative therapeutic treatments. In this review, we describe some of the most recent advances in the field of network pharmacology, starting with approaches relying on computational models of transcriptional networks, then moving to protein and signaling network models and concluding with "drug networks". These networks are derived from different sources of experimental data, or literature-based analysis, and provide a complementary view of drug mode of action. Molecular and drug networks are powerful integrated computational and experimental approaches that will likely speed up and improve the drug discovery process, once fully integrated into the academic and industrial drug discovery pipeline.
Collapse
Affiliation(s)
- Francesco Iorio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
- Cancer Genome Project, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK
| | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Diego di Bernardo
- Telethon Institute of Genetics and Medicine, Naples, Italy
- Deptartment of Electrical Engineering and Information Technology, University of Naples “Federico II”, Naples, Italy
| |
Collapse
|
23
|
Hensman J, Lawrence ND, Rattray M. Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters. BMC Bioinformatics 2013; 14:252. [PMID: 23962281 PMCID: PMC3766667 DOI: 10.1186/1471-2105-14-252] [Citation(s) in RCA: 62] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2012] [Accepted: 08/13/2013] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering. Existing methodologies fail to capture either the temporal or replicated nature of the experiments, and often impose constraints on the data collection process, such as regularly spaced samples, or similar sampling schema across replications. RESULTS We propose hierarchical Gaussian processes as a general model of gene expression time-series, with application to a variety of problems. In particular, we illustrate the method's capacity for missing data imputation, data fusion and clustering.The method can impute data which is missing both systematically and at random: in a hold-out test on real data, performance is significantly better than commonly used imputation methods. The method's ability to model inter- and intra-cluster variance leads to more biologically meaningful clusters. The approach removes the necessity for evenly spaced samples, an advantage illustrated on a developmental Drosophila dataset with irregular replications. CONCLUSION The hierarchical Gaussian process model provides an excellent statistical basis for several gene-expression time-series tasks. It has only a few additional parameters over a regular GP, has negligible additional complexity, is easily implemented and can be integrated into several existing algorithms. Our experiments were implemented in python, and are available from the authors' website: http://staffwww.dcs.shef.ac.uk/people/J.Hensman/.
Collapse
Affiliation(s)
- James Hensman
- Department of Computer Science, The University of Sheffield, Sheffield, UK.
| | | | | |
Collapse
|
24
|
Chen BS, Li CW. Analysing microarray data in drug discovery using systems biology. Expert Opin Drug Discov 2013; 2:755-68. [PMID: 23488963 DOI: 10.1517/17460441.2.5.755] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The innovation of present drug design focuses on new targets. However, compound efficacy and safety in human metabolism, including toxicity and pharmacokinetic profiles, but not target selection, are the criteria that determine which drug candidates enter the clinic. Systems biology approaches to disease are developed from the idea that disease-perturbed regulatory networks differ from their normal counterparts. Microarray data analyses reveal global changes in gene or protein expression in response to genetic and environmental changes and, accordingly, are well suited to construct the normal, disease-perturbed and drug-affected networks, which are useful for drug discovery in the pharmaceutical industry. The integration of modelling, microarray data and systems biology approaches will allow for a true breakthrough in in silico absorption, distribution, metabolism, excretion and toxicity assessment in drug design. Therefore, drug discovery through systems biology by means of microarray analyses could significantly reduce the time and cost of new drug development.
Collapse
Affiliation(s)
- Bor-Sen Chen
- National Tsing Hua University, Laboratory of Control and Systems Biology, 101, Sec 2, Kuang Fu Road, Hsinchu, 300, Taiwan
| | | |
Collapse
|
25
|
Wang Z, Wu H, Liang J, Cao J, Liu X. On Modeling and State Estimation for Genetic Regulatory Networks With Polytopic Uncertainties. IEEE Trans Nanobioscience 2013; 12:13-20. [DOI: 10.1109/tnb.2012.2215626] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
26
|
Murtuza Baker S, Poskar CH, Schreiber F, Junker BH. An improved constraint filtering technique for inferring hidden states and parameters of a biological model. ACTA ACUST UNITED AC 2013; 29:1052-9. [PMID: 23434837 DOI: 10.1093/bioinformatics/btt097] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION In systems biology, kinetic models represent the biological system using a set of ordinary differential equations (ODEs). The correct values of the parameters within these ODEs are critical for a reliable study of the dynamic behaviour of such systems. Typically, it is only possible to experimentally measure a fraction of these parameter values. The rest must be indirectly determined from measurements of other quantities. In this article, we propose a novel statistical inference technique to computationally estimate these unknown parameter values. By characterizing the ODEs with non-linear state-space equations, this inference technique models the unknown parameters as hidden states, which can then be estimated from noisy measurement data. RESULTS Here we extended the square-root unscented Kalman filter SR-UKF proposed by Merwe and Wan to include constraints with the state estimation process. We developed the constrained square-root unscented Kalman filter (CSUKF) to estimate parameters of non-linear state-space models. This probabilistic inference technique was successfully used to estimate parameters of a glycolysis model in yeast and a gene regulatory network. We showed that our method is numerically stable and can reliably estimate parameters within a biologically meaningful parameter space from noisy observations. When compared with the two common non-linear extensions of Kalman filter in addition to four widely used global optimization algorithms, CSUKF is shown to be both accurate and computationally efficient. With CSUKF, statistical analysis is straightforward, as it directly provides the uncertainty on the estimation result. AVAILABILITY AND IMPLEMENTATION Matlab code available upon request from the author. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Syed Murtuza Baker
- Systems Biology Group, Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Germany.
| | | | | | | |
Collapse
|
27
|
Wang SQ, Li HX. Bayesian inference based modelling for gene transcriptional dynamics by integrating multiple source of knowledge. BMC SYSTEMS BIOLOGY 2012; 6 Suppl 1:S3. [PMID: 23046631 PMCID: PMC3403574 DOI: 10.1186/1752-0509-6-s1-s3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
BACKGROUND A key challenge in the post genome era is to identify genome-wide transcriptional regulatory networks, which specify the interactions between transcription factors and their target genes. Numerous methods have been developed for reconstructing gene regulatory networks from expression data. However, most of them are based on coarse grained qualitative models, and cannot provide a quantitative view of regulatory systems. RESULTS A binding affinity based regulatory model is proposed to quantify the transcriptional regulatory network. Multiple quantities, including binding affinity and the activity level of transcription factor (TF) are incorporated into a general learning model. The sequence features of the promoter and the possible occupancy of nucleosomes are exploited to estimate the binding probability of regulators. Comparing with the previous models that only employ microarray data, the proposed model can bridge the gap between the relative background frequency of the observed nucleotide and the gene's transcription rate. CONCLUSIONS We testify the proposed approach on two real-world microarray datasets. Experimental results show that the proposed model can effectively identify the parameters and the activity level of TF. Moreover, the kinetic parameters introduced in the proposed model can reveal more biological sense than previous models can do.
Collapse
Affiliation(s)
- Shu-Qiang Wang
- Department of Systems Engineering and Engineering Management, City University of Hong Kong, Hong Kong
| | | |
Collapse
|
28
|
Titsias MK, Honkela A, Lawrence ND, Rattray M. Identifying targets of multiple co-regulating transcription factors from expression time-series by Bayesian model comparison. BMC SYSTEMS BIOLOGY 2012; 6:53. [PMID: 22647244 PMCID: PMC3527261 DOI: 10.1186/1752-0509-6-53] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/20/2011] [Accepted: 05/30/2012] [Indexed: 02/02/2023]
Abstract
BACKGROUND Complete transcriptional regulatory network inference is a huge challenge because of the complexity of the network and sparsity of available data. One approach to make it more manageable is to focus on the inference of context-specific networks involving a few interacting transcription factors (TFs) and all of their target genes. RESULTS We present a computational framework for Bayesian statistical inference of target genes of multiple interacting TFs from high-throughput gene expression time-series data. We use ordinary differential equation models that describe transcription of target genes taking into account combinatorial regulation. The method consists of a training and a prediction phase. During the training phase we infer the unobserved TF protein concentrations on a subnetwork of approximately known regulatory structure. During the prediction phase we apply Bayesian model selection on a genome-wide scale and score all alternative regulatory structures for each target gene. We use our methodology to identify targets of five TFs regulating Drosophila melanogaster mesoderm development. We find that confident predicted links between TFs and targets are significantly enriched for supporting ChIP-chip binding events and annotated TF-gene interations. Our method statistically significantly outperforms existing alternatives. CONCLUSIONS Our results show that it is possible to infer regulatory links between multiple interacting TFs and their target genes even from a single relatively short time series and in presence of unmodelled confounders and unreliable prior knowledge on training network connectivity. Introducing data from several different experimental perturbations significantly increases the accuracy.
Collapse
Affiliation(s)
- Michalis K Titsias
- The Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, UK.
| | | | | | | |
Collapse
|
29
|
Liu X, Niranjan M. State and parameter estimation of the heat shock response system using Kalman and particle filters. ACTA ACUST UNITED AC 2012; 28:1501-7. [PMID: 22539674 DOI: 10.1093/bioinformatics/bts161] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
MOTIVATION Traditional models of systems biology describe dynamic biological phenomena as solutions to ordinary differential equations, which, when parameters in them are set to correct values, faithfully mimic observations. Often parameter values are tweaked by hand until desired results are achieved, or computed from biochemical experiments carried out in vitro. Of interest in this article, is the use of probabilistic modelling tools with which parameters and unobserved variables, modelled as hidden states, can be estimated from limited noisy observations of parts of a dynamical system. RESULTS Here we focus on sequential filtering methods and take a detailed look at the capabilities of three members of this family: (i) extended Kalman filter (EKF), (ii) unscented Kalman filter (UKF) and (iii) the particle filter, in estimating parameters and unobserved states of cellular response to sudden temperature elevation of the bacterium Escherichia coli. While previous literature has studied this system with the EKF, we show that parameter estimation is only possible with this method when the initial guesses are sufficiently close to the true values. The same turns out to be true for the UKF. In this thorough empirical exploration, we show that the non-parametric method of particle filtering is able to reliably estimate parameters and states, converging from initial distributions relatively far away from the underlying true values. AVAILABILITY AND IMPLEMENTATION Software implementation of the three filters on this problem can be freely downloaded from http://users.ecs.soton.ac.uk/mn/HeatShock
Collapse
Affiliation(s)
- Xin Liu
- School of Electronics and Computer Science, University of Southampton, Southampton, UK
| | | |
Collapse
|
30
|
Liu W, Niranjan M. Gaussian process modelling for bicoid mRNA regulation in spatio-temporal Bicoid profile. ACTA ACUST UNITED AC 2011; 28:366-72. [PMID: 22130592 DOI: 10.1093/bioinformatics/btr658] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Bicoid protein molecules, translated from maternally provided bicoid mRNA, establish a concentration gradient in Drosophila early embryonic development. There is experimental evidence that the synthesis and subsequent destruction of this protein is regulated at source by precise control of the stability of the maternal mRNA. Can we infer the driving function at the source from noisy observations of the spatio-temporal protein profile? We use non-parametric Gaussian process regression for modelling the propagation of Bicoid in the embryo and infer aspects of source regulation as a posterior function. RESULTS With synthetic data from a 1D diffusion model with a source simulated to model mRNA stability regulation, our results establish that the Gaussian process method can accurately infer the driving function and capture the spatio-temporal dynamics of embryonic Bicoid propagation. On real data from the FlyEx database, too, the reconstructed source function is indicative of stability regulation, but is temporally smoother than what we expected, partly due to the fact that the dataset is only partially observed. To be in line with recent thinking on the subject, we also analyse this model with a spatial gradient of maternal mRNA, rather than being fixed at only the anterior pole. CONTACT m.niranjan@southampton.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wei Liu
- School of Electronics and Computer Science, University of Southampton, Southampton, SO17 1BJ, UK
| | | |
Collapse
|
31
|
Wu M, Liu L, Chan C. Identification of novel targets for breast cancer by exploring gene switches on a genome scale. BMC Genomics 2011; 12:547. [PMID: 22053771 PMCID: PMC3269833 DOI: 10.1186/1471-2164-12-547] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2011] [Accepted: 11/03/2011] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND An important feature that emerges from analyzing gene regulatory networks is the "switch-like behavior" or "bistability", a dynamic feature of a particular gene to preferentially toggle between two steady-states. The state of gene switches plays pivotal roles in cell fate decision, but identifying switches has been difficult. Therefore a challenge confronting the field is to be able to systematically identify gene switches. RESULTS We propose a top-down mining approach to exploring gene switches on a genome-scale level. Theoretical analysis, proof-of-concept examples, and experimental studies demonstrate the ability of our mining approach to identify bistable genes by sampling across a variety of different conditions. Applying the approach to human breast cancer data identified genes that show bimodality within the cancer samples, such as estrogen receptor (ER) and ERBB2, as well as genes that show bimodality between cancer and non-cancer samples, where tumor-associated calcium signal transducer 2 (TACSTD2) is uncovered. We further suggest a likely transcription factor that regulates TACSTD2. CONCLUSIONS Our mining approach demonstrates that one can capitalize on genome-wide expression profiling to capture dynamic properties of a complex network. To the best of our knowledge, this is the first attempt in applying mining approaches to explore gene switches on a genome-scale, and the identification of TACSTD2 demonstrates that single cell-level bistability can be predicted from microarray data. Experimental confirmation of the computational results suggest TACSTD2 could be a potential biomarker and attractive candidate for drug therapy against both ER+ and ER- subtypes of breast cancer, including the triple negative subtype.
Collapse
Affiliation(s)
- Ming Wu
- Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, USA.
| | | | | |
Collapse
|
32
|
Ocone A, Sanguinetti G. Reconstructing transcription factor activities in hierarchical transcription network motifs. ACTA ACUST UNITED AC 2011; 27:2873-9. [PMID: 21903631 DOI: 10.1093/bioinformatics/btr487] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION A knowledge of the dynamics of transcription factors is fundamental to understand the transcriptional regulation mechanism. Nowadays, an experimental measure of transcription factor activities in vivo represents a challenge. Several methods have been developed to infer these activities from easily measurable quantities such as mRNA expression of target genes. A limitation of these methods is represented by the fact that they rely on very simple single-layer structures, typically consisting of one or more transcription factors regulating a number of target genes. RESULTS We present a novel statistical inference methodology to reverse engineer the dynamics of transcription factors in hierarchical network motifs such as feed-forward loops. The approach we present is based on a continuous time representation of the system where the high-level master transcription factor is represented as a two state Markov jump process driving a system of differential equations. We solve the inference problem using an efficient variational approach and demonstrate our method on simulated data and two real datasets. The results on real data show that the predictions of our approach can capture biological behaviours in a more effective way than single-layer models of transcription, and can lead to novel biological insights. AVAILABILITY http://homepages.inf.ed.ac.uk/gsanguin/software.html CONTACT g.sanguinetti@ed.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Andrea Ocone
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | | |
Collapse
|
33
|
Asif HMS, Sanguinetti G. Large-scale learning of combinatorial transcriptional dynamics from gene expression. ACTA ACUST UNITED AC 2011; 27:1277-83. [PMID: 21367870 DOI: 10.1093/bioinformatics/btr113] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Knowledge of the activation patterns of transcription factors (TFs) is fundamental to elucidate the dynamics of gene regulation in response to environmental conditions. Direct experimental measurement of TFs' activities is, however, challenging, resulting in a need to develop statistical tools to infer TF activities from mRNA expression levels of target genes. Current models, however, neglect important features of transcriptional regulation; in particular, the combinatorial nature of regulation, which is fundamental for signal integration, is not accounted for. RESULTS We present a novel method to infer combinatorial regulation of gene expression by multiple transcription factors in large-scale transcriptional regulatory networks. The method implements a factorial hidden Markov model with a non-linear likelihood to represent the interactions between the hidden transcription factors. We explore our model's performance on artificial datasets and demonstrate the applicability of our method on genome-wide scale for three expression datasets. The results obtained using our model are biologically coherent and provide a tool to explore the concealed nature of combinatorial transcriptional regulation. AVAILABILITY http://homepages.inf.ed.ac.uk/gsanguin/software.html.
Collapse
Affiliation(s)
- H M Shahzad Asif
- School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh, UK
| | | |
Collapse
|
34
|
Lèbre S, Becq J, Devaux F, Stumpf MPH, Lelandais G. Statistical inference of the time-varying structure of gene-regulation networks. BMC SYSTEMS BIOLOGY 2010; 4:130. [PMID: 20860793 PMCID: PMC2955603 DOI: 10.1186/1752-0509-4-130] [Citation(s) in RCA: 79] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/22/2010] [Accepted: 09/22/2010] [Indexed: 01/08/2023]
Abstract
Background Biological networks are highly dynamic in response to environmental and physiological cues. This variability is in contrast to conventional analyses of biological networks, which have overwhelmingly employed static graph models which stay constant over time to describe biological systems and their underlying molecular interactions. Methods To overcome these limitations, we propose here a new statistical modelling framework, the ARTIVA formalism (Auto Regressive TIme VArying models), and an associated inferential procedure that allows us to learn temporally varying gene-regulation networks from biological time-course expression data. ARTIVA simultaneously infers the topology of a regulatory network and how it changes over time. It allows us to recover the chronology of regulatory associations for individual genes involved in a specific biological process (development, stress response, etc.). Results We demonstrate that the ARTIVA approach generates detailed insights into the function and dynamics of complex biological systems and exploits efficiently time-course data in systems biology. In particular, two biological scenarios are analyzed: the developmental stages of Drosophila melanogaster and the response of Saccharomyces cerevisiae to benomyl poisoning. Conclusions ARTIVA does recover essential temporal dependencies in biological systems from transcriptional data, and provide a natural starting point to learn and investigate their dynamics in greater detail.
Collapse
Affiliation(s)
- Sophie Lèbre
- Center for Bioinformatics, Imperial College London, London, UK
| | | | | | | | | |
Collapse
|
35
|
Tang B, Wu X, Tan G, Chen SS, Jing Q, Shen B. Computational inference and analysis of genetic regulatory networks via a supervised combinatorial-optimization pattern. BMC SYSTEMS BIOLOGY 2010; 4 Suppl 2:S3. [PMID: 20840730 PMCID: PMC2982690 DOI: 10.1186/1752-0509-4-s2-s3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
Background Post-genome era brings about diverse categories of omics data. Inference and analysis of genetic regulatory networks act prominently in extracting inherent mechanisms, discovering and interpreting the related biological nature and living principles beneath mazy phenomena, and eventually promoting the well-beings of humankind. Results A supervised combinatorial-optimization pattern based on information and signal-processing theories is introduced into the inference and analysis of genetic regulatory networks. An associativity measure is proposed to define the regulatory strength/connectivity, and a phase-shift metric determines regulatory directions among components of the reconstructed networks. Thus, it solves the undirected regulatory problems arising from most of current linear/nonlinear relevance methods. In case of computational and topological redundancy, we constrain the classified group size of pair candidates within a multiobjective combinatorial optimization (MOCO) pattern. Conclusions We testify the proposed approach on two real-world microarray datasets of different statistical characteristics. Thus, we reveal the inherent design mechanisms for genetic networks by quantitative means, facilitating further theoretic analysis and experimental design with diverse research purposes. Qualitative comparisons with other methods and certain related focuses needing further work are illustrated within the discussion section.
Collapse
Affiliation(s)
- Binhua Tang
- Department of Bioinformatics, Tongji University, Shanghai, China.
| | | | | | | | | | | |
Collapse
|
36
|
Using temporal correlation in factor analysis for reconstructing transcription factor activities. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2010:172840. [PMID: 18604288 DOI: 10.1155/2008/172840] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/01/2007] [Accepted: 04/13/2008] [Indexed: 11/17/2022]
Abstract
Two-level gene regulatory networks consist of the transcription factors (TFs) in the top level and their regulated genes in the second level. The expression profiles of the regulated genes are the observed high-throughput data given by experiments such as microarrays. The activity profiles of the TFs are treated as hidden variables as well as the connectivity matrix that indicates the regulatory relationships of TFs with their regulated genes. Factor analysis (FA) as well as other methods, such as the network component algorithm, has been suggested for reconstructing gene regulatory networks and also for predicting TF activities. They have been applied to E. coli and yeast data with the assumption that these datasets consist of identical and independently distributed samples. Thus, the main drawback of these algorithms is that they ignore any time correlation existing within the TF profiles. In this paper, we extend previously studied FA algorithms to include time correlation within the transcription factors. At the same time, we consider connectivity matrices that are sparse in order to capture the existing sparsity present in gene regulatory networks. The TFs activity profiles obtained by this approach are significantly smoother than profiles from previous FA algorithms. The periodicities in profiles from yeast expression data become prominent in our reconstruction. Moreover, the strength of the correlation between time points is estimated and can be used to assess the suitability of the experimental time interval.
Collapse
|
37
|
To CC, Vohradsky J. Measurement variation determines the gene network topology reconstructed from experimental data: a case study of the yeast cyclin network. FASEB J 2010; 24:3468-78. [PMID: 20511392 DOI: 10.1096/fj.10-160515] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
| | - Jiri Vohradsky
- Laboratory of BioinformaticsInstitute of MicrobiologyAcademy of Sciences of the Czech Republic Prague Czech Republic
| |
Collapse
|
38
|
Kiełbasa SM, Blüthgen N, Fähling M, Mrowka R. Targetfinder.org: a resource for systematic discovery of transcription factor target genes. Nucleic Acids Res 2010; 38:W233-8. [PMID: 20460454 PMCID: PMC2896086 DOI: 10.1093/nar/gkq374] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Targetfinder.org (http://targetfinder.org/) provides a web-based resource for finding genes that show a similar expression pattern to a group of user-selected genes. It is based on a large-scale gene expression compendium (>1200 experiments, >13 000 genes). The primary application of Targetfinder.org is to expand a list of known transcription factor targets by new candidate target genes. The user submits a group of genes (the ‘seed’), and as a result the web site provides a list of other genes ranked by similarity of their expression to the expression of the seed genes. Additionally, the web site provides information on a recovery/cross-validation test to check for consistency of the provided seed and the quality of the ranking. Furthermore, the web site allows to analyse affinities of a selected transcription factor to the promoter regions of the top-ranked genes in order to select the best new candidate target genes for further experimental analysis.
Collapse
Affiliation(s)
- Szymon M. Kiełbasa
- Max Planck Institute of Molecular Genetics, Ihnestraße 73, D-14195 Berlin, Institute of Pathology, Institute of Theoretical Biology, Charité Universitätsmedizin Berlin, Charitéplatz 1 and Institute of Physiology, AG Systems Biology, Charité Universitätsmedizin Berlin, Tucholskystr. 2, D-10117 Berlin, Germany
- *To whom correspondence should be addressed. Tel: +49 30 8413 1169; Fax: +49 30 8413 1152;
| | - Nils Blüthgen
- Max Planck Institute of Molecular Genetics, Ihnestraße 73, D-14195 Berlin, Institute of Pathology, Institute of Theoretical Biology, Charité Universitätsmedizin Berlin, Charitéplatz 1 and Institute of Physiology, AG Systems Biology, Charité Universitätsmedizin Berlin, Tucholskystr. 2, D-10117 Berlin, Germany
| | - Michael Fähling
- Max Planck Institute of Molecular Genetics, Ihnestraße 73, D-14195 Berlin, Institute of Pathology, Institute of Theoretical Biology, Charité Universitätsmedizin Berlin, Charitéplatz 1 and Institute of Physiology, AG Systems Biology, Charité Universitätsmedizin Berlin, Tucholskystr. 2, D-10117 Berlin, Germany
| | - Ralf Mrowka
- Max Planck Institute of Molecular Genetics, Ihnestraße 73, D-14195 Berlin, Institute of Pathology, Institute of Theoretical Biology, Charité Universitätsmedizin Berlin, Charitéplatz 1 and Institute of Physiology, AG Systems Biology, Charité Universitätsmedizin Berlin, Tucholskystr. 2, D-10117 Berlin, Germany
| |
Collapse
|
39
|
Opper M, Sanguinetti G. Learning combinatorial transcriptional dynamics from gene expression data. Bioinformatics 2010; 26:1623-9. [DOI: 10.1093/bioinformatics/btq244] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
|
40
|
Elkon R, Zlotorynski E, Zeller KI, Agami R. Major role for mRNA stability in shaping the kinetics of gene induction. BMC Genomics 2010; 11:259. [PMID: 20409322 PMCID: PMC2864252 DOI: 10.1186/1471-2164-11-259] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2010] [Accepted: 04/21/2010] [Indexed: 01/20/2023] Open
Abstract
Background mRNA levels in cells are determined by the relative rates of RNA production and degradation. Yet, to date, most analyses of gene expression profiles were focused on mechanisms which regulate transcription, while the role of mRNA stability in modulating transcriptional networks was to a large extent overlooked. In particular, kinetic waves in transcriptional responses are usually interpreted as resulting from sequential activation of transcription factors. Results In this study, we examined on a global scale the role of mRNA stability in shaping the kinetics of gene response. Analyzing numerous expression datasets we revealed a striking global anti-correlation between rapidity of induction and mRNA stability, fitting the prediction of a kinetic mathematical model. In contrast, the relationship between kinetics and stability was less significant when gene suppression was analyzed. Frequently, mRNAs that are stable under standard conditions were very rapidly down-regulated following stimulation. Such effect cannot be explained even by a complete shut-off of transcription, and therefore indicates intense modulation of RNA stability. Conclusion Taken together, our results demonstrate the key role of mRNA stability in determining induction kinetics in mammalian transcriptional networks.
Collapse
Affiliation(s)
- Ran Elkon
- Division of Gene Regulation, The Netherlands Cancer Institute, Plesmanlaan 121, 1066 CX Amsterdam, The Netherlands.
| | | | | | | |
Collapse
|
41
|
Model-based method for transcription factor target identification with limited data. Proc Natl Acad Sci U S A 2010; 107:7793-8. [PMID: 20385836 DOI: 10.1073/pnas.0914285107] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
We present a computational method for identifying potential targets of a transcription factor (TF) using wild-type gene expression time series data. For each putative target gene we fit a simple differential equation model of transcriptional regulation, and the model likelihood serves as a score to rank targets. The expression profile of the TF is modeled as a sample from a Gaussian process prior distribution that is integrated out using a nonparametric Bayesian procedure. This results in a parsimonious model with relatively few parameters that can be applied to short time series datasets without noticeable overfitting. We assess our method using genome-wide chromatin immunoprecipitation (ChIP-chip) and loss-of-function mutant expression data for two TFs, Twist, and Mef2, controlling mesoderm development in Drosophila. Lists of top-ranked genes identified by our method are significantly enriched for genes close to bound regions identified in the ChIP-chip data and for genes that are differentially expressed in loss-of-function mutants. Targets of Twist display diverse expression profiles, and in this case a model-based approach performs significantly better than scoring based on correlation with TF expression. Our approach is found to be comparable or superior to ranking based on mutant differential expression scores. Also, we show how integrating complementary wild-type spatial expression data can further improve target ranking performance.
Collapse
|
42
|
Zhang Y, Hatch KA, Bacon J, Wernisch L. An integrated machine learning approach for predicting DosR-regulated genes in Mycobacterium tuberculosis. BMC SYSTEMS BIOLOGY 2010; 4:37. [PMID: 20356371 PMCID: PMC2867773 DOI: 10.1186/1752-0509-4-37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/08/2009] [Accepted: 03/31/2010] [Indexed: 11/10/2022]
Abstract
BACKGROUND DosR is an important regulator of the response to stress such as limited oxygen availability in Mycobacterium tuberculosis. Time course gene expression data enable us to dissect this response on the gene regulatory level. The mRNA expression profile of a regulator, however, is not necessarily a direct reflection of its activity. Knowing the transcription factor activity (TFA) can be exploited to predict novel target genes regulated by the same transcription factor. Various approaches have been proposed to reconstruct TFAs from gene expression data. Most of them capture only a first-order approximation to the complex transcriptional processes by assuming linear gene responses and linear dynamics in TFA, or ignore the temporal information in data from such systems. RESULTS In this paper, we approach the problem of inferring dynamic hidden TFAs using Gaussian processes (GP). We are able to model dynamic TFAs and to account for both linear and nonlinear gene responses. To test the validity of the proposed approach, we reconstruct the hidden TFA of p53, a tumour suppressor activated by DNA damage, using published time course gene expression data. Our reconstructed TFA is closer to the experimentally determined profile of p53 concentration than that from the original study. We then apply the model to time course gene expression data obtained from chemostat cultures of M. tuberculosis under reduced oxygen availability. After estimation of the TFA of DosR based on a number of known target genes using the GP model, we predict novel DosR-regulated genes: the parameters of the model are interpreted as relevance parameters indicating an existing functional relationship between TFA and gene expression. We further improve the prediction by integrating promoter sequence information in a logistic regression model. Apart from the documented DosR-regulated genes, our prediction yields ten novel genes under direct control of DosR. CONCLUSIONS Chemostat cultures are an ideal experimental system for controlling noise and variability when monitoring the response of bacterial organisms such as M. tuberculosis to finely controlled changes in culture conditions and available metabolites. Nonlinear hidden TFA dynamics of regulators can be reconstructed remarkably well with Gaussian processes from such data. Moreover, estimated parameters of the GP can be used to assess whether a gene is controlled by the reconstructed TFA or not. It is straightforward to combine these parameters with further information, such as the presence of binding motifs, to increase prediction accuracy.
Collapse
Affiliation(s)
- Yi Zhang
- School of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK
| | - Kim A Hatch
- TB research, Health Protection Agency, CEPR, Porton Down, Salisbury SP4 0JG, UK
| | - Joanna Bacon
- TB research, Health Protection Agency, CEPR, Porton Down, Salisbury SP4 0JG, UK
| | - Lorenz Wernisch
- School of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK
- MRC Biostatistics Unit, University Forvie Site, Robinson Way, Cambridge CB2 0SR, UK
| |
Collapse
|
43
|
Wang J, Tian T. Quantitative model for inferring dynamic regulation of the tumour suppressor gene p53. BMC Bioinformatics 2010; 11:36. [PMID: 20085646 PMCID: PMC2832896 DOI: 10.1186/1471-2105-11-36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The availability of various "omics" datasets creates a prospect of performing the study of genome-wide genetic regulatory networks. However, one of the major challenges of using mathematical models to infer genetic regulation from microarray datasets is the lack of information for protein concentrations and activities. Most of the previous researches were based on an assumption that the mRNA levels of a gene are consistent with its protein activities, though it is not always the case. Therefore, a more sophisticated modelling framework together with the corresponding inference methods is needed to accurately estimate genetic regulation from "omics" datasets. RESULTS This work developed a novel approach, which is based on a nonlinear mathematical model, to infer genetic regulation from microarray gene expression data. By using the p53 network as a test system, we used the nonlinear model to estimate the activities of transcription factor (TF) p53 from the expression levels of its target genes, and to identify the activation/inhibition status of p53 to its target genes. The predicted top 317 putative p53 target genes were supported by DNA sequence analysis. A comparison between our prediction and the other published predictions of p53 targets suggests that most of putative p53 targets may share a common depleted or enriched sequence signal on their upstream non-coding region. CONCLUSIONS The proposed quantitative model can not only be used to infer the regulatory relationship between TF and its down-stream genes, but also be applied to estimate the protein activities of TF from the expression levels of its target genes.
Collapse
Affiliation(s)
- Junbai Wang
- Division of Pathology, The Norwegian Radium Hospital, Rikshospitalet University Hospital, Montebello 0310 Oslo, Norway
| | | |
Collapse
|
44
|
Tian T. Stochastic models for inferring genetic regulation from microarray gene expression data. Biosystems 2009; 99:192-200. [PMID: 19945503 DOI: 10.1016/j.biosystems.2009.11.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2009] [Revised: 11/23/2009] [Accepted: 11/23/2009] [Indexed: 11/16/2022]
Abstract
Microarray expression profiles are inherently noisy and many different sources of variation exist in microarray experiments. It is still a significant challenge to develop stochastic models to realize noise in microarray expression profiles, which has profound influence on the reverse engineering of genetic regulation. Using the target genes of the tumour suppressor gene p53 as the test problem, we developed stochastic differential equation models and established the relationship between the noise strength of stochastic models and parameters of an error model for describing the distribution of the microarray measurements. Numerical results indicate that the simulated variance from stochastic models with a stochastic degradation process can be represented by a monomial in terms of the hybridization intensity and the order of the monomial depends on the type of stochastic process. The developed stochastic models with multiple stochastic processes generated simulations whose variance is consistent with the prediction of the error model. This work also established a general method to develop stochastic models from experimental information.
Collapse
|
45
|
Barenco M, Brewer D, Papouli E, Tomescu D, Callard R, Stark J, Hubank M. Dissection of a complex transcriptional response using genome-wide transcriptional modelling. Mol Syst Biol 2009; 5:327. [PMID: 19920812 PMCID: PMC2795478 DOI: 10.1038/msb.2009.84] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2009] [Accepted: 10/05/2009] [Indexed: 11/14/2022] Open
Abstract
Modern genomics technologies generate huge data sets creating a demand for systems level, experimentally verified, analysis techniques. We examined the transcriptional response to DNA damage in a human T cell line (MOLT4) using microarrays. By measuring both mRNA accumulation and degradation over a short time course, we were able to construct a mechanistic model of the transcriptional response. The model predicted three dominant transcriptional activity profiles—an early response controlled by NFκB and c-Jun, a delayed response controlled by p53, and a late response related to cell cycle re-entry. The method also identified, with defined confidence limits, the transcriptional targets associated with each activity. Experimental inhibition of NFκB, c-Jun and p53 confirmed that target predictions were accurate. Model predictions directly explained 70% of the 200 most significantly upregulated genes in the DNA-damage response. Genome-wide transcriptional modelling (GWTM) requires no prior knowledge of either transcription factors or their targets. GWTM is an economical and effective method for identifying the main transcriptional activators in a complex response and confidently predicting their targets.
Collapse
Affiliation(s)
- Martino Barenco
- Department of Molecular Heamatology and Cancer Biology, UCL Institute of Child Health, London, UK
| | | | | | | | | | | | | |
Collapse
|
46
|
Aijö T, Lähdesmäki H. Learning gene regulatory networks from gene expression measurements using non-parametric molecular kinetics. ACTA ACUST UNITED AC 2009; 25:2937-44. [PMID: 19706742 DOI: 10.1093/bioinformatics/btp511] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Regulation of gene expression is fundamental to the operation of a cell. Revealing the structure and dynamics of a gene regulatory network (GRN) is of great interest and represents a considerably challenging computational problem. The GRN estimation problem is complicated by the fact that the number of gene expression measurements is typically extremely small when compared with the dimension of the biological system. Further, because the gene regulation process is intrinsically complex, commonly used parametric models can provide too simple description of the underlying phenomena and, thus, can be unreliable. In this article, we propose a novel methodology for the inference of GRNs from time-series and steady-state gene expression measurements. The presented framework is based on the use of Bayesian analysis with ordinary differential equations (ODEs) and non-parametric Gaussian process modeling for the transcriptional-level regulation. RESULTS The performance of the proposed structure inference method is evaluated using a recently published in vivo dataset. By comparing the obtained results with those of existing ODE- and Bayesian-based inference methods we demonstrate that the proposed method provides more accurate network structure learning. The predictive capabilities of the method are examined by splitting the dataset into a training set and a test set and by predicting the test set based on the training set. AVAILABILITY A MATLAB implementation of the method will be available from http://www.cs.tut.fi/~aijo2/gp upon publication.
Collapse
Affiliation(s)
- Tarmo Aijö
- Department of Signal Processing, Tampere University of Technology, Tampere, Finland
| | | |
Collapse
|
47
|
Haley B, Paunesku T, Protić M, Woloschak GE. Response of heterogeneous ribonuclear proteins (hnRNP) to ionising radiation and their involvement in DNA damage repair. Int J Radiat Biol 2009; 85:643-55. [PMID: 19579069 DOI: 10.1080/09553000903009548] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
PURPOSE To determine the relationship between heterogeneous nuclear ribonucleoproteins (hnRNP) and DNA repair, particularly in response to ionising radiation (IR). MATERIALS AND METHODS The literature was examined for papers related to the topics of hnRNP, IR and DNA repair. RESULTS HnRNP orchestrate the processing of mRNA to which they are bound in response to IR. HnRNP A18, B1, C1/C2 and K interact with important proteins from DNA Damage Response (DDR) pathways, binding DNA-dependent protein kinase (DNA-PK), the Ku antigen (Ku) and tumour suppressor protein 53 (p53) respectively. Notably, irregularities in the expression of hnRNP A18, B1, K, P2 and L have been linked to cancer and radiosensitivity. Sixteen different hnRNP proteins have been reported to show either mRNA transcript or protein quantity changes following IR. Various protein modifications of hnRNP in response to IR have also been noted: hnRNP A18, C1/C2 and K are phosphorylated; hnRNP C1/C2 is a target of apoptotic proteases; and hnRNP K degradation is controlled by murine double minute ubiquitin ligase (MDM2). Evidence points to a role for hnRNP A1, A18, A2/B1, C1/C2, K and P2 in regulating double-stranded break (DSB) repair pathways by promoting either homologous recombination (HR) or non-homologous end rejoining (NHEJ) repair pathways following IR. CONCLUSIONS HnRNP proteins play a pivotal role in coordinating repair pathways following exposure to IR, through protein-protein interactions and transcript regulation of key repair and stress response mRNA. In particular, several hnRNP proteins are critical in coordinating the choice of HR or NHEJ to repair DSB caused by IR.
Collapse
Affiliation(s)
- Benjamin Haley
- Department of Radiation Oncology, Northwestern University, Chicago, Illinois, USA
| | | | | | | |
Collapse
|
48
|
Seok J, Xiao W, Moldawer LL, Davis RW, Covert MW. A dynamic network of transcription in LPS-treated human subjects. BMC SYSTEMS BIOLOGY 2009; 3:78. [PMID: 19638230 PMCID: PMC2729748 DOI: 10.1186/1752-0509-3-78] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2008] [Accepted: 07/28/2009] [Indexed: 01/01/2023]
Abstract
BACKGROUND Understanding the transcriptional regulatory networks that map out the coordinated dynamic responses of signaling proteins, transcription factors and target genes over time would represent a significant advance in the application of genome wide expression analysis. The primary challenge is monitoring transcription factor activities over time, which is not yet available at the large scale. Instead, there have been several developments to estimate activities computationally. For example, Network Component Analysis (NCA) is an approach that can predict transcription factor activities over time as well as the relative regulatory influence of factors on each target gene. RESULTS In this study, we analyzed a gene expression data set in blood leukocytes from human subjects administered with lipopolysaccharide (LPS), a prototypical inflammatory challenge, in the context of a reconstructed regulatory network including 10 transcription factors, 99 target genes and 149 regulatory interactions. We found that the computationally estimated activities were well correlated to their coordinated action. Furthermore, we found that clustering the genes in the context of regulatory influences greatly facilitated interpretation of the expression data, as clusters of gene expression corresponded to the activity of specific factors or more interestingly, factor combinations which suggest coordinated regulation of gene expression. The resulting clusters were therefore more biologically meaningful, and also led to identification of additional genes under the same regulation. CONCLUSION Using NCA, we were able to build a network that accounted for between 8-11% genes in the known transcriptional response to LPS in humans. The dynamic network illustrated changes of transcription factor activities and gene expressions as well as interactions of signaling proteins, transcription factors and target genes.
Collapse
Affiliation(s)
- Junhee Seok
- Department of Bioengineering, Stanford University, Stanford, California, USA.
| | | | | | | | | |
Collapse
|
49
|
Turtoi A, Brown I, Oskamp D, Schneeweiss FHA. Early gene expression in human lymphocytes aftergamma-irradiation–a genetic pattern with potential for biodosimetry. Int J Radiat Biol 2009; 84:375-87. [PMID: 18464067 DOI: 10.1080/09553000802029886] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
50
|
Kirk PDW, Stumpf MPH. Gaussian process regression bootstrapping: exploring the effects of uncertainty in time course data. ACTA ACUST UNITED AC 2009; 25:1300-6. [PMID: 19289448 PMCID: PMC2677737 DOI: 10.1093/bioinformatics/btp139] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Motivation: Although widely accepted that high-throughput biological data are typically highly noisy, the effects that this uncertainty has upon the conclusions we draw from these data are often overlooked. However, in order to assign any degree of confidence to our conclusions, we must quantify these effects. Bootstrap resampling is one method by which this may be achieved. Here, we present a parametric bootstrapping approach for time-course data, in which Gaussian process regression (GPR) is used to fit a probabilistic model from which replicates may then be drawn. This approach implicitly allows the time dependence of the data to be taken into account, and is applicable to a wide range of problems. Results: We apply GPR bootstrapping to two datasets from the literature. In the first example, we show how the approach may be used to investigate the effects of data uncertainty upon the estimation of parameters in an ordinary differential equations (ODE) model of a cell signalling pathway. Although we find that the parameter estimates inferred from the original dataset are relatively robust to data uncertainty, we also identify a distinct second set of estimates. In the second example, we use our method to show that the topology of networks constructed from time-course gene expression data appears to be sensitive to data uncertainty, although there may be individual edges in the network that are robust in light of present data. Availability: Matlab code for performing GPR bootstrapping is available from our web site: http://www3.imperial.ac.uk/theoreticalsystemsbiology/data-software/ Contact:paul.kirk@imperial.ac.uk, m.stumpf@imperial.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Paul D W Kirk
- Centre for Bioinformatics, Division of Molecular Biosciences, Imperial College London, London SW7 2AZ, UK.
| | | |
Collapse
|