1
|
Ranciati S, Wit EC, Viroli C. Bayesian smooth‐and‐match inference for ordinary differential equations models linear in the parameters. STAT NEERL 2020. [DOI: 10.1111/stan.12192] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Saverio Ranciati
- Department of Statistical SciencesUniversity of Bologna Bologna Italy
| | - Ernst C. Wit
- Institute of Computational ScienceUniversità della Svizzera Italiana Lugano Switzerland
| | - Cinzia Viroli
- Department of Statistical SciencesUniversity of Bologna Bologna Italy
| |
Collapse
|
2
|
Lopez-Lopera AF, Alvarez MA. Switched Latent Force Models for Reverse-Engineering Transcriptional Regulation in Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:322-335. [PMID: 29990003 DOI: 10.1109/tcbb.2017.2764908] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
To survive environmental conditions, cells transcribe their response activities into encoded mRNA sequences in order to produce certain amounts of protein concentrations. The external conditions are mapped into the cell through the activation of special proteins called transcription factors (TFs). Due to the difficult task to measure experimentally TF behaviors, and the challenges to capture their quick-time dynamics, different types of models based on differential equations have been proposed. However, those approaches usually incur in costly procedures, and they present problems to describe sudden changes in TF regulators. In this paper, we present a switched dynamical latent force model for reverse-engineering transcriptional regulation in gene expression data which allows the exact inference over latent TF activities driving some observed gene expressions through a linear differential equation. To deal with discontinuities in the dynamics, we introduce an approach that switches between different TF activities and different dynamical systems. This creates a versatile representation of transcription networks that can capture discrete changes and non-linearities. We evaluate our model on both simulated data and real data (e.g., microaerobic shift in E. coli, yeast respiration), concluding that our framework allows for the fitting of the expression data while being able to infer continuous-time TF profiles.
Collapse
|
3
|
Modrák M, Vohradský J. Genexpi: a toolset for identifying regulons and validating gene regulatory networks using time-course expression data. BMC Bioinformatics 2018; 19:137. [PMID: 29653518 PMCID: PMC5899412 DOI: 10.1186/s12859-018-2138-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Accepted: 03/26/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Identifying regulons of sigma factors is a vital subtask of gene network inference. Integrating multiple sources of data is essential for correct identification of regulons and complete gene regulatory networks. Time series of expression data measured with microarrays or RNA-seq combined with static binding experiments (e.g., ChIP-seq) or literature mining may be used for inference of sigma factor regulatory networks. RESULTS We introduce Genexpi: a tool to identify sigma factors by combining candidates obtained from ChIP experiments or literature mining with time-course gene expression data. While Genexpi can be used to infer other types of regulatory interactions, it was designed and validated on real biological data from bacterial regulons. In this paper, we put primary focus on CyGenexpi: a plugin integrating Genexpi with the Cytoscape software for ease of use. As a part of this effort, a plugin for handling time series data in Cytoscape called CyDataseries has been developed and made available. Genexpi is also available as a standalone command line tool and an R package. CONCLUSIONS Genexpi is a useful part of gene network inference toolbox. It provides meaningful information about the composition of regulons and delivers biologically interpretable results.
Collapse
Affiliation(s)
- Martin Modrák
- Institute of Microbiology of the Czech Academy of Sciences, Vídeňská, 1083, Prague, Czech Republic.
| | - Jiří Vohradský
- Institute of Microbiology of the Czech Academy of Sciences, Vídeňská, 1083, Prague, Czech Republic
| |
Collapse
|
4
|
Minas G, Jenkins DJ, Rand DA, Finkenstädt B. Inferring transcriptional logic from multiple dynamic experiments. Bioinformatics 2017; 33:3437-3444. [PMID: 28666320 PMCID: PMC5860162 DOI: 10.1093/bioinformatics/btx407] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2016] [Revised: 06/02/2017] [Accepted: 06/27/2017] [Indexed: 01/10/2023] Open
Abstract
MOTIVATION The availability of more data of dynamic gene expression under multiple experimental conditions provides new information that makes the key goal of identifying not only the transcriptional regulators of a gene but also the underlying logical structure attainable. RESULTS We propose a novel method for inferring transcriptional regulation using a simple, yet biologically interpretable, model to find the logic by which a set of candidate genes and their associated transcription factors (TFs) regulate the transcriptional process of a gene of interest. Our dynamic model links the mRNA transcription rate of the target gene to the activation states of the TFs assuming that these interactions are consistent across multiple experiments and over time. A trans-dimensional Markov Chain Monte Carlo (MCMC) algorithm is used to efficiently sample the regulatory logic under different combinations of parents and rank the estimated models by their posterior probabilities. We demonstrate and compare our methodology with other methods using simulation examples and apply it to a study of transcriptional regulation of selected target genes of Arabidopsis Thaliana from microarray time series data obtained under multiple biotic stresses. We show that our method is able to detect complex regulatory interactions that are consistent under multiple experimental conditions. AVAILABILITY AND IMPLEMENTATION Programs are written in MATLAB and Statistics Toolbox Release 2016b, The MathWorks, Inc., Natick, Massachusetts, United States and are available on GitHub https://github.com/giorgosminas/TRS and at http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software. CONTACT giorgos.minas@warwick.ac.uk or b.f.finkenstadt@warwick.ac.uk. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Giorgos Minas
- Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | - Dafyd J Jenkins
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | - David A Rand
- Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
- Zeeman Institute, Systems Biology and Infectious Disease Epidemiology Research, University of Warwick, Coventry, CV4 7AL, UK
| | | |
Collapse
|
5
|
System-wide analysis of the transcriptional network of human myelomonocytic leukemia cells predicts attractor structure and phorbol-ester-induced differentiation and dedifferentiation transitions. Sci Rep 2015; 5:8283. [PMID: 25655563 PMCID: PMC4319166 DOI: 10.1038/srep08283] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Accepted: 01/09/2015] [Indexed: 11/24/2022] Open
Abstract
We present a system-wide transcriptional network structure that controls cell types in the context of expression pattern transitions that correspond to cell type transitions. Co-expression based analyses uncovered a system-wide, ladder-like transcription factor cluster structure composed of nearly 1,600 transcription factors in a human transcriptional network. Computer simulations based on a transcriptional regulatory model deduced from the system-wide, ladder-like transcription factor cluster structure reproduced expression pattern transitions when human THP-1 myelomonocytic leukaemia cells cease proliferation and differentiate under phorbol myristate acetate stimulation. The behaviour of MYC, a reprogramming Yamanaka factor that was suggested to be essential for induced pluripotent stem cells during dedifferentiation, could be interpreted based on the transcriptional regulation predicted by the system-wide, ladder-like transcription factor cluster structure. This study introduces a novel system-wide structure to transcriptional networks that provides new insights into network topology.
Collapse
|
6
|
Topa H, Jónás Á, Kofler R, Kosiol C, Honkela A. Gaussian process test for high-throughput sequencing time series: application to experimental evolution. ACTA ACUST UNITED AC 2015; 31:1762-70. [PMID: 25614471 PMCID: PMC4443671 DOI: 10.1093/bioinformatics/btv014] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 01/07/2015] [Indexed: 12/21/2022]
Abstract
MOTIVATION Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but also monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analyzing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth. RESULTS We present the beta-binomial Gaussian process model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine it with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present simulations exploring different experimental design choices and results on real data from Drosophila experimental evolution experiment in temperature adaptation. AVAILABILITY AND IMPLEMENTATION R software implementing the test is available at https://github.com/handetopa/BBGP.
Collapse
Affiliation(s)
- Hande Topa
- Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Ágnes Jónás
- Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Robert Kofler
- Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Carolin Kosiol
- Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| | - Antti Honkela
- Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| |
Collapse
|
7
|
Wheeler MW, Dunson DB, Pandalai SP, Baker BA, Herring AH. Mechanistic Hierarchical Gaussian Processes. J Am Stat Assoc 2014; 109:894-904. [PMID: 25541568 PMCID: PMC4273873 DOI: 10.1080/01621459.2014.899234] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2013] [Revised: 01/01/2014] [Indexed: 10/25/2022]
Abstract
The statistics literature on functional data analysis focuses primarily on flexible black-box approaches, which are designed to allow individual curves to have essentially any shape while characterizing variability. Such methods typically cannot incorporate mechanistic information, which is commonly expressed in terms of differential equations. Motivated by studies of muscle activation, we propose a nonparametric Bayesian approach that takes into account mechanistic understanding of muscle physiology. A novel class of hierarchical Gaussian processes is defined that favors curves consistent with differential equations defined on motor, damper, spring systems. A Gibbs sampler is proposed to sample from the posterior distribution and applied to a study of rats exposed to non-injurious muscle activation protocols. Although motivated by muscle force data, a parallel approach can be used to include mechanistic information in broad functional data analysis applications.
Collapse
Affiliation(s)
- Matthew W. Wheeler
- National Institute for Occupational Safety and Health, 4676 Columbia Parkway, Cincinnati, Ohio 45226, MS C-15
| | | | - Sudha P. Pandalai
- National Institute for Occupational Safety and Health, 4676 Columbia Parkway, Cincinnati, Ohio 45226, MS C-15
| | - Brent A. Baker
- National Institute for Occupational Safety and Health, 4676 Columbia Parkway, Cincinnati, Ohio 45226, MS C-15
| | - Amy H. Herring
- Department of Biostatistics, University of North Carolina at Chapel Hill
| |
Collapse
|
8
|
Strakova E, Zikova A, Vohradsky J. Inference of sigma factor controlled networks by using numerical modeling applied to microarray time series data of the germinating prokaryote. Nucleic Acids Res 2013; 42:748-63. [PMID: 24157841 PMCID: PMC3902916 DOI: 10.1093/nar/gkt917] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
A computational model of gene expression was applied to a novel test set of microarray time series measurements to reveal regulatory interactions between transcriptional regulators represented by 45 sigma factors and the genes expressed during germination of a prokaryote Streptomyces coelicolor. Using microarrays, the first 5.5 h of the process was recorded in 13 time points, which provided a database of gene expression time series on genome-wide scale. The computational modeling of the kinetic relations between the sigma factors, individual genes and genes clustered according to the similarity of their expression kinetics identified kinetically plausible sigma factor-controlled networks. Using genome sequence annotations, functional groups of genes that were predominantly controlled by specific sigma factors were identified. Using external binding data complementing the modeling approach, specific genes involved in the control of the studied process were identified and their function suggested.
Collapse
Affiliation(s)
- Eva Strakova
- Laboratory of Bioinformatics, Institute of Microbiology, Academy of Sciences of the Czech Republic, Prague 142 20, Czech Republic
| | | | | |
Collapse
|
9
|
Kirk P, Thorne T, Stumpf MPH. Model selection in systems and synthetic biology. Curr Opin Biotechnol 2013; 24:767-74. [PMID: 23578462 DOI: 10.1016/j.copbio.2013.03.012] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2012] [Revised: 03/07/2013] [Accepted: 03/14/2013] [Indexed: 11/17/2022]
Abstract
Developing mechanistic models has become an integral aspect of systems biology, as has the need to differentiate between alternative models. Parameterizing mathematical models has been widely perceived as a formidable challenge, which has spurred the development of statistical and optimisation routines for parameter inference. But now focus is increasingly shifting to problems that require us to choose from among a set of different models to determine which one offers the best description of a given biological system. We will here provide an overview of recent developments in the area of model selection. We will focus on approaches that are both practical as well as build on solid statistical principles and outline the conceptual foundations and the scope for application of such methods in systems biology.
Collapse
Affiliation(s)
- Paul Kirk
- Centre for Integrative Systems Biology and Bioinformatics, Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | | | | |
Collapse
|
10
|
Äijö T, Granberg K, Lähdesmäki H. Sorad: a systems biology approach to predict and modulate dynamic signaling pathway response from phosphoproteome time-course measurements. ACTA ACUST UNITED AC 2013; 29:1283-91. [PMID: 23505293 DOI: 10.1093/bioinformatics/btt130] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Signaling networks mediate responses to different stimuli using a multitude of feed-forward, feedback and cross-talk mechanisms, and malfunctions in these mechanisms have an important role in various diseases. To understand a disease and to help discover novel therapeutic approaches, we have to reveal the molecular mechanisms underlying signal transduction and use that information to design targeted perturbations. RESULTS We have pursued this direction by developing an efficient computational approach, Sorad, which can estimate the structure of signal transduction networks and the associated continuous signaling dynamics from phosphoprotein time-course measurements. Further, Sorad can identify experimental conditions that modulate the signaling toward a desired response. We have analyzed comprehensive phosphoprotein time-course data from a human hepatocellular liver carcinoma cell line and demonstrate here that Sorad provides more accurate predictions of phosphoprotein responses to given stimuli than previously presented methods and, importantly, that Sorad can estimate experimental conditions to achieve a desired signaling response. Because Sorad is data driven, it has a high potential to generate novel hypotheses for further research. Our analysis of the hepatocellular liver carcinoma data predict a regulatory connection where AKT activity is dependent on IKK in TGFα stimulated cells, which is supported by the original data but not included in the original model. AVAILABILITY An implementation of the proposed computational methods will be available at http://research.ics.aalto.fi/csb/software/. CONTACT tarmo.aijo@aalto.fi or harri.lahdesmaki@aalto.fi SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tarmo Äijö
- Department of Information and Computer Science, Aalto University, FI-00076 AALTO, Finland.
| | | | | |
Collapse
|
11
|
Abstract
This chapter is split into two main sections; first, I will present an introduction to gene networks. Second, I will discuss various approaches to gene network modeling which will include some examples for using different data sources. Computational modeling has been used for many different biological systems and many approaches have been developed addressing the different needs posed by the different application fields. The modeling approaches presented here are not limited to gene regulatory networks and occasionally I will present other examples. The material covered here is an update based on several previous publications by Thomas Schlitt and Alvis Brazma (FEBS Lett 579(8),1859-1866, 2005; Philos Trans R Soc Lond B Biol Sci 361(1467), 483-494, 2006; BMC Bioinformatics 8(suppl 6), S9, 2007) that formed the foundation for a lecture on gene regulatory networks at the In Silico Systems Biology workshop series at the European Bioinformatics Institute in Hinxton.
Collapse
Affiliation(s)
- Thomas Schlitt
- Department of Medical and Molecular Genetics, King's College London, London, UK
| |
Collapse
|