1
|
Awasthi A, Minin VM, Huang J, Chow D, Xu J. Fitting a stochastic model of intensive care occupancy to noisy hospitalization time series during the COVID-19 pandemic. Stat Med 2023; 42:5189-5206. [PMID: 37705508 DOI: 10.1002/sim.9907] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 07/28/2023] [Accepted: 09/01/2023] [Indexed: 09/15/2023]
Abstract
Intensive care occupancy is an important indicator of health care stress that has been used to guide policy decisions during the COVID-19 pandemic. Toward reliable decision-making as a pandemic progresses, estimating the rates at which patients are admitted to and discharged from hospitals and intensive care units (ICUs) is crucial. Since individual-level hospital data are rarely available to modelers in each geographic locality of interest, it is important to develop tools for inferring these rates from publicly available daily numbers of hospital and ICU beds occupied. We develop such an estimation approach based on an immigration-death process that models fluctuations of ICU occupancy. Our flexible framework allows for immigration and death rates to depend on covariates, such as hospital bed occupancy and daily SARS-CoV-2 test positivity rate, which may drive changes in hospital ICU operations. We demonstrate via simulation studies that the proposed method performs well on noisy time series data and apply our statistical framework to hospitalization data from the University of California, Irvine (UCI) Health and Orange County, California. By introducing a likelihood-based framework where immigration and death rates can vary with covariates, we find, through rigorous model selection, that hospitalization and positivity rates are crucial covariates for modeling ICU stay dynamics and validate our per-patient ICU stay estimates using anonymized patient-level UCI hospital data.
Collapse
Affiliation(s)
- Achal Awasthi
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
| | - Volodymyr M Minin
- Department of Statistics, University of California, Irvine, Irvine, California, USA
| | - Jenny Huang
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| | - Daniel Chow
- School of Medicine, University of California, Irvine, Irvine, California, USA
| | - Jason Xu
- Department of Biostatistics and Bioinformatics, Duke University, Durham, North Carolina, USA
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| |
Collapse
|
2
|
Marin R, Runvik H, Medvedev A, Engblom S. Bayesian monitoring of COVID-19 in Sweden. Epidemics 2023; 45:100715. [PMID: 37703786 DOI: 10.1016/j.epidem.2023.100715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Revised: 07/28/2023] [Accepted: 08/16/2023] [Indexed: 09/15/2023] Open
Abstract
In an effort to provide regional decision support for the public healthcare, we design a data-driven compartment-based model of COVID-19 in Sweden. From national hospital statistics we derive parameter priors, and we develop linear filtering techniques to drive the simulations given data in the form of daily healthcare demands. We additionally propose a posterior marginal estimator which provides for an improved temporal resolution of the reproduction number estimate as well as supports robustness checks via a parametric bootstrap procedure. From our computational approach we obtain a Bayesian model of predictive value which provides important insight into the progression of the disease, including estimates of the effective reproduction number, the infection fatality rate, and the regional-level immunity. We successfully validate our posterior model against several different sources, including outputs from extensive screening programs. Since our required data in comparison is easy and non-sensitive to collect, we argue that our approach is particularly promising as a tool to support monitoring and decisions within public health. Significance: Using public data from Swedish patient registries we develop a national-scale computational model of COVID-19. The parametrized model produces valuable weekly predictions of healthcare demands at the regional level and validates well against several different sources. We also obtain critical epidemiological insights into the disease progression, including, e.g., reproduction number, immunity and disease fatality estimates. The success of the model hinges on our novel use of filtering techniques which allows us to design an accurate data-driven procedure using data exclusively from healthcare demands, i.e., our approach does not rely on public testing and is therefore very cost-effective.
Collapse
Affiliation(s)
- Robin Marin
- Division of Scientific Computing, Department of Information Technology, Uppsala University, SE-751 05, Uppsala, Sweden.
| | - Håkan Runvik
- Division of Systems and Control, Department of Information Technology, Uppsala University, SE-751 05, Uppsala, Sweden.
| | - Alexander Medvedev
- Division of Systems and Control, Department of Information Technology, Uppsala University, SE-751 05, Uppsala, Sweden.
| | - Stefan Engblom
- Division of Scientific Computing, Department of Information Technology, Uppsala University, SE-751 05, Uppsala, Sweden.
| |
Collapse
|
3
|
Wadkin LE, Golightly A, Branson J, Hoppit A, Parker NG, Baggaley AW. Quantifying Invasive Pest Dynamics through Inference of a Two-Node Epidemic Network Model. DIVERSITY 2023. [DOI: 10.3390/d15040496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
Abstract
Invasive woodland pests have substantial ecological, economic, and social impacts, harming biodiversity and ecosystem services. Mathematical modelling informed by Bayesian inference can deepen our understanding of the fundamental behaviours of invasive pests and provide predictive tools for forecasting future spread. A key invasive pest of concern in the UK is the oak processionary moth (OPM). OPM was established in the UK in 2006; it is harmful to both oak trees and humans, and its infestation area is continually expanding. Here, we use a computational inference scheme to estimate the parameters for a two-node network epidemic model to describe the temporal dynamics of OPM in two geographically neighbouring parks (Bushy Park and Richmond Park, London). We show the applicability of such a network model to describing invasive pest dynamics and our results suggest that the infestation within Richmond Park has largely driven the infestation within Bushy Park.
Collapse
|
4
|
Tang M, Dudas G, Bedford T, Minin VN. Fitting stochastic epidemic models to gene genealogies using linear noise approximation. Ann Appl Stat 2023. [DOI: 10.1214/21-aoas1583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Mingwei Tang
- Department of Statistics, University of Washington, Seattle
| | - Gytis Dudas
- Gothenburg Global Biodiversity Centre (GGBC)
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center
| | | |
Collapse
|
5
|
Fintzi J, Wakefield J, Minin VN. A linear noise approximation for stochastic epidemic models fit to partially observed incidence counts. Biometrics 2022; 78:1530-1541. [PMID: 34374071 DOI: 10.1111/biom.13538] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Revised: 06/10/2021] [Accepted: 06/17/2021] [Indexed: 12/30/2022]
Abstract
Stochastic epidemic models (SEMs) fit to incidence data are critical to elucidating outbreak dynamics, shaping response strategies, and preparing for future epidemics. SEMs typically represent counts of individuals in discrete infection states using Markov jump processes (MJPs), but are computationally challenging as imperfect surveillance, lack of subject-level information, and temporal coarseness of the data obscure the true epidemic. Analytic integration over the latent epidemic process is impossible, and integration via Markov chain Monte Carlo (MCMC) is cumbersome due to the dimensionality and discreteness of the latent state space. Simulation-based computational approaches can address the intractability of the MJP likelihood, but are numerically fragile and prohibitively expensive for complex models. A linear noise approximation (LNA) that approximates the MJP transition density with a Gaussian density has been explored for analyzing prevalence data in large-population settings, but requires modification for analyzing incidence counts without assuming that the data are normally distributed. We demonstrate how to reparameterize SEMs to appropriately analyze incidence data, and fold the LNA into a data augmentation MCMC framework that outperforms deterministic methods, statistically, and simulation-based methods, computationally. Our framework is computationally robust when the model dynamics are complex and applies to a broad class of SEMs. We evaluate our method in simulations that reflect Ebola, influenza, and SARS-CoV-2 dynamics, and apply our method to national surveillance counts from the 2013-2015 West Africa Ebola outbreak.
Collapse
Affiliation(s)
- Jonathan Fintzi
- Biostatistics Research Branch, National Institute of Allergy and Infectious Diseases, Rockville, Maryland, USA
| | - Jon Wakefield
- Departments of Biostatistics and Statistics, University of Washington, Seattle, Washington, USA
| | - Vladimir N Minin
- Department of Statistics, University of California, Irvine, California, USA
| |
Collapse
|
6
|
Öcal K, Gutmann MU, Sanguinetti G, Grima R. Inference and uncertainty quantification of stochastic gene expression via synthetic models. J R Soc Interface 2022; 19:20220153. [PMID: 35858045 PMCID: PMC9277240 DOI: 10.1098/rsif.2022.0153] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/21/2022] [Indexed: 12/26/2022] Open
Abstract
Estimating uncertainty in model predictions is a central task in quantitative biology. Biological models at the single-cell level are intrinsically stochastic and nonlinear, creating formidable challenges for their statistical estimation which inevitably has to rely on approximations that trade accuracy for tractability. Despite intensive interest, a sweet spot in this trade-off has not been found yet. We propose a flexible procedure for uncertainty quantification in a wide class of reaction networks describing stochastic gene expression including those with feedback. The method is based on creating a tractable coarse-graining of the model that is learned from simulations, a synthetic model, to approximate the likelihood function. We demonstrate that synthetic models can substantially outperform state-of-the-art approaches on a number of non-trivial systems and datasets, yielding an accurate and computationally viable solution to uncertainty quantification in stochastic models of gene expression.
Collapse
Affiliation(s)
- Kaan Öcal
- School of Informatics, University of Edinburgh, Edinburgh EH9 3JH, UK
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| | | | - Guido Sanguinetti
- Scuola Internazionale Superiore di Studi Avanzati, 34136 Trieste, Italy
| | - Ramon Grima
- School of Biological Sciences, University of Edinburgh, Edinburgh EH9 3JH, UK
| |
Collapse
|
7
|
Sherlock C, Golightly A. Exact Bayesian inference for discretely observed Markov Jump Processes using finite rate matrices. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2093886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, UK
| | | |
Collapse
|
8
|
Chkrebtii OA, García YE, Capistrán MA, Noyola DE. Inference for stochastic kinetic models from multiple data sources for joint estimation of infection dynamics from aggregate reports and virological data. Ann Appl Stat 2022. [DOI: 10.1214/21-aoas1527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | - Yury E. García
- Área de Matemáticas Básicas, Centro de Investigación en Matemáticas
| | | | - Daniel E. Noyola
- Department of Microbiology, Faculty of Medicine, Universidad Autónoma de San Luis Potosí
| |
Collapse
|
9
|
Wadkin LE, Branson J, Hoppit A, Parker NG, Golightly A, Baggaley AW. Inference for epidemic models with time-varying infection rates: Tracking the dynamics of oak processionary moth in the UK. Ecol Evol 2022; 12:e8871. [PMID: 35509609 PMCID: PMC9058805 DOI: 10.1002/ece3.8871] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Revised: 03/31/2022] [Accepted: 04/08/2022] [Indexed: 11/16/2022] Open
Abstract
Invasive pests pose a great threat to forest, woodland, and urban tree ecosystems. The oak processionary moth (OPM) is a destructive pest of oak trees, first reported in the UK in 2006. Despite great efforts to contain the outbreak within the original infested area of South‐East England, OPM continues to spread. Here, we analyze data consisting of the numbers of OPM nests removed each year from two parks in London between 2013 and 2020. Using a state‐of‐the‐art Bayesian inference scheme, we estimate the parameters for a stochastic compartmental SIR (susceptible, infested, and removed) model with a time‐varying infestation rate to describe the spread of OPM. We find that the infestation rate and subsequent basic reproduction number have remained constant since 2013 (with R0 between one and two). This shows further controls must be taken to reduce R0 below one and stop the advance of OPM into other areas of England. Synthesis. Our findings demonstrate the applicability of the SIR model to describing OPM spread and show that further controls are needed to reduce the infestation rate. The proposed statistical methodology is a powerful tool to explore the nature of a time‐varying infestation rate, applicable to other partially observed time series epidemic data.
Collapse
Affiliation(s)
- Laura E Wadkin
- School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne UK
| | - Julia Branson
- GeoData, Geography and Environmental Science University of Southampton Southampton UK
| | | | - Nicholas G Parker
- School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne UK
| | - Andrew Golightly
- School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne UK.,Department of Mathematical Sciences Durham University Durham UK
| | - Andrew W Baggaley
- School of Mathematics, Statistics and Physics Newcastle University Newcastle upon Tyne UK
| |
Collapse
|
10
|
Münch JL, Paul F, Schmauder R, Benndorf K. Bayesian inference of kinetic schemes for ion channels by Kalman filtering. eLife 2022; 11:e62714. [PMID: 35506659 PMCID: PMC9342998 DOI: 10.7554/elife.62714] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2020] [Accepted: 04/22/2022] [Indexed: 11/16/2022] Open
Abstract
Inferring adequate kinetic schemes for ion channel gating from ensemble currents is a daunting task due to limited information in the data. We address this problem by using a parallelized Bayesian filter to specify hidden Markov models for current and fluorescence data. We demonstrate the flexibility of this algorithm by including different noise distributions. Our generalized Kalman filter outperforms both a classical Kalman filter and a rate equation approach when applied to patch-clamp data exhibiting realistic open-channel noise. The derived generalization also enables inclusion of orthogonal fluorescence data, making unidentifiable parameters identifiable and increasing the accuracy of the parameter estimates by an order of magnitude. By using Bayesian highest credibility volumes, we found that our approach, in contrast to the rate equation approach, yields a realistic uncertainty quantification. Furthermore, the Bayesian filter delivers negligibly biased estimates for a wider range of data quality. For some data sets, it identifies more parameters than the rate equation approach. These results also demonstrate the power of assessing the validity of algorithms by Bayesian credibility volumes in general. Finally, we show that our Bayesian filter is more robust against errors induced by either analog filtering before analog-to-digital conversion or by limited time resolution of fluorescence data than a rate equation approach.
Collapse
Affiliation(s)
- Jan L Münch
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| | - Fabian Paul
- Department of Biochemistry and Molecular Biology, University of ChicagoChicagoUnited States
| | - Ralf Schmauder
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| | - Klaus Benndorf
- Institut für Physiologie II, Universitätsklinikum Jena, Friedrich Schiller University JenaJenaGermany
| |
Collapse
|
11
|
Davidović A, Chait R, Batt G, Ruess J. Parameter inference for stochastic biochemical models from perturbation experiments parallelised at the single cell level. PLoS Comput Biol 2022; 18:e1009950. [PMID: 35303737 PMCID: PMC8967023 DOI: 10.1371/journal.pcbi.1009950] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 03/30/2022] [Accepted: 02/21/2022] [Indexed: 01/30/2023] Open
Abstract
Understanding and characterising biochemical processes inside single cells requires experimental platforms that allow one to perturb and observe the dynamics of such processes as well as computational methods to build and parameterise models from the collected data. Recent progress with experimental platforms and optogenetics has made it possible to expose each cell in an experiment to an individualised input and automatically record cellular responses over days with fine time resolution. However, methods to infer parameters of stochastic kinetic models from single-cell longitudinal data have generally been developed under the assumption that experimental data is sparse and that responses of cells to at most a few different input perturbations can be observed. Here, we investigate and compare different approaches for calculating parameter likelihoods of single-cell longitudinal data based on approximations of the chemical master equation (CME) with a particular focus on coupling the linear noise approximation (LNA) or moment closure methods to a Kalman filter. We show that, as long as cells are measured sufficiently frequently, coupling the LNA to a Kalman filter allows one to accurately approximate likelihoods and to infer model parameters from data even in cases where the LNA provides poor approximations of the CME. Furthermore, the computational cost of filtering-based iterative likelihood evaluation scales advantageously in the number of measurement times and different input perturbations and is thus ideally suited for data obtained from modern experimental platforms. To demonstrate the practical usefulness of these results, we perform an experiment in which single cells, equipped with an optogenetic gene expression system, are exposed to various different light-input sequences and measured at several hundred time points and use parameter inference based on iterative likelihood evaluation to parameterise a stochastic model of the system.
Collapse
Affiliation(s)
- Anđela Davidović
- Department of Computational Biology, Institut Pasteur, Paris, France
| | - Remy Chait
- Biosciences, Living Systems Institute, University of Exeter, Exeter, The United Kingdom
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Gregory Batt
- Department of Computational Biology, Institut Pasteur, Paris, France
- Inria Paris, Paris, France
| | - Jakob Ruess
- Department of Computational Biology, Institut Pasteur, Paris, France
- Inria Paris, Paris, France
| |
Collapse
|
12
|
Ion IG, Wildner C, Loukrezis D, Koeppl H, De Gersem H. Tensor-train approximation of the chemical master equation and its application for parameter inference. J Chem Phys 2021; 155:034102. [PMID: 34293878 DOI: 10.1063/5.0045521] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
In this work, we perform Bayesian inference tasks for the chemical master equation in the tensor-train format. The tensor-train approximation has been proven to be very efficient in representing high-dimensional data arising from the explicit representation of the chemical master equation solution. An additional advantage of representing the probability mass function in the tensor-train format is that parametric dependency can be easily incorporated by introducing a tensor product basis expansion in the parameter space. Time is treated as an additional dimension of the tensor and a linear system is derived to solve the chemical master equation in time. We exemplify the tensor-train method by performing inference tasks such as smoothing and parameter inference using the tensor-train framework. A very high compression ratio is observed for storing the probability mass function of the solution. Since all linear algebra operations are performed in the tensor-train format, a significant reduction in the computational time is observed as well.
Collapse
Affiliation(s)
- Ion Gabriel Ion
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Christian Wildner
- Department of Electrical Engineering and Information Technology, Technische Universität Darmstadt, Darmstadt, Germany
| | - Dimitrios Loukrezis
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Heinz Koeppl
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| | - Herbert De Gersem
- Centre for Computational Engineering, Technische Universität Darmstadt, Darmstadt, Germany
| |
Collapse
|
13
|
Efficient inference for stochastic differential equation mixed-effects models using correlated particle pseudo-marginal algorithms. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107151] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
14
|
Sherlock C. Direct statistical inference for finite Markov jump processes via the matrix exponential. Comput Stat 2021; 36:2863-2887. [PMID: 33897113 PMCID: PMC8054858 DOI: 10.1007/s00180-021-01102-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 03/23/2021] [Indexed: 11/27/2022]
Abstract
Given noisy, partial observations of a time-homogeneous, finite-statespace Markov chain, conceptually simple, direct statistical inference is available, in theory, via its rate matrix, or infinitesimal generator, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q, since \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\exp ({\mathsf {Q}}t)$$\end{document}exp(Qt) is the transition matrix over time t. However, perhaps because of inadequate tools for matrix exponentiation in programming languages commonly used amongst statisticians or a belief that the necessary calculations are prohibitively expensive, statistical inference for continuous-time Markov chains with a large but finite state space is typically conducted via particle MCMC or other relatively complex inference schemes. When, as in many applications \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q arises from a reaction network, it is usually sparse. We describe variations on known algorithms which allow fast, robust and accurate evaluation of the product of a non-negative vector with the exponential of a large, sparse rate matrix. Our implementation uses relatively recently developed, efficient, linear algebra tools that take advantage of such sparsity. We demonstrate the straightforward statistical application of the key algorithm on a model for the mixing of two alleles in a population and on the Susceptible-Infectious-Removed epidemic model.
Collapse
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
15
|
Nguyen-Van-Yen B, Del Moral P, Cazelles B. Stochastic Epidemic Models inference and diagnosis with Poisson Random Measure Data Augmentation. Math Biosci 2021; 335:108583. [PMID: 33713696 DOI: 10.1016/j.mbs.2021.108583] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 12/22/2020] [Accepted: 02/28/2021] [Indexed: 11/24/2022]
Abstract
We present a new Bayesian inference method for compartmental models that takes into account the intrinsic stochasticity of the process. We show how to formulate a SIR-type Markov jump process as the solution of a stochastic differential equation with respect to a Poisson Random Measure (PRM), and how to simulate the process trajectory deterministically from a parameter value and a PRM realization. This forms the basis of our Data Augmented MCMC, which consists of augmenting parameter space with the unobserved PRM value. The resulting simple Metropolis-Hastings sampler acts as an efficient simulation-based inference method, that can easily be transferred from model to model. Compared with a recent Data Augmentation method based on Gibbs sampling of individual infection histories, PRM-augmented MCMC scales much better with epidemic size and is far more flexible. It is also found to be competitive with Particle MCMC for moderate epidemics when using approximate simulations. PRM-augmented MCMC also yields a posteriori estimates of the PRM, that represent process stochasticity, and which can be used to validate the model. A pattern of deviation from the PRM prior distribution will indicate that the model underfits the data and help to understand the cause. We illustrate this by fitting a non-seasonal model to some simulated seasonal case count data. Applied to the Zika epidemic of 2013 in French Polynesia, our approach shows that a simple SEIR model cannot correctly reproduce both the initial sharp increase in the number of cases as well as the final proportion of seropositive. PRM augmentation thus provides a coherent story for Stochastic Epidemic Model inference, where explicitly inferring process stochasticity helps with model validation.
Collapse
Affiliation(s)
- Benjamin Nguyen-Van-Yen
- Institut Pasteur, Unité de Génétique Fonctionnelle des Maladies Infectieuses, UMR 2000 CNRS, Paris, France; Institut de Biologie de l'ENS (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France.
| | | | - Bernard Cazelles
- Institut de Biologie de l'ENS (IBENS), Ecole Normale Supérieure, CNRS, INSERM, Université PSL, 75005 Paris, France; International Center for Mathematical and Computational Modeling of Complex Systems (UMMISCO), UMI 209, Sorbonne Université, France; iGLOBE, UMI CNRS 3157, University of Arizona, Tucson, AZ, United States of America
| |
Collapse
|
16
|
Browning AP, Warne DJ, Burrage K, Baker RE, Simpson MJ. Identifiability analysis for stochastic differential equation models in systems biology. J R Soc Interface 2020; 17:20200652. [PMID: 33323054 PMCID: PMC7811582 DOI: 10.1098/rsif.2020.0652] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 11/24/2020] [Indexed: 12/26/2022] Open
Abstract
Mathematical models are routinely calibrated to experimental data, with goals ranging from building predictive models to quantifying parameters that cannot be measured. Whether or not reliable parameter estimates are obtainable from the available data can easily be overlooked. Such issues of parameter identifiability have important ramifications for both the predictive power of a model, and the mechanistic insight that can be obtained. Identifiability analysis is well-established for deterministic, ordinary differential equation (ODE) models, but there are no commonly adopted methods for analysing identifiability in stochastic models. We provide an accessible introduction to identifiability analysis and demonstrate how existing ideas for analysis of ODE models can be applied to stochastic differential equation (SDE) models through four practical case studies. To assess structural identifiability, we study ODEs that describe the statistical moments of the stochastic process using open-source software tools. Using practically motivated synthetic data and Markov chain Monte Carlo methods, we assess parameter identifiability in the context of available data. Our analysis shows that SDE models can often extract more information about parameters than deterministic descriptions. All code used to perform the analysis is available on Github.
Collapse
Affiliation(s)
- Alexander P. Browning
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - David J. Warne
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| | - Kevin Burrage
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Plant Success in Nature and Agriculture, Queensland University of Technology, Brisbane, Australia
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Ruth E. Baker
- Mathematical Institute, University of Oxford, Oxford, UK
| | - Matthew J. Simpson
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
- ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
17
|
Huang Z, Lan Y. Low-dimensional projection of stochastic cell-signalling dynamics via a variational approach. Phys Rev E 2020; 101:012402. [PMID: 32069661 DOI: 10.1103/physreve.101.012402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Indexed: 11/07/2022]
Abstract
Noise and fluctuations play vital roles in signal transduction in cells. Various numerical techniques for its simulation have been proposed, most of which are not efficient in cellular networks with a wide spectrum of timescales. In this paper, based on a recently developed variational technique, low-dimensional structures embedded in complex stochastic reaction dynamics are unfolded which sheds light on new design principles of efficient simulation algorithm for treating noise in the mesoscopic world. This idea is effectively demonstrated in several popular regulation models with an empirical selection of test functions according to their reaction geometry, which not only captures complex distribution profiles of different molecular species but also considerably speeds up the computation.
Collapse
Affiliation(s)
- Zhenzhen Huang
- School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China
| | - Yueheng Lan
- School of Science, Beijing University of Posts and Telecommunications, Beijing 100876, China.,State Key Lab of Information Photonics and Optical Communications, Beijing University of Posts and Telecommunications, Beijing 100876, China
| |
Collapse
|
18
|
Calderazzo S, Brancaccio M, Finkenstädt B. Filtering and inference for stochastic oscillators with distributed delays. Bioinformatics 2020; 35:1380-1387. [PMID: 30202930 PMCID: PMC6477979 DOI: 10.1093/bioinformatics/bty782] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Revised: 08/08/2018] [Accepted: 09/06/2018] [Indexed: 01/30/2023] Open
Abstract
Motivation The time evolution of molecular species involved in biochemical reaction networks often arises from complex stochastic processes involving many species and reaction events. Inference for such systems is profoundly challenged by the relative sparseness of experimental data, as measurements are often limited to a small subset of the participating species measured at discrete time points. The need for model reduction can be realistically achieved for oscillatory dynamics resulting from negative translational and transcriptional feedback loops by the introduction of probabilistic time-delays. Although this approach yields a simplified model, inference is challenging and subject to ongoing research. The linear noise approximation (LNA) has recently been proposed to address such systems in stochastic form and will be exploited here. Results We develop a novel filtering approach for the LNA in stochastic systems with distributed delays, which allows the parameter values and unobserved states of a stochastic negative feedback model to be inferred from univariate time-series data. The performance of the methods is tested for simulated data. Results are obtained for real data when the model is fitted to imaging data on Cry1, a key gene involved in the mammalian central circadian clock, observed via a luciferase reporter construct in a mouse suprachiasmatic nucleus. Availability and implementation Programmes are written in MATLAB and Statistics Toolbox Release 2016 b, The MathWorks, Inc., Natick, Massachusetts, USA. Sample code and Cry1 data are available on GitHub https://github.com/scalderazzo/FLNADD. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Silvia Calderazzo
- Department of Statistics, University of Warwick, Coventry, UK.,Division of Biostatistics, German Cancer Research Center, Heidelberg, Germany
| | - Marco Brancaccio
- Division of Neurobiology, Medical Research Council Laboratory of Molecular Biology, Cambridge, UK
| | | |
Collapse
|
19
|
Panchal V, Linder DF. Reverse engineering gene networks using global-local shrinkage rules. Interface Focus 2019; 10:20190049. [PMID: 31897291 DOI: 10.1098/rsfs.2019.0049] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/13/2019] [Indexed: 12/26/2022] Open
Abstract
Inferring gene regulatory networks from high-throughput 'omics' data has proven to be a computationally demanding task of critical importance. Frequently, the classical methods break down owing to the curse of dimensionality, and popular strategies to overcome this are typically based on regularized versions of the classical methods. However, these approaches rely on loss functions that may not be robust and usually do not allow for the incorporation of prior information in a straightforward way. Fully Bayesian methods are equipped to handle both of these shortcomings quite naturally, and they offer the potential for improvements in network structure learning. We propose a Bayesian hierarchical model to reconstruct gene regulatory networks from time-series gene expression data, such as those common in perturbation experiments of biological systems. The proposed methodology uses global-local shrinkage priors for posterior selection of regulatory edges and relaxes the common normal likelihood assumption in order to allow for heavy-tailed data, which were shown in several of the cited references to severely impact network inference. We provide a sufficient condition for posterior propriety and derive an efficient Markov chain Monte Carlo via Gibbs sampling in the electronic supplementary material. We describe a novel way to detect multiple scales based on the corresponding posterior quantities. Finally, we demonstrate the performance of our approach in a simulation study and compare it with existing methods on real data from a T-cell activation study.
Collapse
Affiliation(s)
- Viral Panchal
- Department of Mathematics and Statistics, University of North Carolina Wilmington, Wilmington, NC 28403, USA
| | - Daniel F Linder
- Medical College of Georgia, Augusta University, Augusta, GA 30912, USA
| |
Collapse
|
20
|
Zimmer C, Leuba SI, Cohen T, Yaesoubi R. Accurate quantification of uncertainty in epidemic parameter estimates and predictions using stochastic compartmental models. Stat Methods Med Res 2019; 28:3591-3608. [PMID: 30428780 PMCID: PMC6517086 DOI: 10.1177/0962280218805780] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Stochastic transmission dynamic models are needed to quantify the uncertainty in estimates and predictions during outbreaks of infectious diseases. We previously developed a calibration method for stochastic epidemic compartmental models, called Multiple Shooting for Stochastic Systems (MSS), and demonstrated its competitive performance against a number of existing state-of-the-art calibration methods. The existing MSS method, however, lacks a mechanism against filter degeneracy, a phenomenon that results in parameter posterior distributions that are weighted heavily around a single value. As such, when filter degeneracy occurs, the posterior distributions of parameter estimates will not yield reliable credible or prediction intervals for parameter estimates and predictions. In this work, we extend the MSS method by evaluating and incorporating two resampling techniques to detect and resolve filter degeneracy. Using simulation experiments, we demonstrate that an extended MSS method produces credible and prediction intervals with desired coverage in estimating key epidemic parameters (e.g. mean duration of infectiousness and R0) and short- and long-term predictions (e.g. one and three-week forecasts, timing and number of cases at the epidemic peak, and final epidemic size). Applying the extended MSS approach to a humidity-based stochastic compartmental influenza model, we were able to accurately predict influenza-like illness activity reported by U.S. Centers for Disease Control and Prevention from 10 regions as well as city-level influenza activity using real-time, city-specific Google search query data from 119 U.S. cities between 2003 and 2014.
Collapse
Affiliation(s)
- Christoph Zimmer
- Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
- Bosch Center for Artificial Intelligence, Robert Bosch GmbH, Renningen, Germany
| | - Sequoia I Leuba
- Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
| | - Ted Cohen
- Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, CT, USA
| | - Reza Yaesoubi
- Health Policy and Management, Yale School of Public Health, New Haven, CT, USA
| |
Collapse
|
21
|
Estimating numbers of intracellular molecules through analysing fluctuations in photobleaching. Sci Rep 2019; 9:15238. [PMID: 31645577 PMCID: PMC6811640 DOI: 10.1038/s41598-019-50921-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Accepted: 09/18/2019] [Indexed: 01/18/2023] Open
Abstract
The impact of fluorescence microscopy has been limited by the difficulties of expressing measurements of fluorescent proteins in numbers of molecules. Absolute numbers enable the integration of results from different laboratories, empower mathematical modelling, and are the bedrock for a quantitative, predictive biology. Here we propose an estimator to infer numbers of molecules from fluctuations in the photobleaching of proteins tagged with Green Fluorescent Protein. Performing experiments in budding yeast, we show that our estimates of numbers agree, within an order of magnitude, with published biochemical measurements, for all six proteins tested. The experiments we require are straightforward and use only a wide-field fluorescence microscope. As such, our approach has the potential to become standard for those practising quantitative fluorescence microscopy.
Collapse
|
22
|
Lötstedt P. The Linear Noise Approximation for Spatially Dependent Biochemical Networks. Bull Math Biol 2019; 81:2873-2901. [PMID: 29644520 PMCID: PMC6677697 DOI: 10.1007/s11538-018-0428-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 03/29/2018] [Indexed: 10/26/2022]
Abstract
An algorithm for computing the linear noise approximation (LNA) of the reaction-diffusion master equation (RDME) is developed and tested. The RDME is often used as a model for biochemical reaction networks. The LNA is derived for a general discretization of the spatial domain of the problem. If M is the number of chemical species in the network and N is the number of nodes in the discretization in space, then the computational work to determine approximations of the mean and the covariances of the probability distributions is proportional to [Formula: see text] in a straightforward implementation. In our LNA algorithm, the work is proportional to [Formula: see text]. Since N usually is larger than M, this is a significant reduction. The accuracy of the approximation in the algorithm is estimated analytically and evaluated in numerical experiments.
Collapse
Affiliation(s)
- Per Lötstedt
- Division of Scientific Computing, Department of Information Technology, Uppsala University, SE-75105, Uppsala, Sweden.
| |
Collapse
|
23
|
Loskot P, Atitey K, Mihaylova L. Comprehensive Review of Models and Methods for Inferences in Bio-Chemical Reaction Networks. Front Genet 2019; 10:549. [PMID: 31258548 PMCID: PMC6588029 DOI: 10.3389/fgene.2019.00549] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 05/24/2019] [Indexed: 01/30/2023] Open
Abstract
The key processes in biological and chemical systems are described by networks of chemical reactions. From molecular biology to biotechnology applications, computational models of reaction networks are used extensively to elucidate their non-linear dynamics. The model dynamics are crucially dependent on the parameter values which are often estimated from observations. Over the past decade, the interest in parameter and state estimation in models of (bio-) chemical reaction networks (BRNs) grew considerably. The related inference problems are also encountered in many other tasks including model calibration, discrimination, identifiability, and checking, and optimum experiment design, sensitivity analysis, and bifurcation analysis. The aim of this review paper is to examine the developments in literature to understand what BRN models are commonly used, and for what inference tasks and inference methods. The initial collection of about 700 documents concerning estimation problems in BRNs excluding books and textbooks in computational biology and chemistry were screened to select over 270 research papers and 20 graduate research theses. The paper selection was facilitated by text mining scripts to automate the search for relevant keywords and terms. The outcomes are presented in tables revealing the levels of interest in different inference tasks and methods for given models in the literature as well as the research trends are uncovered. Our findings indicate that many combinations of models, tasks and methods are still relatively unexplored, and there are many new research opportunities to explore combinations that have not been considered-perhaps for good reasons. The most common models of BRNs in literature involve differential equations, Markov processes, mass action kinetics, and state space representations whereas the most common tasks are the parameter inference and model identification. The most common methods in literature are Bayesian analysis, Monte Carlo sampling strategies, and model fitting to data using evolutionary algorithms. The new research problems which cannot be directly deduced from the text mining data are also discussed.
Collapse
Affiliation(s)
- Pavel Loskot
- College of Engineering, Swansea University, Swansea, United Kingdom
| | - Komlan Atitey
- College of Engineering, Swansea University, Swansea, United Kingdom
| | - Lyudmila Mihaylova
- Department of Automatic Control and Systems Engineering, University of Sheffield, Sheffield, United Kingdom
| |
Collapse
|
24
|
Cao Z, Grima R. Accuracy of parameter estimation for auto-regulatory transcriptional feedback loops from noisy data. J R Soc Interface 2019; 16:20180967. [PMID: 30940028 PMCID: PMC6505555 DOI: 10.1098/rsif.2018.0967] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Bayesian and non-Bayesian moment-based inference methods are commonly used to estimate the parameters defining stochastic models of gene regulatory networks from noisy single cell or population snapshot data. However, a systematic investigation of the accuracy of the predictions of these methods remains missing. Here, we present the results of such a study using synthetic noisy data of a negative auto-regulatory transcriptional feedback loop, one of the most common building blocks of complex gene regulatory networks. We study the error in parameter estimation as a function of (i) number of cells in each sample; (ii) the number of time points; (iii) the highest-order moment of protein fluctuations used for inference; (iv) the moment-closure method used for likelihood approximation. We find that for sample sizes typical of flow cytometry experiments, parameter estimation by maximizing the likelihood is as accurate as using Bayesian methods but with a much reduced computational time. We also show that the choice of moment-closure method is the crucial factor determining the maximum achievable accuracy of moment-based inference methods. Common likelihood approximation methods based on the linear noise approximation or the zero cumulants closure perform poorly for feedback loops with large protein-DNA binding rates or large protein bursts; this is exacerbated for highly heterogeneous cell populations. By contrast, approximating the likelihood using the linear-mapping approximation or conditional derivative matching leads to highly accurate parameter estimates for a wide range of conditions.
Collapse
|
25
|
Buckingham-Jeffery E, Isham V, House T. Gaussian process approximations for fast inference from infectious disease data. Math Biosci 2018; 301:111-120. [PMID: 29471011 DOI: 10.1016/j.mbs.2018.02.003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 02/13/2018] [Accepted: 02/17/2018] [Indexed: 10/18/2022]
Abstract
We present a flexible framework for deriving and quantifying the accuracy of Gaussian process approximations to non-linear stochastic individual-based models of epidemics. We develop this for the SIR and SEIR models, and we show how it can be used to perform quick maximum likelihood inference for the underlying parameters given population estimates of the number of infecteds or cases at given time points. We also show how the unobserved processes can be inferred at the same time as the underlying parameters.
Collapse
Affiliation(s)
- Elizabeth Buckingham-Jeffery
- Centre for Complexity Science, University of Warwick, Coventry, CV4 7AL, UK; School of Mathematics, University of Manchester, Manchester M13 9PL, UK.
| | - Valerie Isham
- Department of Statistical Science, University College London, London, WC1E 6BT, UK
| | - Thomas House
- School of Mathematics, University of Manchester, Manchester M13 9PL, UK
| |
Collapse
|
26
|
Drovandi CC, Moores MT, Boys RJ. Accelerating pseudo-marginal MCMC using Gaussian processes. Comput Stat Data Anal 2018. [DOI: 10.1016/j.csda.2017.09.002] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
27
|
Folia MM, Rattray M. Trajectory inference and parameter estimation in stochastic models with temporally aggregated data. STATISTICS AND COMPUTING 2017; 28:1053-1072. [PMID: 30147250 PMCID: PMC6096750 DOI: 10.1007/s11222-017-9779-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2017] [Accepted: 09/22/2017] [Indexed: 06/08/2023]
Abstract
Stochastic models are of fundamental importance in many scientific and engineering applications. For example, stochastic models provide valuable insights into the causes and consequences of intra-cellular fluctuations and inter-cellular heterogeneity in molecular biology. The chemical master equation can be used to model intra-cellular stochasticity in living cells, but analytical solutions are rare and numerical simulations are computationally expensive. Inference of system trajectories and estimation of model parameters from observed data are important tasks and are even more challenging. Here, we consider the case where the observed data are aggregated over time. Aggregation of data over time is required in studies of single cell gene expression using a luciferase reporter, where the emitted light can be very faint and is therefore collected for several minutes for each observation. We show how an existing approach to inference based on the linear noise approximation (LNA) can be generalised to the case of temporally aggregated data. We provide a Kalman filter (KF) algorithm which can be combined with the LNA to carry out inference of system variable trajectories and estimation of model parameters. We apply and evaluate our method on both synthetic and real data scenarios and show that it is able to accurately infer the posterior distribution of model parameters in these examples. We demonstrate how applying standard KF inference to aggregated data without accounting for aggregation will tend to underestimate the process noise and can lead to biased parameter estimates.
Collapse
Affiliation(s)
- Maria Myrto Folia
- Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| | - Magnus Rattray
- Division of Informatics, Imaging and Data Sciences, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
| |
Collapse
|
28
|
Sherlock C, Golightly A, Henderson DA. Adaptive, Delayed-Acceptance MCMC for Targets With Expensive Likelihoods. J Comput Graph Stat 2017. [DOI: 10.1080/10618600.2016.1231064] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster, United Kingdom
| | - Andrew Golightly
- School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Daniel A. Henderson
- School of Mathematics & Statistics, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
29
|
Zimmer C. Experimental Design for Stochastic Models of Nonlinear Signaling Pathways Using an Interval-Wise Linear Noise Approximation and State Estimation. PLoS One 2016; 11:e0159902. [PMID: 27583802 PMCID: PMC5008843 DOI: 10.1371/journal.pone.0159902] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 07/11/2016] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Computational modeling is a key technique for analyzing models in systems biology. There are well established methods for the estimation of the kinetic parameters in models of ordinary differential equations (ODE). Experimental design techniques aim at devising experiments that maximize the information encoded in the data. For ODE models there are well established approaches for experimental design and even software tools. However, data from single cell experiments on signaling pathways in systems biology often shows intrinsic stochastic effects prompting the development of specialized methods. While simulation methods have been developed for decades and parameter estimation has been targeted for the last years, only very few articles focus on experimental design for stochastic models. METHODS The Fisher information matrix is the central measure for experimental design as it evaluates the information an experiment provides for parameter estimation. This article suggest an approach to calculate a Fisher information matrix for models containing intrinsic stochasticity and high nonlinearity. The approach makes use of a recently suggested multiple shooting for stochastic systems (MSS) objective function. The Fisher information matrix is calculated by evaluating pseudo data with the MSS technique. RESULTS The performance of the approach is evaluated with simulation studies on an Immigration-Death, a Lotka-Volterra, and a Calcium oscillation model. The Calcium oscillation model is a particularly appropriate case study as it contains the challenges inherent to signaling pathways: high nonlinearity, intrinsic stochasticity, a qualitatively different behavior from an ODE solution, and partial observability. The computational speed of the MSS approach for the Fisher information matrix allows for an application in realistic size models.
Collapse
Affiliation(s)
- Christoph Zimmer
- BIOMS, University of Heidelberg, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany
- * E-mail:
| |
Collapse
|
30
|
Georgoulas A, Hillston J, Sanguinetti G. Unbiased Bayesian inference for population Markov jump processes via random truncations. STATISTICS AND COMPUTING 2016; 27:991-1002. [PMID: 28690370 PMCID: PMC5477715 DOI: 10.1007/s11222-016-9667-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 05/02/2016] [Indexed: 05/24/2023]
Abstract
We consider continuous time Markovian processes where populations of individual agents interact stochastically according to kinetic rules. Despite the increasing prominence of such models in fields ranging from biology to smart cities, Bayesian inference for such systems remains challenging, as these are continuous time, discrete state systems with potentially infinite state-space. Here we propose a novel efficient algorithm for joint state/parameter posterior sampling in population Markov Jump processes. We introduce a class of pseudo-marginal sampling algorithms based on a random truncation method which enables a principled treatment of infinite state spaces. Extensive evaluation on a number of benchmark models shows that this approach achieves considerable savings compared to state of the art methods, retaining accuracy and fast convergence. We also present results on a synthetic biology data set showing the potential for practical usefulness of our work.
Collapse
Affiliation(s)
| | - Jane Hillston
- School of Informatics, University of Edinburgh, Edinburgh, UK
| | - Guido Sanguinetti
- School of Informatics, University of Edinburgh, Edinburgh, UK
- Synthetic and Systems Biology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
31
|
Koepke AA, Longini IM, Halloran ME, Wakefield J, Minin VN. PREDICTIVE MODELING OF CHOLERA OUTBREAKS IN BANGLADESH. Ann Appl Stat 2016; 10:575-595. [PMID: 27746850 PMCID: PMC5061460 DOI: 10.1214/16-aoas908] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Despite seasonal cholera outbreaks in Bangladesh, little is known about the relationship between environmental conditions and cholera cases. We seek to develop a predictive model for cholera outbreaks in Bangladesh based on environmental predictors. To do this, we estimate the contribution of environmental variables, such as water depth and water temperature, to cholera outbreaks in the context of a disease transmission model. We implement a method which simultaneously accounts for disease dynamics and environmental variables in a Susceptible-Infected-Recovered-Susceptible (SIRS) model. The entire system is treated as a continuous-time hidden Markov model, where the hidden Markov states are the numbers of people who are susceptible, infected, or recovered at each time point, and the observed states are the numbers of cholera cases reported. We use a Bayesian framework to fit this hidden SIRS model, implementing particle Markov chain Monte Carlo methods to sample from the posterior distribution of the environmental and transmission parameters given the observed data. We test this method using both simulation and data from Mathbaria, Bangladesh. Parameter estimates are used to make short-term predictions that capture the formation and decline of epidemic peaks. We demonstrate that our model can successfully predict an increase in the number of infected individuals in the population weeks before the observed number of cholera cases increases, which could allow for early notification of an epidemic and timely allocation of resources.
Collapse
|
32
|
Golightly A, Wilkinson DJ. Bayesian inference for Markov jump processes with informative observations. Stat Appl Genet Mol Biol 2016; 14:169-88. [PMID: 25720091 DOI: 10.1515/sagmb-2014-0070] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
In this paper we consider the problem of parameter inference for Markov jump process (MJP) representations of stochastic kinetic models. Since transition probabilities are intractable for most processes of interest yet forward simulation is straightforward, Bayesian inference typically proceeds through computationally intensive methods such as (particle) MCMC. Such methods ostensibly require the ability to simulate trajectories from the conditioned jump process. When observations are highly informative, use of the forward simulator is likely to be inefficient and may even preclude an exact (simulation based) analysis. We therefore propose three methods for improving the efficiency of simulating conditioned jump processes. A conditioned hazard is derived based on an approximation to the jump process, and used to generate end-point conditioned trajectories for use inside an importance sampling algorithm. We also adapt a recently proposed sequential Monte Carlo scheme to our problem. Essentially, trajectories are reweighted at a set of intermediate time points, with more weight assigned to trajectories that are consistent with the next observation. We consider two implementations of this approach, based on two continuous approximations of the MJP. We compare these constructs for a simple tractable jump process before using them to perform inference for a Lotka-Volterra system. The best performing construct is used to infer the parameters governing a simple model of motility regulation in Bacillus subtilis.
Collapse
|
33
|
Hey KL, Momiji H, Featherstone K, Davis JRE, White MRH, Rand DA, Finkenstädt B. A stochastic transcriptional switch model for single cell imaging data. Biostatistics 2015; 16:655-69. [PMID: 25819987 PMCID: PMC4570576 DOI: 10.1093/biostatistics/kxv010] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2014] [Accepted: 02/21/2015] [Indexed: 12/03/2022] Open
Abstract
Gene expression is made up of inherently stochastic processes within single cells and can be modeled through stochastic reaction networks (SRNs). In particular, SRNs capture the features of intrinsic variability arising from intracellular biochemical processes. We extend current models for gene expression to allow the transcriptional process within an SRN to follow a random step or switch function which may be estimated using reversible jump Markov chain Monte Carlo (MCMC). This stochastic switch model provides a generic framework to capture many different dynamic features observed in single cell gene expression. Inference for such SRNs is challenging due to the intractability of the transition densities. We derive a model-specific birth–death approximation and study its use for inference in comparison with the linear noise approximation where both approximations are considered within the unifying framework of state-space models. The methodology is applied to synthetic as well as experimental single cell imaging data measuring expression of the human prolactin gene in pituitary cells.
Collapse
Affiliation(s)
- Kirsty L Hey
- Department of Statistics, University of Warwick, Coventry CV4 7AL, UK
| | - Hiroshi Momiji
- Warwick Systems Biology, University of Warwick, Coventry CV4 7AL, UK
| | - Karen Featherstone
- Centre for Endocrinology and Diabetes, University of Manchester, Manchester M13 9PT, UK
| | - Julian R E Davis
- Centre for Endocrinology and Diabetes, University of Manchester, Manchester M13 9PT, UK
| | - Michael R H White
- Systems Biology Centre, University of Manchester, Manchester M13 9PL, UK
| | - David A Rand
- Warwick Systems Biology, University of Warwick, Coventry CV4 7AL, UK
| | | |
Collapse
|
34
|
Abstract
One of the greatest challenges in biology is to improve the understanding of the mechanisms which underpin aging and how these affect health. The need to better understand aging is amplified by demographic changes, which have caused a gradual increase in the global population of older people. Aging western populations have resulted in a rise in the prevalence of age-related pathologies. Of these diseases, cardiovascular disease is the most common underlying condition in older people. The dysregulation of lipid metabolism due to aging impinges significantly on cardiovascular health. However, the multifaceted nature of lipid metabolism and the complexities of its interaction with aging make it challenging to understand by conventional means. To address this challenge computational modeling, a key component of the systems biology paradigm is being used to study the dynamics of lipid metabolism. This mini-review briefly outlines the key regulators of lipid metabolism, their dysregulation, and how computational modeling is being used to gain an increased insight into this system.
Collapse
Affiliation(s)
- Mark T. Mc Auley
- Faculty of Science and Engineering, Department of Chemical Engineering, Thornton Science Park, University of Chester, UK
| | - Kathleen M. Mooney
- Faculty of Health and Social Care, Edge Hill University, Ormskirk, Lancashire, UK
| |
Collapse
|