1
|
Vandecasteele H, Samaey G. Pseudo-marginal approximation to the free energy in a micro-macro Markov chain Monte Carlo method. J Chem Phys 2024; 160:104702. [PMID: 38465681 DOI: 10.1063/5.0199562] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 02/13/2024] [Indexed: 03/12/2024] Open
Abstract
We introduce a generalized micro-macro Markov chain Monte Carlo (mM-MCMC) method with pseudo-marginal approximation to the free energy that is able to accelerate sampling of the microscopic Gibbs distributions when there is a time-scale separation between the macroscopic dynamics of a reaction coordinate and the remaining microscopic degrees of freedom. The mM-MCMC method attains this efficiency by iterating four steps: (i) propose a new value of the reaction coordinate, (ii) accept or reject the macroscopic sample, (iii) run a biased simulation that creates a microscopic molecular instance that lies close to the newly sampled macroscopic reaction coordinate value, and (iv) microscopic accept/reject step for the new microscopic sample. In the present paper, we eliminate the main computational bottleneck of earlier versions of this method: the necessity to have an accurate approximation of free energy. We show that the introduction of a pseudo-marginal approximation significantly reduces the computational cost of the microscopic accept/reject step while still providing unbiased samples. We illustrate the method's behavior on several molecular systems with low-dimensional reaction coordinates.
Collapse
Affiliation(s)
- Hannes Vandecasteele
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, 3400 N. Charles Street Baltimore, Maryland 21218, USA
- Department of Computer Science, KU Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium
| | - Giovanni Samaey
- Department of Computer Science, KU Leuven, Celestijnenlaan 200A, 3001 Leuven, Belgium
| |
Collapse
|
2
|
Optimal scaling of MCMC beyond Metropolis. ADV APPL PROBAB 2022. [DOI: 10.1017/apr.2022.37] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Abstract
The problem of optimally scaling the proposal distribution in a Markov chain Monte Carlo algorithm is critical to the quality of the generated samples. Much work has gone into obtaining such results for various Metropolis–Hastings (MH) algorithms. Recently, acceptance probabilities other than MH are being employed in problems with intractable target distributions. There are few resources available on tuning the Gaussian proposal distributions for this situation. We obtain optimal scaling results for a general class of acceptance functions, which includes Barker’s and lazy MH. In particular, optimal values for Barker’s algorithm are derived and found to be significantly different from that obtained for the MH algorithm. Our theoretical conclusions are supported by numerical simulations indicating that when the optimal proposal variance is unknown, tuning to the optimal acceptance probability remains an effective strategy.
Collapse
|
3
|
Andrieu C, Lee A, Power S, Wang AQ. Comparison of Markov chains via weak Poincaré inequalities with application to pseudo-marginal MCMC. Ann Stat 2022. [DOI: 10.1214/22-aos2241] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
| | - Anthony Lee
- School of Mathematics, University of Bristol
| | - Sam Power
- School of Mathematics, University of Bristol
| | | |
Collapse
|
4
|
Roy A, Shen L, Balasubramanian K, Ghadimi S. Stochastic zeroth-order discretizations of Langevin diffusions for Bayesian inference. BERNOULLI 2022. [DOI: 10.3150/21-bej1400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Abhishek Roy
- Department of Statistics, University of California, Davis, Davis, CA 95616, USA
| | - Lingqing Shen
- Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA 15213
| | | | - Saeed Ghadimi
- Department of Management Sciences, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| |
Collapse
|
5
|
Sherlock C, Golightly A. Exact Bayesian inference for discretely observed Markov Jump Processes using finite rate matrices. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2093886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, UK
| | | |
Collapse
|
6
|
Frazier DT, Nott DJ, Drovandi C, Kohn R. Bayesian inference using synthetic likelihood: asymptotics and adjustments. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2086132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- David T. Frazier
- Department of Econometrics and Business Statistics, Monash University, Clayton VIC 3800, Australia
- Australian Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - David J. Nott
- Department of Statistics and Applied Probability, National University of Singapore, Singapore 117546
- Operations Research and Analytics Cluster, National University of Singapore, Singapore 119077
| | - Christopher Drovandi
- School of Mathematical Sciences, Queensland University of Technology, Brisbane 4000 Australia
- Australian Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - Robert Kohn
- Australian School of Business, School of Economics, University of New South Wales, Sydney NSW 2052, Australia
- Australian Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| |
Collapse
|
7
|
Persson S, Welkenhuysen N, Shashkova S, Wiqvist S, Reith P, Schmidt GW, Picchini U, Cvijovic M. Scalable and flexible inference framework for stochastic dynamic single-cell models. PLoS Comput Biol 2022; 18:e1010082. [PMID: 35588132 PMCID: PMC9159578 DOI: 10.1371/journal.pcbi.1010082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 06/01/2022] [Accepted: 04/05/2022] [Indexed: 01/22/2023] Open
Abstract
Understanding the inherited nature of how biological processes dynamically change over time and exhibit intra- and inter-individual variability, due to the different responses to environmental stimuli and when interacting with other processes, has been a major focus of systems biology. The rise of single-cell fluorescent microscopy has enabled the study of those phenomena. The analysis of single-cell data with mechanistic models offers an invaluable tool to describe dynamic cellular processes and to rationalise cell-to-cell variability within the population. However, extracting mechanistic information from single-cell data has proven difficult. This requires statistical methods to infer unknown model parameters from dynamic, multi-individual data accounting for heterogeneity caused by both intrinsic (e.g. variations in chemical reactions) and extrinsic (e.g. variability in protein concentrations) noise. Although several inference methods exist, the availability of efficient, general and accessible methods that facilitate modelling of single-cell data, remains lacking. Here we present a scalable and flexible framework for Bayesian inference in state-space mixed-effects single-cell models with stochastic dynamic. Our approach infers model parameters when intrinsic noise is modelled by either exact or approximate stochastic simulators, and when extrinsic noise is modelled by either time-varying, or time-constant parameters that vary between cells. We demonstrate the relevance of our approach by studying how cell-to-cell variation in carbon source utilisation affects heterogeneity in the budding yeast Saccharomyces cerevisiae SNF1 nutrient sensing pathway. We identify hexokinase activity as a source of extrinsic noise and deduce that sugar availability dictates cell-to-cell variability.
Collapse
Affiliation(s)
- Sebastian Persson
- Department of Mathematical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Niek Welkenhuysen
- Department of Mathematical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Sviatlana Shashkova
- Department of Microbiology and Immunology, Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Samuel Wiqvist
- Centre for Mathematical Sciences, Lund University, Lund, Sweden
| | - Patrick Reith
- Department of Mathematical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
- Department of Biology and Biological Engineering, Chalmers University of Technology, Gothenburg, Sweden
| | - Gregor W. Schmidt
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Umberto Picchini
- Department of Mathematical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| | - Marija Cvijovic
- Department of Mathematical Sciences, University of Gothenburg, Gothenburg, Sweden
- Department of Mathematical Sciences, Chalmers University of Technology, Gothenburg, Sweden
| |
Collapse
|
8
|
West B, Wood AJ, Ungar D. Computational Modeling of Glycan Processing in the Golgi for Investigating Changes in the Arrangements of Biosynthetic Enzymes. Methods Mol Biol 2022; 2370:209-222. [PMID: 34611871 DOI: 10.1007/978-1-0716-1685-7_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Modeling glycan biosynthesis is becoming increasingly important due to the far-reaching implications that glycosylation can exhibit, from pathologies to biopharmaceutical manufacturing. Here we describe a stochastic simulation approach, to overcome the deterministic nature of previous models, that aims to simulate the action of glycan modifying enzymes to produce a glycan profile. This is then coupled with an approximate Bayesian computation methodology to systematically fit to empirical data in order to determine which set of parameters adequately describes the organization of enzymes within the Golgi. The model is described in detail along with a proof of concept and therapeutic applications.
Collapse
Affiliation(s)
- Ben West
- Department of Biology, University of York, York, UK
| | - A Jamie Wood
- Departments of Biology and Mathematics, University of York, York, UK
| | - Daniel Ungar
- Department of Biology, University of York, York, UK.
| |
Collapse
|
9
|
Sherlock C, Thiery AH, Golightly A. Efficiency of delayed-acceptance random walk Metropolis algorithms. Ann Stat 2021. [DOI: 10.1214/21-aos2068] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University
| | - Alexandre H. Thiery
- Department of Statistics and Applied Probability, National University of Singapore
| | - Andrew Golightly
- School of Mathematics, Statistics and Physics, Newcastle University
| |
Collapse
|
10
|
Quiroz M, Tran MN, Villani M, Kohn R, Dang KD. The Block-Poisson Estimator for Optimally Tuned Exact Subsampling MCMC. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2021.1917420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Matias Quiroz
- School of Mathematical and Physical Sciences, University of Technology Sydney, Ultimo, Australia
- Research Division, Sveriges Riksbank, Stockholm, Sweden
| | - Minh-Ngoc Tran
- Discipline of Business Analytics, University of Sydney, Sydney, Australia
| | - Mattias Villani
- Department of Statistics, Stockholm University, Stockholm, Sweden
- Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Robert Kohn
- School of Economics, UNSW Business School, University of New South Wales, Kensington, Australia
| | - Khue-Dung Dang
- School of Mathematical and Physical Sciences, University of Technology Sydney, Ultimo, Australia
| |
Collapse
|
11
|
Efficient inference for stochastic differential equation mixed-effects models using correlated particle pseudo-marginal algorithms. Comput Stat Data Anal 2021. [DOI: 10.1016/j.csda.2020.107151] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Sherlock C. Direct statistical inference for finite Markov jump processes via the matrix exponential. Comput Stat 2021; 36:2863-2887. [PMID: 33897113 PMCID: PMC8054858 DOI: 10.1007/s00180-021-01102-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 03/23/2021] [Indexed: 11/27/2022]
Abstract
Given noisy, partial observations of a time-homogeneous, finite-statespace Markov chain, conceptually simple, direct statistical inference is available, in theory, via its rate matrix, or infinitesimal generator, \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q, since \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\exp ({\mathsf {Q}}t)$$\end{document}exp(Qt) is the transition matrix over time t. However, perhaps because of inadequate tools for matrix exponentiation in programming languages commonly used amongst statisticians or a belief that the necessary calculations are prohibitively expensive, statistical inference for continuous-time Markov chains with a large but finite state space is typically conducted via particle MCMC or other relatively complex inference schemes. When, as in many applications \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${\mathsf {Q}}$$\end{document}Q arises from a reaction network, it is usually sparse. We describe variations on known algorithms which allow fast, robust and accurate evaluation of the product of a non-negative vector with the exponential of a large, sparse rate matrix. Our implementation uses relatively recently developed, efficient, linear algebra tools that take advantage of such sparsity. We demonstrate the straightforward statistical application of the key algorithm on a model for the mixing of two alleles in a population and on the Susceptible-Infectious-Removed epidemic model.
Collapse
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster, UK
| |
Collapse
|
13
|
Vogrinc J, Kendall WS. Counterexamples for optimal scaling of Metropolis–Hastings chains with rough target densities. ANN APPL PROBAB 2021. [DOI: 10.1214/20-aap1612] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
14
|
Vihola M, Helske J, Franks J. Importance sampling type estimators based on approximate marginal Markov chain Monte Carlo. Scand Stat Theory Appl 2020. [DOI: 10.1111/sjos.12492] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Matti Vihola
- Department of Mathematics and Statistics University of Jyvaskyla, Finland
| | - Jouni Helske
- Department of Mathematics and Statistics University of Jyvaskyla, Finland
- Department of Science and Technology Linköping University, Sweden
| | - Jordan Franks
- Department of Mathematics and Statistics University of Jyvaskyla, Finland
- School of Mathematics, Statistics and Physics Newcastle University, United Kingdom
| |
Collapse
|
15
|
Yang J, Roberts GO, Rosenthal JS. Optimal scaling of random-walk metropolis algorithms on general target distributions. Stoch Process Their Appl 2020. [DOI: 10.1016/j.spa.2020.05.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
16
|
Alahmadi A, Belet S, Black A, Cromer D, Flegg JA, House T, Jayasundara P, Keith JM, McCaw JM, Moss R, Ross JV, Shearer FM, Tun STT, Walker CR, White L, Whyte JM, Yan AWC, Zarebski AE. Influencing public health policy with data-informed mathematical models of infectious diseases: Recent developments and new challenges. Epidemics 2020; 32:100393. [PMID: 32674025 DOI: 10.1016/j.epidem.2020.100393] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2019] [Accepted: 04/25/2020] [Indexed: 12/16/2022] Open
Abstract
Modern data and computational resources, coupled with algorithmic and theoretical advances to exploit these, allow disease dynamic models to be parameterised with increasing detail and accuracy. While this enhances models' usefulness in prediction and policy, major challenges remain. In particular, lack of identifiability of a model's parameters may limit the usefulness of the model. While lack of parameter identifiability may be resolved through incorporation into an inference procedure of prior knowledge, formulating such knowledge is often difficult. Furthermore, there are practical challenges associated with acquiring data of sufficient quantity and quality. Here, we discuss recent progress on these issues.
Collapse
Affiliation(s)
- Amani Alahmadi
- School of Mathematics, Faculty of Science, Monash University, Melbourne, Australia
| | - Sarah Belet
- School of Mathematics, Faculty of Science, Monash University, Melbourne, Australia; Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - Andrew Black
- School of Mathematical Sciences, University of Adelaide, Adelaide, Australia; Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - Deborah Cromer
- Kirby Institute for Infection and Immunity, UNSW Sydney, Sydney, Australia and School of Mathematics and Statistics, UNSW Sydney, Sydney, Australia
| | - Jennifer A Flegg
- School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia.
| | - Thomas House
- Department of Mathematics, University of Manchester, Manchester, UK; IBM Research, Hartree Centre, Sci-Tech Daresbury, Warrington, UK.
| | | | - Jonathan M Keith
- School of Mathematics, Faculty of Science, Monash University, Melbourne, Australia; Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - James M McCaw
- School of Mathematics and Statistics, University of Melbourne, Melbourne, Australia; Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia.
| | - Robert Moss
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia
| | - Joshua V Ross
- School of Mathematical Sciences, University of Adelaide, Adelaide, Australia; Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS).
| | - Freya M Shearer
- Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Melbourne, Australia
| | - Sai Thein Than Tun
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, UK
| | - Camelia R Walker
- School of Mathematical Sciences, University of Adelaide, Adelaide, Australia
| | - Lisa White
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, UK
| | - Jason M Whyte
- Centre of Excellence for Biosecurity Risk Analysis (CEBRA), School of BioSciences, University of Melbourne, Melbourne, Australia; Australian Research Council Centre of Excellence for Mathematical and Statistical Frontiers (ACEMS)
| | - Ada W C Yan
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, UK
| | | |
Collapse
|
17
|
Gaythorpe KA, Hamlet A, Cibrelus L, Garske T, Ferguson NM. The effect of climate change on yellow fever disease burden in Africa. eLife 2020; 9:55619. [PMID: 32718436 PMCID: PMC7386919 DOI: 10.7554/elife.55619] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 07/01/2020] [Indexed: 12/27/2022] Open
Abstract
Yellow Fever (YF) is an arbovirus endemic in tropical regions of South America and Africa and it is estimated to cause 78,000 deaths a year in Africa alone. Climate change may have substantial effects on the transmission of YF and we present the first analysis of the potential impact on disease burden. We extend an existing model of YF transmission to account for rainfall and a temperature suitability index and project transmission intensity across the African endemic region in the context of four climate change scenarios. We use these transmission projections to assess the change in burden in 2050 and 2070. We find disease burden changes heterogeneously across the region. In the least severe scenario, we find a 93.0%[95%CI(92.7, 93.2%)] chance that annual deaths will increase in 2050. This change in epidemiology will complicate future control efforts. Thus, we may need to consider the effect of changing climatic variables on future intervention strategies.
Collapse
Affiliation(s)
| | | | | | - Tini Garske
- Imperial College London, London, United Kingdom
| | | |
Collapse
|
18
|
Schmon SM, Deligiannidis G, Doucet A, Pitt MK. Large-sample asymptotics of the pseudo-marginal method. Biometrika 2020. [DOI: 10.1093/biomet/asaa044] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
The pseudo-marginal algorithm is a variant of the Metropolis–Hastings algorithm which samples asymptotically from a probability distribution when it is only possible to estimate unbiasedly an unnormalized version of its density. Practically, one has to trade off the computational resources used to obtain this estimator against the asymptotic variances of the ergodic averages obtained by the pseudo-marginal algorithm. Recent works on optimizing this trade-off rely on some strong assumptions, which can cast doubts over their practical relevance. In particular, they all assume that the distribution of the difference between the log-density, and its estimate is independent of the parameter value at which it is evaluated. Under regularity conditions we show that as the number of data points tends to infinity, a space-rescaled version of the pseudo-marginal chain converges weakly to another pseudo-marginal chain for which this assumption indeed holds. A study of this limiting chain allows us to provide parameter dimension-dependent guidelines on how to optimally scale a normal random walk proposal, and the number of Monte Carlo samples for the pseudo-marginal method in the large-sample regime. These findings complement and validate currently available results.
Collapse
Affiliation(s)
- S M Schmon
- Department of Statistics, University of Oxford, 24–29 St Giles’, Oxford OX1 3LB, U.K
| | - G Deligiannidis
- Department of Statistics, University of Oxford, 24–29 St Giles’, Oxford OX1 3LB, U.K
| | - A Doucet
- Department of Statistics, University of Oxford, 24–29 St Giles’, Oxford OX1 3LB, U.K
| | - M K Pitt
- Department of Mathematics, King’s College London, Strand, London WC2R 2LS, U.K
| |
Collapse
|
19
|
Modeling Glycan Processing Reveals Golgi-Enzyme Homeostasis upon Trafficking Defects and Cellular Differentiation. Cell Rep 2020; 27:1231-1243.e6. [PMID: 31018136 PMCID: PMC6486481 DOI: 10.1016/j.celrep.2019.03.107] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 01/24/2019] [Accepted: 03/27/2019] [Indexed: 01/11/2023] Open
Abstract
The decoration of proteins by carbohydrates is essential for eukaryotic life yet heterogeneous due to a lack of biosynthetic templates. This complex carbohydrate mixture—the glycan profile—is generated in the compartmentalized Golgi, in which level and localization of glycosylation enzymes are key determinants. Here, we develop and validate a computational model for glycan biosynthesis to probe how the biosynthetic machinery creates different glycan profiles. We combined stochastic modeling with Bayesian fitting that enables rigorous comparison to experimental data despite starting with uncertain initial parameters. This is an important development in the field of glycan modeling, which revealed biological insights about the glycosylation machinery in altered cellular states. We experimentally validated changes in N-linked glycan-modifying enzymes in cells with perturbed intra-Golgi-enzyme sorting and the predicted glycan-branching activity during osteogenesis. Our model can provide detailed information on altered biosynthetic paths, with potential for advancing treatments for glycosylation-related diseases and glyco-engineering of cells. Developed a stochastic model of N-glycosylation coupled with Bayesian fitting Validated predicted changes of Golgi organization in trafficking mutants Model pinpointed functionally relevant glycan alterations in osteogenesis
Collapse
|
20
|
Vihola M, Franks J. On the use of approximate Bayesian computation Markov chain Monte Carlo with inflated tolerance and post-correction. Biometrika 2020. [DOI: 10.1093/biomet/asz078] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
SummaryApproximate Bayesian computation enables inference for complicated probabilistic models with intractable likelihoods using model simulations. The Markov chain Monte Carlo implementation of approximate Bayesian computation is often sensitive to the tolerance parameter: low tolerance leads to poor mixing and large tolerance entails excess bias. We propose an approach that involves using a relatively large tolerance for the Markov chain Monte Carlo sampler to ensure sufficient mixing and post-processing the output, leading to estimators for a range of finer tolerances. We introduce an approximate confidence interval for the related post-corrected estimators and propose an adaptive approximate Bayesian computation Markov chain Monte Carlo algorithm, which finds a balanced tolerance level automatically based on acceptance rate optimization. Our experiments show that post-processing-based estimators can perform better than direct Markov chain Monte Carlo targeting a fine tolerance, that our confidence intervals are reliable, and that our adaptive algorithm leads to reliable inference with little user specification.
Collapse
Affiliation(s)
- Matti Vihola
- Department of Mathematics and Statistics, University of Jyväskylä, P.O. Box 35, FI-40014 University of Jyväskylä, Finland
| | - Jordan Franks
- Department of Mathematics and Statistics, University of Jyväskylä, P.O. Box 35, FI-40014 University of Jyväskylä, Finland
| |
Collapse
|
21
|
HyperTraPS: Inferring Probabilistic Patterns of Trait Acquisition in Evolutionary and Disease Progression Pathways. Cell Syst 2020; 10:39-51.e10. [PMID: 31786211 DOI: 10.1016/j.cels.2019.10.009] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Revised: 08/23/2019] [Accepted: 10/26/2019] [Indexed: 01/15/2023]
Abstract
The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalizable statistical platform to infer the dynamic pathways by which many, potentially interacting, traits are acquired or lost over time. We use HyperTraPS (hypercubic transition path sampling) to efficiently learn progression pathways from cross-sectional, longitudinal, or phylogenetically linked data, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. This Bayesian approach allows inclusion of prior knowledge, quantifies uncertainty in pathway structure, and allows predictions, such as which symptom a patient will acquire next. We provide visualization tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways.
Collapse
|
22
|
Blath J, Buzzoni E, Koskela J, Wilke Berenguer M. Statistical tools for seed bank detection. Theor Popul Biol 2020; 132:1-15. [PMID: 31945384 DOI: 10.1016/j.tpb.2020.01.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 12/24/2019] [Accepted: 01/02/2020] [Indexed: 10/25/2022]
Abstract
We derive statistical tools to analyze the patterns of genetic variability produced by models related to seed banks; in particular the Kingman coalescent, its time-changed counterpart describing so-called weak seed banks, the strong seed bank coalescent, and the two-island structured coalescent. As (strong) seed banks stratify a population, we expect them to produce a signal comparable to population structure. We present tractable formulas for Wright's FST and the expected site frequency spectrum for these models, and show that they can distinguish between some models for certain ranges of parameters. We then use pseudo-marginal MCMC to show that the full likelihood can reliably distinguish between all models in the presence of parameter uncertainty under moderate stratification, and point out statistical pitfalls arising from stratification that is either too strong or too weak. We further show that it is possible to infer parameters, and in particular determine whether mutation is taking place in the (strong) seed bank.
Collapse
Affiliation(s)
- Jochen Blath
- Institut für Mathematik, Technische Universität Berlin, Straße des 17. Juni 136, 10623 Berlin, Germany.
| | - Eugenio Buzzoni
- Institut für Mathematik, Technische Universität Berlin, Straße des 17. Juni 136, 10623 Berlin, Germany.
| | - Jere Koskela
- Department of Statistics, University of Warwick, Coventry CV4 7AL, UK.
| | - Maite Wilke Berenguer
- Fakultät für Mathematik, Ruhr-Universität Bochum, Universitätsstraße 150, 44801 Bochum, Germany.
| |
Collapse
|
23
|
Golightly A, Bradley E, Lowe T, Gillespie CS. Correlated pseudo-marginal schemes for time-discretised stochastic kinetic models. Comput Stat Data Anal 2019. [DOI: 10.1016/j.csda.2019.01.006] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
24
|
Abstract
Summary
We consider the problem of approximating the product of $n$ expectations with respect to a common probability distribution $\mu$. Such products routinely arise in statistics as values of the likelihood in latent variable models. Motivated by pseudo-marginal Markov chain Monte Carlo schemes, we focus on unbiased estimators of such products. The standard approach is to sample $N$ particles from $\mu$ and assign each particle to one of the expectations; this is wasteful and typically requires the number of particles to grow quadratically with the number of expectations. We propose an alternative estimator that approximates each expectation using most of the particles while preserving unbiasedness, which is computationally more efficient when the cost of simulations greatly exceeds the cost of likelihood evaluations. We carefully study the properties of our proposed estimator, showing that in latent variable contexts it needs only ${O} (n)$ particles to match the performance of the standard approach with ${O}(n^{2})$ particles. We demonstrate the procedure on two latent variable examples from approximate Bayesian computation and single-cell gene expression analysis, observing computational gains by factors of about 25 and 450, respectively.
Collapse
Affiliation(s)
- A Lee
- School of Mathematics, University of Bristol, University Walk, Bristol BS8 1TW, UK
| | - S Tiberi
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - G Zanella
- Department of Decision Sciences, BIDSA and IGIER, Bocconi University, Via Roentgen 1, 20136 Milan, Italy
| |
Collapse
|
25
|
Picchini U, Forman JL. Bayesian inference for stochastic differential equation mixed effects models of a tumour xenography study. J R Stat Soc Ser C Appl Stat 2019. [DOI: 10.1111/rssc.12347] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Umberto Picchini
- Chalmers University of Technology and University of Gothenburg and Lund University Sweden
| | | |
Collapse
|
26
|
Picchini U. Likelihood-free stochastic approximation EM for inference in complex models. COMMUN STAT-SIMUL C 2019. [DOI: 10.1080/03610918.2017.1401082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Umberto Picchini
- Centre for Mathematical Sciences, Lund University, Sölvegatan 18, Lund, Sweden
| |
Collapse
|
27
|
Quiroz M, Villani M, Kohn R, Tran MN, Dang KD. Subsampling MCMC - an Introduction for the Survey Statistician. SANKHYA A 2018. [DOI: 10.1007/s13171-018-0153-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
|
28
|
|
29
|
Deligiannidis G, Doucet A, Pitt MK. The correlated pseudomarginal method. J R Stat Soc Series B Stat Methodol 2018. [DOI: 10.1111/rssb.12280] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
30
|
Affiliation(s)
- Matias Quiroz
- Australian School of Business, University of New South Wales, Sydney, Australia
| | - Robert Kohn
- Australian School of Business, University of New South Wales, Sydney, Australia
| | - Mattias Villani
- Division of Statistics and Machine Learning, Department of Computer and Information Science, Linköping University, Linköping, Sweden
| | - Minh-Ngoc Tran
- Discipline of Business Analytics, University of Sydney, Sydney, Australia
| |
Collapse
|
31
|
Alzahrani N, Neal P, Spencer SE, McKinley TJ, Touloupou P. Model selection for time series of count data. Comput Stat Data Anal 2018. [DOI: 10.1016/j.csda.2018.01.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
32
|
Andrieu C, Lee A, Vihola M. Uniform ergodicity of the iterated conditional SMC and geometric ergodicity of particle Gibbs samplers. BERNOULLI 2018. [DOI: 10.3150/15-bej785] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
33
|
McKinley TJ, Vernon I, Andrianakis I, McCreesh N, Oakley JE, Nsubuga RN, Goldstein M, White RG. Approximate Bayesian Computation and Simulation-Based Inference for Complex Stochastic Epidemic Models. Stat Sci 2018. [DOI: 10.1214/17-sts618] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
34
|
|
35
|
Coupling stochastic EM and approximate Bayesian computation for parameter inference in state-space models. Comput Stat 2017. [DOI: 10.1007/s00180-017-0770-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
36
|
Quiroz M, Tran MN, Villani M, Kohn R. Speeding up MCMC by Delayed Acceptance and Data Subsampling. J Comput Graph Stat 2017. [DOI: 10.1080/10618600.2017.1307117] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Matias Quiroz
- Division of Statistics and Machine Learning, Linköping University, Linköping, Sweden
- Research Division, Sveriges Riksbank, Stockholm, Sweden
| | - Minh-Ngoc Tran
- Discipline of Business Analytics, University of Sydney, Camperdown NSW, Australia
| | - Mattias Villani
- Division of Statistics and Machine Learning, Linköping University, Linköping, Sweden
| | - Robert Kohn
- Australian School of Business, University of New South Wales, Sydney NSW, Australia
| |
Collapse
|
37
|
Affiliation(s)
| | | | - Anthony Lee
- Department of Statistics, University of Warwick, Coventry, UK
| |
Collapse
|
38
|
Sherlock C, Thiery AH, Lee A. Pseudo-marginal Metropolis–Hastings sampling using averages of unbiased estimators. Biometrika 2017. [DOI: 10.1093/biomet/asx031] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Summary
We consider a pseudo-marginal Metropolis–Hastings kernel ${\mathbb{P}}_m$ that is constructed using an average of $m$ exchangeable random variables, and an analogous kernel ${\mathbb{P}}_s$ that averages $s<m$ of these same random variables. Using an embedding technique to facilitate comparisons, we provide a lower bound for the asymptotic variance of any ergodic average associated with ${\mathbb{P}}_m$ in terms of the asymptotic variance of the corresponding ergodic average associated with ${\mathbb{P}}_s$. We show that the bound is tight and disprove a conjecture that when the random variables to be averaged are independent, the asymptotic variance under ${\mathbb{P}}_m$ is never less than $s/m$ times the variance under ${\mathbb{P}}_s$. The conjecture does, however, hold for continuous-time Markov chains. These results imply that if the computational cost of the algorithm is proportional to $m$, it is often better to set $m=1$. We provide intuition as to why these findings differ so markedly from recent results for pseudo-marginal kernels employing particle filter approximations. Our results are exemplified through two simulation studies; in the first the computational cost is effectively proportional to $m$ and in the second there is a considerable start-up cost at each iteration.
Collapse
Affiliation(s)
- Chris Sherlock
- Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF, U.K.
| | - Alexandre H. Thiery
- Department of Statistics and Applied Probability, National University of Singapore, Singapore 117543
| | - Anthony Lee
- Department of Statistics, University of Warwick, Coventry CV4 7AL, U.K.
| |
Collapse
|
39
|
Mingas G, Bottolo L, Bouganis CS. Particle MCMC algorithms and architectures for accelerating inference in state-space models. Int J Approx Reason 2017; 83:413-433. [PMID: 28373744 PMCID: PMC5362159 DOI: 10.1016/j.ijar.2016.10.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Novel algorithmic and hardware techniques for fast SSM inference are proposed. New algorithm extends applicability of particle MCMC to multi-modal posteriors. FPGA architectures exploit particle and chain parallelism to accelerate sampling. 42x speedup vs. state-of-the-art CPU/GPU samplers is achieved for large problems.
Particle Markov Chain Monte Carlo (pMCMC) is a stochastic algorithm designed to generate samples from a probability distribution, when the density of the distribution does not admit a closed form expression. pMCMC is most commonly used to sample from the Bayesian posterior distribution in State-Space Models (SSMs), a class of probabilistic models used in numerous scientific applications. Nevertheless, this task is prohibitive when dealing with complex SSMs with massive data, due to the high computational cost of pMCMC and its poor performance when the posterior exhibits multi-modality. This paper aims to address both issues by: 1) Proposing a novel pMCMC algorithm (denoted ppMCMC), which uses multiple Markov chains (instead of the one used by pMCMC) to improve sampling efficiency for multi-modal posteriors, 2) Introducing custom, parallel hardware architectures, which are tailored for pMCMC and ppMCMC. The architectures are implemented on Field Programmable Gate Arrays (FPGAs), a type of hardware accelerator with massive parallelization capabilities. The new algorithm and the two FPGA architectures are evaluated using a large-scale case study from genetics. Results indicate that ppMCMC achieves 1.96x higher sampling efficiency than pMCMC when using sequential CPU implementations. The FPGA architecture of pMCMC is 12.1x and 10.1x faster than state-of-the-art, parallel CPU and GPU implementations of pMCMC and up to 53x more energy efficient; the FPGA architecture of ppMCMC increases these speedups to 34.9x and 41.8x respectively and is 173x more power efficient, bringing previously intractable SSM-based data analyses within reach.
Collapse
Affiliation(s)
- Grigorios Mingas
- Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, UK
| | - Leonardo Bottolo
- Department of Medical Genetics, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Christos-Savvas Bouganis
- Department of Electrical and Electronic Engineering, Imperial College London, London, SW7 2AZ, UK
| |
Collapse
|
40
|
Gwiazda P, Miasojedow B, Rosińska M. Bayesian inference for age-structured population model of infectious disease with application to varicella in Poland. J Theor Biol 2016; 407:38-50. [PMID: 27396357 DOI: 10.1016/j.jtbi.2016.07.008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2016] [Revised: 06/20/2016] [Accepted: 07/05/2016] [Indexed: 10/21/2022]
Abstract
The dynamics of the infectious disease transmission are often best understood by taking into account the structure of population with respect to specific features, for example age or immunity level. The practical utility of such models depends on the appropriate calibration with the observed data. Here, we discuss the Bayesian approach to data assimilation in the case of a two-state age-structured model. Such models are frequently used to explore the disease dynamics (i.e. force of infection) based on prevalence data collected at several time points. We demonstrate that, in the case when the explicit solution to the model equation is known, accounting for the data collection process in the Bayesian framework allows us to obtain an unbiased posterior distribution for the parameters determining the force of infection. We further show analytically and through numerical tests that the posterior distribution of these parameters is stable with respect to a cohort approximation (Escalator Boxcar Train) of the solution. Finally, we apply the technique to calibrate the model based on observed sero-prevalence of varicella in Poland.
Collapse
Affiliation(s)
- Piotr Gwiazda
- Institute of Mathematics, Polish Academy of Sciences, Poland; Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Poland
| | - Błażej Miasojedow
- Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Poland
| | - Magdalena Rosińska
- National Institute of Public Health - National Institute of Hygiene, Warsaw, Poland
| |
Collapse
|
41
|
Andrieu C, Vihola M. Establishing some order amongst exact approximations of MCMCs. ANN APPL PROBAB 2016. [DOI: 10.1214/15-aap1158] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
42
|
|
43
|
Packwood DM, Han P, Hitosugi T. State-space reduction and equivalence class sampling for a molecular self-assembly model. ROYAL SOCIETY OPEN SCIENCE 2016; 3:150681. [PMID: 27493765 PMCID: PMC4968457 DOI: 10.1098/rsos.150681] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2015] [Accepted: 06/14/2016] [Indexed: 06/06/2023]
Abstract
Direct simulation of a model with a large state space will generate enormous volumes of data, much of which is not relevant to the questions under study. In this paper, we consider a molecular self-assembly model as a typical example of a large state-space model, and present a method for selectively retrieving 'target information' from this model. This method partitions the state space into equivalence classes, as identified by an appropriate equivalence relation. The set of equivalence classes H, which serves as a reduced state space, contains none of the superfluous information of the original model. After construction and characterization of a Markov chain with state space H, the target information is efficiently retrieved via Markov chain Monte Carlo sampling. This approach represents a new breed of simulation techniques which are highly optimized for studying molecular self-assembly and, moreover, serves as a valuable guideline for analysis of other large state-space models.
Collapse
Affiliation(s)
- Daniel M. Packwood
- Advanced Institute for Materials Research (AIMR), Tohoku University, Sendai 980-8577, Japan
- Japan Science and Technology Agency (PRESTO), Kawaguchi, Saitama 332-0012, Japan
| | - Patrick Han
- Advanced Institute for Materials Research (AIMR), Tohoku University, Sendai 980-8577, Japan
- California Nanosystems Institute, University of California, Los Angeles, CA 90095, USA
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
- Department of Materials Science and Engineering, University of California, Los Angeles, CA 90095, USA
| | - Taro Hitosugi
- Advanced Institute for Materials Research (AIMR), Tohoku University, Sendai 980-8577, Japan
- Department of Applied Chemistry, Graduate School of Science and Engineering, Tokyo Institute of Technology, Tokyo 152-8352, Japan
| |
Collapse
|
44
|
Georgoulas A, Hillston J, Sanguinetti G. Unbiased Bayesian inference for population Markov jump processes via random truncations. STATISTICS AND COMPUTING 2016; 27:991-1002. [PMID: 28690370 PMCID: PMC5477715 DOI: 10.1007/s11222-016-9667-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/28/2015] [Accepted: 05/02/2016] [Indexed: 05/24/2023]
Abstract
We consider continuous time Markovian processes where populations of individual agents interact stochastically according to kinetic rules. Despite the increasing prominence of such models in fields ranging from biology to smart cities, Bayesian inference for such systems remains challenging, as these are continuous time, discrete state systems with potentially infinite state-space. Here we propose a novel efficient algorithm for joint state/parameter posterior sampling in population Markov Jump processes. We introduce a class of pseudo-marginal sampling algorithms based on a random truncation method which enables a principled treatment of infinite state spaces. Extensive evaluation on a number of benchmark models shows that this approach achieves considerable savings compared to state of the art methods, retaining accuracy and fast convergence. We also present results on a synthetic biology data set showing the potential for practical usefulness of our work.
Collapse
Affiliation(s)
| | - Jane Hillston
- School of Informatics, University of Edinburgh, Edinburgh, UK
| | - Guido Sanguinetti
- School of Informatics, University of Edinburgh, Edinburgh, UK
- Synthetic and Systems Biology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
45
|
Fasiolo M, Pya N, Wood SN. A Comparison of Inferential Methods for Highly Nonlinear State Space Models in Ecology and Epidemiology. Stat Sci 2016. [DOI: 10.1214/15-sts534] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
46
|
Medina-Aguayo FJ, Lee A, Roberts GO. Stability of noisy Metropolis-Hastings. STATISTICS AND COMPUTING 2015; 26:1187-1211. [PMID: 32055107 PMCID: PMC6991990 DOI: 10.1007/s11222-015-9604-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2015] [Accepted: 10/15/2015] [Indexed: 06/10/2023]
Abstract
Pseudo-marginal Markov chain Monte Carlo methods for sampling from intractable distributions have gained recent interest and have been theoretically studied in considerable depth. Their main appeal is that they are exact, in the sense that they target marginally the correct invariant distribution. However, the pseudo-marginal Markov chain can exhibit poor mixing and slow convergence towards its target. As an alternative, a subtly different Markov chain can be simulated, where better mixing is possible but the exactness property is sacrificed. This is the noisy algorithm, initially conceptualised as Monte Carlo within Metropolis, which has also been studied but to a lesser extent. The present article provides a further characterisation of the noisy algorithm, with a focus on fundamental stability properties like positive recurrence and geometric ergodicity. Sufficient conditions for inheriting geometric ergodicity from a standard Metropolis-Hastings chain are given, as well as convergence of the invariant distribution towards the true target distribution.
Collapse
Affiliation(s)
| | - A. Lee
- Department of Statistics, University of Warwick, Coventry, CV4 7AL UK
| | - G. O. Roberts
- Department of Statistics, University of Warwick, Coventry, CV4 7AL UK
| |
Collapse
|
47
|
Lyne AM, Girolami M, Atchadé Y, Strathmann H, Simpson D. On Russian Roulette Estimates for Bayesian Inference with Doubly-Intractable Likelihoods. Stat Sci 2015. [DOI: 10.1214/15-sts523] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
48
|
Sherlock C. Optimal Scaling for the Pseudo-Marginal Random Walk Metropolis: Insensitivity to the Noise Generating Mechanism. Methodol Comput Appl Probab 2015. [DOI: 10.1007/s11009-015-9471-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
49
|
Kantas N, Doucet A, Singh SS, Maciejowski J, Chopin N. On Particle Methods for Parameter Estimation in State-Space Models. Stat Sci 2015. [DOI: 10.1214/14-sts511] [Citation(s) in RCA: 231] [Impact Index Per Article: 25.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
50
|
Bladt M, Finch S, Sørensen M. Simulation of multivariate diffusion bridges. J R Stat Soc Series B Stat Methodol 2015. [DOI: 10.1111/rssb.12118] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|