1
|
Huang J, Morsomme R, Dunson D, Xu J. Detecting changes in the transmission rate of a stochastic epidemic model. Stat Med 2024; 43:1867-1882. [PMID: 38409877 DOI: 10.1002/sim.10050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 10/26/2023] [Accepted: 01/03/2024] [Indexed: 02/28/2024]
Abstract
Throughout the course of an epidemic, the rate at which disease spreads varies with behavioral changes, the emergence of new disease variants, and the introduction of mitigation policies. Estimating such changes in transmission rates can help us better model and predict the dynamics of an epidemic, and provide insight into the efficacy of control and intervention strategies. We present a method for likelihood-based estimation of parameters in the stochastic susceptible-infected-removed model under a time-inhomogeneous transmission rate comprised of piecewise constant components. In doing so, our method simultaneously learns change points in the transmission rate via a Markov chain Monte Carlo algorithm. The method targets the exact model posterior in a difficult missing data setting given only partially observed case counts over time. We validate performance on simulated data before applying our approach to data from an Ebola outbreak in Western Africa and COVID-19 outbreak on a university campus.
Collapse
Affiliation(s)
- Jenny Huang
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| | - Raphaël Morsomme
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| | - David Dunson
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| | - Jason Xu
- Department of Statistical Science, Duke University, Durham, North Carolina, USA
| |
Collapse
|
2
|
Liang X, Livingstone S, Griffin J. Adaptive MCMC for Bayesian Variable Selection in Generalised Linear Models and Survival Models. ENTROPY (BASEL, SWITZERLAND) 2023; 25:1310. [PMID: 37761609 PMCID: PMC10528396 DOI: 10.3390/e25091310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 08/30/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023]
Abstract
Developing an efficient computational scheme for high-dimensional Bayesian variable selection in generalised linear models and survival models has always been a challenging problem due to the absence of closed-form solutions to the marginal likelihood. The Reversible Jump Markov Chain Monte Carlo (RJMCMC) approach can be employed to jointly sample models and coefficients, but the effective design of the trans-dimensional jumps of RJMCMC can be challenging, making it hard to implement. Alternatively, the marginal likelihood can be derived conditional on latent variables using a data-augmentation scheme (e.g., Pólya-gamma data augmentation for logistic regression) or using other estimation methods. However, suitable data-augmentation schemes are not available for every generalised linear model and survival model, and estimating the marginal likelihood using a Laplace approximation or a correlated pseudo-marginal method can be computationally expensive. In this paper, three main contributions are presented. Firstly, we present an extended Point-wise implementation of Adaptive Random Neighbourhood Informed proposal (PARNI) to efficiently sample models directly from the marginal posterior distributions of generalised linear models and survival models. Secondly, in light of the recently proposed approximate Laplace approximation, we describe an efficient and accurate estimation method for marginal likelihood that involves adaptive parameters. Additionally, we describe a new method to adapt the algorithmic tuning parameters of the PARNI proposal by replacing Rao-Blackwellised estimates with the combination of a warm-start estimate and the ergodic average. We present numerous numerical results from simulated data and eight high-dimensional genetic mapping data-sets to showcase the efficiency of the novel PARNI proposal compared with the baseline add-delete-swap proposal.
Collapse
Affiliation(s)
- Xitong Liang
- Department of Statistical Science, University College London, London WC1E 6BT, UK; (S.L.); (J.G.)
| | | | | |
Collapse
|
3
|
Karhunen V, Launonen I, Järvelin MR, Sebert S, Sillanpää MJ. Genetic fine-mapping from summary data using a nonlocal prior improves the detection of multiple causal variants. Bioinformatics 2023; 39:btad396. [PMID: 37348543 PMCID: PMC10326304 DOI: 10.1093/bioinformatics/btad396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Revised: 06/09/2023] [Accepted: 06/20/2023] [Indexed: 06/24/2023] Open
Abstract
MOTIVATION Genome-wide association studies (GWAS) have been successful in identifying genomic loci associated with complex traits. Genetic fine-mapping aims to detect independent causal variants from the GWAS-identified loci, adjusting for linkage disequilibrium patterns. RESULTS We present "FiniMOM" (fine-mapping using a product inverse-moment prior), a novel Bayesian fine-mapping method for summarized genetic associations. For causal effects, the method uses a nonlocal inverse-moment prior, which is a natural prior distribution to model non-null effects in finite samples. A beta-binomial prior is set for the number of causal variants, with a parameterization that can be used to control for potential misspecifications in the linkage disequilibrium reference. The results of simulations studies aimed to mimic a typical GWAS on circulating protein levels show improved credible set coverage and power of the proposed method over current state-of-the-art fine-mapping method SuSiE, especially in the case of multiple causal variants within a locus. AVAILABILITY AND IMPLEMENTATION https://vkarhune.github.io/finimom/.
Collapse
Affiliation(s)
- Ville Karhunen
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
- Research Unit of Population Health, University of Oulu, Oulu, Finland
| | - Ilkka Launonen
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
| | - Marjo-Riitta Järvelin
- Research Unit of Population Health, University of Oulu, Oulu, Finland
- Department of Epidemiology and Biostatistics, Imperial College London, London, United Kingdom
- Department of Life Sciences, College of Health and Life Sciences, Brunel University, London, United Kingdom
| | - Sylvain Sebert
- Research Unit of Population Health, University of Oulu, Oulu, Finland
| | - Mikko J Sillanpää
- Research Unit of Mathematical Sciences, University of Oulu, Oulu, P.O.Box 8000, FI-90014, Finland
| |
Collapse
|
4
|
De Blasi P, Gil–Leyva MF. Gibbs sampling for mixtures in order of appearance: the ordered allocation sampler. J Comput Graph Stat 2023. [DOI: 10.1080/10618600.2023.2177298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Affiliation(s)
- Pierpaolo De Blasi
- Collegio Carlo Alberto and ESOMAS Department, University of Torino, C.so Unione Sovietica 218/bis, 10134, Torino, Italy
| | - María F. Gil–Leyva
- Department of Probability and Statistics, IIMAS–UNAM, Escolar 3000, C.U., 04510, CDMX, México
| |
Collapse
|
5
|
Zhou Q, Yang J, Vats D, Roberts GO, Rosenthal JS. Dimension‐free mixing for high‐dimensional Bayesian variable selection. J R Stat Soc Series B Stat Methodol 2022. [DOI: 10.1111/rssb.12546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Quan Zhou
- Department of Statistics Texas A&M University College Station Texas USA
| | - Jun Yang
- Department of Statistics University of Oxford Oxford UK
| | - Dootika Vats
- Department of Mathematics and Statistics Indian Institute of Technology Kanpur Kanpur India
| | | | | |
Collapse
|
6
|
Geels V, Pratola MT, Herbei R. The taxicab sampler: MCMC for discrete spaces with application to tree models. J STAT COMPUT SIM 2022. [DOI: 10.1080/00949655.2022.2119972] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- Vincent Geels
- Department of Statistics, The Ohio State University, Columbus, OH, USA
| | | | - Radu Herbei
- Department of Statistics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
7
|
Markov Chain Monte Carlo for generating ranked textual data. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.07.137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
8
|
van den Boom W, Beskos A, De Iorio M. The G-Wishart Weighted Proposal Algorithm: Efficient Posterior Computation for Gaussian Graphical Models. J Comput Graph Stat 2022. [DOI: 10.1080/10618600.2022.2050250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
| | - Alexandros Beskos
- Department of Statistical Science, University College London
- Alan Turing Institute, UK
| | - Maria De Iorio
- Yong Loo Lin School of Medicine, National University of Singapore
- Department of Statistical Science, University College London
- Singapore Institute for Clinical Sciences, A*STAR
| |
Collapse
|
9
|
Livingstone S, Zanella G. The Barker proposal: Combining robustness and efficiency in gradient‐based MCMC. J R Stat Soc Series B Stat Methodol 2022; 84:496-523. [PMID: 35910401 PMCID: PMC9303935 DOI: 10.1111/rssb.12482] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 10/14/2021] [Indexed: 12/02/2022]
Abstract
There is a tension between robustness and efficiency when designing Markov chain Monte Carlo (MCMC) sampling algorithms. Here we focus on robustness with respect to tuning parameters, showing that more sophisticated algorithms tend to be more sensitive to the choice of step‐size parameter and less robust to heterogeneity of the distribution of interest. We characterise this phenomenon by studying the behaviour of spectral gaps as an increasingly poor step‐size is chosen for the algorithm. Motivated by these considerations, we propose a novel and simple gradient‐based MCMC algorithm, inspired by the classical Barker accept‐reject rule, with improved robustness properties. Extensive theoretical results, dealing with robustness to tuning, geometric ergodicity and scaling with dimension, suggest that the novel scheme combines the robustness of simple schemes with the efficiency of gradient‐based ones. We show numerically that this type of robustness is particularly beneficial in the context of adaptive MCMC, giving examples where our proposed scheme significantly outperforms state‐of‐the‐art alternatives.
Collapse
Affiliation(s)
| | - Giacomo Zanella
- Department of Decision Sciences BIDSA and IGIER, Bocconi University Milan Italy
| |
Collapse
|
10
|
Vats D, Gonçalves FB, Łatuszyński K, Roberts GO. Efficient Bernoulli factory Markov chain Monte Carlo for intractable posteriors. Biometrika 2021. [DOI: 10.1093/biomet/asab031] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary
Accept-reject-based Markov chain Monte Carlo algorithms have traditionally utilized acceptance probabilities that can be explicitly written as a function of the ratio of the target density at the two contested points. This feature is rendered almost useless in Bayesian posteriors with unknown functional forms. We introduce a new family of Markov chain Monte Carlo acceptance probabilities that has the distinguishing feature of not being a function of the ratio of the target density at the two points. We present two stable Bernoulli factories that generate events within this class of acceptance probabilities. The efficiency of our methods relies on obtaining reasonable local upper or lower bounds on the target density, and we present two classes of problems where such bounds are viable: Bayesian inference for diffusions, and Markov chain Monte Carlo on constrained spaces. The resulting portkey Barker’s algorithms are exact and computationally more efficient that the current state of the art.
Collapse
Affiliation(s)
- D Vats
- Department of Mathematics and Statistics, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - F B Gonçalves
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Minas Gerais, CEP 31270-901, Brazil
| | - K Łatuszyński
- Department of Statistics, University of Warwick, Coventry CV4 7AL, U.K
| | - G O Roberts
- Department of Statistics, University of Warwick, Coventry CV4 7AL, U.K
| |
Collapse
|
11
|
Shan M, Thomas KS, Gutman R. A MULTIPLE IMPUTATION PROCEDURE FOR RECORD LINKAGE AND CAUSAL INFERENCE TO ESTIMATE THE EFFECTS OF HOME-DELIVERED MEALS. Ann Appl Stat 2021; 15:412-436. [PMID: 35755005 PMCID: PMC9222523 DOI: 10.1214/20-aoas1397] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2024]
Abstract
Causal analysis of observational studies requires data that comprise of a set of covariates, a treatment assignment indicator, and the observed outcomes. However, data confidentiality restrictions or the nature of data collection may distribute these variables across two or more datasets. In the absence of unique identifiers to link records across files, probabilistic record linkage algorithms can be leveraged to merge the datasets. Current applications of record linkage are concerned with estimation of associations between variables that are exclusive to one file and not causal relationships. We propose a Bayesian framework for record linkage and causal inference where one file comprises all the covariate and observed outcome information, and the second file consists of a list of all individuals who receive the active treatment. Under certain ignorability assumptions, the procedure properly propagates the error in the record linkage process, resulting in valid statistical inferences. To estimate the causal effects, we devise a two-stage procedure. The first stage of the procedure performs Bayesian record linkage to multiply impute the treatment assignment for all individuals in the first file, while adjustments for covariates' imbalance and imputation of missing potential outcomes are performed in the second stage. This procedure is used to evaluate the effect of Meals on Wheels services on mortality and healthcare utilization among homebound older adults in Rhode Island. In addition, an interpretable sensitivity analysis is developed to assess potential violations of the ignorability assumptions.
Collapse
|
12
|
Marchant NG, Kaplan A, Elazar DN, Rubinstein BIP, Steorts RC. d-blink: Distributed End-to-End Bayesian Entity Resolution. J Comput Graph Stat 2021. [DOI: 10.1080/10618600.2020.1825451] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Affiliation(s)
- Neil G. Marchant
- School of Computing and Information Systems, University of Melbourne , Parkville , VIC , Australia
| | - Andee Kaplan
- Department of Statistics, Colorado State University , Fort Collins , CO
| | - Daniel N. Elazar
- Methodology Division, Australian Bureau of Statistics , Belconnen , ACT , Australia
| | | | - Rebecca C. Steorts
- Department of Statistical Science and Computer Science, Duke University , Durham , NC
- Principal Mathematical Statistician, United States Census Bureau (DRB #: CBDRB-FY20-309)
| |
Collapse
|
13
|
Affiliation(s)
- Philippe Gagnon
- Department of Mathematics and Statistics, Université de Montréal, C.P. 6128, Succursale Centre-ville, Montreal, QC, H3C 3J7, Canada
| |
Collapse
|
14
|
Affiliation(s)
| | - Giacomo Zanella
- Department of Decision Sciences, Bocconi University, BIDSA and IGIER, Milan, Italy
| | - Rebecca C. Steorts
- Department of Statistical Science and Computer Science, Duke University, Durham, NC
| |
Collapse
|
15
|
Affiliation(s)
- Philippe Gagnon
- Department of Statistics, University of Oxford , Oxford , UK
| | - Arnaud Doucet
- Department of Statistics, University of Oxford , Oxford , UK
| |
Collapse
|
16
|
Griffin JE, Łatuszyński KG, Steel MFJ. In search of lost mixing time: adaptive Markov chain Monte Carlo schemes for Bayesian variable selection with very large p. Biometrika 2020. [DOI: 10.1093/biomet/asaa055] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Summary
The availability of datasets with large numbers of variables is rapidly increasing. The effective application of Bayesian variable selection methods for regression with these datasets has proved difficult since available Markov chain Monte Carlo methods do not perform well in typical problem sizes of interest. We propose new adaptive Markov chain Monte Carlo algorithms to address this shortcoming. The adaptive design of these algorithms exploits the observation that in large-$p$, small-$n$ settings, the majority of the $p$ variables will be approximately uncorrelated a posteriori. The algorithms adaptively build suitable nonlocal proposals that result in moves with squared jumping distance significantly larger than standard methods. Their performance is studied empirically in high-dimensional problems and speed-ups of up to four orders of magnitude are observed.
Collapse
Affiliation(s)
- J E Griffin
- Department of Statistical Science, University College London, Gower Street, London WC1E 6BT, U.K
| | - K G Łatuszyński
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, U.K
| | - M F J Steel
- Department of Statistics, University of Warwick, Coventry, CV4 7AL, U.K
| |
Collapse
|
17
|
Paulon G, Llanos F, Chandrasekaran B, Sarkar A. Bayesian Semiparametric Longitudinal Drift-Diffusion Mixed Models for Tone Learning in Adults. J Am Stat Assoc 2020; 116:1114-1127. [PMID: 34650315 PMCID: PMC8513775 DOI: 10.1080/01621459.2020.1801448] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 06/10/2020] [Accepted: 07/22/2020] [Indexed: 02/07/2023]
Abstract
Understanding how adult humans learn nonnative speech categories such as tone information has shed novel insights into the mechanisms underlying experience-dependent brain plasticity. Scientists have traditionally examined these questions using longitudinal learning experiments under a multi-category decision making paradigm. Drift-diffusion processes are popular in such contexts for their ability to mimic underlying neural mechanisms. Motivated by these problems, we develop a novel Bayesian semiparametric inverse Gaussian drift-diffusion mixed model for multi-alternative decision making in longitudinal settings. We design a Markov chain Monte Carlo algorithm for posterior computation. We evaluate the method's empirical performances through synthetic experiments. Applied to our motivating longitudinal tone learning study, the method provides novel insights into how the biologically interpretable model parameters evolve with learning, differ between input-response tone combinations, and differ between well and poorly performing adults. supplementary materials for this article, including a standardized description of the materials available for reproducing the work, are available as an online supplement.
Collapse
Affiliation(s)
- Giorgio Paulon
- Department of Statistics and Data Sciences, University of Texas at Austin, Austin, TX
| | - Fernando Llanos
- Department of Linguistics, University of Texas at Austin, Austin, TX
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
| | - Bharath Chandrasekaran
- Department of Communication Science and Disorders, University of Pittsburgh, Pittsburgh, PA
| | - Abhra Sarkar
- Department of Statistics and Data Sciences, University of Texas at Austin, Austin, TX
| |
Collapse
|
18
|
Zanella G, Roberts G. Scalable importance tempering and Bayesian variable selection. J R Stat Soc Series B Stat Methodol 2019. [DOI: 10.1111/rssb.12316] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|