1
|
Wang MH, Onnela JP. Flexible Bayesian inference on partially observed epidemics. JOURNAL OF COMPLEX NETWORKS 2024; 12:cnae017. [PMID: 38533184 PMCID: PMC10962317 DOI: 10.1093/comnet/cnae017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Accepted: 03/02/2024] [Indexed: 03/28/2024]
Abstract
Individual-based models of contagious processes are useful for predicting epidemic trajectories and informing intervention strategies. In such models, the incorporation of contact network information can capture the non-randomness and heterogeneity of realistic contact dynamics. In this article, we consider Bayesian inference on the spreading parameters of an SIR contagion on a known, static network, where information regarding individual disease status is known only from a series of tests (positive or negative disease status). When the contagion model is complex or information such as infection and removal times is missing, the posterior distribution can be difficult to sample from. Previous work has considered the use of Approximate Bayesian Computation (ABC), which allows for simulation-based Bayesian inference on complex models. However, ABC methods usually require the user to select reasonable summary statistics. Here, we consider an inference scheme based on the Mixture Density Network compressed ABC, which minimizes the expected posterior entropy in order to learn informative summary statistics. This allows us to conduct Bayesian inference on the parameters of a partially observed contagious process while also circumventing the need for manual summary statistic selection. This methodology can be extended to incorporate additional simulation complexities, including behavioural change after positive tests or false test results.
Collapse
Affiliation(s)
- Maxwell H Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| | - Jukka-Pekka Onnela
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, 677 Huntington Ave, Boston, MA 02115, USA
| |
Collapse
|
2
|
Susvitasari K, Tupper P, Stockdale JE, Colijn C. A method to estimate the serial interval distribution under partially-sampled data. Epidemics 2023; 45:100733. [PMID: 38056165 DOI: 10.1016/j.epidem.2023.100733] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Revised: 11/22/2023] [Accepted: 11/26/2023] [Indexed: 12/08/2023] Open
Abstract
The serial interval of an infectious disease is an important variable in epidemiology. It is defined as the period of time between the symptom onset times of the infector and infectee in a direct transmission pair. Under partially sampled data, purported infector-infectee pairs may actually be separated by one or more unsampled cases in between. Misunderstanding such pairs as direct transmissions will result in overestimating the length of serial intervals. On the other hand, two cases that are infected by an unseen third case (known as coprimary transmission) may be classified as a direct transmission pair, leading to an underestimation of the serial interval. Here, we introduce a method to jointly estimate the distribution of serial intervals factoring in these two sources of error. We simultaneously estimate the distribution of the number of unsampled intermediate cases between purported infector-infectee pairs, as well as the fraction of such pairs that are coprimary. We also extend our method to situations where each infectee has multiple possible infectors, and show how to factor this additional source of uncertainty into our estimates. We assess our method's performance on simulated data sets and find that our method provides consistent and robust estimates. We also apply our method to data from real-life outbreaks of four infectious diseases and compare our results with published results. With similar accuracy, our method of estimating serial interval distribution provides unique advantages, allowing its application in settings of low sampling rates and large population sizes, such as widespread community transmission tracked by routine public health surveillance.
Collapse
Affiliation(s)
| | - Paul Tupper
- Department of Mathematics, Simon Fraser University, Canada
| | | | | |
Collapse
|
3
|
Goyal R, Carnegie N, Slipher S, Turk P, Little SJ, De Gruttola V. Estimating contact network properties by integrating multiple data sources associated with infectious diseases. Stat Med 2023; 42:3593-3615. [PMID: 37392149 PMCID: PMC10825904 DOI: 10.1002/sim.9816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 05/09/2023] [Accepted: 05/19/2023] [Indexed: 07/03/2023]
Abstract
To effectively mitigate the spread of communicable diseases, it is necessary to understand the interactions that enable disease transmission among individuals in a population; we refer to the set of these interactions as a contact network. The structure of the contact network can have profound effects on both the spread of infectious diseases and the effectiveness of control programs. Therefore, understanding the contact network permits more efficient use of resources. Measuring the structure of the network, however, is a challenging problem. We present a Bayesian approach to integrate multiple data sources associated with the transmission of infectious diseases to more precisely and accurately estimate important properties of the contact network. An important aspect of the approach is the use of the congruence class models for networks. We conduct simulation studies modeling pathogens resembling SARS-CoV-2 and HIV to assess the method; subsequently, we apply our approach to HIV data from the University of California San Diego Primary Infection Resource Consortium. Based on simulation studies, we demonstrate that the integration of epidemiological and viral genetic data with risk behavior survey data can lead to large decreases in mean squared error (MSE) in contact network estimates compared to estimates based strictly on risk behavior information. This decrease in MSE is present even in settings where the risk behavior surveys contain measurement error. Through these simulations, we also highlight certain settings where the approach does not improve MSE.
Collapse
Affiliation(s)
- Ravi Goyal
- Division of Infectious Diseases and Global Public, University of California San Diego, San Diego, California, USA
| | | | - Sally Slipher
- Department of Mathematical Sciences, Montana State University, Bozeman, Montana, USA
| | - Philip Turk
- Department of Data Science, University of Mississippi Medical Center, Jackson, Mississippi, USA
| | - Susan J Little
- Division of Infectious Diseases and Global Public, University of California San Diego, La Jolla, California, USA
| | - Victor De Gruttola
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
| |
Collapse
|
4
|
DeGruttola V, Goyal R, Martin NK, Wang R. Network methods and design of randomized trials: Application to investigation of COVID-19 vaccination boosters. Clin Trials 2022; 19:363-374. [PMID: 35894099 DOI: 10.1177/17407745221111818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Network science methods can be useful in design, monitoring, and analysis of randomized trials for control of spread of infections. Their usefulness arises from the role of statistical network models in molecular epidemiology and in study design. Computational models, such as agent-based models that propagate disease on simulated contact networks, can be used to investigate the properties of different study designs and analysis plans. Particularly valuable is the use of these methods to assess how magnitude and detectability of intervention effects depend on both individual-level and network-level characteristics of the enrolled populations. Such investigation also provides an important approach to assessing consequences of study data being incomplete or measured with error. To address these goals, we consider two statistical network models: exponential random graph models and the more flexible congruence class models. We focus first on an historical use of these methods in design and monitoring of a cluster randomized trial in Botswana to evaluate the effect of combination HIV prevention modalities compared to standard of care on HIV incidence. We then present a framework for the design of a study of booster vaccine effects on infection with, and forward transmission of, SARS-CoV-2 variants. Motivation for the study is driven in part by guidance from the United Kingdom to base approval of booster vaccines with "strain changes" that target variants on results of neutralizing antibody tests and information about safety, but without requiring evidence of clinical efficacy. Using designs informed by our agent-based network models, we show it may be feasible to conduct a trial of novel SARS-CoV-2 vaccines in a single large campus to obtain useful information regarding vaccine efficacy against susceptibility and infectiousness. If needed, the sample size could be increased by extending the study to a small number of campuses. Novel network methods may be useful in developing pragmatic SARS-CoV-2 vaccine trials that can leverage existing infrastructure to reduce costs and hasten the development of results.
Collapse
Affiliation(s)
- Victor DeGruttola
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.,Division of Infectious Diseases & Global Public Health, Department of Medicine, University of California San Diego La Jolla, CA, USA
| | - Ravi Goyal
- Division of Infectious Diseases & Global Public Health, Department of Medicine, University of California San Diego La Jolla, CA, USA
| | - Natasha K Martin
- Division of Infectious Diseases & Global Public Health, Department of Medicine, University of California San Diego La Jolla, CA, USA
| | - Rui Wang
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA.,Department of Population Medicine, Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA
| |
Collapse
|
5
|
Manlove K, Wilber M, White L, Bastille‐Rousseau G, Yang A, Gilbertson MLJ, Craft ME, Cross PC, Wittemyer G, Pepin KM. Defining an epidemiological landscape that connects movement ecology to pathogen transmission and pace‐of‐life. Ecol Lett 2022; 25:1760-1782. [DOI: 10.1111/ele.14032] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 04/21/2022] [Accepted: 05/03/2022] [Indexed: 12/20/2022]
Affiliation(s)
- Kezia Manlove
- Department of Wildland Resources and Ecology Center Utah State University Logan Utah USA
| | - Mark Wilber
- Department of Forestry, Wildlife, and Fisheries University of Tennessee Institute of Agriculture Knoxville Tennessee USA
| | - Lauren White
- National Socio‐Environmental Synthesis Center University of Maryland Annapolis Maryland USA
| | | | - Anni Yang
- Department of Fish, Wildlife, and Conservation Biology Colorado State University Fort Collins Colorado USA
- National Wildlife Research Center, United States Department of Agriculture, Animal and Plant Health Inspection Service, Wildlife Services National Wildlife Research Center Fort Collins Colorado USA
- Department of Geography and Environmental Sustainability University of Oklahoma Norman Oklahoma USA
| | - Marie L. J. Gilbertson
- Department of Veterinary Population Medicine University of Minnesota St. Paul Minnesota USA
- Wisconsin Cooperative Wildlife Research Unit, Department of Forest and Wildlife Ecology University of Wisconsin–Madison Madison Wisconsin USA
| | - Meggan E. Craft
- Department of Ecology, Evolution, and Behavior University of Minnesota St. Paul Minnesota USA
| | - Paul C. Cross
- U.S. Geological Survey Northern Rocky Mountain Science Center Bozeman Montana USA
| | - George Wittemyer
- Department of Fish, Wildlife, and Conservation Biology Colorado State University Fort Collins Colorado USA
| | - Kim M. Pepin
- National Wildlife Research Center, United States Department of Agriculture, Animal and Plant Health Inspection Service, Wildlife Services National Wildlife Research Center Fort Collins Colorado USA
| |
Collapse
|
6
|
Schweinberger M, Bomiriya RP, Babkin S. A Semiparametric Bayesian Approach to Epidemics, with Application to the Spread of the Coronavirus MERS in South Korea in 2015. J Nonparametr Stat 2022; 34:628-662. [PMID: 36172077 PMCID: PMC9512273 DOI: 10.1080/10485252.2021.1972294] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We consider incomplete observations of stochastic processes governing the spread of infectious diseases through finite populations by way of contact. We propose a flexible semiparametric modeling framework with at least three advantages. First, it enables researchers to study the structure of a population contact network and its impact on the spread of infectious diseases. Second, it can accommodate short- and long-tailed degree distributions and detect potential superspreaders, who represent an important public health concern. Third, it addresses the important issue of incomplete data. Starting from first principles, we show when the incomplete-data generating process is ignorable for the purpose of Bayesian inference for the parameters of the population model. We demonstrate the semiparametric modeling framework by simulations and an application to the partially observed MERS epidemic in South Korea in 2015. We conclude with an extended discussion of open questions and directions for future research.
Collapse
Affiliation(s)
- Michael Schweinberger
- Corresponding author. Address: Department of Statistics, University of Missouri, 146 Middlebush Hall, Columbia, MO 65211, USA. . Phone: + 1 713-348-2278. Fax: +1 713-348-5476
| | | | | |
Collapse
|
7
|
Jamshidi B, Alavi SMR, Parham GA. The distribution of the number of the infected individuals in a stochastic SIR model on regular rooted trees. COMMUN STAT-SIMUL C 2021. [DOI: 10.1080/03610918.2019.1584299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Babak Jamshidi
- Statistics Department, Faculty of Mathematics and Computer Sciences, Shahid Chamran University of Ahvaz, Ahvaz, Iran
| | - Sayed Mohammad Reza Alavi
- Statistics Department, Faculty of Mathematics and Computer Sciences, Shahid Chamran University of Ahvaz, Ahvaz, Iran
| | - Gholam Ali Parham
- Statistics Department, Faculty of Mathematics and Computer Sciences, Shahid Chamran University of Ahvaz, Ahvaz, Iran
| |
Collapse
|
8
|
Almutiry W, Deardon R. Contact network uncertainty in individual level models of infectious disease transmission. STATISTICAL COMMUNICATIONS IN INFECTIOUS DISEASES 2021; 13:20190012. [PMID: 35880993 PMCID: PMC8865399 DOI: 10.1515/scid-2019-0012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2019] [Accepted: 11/20/2020] [Indexed: 06/15/2023]
Abstract
Infectious disease transmission between individuals in a heterogeneous population is often best modelled through a contact network. This contact network can be spatial in nature, with connections between individuals closer in space being more likely. However, contact network data are often unobserved. Here, we consider the fit of an individual level model containing a spatially-based contact network that is either entirely, or partially, unobserved within a Bayesian framework, using data augmented Markov chain Monte Carlo (MCMC). We also incorporate the uncertainty about event history in the disease data. We also examine the performance of the data augmented MCMC analysis in the presence or absence of contact network observational models based upon either knowledge about the degree distribution or the total number of connections in the network. We find that the latter tend to provide better estimates of the model parameters and the underlying contact network.
Collapse
Affiliation(s)
- Waleed Almutiry
- Mathematics, Arts and Science College in Ar Rass, Qassim University, Buraidah, Saudi Arabia
| | - Rob Deardon
- Production Animal Health, University of Calgary, Calgary, Alberta, Canada
- Mathematics and Statistics, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
9
|
Schweinberger M, Krivitsky PN, Butts CT, Stewart JR. Exponential-Family Models of Random Graphs: Inference in Finite, Super and Infinite Population Scenarios. Stat Sci 2020. [DOI: 10.1214/19-sts743] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
10
|
Moshiri N, Ragonnet-Cronin M, Wertheim JO, Mirarab S. FAVITES: simultaneous simulation of transmission networks, phylogenetic trees and sequences. Bioinformatics 2020; 35:1852-1861. [PMID: 30395173 DOI: 10.1093/bioinformatics/bty921] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 10/29/2018] [Accepted: 11/01/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The ability to simulate epidemics as a function of model parameters allows insights that are unobtainable from real datasets. Further, reconstructing transmission networks for fast-evolving viruses like Human Immunodeficiency Virus (HIV) may have the potential to greatly enhance epidemic intervention, but transmission network reconstruction methods have been inadequately studied, largely because it is difficult to obtain 'truth' sets on which to test them and properly measure their performance. RESULTS We introduce FrAmework for VIral Transmission and Evolution Simulation (FAVITES), a robust framework for simulating realistic datasets for epidemics that are caused by fast-evolving pathogens like HIV. FAVITES creates a generative model to produce contact networks, transmission networks, phylogenetic trees and sequence datasets, and to add error to the data. FAVITES is designed to be extensible by dividing the generative model into modules, each of which is expressed as a fixed API that can be implemented using various models. We use FAVITES to simulate HIV datasets and study the realism of the simulated datasets. We then use the simulated data to study the impact of the increased treatment efforts on epidemiological outcomes. We also study two transmission network reconstruction methods and their effectiveness in detecting fast-growing clusters. AVAILABILITY AND IMPLEMENTATION FAVITES is available at https://github.com/niemasd/FAVITES, and a Docker image can be found on DockerHub (https://hub.docker.com/r/niemasd/favites). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Niema Moshiri
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, La Jolla, USA
| | | | | | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, USA
| |
Collapse
|
11
|
Almutiry W, Deardon R. Incorporating Contact Network Uncertainty in Individual Level Models of Infectious Disease using Approximate Bayesian Computation. Int J Biostat 2019; 16:ijb-2017-0092. [PMID: 31812945 DOI: 10.1515/ijb-2017-0092] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2017] [Accepted: 11/19/2019] [Indexed: 11/15/2022]
Abstract
Infectious disease transmission between individuals in a heterogeneous population is often best modelled through a contact network. However, such contact network data are often unobserved. Such missing data can be accounted for in a Bayesian data augmented framework using Markov chain Monte Carlo (MCMC). Unfortunately, fitting models in such a framework can be highly computationally intensive. We investigate the fitting of network-based infectious disease models with completely unknown contact networks using approximate Bayesian computation population Monte Carlo (ABC-PMC) methods. This is done in the context of both simulated data, and data from the UK 2001 foot-and-mouth disease epidemic. We show that ABC-PMC is able to obtain reasonable approximations of the underlying infectious disease model with huge savings in computation time when compared to a full Bayesian MCMC analysis.
Collapse
Affiliation(s)
- Waleed Almutiry
- Department of Mathematics, College of Science and Arts, Qassim University,Ar Rass, Qassim, Saudi Arabia
| | - Rob Deardon
- Department of Mathematics and Statistics and Department of Production Animal Health, University of Calgary, Calgary, Alberta, Canada
| |
Collapse
|
12
|
Carnegie NB. Effects of contact network structure on epidemic transmission trees: implications for data required to estimate network structure. Stat Med 2018; 37:236-248. [PMID: 28192859 PMCID: PMC6126904 DOI: 10.1002/sim.7259] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2016] [Revised: 01/18/2017] [Accepted: 01/29/2017] [Indexed: 12/30/2022]
Abstract
Understanding the dynamics of disease spread is key to developing effective interventions to control or prevent an epidemic. The structure of the network of contacts over which the disease spreads has been shown to have a strong influence on the outcome of the epidemic, but an open question remains as to whether it is possible to estimate contact network features from data collected in an epidemic. The approach taken in this paper is to examine the distributions of epidemic outcomes arising from epidemics on networks with particular structural features to assess whether that structure could be measured from epidemic data and what other constraints might be needed to make the problem identifiable. To this end, we vary the network size, mean degree, and transmissibility of the pathogen, as well as the network feature of interest: clustering, degree assortativity, or attribute-based preferential mixing. We record several standard measures of the size and spread of the epidemic, as well as measures that describe the shape of the transmission tree in order to ascertain whether there are detectable signals in the final data from the outbreak. The results suggest that there is potential to estimate contact network features from transmission trees or pure epidemic data, particularly for diseases with high transmissibility or for which the relevant contact network is of low mean degree. Copyright © 2017 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Nicole Bohme Carnegie
- Joseph J. Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI, U.S.A
| |
Collapse
|
13
|
tsiR: An R package for time-series Susceptible-Infected-Recovered models of epidemics. PLoS One 2017; 12:e0185528. [PMID: 28957408 PMCID: PMC5619791 DOI: 10.1371/journal.pone.0185528] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2017] [Accepted: 09/14/2017] [Indexed: 11/25/2022] Open
Abstract
tsiR is an open source software package implemented in the R programming language designed to analyze infectious disease time-series data. The software extends a well-studied and widely-applied algorithm, the time-series Susceptible-Infected-Recovered (TSIR) model, to infer parameters from incidence data, such as contact seasonality, and to forward simulate the underlying mechanistic model. The tsiR package aggregates a number of different fitting features previously described in the literature in a user-friendly way, providing support for their broader adoption in infectious disease research. Also included in tsiR are a number of diagnostic tools to assess the fit of the TSIR model. This package should be useful for researchers analyzing incidence data for fully-immunizing infectious diseases.
Collapse
|
14
|
Lee C, Garbett A, Wilkinson DJ. A network epidemic model for online community commissioning data. STATISTICS AND COMPUTING 2017; 28:891-904. [PMID: 31983814 PMCID: PMC6953976 DOI: 10.1007/s11222-017-9770-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/27/2017] [Accepted: 07/26/2017] [Indexed: 06/10/2023]
Abstract
A statistical model assuming a preferential attachment network, which is generated by adding nodes sequentially according to a few simple rules, usually describes real-life networks better than a model assuming, for example, a Bernoulli random graph, in which any two nodes have the same probability of being connected, does. Therefore, to study the propagation of "infection" across a social network, we propose a network epidemic model by combining a stochastic epidemic model and a preferential attachment model. A simulation study based on the subsequent Markov Chain Monte Carlo algorithm reveals an identifiability issue with the model parameters. Finally, the network epidemic model is applied to a set of online commissioning data.
Collapse
Affiliation(s)
- Clement Lee
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne, UK
- Open Lab, Newcastle University, Newcastle upon Tyne, UK
| | | | - Darren J. Wilkinson
- School of Mathematics and Statistics, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
15
|
|
16
|
Chen Y, Crespi N, Ortiz AM, Shu L. Reality mining: A prediction algorithm for disease dynamics based on mobile big data. Inf Sci (N Y) 2017. [DOI: 10.1016/j.ins.2016.07.075] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
17
|
Burch MG, Jacobsen KA, Tien JH, Rempala GA. Network-based analysis of a small Ebola outbreak. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2017; 14:67-77. [PMID: 27879120 DOI: 10.3934/mbe.2017005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
We present a method for estimating epidemic parameters in network-based stochastic epidemic models when the total number of infections is assumed to be small. We illustrate the method by reanalyzing the data from the 2014 Democratic Republic of the Congo (DRC) Ebola outbreak described in Maganga et al. (2014).
Collapse
Affiliation(s)
- Mark G Burch
- College of Public Health, The Ohio State University, Columbus, OH 43210, United States.
| | | | | | | |
Collapse
|
18
|
Giardina F, Romero-Severson EO, Albert J, Britton T, Leitner T. Inference of Transmission Network Structure from HIV Phylogenetic Trees. PLoS Comput Biol 2017; 13:e1005316. [PMID: 28085876 PMCID: PMC5279806 DOI: 10.1371/journal.pcbi.1005316] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2016] [Revised: 01/30/2017] [Accepted: 12/19/2016] [Indexed: 11/22/2022] Open
Abstract
Phylogenetic inference is an attractive means to reconstruct transmission histories and epidemics. However, there is not a perfect correspondence between transmission history and virus phylogeny. Both node height and topological differences may occur, depending on the interaction between within-host evolutionary dynamics and between-host transmission patterns. To investigate these interactions, we added a within-host evolutionary model in epidemiological simulations and examined if the resulting phylogeny could recover different types of contact networks. To further improve realism, we also introduced patient-specific differences in infectivity across disease stages, and on the epidemic level we considered incomplete sampling and the age of the epidemic. Second, we implemented an inference method based on approximate Bayesian computation (ABC) to discriminate among three well-studied network models and jointly estimate both network parameters and key epidemiological quantities such as the infection rate. Our ABC framework used both topological and distance-based tree statistics for comparison between simulated and observed trees. Overall, our simulations showed that a virus time-scaled phylogeny (genealogy) may be substantially different from the between-host transmission tree. This has important implications for the interpretation of what a phylogeny reveals about the underlying epidemic contact network. In particular, we found that while the within-host evolutionary process obscures the transmission tree, the diversification process and infectivity dynamics also add discriminatory power to differentiate between different types of contact networks. We also found that the possibility to differentiate contact networks depends on how far an epidemic has progressed, where distance-based tree statistics have more power early in an epidemic. Finally, we applied our ABC inference on two different outbreaks from the Swedish HIV-1 epidemic.
Collapse
Affiliation(s)
- Federica Giardina
- Department of Mathematics, Stockholm University, Stockholm, Sweden
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Ethan Obie Romero-Severson
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden
| | - Tom Britton
- Department of Mathematics, Stockholm University, Stockholm, Sweden
| | - Thomas Leitner
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| |
Collapse
|
19
|
Estimating enhanced prevaccination measles transmission hotspots in the context of cross-scale dynamics. Proc Natl Acad Sci U S A 2016; 113:14595-14600. [PMID: 27872300 DOI: 10.1073/pnas.1604976113] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A key question in clarifying human-environment interactions is how dynamic complexity develops across integrative scales from molecular to population and global levels. Apart from its public health importance, measles is an excellent test bed for such an analysis. Simple mechanistic models have successfully illuminated measles dynamics at the city and country levels, revealing seasonal forcing of transmission as a major driver of long-term epidemic behavior. Seasonal forcing ties closely to patterns of school aggregation at the individual and community levels, but there are few explicit estimates of school transmission due to the relative lack of epidemic data at this scale. Here, we use data from a 1904 measles outbreak in schools in Woolwich, London, coupled with a stochastic Susceptible-Infected-Recovered model to analyze measles incidence data. Our results indicate that transmission within schools and age classes is higher than previous population-level serological data would suggest. This analysis sheds quantitative light on the role of school-aged children in measles cross-scale dynamics, as we illustrate with references to the contemporary vaccination landscape.
Collapse
|
20
|
Sainudiin R, Welch D. The transmission process: A combinatorial stochastic process for the evolution of transmission trees over networks. J Theor Biol 2016; 410:137-170. [PMID: 27519948 DOI: 10.1016/j.jtbi.2016.07.038] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2016] [Revised: 07/22/2016] [Accepted: 07/22/2016] [Indexed: 10/21/2022]
Abstract
We derive a combinatorial stochastic process for the evolution of the transmission tree over the infected vertices of a host contact network in a susceptible-infected (SI) model of an epidemic. Models of transmission trees are crucial to understanding the evolution of pathogen populations. We provide an explicit description of the transmission process on the product state space of (rooted planar ranked labelled) binary transmission trees and labelled host contact networks with SI-tags as a discrete-state continuous-time Markov chain. We give the exact probability of any transmission tree when the host contact network is a complete, star or path network - three illustrative examples. We then develop a biparametric Beta-splitting model that directly generates transmission trees with exact probabilities as a function of the model parameters, but without explicitly modelling the underlying contact network, and show that for specific values of the parameters we can recover the exact probabilities for our three example networks through the Markov chain construction that explicitly models the underlying contact network. We use the maximum likelihood estimator (MLE) to consistently infer the two parameters driving the transmission process based on observations of the transmission trees and use the exact MLE to characterize equivalence classes over the space of contact networks with a single initial infection. An exploratory simulation study of the MLEs from transmission trees sampled from three other deterministic and four random families of classical contact networks is conducted to shed light on the relation between the MLEs of these families with some implications for statistical inference along with pointers to further extensions of our models. The insights developed here are also applicable to the simplest models of "meme" evolution in online social media networks through transmission events that can be distilled from observable actions such as "likes", "mentions", "retweets" and "+1s" along with any concomitant comments.
Collapse
Affiliation(s)
- Raazesh Sainudiin
- Laboratory for Mathematical Statistical Experiments, Christchurch Centre and Biomathematics Research Centre, School of Mathematics and Statistics, University of Canterbury, Private Bag 4800, Christchurch 8041, New Zealand.
| | - David Welch
- Computational Evolution Group and Department of Computer Science, University of Auckland, Private Bag 92019, Auckland 1142, New Zealand.
| |
Collapse
|
21
|
Zhao L, Chen J, Chen F, Wang W, Lu CT, Ramakrishnan N. SimNest: Social Media Nested Epidemic Simulation via Online Semi-supervised Deep Learning. PROCEEDINGS. IEEE INTERNATIONAL CONFERENCE ON DATA MINING 2015; 2015:639-648. [PMID: 27453696 DOI: 10.1109/icdm.2015.39] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Infectious disease epidemics such as influenza and Ebola pose a serious threat to global public health. It is crucial to characterize the disease and the evolution of the ongoing epidemic efficiently and accurately. Computational epidemiology can model the disease progress and underlying contact network, but suffers from the lack of real-time and fine-grained surveillance data. Social media, on the other hand, provides timely and detailed disease surveillance, but is insensible to the underlying contact network and disease model. This paper proposes a novel semi-supervised deep learning framework that integrates the strengths of computational epidemiology and social media mining techniques. Specifically, this framework learns the social media users' health states and intervention actions in real time, which are regularized by the underlying disease model and contact network. Conversely, the learned knowledge from social media can be fed into computational epidemic model to improve the efficiency and accuracy of disease diffusion modeling. We propose an online optimization algorithm to substantialize the above interactive learning process iteratively to achieve a consistent stage of the integration. The extensive experimental results demonstrated that our approach can effectively characterize the spatio-temporal disease diffusion, outperforming competing methods by a substantial margin on multiple metrics.
Collapse
|
22
|
Potter GE, Smieszek T, Sailer K. Modeling workplace contact networks: The effects of organizational structure, architecture, and reporting errors on epidemic predictions. NETWORK SCIENCE (CAMBRIDGE UNIVERSITY PRESS) 2015; 3:298-325. [PMID: 26634122 PMCID: PMC4663701 DOI: 10.1017/nws.2015.22] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Face-to-face social contacts are potentially important transmission routes for acute respiratory infections, and understanding the contact network can improve our ability to predict, contain, and control epidemics. Although workplaces are important settings for infectious disease transmission, few studies have collected workplace contact data and estimated workplace contact networks. We use contact diaries, architectural distance measures, and institutional structures to estimate social contact networks within a Swiss research institute. Some contact reports were inconsistent, indicating reporting errors. We adjust for this with a latent variable model, jointly estimating the true (unobserved) network of contacts and duration-specific reporting probabilities. We find that contact probability decreases with distance, and that research group membership, role, and shared projects are strongly predictive of contact patterns. Estimated reporting probabilities were low only for 0-5 min contacts. Adjusting for reporting error changed the estimate of the duration distribution, but did not change the estimates of covariate effects and had little effect on epidemic predictions. Our epidemic simulation study indicates that inclusion of network structure based on architectural and organizational structure data can improve the accuracy of epidemic forecasting models.
Collapse
Affiliation(s)
- Gail E. Potter
- California Polytechnic State University, San Luis Obispo, CA, USA; Center for Statistics and Quantitative Infectious Disease, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Timo Smieszek
- Center for Infectious Disease Dynamics, Pennsylvania State University; Modelling and Economics Unit, Centre for Infectious Disease Surveillance and Control, Public Health England, London, UK; MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, Imperial College School of Public Health, London, UK; NIHR Health Protection Research Unit in Modelling Methodology, Department of Infectious Disease Epidemiology, Imperial College School of Public Health, London, UK
| | - Kerstin Sailer
- The Bartlett School of Graduate Studies, University College London
| |
Collapse
|
23
|
Romanescu R, Deardon R. Modeling two strains of disease via aggregate-level infectivity curves. J Math Biol 2015; 72:1195-224. [PMID: 26084408 DOI: 10.1007/s00285-015-0910-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2014] [Revised: 06/07/2015] [Indexed: 11/30/2022]
Abstract
Well formulated models of disease spread, and efficient methods to fit them to observed data, are powerful tools for aiding the surveillance and control of infectious diseases. Our project considers the problem of the simultaneous spread of two related strains of disease in a context where spatial location is the key driver of disease spread. We start our modeling work with the individual level models (ILMs) of disease transmission, and extend these models to accommodate the competing spread of the pathogens in a two-tier hierarchical population (whose levels we refer to as 'farm' and 'animal'). The postulated interference mechanism between the two strains is a period of cross-immunity following infection. We also present a framework for speeding up the computationally intensive process of fitting the ILM to data, typically done using Markov chain Monte Carlo (MCMC) in a Bayesian framework, by turning the inference into a two-stage process. First, we approximate the number of animals infected on a farm over time by infectivity curves. These curves are fit to data sampled from farms, using maximum likelihood estimation, then, conditional on the fitted curves, Bayesian MCMC inference proceeds for the remaining parameters. Finally, we use posterior predictive distributions of salient epidemic summary statistics, in order to assess the model fitted.
Collapse
Affiliation(s)
- Razvan Romanescu
- Department of Mathematics and Statistics, University of Guelph, 50 Stone Road East, Guelph, ON, N1G 2W1, Canada.
| | - Rob Deardon
- Department of Production Animal Health, Faculty of Veterinary Medicine, University of Calgary, HRIC 2AC66, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada.,Department of Mathematics and Statistics, Faculty of Science, University of Calgary, HRIC 2AC66, 3280 Hospital Drive NW, Calgary, AB, T2N 4Z6, Canada
| |
Collapse
|
24
|
Lee XJ, Drovandi CC, Pettitt AN. Model choice problems using approximate Bayesian computation with applications to pathogen transmission data sets. Biometrics 2014; 71:198-207. [PMID: 25303085 DOI: 10.1111/biom.12249] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2013] [Revised: 08/01/2014] [Accepted: 09/01/2014] [Indexed: 11/30/2022]
Abstract
Analytically or computationally intractable likelihood functions can arise in complex statistical inferential problems making them inaccessible to standard Bayesian inferential methods. Approximate Bayesian computation (ABC) methods address such inferential problems by replacing direct likelihood evaluations with repeated sampling from the model. ABC methods have been predominantly applied to parameter estimation problems and less to model choice problems due to the added difficulty of handling multiple model spaces. The ABC algorithm proposed here addresses model choice problems by extending Fearnhead and Prangle (2012, Journal of the Royal Statistical Society, Series B 74, 1-28) where the posterior mean of the model parameters estimated through regression formed the summary statistics used in the discrepancy measure. An additional stepwise multinomial logistic regression is performed on the model indicator variable in the regression step and the estimated model probabilities are incorporated into the set of summary statistics for model choice purposes. A reversible jump Markov chain Monte Carlo step is also included in the algorithm to increase model diversity for thorough exploration of the model space. This algorithm was applied to a validating example to demonstrate the robustness of the algorithm across a wide range of true model probabilities. Its subsequent use in three pathogen transmission examples of varying complexity illustrates the utility of the algorithm in inferring preference of particular transmission models for the pathogens.
Collapse
Affiliation(s)
- Xing Ju Lee
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, 4000, Australia
| | - Christopher C Drovandi
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, 4000, Australia
| | - Anthony N Pettitt
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Queensland, 4000, Australia
| |
Collapse
|
25
|
van Boven M, Ruijs WLM, Wallinga J, O'Neill PD, Hahné S. Estimation of vaccine efficacy and critical vaccination coverage in partially observed outbreaks. PLoS Comput Biol 2013; 9:e1003061. [PMID: 23658512 PMCID: PMC3642050 DOI: 10.1371/journal.pcbi.1003061] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2012] [Accepted: 03/28/2013] [Indexed: 11/18/2022] Open
Abstract
Classical approaches to estimate vaccine efficacy are based on the assumption that a person's risk of infection does not depend on the infection status of others. This assumption is untenable for infectious disease data where such dependencies abound. We present a novel approach to estimating vaccine efficacy in a Bayesian framework using disease transmission models. The methodology is applied to outbreaks of mumps in primary schools in the Netherlands. The total study population consisted of 2,493 children in ten primary schools, of which 510 (20%) were known to have been infected, and 832 (33%) had unknown infection status. The apparent vaccination coverage ranged from 12% to 93%, and the apparent infection attack rate varied from 1% to 76%. Our analyses show that vaccination reduces the probability of infection per contact substantially but not perfectly ([Formula: see text] = 0.933; 95CrI: 0.908-0.954). Mumps virus appears to be moderately transmissible in the school setting, with each case yielding an estimated 2.5 secondary cases in an unvaccinated population ([Formula: see text] = 2.49; 95%CrI: 2.36-2.63), resulting in moderate estimates of the critical vaccination coverage (64.2%; 95%CrI: 61.7-66.7%). The indirect benefits of vaccination are highest in populations with vaccination coverage just below the critical vaccination coverage. In these populations, it is estimated that almost two infections can be prevented per vaccination. We discuss the implications for the optimal control of mumps in heterogeneously vaccinated populations.
Collapse
Affiliation(s)
- Michiel van Boven
- Centre for Infectious Disease Control, National Institute for Public Health and the Environment, Bilthoven, The Netherlands.
| | | | | | | | | |
Collapse
|
26
|
Stack JC, Bansal S, Kumar VSA, Grenfell B. Inferring population-level contact heterogeneity from common epidemic data. J R Soc Interface 2013; 10:20120578. [PMID: 23034353 PMCID: PMC3565785 DOI: 10.1098/rsif.2012.0578] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 09/10/2012] [Indexed: 11/27/2022] Open
Abstract
Models of infectious disease spread that incorporate contact heterogeneity through contact networks are an important tool for epidemiologists studying disease dynamics and assessing intervention strategies. One of the challenges of contact network epidemiology has been the difficulty of collecting individual and population-level data needed to develop an accurate representation of the underlying host population's contact structure. In this study, we evaluate the utility of common epidemiological measures (R0, epidemic peak size, duration and final size) for inferring the degree of heterogeneity in a population's unobserved contact structure through a Bayesian approach. We test the method using ground truth data and find that some of these epidemiological metrics are effective at classifying contact heterogeneity. The classification is also consistent across pathogen transmission probabilities, and so can be applied even when this characteristic is unknown. In particular, the reproductive number, R0, turns out to be a poor classifier of the degree heterogeneity, while, unexpectedly, final epidemic size is a powerful predictor of network structure across the range of heterogeneity. We also evaluate our framework on empirical epidemiological data from past and recent outbreaks to demonstrate its application in practice and to gather insights about the relevance of particular contact structures for both specific systems and general classes of infectious disease. We thus introduce a simple approach that can shed light on the unobserved connectivity of a host population given epidemic data. Our study has the potential to inform future data-collection efforts and study design by driving our understanding of germane epidemic measures, and highlights a general inferential approach to learning about host contact structure in contemporary or historic populations of humans and animals.
Collapse
Affiliation(s)
- J. Conrad Stack
- Department of Biology, Pennsylvania State University, University Park, PA 16802-5301, USA
| | - Shweta Bansal
- Center for Infectious Disease Dynamics, Pennsylvania State University, University Park, PA 16802-5301, USA
- Fogarty International Center, National Institutes of Health, Bethesda, MD 20892-220, USA
| | - V. S. Anil Kumar
- Department of Computer Science and Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA
| | - Bryan Grenfell
- Fogarty International Center, National Institutes of Health, Bethesda, MD 20892-220, USA
- Department of Ecology and Evolutionary Biology and Woodrow Wilson School, Princeton University, Princeton, NJ 08540, USA
| |
Collapse
|
27
|
Hunter DR, Krivitsky PN, Schweinberger M. Computational Statistical Methods for Social Network Models. J Comput Graph Stat 2012; 21:856-882. [PMID: 23828720 PMCID: PMC3697157 DOI: 10.1080/10618600.2012.732921] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We review the broad range of recent statistical work in social network models, with emphasis on computational aspects of these methods. Particular focus is applied to exponential-family random graph models (ERGM) and latent variable models for data on complete networks observed at a single time point, though we also briefly review many methods for incompletely observed networks and networks observed at multiple time points. Although we mention far more modeling techniques than we can possibly cover in depth, we provide numerous citations to current literature. We illustrate several of the methods on a small, well-known network dataset, Sampson's monks, providing code where possible so that these analyses may be duplicated.
Collapse
Affiliation(s)
- David R. Hunter
- Department of Statistics, Pennsylvania State University, University Park, PA ()
| | - Pavel N. Krivitsky
- Department of Statistics, Pennsylvania State University, University Park, PA ()
| | | |
Collapse
|