51
|
Klinkenberg D, Backer JA, Didelot X, Colijn C, Wallinga J. Simultaneous inference of phylogenetic and transmission trees in infectious disease outbreaks. PLoS Comput Biol 2017; 13:e1005495. [PMID: 28545083 PMCID: PMC5436636 DOI: 10.1371/journal.pcbi.1005495] [Citation(s) in RCA: 66] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2016] [Accepted: 04/03/2017] [Indexed: 01/22/2023] Open
Abstract
Whole-genome sequencing of pathogens from host samples becomes more and more routine during infectious disease outbreaks. These data provide information on possible transmission events which can be used for further epidemiologic analyses, such as identification of risk factors for infectivity and transmission. However, the relationship between transmission events and sequence data is obscured by uncertainty arising from four largely unobserved processes: transmission, case observation, within-host pathogen dynamics and mutation. To properly resolve transmission events, these processes need to be taken into account. Recent years have seen much progress in theory and method development, but existing applications make simplifying assumptions that often break up the dependency between the four processes, or are tailored to specific datasets with matching model assumptions and code. To obtain a method with wider applicability, we have developed a novel approach to reconstruct transmission trees with sequence data. Our approach combines elementary models for transmission, case observation, within-host pathogen dynamics, and mutation, under the assumption that the outbreak is over and all cases have been observed. We use Bayesian inference with MCMC for which we have designed novel proposal steps to efficiently traverse the posterior distribution, taking account of all unobserved processes at once. This allows for efficient sampling of transmission trees from the posterior distribution, and robust estimation of consensus transmission trees. We implemented the proposed method in a new R package phybreak. The method performs well in tests of both new and published simulated data. We apply the model to five datasets on densely sampled infectious disease outbreaks, covering a wide range of epidemiological settings. Using only sampling times and sequences as data, our analyses confirmed the original results or improved on them: the more realistic infection times place more confidence in the inferred transmission trees. It is becoming easier and cheaper to obtain (whole genome) sequences of pathogen samples during outbreaks of infectious diseases. If all hosts during an outbreak are sampled, and these samples are sequenced, the small differences between the sequences (single nucleotide polymorphisms, SNPs) give information on the transmission tree, i.e. who infected whom, and when. However, correctly inferring this tree is not straightforward, because SNPs arise from unobserved processes including infection events, as well as pathogen growth and mutation within the hosts. Several methods have been developed in recent years, but often for specific applications or with limiting assumptions, so that they are not easily applied to new settings and datasets. We have developed a new model and method to infer transmission trees without putting prior limiting constraints on the order of unobserved events. The method is easily accessible in an R package implementation. We show that the method performs well on new and previously published simulated data. We illustrate applicability to a wide range of infectious diseases and settings by analysing five published datasets on densely sampled infectious disease outbreaks, confirming or improving the original results.
Collapse
Affiliation(s)
- Don Klinkenberg
- Department of Epidemiology and Surveillance, National Institute for Public Health and the Environment, Bilthoven, The Netherlands
- * E-mail:
| | - Jantien A. Backer
- Department of Epidemiology and Surveillance, National Institute for Public Health and the Environment, Bilthoven, The Netherlands
| | - Xavier Didelot
- Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London, United Kingdom
| | - Jacco Wallinga
- Department of Epidemiology and Surveillance, National Institute for Public Health and the Environment, Bilthoven, The Netherlands
- Department of Medical Statistics and Bio-Informatics, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
52
|
Didelot X, Fraser C, Gardy J, Colijn C. Genomic Infectious Disease Epidemiology in Partially Sampled and Ongoing Outbreaks. Mol Biol Evol 2017; 34:997-1007. [PMID: 28100788 PMCID: PMC5850352 DOI: 10.1093/molbev/msw275] [Citation(s) in RCA: 109] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Genomic data are increasingly being used to understand infectious disease epidemiology. Isolates from a given outbreak are sequenced, and the patterns of shared variation are used to infer which isolates within the outbreak are most closely related to each other. Unfortunately, the phylogenetic trees typically used to represent this variation are not directly informative about who infected whom-a phylogenetic tree is not a transmission tree. However, a transmission tree can be inferred from a phylogeny while accounting for within-host genetic diversity by coloring the branches of a phylogeny according to which host those branches were in. Here we extend this approach and show that it can be applied to partially sampled and ongoing outbreaks. This requires computing the correct probability of an observed transmission tree and we herein demonstrate how to do this for a large class of epidemiological models. We also demonstrate how the branch coloring approach can incorporate a variable number of unique colors to represent unsampled intermediates in transmission chains. The resulting algorithm is a reversible jump Monte-Carlo Markov Chain, which we apply to both simulated data and real data from an outbreak of tuberculosis. By accounting for unsampled cases and an outbreak which may not have reached its end, our method is uniquely suited to use in a public health environment during real-time outbreak investigations. We implemented this transmission tree inference methodology in an R package called TransPhylo, which is freely available from https://github.com/xavierdidelot/TransPhylo.
Collapse
Affiliation(s)
- Xavier Didelot
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, London, United Kingdom
| | - Christophe Fraser
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place, London, United Kingdom
- Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Jennifer Gardy
- Communicable Disease Prevention and Control Services, British Columbia Centre for Disease Control, Vancouver, British Columbia, Canada
- School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada
| | - Caroline Colijn
- Department of Mathematics, Imperial College, London, United Kingdom
| |
Collapse
|
53
|
Rasmussen DA, Kouyos R, Günthard HF, Stadler T. Phylodynamics on local sexual contact networks. PLoS Comput Biol 2017; 13:e1005448. [PMID: 28350852 PMCID: PMC5388502 DOI: 10.1371/journal.pcbi.1005448] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Revised: 04/11/2017] [Accepted: 03/10/2017] [Indexed: 12/26/2022] Open
Abstract
Phylodynamic models are widely used in infectious disease epidemiology to infer the dynamics and structure of pathogen populations. However, these models generally assume that individual hosts contact one another at random, ignoring the fact that many pathogens spread through highly structured contact networks. We present a new framework for phylodynamics on local contact networks based on pairwise epidemiological models that track the status of pairs of nodes in the network rather than just individuals. Shifting our focus from individuals to pairs leads naturally to coalescent models that describe how lineages move through networks and the rate at which lineages coalesce. These pairwise coalescent models not only consider how network structure directly shapes pathogen phylogenies, but also how the relationship between phylogenies and contact networks changes depending on epidemic dynamics and the fraction of infected hosts sampled. By considering pathogen phylogenies in a probabilistic framework, these coalescent models can also be used to estimate the statistical properties of contact networks directly from phylogenies using likelihood-based inference. We use this framework to explore how much information phylogenies retain about the underlying structure of contact networks and to infer the structure of a sexual contact network underlying a large HIV-1 sub-epidemic in Switzerland. Phylodynamic models relate the branching pattern of a pathogen’s phylogenetic tree to the tree-like growth of an epidemic as it spreads through a host population. Such models are increasingly used to learn about the epidemiology of different pathogens. We extend current models to consider the structure of host contact networks—the web of physical interactions through which pathogens spread. By considering how local interactions among hosts shape the phylogeny of a pathogen, our models offer a “pathogen’s eye view” of these networks. Our models also provide a statistical framework that can be used to infer network structure directly from phylogenies, which we use to estimate the properties of a sexual contact network in Switzerland from a HIV phylogeny.
Collapse
Affiliation(s)
- David A. Rasmussen
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- * E-mail:
| | - Roger Kouyos
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, University of Zürich, Zürich, Switzerland
- Institute of Medical Virology, University of Zürich, Zürich, Switzerland
| | - Huldrych F. Günthard
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, University of Zürich, Zürich, Switzerland
- Institute of Medical Virology, University of Zürich, Zürich, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
54
|
Colijn C, Jones N, Johnston IG, Yaliraki S, Barahona M. Toward Precision Healthcare: Context and Mathematical Challenges. Front Physiol 2017; 8:136. [PMID: 28377724 PMCID: PMC5359292 DOI: 10.3389/fphys.2017.00136] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2016] [Accepted: 02/22/2017] [Indexed: 12/12/2022] Open
Abstract
Precision medicine refers to the idea of delivering the right treatment to the right patient at the right time, usually with a focus on a data-centered approach to this task. In this perspective piece, we use the term "precision healthcare" to describe the development of precision approaches that bridge from the individual to the population, taking advantage of individual-level data, but also taking the social context into account. These problems give rise to a broad spectrum of technical, scientific, policy, ethical and social challenges, and new mathematical techniques will be required to meet them. To ensure that the science underpinning "precision" is robust, interpretable and well-suited to meet the policy, ethical and social questions that such approaches raise, the mathematical methods for data analysis should be transparent, robust, and able to adapt to errors and uncertainties. In particular, precision methodologies should capture the complexity of data, yet produce tractable descriptions at the relevant resolution while preserving intelligibility and traceability, so that they can be used by practitioners to aid decision-making. Through several case studies in this domain of precision healthcare, we argue that this vision requires the development of new mathematical frameworks, both in modeling and in data analysis and interpretation.
Collapse
Affiliation(s)
- Caroline Colijn
- Department of Mathematics, Imperial College LondonLondon, UK
- EPSRC Centre for Mathematics of Precision Healthcare, Imperial College LondonLondon, UK
| | - Nick Jones
- Department of Mathematics, Imperial College LondonLondon, UK
- EPSRC Centre for Mathematics of Precision Healthcare, Imperial College LondonLondon, UK
| | - Iain G. Johnston
- EPSRC Centre for Mathematics of Precision Healthcare, Imperial College LondonLondon, UK
- School of Biosciences, University of BirminghamBirmingham, UK
| | - Sophia Yaliraki
- EPSRC Centre for Mathematics of Precision Healthcare, Imperial College LondonLondon, UK
- Department of Chemistry, Imperial College LondonLondon, UK
| | - Mauricio Barahona
- Department of Mathematics, Imperial College LondonLondon, UK
- EPSRC Centre for Mathematics of Precision Healthcare, Imperial College LondonLondon, UK
| |
Collapse
|
55
|
Guthrie JL, Gardy JL. A brief primer on genomic epidemiology: lessons learned from Mycobacterium tuberculosis. Ann N Y Acad Sci 2016; 1388:59-77. [PMID: 28009051 DOI: 10.1111/nyas.13273] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2016] [Revised: 09/02/2016] [Accepted: 09/13/2016] [Indexed: 12/13/2022]
Abstract
Genomics is now firmly established as a technique for the investigation and reconstruction of communicable disease outbreaks, with many genomic epidemiology studies focusing on revealing transmission routes of Mycobacterium tuberculosis. In this primer, we introduce the basic techniques underlying transmission inference from genomic data, using illustrative examples from M. tuberculosis and other pathogens routinely sequenced by public health agencies. We describe the laboratory and epidemiological scenarios under which genomics may or may not be used, provide an introduction to sequencing technologies and bioinformatics approaches to identifying transmission-informative variation and resistance-associated mutations, and discuss how variation must be considered in the light of available clinical and epidemiological information to infer transmission.
Collapse
Affiliation(s)
- Jennifer L Guthrie
- School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada
| | - Jennifer L Gardy
- School of Population and Public Health, University of British Columbia, Vancouver, British Columbia, Canada.,Communicable Disease Prevention and Control Services, British Columbia Centre for Disease Control, Vancouver, British Columbia, Canada
| |
Collapse
|
56
|
Parratt SR, Numminen E, Laine AL. Infectious Disease Dynamics in Heterogeneous Landscapes. ANNUAL REVIEW OF ECOLOGY EVOLUTION AND SYSTEMATICS 2016. [DOI: 10.1146/annurev-ecolsys-121415-032321] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Infectious diseases dynamics are affected by both spatial and temporal heterogeneity in their environments. Our ability to quantify and predict how this heterogeneity impacts risks of infection and disease emergence is the key to successful disease prevention efforts. Here, we review the literature on infectious diseases from human, agricultural, and wildlife ecosystems to describe the rapid ecological and evolutionary responses in pathogens to environmental heterogeneity, with expected impacts on their epidemiology. To date, the underlying network structures through which disease transmission proceeds have been notoriously difficult to quantify because of this variation. We show that with recent advances in statistical methods and genomic approaches, it is now more feasible than ever to trace disease transmission networks, the molecular underpinning of infection, and the environmental variation relevant to disease dynamics. We end by identifying major new opportunities and challenges in understanding disease dynamics in an ever-changing world.
Collapse
Affiliation(s)
- Steven R. Parratt
- Metapopulation Research Centre, Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland;, ,
| | - Elina Numminen
- Metapopulation Research Centre, Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland;, ,
| | - Anna-Liisa Laine
- Metapopulation Research Centre, Department of Biosciences, University of Helsinki, FI-00014 Helsinki, Finland;, ,
| |
Collapse
|
57
|
De Maio N, Wu CH, Wilson DJ. SCOTTI: Efficient Reconstruction of Transmission within Outbreaks with the Structured Coalescent. PLoS Comput Biol 2016; 12:e1005130. [PMID: 27681228 PMCID: PMC5040440 DOI: 10.1371/journal.pcbi.1005130] [Citation(s) in RCA: 78] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 09/05/2016] [Indexed: 11/18/2022] Open
Abstract
Exploiting pathogen genomes to reconstruct transmission represents a powerful tool in the fight against infectious disease. However, their interpretation rests on a number of simplifying assumptions that regularly ignore important complexities of real data, in particular within-host evolution and non-sampled patients. Here we propose a new approach to transmission inference called SCOTTI (Structured COalescent Transmission Tree Inference). This method is based on a statistical framework that models each host as a distinct population, and transmissions between hosts as migration events. Our computationally efficient implementation of this model enables the inference of host-to-host transmission while accommodating within-host evolution and non-sampled hosts. SCOTTI is distributed as an open source package for the phylogenetic software BEAST2. We show that SCOTTI can generally infer transmission events even in the presence of considerable within-host variation, can account for the uncertainty associated with the possible presence of non-sampled hosts, and can efficiently use data from multiple samples of the same host, although there is some reduction in accuracy when samples are collected very close to the infection time. We illustrate the features of our approach by investigating transmission from genetic and epidemiological data in a Foot and Mouth Disease Virus (FMDV) veterinary outbreak in England and a Klebsiella pneumoniae outbreak in a Nepali neonatal unit. Transmission histories inferred with SCOTTI will be important in devising effective measures to prevent and halt transmission. We present a new tool, SCOTTI, to efficiently reconstruct transmission events within outbreaks. Our approach combines genetic information from infection samples with epidemiological information of patient exposure to infection. While epidemiological information has been traditionally used to understand who infected whom in an outbreak, detailed genetic information is increasingly becoming available with the steady progress of sequencing technologies. However, many complications, if unaccounted for, can affect the accuracy with which the transmission history is reconstructed. SCOTTI efficiently accounts for several complications, in particular within-patient genetic variation of the infectious organism, and non-sampled patients (such as asymptomatic patients). Thanks to these features, SCOTTI provides accurate reconstructions of transmission in complex scenarios, which will be important in finding and limiting the sources and routes of transmission, preventing the spread of infectious disease.
Collapse
Affiliation(s)
- Nicola De Maio
- Institute for Emerging Infections, Oxford Martin School, University of Oxford, Oxford, United Kingdom
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- * E-mail:
| | - Chieh-Hsi Wu
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Daniel J Wilson
- Institute for Emerging Infections, Oxford Martin School, University of Oxford, Oxford, United Kingdom
- Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
- Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
58
|
Hall MD, Woolhouse MEJ, Rambaut A. Using genomics data to reconstruct transmission trees during disease outbreaks. REV SCI TECH OIE 2016; 35:287-96. [PMID: 27217184 DOI: 10.20506/rst.35.1.2433] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
Genetic sequence data from pathogens present a novel means to investigate the spread of infectious disease between infected hosts or infected premises, complementing traditional contact-tracing approaches, and much recent work has gone into developing methods for this purpose. The objective is to recover the epidemic transmission tree, which identifies who infected whom. This paper reviews the various approaches that have been taken. The first step is to define a measure of difference between sequences, which must be done while taking into account such factors as recombination and convergent evolution. Three broad categories of method exist, of increasing complexity: those that assume no withinhost genetic diversity or mutation, those that assume no within-host diversity but allow mutation, and those that allow both. Until recently, the assumption was usually made that every host in the epidemic could be identified, but this is now being relaxed, and some methods are intended for sparsely sampled data, concentrating on the identification of pairs of sequences that are likely to be the result of direct transmission rather than inferring the complete transmission tree. Many of the procedures described here are available to researchers as free software.
Collapse
|
59
|
Vasylyeva TI, Friedman SR, Paraskevis D, Magiorkinis G. Integrating molecular epidemiology and social network analysis to study infectious diseases: Towards a socio-molecular era for public health. INFECTION GENETICS AND EVOLUTION 2016; 46:248-255. [PMID: 27262354 PMCID: PMC5135626 DOI: 10.1016/j.meegid.2016.05.042] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2016] [Revised: 05/26/2016] [Accepted: 05/31/2016] [Indexed: 12/30/2022]
Abstract
The number of public health applications for molecular epidemiology and social network analysis has increased rapidly since the improvement in computational capacities and the development of new sequencing techniques. Currently, molecular epidemiology methods are used in a variety of settings: from infectious disease surveillance systems to the description of disease transmission pathways. The latter are of great epidemiological importance as they let us describe how a virus spreads in a community, make predictions for the further epidemic developments, and plan preventive interventions. Social network methods are used to understand how infections spread through communities and what the risk factors for this are, as well as in improved contact tracing and message-dissemination interventions. Research is needed on how to combine molecular and social network data as both include essential, but not fully sufficient information on infection transmission pathways. The main differences between the two data sources are that, firstly, social network data include uninfected individuals unlike the molecular data sampled only from infected network members. Thus, social network data include more detailed picture of a network and can improve inferences made from molecular data. Secondly, network data refer to the current state and interactions within the social network, while molecular data refer to the time points when transmissions happened, which might have happened years before the sampling date. As of today, there have been attempts to combine and compare the data obtained from the two sources. Even though there is no consensus on whether and how social and genetic data complement each other, this research might significantly improve our understanding of how viruses spread through communities. We summarise and analyse the roles of molecular evolution studies in molecular epidemiology of infectious diseases. We review how social network and molecular sequence data have been integrated in the past. We show how integrating social network and molecular evolution approaches may change the study of infectious diseases.
Collapse
Affiliation(s)
- Tetyana I Vasylyeva
- Department of Zoology, University of Oxford, South Parks Road, OX1 3PS Oxford, United Kingdom
| | - Samuel R Friedman
- Institute for Infectious Disease Research, National Development and Research Institutes, New York, NY 10010, USA
| | - Dimitrios Paraskevis
- Department of Hygiene, Epidemiology, and Medical Statistics, Athens University Medical School, 75, M. Asias Street, Athens 115 27, Greece
| | - Gkikas Magiorkinis
- Department of Zoology, University of Oxford, South Parks Road, OX1 3PS Oxford, United Kingdom.
| |
Collapse
|
60
|
Kenah E, Britton T, Halloran ME, Longini IM. Molecular Infectious Disease Epidemiology: Survival Analysis and Algorithms Linking Phylogenies to Transmission Trees. PLoS Comput Biol 2016; 12:e1004869. [PMID: 27070316 PMCID: PMC4829193 DOI: 10.1371/journal.pcbi.1004869] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 03/15/2016] [Indexed: 12/20/2022] Open
Abstract
Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. In a survival analysis framework, estimation of transmission parameters is based on sums or averages over the possible transmission trees. A phylogeny can increase the precision of these estimates by providing partial information about who infected whom. The leaves of the phylogeny represent sampled pathogens, which have known hosts. The interior nodes represent common ancestors of sampled pathogens, which have unknown hosts. Starting from assumptions about disease biology and epidemiologic study design, we prove that there is a one-to-one correspondence between the possible assignments of interior node hosts and the transmission trees simultaneously consistent with the phylogeny and the epidemiologic data on person, place, and time. We develop algorithms to enumerate these transmission trees and show these can be used to calculate likelihoods that incorporate both epidemiologic data and a phylogeny. A simulation study confirms that this leads to more efficient estimates of hazard ratios for infectiousness and baseline hazards of infectious contact, and we use these methods to analyze data from a foot-and-mouth disease virus outbreak in the United Kingdom in 2001. These results demonstrate the importance of data on individuals who escape infection, which is often overlooked. The combination of survival analysis and algorithms linking phylogenies to transmission trees is a rigorous but flexible statistical foundation for molecular infectious disease epidemiology. Recent work has attempted to use whole-genome sequence data from pathogens to reconstruct the transmission trees linking infectors and infectees in outbreaks. However, transmission trees from one outbreak do not generalize to future outbreaks. Reconstruction of transmission trees is most useful to public health if it leads to generalizable scientific insights about disease transmission. Accurate estimates of transmission parameters can help identify risk factors for transmission and aid the design and evaluation of public health interventions for emerging infections. Using statistical methods for time-to-event data (survival analysis), estimation of transmission parameters is based on sums or averages over the possible transmission trees. By providing partial information about who infected whom, a pathogen phylogeny can reduce the set of possible transmission trees and increase the precision of transmission parameter estimates. We derive algorithms that enumerate the transmission trees consistent with a pathogen phylogeny and epidemiologic data, show how to calculate likelihoods for transmission data with a phylogeny, and apply these methods to a foot and mouth disease outbreak in the United Kingdom in 2001. These methods will allow pathogen genetic sequences to be incorporated into the analysis of outbreak investigations, vaccine trials, and other studies of infectious disease transmission.
Collapse
Affiliation(s)
- Eben Kenah
- Biostatistics Department and Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
- Center for Inference and Dynamics of Infectious Diseases, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- * E-mail:
| | - Tom Britton
- Department of Mathematics, Stockholm University, Stockholm, Sweden
| | - M. Elizabeth Halloran
- Center for Inference and Dynamics of Infectious Diseases, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Vaccine and Infectious Diseases Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Department of Biostatistics, University of Washington, Seattle, Washington, United States of America
| | - Ira M. Longini
- Biostatistics Department and Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
- Center for Inference and Dynamics of Infectious Diseases, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| |
Collapse
|
61
|
Gardy JL. Translating phylogeny into action for HIV surveillance. Lancet HIV 2016; 3:e196-7. [PMID: 27126483 DOI: 10.1016/s2352-3018(16)30012-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2016] [Accepted: 03/17/2016] [Indexed: 11/30/2022]
Affiliation(s)
- Jennifer L Gardy
- Communicable Disease Prevention and Control Services, British Columbia Centre for Disease Control, Vancouver, BC, Canada; School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
62
|
Plazzotta G, Kwan C, Boyd M, Colijn C. Effects of memory on the shapes of simple outbreak trees. Sci Rep 2016; 6:21159. [PMID: 26888437 PMCID: PMC4758066 DOI: 10.1038/srep21159] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2015] [Accepted: 01/07/2016] [Indexed: 12/15/2022] Open
Abstract
Genomic tools, including phylogenetic trees derived from sequence data, are increasingly used to understand outbreaks of infectious diseases. One challenge is to link phylogenetic trees to patterns of transmission. Particularly in bacteria that cause chronic infections, this inference is affected by variable infectious periods and infectivity over time. It is known that non-exponential infectious periods can have substantial effects on pathogens’ transmission dynamics. Here we ask how this non-Markovian nature of an outbreak process affects the branching trees describing that process, with particular focus on tree shapes. We simulate Crump-Mode-Jagers branching processes and compare different patterns of infectivity over time. We find that memory (non-Markovian-ness) in the process can have a pronounced effect on the shapes of the outbreak’s branching pattern. However, memory also has a pronounced effect on the sizes of the trees, even when the duration of the simulation is fixed. When the sizes of the trees are constrained to a constant value, memory in our processes has little direct effect on tree shapes, but can bias inference of the birth rate from trees. We compare simulated branching trees to phylogenetic trees from an outbreak of tuberculosis in Canada, and discuss the relevance of memory to this dataset.
Collapse
Affiliation(s)
| | - Christopher Kwan
- Department of Electrical and Electronic Engineering, Imperial College London, London, UK
| | - Michael Boyd
- Department of Mathematics, University of Cambridge, Cambridge, UK
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London, UK
| |
Collapse
|