1
|
Seidel S, Stadler T, Vaughan TG. Estimating pathogen spread using structured coalescent and birth-death models: A quantitative comparison. Epidemics 2024; 49:100795. [PMID: 39461051 DOI: 10.1016/j.epidem.2024.100795] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 09/09/2024] [Accepted: 09/19/2024] [Indexed: 10/29/2024] Open
Abstract
Elucidating disease spread between subpopulations is crucial in guiding effective disease control efforts. Genomic epidemiology and phylodynamics have emerged as key principles to estimate such spread from pathogen phylogenies derived from molecular data. Two well-established structured phylodynamic methodologies - based on the coalescent and the birth-death model - are frequently employed to estimate viral spread between populations. Nonetheless, these methodologies operate under distinct assumptions whose impact on the accuracy of migration rate inference is yet to be thoroughly investigated. In this manuscript, we present a simulation study, contrasting the inferential outcomes of the structured coalescent model with constant population size and the multitype birth-death model with a constant rate. We explore this comparison across a range of migration rates in endemic diseases and epidemic outbreaks. The results of the epidemic outbreak analysis revealed that the birth-death model exhibits a superior ability to retrieve accurate migration rates compared to the coalescent model, regardless of the actual migration rate. Thus, to estimate accurate migration rates, the population dynamics have to be accounted for. On the other hand, for the endemic disease scenario, our investigation demonstrates that both models produce comparable coverage and accuracy of the migration rates, with the coalescent model generating more precise estimates. Regardless of the specific scenario, both models similarly estimated the source location of the disease. This research offers tangible modelling advice for infectious disease analysts, suggesting the use of either model for endemic diseases. For epidemic outbreaks, or scenarios with varying population size, structured phylodynamic models relying on the Kingman coalescent with constant population size should be avoided as they can lead to inaccurate estimates of the migration rate. Instead, coalescent models accounting for varying population size or birth-death models should be favoured. Importantly, our study emphasises the value of directly capturing exponential growth dynamics which could be a useful enhancement for structured coalescent models.
Collapse
Affiliation(s)
- Sophie Seidel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; Swiss Institute of Bioinformatics (SIB), Basel, Switzerland.
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; Swiss Institute of Bioinformatics (SIB), Basel, Switzerland
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland; Swiss Institute of Bioinformatics (SIB), Basel, Switzerland.
| |
Collapse
|
2
|
Carnegie L, McCrone JT, du Plessis L, Hasan M, Ali MZ, Begum R, Hassan MZ, Islam S, Rahman MH, Uddin ASM, Sarker MS, Das T, Hossain M, Khan M, Razu MH, Akram A, Arina S, Hoque E, Molla MMA, Nafisaa T, Angra P, Rambaut A, Pullan ST, Osman KL, Hoque MA, Biswas P, Flora MS, Raghwani J, Fournié G, Samad MA, Hill SC. Genomic epidemiology of early SARS-CoV-2 transmission dynamics in Bangladesh. Virol J 2024; 21:291. [PMID: 39538264 PMCID: PMC11562509 DOI: 10.1186/s12985-024-02560-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 10/26/2024] [Indexed: 11/16/2024] Open
Abstract
BACKGROUND Genomic epidemiology has helped reconstruct the global and regional movement of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). However, there is still a lack of understanding of SARS-CoV-2 spread in some of the world's least developed countries (LDCs). METHODS To begin to address this disparity, we studied the transmission dynamics of the virus in Bangladesh during the country's first COVID-19 wave by analysing case reports and whole-genome sequences from all eight divisions of the country. RESULTS We detected > 50 virus introductions to the country during the period, including during a period of national lockdown. Additionally, through discrete phylogeographic analyses, we identified that geographical distance and population -density and/or -size influenced virus spatial dispersal in Bangladesh. CONCLUSIONS Overall, this study expands our knowledge of SARS-CoV-2 genomic epidemiology in Bangladesh, shedding light on crucial transmission characteristics within the country, while also acknowledging resemblances and differences to patterns observed in other nations.
Collapse
Affiliation(s)
- L Carnegie
- Department of Pathobiology and Population Sciences, Royal Veterinary College (RVC), Hatfield, Hertfordshire, UK.
| | - J T McCrone
- Institute of Ecology and Evolution, University of Edinburgh, King's Buildings, Edinburgh, UK
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - L du Plessis
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - M Hasan
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - M Z Ali
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - R Begum
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - M Z Hassan
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - S Islam
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
- Global Change Center, Virginia Tech, Blacksburg, VA, USA
| | - M H Rahman
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - A S M Uddin
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - M S Sarker
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh
| | - T Das
- Chattogram Veterinary and Animal Sciences University (CVASU), Khulshi, Chattogram, Bangladesh
- School of Agricultural, Environmental and Veterinary Sciences, Charles Sturt University, Wagga Wagga, NSW, Australia
| | - M Hossain
- NSU Genome Research Institute (NGRI), North South University, Bashundhara, Dhaka, Bangladesh
- Department of Biochemistry and Microbiology, North South University, Bashundhara, Dhaka, Bangladesh
| | - M Khan
- Bangladesh Reference Institute for Chemical Measurements (BRiCM), Dhanmondi, Dhaka, Bangladesh
| | - M H Razu
- Bangladesh Reference Institute for Chemical Measurements (BRiCM), Dhanmondi, Dhaka, Bangladesh
| | - A Akram
- National Institute of Laboratory Medicine and Referral Centre (NILMRC), Agargoan, Dhaka, Bangladesh
| | - S Arina
- National Institute of Laboratory Medicine and Referral Centre (NILMRC), Agargoan, Dhaka, Bangladesh
| | - E Hoque
- National Institute of Laboratory Medicine and Referral Centre (NILMRC), Agargoan, Dhaka, Bangladesh
| | - M M A Molla
- National Institute of Laboratory Medicine and Referral Centre (NILMRC), Agargoan, Dhaka, Bangladesh
| | - T Nafisaa
- National Institute of Laboratory Medicine and Referral Centre (NILMRC), Agargoan, Dhaka, Bangladesh
| | - P Angra
- Centers for Disease Control and Prevention (CDC), Atlanta, GA, USA
| | - A Rambaut
- Institute of Ecology and Evolution, University of Edinburgh, King's Buildings, Edinburgh, UK
| | - S T Pullan
- United Kingdom Health Security Agency (UKHSA), Porton Down, Salisbury, UK
| | - K L Osman
- United Kingdom Health Security Agency (UKHSA), Porton Down, Salisbury, UK
| | - M A Hoque
- Chattogram Veterinary and Animal Sciences University (CVASU), Khulshi, Chattogram, Bangladesh
| | - P Biswas
- Chattogram Veterinary and Animal Sciences University (CVASU), Khulshi, Chattogram, Bangladesh
| | - M S Flora
- National Institute of Preventive and Social Medicine (NIPSOM), Ministry of Health and Family Welfare, Dhaka, Bangladesh
| | - J Raghwani
- Department of Pathobiology and Population Sciences, Royal Veterinary College (RVC), Hatfield, Hertfordshire, UK
| | - G Fournié
- Department of Pathobiology and Population Sciences, Royal Veterinary College (RVC), Hatfield, Hertfordshire, UK
- Université de Lyon, INRAE, VetAgro Sup, UMR EPIA, Marcy l'Etoile, France
- Université Clermont Auvergne, INRAE, VetAgro Sup, UMR EPIA, Saint Genes Champanelle, France
| | - M A Samad
- Bangladesh Livestock Research Institute (BLRI), Savar, Dhaka, Bangladesh.
| | - S C Hill
- Department of Pathobiology and Population Sciences, Royal Veterinary College (RVC), Hatfield, Hertfordshire, UK.
| |
Collapse
|
3
|
Tay JH, Kocher A, Duchene S. Assessing the effect of model specification and prior sensitivity on Bayesian tests of temporal signal. PLoS Comput Biol 2024; 20:e1012371. [PMID: 39504312 PMCID: PMC11573219 DOI: 10.1371/journal.pcbi.1012371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 11/18/2024] [Accepted: 10/23/2024] [Indexed: 11/08/2024] Open
Abstract
Our understanding of the evolution of many microbes has been revolutionised by the molecular clock, a statistical tool to infer evolutionary rates and timescales from analyses of biomolecular sequences. In all molecular clock models, evolutionary rates and times are jointly unidentifiable and 'calibration' information must therefore be used. For many organisms, sequences sampled at different time points can be employed for such calibration. Before attempting to do so, it is recommended to verify that the data carry sufficient information for molecular dating, a practice referred to as evaluation of temporal signal. Recently, a fully Bayesian approach, BETS (Bayesian Evaluation of Temporal Signal), was proposed to overcome known limitations of other commonly used techniques such as root-to-tip regression or date randomisation tests. BETS requires the specification of a full Bayesian phylogenetic model, posing several considerations for untangling the impact of model choice on the detection of temporal signal. Here, we aimed to (i) explore the effect of molecular clock model and tree prior specification on the results of BETS and (ii) provide guidelines for improving our confidence in molecular clock estimates. Using microbial molecular sequence data sets and simulation experiments, we assess the impact of the tree prior and its hyperparameters on the accuracy of temporal signal detection. In particular, highly informative priors that are inconsistent with the data can result in the incorrect detection of temporal signal. In consequence, we recommend: (i) using prior predictive simulations to determine whether the prior generates a reasonable expectation of parameters of interest, such as the evolutionary rate and age of the root node, (ii) conducting prior sensitivity analyses to assess the robustness of the posterior to the choice of prior, and (iii) selecting a molecular clock model that reasonably describes the evolutionary process.
Collapse
Affiliation(s)
- John H. Tay
- Peter Doherty Institute for Infection and Immunity, Department of Microbiology and Immunology, University of Melbourne, Melbourne, Australia
| | - Arthur Kocher
- Transmission, Infection, Diversification and Evolution Group, Max Planck Institute of Geoanthropology, Jena, Germany
- Department of Archaeogenetics, Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany
| | - Sebastian Duchene
- Peter Doherty Institute for Infection and Immunity, Department of Microbiology and Immunology, University of Melbourne, Melbourne, Australia
- DEMI unit, Department of Computational Biology, Institut Pasteur, Paris, France
| |
Collapse
|
4
|
Featherstone LA, McGaughran A. The effect of missing data on evolutionary analysis of sequence capture bycatch, with application to an agricultural pest. Mol Genet Genomics 2024; 299:11. [PMID: 38381254 PMCID: PMC10881687 DOI: 10.1007/s00438-024-02097-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2023] [Accepted: 12/29/2023] [Indexed: 02/22/2024]
Abstract
Sequence capture is a genomic technique that selectively enriches target sequences before high throughput next-generation sequencing, to generate specific sequences of interest. Off-target or 'bycatch' data are often discarded from capture experiments, but can be leveraged to address evolutionary questions under some circumstances. Here, we investigated the effects of missing data on a variety of evolutionary analyses using bycatch from an exon capture experiment on the global pest moth, Helicoverpa armigera. We added > 200 new samples from across Australia in the form of mitogenomes obtained as bycatch from targeted sequence capture, and combined these into an additional larger dataset to total > 1000 mitochondrial cytochrome c oxidase subunit I (COI) sequences across the species' global distribution. Using discriminant analysis of principal components and Bayesian coalescent analyses, we showed that mitogenomes assembled from bycatch with up to 75% missing data were able to return evolutionary inferences consistent with higher coverage datasets and the broader literature surrounding H. armigera. For example, low-coverage sequences broadly supported the delineation of two H. armigera subspecies and also provided new insights into the potential for geographic turnover among these subspecies. However, we also identified key effects of dataset coverage and composition on our results. Thus, low-coverage bycatch data can offer valuable information for population genetic and phylodynamic analyses, but caution is required to ensure the reduced information does not introduce confounding factors, such as sampling biases, that drive inference. We encourage more researchers to consider maximizing the potential of the targeted sequence approach by examining evolutionary questions with their off-target bycatch where possible-especially in cases where no previous mitochondrial data exists-but recommend stratifying data at different genome coverage thresholds to separate sampling effects from genuine genomic signals, and to understand their implications for evolutionary research.
Collapse
Affiliation(s)
- Leo A Featherstone
- Research School of Biology, Division of Ecology and Evolution, Australian National University, Canberra, ACT, 2601, Australia
- Peter Doherty Institute for Infection and Immunity, The University of Melbourne, Melbourne, VIC, 3000, Australia
| | - Angela McGaughran
- Research School of Biology, Division of Ecology and Evolution, Australian National University, Canberra, ACT, 2601, Australia.
- Te Aka Mātuatua, School of Science, University of Waikato, Private Bag 3105, Hamilton, 3240, New Zealand.
| |
Collapse
|
5
|
Hollingsworth BD, Grubaugh ND, Lazzaro BP, Murdock CC. Leveraging insect-specific viruses to elucidate mosquito population structure and dynamics. PLoS Pathog 2023; 19:e1011588. [PMID: 37651317 PMCID: PMC10470969 DOI: 10.1371/journal.ppat.1011588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023] Open
Abstract
Several aspects of mosquito ecology that are important for vectored disease transmission and control have been difficult to measure at epidemiologically important scales in the field. In particular, the ability to describe mosquito population structure and movement rates has been hindered by difficulty in quantifying fine-scale genetic variation among populations. The mosquito virome represents a possible avenue for quantifying population structure and movement rates across multiple spatial scales. Mosquito viromes contain a diversity of viruses, including several insect-specific viruses (ISVs) and "core" viruses that have high prevalence across populations. To date, virome studies have focused on viral discovery and have only recently begun examining viral ecology. While nonpathogenic ISVs may be of little public health relevance themselves, they provide a possible route for quantifying mosquito population structure and dynamics. For example, vertically transmitted viruses could behave as a rapidly evolving extension of the host's genome. It should be possible to apply established analytical methods to appropriate viral phylogenies and incidence data to generate novel approaches for estimating mosquito population structure and dispersal over epidemiologically relevant timescales. By studying the virome through the lens of spatial and genomic epidemiology, it may be possible to investigate otherwise cryptic aspects of mosquito ecology. A better understanding of mosquito population structure and dynamics are key for understanding mosquito-borne disease ecology and methods based on ISVs could provide a powerful tool for informing mosquito control programs.
Collapse
Affiliation(s)
- Brandon D Hollingsworth
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
| | - Nathan D Grubaugh
- Yale School of Public Health, New Haven, Connecticut, United States of America
- Yale University, New Haven, Connecticut, United States of America
| | - Brian P Lazzaro
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
| | - Courtney C Murdock
- Department of Entomology, Cornell University, Ithaca, New York, United States of America
- Cornell Institute for Host Microbe Interaction and Disease, Cornell University, Ithaca, New York, United States of America
- Northeast Regional Center for Excellence in Vector-borne Diseases, Cornell University, Ithaca, New York, United States of America
| |
Collapse
|
6
|
Duvvuri VR, Hicks JT, Damodaran L, Grunnill M, Braukmann T, Wu J, Gubbay JB, Patel SN, Bahl J. Comparing the transmission potential from sequence and surveillance data of 2009 North American influenza pandemic waves. Infect Dis Model 2023; 8:240-252. [PMID: 36844759 PMCID: PMC9944206 DOI: 10.1016/j.idm.2023.02.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2022] [Revised: 02/10/2023] [Accepted: 02/15/2023] [Indexed: 02/18/2023] Open
Abstract
Technological advancements in phylodynamic modeling coupled with the accessibility of real-time pathogen genetic data are increasingly important for understanding the infectious disease transmission dynamics. In this study, we compare the transmission potentials of North American influenza A(H1N1)pdm09 derived from sequence data to that derived from surveillance data. The impact of the choice of tree-priors, informative epidemiological priors, and evolutionary parameters on the transmission potential estimation is evaluated. North American Influenza A(H1N1)pdm09 hemagglutinin (HA) gene sequences are analyzed using the coalescent and birth-death tree prior models to estimate the basic reproduction number (R 0 ). Epidemiological priors gathered from published literature are used to simulate the birth-death skyline models. Path-sampling marginal likelihood estimation is conducted to assess model fit. A bibliographic search to gather surveillance-based R 0 values were consistently lower (mean ≤ 1.2) when estimated by coalescent models than by the birth-death models with informative priors on the duration of infectiousness (mean ≥ 1.3 to ≤2.88 days). The user-defined informative priors for use in the birth-death model shift the directionality of epidemiological and evolutionary parameters compared to non-informative estimates. While there was no certain impact of clock rate and tree height on the R 0 estimation, an opposite relationship was observed between coalescent and birth-death tree priors. There was no significant difference (p = 0.46) between the birth-death model and surveillance R 0 estimates. This study concludes that tree-prior methodological differences may have a substantial impact on the transmission potential estimation as well as the evolutionary parameters. The study also reports a consensus between the sequence-based R 0 estimation and surveillance-based R 0 estimates. Altogether, these outcomes shed light on the potential role of phylodynamic modeling to augment existing surveillance and epidemiological activities to better assess and respond to emerging infectious diseases.
Collapse
Affiliation(s)
- Venkata R. Duvvuri
- Public Health Ontario, Toronto, Ontario, Canada,Department of Laboratory Medicine and Pathobiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada,Laboratory for Industrial and Applied Mathematics, Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada,Center for the Ecology of Infectious Disease, Department of Infectious Diseases, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Department of Epidemiology and Biostatistics, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Corresponding author. Public Health Ontario, Toronto, Ontario, Canada.
| | - Joseph T. Hicks
- Center for the Ecology of Infectious Disease, Department of Infectious Diseases, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Department of Epidemiology and Biostatistics, Institute of Bioinformatics, University of Georgia, Athens, Georgia
| | - Lambodhar Damodaran
- Center for the Ecology of Infectious Disease, Department of Infectious Diseases, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Department of Epidemiology and Biostatistics, Institute of Bioinformatics, University of Georgia, Athens, Georgia
| | - Martin Grunnill
- Public Health Ontario, Toronto, Ontario, Canada,Laboratory for Industrial and Applied Mathematics, Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada
| | | | - Jianhong Wu
- Laboratory for Industrial and Applied Mathematics, Department of Mathematics and Statistics, York University, Toronto, Ontario, Canada
| | - Jonathan B. Gubbay
- Public Health Ontario, Toronto, Ontario, Canada,Department of Laboratory Medicine and Pathobiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Samir N. Patel
- Public Health Ontario, Toronto, Ontario, Canada,Department of Laboratory Medicine and Pathobiology, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Justin Bahl
- Center for the Ecology of Infectious Disease, Department of Infectious Diseases, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Department of Epidemiology and Biostatistics, Institute of Bioinformatics, University of Georgia, Athens, Georgia,Duke-NUS Graduate Medical School, Singapore,Corresponding author. Center for the Ecology of Infectious Disease, Department of Infectious Diseases, Institute of Bioinformatics, University of Georgia, Athens, Georgia, USA.
| |
Collapse
|
7
|
Plagued by a cryptic clock: insight and issues from the global phylogeny of Yersinia pestis. Commun Biol 2023; 6:23. [PMID: 36658311 PMCID: PMC9852431 DOI: 10.1038/s42003-022-04394-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 12/21/2022] [Indexed: 01/21/2023] Open
Abstract
Plague has an enigmatic history as a zoonotic pathogen. This infectious disease will unexpectedly appear in human populations and disappear just as suddenly. As a result, a long-standing line of inquiry has been to estimate when and where plague appeared in the past. However, there have been significant disparities between phylogenetic studies of the causative bacterium, Yersinia pestis, regarding the timing and geographic origins of its reemergence. Here, we curate and contextualize an updated phylogeny of Y. pestis using 601 genome sequences sampled globally. Through a detailed Bayesian evaluation of temporal signal in subsets of these data we demonstrate that a Y. pestis-wide molecular clock is unstable. To resolve this, we developed a new approach in which each Y. pestis population was assessed independently, enabling us to recover substantial temporal signal in five populations, including the ancient pandemic lineages which we now estimate may have emerged decades, or even centuries, before a pandemic was historically documented from European sources. Despite this methodological advancement, we only obtain robust divergence dates from populations sampled over a period of at least 90 years, indicating that genetic evidence alone is insufficient for accurately reconstructing the timing and spread of short-term plague epidemics.
Collapse
|
8
|
Suster CJE, Arnott A, Blackwell G, Gall M, Draper J, Martinez E, Drew AP, Rockett RJ, Chen SCA, Kok J, Dwyer DE, Sintchenko V. Guiding the design of SARS-CoV-2 genomic surveillance by estimating the resolution of outbreak detection. Front Public Health 2022; 10:1004201. [PMID: 36276383 PMCID: PMC9581317 DOI: 10.3389/fpubh.2022.1004201] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Accepted: 09/16/2022] [Indexed: 01/27/2023] Open
Abstract
Genomic surveillance of SARS-CoV-2 has been essential to inform public health response to outbreaks. The high incidence of infection has resulted in a smaller proportion of cases undergoing whole genome sequencing due to finite resources. We present a framework for estimating the impact of reduced depths of genomic surveillance on the resolution of outbreaks, based on a clustering approach using pairwise genetic and temporal distances. We apply the framework to simulated outbreak data to show that outbreaks are detected less frequently when fewer cases are subjected to whole genome sequencing. The impact of sequencing fewer cases depends on the size of the outbreaks, and on the genetic and temporal similarity of the index cases of the outbreaks. We also apply the framework to an outbreak of the SARS-CoV-2 Delta variant in New South Wales, Australia. We find that the detection of clusters in the outbreak would have been delayed if fewer cases had been sequenced. Existing recommendations for genomic surveillance estimate the minimum number of cases to sequence in order to detect and monitor new virus variants, assuming representative sampling of cases. Our method instead measures the resolution of clustering, which is important for genomic epidemiology, and accommodates sampling biases.
Collapse
Affiliation(s)
- Carl J. E. Suster
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
| | - Alicia Arnott
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Grace Blackwell
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Mailie Gall
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Jenny Draper
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Elena Martinez
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Alexander P. Drew
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Rebecca J. Rockett
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
| | - Sharon C.-A. Chen
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Jen Kok
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Dominic E. Dwyer
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| | - Vitali Sintchenko
- Centre for Infectious Diseases and Microbiology Public Health, Westmead Hospital, Westmead, NSW, Australia
- Sydney Institute for Infectious Diseases, The University of Sydney, Westmead, NSW, Australia
- Centre for Infectious Diseases and Microbiology Laboratory Services, Institute of Clinical Pathology and Medical Research, NSW Health Pathology, Westmead, NSW, Australia
| |
Collapse
|
9
|
Porter AF, Sherry N, Andersson P, Johnson SA, Duchene S, Howden BP. New rules for genomics-informed COVID-19 responses-Lessons learned from the first waves of the Omicron variant in Australia. PLoS Genet 2022; 18:e1010415. [PMID: 36227810 PMCID: PMC9560517 DOI: 10.1371/journal.pgen.1010415] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Affiliation(s)
- Ashleigh F. Porter
- Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Norelle Sherry
- Microbiological Diagnostic Unit Public Health Laboratory, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Patiyan Andersson
- Microbiological Diagnostic Unit Public Health Laboratory, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Sandra A. Johnson
- Microbiological Diagnostic Unit Public Health Laboratory, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Sebastian Duchene
- Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Benjamin P. Howden
- Department of Microbiology and Immunology, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
- Microbiological Diagnostic Unit Public Health Laboratory, The University of Melbourne at The Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| |
Collapse
|
10
|
Attwood SW, Hill SC, Aanensen DM, Connor TR, Pybus OG. Phylogenetic and phylodynamic approaches to understanding and combating the early SARS-CoV-2 pandemic. Nat Rev Genet 2022; 23:547-562. [PMID: 35459859 PMCID: PMC9028907 DOI: 10.1038/s41576-022-00483-8] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/23/2022] [Indexed: 01/05/2023]
Abstract
Determining the transmissibility, prevalence and patterns of movement of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infections is central to our understanding of the impact of the pandemic and to the design of effective control strategies. Phylogenies (evolutionary trees) have provided key insights into the international spread of SARS-CoV-2 and enabled investigation of individual outbreaks and transmission chains in specific settings. Phylodynamic approaches combine evolutionary, demographic and epidemiological concepts and have helped track virus genetic changes, identify emerging variants and inform public health strategy. Here, we review and synthesize studies that illustrate how phylogenetic and phylodynamic techniques were applied during the first year of the pandemic, and summarize their contributions to our understanding of SARS-CoV-2 transmission and control.
Collapse
Affiliation(s)
- Stephen W Attwood
- Department of Zoology, University of Oxford, Oxford, UK.
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK.
| | - Sarah C Hill
- Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, London, UK
| | - David M Aanensen
- Centre for Genomic Pathogen Surveillance, Wellcome Genome Campus, Hinxton, UK
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Thomas R Connor
- Pathogen Genomics Unit, Public Health Wales NHS Trust, Cardiff, UK
- School of Biosciences, Cardiff University, Cardiff, UK
| | - Oliver G Pybus
- Department of Zoology, University of Oxford, Oxford, UK.
- Department of Pathobiology and Population Sciences, Royal Veterinary College, University of London, London, UK.
| |
Collapse
|
11
|
Phylogeographic analysis of the Bantu language expansion supports a rainforest route. Proc Natl Acad Sci U S A 2022; 119:e2112853119. [PMID: 35914165 PMCID: PMC9372543 DOI: 10.1073/pnas.2112853119] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Southern Africa has been shaped by the large-scale expansion of Bantu populations fueled by agriculture: Currently, 240 million people speak one of the more than 500 Bantu languages. However, the timing and geographic routes undergone by the Bantu populations remain largely unknown. We use cutting-edge phylogeographic techniques to show that Bantu populations migrated through the Central African tropical rainforest around 4,400 y ago. This adds to the growing evidence that agricultural expansions can successfully overcome ecological challenges as they unfold. The Bantu expansion transformed the linguistic, economic, and cultural composition of sub-Saharan Africa. However, the exact dates and routes taken by the ancestors of the speakers of the more than 500 current Bantu languages remain uncertain. Here, we use the recently developed “break-away” geographical diffusion model, specially designed for modeling migrations, with “augmented” geographic information, to reconstruct the Bantu language family expansion. This Bayesian phylogeographic approach with augmented geographical data provides a powerful way of linking linguistic, archaeological, and genetic data to test hypotheses about large language family expansions. We compare four hypotheses: an early major split north of the rainforest; a migration through the Sangha River Interval corridor around 2,500 BP; a coastal migration around 4,000 BP; and a migration through the rainforest before the corridor opening, at 4,000 BP. Our results produce a topology and timeline for the Bantu language family, which supports the hypothesis of an expansion through Central African tropical forests at 4,420 BP (4,040 to 5,000 95% highest posterior density interval), well before the Sangha River Interval was open.
Collapse
|
12
|
Featherstone LA, Zhang JM, Vaughan TG, Duchene S. Epidemiological inference from pathogen genomes: A review of phylodynamic models and applications. Virus Evol 2022; 8:veac045. [PMID: 35775026 PMCID: PMC9241095 DOI: 10.1093/ve/veac045] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 05/23/2022] [Accepted: 06/02/2022] [Indexed: 11/24/2022] Open
Abstract
Phylodynamics requires an interdisciplinary understanding of phylogenetics, epidemiology, and statistical inference. It has also experienced more intense application than ever before amid the SARS-CoV-2 pandemic. In light of this, we present a review of phylodynamic models beginning with foundational models and assumptions. Our target audience is public health researchers, epidemiologists, and biologists seeking a working knowledge of the links between epidemiology, evolutionary models, and resulting epidemiological inference. We discuss the assumptions linking evolutionary models of pathogen population size to epidemiological models of the infected population size. We then describe statistical inference for phylodynamic models and list how output parameters can be rearranged for epidemiological interpretation. We go on to cover more sophisticated models and finish by highlighting future directions.
Collapse
Affiliation(s)
- Leo A Featherstone
- Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3000, Australia
| | - Joshua M Zhang
- Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3000, Australia
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel 4058, Switzerland
- Swiss Institute of Bioinformatics, Geneva 1015, Switzerland
| | - Sebastian Duchene
- Peter Doherty Institute for Infection and Immunity, University of Melbourne, Melbourne, VIC 3000, Australia
| |
Collapse
|
13
|
Andréoletti J, Zwaans A, Warnock RCM, Aguirre-Fernández G, Barido-Sottani J, Gupta A, Stadler T, Manceau M. The Occurrence Birth-Death Process for combined-evidence analysis in macroevolution and epidemiology. Syst Biol 2022; 71:1440-1452. [PMID: 35608305 PMCID: PMC9558841 DOI: 10.1093/sysbio/syac037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 05/02/2022] [Accepted: 05/06/2022] [Indexed: 11/28/2022] Open
Abstract
Phylodynamic models generally aim at jointly inferring phylogenetic relationships, model parameters, and more recently, the number of lineages through time, based on molecular sequence data. In the fields of epidemiology and macroevolution, these models can be used to estimate, respectively, the past number of infected individuals (prevalence) or the past number of species (paleodiversity) through time. Recent years have seen the development of “total-evidence” analyses, which combine molecular and morphological data from extant and past sampled individuals in a unified Bayesian inference framework. Even sampled individuals characterized only by their sampling time, that is, lacking morphological and molecular data, which we call occurrences, provide invaluable information to estimate the past number of lineages. Here, we present new methodological developments around the fossilized birth–death process enabling us to (i) incorporate occurrence data in the likelihood function; (ii) consider piecewise-constant birth, death, and sampling rates; and (iii) estimate the past number of lineages, with or without knowledge of the underlying tree. We implement our method in the RevBayes software environment, enabling its use along with a large set of models of molecular and morphological evolution, and validate the inference workflow using simulations under a wide range of conditions. We finally illustrate our new implementation using two empirical data sets stemming from the fields of epidemiology and macroevolution. In epidemiology, we infer the prevalence of the coronavirus disease 2019 outbreak on the Diamond Princess ship, by taking into account jointly the case count record (occurrences) along with viral sequences for a fraction of infected individuals. In macroevolution, we infer the diversity trajectory of cetaceans using molecular and morphological data from extant taxa, morphological data from fossils, as well as numerous fossil occurrences. The joint modeling of occurrences and trees holds the promise to further bridge the gap between traditional epidemiology and pathogen genomics, as well as paleontology and molecular phylogenetics. [Birth–death model; epidemiology; fossils; macroevolution; occurrences; phylogenetics; skyline.]
Collapse
Affiliation(s)
- Jérémy Andréoletti
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Antoine Zwaans
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Rachel C M Warnock
- GeoZentrum Nordbayern,Friedrich-Alexander-Universität Erlangen-Nürnberg, Germany
| | | | - Joëlle Barido-Sottani
- Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, USA
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Marc Manceau
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| |
Collapse
|
14
|
Cappello L, Kim J, Liu S, Palacios JA. Statistical Challenges in Tracking the Evolution of SARS-CoV-2. Stat Sci 2022; 37:162-182. [PMID: 36034090 PMCID: PMC9409356 DOI: 10.1214/22-sts853] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Genomic surveillance of SARS-CoV-2 has been instrumental in tracking the spread and evolution of the virus during the pandemic. The availability of SARS-CoV-2 molecular sequences isolated from infected individuals, coupled with phylodynamic methods, have provided insights into the origin of the virus, its evolutionary rate, the timing of introductions, the patterns of transmission, and the rise of novel variants that have spread through populations. Despite enormous global efforts of governments, laboratories, and researchers to collect and sequence molecular data, many challenges remain in analyzing and interpreting the data collected. Here, we describe the models and methods currently used to monitor the spread of SARS-CoV-2, discuss long-standing and new statistical challenges, and propose a method for tracking the rise of novel variants during the epidemic.
Collapse
Affiliation(s)
- Lorenzo Cappello
- Departments of Economics and Business, Universitat Pompeu Fabra, 08005, Spain
| | - Jaehee Kim
- Department of Computational Biology, Cornell University, Ithaca, New York 14853, USA\
| | - Sifan Liu
- Department of Statistics, Stanford University, Stanford, California 94305, USA
| | - Julia A Palacios
- Departments of Statistics and Biomedical Data Sciences, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
15
|
Progress and challenges in virus genomic epidemiology. Trends Parasitol 2021; 37:1038-1049. [PMID: 34620561 DOI: 10.1016/j.pt.2021.08.007] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/24/2021] [Accepted: 08/26/2021] [Indexed: 12/18/2022]
Abstract
Genomic epidemiology, which links pathogen genomes with associated metadata to understand disease transmission, has become a key component of outbreak response. Decreasing costs of genome sequencing and increasing computational power provide opportunities to generate and analyse large viral genomic datasets that aim to uncover the spatial scales of transmission, the demographics contributing to transmission patterns, and to forecast epidemic trends. Emerging sources of genomic data and associated metadata provide new opportunities to further unravel transmission patterns. Key challenges include how to integrate genomic data with metadata from multiple sources, how to generate efficient computational algorithms to cope with large datasets, and how to establish sampling frameworks to enable robust conclusions.
Collapse
|