1
|
Susvitasari K, Tupper PF, Cancino-Muños I, Lòpez MG, Comas I, Colijn C. Epidemiological cluster identification using multiple data sources: an approach using logistic regression. Microb Genom 2023; 9. [PMID: 36867086 PMCID: PMC10132077 DOI: 10.1099/mgen.0.000929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/04/2023] Open
Abstract
In the management of infectious disease outbreaks, grouping cases into clusters and understanding their underlying epidemiology are fundamental tasks. In genomic epidemiology, clusters are typically identified either using pathogen sequences alone or with sequences in combination with epidemiological data such as location and time of collection. However, it may not be feasible to culture and sequence all pathogen isolates, so sequence data may not be available for all cases. This presents challenges for identifying clusters and understanding epidemiology, because these cases may be important for transmission. Demographic, clinical and location data are likely to be available for unsequenced cases, and comprise partial information about their clustering. Here, we use statistical modelling to assign unsequenced cases to clusters already identified by genomic methods, assuming that a more direct method of linking individuals, such as contact tracing, is not available. We build our model on pairwise similarity between cases to predict whether cases cluster together, in contrast to using individual case data to predict the cases' clusters. We then develop methods that allow us to determine whether a pair of unsequenced cases are likely to cluster together, to group them into their most probable clusters, to identify which are most likely to be members of a specific (known) cluster, and to estimate the true size of a known cluster given a set of unsequenced cases. We apply our method to tuberculosis data from Valencia, Spain. Among other applications, we find that clustering can be predicted successfully using spatial distance between cases and whether nationality is the same. We can identify the correct cluster for an unsequenced case, among 38 possible clusters, with an accuracy of approximately 35 %, higher than both direct multinomial regression (17 %) and random selection (< 5 %).
Collapse
Affiliation(s)
| | - Paul F Tupper
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| | - Irving Cancino-Muños
- I2SysBio, University of Valencia-CSIC, Valencia, Spain.,FISABIO Public Health, Valencia, Spain
| | - Mariana G Lòpez
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia (IBV-CSIC), Valencia, Spain
| | - Iñaki Comas
- Tuberculosis Genomics Unit, Instituto de Biomedicina de Valencia (IBV-CSIC), Valencia, Spain.,Ciber en Epidemiología y Salud Pública (CIBERESP), Madrid, Spain
| | - Caroline Colijn
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| |
Collapse
|
2
|
Liu M, Chato C, Poon AFY. From components to communities: bringing network science to clustering for molecular epidemiology. Virus Evol 2023; 9:vead026. [PMID: 37187604 PMCID: PMC10175948 DOI: 10.1093/ve/vead026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 01/30/2023] [Accepted: 04/17/2023] [Indexed: 05/17/2023] Open
Abstract
Defining clusters of epidemiologically related infections is a common problem in the surveillance of infectious disease. A popular method for generating clusters is pairwise distance clustering, which assigns pairs of sequences to the same cluster if their genetic distance falls below some threshold. The result is often represented as a network or graph of nodes. A connected component is a set of interconnected nodes in a graph that are not connected to any other node. The prevailing approach to pairwise clustering is to map clusters to the connected components of the graph on a one-to-one basis. We propose that this definition of clusters is unnecessarily rigid. For instance, the connected components can collapse into one cluster by the addition of a single sequence that bridges nodes in the respective components. Moreover, the distance thresholds typically used for viruses like HIV-1 tend to exclude a large proportion of new sequences, making it difficult to train models for predicting cluster growth. These issues may be resolved by revisiting how we define clusters from genetic distances. Community detection is a promising class of clustering methods from the field of network science. A community is a set of nodes that are more densely inter-connected relative to the number of their connections to external nodes. Thus, a connected component may be partitioned into two or more communities. Here we describe community detection methods in the context of genetic clustering for epidemiology, demonstrate how a popular method (Markov clustering) enables us to resolve variation in transmission rates within a giant connected component of HIV-1 sequences, and identify current challenges and directions for further work.
Collapse
Affiliation(s)
- Molly Liu
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044, London, ON N6A 5C1, Canada
| | - Connor Chato
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044, London, ON N6A 5C1, Canada
| | | |
Collapse
|
3
|
Characterization of HIV-1 Transmission Clusters Inferred from the Brazilian Nationwide Genotyping Service Database. Viruses 2022; 14:v14122768. [PMID: 36560771 PMCID: PMC9783618 DOI: 10.3390/v14122768] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 11/23/2022] [Accepted: 11/28/2022] [Indexed: 12/14/2022] Open
Abstract
The study of HIV-1 transmission networks inferred from viral genetic data can be used to clarify important factors about the dynamics of HIV-1 transmission, such as network growth rate and demographic composition. In Brazil, HIV transmission has been stable since the early 2000s and the study of transmission clusters can provide valuable data to understand the drivers of virus spread. In this work, we analyzed a nation-wide database of approximately 53,000 HIV-1 nucleotide pol sequences sampled from genotyped patients from 2008-2017. Phylogenetic trees were reconstructed for the HIV-1 subtypes B, C and F1 in Brazil and transmission clusters were inferred by applying genetic distances thresholds of 1.5%, 3.0% and 4.5%, as well as high (>0.9) cluster statistical support. An odds ratio test revealed that young men (15-24 years) and individuals with more years of education presented higher odds to cluster. The assortativity coefficient revealed that individuals with similar demographic features tended to cluster together, with emphasis on features, such as place of residence and age. We also observed that assortativity weakens as the genetic distance threshold increases. Our results indicate that the phylogenetic clusters identified here are likely representative of the contact networks that shape HIV transmission, and this is a valuable tool even in sites with low sampling density, such as Brazil.
Collapse
|
4
|
Optimized phylogenetic clustering of HIV-1 sequence data for public health applications. PLoS Comput Biol 2022; 18:e1010745. [PMID: 36449514 PMCID: PMC9744331 DOI: 10.1371/journal.pcbi.1010745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 12/12/2022] [Accepted: 11/17/2022] [Indexed: 12/02/2022] Open
Abstract
Clusters of genetically similar infections suggest rapid transmission and may indicate priorities for public health action or reveal underlying epidemiological processes. However, clusters often require user-defined thresholds and are sensitive to non-epidemiological factors, such as non-random sampling. Consequently the ideal threshold for public health applications varies substantially across settings. Here, we show a method which selects optimal thresholds for phylogenetic (subset tree) clustering based on population. We evaluated this method on HIV-1 pol datasets (n = 14, 221 sequences) from four sites in USA (Tennessee, Washington), Canada (Northern Alberta) and China (Beijing). Clusters were defined by tips descending from an ancestral node (with a minimum bootstrap support of 95%) through a series of branches, each with a length below a given threshold. Next, we used pplacer to graft new cases to the fixed tree by maximum likelihood. We evaluated the effect of varying branch-length thresholds on cluster growth as a count outcome by fitting two Poisson regression models: a null model that predicts growth from cluster size, and an alternative model that includes mean collection date as an additional covariate. The alternative model was favoured by AIC across most thresholds, with optimal (greatest difference in AIC) thresholds ranging 0.007-0.013 across sites. The range of optimal thresholds was more variable when re-sampling 80% of the data by location (IQR 0.008 - 0.016, n = 100 replicates). Our results use prospective phylogenetic cluster growth and suggest that there is more variation in effective thresholds for public health than those typically used in clustering studies.
Collapse
|
5
|
Illingworth CJR, Hamilton WL, Jackson C, Warne B, Popay A, Meredith L, Hosmillo M, Jahun A, Fieldman T, Routledge M, Houldcroft CJ, Caller L, Caddy S, Yakovleva A, Hall G, Khokhar FA, Feltwell T, Pinckert ML, Georgana I, Chaudhry Y, Curran M, Parmar S, Sparkes D, Rivett L, Jones NK, Sridhar S, Forrest S, Dymond T, Grainger K, Workman C, Gkrania-Klotsas E, Brown NM, Weekes MP, Baker S, Peacock SJ, Gouliouris T, Goodfellow I, Angelis DD, Török ME. A2B-COVID: A Tool for Rapidly Evaluating Potential SARS-CoV-2 Transmission Events. Mol Biol Evol 2022; 39:msac025. [PMID: 35106603 PMCID: PMC8892943 DOI: 10.1093/molbev/msac025] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Identifying linked cases of infection is a critical component of the public health response to viral infectious diseases. In a clinical context, there is a need to make rapid assessments of whether cases of infection have arrived independently onto a ward, or are potentially linked via direct transmission. Viral genome sequence data are of great value in making these assessments, but are often not the only form of data available. Here, we describe A2B-COVID, a method for the rapid identification of potentially linked cases of COVID-19 infection designed for clinical settings. Our method combines knowledge about infection dynamics, data describing the movements of individuals, and evolutionary analysis of genome sequences to assess whether data collected from cases of infection are consistent or inconsistent with linkage via direct transmission. A retrospective analysis of data from two wards at Cambridge University Hospitals NHS Foundation Trust during the first wave of the pandemic showed qualitatively different patterns of linkage between cases on designated COVID-19 and non-COVID-19 wards. The subsequent real-time application of our method to data from the second epidemic wave highlights its value for monitoring cases of infection in a clinical context.
Collapse
Affiliation(s)
- Christopher J R Illingworth
- MRC-University of Glasgow Centre for Virus Research, Glasgow, United Kingdom
- MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, United Kingdom
- Institut für Biologische Physik, Universität zu Köln, Köln, Germany
| | - William L Hamilton
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | | | - Ben Warne
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Ashley Popay
- Public Health England Field Epidemiology Unit, Cambridge Institute of Public Health, Cambridge, United Kingdom
| | - Luke Meredith
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Myra Hosmillo
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Aminu Jahun
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Tom Fieldman
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Matthew Routledge
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | | | | | - Sarah Caddy
- Cambridge Institute for Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, United Kingdom
| | - Anna Yakovleva
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Grant Hall
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Fahad A Khokhar
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Theresa Feltwell
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Malte L Pinckert
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Iliana Georgana
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Yasmin Chaudhry
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Martin Curran
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Surendra Parmar
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Dominic Sparkes
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Lucy Rivett
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Nick K Jones
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Sushmita Sridhar
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge Institute for Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, United Kingdom
- Wellcome Sanger Institute, Hinxton, United Kingdom
| | | | - Tom Dymond
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Kayleigh Grainger
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Chris Workman
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Effrossyni Gkrania-Klotsas
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- MRC Epidemiology Unit, University of Cambridge, Level 3 Institute of Metabolic Science, Cambridge, United Kingdom
- School of Clinical Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Nicholas M Brown
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Michael P Weekes
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge Institute for Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, United Kingdom
| | - Stephen Baker
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge Institute for Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge, United Kingdom
| | - Sharon J Peacock
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Wellcome Sanger Institute, Hinxton, United Kingdom
| | - Theodore Gouliouris
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Clinical Microbiology and Public Health Laboratory, Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| | - Ian Goodfellow
- Department of Pathology, Division of Virology, University of Cambridge, Cambridge, United Kingdom
| | - Daniela De Angelis
- MRC Biostatistics Unit, University of Cambridge, Cambridge, United Kingdom
- Public Health England, National Infection Service, London, United Kingdom
| | - M Estée Török
- Department of Medicine, University of Cambridge, Cambridge, United Kingdom
- Cambridge University Hospitals NHS Foundation Trust, Cambridge, United Kingdom
| |
Collapse
|
6
|
Cuadros DF, de Oliveira T, Gräf T, Junqueira DM, Wilkinson E, Lemey P, Bärnighausen T, Kim HY, Tanser F. The role of high-risk geographies in the perpetuation of the HIV epidemic in rural South Africa: A spatial molecular epidemiology study. PLOS GLOBAL PUBLIC HEALTH 2022; 2:e0000105. [PMID: 36962341 PMCID: PMC10021703 DOI: 10.1371/journal.pgph.0000105] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 11/15/2021] [Indexed: 11/18/2022]
Abstract
In this study, we hypothesize that HIV geographical clusters (geospatial areas with significantly higher numbers of HIV positive individuals) can behave as the highly connected nodes in the transmission network. Using data come from one of the most comprehensive demographic surveillance systems in Africa, we found that more than 70% of the HIV transmission links identified were directly connected to an HIV geographical cluster located in a peri-urban area. Moreover, we identified a single central large community of highly connected nodes located within the HIV cluster. This module was composed by nodes highly connected among them, forming a central structure of the network that was also connected with the small sparser modules located outside of the HIV geographical cluster. Our study supports the evidence of the high level of connectivity between HIV geographical high-risk populations and the entire community.
Collapse
Affiliation(s)
- Diego F. Cuadros
- Department of Geography and Geographic Information Science, University of Cincinnati, Cincinnati, OH, United States of America
- Health Geography and Disease Modeling Laboratory, University of Cincinnati, Cincinnati, OH, United States of America
| | - Tulio de Oliveira
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- School of Laboratory Medicine and Medical Science, Department of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Tiago Gräf
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- Fundação Oswaldo Cruz (FIOCRUZ), Instituto Gonçalo Moniz, Salvador, Brazil
| | - Dennis M. Junqueira
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- School of Laboratory Medicine and Medical Science, Department of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Eduan Wilkinson
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
- School of Laboratory Medicine and Medical Science, Department of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Philippe Lemey
- Department of Microbiology and Immunology, Rega Institute for Medical Research, University of Leuven, Leuven, Belgium
| | - Till Bärnighausen
- Africa Health Research Institute, University of KwaZulu-Natal, Durban, South Africa
- Heidelberg Institute for Public Health, University of Heidelberg, Heidelberg, Germany
- Department of Global Health and Population, Harvard T.H. Chan School of Public Health, Boston, MA, United States of America
| | - Hae-Young Kim
- Africa Health Research Institute, University of KwaZulu-Natal, Durban, South Africa
- Department of Population Health, New York University Grossman School of Medicine, New York, NY, United States of America
| | - Frank Tanser
- Africa Health Research Institute, University of KwaZulu-Natal, Durban, South Africa
- School of Nursing and Public Health, University of KwaZulu-Natal, Durban, South Africa
- Lincoln International Institute for Rural Health, University of Lincoln, Lincoln, United Kingdom
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
| |
Collapse
|
7
|
Helekal D, Ledda A, Volz E, Wyllie D, Didelot X. Bayesian inference of clonal expansions in a dated phylogeny. Syst Biol 2021; 71:1073-1087. [PMID: 34893904 PMCID: PMC9366454 DOI: 10.1093/sysbio/syab095] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 11/23/2021] [Accepted: 11/29/2021] [Indexed: 11/16/2022] Open
Abstract
Microbial population genetics models often assume that all lineages are constrained by the same population size dynamics over time. However, many neutral and selective events can invalidate this assumption and can contribute to the clonal expansion of a specific lineage relative to the rest of the population. Such differential phylodynamic properties between lineages result in asymmetries and imbalances in phylogenetic trees that are sometimes described informally but which are difficult to analyze formally. To this end, we developed a model of how clonal expansions occur and affect the branching patterns of a phylogeny. We show how the parameters of this model can be inferred from a given dated phylogeny using Bayesian statistics, which allows us to assess the probability that one or more clonal expansion events occurred. For each putative clonal expansion event, we estimate its date of emergence and subsequent phylodynamic trajectory, including its long-term evolutionary potential which is important to determine how much effort should be placed on specific control measures. We demonstrate the applicability of our methodology on simulated and real data sets. Inference under our clonal expansion model can reveal important features in the evolution and epidemiology of infectious disease pathogens. [Clonal expansion; genomic epidemiology; microbial population genomics; phylodynamics.]
Collapse
Affiliation(s)
- David Helekal
- Centre for Doctoral Training in Mathematics for Real-World Systems, University of Warwick, United Kingdom
| | - Alice Ledda
- Healthcare Associated Infections and Antimicrobial Resistance Division, National Infection Service, Public Health England, United Kingdom
| | - Erik Volz
- Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, United Kingdom
| | - David Wyllie
- Field Service, East of England, National Infection Service, Public Health England, Cambridge, United Kingdom
| | - Xavier Didelot
- School of Life Sciences and Department of Statistics, University of Warwick, United Kingdom
| |
Collapse
|
8
|
McLaughlin A, Sereda P, Brumme CJ, Brumme ZL, Barrios R, Montaner JSG, Joy JB. Concordance of HIV transmission risk factors elucidated using viral diversification rate and phylogenetic clustering. EVOLUTION MEDICINE AND PUBLIC HEALTH 2021; 9:338-348. [PMID: 34754454 PMCID: PMC8573190 DOI: 10.1093/emph/eoab028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 09/14/2021] [Indexed: 11/22/2022]
Abstract
Background and objectives Although HIV sequence clustering is routinely used to identify subpopulations experiencing elevated transmission, it over-simplifies transmission dynamics and is sensitive to methodology. Complementarily, viral diversification rates can be used to approximate historical transmission rates. Here, we investigated the concordance and sensitivity of HIV transmission risk factors identified by phylogenetic clustering, viral diversification rate, changes in viral diversification rate and a combined approach. Methodology Viral sequences from 9848 people living with HIV in British Columbia, Canada, sampled between 1996 and February 2019, were used to infer phylogenetic trees, from which clusters were identified and viral diversification rates of each tip were calculated. Factors associated with heightened transmission risk were compared across models of cluster membership, viral diversification rate, changes in diversification rate, and viral diversification rate among clusters. Results Viruses within larger clusters had higher diversification rates and lower changes in diversification rate than those within smaller clusters; however, rates within individual clusters, independent of size, varied widely. Risk factors for both cluster membership and elevated viral diversification rate included being male, young, a resident of health authority E, previous injection drug use, previous hepatitis C virus infection or a high recent viral load. In a sensitivity analysis, models based on cluster membership had wider confidence intervals and lower concordance of significant effects than viral diversification rate for lower sampling rates. Conclusions and implications Viral diversification rate complements phylogenetic clustering, offering a means of evaluating transmission dynamics to guide provision of treatment and prevention services. Lay Summary Understanding HIV transmission dynamics within clusters can help prioritize public health resource allocation. We compared socio-demographic and clinical risk factors associated with phylogenetic cluster membership and viral diversification rate, a historical branching rate, in order to assess their relative concordance and sampling sensitivity.
Collapse
Affiliation(s)
- Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.,Department of Bioinformatics, University of British Columbia, Genome Sciences Centre, British Columbia Cancer Agency, 100-570 West 7th Avenue, Vancouver, BC V5Z 4S6, Canada
| | - Paul Sereda
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
| | - Chanson J Brumme
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.,Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| | - Zabrina L Brumme
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.,Faculty of Health Sciences, Simon Fraser University, Blusson Hall, Room 11300, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Rolando Barrios
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
| | - Julio S G Montaner
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.,Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul's Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada.,Department of Bioinformatics, University of British Columbia, Genome Sciences Centre, British Columbia Cancer Agency, 100-570 West 7th Avenue, Vancouver, BC V5Z 4S6, Canada.,Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| |
Collapse
|
9
|
Volz EM, Carsten W, Grad YH, Frost SDW, Dennis AM, Didelot X. Identification of Hidden Population Structure in Time-Scaled Phylogenies. Syst Biol 2021; 69:884-896. [PMID: 32049340 PMCID: PMC8559910 DOI: 10.1093/sysbio/syaa009] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 01/09/2020] [Accepted: 01/23/2020] [Indexed: 11/13/2022] Open
Abstract
Population structure influences genealogical patterns, however, data pertaining to how populations are structured are often unavailable or not directly observable. Inference of population structure is highly important in molecular epidemiology where pathogen phylogenetics is increasingly used to infer transmission patterns and detect outbreaks. Discrepancies between observed and idealized genealogies, such as those generated by the coalescent process, can be quantified, and where significant differences occur, may reveal the action of natural selection, host population structure, or other demographic and epidemiological heterogeneities. We have developed a fast non-parametric statistical test for detection of cryptic population structure in time-scaled phylogenetic trees. The test is based on contrasting estimated phylogenies with the theoretically expected phylodynamic ordering of common ancestors in two clades within a coalescent framework. These statistical tests have also motivated the development of algorithms which can be used to quickly screen a phylogenetic tree for clades which are likely to share a distinct demographic or epidemiological history. Epidemiological applications include identification of outbreaks in vulnerable host populations or rapid expansion of genotypes with a fitness advantage. To demonstrate the utility of these methods for outbreak detection, we applied the new methods to large phylogenies reconstructed from thousands of HIV-1 partial pol sequences. This revealed the presence of clades which had grown rapidly in the recent past and was significantly concentrated in young men, suggesting recent and rapid transmission in that group. Furthermore, to demonstrate the utility of these methods for the study of antimicrobial resistance, we applied the new methods to a large phylogeny reconstructed from whole genome Neisseria gonorrhoeae sequences. We find that population structure detected using these methods closely overlaps with the appearance and expansion of mutations conferring antimicrobial resistance. [Antimicrobial resistance; coalescent; HIV; population structure.].
Collapse
Affiliation(s)
- Erik M Volz
- Department of Infectious Disease Epidemiology and MRC Centre for Global Infectious Disease Analysis, Imperial College London, Norfolk Place, W2 1PG London, UK
| | - Wiuf Carsten
- Department of Mathematical Sciences, University of Copenhagen, Universitetsparken 5, DK-2100 Copenhagen, Denmark
| | - Yonatan H Grad
- Department of Immunology and Infectious Diseases, TH Chan School of Public Health, Harvard University, 677 Huntington Ave, Boston, MA 02115, USA
| | - Simon D W Frost
- Department of Veterinary Medicine, University of Cambridge, Madingley Rd, Cambridge CB3 0ES, UK.,The Alan Turing Institute, 96 Euston Rd, London NW1 2DB, London, UK
| | - Ann M Dennis
- Department of Medicine, University of North Carolina Chapel Hill, 321 S Columbia St, Chapel Hill, NC 27516, USA
| | - Xavier Didelot
- School of Life Sciences and Department of Statistics, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
10
|
Zhang Y, Leitner T, Albert J, Britton T. Inferring transmission heterogeneity using virus genealogies: Estimation and targeted prevention. PLoS Comput Biol 2020; 16:e1008122. [PMID: 32881984 PMCID: PMC7494101 DOI: 10.1371/journal.pcbi.1008122] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2019] [Revised: 09/16/2020] [Accepted: 07/02/2020] [Indexed: 12/19/2022] Open
Abstract
Spread of HIV typically involves uneven transmission patterns where some individuals spread to a large number of individuals while others to only a few or none. Such transmission heterogeneity can impact how fast and how much an epidemic spreads. Further, more efficient interventions may be achieved by taking such transmission heterogeneity into account. To address these issues, we developed two phylogenetic methods based on virus sequence data: 1) to generally detect if significant transmission heterogeneity is present, and 2) to pinpoint where in a phylogeny high-level spread is occurring. We derive inference procedures to estimate model parameters, including the amount of transmission heterogeneity, in a sampled epidemic. We show that it is possible to detect transmission heterogeneity under a wide range of simulated situations, including incomplete sampling, varying levels of heterogeneity, and including within-host genetic diversity. When evaluating real HIV-1 data from different epidemic scenarios, we found a lower level of transmission heterogeneity in slowly spreading situations and a higher level of heterogeneity in data that included a rapid outbreak, while R0 and Sackin's index (overall tree shape statistic) were similar in the two scenarios, suggesting that our new method is able to detect transmission heterogeneity in real data. We then show by simulations that targeted prevention, where we pinpoint high-level spread using a coalescence measurement, is efficient when sequence data are collected in an ongoing surveillance system. Such phylogeny-guided prevention is efficient under both single-step contact tracing as well as iterative contact tracing as compared to random intervention.
Collapse
Affiliation(s)
- Yunjun Zhang
- Department of Biostatistics, School of Public Health, Peking University, Beijing, China
- Department of Mathematics, Stockholm University, Stockholm, Sweden
- * E-mail:
| | - Thomas Leitner
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - Jan Albert
- Department of Microbiology, Tumor and Cell Biology, Karolinska Institute, Stockholm, Sweden
- Department of Clinical Microbiology, Karolinska University Hospital, Stockholm, Sweden
| | - Tom Britton
- Department of Mathematics, Stockholm University, Stockholm, Sweden
| |
Collapse
|
11
|
Jung S, Moon J, Hwang E. Cluster-Based Analysis of Infectious Disease Occurrences Using Tensor Decomposition: A Case Study of South Korea. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2020; 17:ijerph17134872. [PMID: 32640742 PMCID: PMC7370004 DOI: 10.3390/ijerph17134872] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 07/01/2020] [Accepted: 07/04/2020] [Indexed: 11/23/2022]
Abstract
For a long time, various epidemics, such as lower respiratory infections and diarrheal diseases, have caused serious social losses and costs. Various methods for analyzing infectious disease occurrences have been proposed for effective prevention and proactive response to reduce such losses and costs. However, the results of the occurrence analyses were limited because numerous factors affect the outbreak of infectious diseases and there are complex interactions between these factors. To alleviate this limitation, we propose a cluster-based analysis scheme of infectious disease occurrences that can discover commonalities or differences between clusters by grouping elements with similar occurrence patterns. To do this, we collect and preprocess infectious disease occurrence data according to time, region, and disease. Then, we construct a tensor for the data and apply Tucker decomposition to extract latent features in the dimensions of time, region, and disease. Based on these latent features, we conduct k-means clustering and analyze the results for each dimension. To demonstrate the effectiveness of this scheme, we conduct a case study on data from South Korea and report some of the results.
Collapse
|
12
|
James N, Menzies M. Cluster-based dual evolution for multivariate time series: Analyzing COVID-19. CHAOS (WOODBURY, N.Y.) 2020; 30:061108. [PMID: 32611104 PMCID: PMC7328914 DOI: 10.1063/5.0013156] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 06/11/2020] [Indexed: 05/20/2023]
Abstract
This paper proposes a cluster-based method to analyze the evolution of multivariate time series and applies this to the COVID-19 pandemic. On each day, we partition countries into clusters according to both their cases and death counts. The total number of clusters and individual countries' cluster memberships are algorithmically determined. We study the change in both quantities over time, demonstrating a close similarity in the evolution of cases and deaths. The changing number of clusters of the case counts precedes that of the death counts by 32 days. On the other hand, there is an optimal offset of 16 days with respect to the greatest consistency between cluster groupings, determined by a new method of comparing affinity matrices. With this offset in mind, we identify anomalous countries in the progression from COVID-19 cases to deaths. This analysis can aid in highlighting the most and least significant public policies in minimizing a country's COVID-19 mortality rate.
Collapse
Affiliation(s)
- Nick James
- School of Mathematics and Statistics, University of Sydney, NSW 2006, Australia
| | - Max Menzies
- Yau Mathematical Sciences Center, Tsinghua University, Beijing 100084, China
- Author to whom correspondence should be addressed:
| |
Collapse
|
13
|
Han AX, Parker E, Scholer F, Maurer-Stroh S, Russell CA. Phylogenetic Clustering by Linear Integer Programming (PhyCLIP). Mol Biol Evol 2020; 36:1580-1595. [PMID: 30854550 PMCID: PMC6573476 DOI: 10.1093/molbev/msz053] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Subspecies nomenclature systems of pathogens are increasingly based on sequence data. The use of phylogenetics to identify and differentiate between clusters of genetically similar pathogens is particularly prevalent in virology from the nomenclature of human papillomaviruses to highly pathogenic avian influenza (HPAI) H5Nx viruses. These nomenclature systems rely on absolute genetic distance thresholds to define the maximum genetic divergence tolerated between viruses designated as closely related. However, the phylogenetic clustering methods used in these nomenclature systems are limited by the arbitrariness of setting intra and intercluster diversity thresholds. The lack of a consensus ground truth to define well-delineated, meaningful phylogenetic subpopulations amplifies the difficulties in identifying an informative distance threshold. Consequently, phylogenetic clustering often becomes an exploratory, ad hoc exercise. Phylogenetic Clustering by Linear Integer Programming (PhyCLIP) was developed to provide a statistically principled phylogenetic clustering framework that negates the need for an arbitrarily defined distance threshold. Using the pairwise patristic distance distributions of an input phylogeny, PhyCLIP parameterizes the intra and intercluster divergence limits as statistical bounds in an integer linear programming model which is subsequently optimized to cluster as many sequences as possible. When applied to the hemagglutinin phylogeny of HPAI H5Nx viruses, PhyCLIP was not only able to recapitulate the current WHO/OIE/FAO H5 nomenclature system but also further delineated informative higher resolution clusters that capture geographically distinct subpopulations of viruses. PhyCLIP is pathogen-agnostic and can be generalized to a wide variety of research questions concerning the identification of biologically informative clusters in pathogen phylogenies. PhyCLIP is freely available at http://github.com/alvinxhan/PhyCLIP, last accessed March 15, 2019.
Collapse
Affiliation(s)
- Alvin X Han
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore (NUS), Singapore.,Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Edyth Parker
- Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands.,Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Frits Scholer
- Department of Medical Microbiology, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| | - Sebastian Maurer-Stroh
- Bioinformatics Institute, Agency for Science, Technology and Research (A*STAR), Singapore.,NUS Graduate School for Integrative Sciences and Engineering, National University of Singapore (NUS), Singapore.,Department of Biological Sciences, National University of Singapore, Singapore
| | - Colin A Russell
- Laboratory of Applied Evolutionary Biology, Department of Medical Microbiology, Academic Medical Centre, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
14
|
Barido-Sottani J, Vaughan TG, Stadler T. Detection of HIV transmission clusters from phylogenetic trees using a multi-state birth-death model. J R Soc Interface 2019; 15:rsif.2018.0512. [PMID: 30185544 PMCID: PMC6170769 DOI: 10.1098/rsif.2018.0512] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Accepted: 08/13/2018] [Indexed: 12/03/2022] Open
Abstract
HIV patients form clusters in HIV transmission networks. Accurate identification of these transmission clusters is essential to effectively target public health interventions. One reason for clustering is that the underlying contact network contains many local communities. We present a new maximum-likelihood method for identifying transmission clusters caused by community structure, based on phylogenetic trees. The method employs a multi-state birth–death (MSBD) model which detects changes in transmission rate, which are interpreted as the introduction of the epidemic into a new susceptible community, i.e. the formation of a new cluster. We show that the MSBD method is able to reliably infer the clusters and the transmission parameters from a pathogen phylogeny based on our simulations. In contrast to existing cutpoint-based methods for cluster identification, our method does not require that clusters be monophyletic nor is it dependent on the selection of a difficult-to-interpret cutpoint parameter. We present an application of our method to data from the Swiss HIV Cohort Study. The method is available as an easy-to-use R package.
Collapse
Affiliation(s)
- Joëlle Barido-Sottani
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland .,Swiss Institute of Bioinformatics (SIB), Switzerland
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.,Swiss Institute of Bioinformatics (SIB), Switzerland
| |
Collapse
|
15
|
McLaughlin A, Sereda P, Oliveira N, Barrios R, Brumme CJ, Brumme ZL, Montaner JSG, Joy JB. Detection of HIV transmission hotspots in British Columbia, Canada: A novel framework for the prioritization and allocation of treatment and prevention resources. EBioMedicine 2019; 48:405-413. [PMID: 31628022 PMCID: PMC6838403 DOI: 10.1016/j.ebiom.2019.09.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 09/14/2019] [Indexed: 01/05/2023] Open
Abstract
Background Identifying populations at high risk of HIV transmission is critical for prioritizing treatment and prevention resources and achieving the UNAIDS 90-90-90 Targets. Methods HIV transmission rates can be estimated from phylogenetic trees as viral lineage-level diversification rates. To identify HIV-1 transmission foci in British Columbia, Canada, we inferred diversification rates from phylogenetic trees of 36 271 HIV-1 sequences from 9630 anonymized individuals. Diversification rates were combined with sociodemographic and clinical data, then aggregated by patients’ area of residence to predict the distribution of new HIV cases between 2008 and 2018. The predictive power of the model was compared with a phylogenetically uninformed model. Findings Aggregated diversification rate measures were predictive of new HIV cases in the subsequent year after adjusting for prevalent and incident cases in the previous year. For every one-unit increase in the mean of the top five diversification rates, the number of new HIV cases increased by on average 1·38-fold (95% CI, 1·28–1·49). In a blind prediction of 2018 cases, diversification rate improved the model's specificity by 12%, accuracy by 9%, top 20 agreement by 100%, and correlation of predicted and observed values by 162% relative to a model that incorporated epidemiological data alone. Interpretation By predicting the distribution of future HIV cases, a combined phylogenetic and epidemiological approach identifies hotspots where public health resources are needed most. Funding Canadian Institutes of Health Research, University of British Columbia, Public Health Agency of Canada, Genome Canada, Genome BC, Michael Smith Foundation for Health Research, and BC Centre for Excellence in HIV/AIDS.
Collapse
Affiliation(s)
- Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada; Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Paul Sereda
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Natalia Oliveira
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Rolando Barrios
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada
| | - Chanson J Brumme
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Zabrina L Brumme
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Julio S G Montaner
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
16
|
Kosakovsky Pond SL, Weaver S, Leigh Brown AJ, Wertheim JO. HIV-TRACE (TRAnsmission Cluster Engine): a Tool for Large Scale Molecular Epidemiology of HIV-1 and Other Rapidly Evolving Pathogens. Mol Biol Evol 2019; 35:1812-1819. [PMID: 29401317 DOI: 10.1093/molbev/msy016] [Citation(s) in RCA: 162] [Impact Index Per Article: 32.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
In modern applications of molecular epidemiology, genetic sequence data are routinely used to identify clusters of transmission in rapidly evolving pathogens, most notably HIV-1. Traditional 'shoe-leather' epidemiology infers transmission clusters by tracing chains of partners sharing epidemiological connections (e.g., sexual contact). Here, we present a computational tool for identifying a molecular transmission analog of such clusters: HIV-TRACE (TRAnsmission Cluster Engine). HIV-TRACE implements an approach inspired by traditional epidemiology, by identifying chains of partners whose viral genetic relatedness imply direct or indirect epidemiological connections. Molecular transmission clusters are constructed using codon-aware pairwise alignment to a reference sequence followed by pairwise genetic distance estimation among all sequences. This approach is computationally tractable and is capable of identifying HIV-1 transmission clusters in large surveillance databases comprising tens or hundreds of thousands of sequences in near real time, that is, on the order of minutes to hours. HIV-TRACE is available at www.hivtrace.org and from www.github.com/veg/hivtrace, along with the accompanying result visualization module from www.github.com/veg/hivtrace-viz. Importantly, the approach underlying HIV-TRACE is not limited to the study of HIV-1 and can be applied to study outbreaks and epidemics of other rapidly evolving pathogens.
Collapse
Affiliation(s)
| | - Steven Weaver
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA
| | - Andrew J Leigh Brown
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Joel O Wertheim
- Department of Medicine, University of California, San Diego, CA
| |
Collapse
|
17
|
Wertheim JO, Chato C, Poon AFY. Comparative analysis of HIV sequences in real time for public health. Curr Opin HIV AIDS 2019; 14:213-220. [PMID: 30882486 DOI: 10.1097/coh.0000000000000539] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
PURPOSE OF REVIEW The purpose of this study is to summarize recent advances in public health applications of comparative methods for HIV-1 sequence analysis in real time, including genetic clustering methods. RECENT FINDINGS Over the past 2 years, several groups have reported the deployment of established genetic clustering methods to guide public health decisions for HIV prevention in 'near real time'. However, it remains unresolved how well the readouts of comparative methods like clusters translate to events that are actionable for public health. A small number of recent studies have begun to elucidate the linkage between clusters and HIV-1 incidence, whereas others continue to refine and develop new comparative methods for such applications. SUMMARY Although the use of established methods to cluster HIV-1 sequence databases has become a widespread activity, there remains a critical gap between clusters and public health value.
Collapse
Affiliation(s)
- Joel O Wertheim
- Department of Medicine, University of California, San Diego, California, USA
| | | | - Art F Y Poon
- Department of Pathology and Laboratory Medicine
- Department of Microbiology and Immunology, Western University, London, Ontario, Canada
| |
Collapse
|
18
|
Carroll LM, Wiedmann M, Mukherjee M, Nicholas DC, Mingle LA, Dumas NB, Cole JA, Kovac J. Characterization of Emetic and Diarrheal Bacillus cereus Strains From a 2016 Foodborne Outbreak Using Whole-Genome Sequencing: Addressing the Microbiological, Epidemiological, and Bioinformatic Challenges. Front Microbiol 2019; 10:144. [PMID: 30809204 PMCID: PMC6379260 DOI: 10.3389/fmicb.2019.00144] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Accepted: 01/21/2019] [Indexed: 12/21/2022] Open
Abstract
The Bacillus cereus group comprises multiple species capable of causing emetic or diarrheal foodborne illness. Despite being responsible for tens of thousands of illnesses each year in the U.S. alone, whole-genome sequencing (WGS) is not yet routinely employed to characterize B. cereus group isolates from foodborne outbreaks. Here, we describe the first WGS-based characterization of isolates linked to an outbreak caused by members of the B. cereus group. In conjunction with a 2016 outbreak traced to a supplier of refried beans served by a fast food restaurant chain in upstate New York, a total of 33 B. cereus group isolates were obtained from human cases (n = 7) and food samples (n = 26). Emetic (n = 30) and diarrheal (n = 3) isolates were most closely related to B. paranthracis (group III) and B. cereus sensu stricto (group IV), respectively. WGS indicated that the 30 emetic isolates (24 and 6 from food and humans, respectively) were closely related and formed a well-supported clade distinct from publicly available emetic group III genomes with an identical sequence type (ST 26). The 30 emetic group III isolates from this outbreak differed from each other by a mean of 8.3 to 11.9 core single nucleotide polymorphisms (SNPs), while differing from publicly available emetic group III ST 26 B. cereus group genomes by a mean of 301.7-528.0 core SNPs, depending on the SNP calling methodology used. Using a WST-1 cell proliferation assay, the strains isolated from this outbreak had only mild detrimental effects on HeLa cell metabolic activity compared to reference diarrheal strain B. cereus ATCC 14579. We hypothesize that the outbreak was a single source outbreak caused by emetic group III B. cereus belonging to the B. paranthracis species, although food samples were not tested for presence of the emetic toxin cereulide. In addition to showcasing how WGS can be used to characterize B. cereus group strains linked to a foodborne outbreak, we also discuss potential microbiological and epidemiological challenges presented by B. cereus group outbreaks, and we offer recommendations for analyzing WGS data from the isolates associated with them.
Collapse
Affiliation(s)
- Laura M. Carroll
- Department of Food Science, Cornell University, Ithaca, NY, United States
| | - Martin Wiedmann
- Department of Food Science, Cornell University, Ithaca, NY, United States
| | - Manjari Mukherjee
- Department of Food Science, The Pennsylvania State University, University Park, PA, United States
| | - David C. Nicholas
- New York State Department of Health, Corning Tower, Empire State Plaza, Albany, NY, United States
| | - Lisa A. Mingle
- New York State Department of Health, Wadsworth Center, Albany, NY, United States
| | - Nellie B. Dumas
- New York State Department of Health, Wadsworth Center, Albany, NY, United States
| | - Jocelyn A. Cole
- New York State Department of Health, Wadsworth Center, Albany, NY, United States
| | - Jasna Kovac
- Department of Food Science, The Pennsylvania State University, University Park, PA, United States
| |
Collapse
|