1
|
Weaver S, Dávila Conn VM, Ji D, Verdonk H, Ávila-Ríos S, Leigh Brown AJ, Wertheim JO, Kosakovsky Pond SL. AUTO-TUNE: selecting the distance threshold for inferring HIV transmission clusters. FRONTIERS IN BIOINFORMATICS 2024; 4:1400003. [PMID: 39086842 PMCID: PMC11289888 DOI: 10.3389/fbinf.2024.1400003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 05/17/2024] [Indexed: 08/02/2024] Open
Abstract
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained heterosexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
Collapse
Affiliation(s)
- Steven Weaver
- Center for Viral Evolution, Temple University, Philadelphia, PA, United States
| | - Vanessa M. Dávila Conn
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Daniel Ji
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | - Hannah Verdonk
- Center for Viral Evolution, Temple University, Philadelphia, PA, United States
| | | | - Andrew J. Leigh Brown
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | - Joel O. Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA, United States
| | | |
Collapse
|
2
|
Sun C, Fang R, Salemi M, Prosperi M, Rife Magalis B. DeepDynaForecast: Phylogenetic-informed graph deep learning for epidemic transmission dynamic prediction. PLoS Comput Biol 2024; 20:e1011351. [PMID: 38598563 PMCID: PMC11034642 DOI: 10.1371/journal.pcbi.1011351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 04/22/2024] [Accepted: 03/11/2024] [Indexed: 04/12/2024] Open
Abstract
In the midst of an outbreak or sustained epidemic, reliable prediction of transmission risks and patterns of spread is critical to inform public health programs. Projections of transmission growth or decline among specific risk groups can aid in optimizing interventions, particularly when resources are limited. Phylogenetic trees have been widely used in the detection of transmission chains and high-risk populations. Moreover, tree topology and the incorporation of population parameters (phylodynamics) can be useful in reconstructing the evolutionary dynamics of an epidemic across space and time among individuals. We now demonstrate the utility of phylodynamic trees for transmission modeling and forecasting, developing a phylogeny-based deep learning system, referred to as DeepDynaForecast. Our approach leverages a primal-dual graph learning structure with shortcut multi-layer aggregation, which is suited for the early identification and prediction of transmission dynamics in emerging high-risk groups. We demonstrate the accuracy of DeepDynaForecast using simulated outbreak data and the utility of the learned model using empirical, large-scale data from the human immunodeficiency virus epidemic in Florida between 2012 and 2020. Our framework is available as open-source software (MIT license) at github.com/lab-smile/DeepDynaForcast.
Collapse
Affiliation(s)
- Chaoyue Sun
- Department of Electrical and Computer Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Ruogu Fang
- Department of Electrical and Computer Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, Florida, United States of America
| | - Marco Salemi
- Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| | - Mattia Prosperi
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
- Department of Epidemiology, University of Florida, Gainesville, Florida, United States of America
| | - Brittany Rife Magalis
- Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|
3
|
Weaver S, Dávila-Conn V, Ji D, Verdonk H, Ávila-Ríos S, Leigh Brown AJ, Wertheim JO, Kosakovsky Pond SL. AUTO-TUNE: SELECTING THE DISTANCE THRESHOLD FOR INFERRING HIV TRANSMISSION CLUSTERS. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.11.584522. [PMID: 38559140 PMCID: PMC10979987 DOI: 10.1101/2024.03.11.584522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Molecular surveillance of viral pathogens and inference of transmission networks from genomic data play an increasingly important role in public health efforts, especially for HIV-1. For many methods, the genetic distance threshold used to connect sequences in the transmission network is a key parameter informing the properties of inferred networks. Using a distance threshold that is too high can result in a network with many spurious links, making it difficult to interpret. Conversely, a distance threshold that is too low can result in a network with too few links, which may not capture key insights into clusters of public health concern. Published research using the HIV-TRACE software package frequently uses the default threshold of 0.015 substitutions/site for HIV pol gene sequences, but in many cases, investigators heuristically select other threshold parameters to better capture the underlying dynamics of the epidemic they are studying. Here, we present a general heuristic scoring approach for tuning a distance threshold adaptively, which seeks to prevent the formation of giant clusters. We prioritize the ratio of the sizes of the largest and the second largest cluster, maximizing the number of clusters present in the network. We apply our scoring heuristic to outbreaks with different characteristics, such as regional or temporal variability, and demonstrate the utility of using the scoring mechanism's suggested distance threshold to identify clusters exhibiting risk factors that would have otherwise been more difficult to identify. For example, while we found that a 0.015 substitutions/site distance threshold is typical for US-like epidemics, recent outbreaks like the CRF07_BC subtype among men who have sex with men (MSM) in China have been found to have a lower optimal threshold of 0.005 to better capture the transition from injected drug use (IDU) to MSM as the primary risk factor. Alternatively, in communities surrounding Lake Victoria in Uganda, where there has been sustained hetero-sexual transmission for many years, we found that a larger distance threshold is necessary to capture a more risk factor-diverse population with sparse sampling over a longer period of time. Such identification may allow for more informed intervention action by respective public health officials.
Collapse
Affiliation(s)
- Steven Weaver
- Center for Viral Evolution, Temple University, Philadelphia, PA, USA
| | - Vanessa Dávila-Conn
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Daniel Ji
- Department of Computer Science & Engineering, UC San Diego, La Jolla, CA 92093, USA
| | - Hannah Verdonk
- Center for Viral Evolution, Temple University, Philadelphia, PA, USA
| | - Santiago Ávila-Ríos
- Center for Research in Infectious Diseases, National Institute of Respiratory Diseases, Mexico City, Mexico
| | - Andrew J Leigh Brown
- School of Biological Sciences, University of Edinburgh, Edinburgh, Scotland, United Kingdom
| | - Joel O Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA
| | | |
Collapse
|
4
|
Switzer WM, Shankar A, Jia H, Knyazev S, Ambrosio F, Kelly R, Zheng H, Campbell EM, Cintron R, Pan Y, Saduvala N, Panneer N, Richman R, Singh MB, Thoroughman DA, Blau EF, Khalil GM, Lyss S, Heneine W. High HIV diversity, recombination, and superinfection revealed in a large outbreak among persons who inject drugs in Kentucky and Ohio, USA. Virus Evol 2024; 10:veae015. [PMID: 38510920 PMCID: PMC10953796 DOI: 10.1093/ve/veae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 01/30/2024] [Accepted: 02/05/2024] [Indexed: 03/22/2024] Open
Abstract
We investigated transmission dynamics of a large human immunodeficiency virus (HIV) outbreak among persons who inject drugs (PWID) in KY and OH during 2017-20 by using detailed phylogenetic, network, recombination, and cluster dating analyses. Using polymerase (pol) sequences from 193 people associated with the investigation, we document high HIV-1 diversity, including Subtype B (44.6 per cent); numerous circulating recombinant forms (CRFs) including CRF02_AG (2.5 per cent) and CRF02_AG-like (21.8 per cent); and many unique recombinant forms composed of CRFs with major subtypes and sub-subtypes [CRF02_AG/B (24.3 per cent), B/CRF02_AG/B (0.5 per cent), and A6/D/B (6.4 per cent)]. Cluster analysis of sequences using a 1.5 per cent genetic distance identified thirteen clusters, including a seventy-five-member cluster composed of CRF02_AG-like and CRF02_AG/B, an eighteen-member CRF02_AG/B cluster, Subtype B clusters of sizes ranging from two to twenty-three, and a nine-member A6/D and A6/D/B cluster. Recombination and phylogenetic analyses identified CRF02_AG/B variants with ten unique breakpoints likely originating from Subtype B and CRF02_AG-like viruses in the largest clusters. The addition of contact tracing results from OH to the genetic networks identified linkage between persons with Subtype B, CRF02_AG, and CRF02_AG/B sequences in the clusters supporting de novo recombinant generation. Superinfection prevalence was 13.3 per cent (8/60) in persons with multiple specimens and included infection with B and CRF02_AG; B and CRF02_AG/B; or B and A6/D/B. In addition to the presence of multiple, distinct molecular clusters associated with this outbreak, cluster dating inferred transmission associated with the largest molecular cluster occurred as early as 2006, with high transmission rates during 2017-8 in certain other molecular clusters. This outbreak among PWID in KY and OH was likely driven by rapid transmission of multiple HIV-1 variants including de novo viral recombinants from circulating viruses within the community. Our findings documenting the high HIV-1 transmission rate and clustering through partner services and molecular clusters emphasize the importance of leveraging multiple different data sources and analyses, including those from disease intervention specialist investigations, to better understand outbreak dynamics and interrupt HIV spread.
Collapse
Affiliation(s)
- William M Switzer
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Anupama Shankar
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Hongwei Jia
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Sergey Knyazev
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
- Oak Ridge Institute for Science and Education, 1299 Bethel Valley Rd, Oak Ridge, TN 37830, USA
| | - Frank Ambrosio
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Reagan Kelly
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
- General Dynamics Information Technology, 3150 Fairview Park Dr, Falls Church, VA 22042, USA
| | - HaoQiang Zheng
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | | | - Roxana Cintron
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Yi Pan
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | | | - Nivedha Panneer
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Rhiannon Richman
- HIV Surveillance Program, Bureau of HIV/STI/Viral Hepatitis, Ohio Department of Health, 246 North High Street, Colombus, OH 43215, USA
| | - Manny B Singh
- Division of Epidemiology and Health Planning, Kentucky Department for Public Health, Frankfort, KY 40621, USA
| | - Douglas A Thoroughman
- Division of Epidemiology and Health Planning, Kentucky Department for Public Health, Frankfort, KY 40621, USA
- ORR/Division of State and Local Readiness/Field Services Branch/CEFO Program, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Erin F Blau
- Division of Epidemiology and Health Planning, Kentucky Department for Public Health, Frankfort, KY 40621, USA
- Epidemic Intelligence Service, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - George M Khalil
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| | - Sheryl Lyss
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
- HIV Surveillance Program, Bureau of HIV/STI/Viral Hepatitis, Ohio Department of Health, 246 North High Street, Colombus, OH 43215, USA
- Division of Epidemiology and Health Planning, Kentucky Department for Public Health, Frankfort, KY 40621, USA
- Hamilton County Public Health, 250 William Howard Taft Rd, Cincinnati, OH 45219, USA
- Northern Kentucky Health Department, 8001 Veterans Memorial Drive, Florence, KY 41042, USA
| | - Walid Heneine
- Division of HIV Prevention, CDC, 1600 Clifton Rd, Atlanta, GA 30329, USA
| |
Collapse
|
5
|
DeGruttola V, Nakazawa M, Lin T, Liu J, Goyal R, Little S, Tu X, Mehta S. Modeling homophily in dynamic networks with application to HIV molecular surveillance. BMC Infect Dis 2023; 23:656. [PMID: 37794364 PMCID: PMC10548762 DOI: 10.1186/s12879-023-08598-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 09/11/2023] [Indexed: 10/06/2023] Open
Abstract
BACKGROUND Efforts to control the HIV epidemic can benefit from knowledge of the relationships between the characteristics of people who have transmitted HIV and those who became infected by them. Investigation of this relationship is facilitated by the use of HIV genetic linkage analyses, which allows inference about possible transmission events among people with HIV infection. Two persons with HIV (PWH) are considered linked if the genetic distance between their HIV sequences is less than a given threshold, which implies proximity in a transmission network. The tendency of pairs of nodes (in our case PWH) that share (or differ in) certain attributes to be linked is denoted homophily. Below, we describe a novel approach to modeling homophily with application to analyses of HIV viral genetic sequences from clinical series of participants followed in San Diego. Over the 22-year period of follow-up, increases in cluster size results from HIV transmissions to new people from those already in the cluster-either directly or through intermediaries. METHODS Our analytical approach makes use of a logistic model to describe homophily with regard to demographic, clinical, and behavioral characteristics-that is we investigate whether similarities (or differences) between PWH in these characteristics are associated with their sequences being linked. To investigate the performance of our methods, we conducted on a simulation study for which data sets were generated in a way that reproduced the structure of the observed database. RESULTS Our results demonstrated strong positive homophily associated with hispanic ethnicity, and strong negative homophily, with birth year difference. The second result implies that the larger the difference between the age of a newly-infected PWH and the average age for an available cluster, the lower the odds of a newly infected person joining that cluster. We did not observe homophily associated with prior diagnosis of sexually transmitted diseases. Our simulation studies demonstrated the validity of our approach for modeling homophily, by showing that the estimates it produced matched the specified values of the statistical network generating model. CONCLUSIONS Our novel methods provide a simple and flexible statistical network-based approach for modeling the growth of viral (or other microbial) genetic clusters from linkage to new infections based on genetic distance.
Collapse
Affiliation(s)
- Victor DeGruttola
- Division of Biostatistics and Bioinformatics Herbert Wertheim School of Public Health and Human Longevity Science, University of California, 9500 Gilman Dr., 92093-0628, San Diego, La Jolla, CA, USA.
| | | | - Tuo Lin
- Division of Biostatistics and Bioinformatics Herbert Wertheim School of Public Health and Human Longevity Science, University of California, 9500 Gilman Dr., 92093-0628, San Diego, La Jolla, CA, USA
| | - Jinyuan Liu
- Vanderbilt University, Department of Medicine, Nashville, USA
| | - Ravi Goyal
- Division of Infectious Diseases and Global Public Health, University of California San Diego, La Jolla, CA, USA
| | - Susan Little
- Division of Infectious Diseases and Global Public Health, University of California San Diego, La Jolla, CA, USA
| | - Xin Tu
- Division of Biostatistics and Bioinformatics Herbert Wertheim School of Public Health and Human Longevity Science, University of California, 9500 Gilman Dr., 92093-0628, San Diego, La Jolla, CA, USA
| | - Sanjay Mehta
- Veterans Affairs, San Diego Healthcare System, San Diego, CA, USA
| |
Collapse
|
6
|
Labarile M, Loosli T, Zeeb M, Kusejko K, Huber M, Hirsch HH, Perreau M, Ramette A, Yerly S, Cavassini M, Battegay M, Rauch A, Calmy A, Notter J, Bernasconi E, Fux C, Günthard HF, Pasin C, Kouyos RD, Aebi-Popp K, Anagnostopoulos A, Battegay M, Bernasconi E, Braun DL, Bucher HC, Calmy A, Cavassini M, Ciuffi A, Dollenmaier G, Egger M, Elzi L, Fehr J, Fellay J, Furrer H, Fux CA, Günthard HF, Hachfeld A, Haerry D, Hasse B, Hirsch HH, Hoffmann M, Hösli I, Huber M, Kahlert CR, Kaiser L, Keiser O, Klimkait T, Kouyos RD, Kovari H, Kusejko K, Martinetti G, Martinez de Tejada B, Marzolini C, Metzner KJ, Müller N, Nemeth J, Nicca D, Paioni P, Pantaleo G, Perreau M, Rauch A, Schmid P, Speck R, Stöckle M, Tarr P, Trkola A, Wandeler G, Yerly S. Quantifying and Predicting Ongoing Human Immunodeficiency Virus Type 1 Transmission Dynamics in Switzerland Using a Distance-Based Clustering Approach. J Infect Dis 2023; 227:554-564. [PMID: 36433831 DOI: 10.1093/infdis/jiac457] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 11/11/2022] [Accepted: 11/25/2022] [Indexed: 11/27/2022] Open
Abstract
BACKGROUND Despite effective prevention approaches, ongoing human immunodeficiency virus 1 (HIV-1) transmission remains a public health concern indicating a need for identifying its drivers. METHODS We combined a network-based clustering method using evolutionary distances between viral sequences with statistical learning approaches to investigate the dynamics of HIV transmission in the Swiss HIV Cohort Study and to predict the drivers of ongoing transmission. RESULTS We found that only a minority of clusters and patients acquired links to new infections between 2007 and 2020. While the growth of clusters and the probability of individual patients acquiring new links in the transmission network was associated with epidemiological, behavioral, and virological predictors, the strength of these associations decreased substantially when adjusting for network characteristics. Thus, these network characteristics can capture major heterogeneities beyond classical epidemiological parameters. When modeling the probability of a newly diagnosed patient being linked with future infections, we found that the best predictive performance (median area under the curve receiver operating characteristic AUCROC = 0.77) was achieved by models including characteristics of the network as predictors and that models excluding them performed substantially worse (median AUCROC = 0.54). CONCLUSIONS These results highlight the utility of molecular epidemiology-based network approaches for analyzing and predicting ongoing HIV transmission dynamics. This approach may serve for real-time prospective assessment of HIV transmission.
Collapse
Affiliation(s)
- Marco Labarile
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Tom Loosli
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Marius Zeeb
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Katharina Kusejko
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Michael Huber
- Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Hans H Hirsch
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, University of Basel, Basel, Switzerland.,Transplantation and Clinical Virology, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Matthieu Perreau
- Division of Immunology and Allergy, Lausanne University Hospital, University of Lausanne, Lausanne, Switzerland
| | - Alban Ramette
- Institute for Infectious Diseases, University of Bern, Bern, Switzerland
| | - Sabine Yerly
- Laboratory of Virology and Division of Infectious Diseases, Geneva University Hospital, University of Geneva, Geneva, Switzerland
| | - Matthias Cavassini
- Division of Infectious Diseases, Lausanne University Hospital, Lausanne, Switzerland
| | - Manuel Battegay
- Transplantation and Clinical Virology, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Andri Rauch
- Department of Infectious Diseases, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Alexandra Calmy
- Laboratory of Virology and Division of Infectious Diseases, Geneva University Hospital, University of Geneva, Geneva, Switzerland
| | - Julia Notter
- Division of Infectious Diseases, Cantonal Hospital St Gallen, St Gallen, Switzerland
| | - Enos Bernasconi
- Division of Infectious Diseases, Regional Hospital Lugano, Lugano, Switzerland
| | - Christoph Fux
- Department of Infectious Diseases, Kantonsspital Aarau, Aarau, Switzerland
| | - Huldrych F Günthard
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Chloé Pasin
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | - Roger D Kouyos
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zurich, Zurich, Switzerland.,Institute of Medical Virology, University of Zurich, Zurich, Switzerland
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Liu M, Chato C, Poon AFY. From components to communities: bringing network science to clustering for molecular epidemiology. Virus Evol 2023; 9:vead026. [PMID: 37187604 PMCID: PMC10175948 DOI: 10.1093/ve/vead026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 01/30/2023] [Accepted: 04/17/2023] [Indexed: 05/17/2023] Open
Abstract
Defining clusters of epidemiologically related infections is a common problem in the surveillance of infectious disease. A popular method for generating clusters is pairwise distance clustering, which assigns pairs of sequences to the same cluster if their genetic distance falls below some threshold. The result is often represented as a network or graph of nodes. A connected component is a set of interconnected nodes in a graph that are not connected to any other node. The prevailing approach to pairwise clustering is to map clusters to the connected components of the graph on a one-to-one basis. We propose that this definition of clusters is unnecessarily rigid. For instance, the connected components can collapse into one cluster by the addition of a single sequence that bridges nodes in the respective components. Moreover, the distance thresholds typically used for viruses like HIV-1 tend to exclude a large proportion of new sequences, making it difficult to train models for predicting cluster growth. These issues may be resolved by revisiting how we define clusters from genetic distances. Community detection is a promising class of clustering methods from the field of network science. A community is a set of nodes that are more densely inter-connected relative to the number of their connections to external nodes. Thus, a connected component may be partitioned into two or more communities. Here we describe community detection methods in the context of genetic clustering for epidemiology, demonstrate how a popular method (Markov clustering) enables us to resolve variation in transmission rates within a giant connected component of HIV-1 sequences, and identify current challenges and directions for further work.
Collapse
Affiliation(s)
- Molly Liu
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044, London, ON N6A 5C1, Canada
| | - Connor Chato
- Department of Pathology and Laboratory Medicine, Western University, Dental Sciences Building, Rm. 4044, London, ON N6A 5C1, Canada
| | | |
Collapse
|
8
|
Optimized phylogenetic clustering of HIV-1 sequence data for public health applications. PLoS Comput Biol 2022; 18:e1010745. [PMID: 36449514 PMCID: PMC9744331 DOI: 10.1371/journal.pcbi.1010745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 12/12/2022] [Accepted: 11/17/2022] [Indexed: 12/02/2022] Open
Abstract
Clusters of genetically similar infections suggest rapid transmission and may indicate priorities for public health action or reveal underlying epidemiological processes. However, clusters often require user-defined thresholds and are sensitive to non-epidemiological factors, such as non-random sampling. Consequently the ideal threshold for public health applications varies substantially across settings. Here, we show a method which selects optimal thresholds for phylogenetic (subset tree) clustering based on population. We evaluated this method on HIV-1 pol datasets (n = 14, 221 sequences) from four sites in USA (Tennessee, Washington), Canada (Northern Alberta) and China (Beijing). Clusters were defined by tips descending from an ancestral node (with a minimum bootstrap support of 95%) through a series of branches, each with a length below a given threshold. Next, we used pplacer to graft new cases to the fixed tree by maximum likelihood. We evaluated the effect of varying branch-length thresholds on cluster growth as a count outcome by fitting two Poisson regression models: a null model that predicts growth from cluster size, and an alternative model that includes mean collection date as an additional covariate. The alternative model was favoured by AIC across most thresholds, with optimal (greatest difference in AIC) thresholds ranging 0.007-0.013 across sites. The range of optimal thresholds was more variable when re-sampling 80% of the data by location (IQR 0.008 - 0.016, n = 100 replicates). Our results use prospective phylogenetic cluster growth and suggest that there is more variation in effective thresholds for public health than those typically used in clustering studies.
Collapse
|
9
|
Ragonnet-Cronin M, Hayford C, D’Aquila R, Ma F, Ward C, Benbow N, Wertheim JO. Forecasting HIV-1 Genetic Cluster Growth in Illinois,United States. J Acquir Immune Defic Syndr 2022; 89:49-55. [PMID: 34878434 PMCID: PMC8667185 DOI: 10.1097/qai.0000000000002821] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 09/08/2021] [Indexed: 01/03/2023]
Abstract
BACKGROUND HIV intervention activities directed toward both those most likely to transmit and their HIV-negative partners have the potential to substantially disrupt HIV transmission. Using HIV sequence data to construct molecular transmission clusters can reveal individuals whose viruses are connected. The utility of various cluster prioritization schemes measuring cluster growth have been demonstrated using surveillance data in New York City and across the United States, by the Centers for Disease Control and Prevention (CDC). METHODS We examined clustering and cluster growth prioritization schemes using Illinois HIV sequence data that include cases from Chicago, a large urban center with high HIV prevalence, to compare their ability to predict future cluster growth. RESULTS We found that past cluster growth was a far better predictor of future cluster growth than cluster membership alone but found no substantive difference between the schemes used by CDC and the relative cluster growth scheme previously used in New York City (NYC). Focusing on individuals selected simultaneously by both the CDC and the NYC schemes did not provide additional improvements. CONCLUSION Growth-based prioritization schemes can easily be automated in HIV surveillance tools and can be used by health departments to identify and respond to clusters where HIV transmission may be actively occurring.
Collapse
Affiliation(s)
- Manon Ragonnet-Cronin
- Department of Medicine, University of California San Diego, San Diego, USA
- MRC Centre for Global Infectious Disease Analysis, Department of Infectious Disease Epidemiology, Imperial College London, London, UK
| | - Christina Hayford
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Richard D’Aquila
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Fangchao Ma
- Illinois Department of Public Health, Chicago, USA
| | - Cheryl Ward
- Illinois Department of Public Health, Chicago, USA
| | - Nanette Benbow
- Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, USA
| | - Joel O. Wertheim
- Department of Medicine, University of California San Diego, San Diego, USA
| |
Collapse
|
10
|
Dávila‐Conn V, García‐Morales C, Matías‐Florentino M, López‐Ortiz E, Paz‐Juárez HE, Beristain‐Barreda Á, Cárdenas‐Sandoval M, Tapia‐Trejo D, López‐Sánchez DM, Becerril‐Rodríguez M, García‐Esparza P, Macías‐González I, Iracheta‐Hernández P, Weaver S, Wertheim JO, Reyes‐Terán G, González‐Rodríguez A, Ávila‐Ríos S. Characteristics and growth of the genetic HIV transmission network of Mexico City during 2020. J Int AIDS Soc 2021; 24:e25836. [PMID: 34762774 PMCID: PMC8583431 DOI: 10.1002/jia2.25836] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 10/13/2021] [Indexed: 12/04/2022] Open
Abstract
INTRODUCTION Molecular surveillance systems could provide public health benefits to focus strategies to improve the HIV care continuum. Here, we infer the HIV genetic network of Mexico City in 2020, and identify actively growing clusters that could represent relevant targets for intervention. METHODS All new diagnoses, referrals from other institutions, as well as persons returning to care, enrolling at the largest HIV clinic in Mexico City were invited to participate in the study. The network was inferred from HIV pol sequences, using pairwise genetic distance methods, with a locally hosted, secure version of the HIV-TRACE tool: Seguro HIV-TRACE. Socio-demographic, clinical and behavioural metadata were overlaid across the network to design focused prevention interventions. RESULTS A total of 3168 HIV sequences from unique individuals were included. One thousand and one-hundred and fifty (36%) sequences formed 1361 links within 386 transmission clusters in the network. Cluster size varied from 2 to 14 (63% were dyads). After adjustment for covariates, lower age (adjusted odds ratio [aOR]: 0.37, p<0.001; >34 vs. <24 years), being a man who has sex with men (MSM) (aOR: 2.47, p = 0.004; MSM vs. cisgender women), having higher viral load (aOR: 1.28, p<0.001) and higher CD4+ T cell count (aOR: 1.80, p<0.001; ≥500 vs. <200 cells/mm3 ) remained associated with higher odds of clustering. Compared to MSM, cisgender women and heterosexual men had significantly lower education (none or any elementary: 59.1% and 54.2% vs. 16.6%, p<0.001) and socio-economic status (low income: 36.4% and 29.0% vs. 18.6%, p = 0.03) than MSM. We identified 10 (2.6%) clusters with constant growth, for prioritized intervention, that included intersecting sexual risk groups, highly connected nodes and bridge nodes between possible sub-clusters with high growth potential. CONCLUSIONS HIV transmission in Mexico City is strongly driven by young MSM with higher education level and recent infection. Nevertheless, leveraging network inference, we identified actively growing clusters that could be prioritized for focused intervention with demographic and risk characteristics that do not necessarily reflect the ones observed in the overall clustering population. Further studies evaluating different models to predict growing clusters are warranted. Focused interventions will have to consider structural and risk disparities between the MSM and the heterosexual populations.
Collapse
Affiliation(s)
- Vanessa Dávila‐Conn
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Claudia García‐Morales
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | | | - Eduardo López‐Ortiz
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Héctor E. Paz‐Juárez
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Ángeles Beristain‐Barreda
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | | | - Daniela Tapia‐Trejo
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Dulce M. López‐Sánchez
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Manuel Becerril‐Rodríguez
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | - Pedro García‐Esparza
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| | | | | | - Steven Weaver
- Institute for Genomics and Evolutionary MedicineTemple UniversityPhiladelphiaPennsylvaniaUSA
| | - Joel O. Wertheim
- Department of MedicineUniversity of California San DiegoLa JollaCaliforniaUSA
| | - Gustavo Reyes‐Terán
- Coordinating Commission of the National Institutes of Health and High Specialty HospitalsMexico CityMexico
| | | | - Santiago Ávila‐Ríos
- Centre for Research in Infectious DiseasesNational Institute of Respiratory DiseasesMexico CityMexico
| |
Collapse
|
11
|
McLaughlin A, Sereda P, Brumme CJ, Brumme ZL, Barrios R, Montaner JSG, Joy JB. Concordance of HIV transmission risk factors elucidated using viral diversification rate and phylogenetic clustering. Evol Med Public Health 2021; 9:338-348. [PMID: 34754454 PMCID: PMC8573190 DOI: 10.1093/emph/eoab028] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Accepted: 09/14/2021] [Indexed: 11/22/2022] Open
Abstract
BACKGROUND AND OBJECTIVES Although HIV sequence clustering is routinely used to identify subpopulations experiencing elevated transmission, it over-simplifies transmission dynamics and is sensitive to methodology. Complementarily, viral diversification rates can be used to approximate historical transmission rates. Here, we investigated the concordance and sensitivity of HIV transmission risk factors identified by phylogenetic clustering, viral diversification rate, changes in viral diversification rate and a combined approach. METHODOLOGY Viral sequences from 9848 people living with HIV in British Columbia, Canada, sampled between 1996 and February 2019, were used to infer phylogenetic trees, from which clusters were identified and viral diversification rates of each tip were calculated. Factors associated with heightened transmission risk were compared across models of cluster membership, viral diversification rate, changes in diversification rate, and viral diversification rate among clusters. RESULTS Viruses within larger clusters had higher diversification rates and lower changes in diversification rate than those within smaller clusters; however, rates within individual clusters, independent of size, varied widely. Risk factors for both cluster membership and elevated viral diversification rate included being male, young, a resident of health authority E, previous injection drug use, previous hepatitis C virus infection or a high recent viral load. In a sensitivity analysis, models based on cluster membership had wider confidence intervals and lower concordance of significant effects than viral diversification rate for lower sampling rates. CONCLUSIONS AND IMPLICATIONS Viral diversification rate complements phylogenetic clustering, offering a means of evaluating transmission dynamics to guide provision of treatment and prevention services. LAY SUMMARY Understanding HIV transmission dynamics within clusters can help prioritize public health resource allocation. We compared socio-demographic and clinical risk factors associated with phylogenetic cluster membership and viral diversification rate, a historical branching rate, in order to assess their relative concordance and sampling sensitivity.
Collapse
Affiliation(s)
- Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
- Department of Bioinformatics, University of British Columbia, Genome Sciences Centre, British Columbia Cancer Agency, 100-570 West 7th Avenue, Vancouver, BC V5Z 4S6, Canada
| | - Paul Sereda
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
| | - Chanson J Brumme
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| | - Zabrina L Brumme
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
- Faculty of Health Sciences, Simon Fraser University, Blusson Hall, Room 11300, 8888 University Drive, Burnaby, BC V5A 1S6, Canada
| | - Rolando Barrios
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
| | - Julio S G Montaner
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, St. Paul’s Hospital, 608-1081 Burrard Street, Vancouver, BC V6Z 1Y6, Canada
- Department of Bioinformatics, University of British Columbia, Genome Sciences Centre, British Columbia Cancer Agency, 100-570 West 7th Avenue, Vancouver, BC V5Z 4S6, Canada
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, 452D, Heather Pavilion East, Vancouver General Hospital, 2733 Heather Street, Vancouver, BC V5Z 3J5, Canada
| |
Collapse
|
12
|
Rich SN, Richards VL, Mavian CN, Switzer WM, Rife Magalis B, Poschman K, Geary S, Broadway SE, Bennett SB, Blanton J, Leitner T, Boatwright JL, Stetten NE, Cook RL, Spencer EC, Salemi M, Prosperi M. Employing Molecular Phylodynamic Methods to Identify and Forecast HIV Transmission Clusters in Public Health Settings: A Qualitative Study. Viruses 2020; 12:E921. [PMID: 32842636 PMCID: PMC7551766 DOI: 10.3390/v12090921] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Revised: 08/18/2020] [Accepted: 08/21/2020] [Indexed: 01/19/2023] Open
Abstract
Molecular HIV surveillance is a promising public health strategy for curbing the HIV epidemic. Clustering technologies used by health departments to date are limited in their ability to infer/forecast cluster growth trajectories. Resolution of the spatiotemporal dynamics of clusters, through phylodynamic and phylogeographic modelling, is one potential strategy to develop a forecasting tool; however, the projected utility of this approach needs assessment. Prior to incorporating novel phylodynamic-based molecular surveillance tools, we sought to identify possible issues related to their feasibility, acceptability, interpretation, and utility. Qualitative data were collected via focus groups among field experts (n = 17, 52.9% female) using semi-structured, open-ended questions. Data were coded using an iterative process, first through the development of provisional themes and subthemes, followed by independent line-by-line coding by two coders. Most participants routinely used molecular methods for HIV surveillance. All agreed that linking molecular sequences to epidemiological data is important for improving HIV surveillance. We found that, in addition to methodological challenges, a variety of implementation barriers are expected in relation to the uptake of phylodynamic methods for HIV surveillance. The participants identified several opportunities to enhance current methods, as well as increase the usability and utility of promising works-in-progress.
Collapse
Affiliation(s)
- Shannan N. Rich
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL 32610, USA; (V.L.R.); (N.E.S.); (R.L.C.); (M.P.)
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
| | - Veronica L. Richards
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL 32610, USA; (V.L.R.); (N.E.S.); (R.L.C.); (M.P.)
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
| | - Carla N. Mavian
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - William M. Switzer
- Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, GA 30322, USA; (W.M.S.); (K.P.)
| | - Brittany Rife Magalis
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Karalee Poschman
- Division of HIV/AIDS Prevention, Centers for Disease Control and Prevention, Atlanta, GA 30322, USA; (W.M.S.); (K.P.)
- Florida Department of Health, Division of Disease Control and Health Protection, Bureau of Communicable Diseases, HIV/AIDS Section, Tallahassee, FL 32399, USA; (S.E.B.); (E.C.S.)
| | - Shana Geary
- Division of Public Health, Injury and Violence Prevention Branch, North Carolina Department of Health and Human Services, Raleigh, NC 27699, USA;
| | - Steven E. Broadway
- Florida Department of Health, Division of Disease Control and Health Protection, Bureau of Communicable Diseases, HIV/AIDS Section, Tallahassee, FL 32399, USA; (S.E.B.); (E.C.S.)
| | - Spencer B. Bennett
- Florida Department of Health, Bureau of Public Health Laboratories, Jacksonville, FL 32202, USA; (S.B.B.); (J.B.)
| | - Jason Blanton
- Florida Department of Health, Bureau of Public Health Laboratories, Jacksonville, FL 32202, USA; (S.B.B.); (J.B.)
| | - Thomas Leitner
- Theoretical Biology & Biophysics Group, Los Alamos National Laboratory, Los Alamos, NM 87545, USA;
| | - J. Lucas Boatwright
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC 29634, USA;
- Advanced Plant Technology Program, Clemson University, Clemson, SC 29634, USA
| | - Nichole E. Stetten
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL 32610, USA; (V.L.R.); (N.E.S.); (R.L.C.); (M.P.)
| | - Robert L. Cook
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL 32610, USA; (V.L.R.); (N.E.S.); (R.L.C.); (M.P.)
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
| | - Emma C. Spencer
- Florida Department of Health, Division of Disease Control and Health Protection, Bureau of Communicable Diseases, HIV/AIDS Section, Tallahassee, FL 32399, USA; (S.E.B.); (E.C.S.)
| | - Marco Salemi
- Emerging Pathogens Institute, University of Florida, Gainesville, FL 32610, USA; (C.N.M.); (B.R.M.); (M.S.)
- Department of Pathology, Immunology, and Laboratory Medicine, College of Medicine, University of Florida, Gainesville, FL 32610, USA
| | - Mattia Prosperi
- Department of Epidemiology, College of Public Health and Health Professions & College of Medicine, University of Florida, Gainesville, FL 32610, USA; (V.L.R.); (N.E.S.); (R.L.C.); (M.P.)
| |
Collapse
|