1
|
Nanduri S, Black A, Bedford T, Huddleston J. Dimensionality reduction distills complex evolutionary relationships in seasonal influenza and SARS-CoV-2. Virus Evol 2024; 10:veae087. [PMID: 39610652 PMCID: PMC11604119 DOI: 10.1093/ve/veae087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2024] [Revised: 09/30/2024] [Accepted: 10/11/2024] [Indexed: 11/30/2024] Open
Abstract
Public health researchers and practitioners commonly infer phylogenies from viral genome sequences to understand transmission dynamics and identify clusters of genetically-related samples. However, viruses that reassort or recombine violate phylogenetic assumptions and require more sophisticated methods. Even when phylogenies are appropriate, they can be unnecessary or difficult to interpret without specialty knowledge. For example, pairwise distances between sequences can be enough to identify clusters of related samples or assign new samples to existing phylogenetic clusters. In this work, we tested whether dimensionality reduction methods could capture known genetic groups within two human pathogenic viruses that cause substantial human morbidity and mortality and frequently reassort or recombine, respectively: seasonal influenza A/H3N2 and SARS-CoV-2. We applied principal component analysis, multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection to sequences with well-defined phylogenetic clades and either reassortment (H3N2) or recombination (SARS-CoV-2). For each low-dimensional embedding of sequences, we calculated the correlation between pairwise genetic and Euclidean distances in the embedding and applied a hierarchical clustering method to identify clusters in the embedding. We measured the accuracy of clusters compared to previously defined phylogenetic clades, reassortment clusters, or recombinant lineages. We found that MDS embeddings accurately represented pairwise genetic distances including the intermediate placement of recombinant SARS-CoV-2 lineages between parental lineages. Clusters from t-SNE embeddings accurately recapitulated known phylogenetic clades, H3N2 reassortment groups, and SARS-CoV-2 recombinant lineages. We show that simple statistical methods without a biological model can accurately represent known genetic relationships for relevant human pathogenic viruses. Our open source implementation of these methods for analysis of viral genome sequences can be easily applied when phylogenetic methods are either unnecessary or inappropriate.
Collapse
Affiliation(s)
- Sravani Nanduri
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, United States
| | - Allison Black
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
- Howard Hughes Medical Institute, Seattle, WA, United States
| | - John Huddleston
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, United States
| |
Collapse
|
2
|
Gogishvili M, Arora AK, White TM, Lazarus JV. Recommendations for the equitable integration of digital health interventions across the HIV care cascade. COMMUNICATIONS MEDICINE 2024; 4:226. [PMID: 39489853 PMCID: PMC11532406 DOI: 10.1038/s43856-024-00645-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2024] [Accepted: 10/14/2024] [Indexed: 11/05/2024] Open
Abstract
Gogishvili et al highlight the crucial role of digital health interventions (DHIs) in improving HIV care outcomes and experiences. They provide recommendations for the equitable integration of DHIs in the HIV care cascade, emphasizing the need to address the digital divide to ensure inclusive access to healthcare.
Collapse
Affiliation(s)
- Megi Gogishvili
- Centre of Epidemiological Studies of HIV/AIDS and STI of Catalonia (CEEISCAT), Badalona, Spain
- Health Department, Generalitat de Catalunya, Badalona, Spain
- Germans Trias i Pujol Research Institute (IGTP), Campus Can Ruti, Badalona, Spain
| | - Anish K Arora
- Department of Family Medicine, Faculty of Medicine and Health Sciences, McGill University, Montreal, Canada
- Department of Family & Community Medicine, Temerty Faculty of Medicine, University of Toronto, Toronto, Canada
| | - Trenton M White
- City University of New York Graduate School of Public Health and Health Policy, (CUNY SPH), New York City, NY, USA
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain
- Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain
| | - Jeffrey V Lazarus
- City University of New York Graduate School of Public Health and Health Policy, (CUNY SPH), New York City, NY, USA.
- Barcelona Institute for Global Health (ISGlobal), Barcelona, Spain.
- Faculty of Medicine and Health Sciences, University of Barcelona, Barcelona, Spain.
| |
Collapse
|
3
|
Kupperman MD, Ke R, Leitner T. Identifying Impacts of Contact Tracing on Epidemiological Inference from Phylogenetic Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.11.30.567148. [PMID: 38076930 PMCID: PMC10705478 DOI: 10.1101/2023.11.30.567148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Robust sampling methods are foundational to inferences using phylogenies. Yet the impact of using contact tracing, a type of non-uniform sampling used in public health applications such as infectious disease outbreak investigations, has not been investigated in the molecular epidemiology field. To understand how contact tracing influences a recovered phylogeny, we developed a new simulation tool called SEEPS (Sequence Evolution and Epidemiological Process Simulator) that allows for the simulation of contact tracing and the resulting transmission tree, pathogen phylogeny, and corresponding virus genetic sequences. Importantly, SEEPS takes within-host evolution into account when generating pathogen phylogenies and sequences from transmission histories. Using SEEPS, we demonstrate that contact tracing can significantly impact the structure of the resulting tree, as described by popular tree statistics. Contact tracing generates phylogenies that are less balanced than the underlying transmission process, less representative of the larger epidemiological process, and affects the internal/external branch length ratios that characterize specific epidemiological scenarios. We also examined real data from a 2007-2008 Swedish HIV-1 outbreak and the broader 1998-2010 European HIV-1 epidemic to highlight the differences in contact tracing and expected phylogenies. Aided by SEEPS, we show that the data collection of the Swedish outbreak was strongly influenced by contact tracing even after downsampling, while the broader European Union epidemic showed little evidence of universal contact tracing, agreeing with the known epidemiological information about sampling and spread. Overall, our results highlight the importance of including possible non-uniform sampling schemes when examining phylogenetic trees. For that, SEEPS serves as a useful tool to evaluate such impacts, thereby facilitating better phylogenetic inferences of the characteristics of a disease outbreak. SEEPS is available at github.com/MolEvolEpid/SEEPS.
Collapse
Affiliation(s)
- Michael D. Kupperman
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, New Mexico, United States of America
- Department of Applied Mathematics, University of Washington, Washington, United States of America
| | - Ruian Ke
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, New Mexico, United States of America
| | - Thomas Leitner
- Theoretical Biology and Biophysics, Los Alamos National Laboratory, New Mexico, United States of America
| |
Collapse
|
4
|
Nanduri S, Black A, Bedford T, Huddleston J. Dimensionality reduction distills complex evolutionary relationships in seasonal influenza and SARS-CoV-2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.07.579374. [PMID: 39253501 PMCID: PMC11383015 DOI: 10.1101/2024.02.07.579374] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Public health researchers and practitioners commonly infer phylogenies from viral genome sequences to understand transmission dynamics and identify clusters of genetically-related samples. However, viruses that reassort or recombine violate phylogenetic assumptions and require more sophisticated methods. Even when phylogenies are appropriate, they can be unnecessary or difficult to interpret without specialty knowledge. For example, pairwise distances between sequences can be enough to identify clusters of related samples or assign new samples to existing phylogenetic clusters. In this work, we tested whether dimensionality reduction methods could capture known genetic groups within two human pathogenic viruses that cause substantial human morbidity and mortality and frequently reassort or recombine, respectively: seasonal influenza A/H3N2 and SARS-CoV-2. We applied principal component analysis (PCA), multidimensional scaling (MDS), t-distributed stochastic neighbor embedding (t-SNE), and uniform manifold approximation and projection (UMAP) to sequences with well-defined phylogenetic clades and either reassortment (H3N2) or recombination (SARS-CoV-2). For each low-dimensional embedding of sequences, we calculated the correlation between pairwise genetic and Euclidean distances in the embedding and applied a hierarchical clustering method to identify clusters in the embedding. We measured the accuracy of clusters compared to previously defined phylogenetic clades, reassortment clusters, or recombinant lineages. We found that MDS embeddings accurately represented pairwise genetic distances including the intermediate placement of recombinant SARS-CoV-2 lineages between parental lineages. Clusters from t-SNE embeddings accurately recapitulated known phylogenetic clades, H3N2 reassortment groups, and SARS-CoV-2 recombinant lineages. We show that simple statistical methods without a biological model can accurately represent known genetic relationships for relevant human pathogenic viruses. Our open source implementation of these methods for analysis of viral genome sequences can be easily applied when phylogenetic methods are either unnecessary or inappropriate.
Collapse
Affiliation(s)
- Sravani Nanduri
- Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA
| | - Allison Black
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Trevor Bedford
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
- Howard Hughes Medical Institute, Seattle, WA, USA
| | - John Huddleston
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| |
Collapse
|
5
|
Sun C, Fang R, Salemi M, Prosperi M, Rife Magalis B. DeepDynaForecast: Phylogenetic-informed graph deep learning for epidemic transmission dynamic prediction. PLoS Comput Biol 2024; 20:e1011351. [PMID: 38598563 PMCID: PMC11034642 DOI: 10.1371/journal.pcbi.1011351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 04/22/2024] [Accepted: 03/11/2024] [Indexed: 04/12/2024] Open
Abstract
In the midst of an outbreak or sustained epidemic, reliable prediction of transmission risks and patterns of spread is critical to inform public health programs. Projections of transmission growth or decline among specific risk groups can aid in optimizing interventions, particularly when resources are limited. Phylogenetic trees have been widely used in the detection of transmission chains and high-risk populations. Moreover, tree topology and the incorporation of population parameters (phylodynamics) can be useful in reconstructing the evolutionary dynamics of an epidemic across space and time among individuals. We now demonstrate the utility of phylodynamic trees for transmission modeling and forecasting, developing a phylogeny-based deep learning system, referred to as DeepDynaForecast. Our approach leverages a primal-dual graph learning structure with shortcut multi-layer aggregation, which is suited for the early identification and prediction of transmission dynamics in emerging high-risk groups. We demonstrate the accuracy of DeepDynaForecast using simulated outbreak data and the utility of the learned model using empirical, large-scale data from the human immunodeficiency virus epidemic in Florida between 2012 and 2020. Our framework is available as open-source software (MIT license) at github.com/lab-smile/DeepDynaForcast.
Collapse
Affiliation(s)
- Chaoyue Sun
- Department of Electrical and Computer Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
| | - Ruogu Fang
- Department of Electrical and Computer Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- J. Crayton Pruitt Family Department of Biomedical Engineering, Herbert Wertheim College of Engineering, University of Florida, Gainesville, Florida, United States of America
- Center for Cognitive Aging and Memory, McKnight Brain Institute, University of Florida, Gainesville, Florida, United States of America
| | - Marco Salemi
- Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| | - Mattia Prosperi
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
- Department of Epidemiology, University of Florida, Gainesville, Florida, United States of America
| | - Brittany Rife Magalis
- Department of Pathology, Immunology, and Laboratory Medicine, University of Florida, Gainesville, Florida, United States of America
- Emerging Pathogens Institute, University of Florida, Gainesville, Florida, United States of America
| |
Collapse
|