1
|
Cen S, Rasmussen DA. Exploring the Accuracy and Limits of Algorithms for Localizing Recombination Breakpoints. Mol Biol Evol 2024; 41:msae133. [PMID: 38917277 PMCID: PMC11229816 DOI: 10.1093/molbev/msae133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 06/04/2024] [Accepted: 06/11/2024] [Indexed: 06/27/2024] Open
Abstract
Phylogenetic methods are widely used to reconstruct the evolutionary relationships among species and individuals. However, recombination can obscure ancestral relationships as individuals may inherit different regions of their genome from different ancestors. It is, therefore, often necessary to detect recombination events, locate recombination breakpoints, and select recombination-free alignments prior to reconstructing phylogenetic trees. While many earlier studies have examined the power of different methods to detect recombination, very few have examined the ability of these methods to accurately locate recombination breakpoints. In this study, we simulated genome sequences based on ancestral recombination graphs and explored the accuracy of three popular recombination detection methods: MaxChi, 3SEQ, and Genetic Algorithm Recombination Detection. The accuracy of inferred breakpoint locations was evaluated along with the key factors contributing to variation in accuracy across datasets. While many different genomic features contribute to the variation in performance across methods, the number of informative sites consistent with the pattern of inheritance between parent and recombinant child sequences always has the greatest contribution to accuracy. While partitioning sequence alignments based on identified recombination breakpoints can greatly decrease phylogenetic error, the quality of phylogenetic reconstructions depends very little on how breakpoints are chosen to partition the alignment. Our work sheds light on how different features of recombinant genomes affect the performance of recombination detection methods and suggests best practices for reconstructing phylogenies based on recombination-free alignments.
Collapse
Affiliation(s)
- Shi Cen
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - David A Rasmussen
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
2
|
Chen YJ, Lin YC, Wu MT, Kuo JY, Wang CH. Prevention of Viral Hepatitis and HIV Infection among People Who Inject Drugs: A Systematic Review and Meta-Analysis. Viruses 2024; 16:142. [PMID: 38257842 PMCID: PMC10820947 DOI: 10.3390/v16010142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 01/11/2024] [Indexed: 01/24/2024] Open
Abstract
This study aimed to explore the current evidence on preventing blood-borne virus infections among people who inject drugs (PWID). We conducted a comprehensive search across three databases (PubMed, Embase, Cochrane Library) for relevant articles published in English between 2014 and 2023. We followed the Preferred Reporting Items for Systematic Reviews and Meta Analysis (PRISMA) guidelines, assessed the quality of the paper using the revised Cochrane Risk of Bias Tool (ROB 2), and conducted a meta-analysis using RevMan 5.3. Completing the harm reduction program (HRP) participation and receiving all three vaccine doses resulted in a 28% reduction in the risk of HBV infection (OR: 0.72, 95% CI: 0.37-1.42). Various interventions increased the willingness of PWIDs to undergo HCV treatment (OR: 5.91, 95% CI: 2.46-14.24) and promoted treatment adherence (OR: 15.04, 95% CI: 2.80-80.61). Taking PrEP, participating in HRP, and modifying risky behaviors were associated with a 33% reduction in the risk of HIV infection (OR: 0.67, 95% CI: 0.61-0.74). Conducting referrals, providing counseling, and implementing antiretroviral therapy resulted in a 44% reduction in the risk of viral transmission (OR: 0.56, 95% CI: 0.47-0.66). Co-infection may potentially compromise effectiveness, so it is important to consider drug resistance.
Collapse
Affiliation(s)
- Yen-Ju Chen
- Research Assistant Center, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan 701033, Taiwan; (Y.-J.C.); (Y.-C.L.); (M.-T.W.)
- Department of Food Nutrition, Chung Hwa University of Medical Technology, Tainan 717302, Taiwan
| | - Yu-Chen Lin
- Research Assistant Center, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan 701033, Taiwan; (Y.-J.C.); (Y.-C.L.); (M.-T.W.)
| | - Meng-Tien Wu
- Research Assistant Center, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan 701033, Taiwan; (Y.-J.C.); (Y.-C.L.); (M.-T.W.)
| | - Jenn-Yuan Kuo
- Department of Hepatogastroenterology, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan 701033, Taiwan
| | - Chun-Hsiang Wang
- Department of Hepatogastroenterology, Tainan Municipal Hospital (Managed by Show Chwan Medical Care Corporation), Tainan 701033, Taiwan
| |
Collapse
|
3
|
Park Y, Martin MA, Koelle K. Epidemiological inference for emerging viruses using segregating sites. Nat Commun 2023; 14:3105. [PMID: 37248255 DOI: 10.1038/s41467-023-38809-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 05/16/2023] [Indexed: 05/31/2023] Open
Abstract
Epidemiological models are commonly fit to case and pathogen sequence data to estimate parameters and to infer unobserved disease dynamics. Here, we present an inference approach based on sequence data that is well suited for model fitting early on during the expansion of a viral lineage. Our approach relies on a trajectory of segregating sites to infer epidemiological parameters within a Sequential Monte Carlo framework. Using simulated data, we first show that our approach accurately recovers key epidemiological quantities under a single-introduction scenario. We then apply our approach to SARS-CoV-2 sequence data from France, estimating a basic reproduction number of approximately 2.3-2.7 under an epidemiological model that allows for multiple introductions. Our approach presented here indicates that inference approaches that rely on simple population genetic summary statistics can be informative of epidemiological parameters and can be used for reconstructing infectious disease dynamics during the early expansion of a viral lineage.
Collapse
Affiliation(s)
- Yeongseon Park
- Graduate Program in Population Biology, Ecology, and Evolution, Emory University, Atlanta, GA, 30322, USA
| | - Michael A Martin
- Graduate Program in Population Biology, Ecology, and Evolution, Emory University, Atlanta, GA, 30322, USA
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Katia Koelle
- Department of Biology, Emory University, Atlanta, GA, 30322, USA.
- Emory Center of Excellence for Influenza Research and Response (CEIRR), Atlanta, GA, USA.
| |
Collapse
|
4
|
Cholette F, Lazarus L, Macharia P, Thompson LH, Githaiga S, Mathenge J, Walimbwa J, Kuria I, Okoth S, Wambua S, Albert H, Mwangi P, Adhiambo J, Kasiba R, Juma E, Battacharjee P, Kimani J, Sandstrom P, Meyers AFA, Joy JB, Thomann M, McLaren PJ, Shaw S, Mishra S, Becker ML, McKinnon L, Lorway R. Community Insights in Phylogenetic HIV Research: The CIPHR Project Protocol. Glob Public Health 2023; 18:2269435. [PMID: 37851872 DOI: 10.1080/17441692.2023.2269435] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 10/04/2023] [Indexed: 10/20/2023]
Abstract
Inferring HIV transmission networks from HIV sequences is gaining popularity in the field of HIV molecular epidemiology. However, HIV sequences are often analyzed at distance from those affected by HIV epidemics, namely without the involvement of communities most affected by HIV. These remote analyses often mean that knowledge is generated in absence of lived experiences and socio-economic realities that could inform the ethical application of network-derived information in 'real world' programmes. Procedures to engage communities are noticeably absent from the HIV molecular epidemiology literature. Here we present our team's protocol for engaging community activists living in Nairobi, Kenya in a knowledge exchange process - The CIPHR Project (Community Insights in Phylogenetic HIV Research). Drawing upon a community-based participatory approach, our team will (1) explore the possibilities and limitations of HIV molecular epidemiology for key population programmes, (2) pilot a community-based HIV molecular study, and (3) co-develop policy guidelines on conducting ethically safe HIV molecular epidemiology. Critical dialogue with activist communities will offer insight into the potential uses and abuses of using such information to sharpen HIV prevention programmes. The outcome of this process holds importance to the development of policy frameworks that will guide the next generation of the global response.
Collapse
Affiliation(s)
- François Cholette
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
- Sexually Transmitted and Blood-Borne Infections, National Microbiology Laboratory at JC Wilt Infectious Diseases Research Centre, Public Health Agency of Canada, Winnipeg, Canada
| | - Lisa Lazarus
- Institute for Global Public Health, Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Pascal Macharia
- Health Options for Young Men on HIV/AIDS and STIs (HOYMAS), Nairobi, Kenya
| | - Laura H Thompson
- Sexually Transmitted and Blood-Borne Infections Surveillance Division, Centre for Communicable Diseases and Infection Control, Public Health Agency of Canada, Ottawa, Canada
| | - Samuel Githaiga
- Health Options for Young Men on HIV/AIDS and STIs (HOYMAS), Nairobi, Kenya
| | - John Mathenge
- Health Options for Young Men on HIV/AIDS and STIs (HOYMAS), Nairobi, Kenya
| | | | - Irene Kuria
- Key Population Consortium of Kenya, Nairobi, Kenya
| | - Silvia Okoth
- Bar Hostess Empowerment and Support Programme, Nairobi, Kenya
| | | | - Harrison Albert
- Health Options for Young Men on HIV/AIDS and STIs (HOYMAS), Nairobi, Kenya
| | - Peninah Mwangi
- Bar Hostess Empowerment and Support Programme, Nairobi, Kenya
| | - Joyce Adhiambo
- Partners for Health Development in Africa (PHDA), Nairobi, Kenya
- Sex Worker Outreach Programme (SWOP), Nairobi, Kenya
| | | | - Esther Juma
- Sex Worker Outreach Programme (SWOP), Nairobi, Kenya
| | | | - Joshua Kimani
- Sex Worker Outreach Programme (SWOP), Nairobi, Kenya
- Department of Medical Microbiology, University of Nairobi, Nairobi, Kenya
| | - Paul Sandstrom
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
- Sexually Transmitted and Blood-Borne Infections, National Microbiology Laboratory at JC Wilt Infectious Diseases Research Centre, Public Health Agency of Canada, Winnipeg, Canada
| | - Adrienne F A Meyers
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
- Sexually Transmitted and Blood-Borne Infections, National Microbiology Laboratory at JC Wilt Infectious Diseases Research Centre, Public Health Agency of Canada, Winnipeg, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS (BCCfE), St. Paul's Hospital, Vancouver, Canada
- Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, Canada
- Bioinformatics Programme, University of British Columbia, Vancouver, Canada
| | - Matthew Thomann
- Department of Anthropology, University of Maryland, College Park, MD, USA
| | - Paul J McLaren
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
- Sexually Transmitted and Blood-Borne Infections, National Microbiology Laboratory at JC Wilt Infectious Diseases Research Centre, Public Health Agency of Canada, Winnipeg, Canada
| | - Souradet Shaw
- Institute for Global Public Health, Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Sharmistha Mishra
- MAP Centre for Urban Health Solutions, St. Michael's Hospital, Toronto, Canada
- Department of Medicine, University of Toronto, Toronto, Canada
- Institute of Medical Sciences, University of Toronto, Toronto, Canada
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Canada
| | - Marissa L Becker
- Institute for Global Public Health, Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| | - Lyle McKinnon
- Department of Medical Microbiology and Infectious Diseases, University of Manitoba, Winnipeg, Canada
- Department of Medical Microbiology, University of Nairobi, Nairobi, Kenya
- Centre for the AIDS Programme of Research in South Africa, Durban, South Africa
| | - Robert Lorway
- Institute for Global Public Health, Department of Community Health Sciences, University of Manitoba, Winnipeg, Canada
| |
Collapse
|
5
|
Zhang C, Bzikadze AV, Safonova Y, Mirarab S. A scalable model for simulating multi-round antibody evolution and benchmarking of clonal tree reconstruction methods. Front Immunol 2022; 13:1014439. [PMID: 36618367 PMCID: PMC9815712 DOI: 10.3389/fimmu.2022.1014439] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Accepted: 10/26/2022] [Indexed: 12/12/2022] Open
Abstract
Affinity maturation (AM) of B cells through somatic hypermutations (SHMs) enables the immune system to evolve to recognize diverse pathogens. The accumulation of SHMs leads to the formation of clonal lineages of antibody-secreting b cells that have evolved from a common naïve B cell. Advances in high-throughput sequencing have enabled deep scans of B cell receptor repertoires, paving the way for reconstructing clonal trees. However, it is not clear if clonal trees, which capture microevolutionary time scales, can be reconstructed using traditional phylogenetic reconstruction methods with adequate accuracy. In fact, several clonal tree reconstruction methods have been developed to fix supposed shortcomings of phylogenetic methods. Nevertheless, no consensus has been reached regarding the relative accuracy of these methods, partially because evaluation is challenging. Benchmarking the performance of existing methods and developing better methods would both benefit from realistic models of clonal lineage evolution specifically designed for emulating B cell evolution. In this paper, we propose a model for modeling B cell clonal lineage evolution and use this model to benchmark several existing clonal tree reconstruction methods. Our model, designed to be extensible, has several features: by evolving the clonal tree and sequences simultaneously, it allows modeling selective pressure due to changes in affinity binding; it enables scalable simulations of large numbers of cells; it enables several rounds of infection by an evolving pathogen; and, it models building of memory. In addition, we also suggest a set of metrics for comparing clonal trees and measuring their properties. Our results show that while maximum likelihood phylogenetic reconstruction methods can fail to capture key features of clonal tree expansion if applied naively, a simple post-processing of their results, where short branches are contracted, leads to inferences that are better than alternative methods.
Collapse
Affiliation(s)
- Chao Zhang
- Bioinformatics and Systems Biology, University of California, San Diego, San Diego, CA, United States
| | - Andrey V. Bzikadze
- Bioinformatics and Systems Biology, University of California, San Diego, San Diego, CA, United States
| | - Yana Safonova
- Computer Science and Engineering Department, University of California, San Diego, San Diego, CA, United States
| | - Siavash Mirarab
- Electrical and Computer Engineering Department, University of California, San Diego, San Diego, CA, United States,*Correspondence: Siavash Mirarab,
| |
Collapse
|
6
|
Pickles M, Cori A, Probert WJM, Sauter R, Hinch R, Fidler S, Ayles H, Bock P, Donnell D, Wilson E, Piwowar-Manning E, Floyd S, Hayes RJ, Fraser C. PopART-IBM, a highly efficient stochastic individual-based simulation model of generalised HIV epidemics developed in the context of the HPTN 071 (PopART) trial. PLoS Comput Biol 2021; 17:e1009301. [PMID: 34473700 PMCID: PMC8478209 DOI: 10.1371/journal.pcbi.1009301] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2020] [Revised: 09/28/2021] [Accepted: 07/22/2021] [Indexed: 11/23/2022] Open
Abstract
Mathematical models are powerful tools in HIV epidemiology, producing quantitative projections of key indicators such as HIV incidence and prevalence. In order to improve the accuracy of predictions, such models need to incorporate a number of behavioural and biological heterogeneities, especially those related to the sexual network within which HIV transmission occurs. An individual-based model, which explicitly models sexual partnerships, is thus often the most natural type of model to choose. In this paper we present PopART-IBM, a computationally efficient individual-based model capable of simulating 50 years of an HIV epidemic in a large, high-prevalence community in under a minute. We show how the model calibrates within a Bayesian inference framework to detailed age- and sex-stratified data from multiple sources on HIV prevalence, awareness of HIV status, ART status, and viral suppression for an HPTN 071 (PopART) study community in Zambia, and present future projections of HIV prevalence and incidence for this community in the absence of trial intervention. In this paper we present PopART-IBM, an individual-based model used to simulate HIV transmission in communities in high prevalence settings. We show that PopART-IBM can simulate transmission over a span of decades in a large community in less than a minute. This computational efficiency allows us to calibrate the model within an inference framework, and we show an illustrative example of calibration using an adaptive population Monte Carlo Approximate Bayesian Computation algorithm for a community in Zambia that was part of the HPTN-071 (PopART) trial. We compare the detailed model output to real-world data collected during the trial from this community. Finally, we project how the HIV epidemic would have changed over time in this community if no intervention from the trial had occurred.
Collapse
Affiliation(s)
- Michael Pickles
- Medical Research Council Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
- * E-mail:
| | - Anne Cori
- Medical Research Council Centre for Global Infectious Disease Analysis, School of Public Health, Imperial College London, London, United Kingdom
| | - William J. M. Probert
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Rafael Sauter
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Robert Hinch
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Sarah Fidler
- Department of Infectious Disease, Imperial College London, London, United Kingdom
- Imperial College NIHR BRC, London, United Kingdom
| | - Helen Ayles
- Zambart, School of Public Health, University of Zambia, Ridgeway Campus, Lusaka, Zambia
- Department of Clinical Research, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Peter Bock
- Desmond Tutu TB Centre, Department of Paediatrics and Child Health, Stellenbosch University, Cape Town, South Africa
| | - Deborah Donnell
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Ethan Wilson
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Estelle Piwowar-Manning
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland United States of America
| | - Sian Floyd
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Richard J. Hayes
- Department of Infectious Disease Epidemiology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Christophe Fraser
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | |
Collapse
|
7
|
Almaraz K, Jang T, Lewis M, Ngo T, Song M, Moshiri N. SEPIA: simulation-based evaluation of prioritization algorithms. BMC Med Inform Decis Mak 2021; 21:177. [PMID: 34082739 PMCID: PMC8173910 DOI: 10.1186/s12911-021-01536-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 05/23/2021] [Indexed: 11/18/2022] Open
Abstract
Background The ability to prioritize people living with HIV (PLWH) by risk of future transmissions could aid public health officials in optimizing epidemiological intervention. While methods exist to perform such prioritization based on molecular data, their effectiveness and accuracy are poorly understood, and it is unclear how one can directly compare the accuracy of different methods. We introduce SEPIA (Simulation-based Evaluation of PrIoritization Algorithms), a novel simulation-based framework for determining the effectiveness of prioritization algorithms. SEPIA expands upon prior related work by defining novel metrics of effectiveness with which to compare prioritization techniques, as well as by creating a simulation-based tool with which to perform such effectiveness comparisons. Under several metrics of effectiveness that we propose, we compare two existing prioritization approaches: one phylogenetic (ProACT) and one distance-based (growth of HIV-TRACE transmission clusters). Results Using all proposed metrics, ProACT consistently slightly outperformed the transmission cluster growth approach. However, both methods consistently performed just marginally better than random, suggesting that there is significant room for improvement in prioritization tools. Conclusion We hope that, by providing ways to quantify the effectiveness of prioritization methods in simulation, SEPIA will aid researchers in developing novel risk prioritization tools for PLWH.
Collapse
Affiliation(s)
- Kimberly Almaraz
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Tyler Jang
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - McKenna Lewis
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Titan Ngo
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Miranda Song
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA
| | - Niema Moshiri
- Department of Computer Science and Engineering, University of California San Diego, 9500 Gilman Drive, La Jolla, CA, 92093, USA.
| |
Collapse
|
8
|
Pekar J, Worobey M, Moshiri N, Scheffler K, Wertheim JO. Timing the SARS-CoV-2 index case in Hubei province. Science 2021; 372:412-417. [PMID: 33737402 PMCID: PMC8139421 DOI: 10.1126/science.abf8003] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Accepted: 03/15/2021] [Indexed: 12/14/2022]
Abstract
Understanding when severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) emerged is critical to evaluating our current approach to monitoring novel zoonotic pathogens and understanding the failure of early containment and mitigation efforts for COVID-19. We used a coalescent framework to combine retrospective molecular clock inference with forward epidemiological simulations to determine how long SARS-CoV-2 could have circulated before the time of the most recent common ancestor of all sequenced SARS-CoV-2 genomes. Our results define the period between mid-October and mid-November 2019 as the plausible interval when the first case of SARS-CoV-2 emerged in Hubei province, China. By characterizing the likely dynamics of the virus before it was discovered, we show that more than two-thirds of SARS-CoV-2-like zoonotic events would be self-limited, dying out without igniting a pandemic. Our findings highlight the shortcomings of zoonosis surveillance approaches for detecting highly contagious pathogens with moderate mortality rates.
Collapse
Affiliation(s)
- Jonathan Pekar
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, La Jolla, CA 92093, USA
- Department of Biomedical Informatics, University of California San Diego, La Jolla, CA 92093, USA
| | - Michael Worobey
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ 85721, USA.
| | - Niema Moshiri
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA 92093, USA
| | | | - Joel O Wertheim
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
9
|
Moshiri N, Smith DM, Mirarab S. HIV Care Prioritization Using Phylogenetic Branch Length. J Acquir Immune Defic Syndr 2021; 86:626-637. [PMID: 33394616 PMCID: PMC7933099 DOI: 10.1097/qai.0000000000002612] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 12/14/2020] [Indexed: 12/22/2022]
Abstract
BACKGROUND The structure of the HIV transmission networks can be dictated by just a few individuals. Public health intervention, such as ensuring people living with HIV adhere to antiretroviral therapy and remain virally suppressed, can help control the spread of the virus. However, such intervention requires using limited public health resource allocations. Determining which individuals are most at risk of transmitting HIV could allow public health officials to focus their limited resources on these individuals. SETTING Molecular epidemiology can help prioritize people living with HIV by patterns of transmission inferred from their sampled viral sequences. Such prioritization has been previously suggested and performed by monitoring cluster growth. In this article, we introduce Prioritization using AnCesTral edge lengths (ProACT), a phylogenetic approach for prioritizing individuals living with HIV. METHODS ProACT starts from a phylogeny inferred from sequence data and orders individuals according to their terminal branch length, breaking ties using ancestral branch lengths. We evaluated ProACT on a real data set of 926 HIV-1 subtype B pol data obtained in San Diego between 2005 and 2014 and a simulation data set modeling the same epidemic. Prioritization methods are compared by their ability to predict individuals who transmit most after the prioritization. RESULTS Across all simulation conditions and most real data sampling conditions, ProACT outperformed monitoring cluster growth for multiple metrics of prioritization efficacy. CONCLUSION The simple strategy used by ProACT improves the effectiveness of prioritization compared with state-of-the-art methods that rely on monitoring the growth of transmission clusters defined based on genetic distance.
Collapse
Affiliation(s)
- Niema Moshiri
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, 92093, USA
| | - Davey M. Smith
- Department of Medicine, University of California, San Diego, La Jolla, 92093, USA
| | - Siavash Mirarab
- Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, 92093, USA
| |
Collapse
|
10
|
Pacioni C, Vaughan TG, Strive T, Campbell S, Ramsey DSL. Field validation of phylodynamic analytical methods for inference on epidemiological processes in wildlife. Transbound Emerg Dis 2021; 69:1020-1029. [PMID: 33683829 DOI: 10.1111/tbed.14058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 02/24/2021] [Accepted: 03/02/2021] [Indexed: 11/30/2022]
Abstract
Amongst newly developed approaches to analyse molecular data, phylodynamic models are receiving much attention because of their potential to reveal changes to viral populations over short periods. This knowledge can be very important for understanding disease impacts. However, their accuracy needs to be fully understood, especially in relation to wildlife disease epidemiology, where sampling and prior knowledge may be limited. The release of the rabbit haemorrhagic disease virus (RHDV) as biological control in naïve rabbit populations in Australia in 1996 provides a unique data set with which to validate phylodynamic models. By comparing results obtained from RHDV sequence data with our current understanding of RHDV epidemiology in Australia, we evaluated the performances of these recently developed models. In line with our expectations, coalescent analyses detected a sharp increase in the virus population size in the first few months after release, followed by a more gradual increase. Phylodynamic analyses using a birth-death model generated effective reproductive number estimates (the average number of secondary infections per each infectious case, Re ) larger than one for most of the epochs considered. However, the possible range of the initial Re included estimates lower than one despite the known rapid spread of RHDV in Australia. Furthermore, the analyses that accounted for geographical structuring failed to converge. We argue that the difficulties that we encountered most likely stem from the fact that the samples available from 1996 to 2014 were too sparse with respect to both geographic and within outbreak coverage to adequately infer some of the model parameters. In general, while these phylodynamic analyses proved to be greatly informative in some regards, we caution that their interpretation may not be straightforward. We recommend further research to evaluate the robustness of these models to assumption violations and sensitivity to sampling regimes.
Collapse
Affiliation(s)
- Carlo Pacioni
- Arthur Rylah Institute for Environmental Research, Department of Environment, Land, Water and Planning, Heidelberg, VIC, Australia.,School of Veterinary and Life Sciences, Murdoch University, Murdoch, WA, Australia.,Centre for Invasive Species Solutions, Bruce, ACT, Australia
| | - Timothy G Vaughan
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Tanja Strive
- Centre for Invasive Species Solutions, Bruce, ACT, Australia.,Commonwealth Scientific and Industrial Research Organisation, Canberra, ACT, Australia
| | - Susan Campbell
- Centre for Invasive Species Solutions, Bruce, ACT, Australia.,Department of Primary Industries and Regional Development, Albany, WA, Australia
| | - David S L Ramsey
- Arthur Rylah Institute for Environmental Research, Department of Environment, Land, Water and Planning, Heidelberg, VIC, Australia.,Centre for Invasive Species Solutions, Bruce, ACT, Australia
| |
Collapse
|
11
|
Little SJ, Chen T, Wang R, Anderson C, Kosakovsky Pond S, Nakazawa M, Mathews WC, DeGruttola V, Smith DM. Effective HIV Molecular Surveillance Requires Identification of Incident Cases of Infection. Clin Infect Dis 2021; 73:842-849. [PMID: 33588434 DOI: 10.1093/cid/ciab140] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Ending the HIV epidemic requires knowledge of key drivers of spread of HIV infection. METHODS Between 1996 and 2018, 1119 newly and previously diagnosed, therapy-naïve persons with HIV (PWH) from San Diego were followed. A genetic distance-based network was inferred using pol sequences, and genetic clusters grew over time through linkage of sequences from newly observed infections. Cox proportional hazards models were used to identify factors associated with the rate of growth. These results were used to predict the impact of a hypothetical intervention targeting PWH with incident infection. Comparison was made to the CDC EHE molecular surveillance strategy, which prioritizes clusters recently linked to all new HIV diagnoses and does not incorporate data on incident infections. RESULTS Overall, 219 genetic linkages to incident infections were identified over a median follow-up of 8.8 years. Incident cluster growth was strongly associated with proportion of PWH in the cluster who themselves had incident infection. (HR 44.09; 95% CI: 17.09, 113.78). The CDC EHE molecular surveillance strategy identified 11 linkages to incident infections a genetic distance threshold of 0.5%, and 24 linkages at 1.5%. CONCLUSIONS Over the past two decades, incident infections drove incident HIV cluster growth in San Diego. The current CDC EHE molecular detection and response strategy would not have identified most transmission events arising from those with incident infection in San Diego. Molecular surveillance that includes detection of incident cases will provide a more effective strategy for EHE.
Collapse
Affiliation(s)
- Susan J Little
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA
| | - Tom Chen
- Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA
| | - Rui Wang
- Harvard Pilgrim Health Care Institute and Harvard Medical School, Boston, MA, USA.,Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Christy Anderson
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA
| | - Sergei Kosakovsky Pond
- Institute for Genomics and Evolutionary Medicine, Temple University, Philadelphia, PA, USA
| | - Masato Nakazawa
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA
| | | | - Victor DeGruttola
- Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
| | - Davey M Smith
- Division of Infectious Diseases and Global Public Health, University of California San Diego, CA, USA.,San Diego Veterans Affairs Healthcare System, San Diego, CA, USA
| |
Collapse
|
12
|
Didelot X, Kendall M, Xu Y, White PJ, McCarthy N. Genomic Epidemiology Analysis of Infectious Disease Outbreaks Using TransPhylo. Curr Protoc 2021; 1:e60. [PMID: 33617114 PMCID: PMC7995038 DOI: 10.1002/cpz1.60] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Comparing the pathogen genomes from several cases of an infectious disease has the potential to help us understand and control outbreaks. Many methods exist to reconstruct a phylogeny from such genomes, which represents how the genomes are related to one another. However, such a phylogeny is not directly informative about transmission events between individuals. TransPhylo is a software tool implemented as an R package designed to bridge the gap between pathogen phylogenies and transmission trees. TransPhylo is based on a combined model of transmission between hosts and pathogen evolution within each host. It can simulate both phylogenies and transmission trees jointly under this combined model. TransPhylo can also reconstruct a transmission tree based on a dated phylogeny, by exploring the space of transmission trees compatible with the phylogeny. A transmission tree can be represented as a coloring of a phylogeny where each color represents a different host of the pathogen, and TransPhylo provides convenient ways to plot these colorings and explore the results. This article presents the basic protocols that can be used to make the most of TransPhylo. © 2021 The Authors. Basic Protocol 1: First steps with TransPhylo Basic Protocol 2: Simulation of outbreak data Basic Protocol 3: Inference of transmission Basic Protocol 4: Exploring the results of inference.
Collapse
Affiliation(s)
- Xavier Didelot
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Michelle Kendall
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Yuanwei Xu
- Center for Computational Biology, Institute of Cancer and Genomic SciencesUniversity of BirminghamUnited Kingdom
| | - Peter J. White
- Department of Infectious Disease Epidemiology, School of Public HealthImperial College LondonUnited Kingdom
- Medical Research Council Centre for Global Infectious Disease Analysis, School of Public HealthImperial College LondonUnited Kingdom
- National Institute for Health Research Health Protection Research Unit in Modelling and Health Economics, School of Public HealthImperial College LondonUnited Kingdom
- Modelling and Economics Unit, National Infection ServicePublic Health EnglandLondonUnited Kingdom
| | - Noel McCarthy
- Warwick Medical SchoolUniversity of WarwickUnited Kingdom
| |
Collapse
|
13
|
Manceau M, Gupta A, Vaughan T, Stadler T. The probability distribution of the ancestral population size conditioned on the reconstructed phylogenetic tree with occurrence data. J Theor Biol 2021; 509:110400. [PMID: 32739241 PMCID: PMC7733867 DOI: 10.1016/j.jtbi.2020.110400] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 05/07/2020] [Accepted: 07/03/2020] [Indexed: 01/10/2023]
Abstract
We consider a homogeneous birth-death process with three different sampling schemes. First, individuals can be sampled through time and included in a reconstructed phylogenetic tree. Second, they can be sampled through time and only recorded as a point 'occurrence' along a timeline. Third, extant individuals can be sampled and included in the reconstructed phylogenetic tree with a fixed probability. We further consider that sampled individuals can be removed or not from the process, upon sampling, with fixed probability. We derive the probability distribution of the population size at any time in the past conditional on the joint observation of a reconstructed phylogenetic tree and a record of occurrences not included in the tree. We also provide an algorithm to simulate ancestral population size trajectories given the observation of a reconstructed phylogenetic tree and occurrences. This distribution can be readily used to draw inferences about the ancestral population size in the field of epidemiology and macroevolution. In epidemiology, these results will allow data from epidemiological case count studies to be used in conjunction with molecular sequencing data (yielding reconstructed phylogenetic trees) to coherently estimate prevalence through time. In macroevolution, it will foster the joint examination of the fossil record and extant taxa to reconstruct past biodiversity.
Collapse
Affiliation(s)
- Marc Manceau
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| | - Ankit Gupta
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Timothy Vaughan
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland.
| |
Collapse
|
14
|
Novitsky V, Steingrimsson JA, Howison M, Gillani FS, Li Y, Manne A, Fulton J, Spence M, Parillo Z, Marak T, Chan PA, Bertrand T, Bandy U, Alexander-Scott N, Dunn CW, Hogan J, Kantor R. Empirical comparison of analytical approaches for identifying molecular HIV-1 clusters. Sci Rep 2020; 10:18547. [PMID: 33122765 PMCID: PMC7596705 DOI: 10.1038/s41598-020-75560-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 09/21/2020] [Indexed: 01/10/2023] Open
Abstract
Public health interventions guided by clustering of HIV-1 molecular sequences may be impacted by choices of analytical approaches. We identified commonly-used clustering analytical approaches, applied them to 1886 HIV-1 Rhode Island sequences from 2004-2018, and compared concordance in identifying molecular HIV-1 clusters within and between approaches. We used strict (topological support ≥ 0.95; distance 0.015 substitutions/site) and relaxed (topological support 0.80-0.95; distance 0.030-0.045 substitutions/site) thresholds to reflect different epidemiological scenarios. We found that clustering differed by method and threshold and depended more on distance than topological support thresholds. Clustering concordance analyses demonstrated some differences across analytical approaches, with RAxML having the highest (91%) mean summary percent concordance when strict thresholds were applied, and three (RAxML-, FastTree regular bootstrap- and IQ-Tree regular bootstrap-based) analytical approaches having the highest (86%) mean summary percent concordance when relaxed thresholds were applied. We conclude that different analytical approaches can yield diverse HIV-1 clustering outcomes and may need to be differentially used in diverse public health scenarios. Recognizing the variability and limitations of commonly-used methods in cluster identification is important for guiding clustering-triggered interventions to disrupt new transmissions and end the HIV epidemic.
Collapse
Affiliation(s)
| | | | - Mark Howison
- Research Improving People's Life, Providence, RI, USA
| | | | | | | | | | | | | | | | - Philip A Chan
- Brown University, Providence, RI, USA
- Rhode Island Department of Health, Providence, RI, USA
| | | | - Utpala Bandy
- Rhode Island Department of Health, Providence, RI, USA
| | | | | | | | | |
Collapse
|
15
|
What Should Health Departments Do with HIV Sequence Data? Viruses 2020; 12:v12091018. [PMID: 32932642 PMCID: PMC7551807 DOI: 10.3390/v12091018] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Revised: 09/09/2020] [Accepted: 09/11/2020] [Indexed: 11/27/2022] Open
Abstract
Many countries and US states have mandatory statues that require reporting of HIV clinical data including genetic sequencing results to the public health departments. Because genetic sequencing is a part of routine care for HIV infected persons, health departments have extensive sequence collections spanning years and even decades of the HIV epidemic. How should these data be used (or not) in public health practice? This is a complex, multi-faceted question that weighs personal risks against public health benefit. The answer is neither straightforward nor universal. However, to make that judgement—of how genetic sequence data should be used in describing and combating the HIV epidemic—we need a clear image of what a phylogenetically enhanced HIV surveillance system can do and what benefit it might provide. In this paper, we present a positive case for how up-to-date analysis of HIV sequence databases managed by health departments can provide unique and actionable information of how HIV is spreading in local communities. We discuss this question broadly, with examples from the US, as it is globally relevant for all health authorities that collect HIV genetic data.
Collapse
|
16
|
Villabona-Arenas CJ, Hall M, Lythgoe KA, Gaffney SG, Regoes RR, Hué S, Atkins KE. Number of HIV-1 founder variants is determined by the recency of the source partner infection. Science 2020; 369:103-108. [PMID: 32631894 DOI: 10.1126/science.aba5443] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 05/11/2020] [Indexed: 01/10/2023]
Abstract
During sexual transmission, the high genetic diversity of HIV-1 within an individual is frequently reduced to one founder variant that initiates infection. Understanding the drivers of this bottleneck is crucial to developing effective infection control strategies. Little is known about the importance of the source partner during this bottleneck. To test the hypothesis that the source partner affects the number of HIV founder variants, we developed a phylodynamic model calibrated using genetic and epidemiological data on all existing transmission pairs for whom the direction of transmission and the infection stage of the source partner are known. Our results suggest that acquiring infection from someone in the acute (early) stage of infection increases the risk of multiple-founder variant transmission compared with acquiring infection from someone in the chronic (later) stage of infection. This study provides the first direct test of source partner characteristics to explain the low frequency of multiple-founder strain infections.
Collapse
Affiliation(s)
- Ch Julián Villabona-Arenas
- Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK.,Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK
| | - Matthew Hall
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Katrina A Lythgoe
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Stephen G Gaffney
- Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
| | - Roland R Regoes
- Institute of Integrative Biology, Department of Environmental Systems Science, ETH Zurich, Zurich, Switzerland
| | - Stéphane Hué
- Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK.,Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK
| | - Katherine E Atkins
- Department of Infectious Disease Epidemiology, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, UK. .,Centre for Mathematical Modelling of Infectious Diseases, London School of Hygiene and Tropical Medicine, London, UK.,Centre for Global Health, Usher Institute of Population Health Sciences and Informatics, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
17
|
Gibson KM, Steiner MC, Rentia U, Bendall ML, Pérez-Losada M, Crandall KA. Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses. Viruses 2020; 12:E758. [PMID: 32674515 PMCID: PMC7412389 DOI: 10.3390/v12070758] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/03/2020] [Accepted: 07/06/2020] [Indexed: 01/04/2023] Open
Abstract
Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated pol consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV gp120 sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options.
Collapse
Affiliation(s)
- Keylie M. Gibson
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Margaret C. Steiner
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Uzma Rentia
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Matthew L. Bendall
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
| | - Marcos Pérez-Losada
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
- CIBIO-InBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, Universidade do Porto, Campus Agrário de Vairão, 4169-007 Vairão, Portugal
| | - Keith A. Crandall
- Computational Biology Institute, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA; (M.C.S.); (U.R.); (M.L.B.); (M.P.-L.); (K.A.C.)
- Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052, USA
| |
Collapse
|
18
|
Moshiri N, Ragonnet-Cronin M, Wertheim JO, Mirarab S. FAVITES: simultaneous simulation of transmission networks, phylogenetic trees and sequences. Bioinformatics 2020; 35:1852-1861. [PMID: 30395173 DOI: 10.1093/bioinformatics/bty921] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Revised: 10/29/2018] [Accepted: 11/01/2018] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The ability to simulate epidemics as a function of model parameters allows insights that are unobtainable from real datasets. Further, reconstructing transmission networks for fast-evolving viruses like Human Immunodeficiency Virus (HIV) may have the potential to greatly enhance epidemic intervention, but transmission network reconstruction methods have been inadequately studied, largely because it is difficult to obtain 'truth' sets on which to test them and properly measure their performance. RESULTS We introduce FrAmework for VIral Transmission and Evolution Simulation (FAVITES), a robust framework for simulating realistic datasets for epidemics that are caused by fast-evolving pathogens like HIV. FAVITES creates a generative model to produce contact networks, transmission networks, phylogenetic trees and sequence datasets, and to add error to the data. FAVITES is designed to be extensible by dividing the generative model into modules, each of which is expressed as a fixed API that can be implemented using various models. We use FAVITES to simulate HIV datasets and study the realism of the simulated datasets. We then use the simulated data to study the impact of the increased treatment efforts on epidemiological outcomes. We also study two transmission network reconstruction methods and their effectiveness in detecting fast-growing clusters. AVAILABILITY AND IMPLEMENTATION FAVITES is available at https://github.com/niemasd/FAVITES, and a Docker image can be found on DockerHub (https://hub.docker.com/r/niemasd/favites). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Niema Moshiri
- Bioinformatics and Systems Biology Graduate Program, UC San Diego, La Jolla, USA
| | | | | | - Siavash Mirarab
- Department of Electrical and Computer Engineering, UC San Diego, La Jolla, USA
| |
Collapse
|
19
|
Capoferri AA, Lamers SL, Grabowski MK, Rose R, Wawer MJ, Serwadda D, Gray RH, Quinn TC, Kigozi G, Kagaayi J, Laeyendecker O, Abeler-Dörner L, Ayles H, Bonsall D, Bowden R, Calvez V, Cohen M, Denis A, Frampton D, de Oliveira T, Essex M, Fidler S, Fraser C, Golubchik T, Hayes R, Herbeck JT, Hoppe A, Kaleebu P, Kellam P, Kityo C, Leigh-Brown A, Lingappa JR, Novitsky V, Paton N, Pillay D, Rambaut A, Ratmann O, Seeley J, Ssemwanga D, Tanser F. Recombination Analysis of Near Full-Length HIV-1 Sequences and the Identification of a Potential New Circulating Recombinant Form from Rakai, Uganda. AIDS Res Hum Retroviruses 2020; 36:467-474. [PMID: 31914792 DOI: 10.1089/aid.2019.0150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The Phylogenetics And Networks for Generalized HIV Epidemics in Africa (PANGEA-HIV) consortium has been vital in the generation and examination of near full-length HIV-1 sequences generated from Sub-Saharan Africa. In this study, we examined a subset (n = 275) of sequences from Rakai, Uganda, collected between August 2011 and January 2015. Sequences were initially screened with COMET for subtyping and then evaluated using bootscanning and phylogenetic inference. Among 275 sequences, 38.6% were subtype D, 19.3% were subtype A, 2.9% were subtype C, and 39.3% were recombinant. The recombinants were structurally diverse in the number of breakpoints observed, the location of recombinant segments, and represented subtypes, with AD recombinants accounting for the majority of all recombinants (29.8%). Within the AD subpopulation, we identified a potential new circulating recombinant form in five individuals where the polymerase gene was subtype D and most of env was subtype A (D-A junctures at HXB2 6760 and 8709). While the breakpoints were identical for the viruses from these individuals, the viral fragments did not cluster together. These results suggest selection for a viral strain where properties of the subtype A and subtype D portions of the virus confer a survival advantage. The continued study of recombinants will increase our breadth of knowledge for the genetic diversity and evolution of HIV-1, which can further contribute to our understanding toward a universal HIV-1 vaccine.
Collapse
Affiliation(s)
- Adam A. Capoferri
- The Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
| | | | - Mary Kate Grabowski
- The Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Rakai Health Sciences Program, Entebbe, Uganda
| | | | - Maria J. Wawer
- The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Rakai Health Sciences Program, Entebbe, Uganda
| | - David Serwadda
- Rakai Health Sciences Program, Entebbe, Uganda
- Makerere University School of Public Health, Kampala, Uganda
| | - Ronald H. Gray
- The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Rakai Health Sciences Program, Entebbe, Uganda
| | - Thomas C. Quinn
- The Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Baltimore, Maryland, USA
| | | | | | - Oliver Laeyendecker
- The Johns Hopkins University School of Medicine, Baltimore, Maryland, USA
- The Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, USA
- Laboratory of Immunoregulation, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Baltimore, Maryland, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
20
|
Liesenborgs J, Hendrickx DM, Kuylen E, Niyukuri D, Hens N, Delva W. SimpactCyan 1.0: An Open-source Simulator for Individual-Based Models in HIV Epidemiology with R and Python Interfaces. Sci Rep 2019; 9:19289. [PMID: 31848434 PMCID: PMC6917719 DOI: 10.1038/s41598-019-55689-4] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Accepted: 11/29/2019] [Indexed: 01/21/2023] Open
Abstract
SimpactCyan is an open-source simulator for individual-based models in HIV epidemiology. Its core algorithm is written in C++ for computational efficiency, while the R and Python interfaces aim to make the tool accessible to the fast-growing community of R and Python users. Transmission, treatment and prevention of HIV infections in dynamic sexual networks are simulated by discrete events. A generic “intervention” event allows model parameters to be changed over time, and can be used to model medical and behavioural HIV prevention programmes. First, we describe a more efficient variant of the modified Next Reaction Method that drives our continuous-time simulator. Next, we outline key built-in features and assumptions of individual-based models formulated in SimpactCyan, and provide code snippets for how to formulate, execute and analyse models in SimpactCyan through its R and Python interfaces. Lastly, we give two examples of applications in HIV epidemiology: the first demonstrates how the software can be used to estimate the impact of progressive changes to the eligibility criteria for HIV treatment on HIV incidence. The second example illustrates the use of SimpactCyan as a data-generating tool for assessing the performance of a phylodynamic inference framework.
Collapse
Affiliation(s)
- Jori Liesenborgs
- Expertise Centre for Digital Media, Hasselt University - tUL, Diepenbeek, Belgium
| | - Diana M Hendrickx
- Center for Statistics, I-BioStat, Hasselt University, Diepenbeek, Belgium
| | - Elise Kuylen
- IDLab, University of Antwerp, Antwerp, Belgium.,Centre for Health Economics Research and Modelling Infectious Diseases and Centre for the Evaluation of Vaccination, Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - David Niyukuri
- The South African Department of Science and Technology-National Research Foundation (DST-NRF) Centre of Excellence in Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Stellenbosch, South Africa.,Department of Global Health, Faculty of Medicine and Health, Stellenbosch University, Stellenbosch, South Africa
| | - Niel Hens
- Center for Statistics, I-BioStat, Hasselt University, Diepenbeek, Belgium.,Centre for Health Economics Research and Modelling Infectious Diseases and Centre for the Evaluation of Vaccination, Vaccine & Infectious Disease Institute, University of Antwerp, Antwerp, Belgium
| | - Wim Delva
- Center for Statistics, I-BioStat, Hasselt University, Diepenbeek, Belgium. .,The South African Department of Science and Technology-National Research Foundation (DST-NRF) Centre of Excellence in Epidemiological Modelling and Analysis (SACEMA), Stellenbosch University, Stellenbosch, South Africa. .,Department of Global Health, Faculty of Medicine and Health, Stellenbosch University, Stellenbosch, South Africa. .,International Centre for Reproductive Health, Ghent University, Ghent, Belgium. .,Rega Institute for Medical Research, KU Leuven, Leuven, Belgium. .,School for Data Science and Computational Thinking, Stellenbosch University, Stellenbosch, South Africa.
| |
Collapse
|
21
|
Nascimento FF, Baral S, Geidelberg L, Mukandavire C, Schwartz SR, Turpin G, Turpin N, Diouf D, Diouf NL, Coly K, Kane CT, Ndour C, Vickerman P, Boily MC, Volz EM. Phylodynamic analysis of HIV-1 subtypes B, C and CRF 02_AG in Senegal. Epidemics 2019; 30:100376. [PMID: 31767497 PMCID: PMC10066795 DOI: 10.1016/j.epidem.2019.100376] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 10/28/2019] [Accepted: 11/04/2019] [Indexed: 01/12/2023] Open
Abstract
Surveillance of HIV epidemics in key populations and in developing countries is often challenging due to sparse, incomplete, or low-quality data. Analysis of HIV sequence data can provide an alternative source of information about epidemic history, population structure, and transmission patterns. To understand HIV-1 dynamics and transmission patterns in Senegal, we carried out model-based phylodynamic analyses using the structured-coalescent approach using HIV-1 sequence data from three different subgroups: reproductive aged males and females from the adult Senegalese population and men who have sex with other men (MSM). We fitted these phylodynamic analyses to time-scaled phylogenetic trees individually for subtypes C and CRF 02_AG, and for the combined data for subtypes B, C and CRF 02_AG. In general, the combined analysis showed a decreasing proportion of effective number of infections among all reproductive aged adults relative to MSM. However, we observed a nearly time-invariant distribution for subtype CRF 02_AG and an increasing trend for subtype C on the proportion of effective number of infections. The population attributable fraction also differed between analyses: subtype CRF 02_AG showed little contribution from MSM, while for subtype C and combined analyses this contribution was much higher. Despite observed differences, results suggested that the combination of high assortativity among MSM and the unmet HIV prevention and treatment needs represent a significant component of the HIV epidemic in Senegal.
Collapse
Affiliation(s)
- Fabrícia F Nascimento
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place W2 1PG, UK
| | - Stefan Baral
- Department of Epidemiology, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | - Lily Geidelberg
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place W2 1PG, UK
| | - Christinah Mukandavire
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place W2 1PG, UK
| | - Sheree R Schwartz
- Department of Epidemiology, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | - Gnilane Turpin
- Department of Epidemiology, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | | | | | - Nafissatou Leye Diouf
- Institut de Recherche en Santé, de Surveillance Epidemiologique et de Formations, Dakar, Senegal
| | - Karleen Coly
- Department of Epidemiology, Johns Hopkins School of Public Health, Baltimore, MD, USA
| | - Coumba Toure Kane
- Institut de Recherche en Santé, de Surveillance Epidemiologique et de Formations, Dakar, Senegal
| | - Cheikh Ndour
- Division de La Lutte Contre Le Sida et Les IST, Ministry of Health, Dakar, Senegal
| | - Peter Vickerman
- Population Health Sciences, Bristol Medical School, University of Bristol, Bristol, UK
| | - Marie-Claude Boily
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place W2 1PG, UK
| | - Erik M Volz
- Department of Infectious Disease Epidemiology, Imperial College London, Norfolk Place W2 1PG, UK; MRC Centre for Global Infectious Disease Analysis, Imperial College London, UK.
| |
Collapse
|
22
|
Lewitus E, Rolland M. A non-parametric analytic framework for within-host viral phylogenies and a test for HIV-1 founder multiplicity. Virus Evol 2019; 5:vez044. [PMID: 31700680 PMCID: PMC6826062 DOI: 10.1093/ve/vez044] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Phylogenetics is a powerful tool for understanding the diversification dynamics of viral pathogens. Here we present an extension of the spectral density profile of the modified graph Laplacian, which facilitates the characterization of within-host molecular evolution of viruses and the direct comparison of diversification dynamics between hosts. This approach is non-parametric and therefore fast and model-free. We used simulations of within-host evolutionary scenarios to evaluate the efficiency of our approach and to demonstrate the significance of interpreting a viral phylogeny by its spectral density profile in terms of diversification dynamics. The key features that are captured by the profile are positive selection on the viral gene (or genome), temporal changes in substitution rates, mutational fitness, and time between sampling. Using sequences from individuals infected with HIV-1, we showed the utility of this approach for characterizing within-host diversification dynamics, for comparing dynamics between hosts, and for charting disease progression in infected individuals sampled over multiple years. We furthermore propose a heuristic test for assessing founder heterogeneity, which allows us to classify infections with single and multiple HIV-1 founder viruses. This non-parametric approach can be a valuable complement to existing parametric approaches.
Collapse
Affiliation(s)
- Eric Lewitus
- U.S. Military HIV Research Program (MHRP), WRAIR, 503 Robert Grant Avenue, Silver Spring, MD, USA.,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr, Bethesda, MD, USA
| | - Morgane Rolland
- U.S. Military HIV Research Program (MHRP), WRAIR, 503 Robert Grant Avenue, Silver Spring, MD, USA.,Henry M. Jackson Foundation for the Advancement of Military Medicine, Inc., 6720A Rockledge Dr, Bethesda, MD, USA
| |
Collapse
|
23
|
McLaughlin A, Sereda P, Oliveira N, Barrios R, Brumme CJ, Brumme ZL, Montaner JSG, Joy JB. Detection of HIV transmission hotspots in British Columbia, Canada: A novel framework for the prioritization and allocation of treatment and prevention resources. EBioMedicine 2019; 48:405-413. [PMID: 31628022 PMCID: PMC6838403 DOI: 10.1016/j.ebiom.2019.09.026] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 09/14/2019] [Indexed: 01/05/2023] Open
Abstract
Background Identifying populations at high risk of HIV transmission is critical for prioritizing treatment and prevention resources and achieving the UNAIDS 90-90-90 Targets. Methods HIV transmission rates can be estimated from phylogenetic trees as viral lineage-level diversification rates. To identify HIV-1 transmission foci in British Columbia, Canada, we inferred diversification rates from phylogenetic trees of 36 271 HIV-1 sequences from 9630 anonymized individuals. Diversification rates were combined with sociodemographic and clinical data, then aggregated by patients’ area of residence to predict the distribution of new HIV cases between 2008 and 2018. The predictive power of the model was compared with a phylogenetically uninformed model. Findings Aggregated diversification rate measures were predictive of new HIV cases in the subsequent year after adjusting for prevalent and incident cases in the previous year. For every one-unit increase in the mean of the top five diversification rates, the number of new HIV cases increased by on average 1·38-fold (95% CI, 1·28–1·49). In a blind prediction of 2018 cases, diversification rate improved the model's specificity by 12%, accuracy by 9%, top 20 agreement by 100%, and correlation of predicted and observed values by 162% relative to a model that incorporated epidemiological data alone. Interpretation By predicting the distribution of future HIV cases, a combined phylogenetic and epidemiological approach identifies hotspots where public health resources are needed most. Funding Canadian Institutes of Health Research, University of British Columbia, Public Health Agency of Canada, Genome Canada, Genome BC, Michael Smith Foundation for Health Research, and BC Centre for Excellence in HIV/AIDS.
Collapse
Affiliation(s)
- Angela McLaughlin
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada; Bioinformatics, University of British Columbia, Vancouver, BC, Canada
| | - Paul Sereda
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Natalia Oliveira
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Rolando Barrios
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; School of Population and Public Health, University of British Columbia, Vancouver, BC, Canada
| | - Chanson J Brumme
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada
| | - Zabrina L Brumme
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Faculty of Health Sciences, Simon Fraser University, Burnaby, BC, Canada
| | - Julio S G Montaner
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, BC, Canada
| | - Jeffrey B Joy
- British Columbia Centre for Excellence in HIV/AIDS, University of British Columbia Department of Medicine, 608-1081 Burrard Street, Vancouver, BC, V6Z 1Y6, Canada; Division of Infectious Diseases, Department of Medicine, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
24
|
Hidano A, Gates MC. Assessing biases in phylodynamic inferences in the presence of super-spreaders. Vet Res 2019; 50:74. [PMID: 31558163 PMCID: PMC6764146 DOI: 10.1186/s13567-019-0692-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2019] [Accepted: 08/28/2019] [Indexed: 12/03/2022] Open
Abstract
Phylodynamic analyses using pathogen genetic data have become popular for making epidemiological inferences. However, many methods assume that the underlying host population follows homogenous mixing patterns. Nevertheless, in real disease outbreaks, a small number of individuals infect a disproportionately large number of others (super-spreaders). Our objective was to quantify the degree of bias in estimating the epidemic starting date in the presence of super-spreaders using different sample selection strategies. We simulated 100 epidemics of a hypothetical pathogen (fast evolving foot and mouth disease virus-like) over a real livestock movement network allowing the genetic mutations in pathogen sequence. Genetic sequences were sampled serially over the epidemic, which were then used to estimate the epidemic starting date using Extended Bayesian Coalescent Skyline plot (EBSP) and Birth–death skyline plot (BDSKY) models. Our results showed that the degree of bias varies over different epidemic situations, with substantial overestimations on the epidemic duration occurring in some occasions. While the accuracy and precision of BDSKY were deteriorated when a super-spreader generated a larger proportion of secondary cases, those of EBSP were deteriorated when epidemics were shorter. The accuracies of the inference were similar irrespective of whether the analysis used all sampled sequences or only a subset of them, although the former required substantially longer computational times. When phylodynamic analyses need to be performed under a time constraint to inform policy makers, we suggest multiple phylodynamics models to be used simultaneously for a subset of data to ascertain the robustness of inferences.
Collapse
Affiliation(s)
- Arata Hidano
- EpiCentre, School of Veterinary Science, Massey University, Palmerston North, New Zealand.
| | - M Carolyn Gates
- EpiCentre, School of Veterinary Science, Massey University, Palmerston North, New Zealand
| |
Collapse
|
25
|
Volz EM, Le Vu S, Ratmann O, Tostevin A, Dunn D, Orkin C, O'Shea S, Delpech V, Brown A, Gill N, Fraser C. Molecular Epidemiology of HIV-1 Subtype B Reveals Heterogeneous Transmission Risk: Implications for Intervention and Control. J Infect Dis 2019; 217:1522-1529. [PMID: 29506269 PMCID: PMC5913615 DOI: 10.1093/infdis/jiy044] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 01/22/2018] [Indexed: 11/25/2022] Open
Abstract
Background The impact of HIV pre-exposure prophylaxis (PrEP) depends on infections averted by protecting vulnerable individuals as well as infections averted by preventing transmission by those who would have been infected if not receiving PrEP. Analysis of HIV phylogenies reveals risk factors for transmission, which we examine as potential criteria for allocating PrEP. Methods We analyzed 6912 HIV-1 partial pol sequences from men who have sex with men (MSM) in the United Kingdom combined with global reference sequences and patient-level metadata. Population genetic models were developed that adjust for stage of infection, global migration of HIV lineages, and changing incidence of infection through time. Models were extended to simulate the effects of providing susceptible MSM with PrEP. Results We found that young age <25 years confers higher risk of HIV transmission (relative risk = 2.52 [95% confidence interval, 2.32–2.73]) and that young MSM are more likely to transmit to one another than expected by chance. Simulated interventions indicate that 4-fold more infections can be averted over 5 years by focusing PrEP on young MSM. Conclusions Concentrating PrEP doses on young individuals can avert more infections than random allocation.
Collapse
Affiliation(s)
- Erik M Volz
- Department of Infectious Disease Epidemiology and the National Institute for Health Research Health Protection Research Unit on Modeling Methodology, Imperial College London
| | - Stephane Le Vu
- Department of Infectious Disease Epidemiology and the National Institute for Health Research Health Protection Research Unit on Modeling Methodology, Imperial College London
| | - Oliver Ratmann
- Department of Infectious Disease Epidemiology and the National Institute for Health Research Health Protection Research Unit on Modeling Methodology, Imperial College London
| | - Anna Tostevin
- Institute for Global Health, University College London
| | - David Dunn
- Institute for Global Health, University College London
| | | | - Siobhan O'Shea
- Infection Sciences, Viapath Analytics, Guy's and St Thomas' NHS Foundation Trust, London
| | | | | | | | - Christophe Fraser
- Li Ka Shing Centre for Health Information and Discovery, Oxford University, United Kingdom
| | | |
Collapse
|
26
|
Ishikawa SA, Zhukova A, Iwasaki W, Gascuel O. A Fast Likelihood Method to Reconstruct and Visualize Ancestral Scenarios. Mol Biol Evol 2019; 36:2069-2085. [PMID: 31127303 PMCID: PMC6735705 DOI: 10.1093/molbev/msz131] [Citation(s) in RCA: 116] [Impact Index Per Article: 23.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The reconstruction of ancestral scenarios is widely used to study the evolution of characters along phylogenetic trees. One commonly uses the marginal posterior probabilities of the character states, or the joint reconstruction of the most likely scenario. However, marginal reconstructions provide users with state probabilities, which are difficult to interpret and visualize, whereas joint reconstructions select a unique state for every tree node and thus do not reflect the uncertainty of inferences. We propose a simple and fast approach, which is in between these two extremes. We use decision-theory concepts (namely, the Brier score) to associate each node in the tree to a set of likely states. A unique state is predicted in tree regions with low uncertainty, whereas several states are predicted in uncertain regions, typically around the tree root. To visualize the results, we cluster the neighboring nodes associated with the same states and use graph visualization tools. The method is implemented in the PastML program and web server. The results on simulated data demonstrate the accuracy and robustness of the approach. PastML was applied to the phylogeography of Dengue serotype 2 (DENV2), and the evolution of drug resistances in a large HIV data set. These analyses took a few minutes and provided convincing results. PastML retrieved the main transmission routes of human DENV2 and showed the uncertainty of the human-sylvatic DENV2 geographic origin. With HIV, the results show that resistance mutations mostly emerge independently under treatment pressure, but resistance clusters are found, corresponding to transmissions among untreated patients.
Collapse
Affiliation(s)
- Sohta A Ishikawa
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan
- Evolutionary Genomics of RNA Viruses, Virology Department, Institut Pasteur, Paris, France
| | - Anna Zhukova
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
| | - Wataru Iwasaki
- Department of Biological Sciences, The University of Tokyo, Tokyo, Japan
| | - Olivier Gascuel
- Unité Bioinformatique Evolutive, Institut Pasteur, C3BI USR 3756 IP & CNRS, Paris, France
| |
Collapse
|
27
|
Abeler-Dörner L, Grabowski MK, Rambaut A, Pillay D, Fraser C. PANGEA-HIV 2: Phylogenetics And Networks for Generalised Epidemics in Africa. Curr Opin HIV AIDS 2019; 14:173-180. [PMID: 30946141 PMCID: PMC6629166 DOI: 10.1097/coh.0000000000000542] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
PURPOSE OF REVIEW The HIV epidemic in sub-Saharan Africa is far from being under control and the ambitious UNAIDS targets are unlikely to be met by 2020 as declines in per-capita incidence being largely offset by demographic trends. There is an increasing number of proven and specific HIV prevention tools, but little consensus on how best to deploy them. RECENT FINDINGS Traditionally, phylogenetics has been used in HIV research to reconstruct the history of the epidemic and date zoonotic infections, whereas more recent publications focus on HIV diversity and drug resistance. However, it is also the most powerful method of source attribution available for the study of HIV transmission. The PANGEA (Phylogenetics And Networks for Generalized Epidemics in Africa) consortium has generated over 18 000 NGS HIV sequences from five countries in sub-Saharan Africa. Using phylogenetic methods, we will identify characteristics of individuals or groups, which are most likely to be at risk of infection or at risk of infecting others. SUMMARY Combining phylogenetics, phylodynamics and epidemiology will allow PANGEA to highlight where prevention efforts should be focussed to reduce the HIV epidemic most effectively. To maximise the public health benefit of the data, PANGEA offers accreditation to external researchers, allowing them to access the data and join the consortium. We also welcome submissions of other HIV sequences from sub-Saharan Africa to the database.
Collapse
Affiliation(s)
- Lucie Abeler-Dörner
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Mary K. Grabowski
- Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Rakai Health Sciences Program, Baltimore, USA
| | - Andrew Rambaut
- Institute of Evolutionary Biology, University of Edinburgh, Ashworth Laboratories, Edinburgh, UK
| | - Deenan Pillay
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection and Immunity, University College London, London, UK
| | - Christophe Fraser
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| |
Collapse
|
28
|
Duchene S, Bouckaert R, Duchene DA, Stadler T, Drummond AJ. Phylodynamic Model Adequacy Using Posterior Predictive Simulations. Syst Biol 2019; 68:358-364. [PMID: 29945220 PMCID: PMC6368481 DOI: 10.1093/sysbio/syy048] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2018] [Accepted: 06/15/2018] [Indexed: 11/18/2022] Open
Abstract
Rapidly evolving pathogens, such as viruses and bacteria, accumulate genetic change at a similar timescale over which their epidemiological processes occur, such that, it is possible to make inferences about their infectious spread using phylogenetic time-trees. For this purpose it is necessary to choose a phylodynamic model. However, the resulting inferences are contingent on whether the model adequately describes key features of the data. Model adequacy methods allow formal rejection of a model if it cannot generate the main features of the data. We present TreeModelAdequacy, a package for the popular BEAST2 software that allows assessing the adequacy of phylodynamic models. We illustrate its utility by analyzing phylogenetic trees from two viral outbreaks of Ebola and H1N1 influenza. The main features of the Ebola data were adequately described by the coalescent exponential-growth model, whereas the H1N1 influenza data were best described by the birth–death susceptible-infected-recovered model.
Collapse
Affiliation(s)
- Sebastian Duchene
- Department of Biochemistry and Molecular Biology, Bio21 Molecular Science and Biotechnology Institute, University of Melbourne, Melbourne, Australia
| | - Remco Bouckaert
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand.,Max Planck Institute for the Science of Human History, Jena, Germany
| | - David A Duchene
- School of Life and Environmental Sciences, University of Sydney, Sydney, Australia
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Alexei J Drummond
- Centre for Computational Evolution, University of Auckland, Auckland, New Zealand
| |
Collapse
|
29
|
Firestone SM, Hayama Y, Bradhurst R, Yamamoto T, Tsutsui T, Stevenson MA. Reconstructing foot-and-mouth disease outbreaks: a methods comparison of transmission network models. Sci Rep 2019; 9:4809. [PMID: 30886211 PMCID: PMC6423326 DOI: 10.1038/s41598-019-41103-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 02/28/2019] [Indexed: 12/22/2022] Open
Abstract
A number of transmission network models are available that combine genomic and epidemiological data to reconstruct networks of who infected whom during infectious disease outbreaks. For such models to reliably inform decision-making they must be transparently validated, robust, and capable of producing accurate predictions within the short data collection and inference timeframes typical of outbreak responses. A lack of transparent multi-model comparisons reduces confidence in the accuracy of transmission network model outputs, negatively impacting on their more widespread use as decision-support tools. We undertook a formal comparison of the performance of nine published transmission network models based on a set of foot-and-mouth disease outbreaks simulated in a previously free country, with corresponding simulated phylogenies and genomic samples from animals on infected premises. Of the transmission network models tested, Lau’s systematic Bayesian integration framework was found to be the most accurate for inferring the transmission network and timing of exposures, correctly identifying the source of 73% of the infected premises (with 91% accuracy for sources with model support >0.80). The Structured COalescent Transmission Tree Inference provided the most accurate inference of molecular clock rates. This validation study points to which models might be reliably used to reconstruct similar future outbreaks and how to interpret the outputs to inform control. Further research could involve extending the best-performing models to explicitly represent within-host diversity so they can handle next-generation sequencing data, incorporating additional animal and farm-level covariates and combining predictions using Ensemble methods and other approaches.
Collapse
Affiliation(s)
- Simon M Firestone
- Asia-Pacific Centre for Animal Health, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia.
| | - Yoko Hayama
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Richard Bradhurst
- Centre of Excellence for Biosecurity Risk Analysis, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Takehisa Yamamoto
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Toshiyuki Tsutsui
- Viral Disease and Epidemiology Research Division, National Institute of Animal Health, National Agriculture Research Organization, Tsukuba, Ibaraki, 305-0856, Japan
| | - Mark A Stevenson
- Asia-Pacific Centre for Animal Health, Melbourne Veterinary School, Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
30
|
Fleming TR, DeGruttola V, Donnell D. Designing & Conducting Trials To Reliably Evaluate HIV Prevention Interventions. ACTA ACUST UNITED AC 2019; 11. [PMID: 33777327 DOI: 10.1515/scid-2019-0001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
While much has been achieved, much remains to be accomplished in the science of preventing the spread of HIV infection. Clinical trials that are properly designed, conducted and analyzed are of integral importance in the pursuit of reliable insights about HIV prevention. As we build on previous scientific breakthroughs, there will be an increasing need for clinical trials to be designed to efficiently achieve insights without compromising their reliability and generalizability. Key design features should continue to include: 1) the use of randomization and evidence-based controls, 2) specifying the use of intention-to-treat analyses to preserve the integrity of randomization and to increase interpretability of results, 3) obtaining direct assessments of effects on clinical endpoints such as the risk of HIV infection, 4) using either superiority designs or non-inferiority designs with rigorous non-inferiority margins, and 5) enhancing generalizability through the choice of a relative risk rather than risk difference metric. When interventions have complementary and potentially synergistic effects, factorial designs should be considered to increase efficiency as well as to obtain clinically important insights about interaction and the contribution of component interventions to the efficacy and safety of combination regimens. Key trial conduct issues include timely enrollment of participants at high HIV risk recruited from populations with high viral burden, obtaining 'best real-world achievable' levels of adherence to the interventions being assessed and ensuring high levels of retention. High quality of trial conduct occurs through active rather than passive monitoring, using pre-specified targeted levels of performance with defined methods to achieve those targets. During trial conduct, active monitoring of the performance standards not only holds the trial leaders accountable but also can assist in the development and implementation of creative alternative approaches to increase the quality of trial conduct. Designing, conducting and analyzing HIV prevention trials with the quality needed to obtain reliable insights is an ethical as well as scientific imperative.
Collapse
Affiliation(s)
- Thomas R Fleming
- Department of Biostatistics, University of Washington, Seattle, WA, USA.,Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | | | - Deborah Donnell
- Vaccine and Infectious Disease Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| |
Collapse
|
31
|
Coltart CEM, Hoppe A, Parker M, Dawson L, Amon JJ, Simwinga M, Geller G, Henderson G, Laeyendecker O, Tucker JD, Eba P, Novitsky V, Vandamme AM, Seeley J, Dallabetta G, Harling G, Grabowski MK, Godfrey-Faussett P, Fraser C, Cohen MS, Pillay D. Ethical considerations in global HIV phylogenetic research. Lancet HIV 2018; 5:e656-e666. [PMID: 30174214 PMCID: PMC7327184 DOI: 10.1016/s2352-3018(18)30134-6] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2018] [Revised: 05/28/2018] [Accepted: 06/06/2018] [Indexed: 01/01/2023]
Abstract
Phylogenetic analysis of pathogens is an increasingly powerful way to reduce the spread of epidemics, including HIV. As a result, phylogenetic approaches are becoming embedded in public health and research programmes, as well as outbreak responses, presenting unique ethical, legal, and social issues that are not adequately addressed by existing bioethics literature. We formed a multidisciplinary working group to explore the ethical issues arising from the design of, conduct in, and use of results from HIV phylogenetic studies, and to propose recommendations to minimise the associated risks to both individuals and groups. We identified eight key ethical domains, within which we highlighted factors that make HIV phylogenetic research unique. In this Review, we endeavoured to provide a framework to assist researchers, public health practitioners, and funding institutions to ensure that HIV phylogenetic studies are designed, done, and disseminated in an ethical manner. Our conclusions also have broader relevance for pathogen phylogenetics.
Collapse
Affiliation(s)
| | - Anne Hoppe
- Division of Infection and Immunity, University College London, London, UK.
| | - Michael Parker
- The Wellcome Centre for Ethics and Humanities (Ethox), Nuffield Department of Population Health, University of Oxford, Oxford, UK
| | - Liza Dawson
- Division of AIDS, National Institutes of Health, Bethesda, MD, USA
| | - Joseph J Amon
- Woodrow Wilson School of Public and International Affairs, Princeton University, Princeton, NJ, USA
| | | | - Gail Geller
- Berman Institute of Bioethics and School of Medicine, Johns Hopkins University, Baltimore, MD, USA
| | - Gail Henderson
- Center for Genomics and Society, Department of Social Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Oliver Laeyendecker
- National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Joseph D Tucker
- Institute for Global Health and Infectious Diseases, University of North Carolina, Chapel Hill, NC, USA
| | - Patrick Eba
- Community Support, Social Justice and Inclusion Department, Geneva, Switzerland; School of Law, University of KwaZulu-Natal, Pietermaritzburg, South Africa
| | - Vladimir Novitsky
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Anne-Mieke Vandamme
- Clinical and Epidemiological Virology, Rega Institute for Medical Research, Department of Microbiology and Immunology, KU Leuven-University of Leuven, Leuven, Belgium; Center for Global Health and Tropical Medicine, Unidade de Microbiologia, Instituto de Higiene e Medicina Tropical, Universidade Nova de Lisboa, Lisbon, Portugal
| | - Janet Seeley
- Africa Health Research Institute, KwaZulu-Natal, South Africa; Department of Global Health and Development, London School of Hygiene & Tropical Medicine, London, UK
| | | | - Guy Harling
- Institute for Global Health, University College London, London, UK; Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - M Kate Grabowski
- Department of Pathology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA; Department of Rakai Community Cohort Study, Rakai Health Sciences Program, Kalisizo, Uganda
| | - Peter Godfrey-Faussett
- Joint United Nations Programme on HIV/AIDS, Geneva, Switzerland; Department of Clinical Research, London School of Hygiene & Tropical Medicine, London, UK
| | - Christophe Fraser
- Big Data Institute, Nuffield Department of Medicine, University of Oxford, Oxford, UK
| | - Myron S Cohen
- Institute for Global Health and Infectious Diseases, University of North Carolina, Chapel Hill, NC, USA; Department of Medicine, University of North Carolina, Chapel Hill, NC, USA
| | - Deenan Pillay
- Division of Infection and Immunity, University College London, London, UK; Africa Health Research Institute, KwaZulu-Natal, South Africa
| |
Collapse
|
32
|
Volz EM, Siveroni I. Bayesian phylodynamic inference with complex models. PLoS Comput Biol 2018; 14:e1006546. [PMID: 30422979 PMCID: PMC6258546 DOI: 10.1371/journal.pcbi.1006546] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2018] [Revised: 11/27/2018] [Accepted: 10/05/2018] [Indexed: 12/20/2022] Open
Abstract
Population genetic modeling can enhance Bayesian phylogenetic inference by providing a realistic prior on the distribution of branch lengths and times of common ancestry. The parameters of a population genetic model may also have intrinsic importance, and simultaneous estimation of a phylogeny and model parameters has enabled phylodynamic inference of population growth rates, reproduction numbers, and effective population size through time. Phylodynamic inference based on pathogen genetic sequence data has emerged as useful supplement to epidemic surveillance, however commonly-used mechanistic models that are typically fitted to non-genetic surveillance data are rarely fitted to pathogen genetic data due to a dearth of software tools, and the theory required to conduct such inference has been developed only recently. We present a framework for coalescent-based phylogenetic and phylodynamic inference which enables highly-flexible modeling of demographic and epidemiological processes. This approach builds upon previous structured coalescent approaches and includes enhancements for computational speed, accuracy, and stability. A flexible markup language is described for translating parametric demographic or epidemiological models into a structured coalescent model enabling simultaneous estimation of demographic or epidemiological parameters and time-scaled phylogenies. We demonstrate the utility of these approaches by fitting compartmental epidemiological models to Ebola virus and Influenza A virus sequence data, demonstrating how important features of these epidemics, such as the reproduction number and epidemic curves, can be gleaned from genetic data. These approaches are provided as an open-source package PhyDyn for the BEAST2 phylogenetics platform.
Collapse
Affiliation(s)
- Erik M. Volz
- Department of Infectious Disease Epidemiology and the MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| | - Igor Siveroni
- Department of Infectious Disease Epidemiology and the MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, United Kingdom
| |
Collapse
|
33
|
Dennis AM, Volz E, Frost AMSD, Hossain M, Poon AF, Rebeiro PF, Vermund SH, Sterling TR, Kalish ML. HIV-1 Transmission Clustering and Phylodynamics Highlight the Important Role of Young Men Who Have Sex with Men. AIDS Res Hum Retroviruses 2018; 34:879-888. [PMID: 30027754 PMCID: PMC6204570 DOI: 10.1089/aid.2018.0039] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
More persons living with HIV reside in the Southern United States than in any other region, yet little is known about HIV molecular epidemiology in the South. We used cluster and phylodynamic analyses to evaluate HIV transmission patterns in middle Tennessee. We performed cross-sectional analyses of HIV-1 pol sequences and clinical data collected from 2001 to 2015 among persons attending the Vanderbilt Comprehensive Care Clinic. Transmission clusters were identified using maximum likelihood phylogenetics and patristic distance differences. Demographic, risk behavior, and clinical factors were assessed evaluating “active” clusters (clusters including sequences sampled 2011–2015) and associations estimated with logistic regression. Transmission risk ratios for men who have sex with men (MSM) were estimated with phylodynamic models. Among 2915 persons (96% subtype-B sequences), 963 (33%) were members of 292 clusters (distance ≤1.5%, size range 2–39). Most clusters (62%, n = 690 persons) were active, either being newly identified (n = 80) or showing expansion on existing clusters (n = 101). Correlates of active clustering among persons with sequences collected during 2011–2015 included MSM risk and ≤30 years of age. Active clusters were significantly more concentrated in MSM and younger persons than historical clusters. Young MSM (YMSM) (≤26.4 years) had high estimated transmission risk [risk ratio = 4.04 (2.85–5.65) relative to older MSM] and were much more likely to transmit to YMSM. In this Tennessee cohort, transmission clusters over time were more concentrated by MSM and younger age, with high transmission risk among and between YMSM, highlighting the importance of interventions among this group. Detecting active clusters could help direct interventions to disrupt ongoing transmission chains.
Collapse
Affiliation(s)
- Ann M. Dennis
- Division of Infectious Diseases, University of North Carolina, Chapel Hill, North Carolina
| | - Erik Volz
- Department of Infectious Disease Epidemiology and Centre for Outbreak Analysis and Modeling, Imperial College, London, United Kingdom
| | | | - Mukarram Hossain
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Art F.Y. Poon
- Department of Pathology and Laboratory Medicine, Western University, London, Canada
| | - Peter F. Rebeiro
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Sten H. Vermund
- Department of Epidemiology of Microbial Diseases, Yale University School of Public Health, New Haven, Connecticut
| | - Timothy R. Sterling
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Marcia L. Kalish
- Division of Infectious Diseases, Department of Medicine, Vanderbilt University School of Medicine, Nashville, Tennessee
| |
Collapse
|
34
|
Kusejko K, Kadelka C, Marzel A, Battegay M, Bernasconi E, Calmy A, Cavassini M, Hoffmann M, Böni J, Yerly S, Klimkait T, Perreau M, Rauch A, Günthard HF, Kouyos RD. Inferring the age difference in HIV transmission pairs by applying phylogenetic methods on the HIV transmission network of the Swiss HIV Cohort Study. Virus Evol 2018; 4:vey024. [PMID: 30250751 PMCID: PMC6143731 DOI: 10.1093/ve/vey024] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Age-mixing patterns are of key importance for understanding the dynamics of human
immunodeficiency virus (HIV)-epidemics and target public health interventions. We use the
densely sampled Swiss HIV Cohort Study (SHCS) resistance database to study the age
difference at infection in HIV transmission pairs using phylogenetic methods. In addition,
we investigate whether the mean age difference of pairs in the phylogenetic tree is
influenced by sampling as well as by additional distance thresholds for including pairs.
HIV-1 pol-sequences of 11,922 SHCS patients and approximately 240,000 Los
Alamos background sequences were used to build a phylogenetic tree. Using this tree, 100
per cent down to 1 per cent of the tips were sampled repeatedly to generate pruned trees
(N = 500 for each sample proportion), of which pairs of SHCS patients
were extracted. The mean of the absolute age differences of the pairs, measured as the
absolute difference of the birth years, was analyzed with respect to this sample
proportion and a distance criterion for inclusion of the pairs. In addition, the
transmission groups men having sex with men (MSM), intravenous drug users (IDU), and
heterosexuals (HET) were analyzed separately. Considering the tree with all 11,922 SHCS
patients, 2,991 pairs could be extracted, with 954 (31.9 per cent) MSM-pairs, 635 (21.2
per cent) HET-pairs, 414 (13.8 per cent) IDU-pairs, and 352 (11.8 per cent) HET/IDU-pairs.
For all transmission groups, the age difference at infection was significantly
(P < 0.001) smaller for pairs in the tree compared with randomly assigned pairs,
meaning that patients of similar age are more likely to be pairs. The mean age difference
in the phylogenetic analysis, using a fixed distance of 0.05, was 9.2, 9.0, 7.3 and
5.6 years for MSM-, HET-, HET/IDU-, and IDU-pairs, respectively. Decreasing the cophenetic
distance threshold from 0.05 to 0.01 significantly decreased the mean age difference.
Similarly, repeated sampling of 100 per cent down to 1 per cent of the tips revealed an
increased age difference at lower sample proportions. HIV-transmission is age-assortative,
but the age difference of transmission pairs detected by phylogenetic analyses depends on
both sampling proportion and distance criterion. The mean age difference decreases when
using more conservative distance thresholds, implying an underestimation of
age-assortativity when using liberal distance criteria. Similarly, overestimation of the
mean age difference occurs for pairs from sparsely sampled trees, as it is often the case
in sub-Saharan Africa.
Collapse
Affiliation(s)
- Katharina Kusejko
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, Rämistrasse 100, Zürich, Switzerland.,Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | - Claus Kadelka
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, Rämistrasse 100, Zürich, Switzerland.,Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | - Alex Marzel
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, Rämistrasse 100, Zürich, Switzerland.,Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | - Manuel Battegay
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Basel, Petersgraben 4, CH-4031 Basel; University of Basel, Petersplatz 1, Basel, Switzerland
| | - Enos Bernasconi
- Division of Infectious Diseases, Regional Hospital Lugano, Via Tesserete 46, Lugano, Switzerland
| | - Alexandra Calmy
- Laboratory of Virology and Division of Infectious Diseases, Genève University Hospital, Rue Gabrielle-Perret-Gentil 4, CH-1205 Genève; University of Genève, 24 rue du Général-Dufour, Genève, Switzerland
| | - Matthias Cavassini
- Division of Infectious Diseases, Lausanne University Hospital, Rue du Bugnon 46, Lausanne, Switzerland
| | - Matthias Hoffmann
- Division of Infectious Diseases, Cantonal Hospital St Gallen, Rorschacher Strasse 95, St. Gallen, Switzerland
| | - Jürg Böni
- Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | - Sabine Yerly
- Laboratory of Virology and Division of Infectious Diseases, Genève University Hospital, Rue Gabrielle-Perret-Gentil 4, CH-1205 Genève; University of Genève, 24 rue du Général-Dufour, Genève, Switzerland
| | - Thomas Klimkait
- Molecular Virology, Department of Biomedicine, University of Basel, Petersplatz 10, Basel, Switzerland
| | - Matthieu Perreau
- Division of Infectious Diseases, Lausanne University Hospital, Rue du Bugnon 46, Lausanne, Switzerland
| | - Andri Rauch
- Clinic for Infectious Diseases, Bern University Hospital, Freiburgstrasse 18, Bern; University of Bern, Hochschulstrasse 6, CH-3012 Bern, Switzerland
| | - Huldrych F Günthard
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, Rämistrasse 100, Zürich, Switzerland.,Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | - Roger D Kouyos
- Division of Infectious Diseases and Hospital Epidemiology, University Hospital Zürich, Rämistrasse 100, Zürich, Switzerland.,Institute of Medical Virology, University of Zürich, Winterthurerstrasse 190, Zürich, Switzerland
| | | |
Collapse
|
35
|
HIV-1 diversity among young women in rural South Africa: HPTN 068. PLoS One 2018; 13:e0198999. [PMID: 29975689 PMCID: PMC6033411 DOI: 10.1371/journal.pone.0198999] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 04/21/2018] [Indexed: 12/15/2022] Open
Abstract
Background South Africa has one of the highest rates of HIV-1 (HIV) infection world-wide, with the highest rates among young women. We analyzed the molecular epidemiology and evolutionary history of HIV in young women attending high school in rural South Africa. Methods Samples were obtained from the HPTN 068 randomized controlled trial, which evaluated the effect of cash transfers for school attendance on HIV incidence in women aged 13–20 years (Mpumalanga province, 2011–2015). Plasma samples from HIV-infected participants were analyzed using the ViroSeq HIV-1 Genotyping assay. Phylogenetic analysis was performed using 200 pol gene study sequences and 2,294 subtype C reference sequences from South Africa. Transmission clusters were identified using Cluster Picker and HIV-TRACE, and were characterized using demographic and other epidemiological data. Phylodynamic analyses were performed using the BEAST software. Results The study enrolled 2,533 young women who were followed through their expected high school graduation date (main study); some participants had a post-study assessment (follow-up study). Two-hundred-twelve of 2,533 enrolled young women had HIV infection. HIV pol sequences were obtained for 94% (n = 201/212) of the HIV-infected participants. All but one of the sequences were HIV-1 subtype C; the non-C subtype sequence was excluded from further analysis. Median pairwise genetic distance between the subtype C sequences was 6.4% (IQR: 5.6–7.2). Overall, 26% of study sequences fell into 21 phylogenetic clusters with 2–6 women per cluster. Thirteen (62%) clusters included women who were HIV-infected at enrollment. Clustering was not associated with study arm, demographic or other epidemiological factors. The estimated date of origin of HIV subtype C in the study population was 1958 (95% highest posterior density [HPD]: 1931–1980), and the median estimated substitution rate among study pol sequences was 1.98x10-3 (95% HPD: 1.15x10-3–2.81x10-3) per site per year. Conclusions Phylogenetic analysis suggests that multiple HIV subtype C sublineages circulate among school age girls in South Africa. There were no substantive differences in the molecular epidemiology of HIV between control and intervention arms in the HPTN 068 trial.
Collapse
|
36
|
Rasmussen DA, Wilkinson E, Vandormael A, Tanser F, Pillay D, Stadler T, de Oliveira T. Tracking external introductions of HIV using phylodynamics reveals a major source of infections in rural KwaZulu-Natal, South Africa. Virus Evol 2018; 4:vey037. [PMID: 30555720 PMCID: PMC6290119 DOI: 10.1093/ve/vey037] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Despite increasing access to antiretrovirals, HIV incidence in rural KwaZulu-Natal remains among the highest ever reported in Africa. While many epidemiological factors have been invoked to explain such high incidence, widespread human mobility and viral movement suggest that transmission between communities may be a major source of new infections. High cross-community transmission rates call into question how effective increasing the coverage of antiretroviral therapy locally will be at preventing new infections, especially if many new cases arise from external introductions. To help address this question, we use a phylodynamic model to reconstruct epidemic dynamics and estimate the relative contribution of local transmission versus external introductions to overall incidence in KwaZulu-Natal from HIV-1 phylogenies. By comparing our results with population-based surveillance data, we show that we can reliably estimate incidence from viral phylogenies once viral movement in and out of the local population is accounted for. Our analysis reveals that early epidemic dynamics were largely driven by external introductions. More recently, we estimate that 35 per cent (95% confidence interval: 20-60%) of new infections arise from external introductions. These results highlight the growing need to consider larger-scale regional transmission dynamics when designing and testing prevention strategies.
Collapse
Affiliation(s)
- David A Rasmussen
- Department of Entomology and Plant Pathology, North Carolina State University, Raleigh, NC, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Eduan Wilkinson
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Alain Vandormael
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
- School of Nursing and Public Health, University of KwaZulu-Natal, Durban, South Africa
| | - Frank Tanser
- School of Nursing and Public Health, University of KwaZulu-Natal, Durban, South Africa
- Africa Health Research Institute, Durban, South Africa
- Research Department of Infection & Population Health, University College London, UK
| | - Deenan Pillay
- Africa Health Research Institute, Durban, South Africa
- Division of Infection and Immunity, University College London, UK
| | - Tanja Stadler
- Department of Biosystems Science and Engineering, ETH Zürich, Basel, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Tulio de Oliveira
- KwaZulu-Natal Research Innovation and Sequencing Platform (KRISP), College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), Durban, South Africa
- Department of Global Health, University of Washington, Seattle, USA
| |
Collapse
|
37
|
Brenner BG, Ibanescu RI, Hardy I, Roger M. Genotypic and Phylogenetic Insights on Prevention of the Spread of HIV-1 and Drug Resistance in "Real-World" Settings. Viruses 2017; 10:v10010010. [PMID: 29283390 PMCID: PMC5795423 DOI: 10.3390/v10010010] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2017] [Revised: 12/22/2017] [Accepted: 12/24/2017] [Indexed: 12/15/2022] Open
Abstract
HIV continues to spread among vulnerable heterosexual (HET), Men-having-Sex with Men (MSM) and intravenous drug user (IDU) populations, influenced by a complex array of biological, behavioral and societal factors. Phylogenetics analyses of large sequence datasets from national drug resistance testing programs reveal the evolutionary interrelationships of viral strains implicated in the dynamic spread of HIV in different regional settings. Viral phylogenetics can be combined with demographic and behavioral information to gain insights on epidemiological processes shaping transmission networks at the population-level. Drug resistance testing programs also reveal emergent mutational pathways leading to resistance to the 23 antiretroviral drugs used in HIV-1 management in low-, middle- and high-income settings. This article describes how genotypic and phylogenetic information from Quebec and elsewhere provide critical information on HIV transmission and resistance, Cumulative findings can be used to optimize public health strategies to tackle the challenges of HIV in “real-world” settings.
Collapse
Affiliation(s)
- Bluma G Brenner
- McGill University AIDS Centre, Lady Davis Institute for Medical Research, Montreal, QC H3T 1E2, Canada.
| | - Ruxandra-Ilinca Ibanescu
- McGill University AIDS Centre, Lady Davis Institute for Medical Research, Montreal, QC H3T 1E2, Canada.
| | - Isabelle Hardy
- Département de Microbiologie et d'Immunologie et Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC H2X 0A9, Canada.
| | - Michel Roger
- Département de Microbiologie et d'Immunologie et Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CHUM), Montreal, QC H2X 0A9, Canada.
| |
Collapse
|
38
|
Ratmann O, Wymant C, Colijn C, Danaviah S, Essex M, Frost S, Gall A, Gaseitsiwe S, Grabowski MK, Gray R, Guindon S, von Haeseler A, Kaleebu P, Kendall M, Kozlov A, Manasa J, Minh BQ, Moyo S, Novitsky V, Nsubuga R, Pillay S, Quinn TC, Serwadda D, Ssemwanga D, Stamatakis A, Trifinopoulos J, Wawer M, Brown AL, de Oliveira T, Kellam P, Pillay D, Fraser C, on behalf of the PANGEA-HIV Consort. HIV-1 full-genome phylogenetics of generalized epidemics in sub-Saharan Africa: impact of missing nucleotide characters in next-generation sequences. AIDS Res Hum Retroviruses 2017; 33:1083-1098. [PMID: 28540766 PMCID: PMC5597042 DOI: 10.1089/aid.2017.0061] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
To characterize HIV-1 transmission dynamics in regions where the burden of HIV-1 is greatest, the “Phylogenetics and Networks for Generalised HIV Epidemics in Africa” consortium (PANGEA-HIV) is sequencing full-genome viral isolates from across sub-Saharan Africa. We report the first 3,985 PANGEA-HIV consensus sequences from four cohort sites (Rakai Community Cohort Study, n = 2,833; MRC/UVRI Uganda, n = 701; Mochudi Prevention Project, n = 359; Africa Health Research Institute Resistance Cohort, n = 92). Next-generation sequencing success rates varied: more than 80% of the viral genome from the gag to the nef genes could be determined for all sequences from South Africa, 75% of sequences from Mochudi, 60% of sequences from MRC/UVRI Uganda, and 22% of sequences from Rakai. Partial sequencing failure was primarily associated with low viral load, increased for amplicons closer to the 3′ end of the genome, was not associated with subtype diversity except HIV-1 subtype D, and remained significantly associated with sampling location after controlling for other factors. We assessed the impact of the missing data patterns in PANGEA-HIV sequences on phylogeny reconstruction in simulations. We found a threshold in terms of taxon sampling below which the patchy distribution of missing characters in next-generation sequences (NGS) has an excess negative impact on the accuracy of HIV-1 phylogeny reconstruction, which is attributable to tree reconstruction artifacts that accumulate when branches in viral trees are long. The large number of PANGEA-HIV sequences provides unprecedented opportunities for evaluating HIV-1 transmission dynamics across sub-Saharan Africa and identifying prevention opportunities. Molecular epidemiological analyses of these data must proceed cautiously because sequence sampling remains below the identified threshold and a considerable negative impact of missing characters on phylogeny reconstruction is expected.
Collapse
Affiliation(s)
- Oliver Ratmann
- MRC Centre for Outbreak Analyses and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom
| | - Chris Wymant
- Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | - Caroline Colijn
- Department of Mathematics, Imperial College London, London, United Kingdom
| | - Siva Danaviah
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - Max Essex
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana
| | - Simon Frost
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | - Astrid Gall
- Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom
| | | | - Mary K. Grabowski
- Department of Epidemiology Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- Rakai Health Sciences Program, Entebbe, Uganda
| | - Ronald Gray
- Department of Epidemiology Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- Rakai Health Sciences Program, Entebbe, Uganda
| | - Stephane Guindon
- Department of Statistics, University of Auckland, Auckland, New Zealand
- Laboratoire d'Informatique, de Robotique et de Microelectronique de Montpellier–UMR 5506, CNRS & UM, Montpellier, France
| | - Arndt von Haeseler
- Centre for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Vienna, Austria
- Bioinformatics and Computational Biology, Faculty of Computer Science, University of Vienna, Vienna, Austria
| | | | - Michelle Kendall
- Department of Mathematics, Imperial College London, London, United Kingdom
| | - Alexey Kozlov
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
| | - Justen Manasa
- Africa Health Research Institute, KwaZulu-Natal, South Africa
| | - Bui Quang Minh
- Centre for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Vienna, Austria
| | - Sikhulile Moyo
- Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana
| | - Vlad Novitsky
- Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, Massachusetts
- Botswana Harvard AIDS Institute Partnership, Gaborone, Botswana
| | | | | | - Thomas C. Quinn
- Rakai Health Sciences Program, Entebbe, Uganda
- Division of Intramural Research, National Institute of Allergy and Infectious Diseases, NIH, Bethesda, Maryland
- Department of Medicine Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
| | - David Serwadda
- Rakai Health Sciences Program, Entebbe, Uganda
- Makerere University School of Public Health, Makerere University College of Health Sciences, Kampala, Uganda
| | | | - Alexandros Stamatakis
- Heidelberg Institute for Theoretical Studies, Heidelberg, Germany
- Institute for Theoretical Informatics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Jana Trifinopoulos
- Centre for Integrative Bioinformatics Vienna, Max F. Perutz Laboratories, University of Vienna, Medical University of Vienna, Vienna, Austria
| | - Maria Wawer
- Department of Epidemiology Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland
- Rakai Health Sciences Program, Entebbe, Uganda
| | - Andy Leigh Brown
- School of Biological Sciences, Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, United Kingdom
| | - Tulio de Oliveira
- Nelson R. Mandela School of Medicine, School of Laboratory Medicine and Medical Sciences, College of Health Sciences, University of KwaZulu-Natal, Durban, South Africa
| | - Paul Kellam
- Department of Infectious Diseases and Immunity, Imperial College London, United Kingdom
| | - Deenan Pillay
- Africa Health Research Institute, KwaZulu-Natal, South Africa
- Division of Infection & Immunity, Faculty of Medical Sciences, University College London, London, United Kingdom
| | - Christophe Fraser
- Oxford Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
| | | |
Collapse
|
39
|
Dearlove BL, Xiang F, Frost SDW. Biased phylodynamic inferences from analysing clusters of viral sequences. Virus Evol 2017; 3:vex020. [PMID: 28852573 PMCID: PMC5570026 DOI: 10.1093/ve/vex020] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a ‘power law’ behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified ‘transmission cluster’ of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources.
Collapse
Affiliation(s)
- Bethany L Dearlove
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, UK
| | - Fei Xiang
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, UK
| | - Simon D W Frost
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, UK
| |
Collapse
|
40
|
Inferring epidemiological parameters from phylogenies using regression-ABC: A comparative study. PLoS Comput Biol 2017; 13:e1005416. [PMID: 28263987 PMCID: PMC5358897 DOI: 10.1371/journal.pcbi.1005416] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 03/20/2017] [Accepted: 02/16/2017] [Indexed: 02/06/2023] Open
Abstract
Inferring epidemiological parameters such as the R0 from time-scaled phylogenies is a timely challenge. Most current approaches rely on likelihood functions, which raise specific issues that range from computing these functions to finding their maxima numerically. Here, we present a new regression-based Approximate Bayesian Computation (ABC) approach, which we base on a large variety of summary statistics intended to capture the information contained in the phylogeny and its corresponding lineage-through-time plot. The regression step involves the Least Absolute Shrinkage and Selection Operator (LASSO) method, which is a robust machine learning technique. It allows us to readily deal with the large number of summary statistics, while avoiding resorting to Markov Chain Monte Carlo (MCMC) techniques. To compare our approach to existing ones, we simulated target trees under a variety of epidemiological models and settings, and inferred parameters of interest using the same priors. We found that, for large phylogenies, the accuracy of our regression-ABC is comparable to that of likelihood-based approaches involving birth-death processes implemented in BEAST2. Our approach even outperformed these when inferring the host population size with a Susceptible-Infected-Removed epidemiological model. It also clearly outperformed a recent kernel-ABC approach when assuming a Susceptible-Infected epidemiological model with two host types. Lastly, by re-analyzing data from the early stages of the recent Ebola epidemic in Sierra Leone, we showed that regression-ABC provides more realistic estimates for the duration parameters (latency and infectiousness) than the likelihood-based method. Overall, ABC based on a large variety of summary statistics and a regression method able to perform variable selection and avoid overfitting is a promising approach to analyze large phylogenies. Given the rapid evolution of many pathogens, analysing their genomes by means of phylogenies can inform us about how they spread. This is the focus of the field known as “phylodynamics”. Most existing methods inferring epidemiological parameters from virus phylogenies are limited by the difficulty of handling complex likelihood functions, which commonly incorporate latent variables. Here, we use an alternative method known as regression-based Approximate Bayesian Computation (ABC), which circumvents this problem by using simulations and dataset comparisons. Since phylogenies are difficult to compare to one another, we introduce many summary statistics to describe them and take advantage of current machine learning techniques able to perform variable selection. We show that the accuracy we reach is comparable to that of existing methods. This accuracy increases with phylogeny size and can even be higher than that of existing methods for some parameters. Overall, regression-based ABC opens new perspectives to infer epidemiological parameters from large phylogenies.
Collapse
|
41
|
Yebra G, Hodcroft EB, Ragonnet-Cronin ML, Pillay D, Brown AJL. Using nearly full-genome HIV sequence data improves phylogeny reconstruction in a simulated epidemic. Sci Rep 2016; 6:39489. [PMID: 28008945 PMCID: PMC5180198 DOI: 10.1038/srep39489] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 11/21/2016] [Indexed: 01/09/2023] Open
Abstract
HIV molecular epidemiology studies analyse viral pol gene sequences due to their availability, but whole genome sequencing allows to use other genes. We aimed to determine what gene(s) provide(s) the best approximation to the real phylogeny by analysing a simulated epidemic (created as part of the PANGEA_HIV project) with a known transmission tree. We sub-sampled a simulated dataset of 4662 sequences into different combinations of genes (gag-pol-env, gag-pol, gag, pol, env and partial pol) and sampling depths (100%, 60%, 20% and 5%), generating 100 replicates for each case. We built maximum-likelihood trees for each combination using RAxML (GTR + Γ), and compared their topologies to the corresponding true tree’s using CompareTree. The accuracy of the trees was significantly proportional to the length of the sequences used, with the gag-pol-env datasets showing the best performance and gag and partial pol sequences showing the worst. The lowest sampling depths (20% and 5%) greatly reduced the accuracy of tree reconstruction and showed high variability among replicates, especially when using the shortest gene datasets. In conclusion, using longer sequences derived from nearly whole genomes will improve the reliability of phylogenetic reconstruction. With low sample coverage, results can be highly variable, particularly when based on short sequences.
Collapse
Affiliation(s)
- Gonzalo Yebra
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | - Emma B Hodcroft
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK
| | | | - Deenan Pillay
- Wellcome Trust-Africa Centre for Health and Population Studies, University of KwaZulu-Natal, Durban, South Africa
| | | | | | | |
Collapse
|