1
|
Walter KS, Cohen T, Mathema B, Colijn C, Sobkowiak B, Comas I, Goig GA, Croda J, Andrews JR. Signatures of transmission in within-host M. tuberculosis variation. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2023:2023.12.28.23300451. [PMID: 38234741 PMCID: PMC10793532 DOI: 10.1101/2023.12.28.23300451] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Background Because M. tuberculosis evolves slowly, transmission clusters often contain multiple individuals with identical consensus genomes, making it difficult to reconstruct transmission chains. Finding additional sources of shared M. tuberculosis variation could help overcome this problem. Previous studies have reported M. tuberculosis diversity within infected individuals; however, whether within-host variation improves transmission inferences remains unclear. Methods To evaluate the transmission information present in within-host M. tuberculosis variation, we re-analyzed publicly available sequence data from three household transmission studies, using household membership as a proxy for transmission linkage between donor-recipient pairs. Findings We found moderate levels of minority variation present in M. tuberculosis sequence data from cultured isolates that varied significantly across studies (mean: 6, 7, and 170 minority variants above a 1% minor allele frequency threshold, outside of PE/PPE genes). Isolates from household members shared more minority variants than did isolates from unlinked individuals in the three studies (mean 98 shared minority variants vs. 10; 0.8 vs. 0.2, and 0.7 vs. 0.2, respectively). Shared within-host variation was significantly associated with household membership (OR: 1.51 [1.30,1.71], for one standard deviation increase in shared minority variants). Models that included shared within-host variation improved the accuracy of predicting household membership in all three studies as compared to models without within-host variation (AUC: 0.95 versus 0.92, 0.99 versus 0.95, and 0.93 versus 0.91). Interpretation Within-host M. tuberculosis variation persists through culture and could enhance the resolution of transmission inferences. The substantial differences in minority variation recovered across studies highlights the need to optimize approaches to recover and incorporate within-host variation into automated phylogenetic and transmission inference. Funding NIAID: 5K01AI173385.
Collapse
Affiliation(s)
| | - Ted Cohen
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
| | - Barun Mathema
- Department of Epidemiology, Columbia University Mailman School of Public Health; New York, United States
| | - Caroline Colijn
- Department of Mathematics, Simon Fraser University; Burnaby, Canada
| | - Benjamin Sobkowiak
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
| | - Iñaki Comas
- Institute of Biomedicine of Valencia (CSIC), Valencia, Spain
| | - Galo A Goig
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Julio Croda
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
- Federal University of Mato Grosso do Sul - UFMS, Campo Grande, MS, Brazil
- Oswaldo Cruz Foundation Mato Grosso do Sul, Mato Grosso do Sul, Brazil
| | - Jason R Andrews
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| |
Collapse
|
2
|
Pérez-Llanos FJ, Dreyer V, Barilar I, Utpatel C, Kohl TA, Murcia MI, Homolka S, Merker M, Niemann S. Transmission Dynamics of a Mycobacterium tuberculosis Complex Outbreak in an Indigenous Population in the Colombian Amazon Region. Microbiol Spectr 2023; 11:e0501322. [PMID: 37222610 PMCID: PMC10269451 DOI: 10.1128/spectrum.05013-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Accepted: 05/04/2023] [Indexed: 05/25/2023] Open
Abstract
Whole genome sequencing (WGS) has become the main tool for studying the transmission of Mycobacterium tuberculosis complex (MTBC) strains; however, the clonal expansion of one strain often limits its application in local MTBC outbreaks. The use of an alternative reference genome and the inclusion of repetitive regions in the analysis could potentially increase the resolution, but the added value has not yet been defined. Here, we leveraged short and long WGS read data of a previously reported MTBC outbreak in the Colombian Amazon Region to analyze possible transmission chains among 74 patients in the indigenous setting of Puerto Nariño (March to October 2016). In total, 90.5% (67/74) of the patients were infected with one distinct MTBC strain belonging to lineage 4.3.3. Employing a reference genome from an outbreak strain and highly confident single nucleotide polymorphisms (SNPs) in repetitive genomic regions, e.g., the proline-glutamic acid/proline-proline-glutamic-acid (PE/PPE) gene family, increased the phylogenetic resolution compared to a classical H37Rv reference mapping approach. Specifically, the number of differentiating SNPs increased from 890 to 1,094, which resulted in a more granular transmission network as judged by an increasing number of individual nodes in a maximum parsimony tree, i.e., 5 versus 9 nodes. We also found in 29.9% (20/67) of the outbreak isolates, heterogenous alleles at phylogenetically informative sites, suggesting that these patients are infected with more than one clone. In conclusion, customized SNP calling thresholds and employment of a local reference genome for a mapping approach can improve the phylogenetic resolution in highly clonal MTBC populations and help elucidate within-host MTBC diversity. IMPORTANCE The Colombian Amazon around Puerto Nariño has a high tuberculosis burden with a prevalence of 1,267/100,000 people in 2016. Recently, an outbreak of Mycobacterium tuberculosis complex (MTBC) bacteria among the indigenous populations was identified with classical MTBC genotyping methods. Here, we employed a whole-genome sequencing-based outbreak investigation in order to improve the phylogenetic resolution and gain new insights into the transmission dynamics in this remote Colombian Amazon Region. The inclusion of well-supported single nucleotide polymorphisms in repetitive regions and a de novo-assembled local reference genome provided a more granular picture of the circulating outbreak strain and revealed new transmission chains. Multiple patients from different settlements were possibly infected with at least two different clones in this high-incidence setting. Thus, our results have the potential to improve molecular surveillance studies in other high-burden settings, especially regions with few clonal multidrug-resistant (MDR) MTBC lineages/clades.
Collapse
Affiliation(s)
| | - Viola Dreyer
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Ivan Barilar
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Christian Utpatel
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Thomas A. Kohl
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| | - Martha Isabel Murcia
- Grupo MICOBAC-UN, Departamento de Microbiología, Facultad de Medicina, Universidad Nacional de Colombia, Bogotá, Colombia
| | - Susanne Homolka
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
| | - Matthias Merker
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
- Evolution of the Resistome, Research Center Borstel, Borstel, Germany
| | - Stefan Niemann
- Molecular and Experimental Mycobacteriology, Research Center Borstel, Borstel, Germany
- German Center for Infection Research, Hamburg-Lübeck-Borstel-Riems, Germany
| |
Collapse
|
3
|
Li X, Muñoz JF, Gade L, Argimon S, Bougnoux ME, Bowers JR, Chow NA, Cuesta I, Farrer RA, Maufrais C, Monroy-Nieto J, Pradhan D, Uehling J, Vu D, Yeats CA, Aanensen DM, d’Enfert C, Engelthaler DM, Eyre DW, Fisher MC, Hagen F, Meyer W, Singh G, Alastruey-Izquierdo A, Litvintseva AP, Cuomo CA. Comparing genomic variant identification protocols for Candida auris. Microb Genom 2023; 9:mgen000979. [PMID: 37043380 PMCID: PMC10210944 DOI: 10.1099/mgen.0.000979] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 02/09/2023] [Indexed: 04/13/2023] Open
Abstract
Genomic analyses are widely applied to epidemiological, population genetic and experimental studies of pathogenic fungi. A wide range of methods are employed to carry out these analyses, typically without including controls that gauge the accuracy of variant prediction. The importance of tracking outbreaks at a global scale has raised the urgency of establishing high-accuracy pipelines that generate consistent results between research groups. To evaluate currently employed methods for whole-genome variant detection and elaborate best practices for fungal pathogens, we compared how 14 independent variant calling pipelines performed across 35 Candida auris isolates from 4 distinct clades and evaluated the performance of variant calling, single-nucleotide polymorphism (SNP) counts and phylogenetic inference results. Although these pipelines used different variant callers and filtering criteria, we found high overall agreement of SNPs from each pipeline. This concordance correlated with site quality, as SNPs discovered by a few pipelines tended to show lower mapping quality scores and depth of coverage than those recovered by all pipelines. We observed that the major differences between pipelines were due to variation in read trimming strategies, SNP calling methods and parameters, and downstream filtration criteria. We calculated specificity and sensitivity for each pipeline by aligning three isolates with chromosomal level assemblies and found that the GATK-based pipelines were well balanced between these metrics. Selection of trimming methods had a greater impact on SAMtools-based pipelines than those using GATK. Phylogenetic trees inferred by each pipeline showed high consistency at the clade level, but there was more variability between isolates from a single outbreak, with pipelines that used more stringent cutoffs having lower resolution. This project generated two truth datasets useful for routine benchmarking of C. auris variant calling, a consensus VCF of genotypes discovered by 10 or more pipelines across these 35 diverse isolates and variants for 2 samples identified from whole-genome alignments. This study provides a foundation for evaluating SNP calling pipelines and developing best practices for future fungal genomic studies.
Collapse
Affiliation(s)
- Xiao Li
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - José F. Muñoz
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Lalitha Gade
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, US Department of Health and Human Services, Atlanta, GA, 30329, USA
| | - Silvia Argimon
- Centre for Genomic Pathogen Surveillance, Big Data Institute, University of Oxford, Oxford, UK
| | - Marie-Elisabeth Bougnoux
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie et Pathogénicité Fongiques, Paris, France
- Université Paris Cité, Hôpital Necker-Enfants-Malades, Unité de Parasitologie-Mycologie, Assistance Publique des Hôpitaux de Paris, Paris, France
| | - Jolene R. Bowers
- Translational Genomics Research Institute, Pathogen and Microbiome Division, Flagstaff, AZ 86005, USA
| | - Nancy A. Chow
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, US Department of Health and Human Services, Atlanta, GA, 30329, USA
| | - Isabel Cuesta
- Mycology Reference Laboratory, National Centre for Microbiology, Instituto de Salud Carlos III, Madrid, Spain
| | - Rhys A. Farrer
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
- Medical Research Council Centre for Medical Mycology, University of Exeter, Exeter, EX4 4PY, UK
| | - Corinne Maufrais
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie et Pathogénicité Fongiques, Paris, France
- Institut Pasteur, Université Paris Cité, CNRS USR 3756, Hub de Bioinformatique et Biostatistique, Paris, France
| | - Juan Monroy-Nieto
- Translational Genomics Research Institute, Pathogen and Microbiome Division, Flagstaff, AZ 86005, USA
| | - Dibyabhaba Pradhan
- All India Institute of Medical Sciences, Ansari Nagar, New Delhi, 110029, India
| | - Jessie Uehling
- Botany and Plant Pathology, Oregon State University, Corvallis, OR 97330, USA
| | - Duong Vu
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584CT, Utrecht, Netherlands
| | - Corin A. Yeats
- Centre for Genomic Pathogen Surveillance, Big Data Institute, University of Oxford, Oxford, UK
| | - David M. Aanensen
- Centre for Genomic Pathogen Surveillance, Big Data Institute, University of Oxford, Oxford, UK
| | - Christophe d’Enfert
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie et Pathogénicité Fongiques, Paris, France
| | - David M. Engelthaler
- Translational Genomics Research Institute, Pathogen and Microbiome Division, Flagstaff, AZ 86005, USA
| | - David W. Eyre
- NIHR Oxford Biomedical Research Centre, University of Oxford, Oxford, UK
| | - Matthew C. Fisher
- MRC Centre for Global Infectious Disease Analysis, Imperial College London, London, UK
| | - Ferry Hagen
- Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584CT, Utrecht, Netherlands
- Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, Amsterdam, Netherlands
- Department of Medical Microbiology, University Medical Center Utrecht, Utrecht, Netherlands
| | - Wieland Meyer
- Sydney Medical School, University of Sydney, Sydney, NSW 2050, Australia
| | - Gagandeep Singh
- All India Institute of Medical Sciences, Ansari Nagar, New Delhi, 110029, India
| | - Ana Alastruey-Izquierdo
- Mycology Reference Laboratory, National Centre for Microbiology, Instituto de Salud Carlos III, Madrid, Spain
| | - Anastasia P. Litvintseva
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, US Department of Health and Human Services, Atlanta, GA, 30329, USA
| | | |
Collapse
|
4
|
Sanabria GE, Sequera G, Aguirre S, Méndez J, Dos Santos PCP, Gustafson NW, Godoy M, Ortiz A, Cespedes C, Martínez G, García-Basteiro AL, Andrews JR, Croda J, Walter KS. Phylogeography and transmission of Mycobacterium tuberculosis spanning prisons and surrounding communities in Paraguay. Nat Commun 2023; 14:303. [PMID: 36658111 PMCID: PMC9849832 DOI: 10.1038/s41467-023-35813-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 01/04/2023] [Indexed: 01/20/2023] Open
Abstract
Recent rises in incident tuberculosis (TB) cases in Paraguay and the increasing concentration of TB within prisons highlight the urgency of targeting strategies to interrupt transmission and prevent new infections. However, whether specific cities or carceral institutions play a disproportionate role in transmission remains unknown. We conducted prospective genomic surveillance, sequencing 471 Mycobacterium tuberculosis complex genomes, from inside and outside prisons in Paraguay's two largest urban areas, Asunción and Ciudad del Este, from 2016 to 2021. We found genomic evidence of frequent recent transmission within prisons and transmission linkages spanning prisons and surrounding populations. We identified a signal of frequent M. tuberculosis spread between urban areas and marked recent population size expansion of the three largest genomic transmission clusters. Together, our findings highlight the urgency of strengthening TB control programs to reduce transmission risk within prisons in Paraguay, where incidence was 70 times that outside prisons in 2021.
Collapse
Affiliation(s)
| | - Guillermo Sequera
- Instituto de Salud Global de Barcelona (ISGLOBAL), Barcelona, Spain
- Programa Nacional de Control de la Tuberculosis, Ministerio de Salud Pública y Bienestar Social (MSPyBS), Asunción, Paraguay
| | - Sarita Aguirre
- Programa Nacional de Control de la Tuberculosis, Ministerio de Salud Pública y Bienestar Social (MSPyBS), Asunción, Paraguay
| | - Julieta Méndez
- Instituto Regional de Investigación en Salud, Caaguazú, Paraguay
| | - Paulo César Pereira Dos Santos
- Postgraduate Program in Infectious and Parasitic Diseases, Federal University of Mato Grosso do Sul, Mato Grosso do Sul, Brazil
| | - Natalie Weiler Gustafson
- Laboratorio Central de Salud Pública (LCSP), Ministerio de Salud Publica y Bienestar Social (MSPyBS), Asunción, Paraguay
| | - Margarita Godoy
- Laboratorio Central de Salud Pública (LCSP), Ministerio de Salud Publica y Bienestar Social (MSPyBS), Asunción, Paraguay
| | - Analía Ortiz
- Instituto Regional de Investigación en Salud, Caaguazú, Paraguay
| | - Cynthia Cespedes
- Programa Nacional de Control de la Tuberculosis, Ministerio de Salud Pública y Bienestar Social (MSPyBS), Asunción, Paraguay
| | - Gloria Martínez
- Instituto Regional de Investigación en Salud, Caaguazú, Paraguay
| | - Alberto L García-Basteiro
- Instituto de Salud Global de Barcelona (ISGLOBAL), Barcelona, Spain
- Centro de Investigação em Saude de Manhiça (CISM), Maputo, Mozambique
- Centro de Investigación Biomédica en Red de Enfermedades Infecciosas (CIBERINFEC), Barcelona, Spain
| | - Jason R Andrews
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA, USA
| | - Julio Croda
- Federal University of Mato Grosso do Sul - UFMS, Campo Grande, MS, Brazil
- Oswaldo Cruz Foundation Mato Grosso do Sul, Mato Grosso do Sul, Brazil
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, USA
| | - Katharine S Walter
- Division of Epidemiology, University of Utah, Salt Lake City, UT, 84105, USA.
| |
Collapse
|
5
|
Walter KS, Dos Santos PCP, Gonçalves TO, da Silva BO, da Silva Santos A, de Cássia Leite A, da Silva AM, Figueira Moreira FM, de Oliveira RD, Lemos EF, Cunha E, Liu YE, Ko AI, Colijn C, Cohen T, Mathema B, Croda J, Andrews JR. The role of prisons in disseminating tuberculosis in Brazil: A genomic epidemiology study. LANCET REGIONAL HEALTH. AMERICAS 2022; 9. [PMID: 35647574 PMCID: PMC9140320 DOI: 10.1016/j.lana.2022.100186] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Background Globally, prisons are high-incidence settings for tuberculosis. Yet the role of prisons as reservoirs of M. tuberculosis, propagating epidemics through spillover to surrounding communities, has been difficult to measure directly. Methods To quantify the role of prisons in driving wider community M. tuberculosis transmission, we conducted prospective genomic surveillance in Central West Brazil from 2014 to 2019. We whole genome sequenced 1152 M. tuberculosis isolates collected during active and passive surveillance inside and outside prisons and linked genomes to detailed incarceration histories. We applied multiple phylogenetic and genomic clustering approaches and inferred timed transmission trees. Findings M. tuberculosis sequences from incarcerated and non-incarcerated people were closely related in a maximum likelihood phylogeny. The majority (70.8%; 46/65) of genomic clusters including people with no incarceration history also included individuals with a recent history of incarceration. Among cases in individuals with no incarceration history, 50.6% (162/320) were in clusters that included individuals with recent incarceration history, suggesting that transmission chains often span prisons and communities. We identified a minimum of 18 highly probable spillover events, M. tuberculosis transmission from people with a recent incarceration history to people with no prior history of incarceration, occurring in the state’s four largest cities and across sampling years. We additionally found that frequent transfers of people between the state’s prisons creates a highly connected prison network that likely disseminates M. tuberculosis across the state. Interpretation We developed a framework for measuring spillover from high-incidence environments to surrounding communities by integrating genomic and spatial information. Our findings indicate that, in this setting, prisons serve not only as disease reservoirs, but also disseminate M. tuberculosis across highly connected prison networks, both amplifying and propagating M. tuberculosis risk in surrounding communities. Funding Brazil’s National Council for Scientific and Technological Development and US National Institutes of Health.
Collapse
Affiliation(s)
- Katharine S Walter
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA 94305, United States
| | | | | | - Bruna Oliveira da Silva
- Health Sciences Research Laboratory, Federal University of Grande Dourados, Dourados, Brazil
| | - Andrea da Silva Santos
- Health Sciences Research Laboratory, Federal University of Grande Dourados, Dourados, Brazil
| | | | - Alessandra Moura da Silva
- School of Medicine, Federal University of Mato Grosso do Sul, School of Medicine, Campo Grande, Brazil
| | | | | | - Everton Ferreira Lemos
- School of Medicine, Federal University of Mato Grosso do Sul, School of Medicine, Campo Grande, Brazil
| | - Eunice Cunha
- Laboratory of Bacteriology, Central Laboratory of Mato Grosso do Sul, Campo Grande, Brazil
| | - Yiran E Liu
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA 94305, United States.,Cancer Biology Graduate Program, Stanford University School of Medicine, Stanford, United States
| | - Albert I Ko
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States.,Instituto Gonçalo¸ Moniz, Fundação Oswaldo Cruz, Salvador, BA, Brazil
| | - Caroline Colijn
- Department of Mathematics, Simon Fraser University, Burnaby, Canada
| | - Ted Cohen
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States
| | - Barun Mathema
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, United States
| | - Julio Croda
- School of Medicine, Federal University of Mato Grosso do Sul, School of Medicine, Campo Grande, Brazil.,Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, United States.,Mato Grosso do Sul Office, Oswaldo Cruz Foundation, Campo Grande, Brazil
| | - Jason R Andrews
- Division of Infectious Diseases and Geographic Medicine, Stanford University School of Medicine, Stanford, CA 94305, United States
| |
Collapse
|
6
|
Dookie N, Khan A, Padayatchi N, Naidoo K. Application of Next Generation Sequencing for Diagnosis and Clinical Management of Drug-Resistant Tuberculosis: Updates on Recent Developments in the Field. Front Microbiol 2022; 13:775030. [PMID: 35401475 PMCID: PMC8988194 DOI: 10.3389/fmicb.2022.775030] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Accepted: 02/17/2022] [Indexed: 11/30/2022] Open
Abstract
The World Health Organization’s End TB Strategy prioritizes universal access to an early diagnosis and comprehensive drug susceptibility testing (DST) for all individuals with tuberculosis (TB) as a key component of integrated, patient-centered TB care. Next generation whole genome sequencing (WGS) and its associated technology has demonstrated exceptional potential for reliable and comprehensive resistance prediction for Mycobacterium tuberculosis isolates, allowing for accurate clinical decisions. This review presents a descriptive analysis of research describing the potential of WGS to accelerate delivery of individualized care, recent advances in sputum-based WGS technology and the role of targeted sequencing for resistance detection. We provide an update on recent research describing the mechanisms of resistance to new and repurposed drugs and the dynamics of mixed infections and its potential implication on TB diagnosis and treatment. Whilst the studies reviewed here have greatly improved our understanding of recent advances in this arena, it highlights significant challenges that remain. The wide-spread introduction of new drugs in the absence of standardized DST has led to rapid emergence of drug resistance. This review highlights apparent gaps in our knowledge of the mechanisms contributing to resistance for these new drugs and challenges that limit the clinical utility of next generation sequencing techniques. It is recommended that a combination of genotypic and phenotypic techniques is warranted to monitor treatment response, curb emerging resistance and further dissemination of drug resistance.
Collapse
Affiliation(s)
- Navisha Dookie
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
- *Correspondence: Navisha Dookie,
| | - Azraa Khan
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
| | - Nesri Padayatchi
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
- South African Medical Research Council (SAMRC), CAPRISA HIV-TB Pathogenesis and Treatment Research Unit, Durban, South Africa
| | - Kogieleum Naidoo
- Centre for the AIDS Programme of Research in South Africa (CAPRISA), University of KwaZulu-Natal, Durban, South Africa
- South African Medical Research Council (SAMRC), CAPRISA HIV-TB Pathogenesis and Treatment Research Unit, Durban, South Africa
| |
Collapse
|
7
|
Lorente-Leal V, Farrell D, Romero B, Álvarez J, de Juan L, Gordon SV. Performance and Agreement Between WGS Variant Calling Pipelines Used for Bovine Tuberculosis Control: Toward International Standardization. Front Vet Sci 2022; 8:780018. [PMID: 34970617 PMCID: PMC8712436 DOI: 10.3389/fvets.2021.780018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 11/25/2021] [Indexed: 11/29/2022] Open
Abstract
Whole genome sequencing (WGS) and allied variant calling pipelines are a valuable tool for the control and eradication of infectious diseases, since they allow the assessment of the genetic relatedness of strains of animal pathogens. In the context of the control of tuberculosis (TB) in livestock, mainly caused by Mycobacterium bovis, these tools offer a high-resolution alternative to traditional molecular methods in the study of herd breakdown events. However, despite the increased use and efforts in the standardization of WGS methods in human tuberculosis around the world, the application of these WGS-enabled approaches to control TB in livestock is still in early development. Our study pursued an initial evaluation of the performance and agreement of four publicly available pipelines for the analysis of M. bovis WGS data (vSNP, SNiPgenie, BovTB, and MTBseq) on a set of simulated Illumina reads generated from a real-world setting with high TB prevalence in cattle and wildlife in the Republic of Ireland. The overall performance of the evaluated pipelines was high, with recall and precision rates above 99% once repeat-rich and problematic regions were removed from the analyses. In addition, when the same filters were applied, distances between inferred phylogenetic trees were similar and pairwise comparison revealed that most of the differences were due to the positioning of polytomies. Hence, under the studied conditions, all pipelines offer similar performance for variant calling to underpin real-world studies of M. bovis transmission dynamics.
Collapse
Affiliation(s)
- Víctor Lorente-Leal
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Damien Farrell
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| | - Beatriz Romero
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Julio Álvarez
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Lucía de Juan
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Stephen V Gordon
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
8
|
Bagal UR, Phan J, Welsh RM, Misas E, Wagner D, Gade L, Litvintseva AP, Cuomo CA, Chow NA. MycoSNP: A Portable Workflow for Performing Whole-Genome Sequencing Analysis of Candida auris. Methods Mol Biol 2022; 2517:215-228. [PMID: 35674957 DOI: 10.1007/978-1-0716-2417-3_17] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Candida auris is an urgent public health threat characterized by high drug-resistant rates and rapid spread in healthcare settings worldwide. As part of the C. auris response, molecular surveillance has helped public health officials track the global spread and investigate local outbreaks. Here, we describe whole-genome sequencing analysis methods used for routine C. auris molecular surveillance in the United States; methods include reference selection, reference preparation, quality assessment and control of sequencing reads, read alignment, and single-nucleotide polymorphism calling and filtration. We also describe the newly developed pipeline MycoSNP, a portable workflow for performing whole-genome sequencing analysis of fungal organisms including C. auris.
Collapse
Affiliation(s)
- Ujwal R Bagal
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - John Phan
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Rory M Welsh
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Elizabeth Misas
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | - Lalitha Gade
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | - Christina A Cuomo
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nancy A Chow
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA.
| |
Collapse
|
9
|
Heupink TH, Verboven L, Warren RM, Van Rie A. Comprehensive and accurate genetic variant identification from contaminated and low-coverage Mycobacterium tuberculosis whole genome sequencing data. Microb Genom 2021; 7:000689. [PMID: 34793294 PMCID: PMC8743552 DOI: 10.1099/mgen.0.000689] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 09/09/2021] [Indexed: 12/30/2022] Open
Abstract
Improved understanding of the genomic variants that allow Mycobacterium tuberculosis (Mtb ) to acquire drug resistance, or tolerance, and increase its virulence are important factors in controlling the current tuberculosis epidemic. Current approaches to Mtb sequencing, however, cannot reveal Mtb ’s full genomic diversity due to the strict requirements of low contamination levels, high Mtb sequence coverage and elimination of complex regions. We have developed the XBS (compleX Bacterial Samples) bioinformatics pipeline, which implements joint calling and machine-learning-based variant filtering tools to specifically improve variant detection in the important Mtb samples that do not meet these criteria, such as those from unbiased sputum samples. Using novel simulated datasets, which permit exact accuracy verification, XBS was compared to the UVP and MTBseq pipelines. Accuracy statistics showed that all three pipelines performed equally well for sequence data that resemble those obtained from culture isolates of high depth of coverage and low-level contamination. In the complex genomic regions, however, XBS accurately identified 9.0 % more SNPs and 8.1 % more single nucleotide insertions and deletions than the WHO-endorsed unified analysis variant pipeline. XBS also had superior accuracy for sequence data that resemble those obtained directly from sputum samples, where depth of coverage is typically very low and contamination levels are high. XBS was the only pipeline not affected by low depth of coverage (5–10×), type of contamination and excessive contamination levels (>50 %). Simulation results were confirmed using whole genome sequencing (WGS) data from clinical samples, confirming the superior performance of XBS with a higher sensitivity (98.8%) when analysing culture isolates and identification of 13.9 % more variable sites in WGS data from sputum samples as compared to MTBseq, without evidence for false positive variants when rRNA regions were excluded. The XBS pipeline facilitates sequencing of less-than-perfect Mtb samples. These advances will benefit future clinical applications of Mtb sequencing, especially WGS directly from clinical specimens, thereby avoiding in vitro biases and making many more samples available for drug resistance and other genomic analyses. The additional genetic resolution and increased sample success rate will improve genome-wide association studies and sequence-based transmission studies.
Collapse
Affiliation(s)
- Tim H. Heupink
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Lennert Verboven
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| | - Robin M. Warren
- South African Medical Research Council Centre for Tuberculosis Research and DST/NRF Centre of Excellence for Biomedical Tuberculosis Research, Division of Molecular Biology and Human Genetics, Stellenbosch University, Stellenbosch, South Africa
| | - Annelies Van Rie
- Family Medicine and Population Health (FAMPOP), Faculty of Medicine and Health Sciences, University of Antwerp, Antwerp, Belgium
| |
Collapse
|
10
|
Bogaerts B, Delcourt T, Soetaert K, Boarbi S, Ceyssens PJ, Winand R, Van Braekel J, De Keersmaecker SCJ, Roosens NHC, Marchal K, Mathys V, Vanneste K. A Bioinformatics Whole-Genome Sequencing Workflow for Clinical Mycobacterium tuberculosis Complex Isolate Analysis, Validated Using a Reference Collection Extensively Characterized with Conventional Methods and In Silico Approaches. J Clin Microbiol 2021; 59:e00202-21. [PMID: 33789960 PMCID: PMC8316078 DOI: 10.1128/jcm.00202-21] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 03/27/2021] [Indexed: 01/18/2023] Open
Abstract
The use of whole-genome sequencing (WGS) for routine typing of bacterial isolates has increased substantially in recent years. For Mycobacterium tuberculosis (MTB), in particular, WGS has the benefit of drastically reducing the time required to generate results compared to most conventional phenotypic methods. Consequently, a multitude of solutions for analyzing WGS MTB data have been developed, but their successful integration in clinical and national reference laboratories is hindered by the requirement for their validation, for which a consensus framework is still largely absent. We developed a bioinformatics workflow for (Illumina) WGS-based routine typing of MTB complex (MTBC) member isolates allowing complete characterization, including (sub)species confirmation and identification (16S, csb/RD, hsp65), single nucleotide polymorphism (SNP)-based antimicrobial resistance (AMR) prediction, and pathogen typing (spoligotyping, SNP barcoding, and core genome multilocus sequence typing). Workflow performance was validated on a per-assay basis using a collection of 238 in-house-sequenced MTBC isolates, extensively characterized with conventional molecular biology-based approaches supplemented with public data. For SNP-based AMR prediction, results from molecular genotyping methods were supplemented with in silico modified data sets, allowing us to greatly increase the set of evaluated mutations. The workflow demonstrated very high performance with performance metrics of >99% for all assays, except for spoligotyping, where sensitivity dropped to ∼90%. The validation framework for our WGS-based bioinformatics workflow can aid in the standardization of bioinformatics tools by the MTB community and other SNP-based applications regardless of the targeted pathogen(s). The bioinformatics workflow is available for academic and nonprofit use through the Galaxy instance of our institute at https://galaxy.sciensano.be.
Collapse
Affiliation(s)
- Bert Bogaerts
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Thomas Delcourt
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | | | | | - Raf Winand
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Julien Van Braekel
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | - Nancy H C Roosens
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Kathleen Marchal
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Genetics, University of Pretoria, Pretoria, South Africa
| | | | - Kevin Vanneste
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| |
Collapse
|
11
|
Didelot X, Kendall M, Xu Y, White PJ, McCarthy N. Genomic Epidemiology Analysis of Infectious Disease Outbreaks Using TransPhylo. Curr Protoc 2021; 1:e60. [PMID: 33617114 PMCID: PMC7995038 DOI: 10.1002/cpz1.60] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Comparing the pathogen genomes from several cases of an infectious disease has the potential to help us understand and control outbreaks. Many methods exist to reconstruct a phylogeny from such genomes, which represents how the genomes are related to one another. However, such a phylogeny is not directly informative about transmission events between individuals. TransPhylo is a software tool implemented as an R package designed to bridge the gap between pathogen phylogenies and transmission trees. TransPhylo is based on a combined model of transmission between hosts and pathogen evolution within each host. It can simulate both phylogenies and transmission trees jointly under this combined model. TransPhylo can also reconstruct a transmission tree based on a dated phylogeny, by exploring the space of transmission trees compatible with the phylogeny. A transmission tree can be represented as a coloring of a phylogeny where each color represents a different host of the pathogen, and TransPhylo provides convenient ways to plot these colorings and explore the results. This article presents the basic protocols that can be used to make the most of TransPhylo. © 2021 The Authors. Basic Protocol 1: First steps with TransPhylo Basic Protocol 2: Simulation of outbreak data Basic Protocol 3: Inference of transmission Basic Protocol 4: Exploring the results of inference.
Collapse
Affiliation(s)
- Xavier Didelot
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Michelle Kendall
- School of Life Sciences and Department of StatisticsUniversity of WarwickUnited Kingdom
| | - Yuanwei Xu
- Center for Computational Biology, Institute of Cancer and Genomic SciencesUniversity of BirminghamUnited Kingdom
| | - Peter J. White
- Department of Infectious Disease Epidemiology, School of Public HealthImperial College LondonUnited Kingdom
- Medical Research Council Centre for Global Infectious Disease Analysis, School of Public HealthImperial College LondonUnited Kingdom
- National Institute for Health Research Health Protection Research Unit in Modelling and Health Economics, School of Public HealthImperial College LondonUnited Kingdom
- Modelling and Economics Unit, National Infection ServicePublic Health EnglandLondonUnited Kingdom
| | - Noel McCarthy
- Warwick Medical SchoolUniversity of WarwickUnited Kingdom
| |
Collapse
|
12
|
Valiente-Mullor C, Beamud B, Ansari I, Francés-Cuesta C, García-González N, Mejía L, Ruiz-Hueso P, González-Candelas F. One is not enough: On the effects of reference genome for the mapping and subsequent analyses of short-reads. PLoS Comput Biol 2021; 17:e1008678. [PMID: 33503026 PMCID: PMC7870062 DOI: 10.1371/journal.pcbi.1008678] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 02/08/2021] [Accepted: 01/05/2021] [Indexed: 12/17/2022] Open
Abstract
Mapping of high-throughput sequencing (HTS) reads to a single arbitrary reference genome is a frequently used approach in microbial genomics. However, the choice of a reference may represent a source of errors that may affect subsequent analyses such as the detection of single nucleotide polymorphisms (SNPs) and phylogenetic inference. In this work, we evaluated the effect of reference choice on short-read sequence data from five clinically and epidemiologically relevant bacteria (Klebsiella pneumoniae, Legionella pneumophila, Neisseria gonorrhoeae, Pseudomonas aeruginosa and Serratia marcescens). Publicly available whole-genome assemblies encompassing the genomic diversity of these species were selected as reference sequences, and read alignment statistics, SNP calling, recombination rates, dN/dS ratios, and phylogenetic trees were evaluated depending on the mapping reference. The choice of different reference genomes proved to have an impact on almost all the parameters considered in the five species. In addition, these biases had potential epidemiological implications such as including/excluding isolates of particular clades and the estimation of genetic distances. These findings suggest that the single reference approach might introduce systematic errors during mapping that affect subsequent analyses, particularly for data sets with isolates from genetically diverse backgrounds. In any case, exploring the effects of different references on the final conclusions is highly recommended. Mapping consists in the alignment of reads (i.e., DNA fragments) obtained through high-throughput genome sequencing to a previously assembled reference sequence. It is a common practice in genomic studies to use a single reference for mapping, usually the ‘reference genome’ of a species—a high-quality assembly. However, the selection of an optimal reference is hindered by intrinsic intra-species genetic variability, particularly in bacteria. It is known that genetic differences between the reference genome and the read sequences may produce incorrect alignments during mapping. Eventually, these errors could lead to misidentification of variants and biased reconstruction of phylogenetic trees (which reflect ancestry between different bacterial lineages). To our knowledge, this is the first work to systematically examine the effect of different references for mapping on the inference of tree topology as well as the impact on recombination and natural selection inferences. Furthermore, the novelty of this work relies on a procedure that guarantees that we are evaluating only the effect of the reference. This effect has proved to be pervasive in the five bacterial species that we have studied and, in some cases, alterations in phylogenetic trees could lead to incorrect epidemiological inferences. Hence, the use of different reference genomes may be prescriptive to assess the potential biases of mapping.
Collapse
Affiliation(s)
- Carlos Valiente-Mullor
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
| | - Beatriz Beamud
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
- * E-mail: (BB); (FG-C)
| | - Iván Ansari
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
| | - Carlos Francés-Cuesta
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
| | - Neris García-González
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
| | - Lorena Mejía
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
- Instituto de Microbiología, Colegio de Ciencias Biológicas y Ambientales, Universidad San Francisco de Quito, Quito, Ecuador
| | - Paula Ruiz-Hueso
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
| | - Fernando González-Candelas
- Joint Research Unit “Infection and Public Health” FISABIO-University of Valencia, Institute for Integrative Systems Biology (I2SysBio), Valencia, Spain
- CIBER in Epidemiology and Public Health, Valencia, Spain
- * E-mail: (BB); (FG-C)
| |
Collapse
|