1
|
Antão-Sousa S, Pinto N, Rende P, Amorim A, Gusmão L. The sequence of the repetitive motif influences the frequency of multistep mutations in Short Tandem Repeats. Sci Rep 2023; 13:10251. [PMID: 37355683 PMCID: PMC10290632 DOI: 10.1038/s41598-023-32137-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 03/23/2023] [Indexed: 06/26/2023] Open
Abstract
Microsatellites, or Short Tandem Repeats (STRs), are subject to frequent length mutations that involve the loss or gain of an integer number of repeats. This work aimed to investigate the correlation between STRs' specific repetitive motif composition and mutational dynamics, specifically the occurrence of single- or multistep mutations. Allelic transmission data, comprising 323,818 allele transfers and 1,297 mutations, were gathered for 35 Y-chromosomal STRs with simple structure. Six structure groups were established: ATT, CTT, TCTA/GATA, GAAA/CTTT, CTTTT, and AGAGAT, according to the repetitive motif present in the DNA leading strand of the markers. Results show that the occurrence of multistep mutations varies significantly among groups of markers defined by the repetitive motif. The group of markers with the highest frequency of multistep mutations was the one with repetitive motif CTTTT (25% of the detected mutations) and the lowest frequency corresponding to the group with repetitive motifs TCTA/GATA (0.93%). Statistically significant differences (α = 0.05) were found between groups with repetitive motifs with different lengths, as is the case of TCTA/GATA and ATT (p = 0.0168), CTT (p < 0.0001) and CTTTT (p < 0.0001), as well as between GAAA/CTTT and CTTTT (p = 0.0102). The same occurred between the two tetrameric groups GAAA/CTTT and TCTA/GATA (p < 0.0001) - the first showing 5.7 times more multistep mutations than the second. When considering the number of repeats of the mutated paternal alleles, statistically significant differences were found for alleles with 10 or 12 repeats, between GATA and ATT structure groups. These results, which demonstrate the heterogeneity of mutational dynamics across repeat motifs, have implications in the fields of population genetics, epidemiology, or phylogeography, and whenever STR mutation models are used in evolutionary studies in general.
Collapse
Affiliation(s)
- Sofia Antão-Sousa
- Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal.
- Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal.
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Porto, Portugal.
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil.
| | - Nádia Pinto
- Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal
- Center of Mathematics of University of Porto (CMUP), Porto, Portugal
| | - Pablo Rende
- Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Porto, Portugal
| | - António Amorim
- Instituto de Investigação e Inovação em Saúde (i3S), University of Porto, Porto, Portugal
- Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Porto, Portugal
- Department of Biology, Faculty of Sciences of University of Porto (FCUP), Porto, Portugal
| | - Leonor Gusmão
- DNA Diagnostic Laboratory (LDD), State University of Rio de Janeiro (UERJ), Rio de Janeiro, Brazil
| |
Collapse
|
2
|
Karabatsos G, Leisen F. An approximate likelihood perspective on ABC methods. STATISTICS SURVEYS 2018. [DOI: 10.1214/18-ss120] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
3
|
The Evolution of Strain Typing in the Mycobacterium tuberculosis Complex. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 1019:43-78. [PMID: 29116629 DOI: 10.1007/978-3-319-64371-7_3] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Tuberculosis (TB) is a contagious disease with a complex epidemiology. Therefore, molecular typing (genotyping) of Mycobacterium tuberculosis complex (MTBC) strains is of primary importance to effectively guide outbreak investigations, define transmission dynamics and assist global epidemiological surveillance of the disease. Large-scale genotyping is also needed to get better insights into the biological diversity and the evolution of the pathogen. Thanks to its shorter turnaround and simple numerical nomenclature system, mycobacterial interspersed repetitive unit-variable-number tandem repeat (MIRU-VNTR) typing, based on 24 standardized plus 4 hypervariable loci, optionally combined with spoligotyping, has replaced IS6110 DNA fingerprinting over the last decade as a gold standard among classical strain typing methods for many applications. With the continuous progress and decreasing costs of next-generation sequencing (NGS) technologies, typing based on whole genome sequencing (WGS) is now increasingly performed for near complete exploitation of the available genetic information. However, some important challenges remain such as the lack of standardization of WGS analysis pipelines, the need of databases for sharing WGS data at a global level, and a better understanding of the relevant genomic distances for defining clusters of recent TB transmission in different epidemiological contexts. This chapter provides an overview of the evolution of genotyping methods over the last three decades, which culminated with the development of WGS-based methods. It addresses the relative advantages and limitations of these techniques, indicates current challenges and potential directions for facilitating standardization of WGS-based typing, and provides suggestions on what method to use depending on the specific research question.
Collapse
|
4
|
Fu S, Octavia S, Wang Q, Tanaka MM, Tay CY, Sintchenko V, Lan R. Evolution of Variable Number Tandem Repeats and Its Relationship with Genomic Diversity in Salmonella Typhimurium. Front Microbiol 2016; 7:2002. [PMID: 28082952 PMCID: PMC5183578 DOI: 10.3389/fmicb.2016.02002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 11/30/2016] [Indexed: 01/06/2023] Open
Abstract
Salmonella enterica serovar Typhimurium is the most common Salmonella serovar causing human infections in Australia and many other countries. A total of 12,112 S. Typhimurium isolates from New South Wales were analyzed by multi-locus variable number of tandem repeat (VNTR) analysis (MLVA) using five VNTRs from 2007 to 2014. We found that mid ranges of repeat units of 8–14 in VNTR locus STTR5, 6–13 in STTR6, and 9–12 in STTR10 were always predominant in the population (>50%). In vitro passaging experiments using MLVA type carrying extreme length alleles found that the majority of long length alleles mutated to short ones and short length alleles mutated to longer ones. Both data suggest directional mutability of VNTRs toward mid-range repeats. Sequencing of 28 isolates from a newly emerged MLVA type and its five single locus variants revealed that single nucleotide variation between isolates with up to two MLVA differences ranged from 0 to 12 single nucleotide polymorphisms (SNPs). However, there was no relationship between SNP and VNTR differences. A population genetic model of the joint distribution of VNTRs and SNPs variations was used to estimate the mutation rates of the two markers, yielding a ratio of 1 VNTR change to 6.9 SNP changes. When only one VNTR repeat difference was considered, the majority of pairwise SNP difference between isolates were 4 SNPs or fewer. Based on this observation and our previous findings of SNP differences of outbreak isolates, we suggest that investigation of S. Typhimurium community outbreaks should include cases of 1 repeat difference to increase sensitivity. This study offers new insights into the short-term VNTR evolution of S. Typhimurium and its application for epidemiological typing.
Collapse
Affiliation(s)
- Songzhe Fu
- School of Biotechnology and Biomolecular Sciences, University of New South Wales (UNSW) Sydney, NSW, Australia
| | - Sophie Octavia
- School of Biotechnology and Biomolecular Sciences, University of New South Wales (UNSW) Sydney, NSW, Australia
| | - Qinning Wang
- Centre for Infectious Diseases and Microbiology-Public Health, Institute of Clinical Pathology and Medical Research, Westmead Hospital Sydney, NSW, Australia
| | - Mark M Tanaka
- School of Biotechnology and Biomolecular Sciences, University of New South Wales (UNSW) Sydney, NSW, Australia
| | - Chin Yen Tay
- Pathology and Laboratory Medicine, University of Western Australia Perth, WA, Australia
| | - Vitali Sintchenko
- Centre for Infectious Diseases and Microbiology-Public Health, Institute of Clinical Pathology and Medical Research, Westmead HospitalSydney, NSW, Australia; Marie Bashir Institute for Infectious Diseases and Biosecurity, University of SydneySydney, NSW, Australia
| | - Ruiting Lan
- School of Biotechnology and Biomolecular Sciences, University of New South Wales (UNSW) Sydney, NSW, Australia
| |
Collapse
|
5
|
Armed conflict and population displacement as drivers of the evolution and dispersal of Mycobacterium tuberculosis. Proc Natl Acad Sci U S A 2016; 113:13881-13886. [PMID: 27872285 DOI: 10.1073/pnas.1611283113] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The "Beijing" Mycobacterium tuberculosis (Mtb) lineage 2 (L2) is spreading globally and has been associated with accelerated disease progression and increased antibiotic resistance. Here we performed a phylodynamic reconstruction of one of the L2 sublineages, the central Asian clade (CAC), which has recently spread to western Europe. We find that recent historical events have contributed to the evolution and dispersal of the CAC. Our timing estimates indicate that the clade was likely introduced to Afghanistan during the 1979-1989 Soviet-Afghan war and spread further after population displacement in the wake of the American invasion in 2001. We also find that drug resistance mutations accumulated on a massive scale in Mtb isolates from former Soviet republics after the fall of the Soviet Union, a pattern that was not observed in CAC isolates from Afghanistan. Our results underscore the detrimental effects of political instability and population displacement on tuberculosis control and demonstrate the power of phylodynamic methods in exploring bacterial evolution in space and time.
Collapse
|
6
|
Chindelevitch L, Colijn C, Moodley P, Wilson D, Cohen T. ClassTR: Classifying Within-Host Heterogeneity Based on Tandem Repeats with Application to Mycobacterium tuberculosis Infections. PLoS Comput Biol 2016; 12:e1004475. [PMID: 26829497 PMCID: PMC4734664 DOI: 10.1371/journal.pcbi.1004475] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Accepted: 07/22/2015] [Indexed: 11/18/2022] Open
Abstract
Genomic tools have revealed genetically diverse pathogens within some hosts. Within-host pathogen diversity, which we refer to as "complex infection", is increasingly recognized as a determinant of treatment outcome for infections like tuberculosis. Complex infection arises through two mechanisms: within-host mutation (which results in clonal heterogeneity) and reinfection (which results in mixed infections). Estimates of the frequency of within-host mutation and reinfection in populations are critical for understanding the natural history of disease. These estimates influence projections of disease trends and effects of interventions. The genotyping technique MLVA (multiple loci variable-number tandem repeats analysis) can identify complex infections, but the current method to distinguish clonal heterogeneity from mixed infections is based on a rather simple rule. Here we describe ClassTR, a method which leverages MLVA information from isolates collected in a population to distinguish mixed infections from clonal heterogeneity. We formulate the resolution of complex infections into their constituent strains as an optimization problem, and show its NP-completeness. We solve it efficiently by using mixed integer linear programming and graph decomposition. Once the complex infections are resolved into their constituent strains, ClassTR probabilistically classifies isolates as clonally heterogeneous or mixed by using a model of tandem repeat evolution. We first compare ClassTR with the standard rule-based classification on 100 simulated datasets. ClassTR outperforms the standard method, improving classification accuracy from 48% to 80%. We then apply ClassTR to a sample of 436 strains collected from tuberculosis patients in a South African community, of which 92 had complex infections. We find that ClassTR assigns an alternate classification to 18 of the 92 complex infections, suggesting important differences in practice. By explicitly modeling tandem repeat evolution, ClassTR helps to improve our understanding of the mechanisms driving within-host diversity of pathogens like Mycobacterium tuberculosis.
Collapse
Affiliation(s)
- Leonid Chindelevitch
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
- * E-mail:
| | - Caroline Colijn
- Department of Mathematics, Imperial College, London, United Kingdom
| | - Prashini Moodley
- School of Laboratory Medicine and Medical Sciences, Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
| | - Douglas Wilson
- Department of Medicine, Edendale Hospital, Pietermaritzberg, South Africa
- Nelson R Mandela School of Medicine, University of KwaZulu-Natal, Durban, South Africa
| | - Ted Cohen
- Department of Epidemiology of Microbial Diseases, Yale School of Public Health, New Haven, Connecticut, United States of America
| |
Collapse
|
7
|
Couvin D, Rastogi N. Tuberculosis – A global emergency: Tools and methods to monitor, understand, and control the epidemic with specific example of the Beijing lineage. Tuberculosis (Edinb) 2015; 95 Suppl 1:S177-89. [DOI: 10.1016/j.tube.2015.02.023] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
8
|
Ahlstrom C, Barkema HW, Stevenson K, Zadoks RN, Biek R, Kao R, Trewby H, Haupstein D, Kelton DF, Fecteau G, Labrecque O, Keefe GP, McKenna SLB, De Buck J. Limitations of variable number of tandem repeat typing identified through whole genome sequencing of Mycobacterium avium subsp. paratuberculosis on a national and herd level. BMC Genomics 2015; 16:161. [PMID: 25765045 PMCID: PMC4356054 DOI: 10.1186/s12864-015-1387-6] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 02/24/2015] [Indexed: 01/14/2023] Open
Abstract
Background Mycobacterium avium subsp. paratuberculosis (MAP), the causative bacterium of Johne’s disease in dairy cattle, is widespread in the Canadian dairy industry and has significant economic and animal welfare implications. An understanding of the population dynamics of MAP can be used to identify introduction events, improve control efforts and target transmission pathways, although this requires an adequate understanding of MAP diversity and distribution between herds and across the country. Whole genome sequencing (WGS) offers a detailed assessment of the SNP-level diversity and genetic relationship of isolates, whereas several molecular typing techniques used to investigate the molecular epidemiology of MAP, such as variable number of tandem repeat (VNTR) typing, target relatively unstable repetitive elements in the genome that may be too unpredictable to draw accurate conclusions. The objective of this study was to evaluate the diversity of bovine MAP isolates in Canadian dairy herds using WGS and then determine if VNTR typing can distinguish truly related and unrelated isolates. Results Phylogenetic analysis based on 3,039 SNPs identified through WGS of 124 MAP isolates identified eight genetically distinct subtypes in dairy herds from seven Canadian provinces, with the dominant type including over 80% of MAP isolates. VNTR typing of 527 MAP isolates identified 12 types, including “bison type” isolates, from seven different herds. At a national level, MAP isolates differed from each other by 1–2 to 239–240 SNPs, regardless of whether they belonged to the same or different VNTR types. A herd-level analysis of MAP isolates demonstrated that VNTR typing may both over-estimate and under-estimate the relatedness of MAP isolates found within a single herd. Conclusions The presence of multiple MAP subtypes in Canada suggests multiple introductions into the country including what has now become one dominant type, an important finding for Johne’s disease control. VNTR typing often failed to identify closely and distantly related isolates, limiting the applicability of using this typing scheme to study the molecular epidemiology of MAP at a national and herd-level.
Collapse
Affiliation(s)
| | | | | | - Ruth N Zadoks
- Moredun Research Institute, Penicuik, Scotland. .,University of Glasgow, Glasgow, Scotland.
| | - Roman Biek
- University of Glasgow, Glasgow, Scotland.
| | | | | | | | | | | | - Olivia Labrecque
- Laboratoire d'épidémiosurveillance animale du Québec, Saint-Hyacinthe, Québec, Canada.
| | - Greg P Keefe
- University of Prince Edward Island, Charlottetown, Prince Edward Island, Canada.
| | - Shawn L B McKenna
- University of Prince Edward Island, Charlottetown, Prince Edward Island, Canada.
| | | |
Collapse
|
9
|
Putman AI, Carbone I. Challenges in analysis and interpretation of microsatellite data for population genetic studies. Ecol Evol 2014; 4:4399-428. [PMID: 25540699 PMCID: PMC4267876 DOI: 10.1002/ece3.1305] [Citation(s) in RCA: 204] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 10/02/2014] [Accepted: 10/03/2014] [Indexed: 12/14/2022] Open
Abstract
Advancing technologies have facilitated the ever-widening application of genetic markers such as microsatellites into new systems and research questions in biology. In light of the data and experience accumulated from several years of using microsatellites, we present here a literature review that synthesizes the limitations of microsatellites in population genetic studies. With a focus on population structure, we review the widely used fixation (F ST) statistics and Bayesian clustering algorithms and find that the former can be confusing and problematic for microsatellites and that the latter may be confounded by complex population models and lack power in certain cases. Clustering, multivariate analyses, and diversity-based statistics are increasingly being applied to infer population structure, but in some instances these methods lack formalization with microsatellites. Migration-specific methods perform well only under narrow constraints. We also examine the use of microsatellites for inferring effective population size, changes in population size, and deeper demographic history, and find that these methods are untested and/or highly context-dependent. Overall, each method possesses important weaknesses for use with microsatellites, and there are significant constraints on inferences commonly made using microsatellite markers in the areas of population structure, admixture, and effective population size. To ameliorate and better understand these constraints, researchers are encouraged to analyze simulated datasets both prior to and following data collection and analysis, the latter of which is formalized within the approximate Bayesian computation framework. We also examine trends in the literature and show that microsatellites continue to be widely used, especially in non-human subject areas. This review assists with study design and molecular marker selection, facilitates sound interpretation of microsatellite data while fostering respect for their practical limitations, and identifies lessons that could be applied toward emerging markers and high-throughput technologies in population genetics.
Collapse
Affiliation(s)
- Alexander I Putman
- Department of Plant Pathology, North Carolina State University Raleigh, North Carolina, 27695-7616
| | - Ignazio Carbone
- Department of Plant Pathology, North Carolina State University Raleigh, North Carolina, 27695-7616
| |
Collapse
|
10
|
Bühlmann A, Dreo T, Rezzonico F, Pothier JF, Smits THM, Ravnikar M, Frey JE, Duffy B. Phylogeography and population structure of the biologically invasive phytopathogen Erwinia amylovora inferred using minisatellites. Environ Microbiol 2013; 16:2112-25. [PMID: 24112873 DOI: 10.1111/1462-2920.12289] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2013] [Accepted: 09/14/2013] [Indexed: 01/08/2023]
Abstract
Erwinia amylovora causes a major disease of pome fruit trees worldwide, and is regulated as a quarantine organism in many countries. While some diversity of isolates has been observed, molecular epidemiology of this bacterium is hindered by a lack of simple molecular typing techniques with sufficiently high resolution. We report a molecular typing system of E. amylovora based on variable number of tandem repeats (VNTR) analysis. Repeats in the E. amylovora genome were identified with comparative genomic tools, and VNTR markers were developed and validated. A Multiple-Locus VNTR Analysis (MLVA) was applied to E. amylovora isolates from bacterial collections representing global and regional distribution of the pathogen. Based on six repeats, MLVA allowed the distinction of 227 haplotypes among a collection of 833 isolates of worldwide origin. Three geographically separated groups were recognized among global isolates using Bayesian clustering methods. Analysis of regional outbreaks confirmed presence of diverse haplotypes but also high representation of certain haplotypes during outbreaks. MLVA analysis is a practical method for epidemiological studies of E. amylovora, identifying previously unresolved population structure within outbreaks. Knowledge of such structure can increase our understanding on how plant diseases emerge and spread over a given geographical region.
Collapse
Affiliation(s)
- Andreas Bühlmann
- Plant Protection Division, Agroscope Changins-Wädenswil Research Station ACW, CH-8820, Wädenswil, Switzerland
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Ponder EL, Freundlich JS, Sarker M, Ekins S. Computational models for neglected diseases: gaps and opportunities. Pharm Res 2013; 31:271-7. [PMID: 23990313 DOI: 10.1007/s11095-013-1170-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2013] [Accepted: 07/28/2013] [Indexed: 01/22/2023]
Abstract
Neglected diseases, such as Chagas disease, African sleeping sickness, and intestinal worms, affect millions of the world's poor. They disproportionately affect marginalized populations, lack effective treatments or vaccines, or existing products are not accessible to the populations affected. Computational approaches have been used across many of these diseases for various aspects of research or development, and yet data produced by computational approaches are not integrated and widely accessible to others. Here, we identify gaps in which computational approaches have been used for some neglected diseases and not others. We also make recommendations for the broad-spectrum integration of these techniques into a neglected disease drug discovery and development workflow.
Collapse
Affiliation(s)
- Elizabeth L Ponder
- Center for Emerging and Neglected Diseases, Berkeley, 444A Li Ka Shing Center, Berkeley, California, 94720-3370, USA,
| | | | | | | |
Collapse
|
12
|
Ragheb MN, Ford CB, Chase MR, Lin PL, Flynn JL, Fortune SM. The mutation rate of mycobacterial repetitive unit loci in strains of M. tuberculosis from cynomolgus macaque infection. BMC Genomics 2013; 14:145. [PMID: 23496945 PMCID: PMC3635867 DOI: 10.1186/1471-2164-14-145] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 02/26/2013] [Indexed: 11/25/2022] Open
Abstract
Background Mycobacterial interspersed repetitive units (MIRUs) are minisatellites within the Mycobacterium tuberculosis (Mtb) genome. Copy number variation (CNV) in MIRU loci is used for epidemiological typing, making the rate of variation important for tracking the transmission of Mtb strains. In this study, we developed and assessed a whole-genome sequencing (WGS) approach to detect MIRU CNV in Mtb. We applied this methodology to a panel of Mtb strains isolated from the macaque model of tuberculosis (TB), the animal model that best mimics human disease. From these data, we have estimated the rate of MIRU variation in the host environment, providing a benchmark rate for future epidemiologic work. Results We assessed variation at the 24 MIRU loci used for typing in a set of Mtb strains isolated from infected cynomolgus macaques. We previously performed WGS of these strains and here have applied both read depth (RD) and paired-end mapping (PEM) metrics to identify putative copy number variants. To assess the relative power of these approaches, all MIRU loci were resequenced using Sanger sequencing. We detected two insertion/deletion events both of which could be identified as candidates by PEM criteria. With these data, we estimate a MIRU mutation rate of 2.70 × 10-03 (95% CI: 3.30 × 10-04- 9.80 × 10-03) per locus, per year. Conclusion Our results represent the first experimental estimate of the MIRU mutation rate in Mtb. This rate is comparable to the highest previous estimates gathered from epidemiologic data and meta-analyses. Our findings allow for a more rigorous interpretation of data gathered from MIRU typing.
Collapse
Affiliation(s)
- Mark N Ragheb
- Department of Immunology and Infectious Diseases, Harvard School of Public Health, Boston, MA, USA
| | | | | | | | | | | |
Collapse
|
13
|
Abstract
The ability to survey polymorphism on a genomic scale has enabled genome-wide scans for the targets of natural selection. Theory that connects patterns of genetic variation to evidence of natural selection most often assumes a diallelic locus and no recurrent mutation. Although these assumptions are suitable to selection that targets single nucleotide variants, fundamentally different types of mutation generate abundant polymorphism in genomes. Moreover, recent empirical results suggest that mutationally complex, multiallelic loci including microsatellites and copy number variants are sometimes targeted by natural selection. Given their abundance, the lack of inference methods tailored to the mutational peculiarities of these types of loci represents a notable gap in our ability to interrogate genomes for signatures of natural selection. Previous theoretical investigations of mutation-selection balance at multiallelic loci include assumptions that limit their application to inference from empirical data. Focusing on microsatellites, we assess the dynamics and population-level consequences of selection targeting mutationally complex variants. We develop general models of a multiallelic fitness surface, a realistic model of microsatellite mutation, and an efficient simulation algorithm. Using these tools, we explore mutation-selection-drift equilibrium at microsatellites and investigate the mutational history and selective regime of the microsatellite that causes Friedreich's ataxia. We characterize microsatellite selective events by their duration and cost, note similarities to sweeps from standing point variation, and conclude that it is premature to label microsatellites as ubiquitous agents of efficient adaptive change. Together, our models and simulation algorithm provide a powerful framework for statistical inference, which can be used to test the neutrality of microsatellites and other multiallelic variants.
Collapse
Affiliation(s)
- Ryan J Haasl
- Laboratory of Genetics, University of Wisconsin, USA.
| | | |
Collapse
|