1
|
Seah YM, Stewart MK, Hoogestraat D, Ryder M, Cookson BT, Salipante SJ, Hoffman NG. In Silico Evaluation of Variant Calling Methods for Bacterial Whole-Genome Sequencing Assays. J Clin Microbiol 2023; 61:e0184222. [PMID: 37428072 PMCID: PMC10446864 DOI: 10.1128/jcm.01842-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 06/18/2023] [Indexed: 07/11/2023] Open
Abstract
Identification and analysis of clinically relevant strains of bacteria increasingly relies on whole-genome sequencing. The downstream bioinformatics steps necessary for calling variants from short-read sequences are well-established but seldom validated against haploid genomes. We devised an in silico workflow to introduce single nucleotide polymorphisms (SNP) and indels into bacterial reference genomes, and computationally generate sequencing reads based on the mutated genomes. We then applied the method to Mycobacterium tuberculosis H37Rv, Staphylococcus aureus NCTC 8325, and Klebsiella pneumoniae HS11286, and used the synthetic reads as truth sets for evaluating several popular variant callers. Insertions proved especially challenging for most variant callers to correctly identify, relative to deletions and single nucleotide polymorphisms. With adequate read depth, however, variant callers that use high quality soft-clipped reads and base mismatches to perform local realignment consistently had the highest precision and recall in identifying insertions and deletions ranging from1 to 50 bp. The remaining variant callers had lower recall values associated with identification of insertions greater than 20 bp.
Collapse
Affiliation(s)
- Yee Mey Seah
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Mary K. Stewart
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Daniel Hoogestraat
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Molly Ryder
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Brad T. Cookson
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
- Department of Microbiology, University of Washington, Seattle, Washington, USA
| | - Stephen J. Salipante
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| | - Noah G. Hoffman
- Department of Laboratory Medicine and Pathology, University of Washington Medical Center, Seattle, Washington, USA
| |
Collapse
|
2
|
Small Insertions and Deletions Drive Genomic Plasticity during Adaptive Evolution of Yersinia pestis. Microbiol Spectr 2022; 10:e0224221. [PMID: 35438532 PMCID: PMC9248902 DOI: 10.1128/spectrum.02242-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The life cycle of Yersinia pestis has changed a lot to adapt to flea-borne transmission since it evolved from an enteric pathogen, Yersinia pseudotuberculosis. Small insertions and deletions (indels), especially frameshift mutations, can have major effects on phenotypes and contribute to virulence and host adaptation through gene disruption and inactivation. Here, we analyzed 365 Y. pestis genomes and identified 2,092 genome-wide indels on the core genome. As recently reported in Mycobacterium tuberculosis, we also detected "indel pockets" in Y. pestis, with average complexity scores declining around indel positions, which we speculate might also exist in other prokaryotes. Phylogenic analysis showed that indel-based phylogenic tree could basically reflect the phylogenetic relationships of major phylogroups in Y. pestis, except some inconsistency around the Big Bang polytomy. We observed 83 indels arising in the trunk of the phylogeny, which played a role in accumulation of pseudogenes related to key metabolism and putatively pathogenicity. We also discovered 32 homoplasies at the level of phylogroups and 7 frameshift scars (i.e., disrupted reading frame being rescued by a second frameshift). Additionally, our analysis showed evidence of parallel evolution at the level of genes, with sspA, rpoS, rnd, and YPO0624, having enriched mutations in Brazilian isolates, which might be advantageous for Y. pestis to cope with fluctuating environments. The diversified selection signals observed here demonstrates that indels are important contributors to the adaptive evolution of Y. pestis. Meanwhile, we provide potential targets for further exploration, as some genes/pseudogenes with indels we focus on remain uncharacterized. IMPORTANCE Yersinia pestis, the causative agent of plague, is a highly pathogenic clone of Yersinia pseudotuberculosis. Previous genome-wide SNP analysis provided few adaptive signatures during its evolution. Here by investigating 365 public genomes of Y. pestis, we give a comprehensive overview of general features of genome-wide indels on the core genome and their roles in Y. pestis evolution. Detection of "indel pockets," with average complexity scores declining around indel positions, in both Mycobacterium tuberculosis and Y. pestis, gives us a clue that this phenomenon might appear in other bacterial genomes. Importantly, the identification of four different forms of selection signals in indels would improve our understanding on adaptive evolution of Y. pestis, and provide targets for further physiological mechanism researches of this pathogen. As evolutionary research based on genome-wide indels is still rare in bacteria, our study would be a helpful reference in deciphering the role of indels in other species.
Collapse
|
3
|
New evaluation methods of read mapping by 17 aligners on simulated and empirical NGS data: an updated comparison of DNA- and RNA-Seq data from Illumina and Ion Torrent technologies. Neural Comput Appl 2021; 33:15669-15692. [PMID: 34155424 PMCID: PMC8208613 DOI: 10.1007/s00521-021-06188-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2020] [Accepted: 06/02/2021] [Indexed: 12/13/2022]
Abstract
During the last (15) years, improved omics sequencing technologies have expanded the scale and resolution of various biological applications, generating high-throughput datasets that require carefully chosen software tools to be processed. Therefore, following the sequencing development, bioinformatics researchers have been challenged to implement alignment algorithms for next-generation sequencing reads. However, nowadays selection of aligners based on genome characteristics is poorly studied, so our benchmarking study extended the “state of art” comparing 17 different aligners. The chosen tools were assessed on empirical human DNA- and RNA-Seq data, as well as on simulated datasets in human and mouse, evaluating a set of parameters previously not considered in such kind of benchmarks. As expected, we found that each tool was the best in specific conditions. For Ion Torrent single-end RNA-Seq samples, the most suitable aligners were CLC and BWA-MEM, which reached the best results in terms of efficiency, accuracy, duplication rate, saturation profile and running time. About Illumina paired-end osteomyelitis transcriptomics data, instead, the best performer algorithm, together with the already cited CLC, resulted Novoalign, which excelled in accuracy and saturation analyses. Segemehl and DNASTAR performed the best on both DNA-Seq data, with Segemehl particularly suitable for exome data. In conclusion, our study could guide users in the selection of a suitable aligner based on genome and transcriptome characteristics. However, several other aspects, emerged from our work, should be considered in the evolution of alignment research area, such as the involvement of artificial intelligence to support cloud computing and mapping to multiple genomes.
Collapse
|
4
|
Steglich M, Hofmann JD, Helmecke J, Sikorski J, Spröer C, Riedel T, Bunk B, Overmann J, Neumann-Schaal M, Nübel U. Convergent Loss of ABC Transporter Genes From Clostridioides difficile Genomes Is Associated With Impaired Tyrosine Uptake and p-Cresol Production. Front Microbiol 2018; 9:901. [PMID: 29867812 PMCID: PMC5951980 DOI: 10.3389/fmicb.2018.00901] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 04/18/2018] [Indexed: 11/13/2022] Open
Abstract
We report the frequent, convergent loss of two genes encoding the substrate-binding protein and the ATP-binding protein of an ATP-binding cassette (ABC) transporter from the genomes of unrelated Clostridioides difficile strains. This specific genomic deletion was strongly associated with the reduced uptake of tyrosine and phenylalanine and production of derived Stickland fermentation products, including p-cresol, suggesting that the affected ABC transporter had been responsible for the import of aromatic amino acids. In contrast, the transporter gene loss did not measurably affect bacterial growth or production of enterotoxins. Phylogenomic analysis of publically available genome sequences indicated that this transporter gene deletion had occurred multiple times in diverse clonal lineages of C. difficile, with a particularly high prevalence in ribotype 027 isolates, where 48 of 195 genomes (25%) were affected. The transporter gene deletion likely was facilitated by the repetitive structure of its genomic location. While at least some of the observed transporter gene deletions are likely to have occurred during the natural life cycle of C. difficile, we also provide evidence for the emergence of this mutation during long-term laboratory cultivation of reference strain R20291.
Collapse
Affiliation(s)
- Matthias Steglich
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,German Center for Infection Research (DZIF), Braunschweig, Germany
| | - Julia D Hofmann
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Julia Helmecke
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Johannes Sikorski
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Cathrin Spröer
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany
| | - Thomas Riedel
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,German Center for Infection Research (DZIF), Braunschweig, Germany
| | - Boyke Bunk
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,German Center for Infection Research (DZIF), Braunschweig, Germany
| | - Jörg Overmann
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,German Center for Infection Research (DZIF), Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Meina Neumann-Schaal
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| | - Ulrich Nübel
- Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures, Braunschweig, Germany.,German Center for Infection Research (DZIF), Braunschweig, Germany.,Braunschweig Integrated Centre of Systems Biology (BRICS), Braunschweig, Germany
| |
Collapse
|
5
|
Leung KSS, Siu GKH, Tam KKG, To SWC, Rajwani R, Ho PL, Wong SSY, Zhao WW, Ma OCK, Yam WC. Comparative Genomic Analysis of Two Clonally Related Multidrug Resistant Mycobacterium tuberculosis by Single Molecule Real Time Sequencing. Front Cell Infect Microbiol 2017; 7:478. [PMID: 29188195 PMCID: PMC5694780 DOI: 10.3389/fcimb.2017.00478] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Accepted: 10/31/2017] [Indexed: 12/02/2022] Open
Abstract
Background: Multidrug-resistant tuberculosis (MDR-TB) is posing a major threat to global TB control. In this study, we focused on two consecutive MDR-TB isolated from the same patient before and after the initiation of anti-TB treatment. To better understand the genomic characteristics of MDR-TB, Single Molecule Real-Time (SMRT) Sequencing and comparative genomic analyses was performed to identify mutations that contributed to the stepwise development of drug resistance and growth fitness in MDR-TB under in vivo challenge of anti-TB drugs. Result: Both pre-treatment and post-treatment strain demonstrated concordant phenotypic and genotypic susceptibility profiles toward rifampicin, pyrazinamide, streptomycin, fluoroquinolones, aminoglycosides, cycloserine, ethionamide, and para-aminosalicylic acid. However, although both strains carried identical missense mutations at rpoB S531L, inhA C-15T, and embB M306V, MYCOTB Sensititre assay showed that the post-treatment strain had 16-, 8-, and 4-fold elevation in the minimum inhibitory concentrations (MICs) toward rifabutin, isoniazid, and ethambutol respectively. The results have indicated the presence of additional resistant-related mutations governing the stepwise development of MDR-TB. Further comparative genomic analyses have identified three additional polymorphisms between the clinical isolates. These include a single nucleotide deletion at nucleotide position 360 of rv0888 in pre-treatment strain, and a missense mutation at rv3303c (lpdA) V44I and a 6-bp inframe deletion at codon 67-68 in rv2071c (cobM) in the post-treatment strain. Multiple sequence alignment showed that these mutations were occurring at highly conserved regions among pathogenic mycobacteria. Using structural-based and sequence-based algorithms, we further predicted that the mutations potentially have deleterious effect on protein function. Conclusion: This is the first study that compared the full genomes of two clonally-related MDR-TB clinical isolates during the course of anti-TB treatment. Our work has demonstrated the robustness of SMRT Sequencing in identifying mutations among MDR-TB clinical isolates. Comparative genome analysis also suggested novel mutations at rv0888, lpdA, and cobM that might explain the difference in antibiotic resistance and growth pattern between the two MDR-TB strains.
Collapse
Affiliation(s)
- Kenneth Siu-Sing Leung
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| | - Gilman Kit-Hang Siu
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, Hong Kong
| | - Kingsley King-Gee Tam
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| | - Sabrina Wai-Chi To
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| | - Rahim Rajwani
- Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong, Hong Kong
| | - Pak-Leung Ho
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| | - Samson Sai-Yin Wong
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| | - Wei W. Zhao
- KingMed Diagnostics, Science Park, Hong Kong, Hong Kong
| | | | - Wing-Cheong Yam
- Department of Microbiology, Queen Mary Hospital, The University of Hong Kong, Hong Kong, Hong Kong
| |
Collapse
|