1
|
Deb S, Basu J, Choudhary M. An overview of next generation sequencing strategies and genomics tools used for tuberculosis research. J Appl Microbiol 2024; 135:lxae174. [PMID: 39003248 DOI: 10.1093/jambio/lxae174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 06/07/2024] [Accepted: 07/10/2024] [Indexed: 07/15/2024]
Abstract
Tuberculosis (TB) is a grave public health concern and is considered the foremost contributor to human mortality resulting from infectious disease. Due to the stringent clonality and extremely restricted genomic diversity, conventional methods prove inefficient for in-depth exploration of minor genomic variations and the evolutionary dynamics operating in Mycobacterium tuberculosis (M.tb) populations. Until now, the majority of reviews have primarily focused on delineating the application of whole-genome sequencing (WGS) in predicting antibiotic resistant genes, surveillance of drug resistance strains, and M.tb lineage classifications. Despite the growing use of next generation sequencing (NGS) and WGS analysis in TB research, there are limited studies that provide a comprehensive summary of there role in studying macroevolution, minor genetic variations, assessing mixed TB infections, and tracking transmission networks at an individual level. This highlights the need for systematic effort to fully explore the potential of WGS and its associated tools in advancing our understanding of TB epidemiology and disease transmission. We delve into the recent bioinformatics pipelines and NGS strategies that leverage various genetic features and simultaneous exploration of host-pathogen protein expression profile to decipher the genetic heterogeneity and host-pathogen interaction dynamics of the M.tb infections. This review highlights the potential benefits and limitations of NGS and bioinformatics tools and discusses their role in TB detection and epidemiology. Overall, this review could be a valuable resource for researchers and clinicians interested in NGS-based approaches in TB research.
Collapse
Affiliation(s)
- Sushanta Deb
- Department of Veterinary Microbiology and Pathology, College of Veterinary Medicine, Washington State University, Pullman 99164, WA, United States
- All India Institute of Medical Sciences, New Delhi 110029, India
| | - Jhinuk Basu
- Department of Clinical Immunology and Rheumatology, Kalinga Institute of Medical Sciences (KIMS), KIIT University, Bhubaneswar 751024, India
| | - Megha Choudhary
- All India Institute of Medical Sciences, New Delhi 110029, India
| |
Collapse
|
2
|
Godin MJ, Sebastian A, Albert I, Lindner SE. Long-Read Genome Assembly and Gene Model Annotations for the Rodent Malaria Parasite Plasmodium yoelii 17XNL. J Biol Chem 2023:104871. [PMID: 37247760 PMCID: PMC10320607 DOI: 10.1016/j.jbc.2023.104871] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/20/2023] [Accepted: 05/22/2023] [Indexed: 05/31/2023] Open
Abstract
Malaria causes over 600 thousand fatalities each year, with most cases attributed to the human-infectious Plasmodium falciparum species. Many rodent-infectious Plasmodium species, like Plasmodium berghei and Plasmodium yoelii, have been used as model species that can expedite studies of this pathogen. P. yoelii is an especially good model for investigating the mosquito and liver stages of parasite development because key attributes closely resemble those of P. falciparum. Because of its importance, in 2002 the 17XNL strain of P. yoelii was the first rodent malaria parasite to be sequenced. While a breakthrough effort, the assembly consisted of >5000 contiguous sequences that adversely impacted the annotated gene models. While other rodent malaria parasite genomes have been sequenced and annotated since then, including the related P. yoelii 17X strain, the 17XNL strain has not. As a result, genomic data for 17X has become the de facto reference genome for the 17XNL strain while leaving open questions surrounding possible differences between the 17XNL and 17X genomes. In this work, we present a high-quality genome assembly for P. yoelii 17XNL using PacBio DNA sequencing. In addition, we use Nanopore and Illumina RNA sequencing of mixed blood stages to create complete gene models that include coding sequences, alternate isoforms, and UTR designations. A comparison of the 17X and this new 17XNL assembly revealed biologically meaningful differences between the strains due to the presence of coding sequence variants. Taken together, our work provides a new genomic framework for studies with this commonly used rodent malaria model species.
Collapse
Affiliation(s)
- Mitchell J Godin
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802
| | - Aswathy Sebastian
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802
| | - Istvan Albert
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802; Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802.
| | - Scott E Lindner
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802.
| |
Collapse
|
3
|
Godin MJ, Sebastian A, Albert I, Lindner SE. Long-Read Genome Assembly and Gene Model Annotations for the Rodent Malaria Parasite Plasmodium yoelii 17XNL. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.06.523040. [PMID: 36711553 PMCID: PMC9882011 DOI: 10.1101/2023.01.06.523040] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Malaria causes over 200 million infections and over 600 thousand fatalities each year, with most cases attributed to a human-infectious Plasmodium species, Plasmodium falciparum . Many rodent-infectious Plasmodium species, like Plasmodium berghei, Plasmodium chabaudi , and Plasmodium yoelii , have been used as genetically tractable model species that can expedite studies of this pathogen. In particular, P. yoelii is an especially good model for investigating the mosquito and liver stages of parasite development because key attributes closely resemble those of P. falciparum . Because of its importance to malaria research, in 2002 the 17XNL strain of P. yoelii was the first rodent malaria parasite to be sequenced. While sequencing and assembling this genome was a breakthrough effort, the final assembly consisted of >5000 contiguous sequences that impacted the creation of annotated gene models. While other important rodent malaria parasite genomes have been sequenced and annotated since then, including the related P. yoelii 17X strain, the 17XNL strain has not. As a result, genomic data for 17X has become the de facto reference genome for the 17XNL strain while leaving open questions surrounding possible differences between the 17XNL and 17X genomes. In this work, we present a high-quality genome assembly for P. yoelii 17XNL using HiFi PacBio long-read DNA sequencing. In addition, we use Nanopore long-read direct RNA-seq and Illumina short-read sequencing of mixed blood stages to create complete gene models that include not only coding sequences but also alternate transcript isoforms, and 5' and 3' UTR designations. A comparison of the 17X and this new 17XNL assembly revealed biologically meaningful differences between the strains due to the presence of coding sequence variants. Taken together, our work provides a new genomic and gene expression framework for studies with this commonly used rodent malaria model species.
Collapse
Affiliation(s)
- Mitchell J. Godin
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802
| | - Aswathy Sebastian
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802
| | - Istvan Albert
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16802
| | - Scott E. Lindner
- Department of Biochemistry and Molecular Biology, The Huck Center for Malaria Research, The Center for Eukaryotic Gene Regulation, Pennsylvania State University, University Park, PA, 16802
| |
Collapse
|
4
|
Gu H, Xie R, Adam DC, Tsui JLH, Chu DK, Chang LDJ, Cheuk SSY, Gurung S, Krishnan P, Ng DYM, Liu GYZ, Wan CKC, Cheng SSM, Edwards KM, Leung KSM, Wu JT, Tsang DNC, Leung GM, Cowling BJ, Peiris M, Lam TTY, Dhanasekaran V, Poon LLM. Genomic epidemiology of SARS-CoV-2 under an elimination strategy in Hong Kong. Nat Commun 2022; 13:736. [PMID: 35136039 PMCID: PMC8825829 DOI: 10.1038/s41467-022-28420-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 01/19/2022] [Indexed: 12/15/2022] Open
Abstract
Hong Kong employed a strategy of intermittent public health and social measures alongside increasingly stringent travel regulations to eliminate domestic SARS-CoV-2 transmission. By analyzing 1899 genome sequences (>18% of confirmed cases) from 23-January-2020 to 26-January-2021, we reveal the effects of fluctuating control measures on the evolution and epidemiology of SARS-CoV-2 lineages in Hong Kong. Despite numerous importations, only three introductions were responsible for 90% of locally-acquired cases. Community outbreaks were caused by novel introductions rather than a resurgence of circulating strains. Thus, local outbreak prevention requires strong border control and community surveillance, especially during periods of less stringent social restriction. Non-adherence to prolonged preventative measures may explain sustained local transmission observed during wave four in late 2020 and early 2021. We also found that, due to a tight transmission bottleneck, transmission of low-frequency single nucleotide variants between hosts is rare.
Collapse
Affiliation(s)
- Haogao Gu
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Ruopeng Xie
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- HKU-Pasteur Research Pole, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Dillon C Adam
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Joseph L-H Tsui
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Daniel K Chu
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Lydia D J Chang
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Sammi S Y Cheuk
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Shreya Gurung
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Pavithra Krishnan
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Daisy Y M Ng
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Gigi Y Z Liu
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Carrie K C Wan
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Samuel S M Cheng
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Kimberly M Edwards
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- HKU-Pasteur Research Pole, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Kathy S M Leung
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Laboratory of Data Discovery for Health, Hong Kong Science and Technology Park, Hong Kong, China
| | - Joseph T Wu
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Laboratory of Data Discovery for Health, Hong Kong Science and Technology Park, Hong Kong, China
| | - Dominic N C Tsang
- Centre for Health Protection, Department of Health, The Government of Hong Kong Special Administrative Region, Hong Kong, China
| | - Gabriel M Leung
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Laboratory of Data Discovery for Health, Hong Kong Science and Technology Park, Hong Kong, China
| | - Benjamin J Cowling
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Laboratory of Data Discovery for Health, Hong Kong Science and Technology Park, Hong Kong, China
| | - Malik Peiris
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- HKU-Pasteur Research Pole, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Centre for Immunology & Infection, Hong Kong Science and Technology Park, Hong Kong, China
| | - Tommy T Y Lam
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China
- Laboratory of Data Discovery for Health, Hong Kong Science and Technology Park, Hong Kong, China
- Centre for Immunology & Infection, Hong Kong Science and Technology Park, Hong Kong, China
| | - Vijaykrishna Dhanasekaran
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
- HKU-Pasteur Research Pole, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
| | - Leo L M Poon
- School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
- HKU-Pasteur Research Pole, School of Public Health, LKS Faculty of Medicine, The University of Hong Kong, Hong Kong, China.
- Centre for Immunology & Infection, Hong Kong Science and Technology Park, Hong Kong, China.
| |
Collapse
|
5
|
Flageul A, Lucas P, Hirchaud E, Touzain F, Blanchard Y, Eterradossi N, Brown P, Grasland B. Viral variant visualizer (VVV): A novel bioinformatic tool for rapid and simple visualization of viral genetic diversity. Virus Res 2020; 291:198201. [PMID: 33080244 DOI: 10.1016/j.virusres.2020.198201] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 09/13/2020] [Accepted: 10/15/2020] [Indexed: 12/15/2022]
Abstract
Here a bioinformatic pipeline VVV has been developed to analyse viral populations in a given sample from Next Generation Sequencing (NGS) data. To date, handling large amounts of data from NGS requires the expertise of bioinformaticians, both for data processing and result analysis. Consequently, VVV was designed to help non-bioinformaticians to perform these tasks. By providing only the NGS data file, the developed pipeline generated consensus sequences and determined the composition of the viral population for an avian Metapneumovirus (AMPV) and three different animal coronaviruses (Porcine Epidemic Diarrhea Virus (PEDV), Turkey Coronavirus (TCoV) and Infectious Bronchitis Virus (IBV)). In all cases, the pipeline produced viral consensus genomes corresponding to known consensus sequence and made it possible to highlight the presence of viral genetic variants through a single graphic representation. The method was validated by comparing the viral populations of an AMPV field sample, and of a copy of this virus produced from a DNA clone. VVV demonstrated that the cloned virus population was homogeneous (as designed) at position 2934 where the wild-type virus demonstrated two variant populations at a ratio of almost 50:50. A total of 18, 10, 3 and 28, viral genetic variants were detected for AMPV, PEDV, TCoV and IBV respectively. The simplicity of this pipeline makes the study of viral genetic variants more accessible to a wide variety of biologists, which should ultimately increase the rate of understanding of the mechanisms of viral genetic evolution.
Collapse
Affiliation(s)
- Alexandre Flageul
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES) Laboratory of Ploufragan-Plouzané-Niort, Virology, Immunology and Parasitology in Poultry and Rabbit (VIPAC) Unit, Université Bretagne Loire (UBL), France
| | - Pierrick Lucas
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES), Laboratory of Ploufragan-Plouzané-Niort, Viral Genetic and Biosafety (GVB) Unit, France
| | - Edouard Hirchaud
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES), Laboratory of Ploufragan-Plouzané-Niort, Viral Genetic and Biosafety (GVB) Unit, France
| | - Fabrice Touzain
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES), Laboratory of Ploufragan-Plouzané-Niort, Viral Genetic and Biosafety (GVB) Unit, France
| | - Yannick Blanchard
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES), Laboratory of Ploufragan-Plouzané-Niort, Viral Genetic and Biosafety (GVB) Unit, France
| | - Nicolas Eterradossi
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES) Laboratory of Ploufragan-Plouzané-Niort, Virology, Immunology and Parasitology in Poultry and Rabbit (VIPAC) Unit, Université Bretagne Loire (UBL), France
| | - Paul Brown
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES) Laboratory of Ploufragan-Plouzané-Niort, Virology, Immunology and Parasitology in Poultry and Rabbit (VIPAC) Unit, Université Bretagne Loire (UBL), France
| | - Béatrice Grasland
- Agence National de Sécurité Sanitaire, de l'environnement et du travail (ANSES) Laboratory of Ploufragan-Plouzané-Niort, Virology, Immunology and Parasitology in Poultry and Rabbit (VIPAC) Unit, Université Bretagne Loire (UBL), France.
| |
Collapse
|
6
|
Deng ZL, Dhingra A, Fritz A, Götting J, Münch PC, Steinbrück L, Schulz TF, Ganzenmüller T, McHardy AC. Evaluating assembly and variant calling software for strain-resolved analysis of large DNA viruses. Brief Bioinform 2020; 22:5868070. [PMID: 34020538 PMCID: PMC8138829 DOI: 10.1093/bib/bbaa123] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2019] [Revised: 05/18/2020] [Accepted: 05/19/2020] [Indexed: 02/06/2023] Open
Abstract
Infection with human cytomegalovirus (HCMV) can cause severe complications in immunocompromised individuals and congenitally infected children. Characterizing heterogeneous viral populations and their evolution by high-throughput sequencing of clinical specimens requires the accurate assembly of individual strains or sequence variants and suitable variant calling methods. However, the performance of most methods has not been assessed for populations composed of low divergent viral strains with large genomes, such as HCMV. In an extensive benchmarking study, we evaluated 15 assemblers and 6 variant callers on 10 lab-generated benchmark data sets created with two different library preparation protocols, to identify best practices and challenges for analyzing such data. Most assemblers, especially metaSPAdes and IVA, performed well across a range of metrics in recovering abundant strains. However, only one, Savage, recovered low abundant strains and in a highly fragmented manner. Two variant callers, LoFreq and VarScan2, excelled across all strain abundances. Both shared a large fraction of false positive variant calls, which were strongly enriched in T to G changes in a 'G.G' context. The magnitude of this context-dependent systematic error is linked to the experimental protocol. We provide all benchmarking data, results and the entire benchmarking workflow named QuasiModo, Quasispecies Metric determination on omics, under the GNU General Public License v3.0 (https://github.com/hzi-bifo/Quasimodo), to enable full reproducibility and further benchmarking on these and other data.
Collapse
Affiliation(s)
- Zhi-Luo Deng
- Department Computational Biology of Infection Research of the Helmholtz Centre for Infection Research
| | | | - Adrian Fritz
- Department Computational Biology of Infection Research of the Helmholtz Centre for Infection Research
| | | | - Philipp C Münch
- Department Computational Biology of Infection Research of the Helmholtz Centre for Infection Research and Max von Pettenkofer Institute in Ludwig Maximilian University of Munich
| | | | | | | | - Alice C McHardy
- Department Computational Biology of Infection Research of the Helmholtz Centre for Infection Research
| |
Collapse
|