1
|
Child HT, Wierzbicki L, Joslin GR, Tennant RK. Comparative evaluation of soil DNA extraction kits for long read metagenomic sequencing. Access Microbiol 2024; 6:000868.v3. [PMID: 39346682 PMCID: PMC11432601 DOI: 10.1099/acmi.0.000868.v3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 09/12/2024] [Indexed: 10/01/2024] Open
Abstract
Metagenomics has been transformative in our understanding of the diversity and function of soil microbial communities. Applying long read sequencing to whole genome shotgun metagenomics has the potential to revolutionise soil microbial ecology through improved taxonomic classification, functional characterisation and metagenome assembly. However, optimisation of robust methods for long read metagenomics of environmental samples remains undeveloped. In this study, Oxford Nanopore sequencing using samples from five commercially available soil DNA extraction kits was compared across four soil types, in order to optimise read length and reproducibility for comparative long read soil metagenomics. Average extracted DNA lengths varied considerably between kits, but longer DNA fragments did not translate consistently into read lengths. Highly variable decreases in the length of resulting reads from some kits were associated with poor classification rate and low reproducibility in microbial communities identified between technical repeats. Replicate samples from other kits showed more consistent conversion of extracted DNA fragment size into read length and resulted in more congruous microbial community representation. Furthermore, extraction kits showed significant differences in the community representation and structure they identified across all soil types. Overall, the QIAGEN DNeasy PowerSoil Pro Kit displayed the best suitability for reproducible long-read WGS metagenomic sequencing, although further optimisation of DNA purification and library preparation may enable translation of higher molecular weight DNA from other kits into longer read lengths. These findings provide a novel insight into the importance of optimising DNA extraction for achieving replicable results from long read metagenomic sequencing of environmental samples.
Collapse
Affiliation(s)
- Harry T. Child
- Geography, Faculty of Environment, Science and Economy, University of Exeter, Amory Building, Rennes Drive, Exeter, Devon, EX4 4RJ, UK
| | - Lucy Wierzbicki
- Geography, Faculty of Environment, Science and Economy, University of Exeter, Amory Building, Rennes Drive, Exeter, Devon, EX4 4RJ, UK
| | - Gabrielle R. Joslin
- Geography, Faculty of Environment, Science and Economy, University of Exeter, Amory Building, Rennes Drive, Exeter, Devon, EX4 4RJ, UK
| | - Richard K. Tennant
- Geography, Faculty of Environment, Science and Economy, University of Exeter, Amory Building, Rennes Drive, Exeter, Devon, EX4 4RJ, UK
| |
Collapse
|
2
|
Dommann J, Kerbl-Knapp J, Albertos Torres D, Egli A, Keiser J, Schneeberger PHH. A novel barcoded nanopore sequencing workflow of high-quality, full-length bacterial 16S amplicons for taxonomic annotation of bacterial isolates and complex microbial communities. mSystems 2024:e0085924. [PMID: 39254034 DOI: 10.1128/msystems.00859-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2024] [Accepted: 08/19/2024] [Indexed: 09/11/2024] Open
Abstract
Due to recent improvements, Nanopore sequencing has become a promising method for experiments relying on amplicon sequencing. We describe a flexible workflow to generate and annotate high-quality, full-length 16S rDNA amplicons. We evaluated it for two applications, namely, (i) identification of bacterial isolates and (ii) species-level profiling of microbial communities. We assessed the identification of single bacterial isolates by sequencing, using a set of barcoded full-length 16S rRNA gene primer pairs (pair A), on 47 isolates encompassing multiple genera and compared those results with matrix-assisted laser desorption ionization-time of flight mass spectrometry (MALDI-TOF MS)-based identification. Species-level community profiling was tested with two sets of barcoded full-length 16S primer pairs (A and B) and compared to the results obtained with shotgun Illumina sequencing using 27 stool samples. We developed a Nextflow pipeline to retain high-quality reads and taxonomically annotate them. We found high agreement between our workflow and MALDI-TOF data for isolate identification (positive predictive value = 0.90, Cramér's V = 0.857, and Theil's U = 0.316). For species-level community profiling, we found strong correlations (rs > 0.6) of alpha diversity indices between the two primer sets and Illumina sequencing. At the community level, we found significant but small differences when comparing sequencing techniques. Finally, we found a moderate to strong correlation when comparing the relative abundances of individual species (average rs = 0.6 and 0.533 for primers A and B). Despite identified shortcomings, the proposed workflow enabled accurate identification of single bacterial isolates and prominent features in microbial communities, making it a worthwhile alternative to MALDI-TOF MS and Illumina sequencing.IMPORTANCEA quick, robust, simple, and cost-effective method to identify bacterial isolates and communities in each sample is indispensable in the fields of microbiology and infection biology. Recent technological advances in Oxford Nanopore Technologies sequencing make this technique an attractive option considering the adaptability, portability, and cost-effectiveness of the platform, even with small sequencing batches. Here, we validated a flexible workflow to identify bacterial isolates and characterize bacterial communities using the Oxford Nanopore Technologies sequencing platform combined with the most recent v14 chemistry kits. For bacterial isolates, we compared our nanopore-based approach to matrix-assisted laser desorption ionization-time of flight mass spectrometry-based identification. For species-level profiling of complex bacterial communities, we compared our nanopore-based approach to Illumina shotgun sequencing. For reproducibility purposes, we wrapped the code used to process the sequencing data into a ready-to-use and self-contained Nextflow pipeline.
Collapse
Affiliation(s)
- Julian Dommann
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Jakob Kerbl-Knapp
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Diana Albertos Torres
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
- Clinical Bacteriology and Mycology, University Hospital Basel, Basel, Switzerland
- Applied Microbiology Research, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Adrian Egli
- Institute of Medical Microbiology, University of Zurich, Zurich, Switzerland
- Clinical Bacteriology and Mycology, University Hospital Basel, Basel, Switzerland
- Applied Microbiology Research, Department of Biomedicine, University of Basel, Basel, Switzerland
| | - Jennifer Keiser
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Pierre H H Schneeberger
- Department of Medical Parasitology and Infection Biology, Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| |
Collapse
|
3
|
Van Uffelen A, Posadas A, Roosens NHC, Marchal K, De Keersmaecker SCJ, Vanneste K. Benchmarking bacterial taxonomic classification using nanopore metagenomics data of several mock communities. Sci Data 2024; 11:864. [PMID: 39127718 PMCID: PMC11316826 DOI: 10.1038/s41597-024-03672-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024] Open
Abstract
Taxonomic classification is crucial in identifying organisms within diverse microbial communities when using metagenomics shotgun sequencing. While second-generation Illumina sequencing still dominates, third-generation nanopore sequencing promises improved classification through longer reads. However, extensive benchmarking studies on nanopore data are lacking. We systematically evaluated performance of bacterial taxonomic classification for metagenomics nanopore sequencing data for several commonly used classifiers, using standardized reference sequence databases, on the largest collection of publicly available data for defined mock communities thus far (nine samples), representing different research domains and application scopes. Our results categorize classifiers into three categories: low precision/high recall; medium precision/medium recall, and high precision/medium recall. Most fall into the first group, although precision can be improved without excessively penalizing recall with suitable abundance filtering. No definitive 'best' classifier emerges, and classifier selection depends on application scope and practical requirements. Although few classifiers designed for long reads exist, they generally exhibit better performance. Our comprehensive benchmarking provides concrete recommendations, supported by publicly available code for reassessment and fine-tuning by other scientists.
Collapse
Affiliation(s)
- Alexander Van Uffelen
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Andrés Posadas
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Nancy H C Roosens
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Kathleen Marchal
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Genetics, University of Pretoria, Pretoria, South Africa
| | | | - Kevin Vanneste
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium.
| |
Collapse
|
4
|
Tian Q, Zhang P, Zhai Y, Wang Y, Zou Q. Application and Comparison of Machine Learning and Database-Based Methods in Taxonomic Classification of High-Throughput Sequencing Data. Genome Biol Evol 2024; 16:evae102. [PMID: 38748485 PMCID: PMC11135637 DOI: 10.1093/gbe/evae102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2024] [Indexed: 05/30/2024] Open
Abstract
The advent of high-throughput sequencing technologies has not only revolutionized the field of bioinformatics but has also heightened the demand for efficient taxonomic classification. Despite technological advancements, efficiently processing and analyzing the deluge of sequencing data for precise taxonomic classification remains a formidable challenge. Existing classification approaches primarily fall into two categories, database-based methods and machine learning methods, each presenting its own set of challenges and advantages. On this basis, the aim of our study was to conduct a comparative analysis between these two methods while also investigating the merits of integrating multiple database-based methods. Through an in-depth comparative study, we evaluated the performance of both methodological categories in taxonomic classification by utilizing simulated data sets. Our analysis revealed that database-based methods excel in classification accuracy when backed by a rich and comprehensive reference database. Conversely, while machine learning methods show superior performance in scenarios where reference sequences are sparse or lacking, they generally show inferior performance compared with database methods under most conditions. Moreover, our study confirms that integrating multiple database-based methods does, in fact, enhance classification accuracy. These findings shed new light on the taxonomic classification of high-throughput sequencing data and bear substantial implications for the future development of computational biology. For those interested in further exploring our methods, the source code of this study is publicly available on https://github.com/LoadStar822/Genome-Classifier-Performance-Evaluator. Additionally, a dedicated webpage showcasing our collected database, data sets, and various classification software can be found at http://lab.malab.cn/~tqz/project/taxonomic/.
Collapse
Affiliation(s)
- Qinzhong Tian
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003 China
| | - Pinglu Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003 China
| | - Yixiao Zhai
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003 China
| | - Yansu Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003 China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324003 China
| |
Collapse
|
5
|
Gand M, Navickaite I, Bartsch LJ, Grützke J, Overballe-Petersen S, Rasmussen A, Otani S, Michelacci V, Matamoros BR, González-Zorn B, Brouwer MSM, Di Marcantonio L, Bloemen B, Vanneste K, Roosens NHCJ, AbuOun M, De Keersmaecker SCJ. Towards facilitated interpretation of shotgun metagenomics long-read sequencing data analyzed with KMA for the detection of bacterial pathogens and their antimicrobial resistance genes. Front Microbiol 2024; 15:1336532. [PMID: 38659981 PMCID: PMC11042533 DOI: 10.3389/fmicb.2024.1336532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 02/29/2024] [Indexed: 04/26/2024] Open
Abstract
Metagenomic sequencing is a promising method that has the potential to revolutionize the world of pathogen detection and antimicrobial resistance (AMR) surveillance in food-producing environments. However, the analysis of the huge amount of data obtained requires performant bioinformatics tools and databases, with intuitive and straightforward interpretation. In this study, based on long-read metagenomics data of chicken fecal samples with a spike-in mock community, we proposed confidence levels for taxonomic identification and AMR gene detection, with interpretation guidelines, to help with the analysis of the output data generated by KMA, a popular k-mer read alignment tool. Additionally, we demonstrated that the completeness and diversity of the genomes present in the reference databases are key parameters for accurate and easy interpretation of the sequencing data. Finally, we explored whether KMA, in a two-step procedure, can be used to link the detected AMR genes to their bacterial host chromosome, both detected within the same long-reads. The confidence levels were successfully tested on 28 metagenomics datasets which were obtained with sequencing of real and spiked samples from fecal (chicken, pig, and buffalo) or food (minced beef and food enzyme products) origin. The methodology proposed in this study will facilitate the analysis of metagenomics sequencing datasets for KMA users. Ultimately, this will contribute to improvements in the rapid diagnosis and surveillance of pathogens and AMR genes in food-producing environments, as prioritized by the EU.
Collapse
Affiliation(s)
- Mathieu Gand
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Indre Navickaite
- Department of Bacteriology, Animal and Plant Health Agency, Weybridge, United Kingdom
| | - Lee-Julia Bartsch
- Department of Biological Safety, German Federal Institute for Risk Assessment, Berlin, Germany
| | - Josephine Grützke
- Department of Biological Safety, German Federal Institute for Risk Assessment, Berlin, Germany
| | | | - Astrid Rasmussen
- Bacterial Reference Center, Statens Serum Institute, Copenhagen, Denmark
| | - Saria Otani
- National Food Institute, Technical University of Denmark, Kongens Lyngby, Denmark
| | - Valeria Michelacci
- Department of Food Safety, Nutrition and Veterinary Public Health, Istituto Superiore di Sanità, Rome, Italy
| | | | - Bruno González-Zorn
- Department of Animal Health, Complutense University of Madrid, Madrid, Spain
| | - Michael S. M. Brouwer
- Wageningen Bioveterinary Research Part of Wageningen University and Research, Lelystad, Netherlands
| | - Lisa Di Marcantonio
- Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise “G. Caporale”, Teramo, Italy
| | - Bram Bloemen
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Kevin Vanneste
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | - Manal AbuOun
- Department of Bacteriology, Animal and Plant Health Agency, Weybridge, United Kingdom
| | | |
Collapse
|
6
|
Pinto Y, Chakraborty M, Jain N, Bhatt AS. Phage-inclusive profiling of human gut microbiomes with Phanta. Nat Biotechnol 2024; 42:651-662. [PMID: 37231259 DOI: 10.1038/s41587-023-01799-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 04/20/2023] [Indexed: 05/27/2023]
Abstract
Due to technical limitations, most gut microbiome studies have focused on prokaryotes, overlooking viruses. Phanta, a virome-inclusive gut microbiome profiling tool, overcomes the limitations of assembly-based viral profiling methods by using customized k-mer-based classification tools and incorporating recently published catalogs of gut viral genomes. Phanta's optimizations consider the small genome size of viruses, sequence homology with prokaryotes and interactions with other gut microbes. Extensive testing of Phanta on simulated data demonstrates that it quickly and accurately quantifies prokaryotes and viruses. When applied to 245 fecal metagenomes from healthy adults, Phanta identifies ~200 viral species per sample, ~5× more than standard assembly-based methods. We observe a ~2:1 ratio between DNA viruses and bacteria, with higher interindividual variability of the gut virome compared to the gut bacteriome. In another cohort, we observe that Phanta performs equally well on bulk versus virus-enriched metagenomes, making it possible to study prokaryotes and viruses in a single experiment, with a single analysis.
Collapse
Affiliation(s)
- Yishay Pinto
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA
| | | | - Navami Jain
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA.
| |
Collapse
|
7
|
Chorlton SD. Ten common issues with reference sequence databases and how to mitigate them. FRONTIERS IN BIOINFORMATICS 2024; 4:1278228. [PMID: 38560517 PMCID: PMC10978663 DOI: 10.3389/fbinf.2024.1278228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 03/05/2024] [Indexed: 04/04/2024] Open
Abstract
Metagenomic sequencing has revolutionized our understanding of microbiology. While metagenomic tools and approaches have been extensively evaluated and benchmarked, far less attention has been given to the reference sequence database used in metagenomic classification. Issues with reference sequence databases are pervasive. Database contamination is the most recognized issue in the literature; however, it remains relatively unmitigated in most analyses. Other common issues with reference sequence databases include taxonomic errors, inappropriate inclusion and exclusion criteria, and sequence content errors. This review covers ten common issues with reference sequence databases and the potential downstream consequences of these issues. Mitigation measures are discussed for each issue, including bioinformatic tools and database curation strategies. Together, these strategies present a path towards more accurate, reproducible and translatable metagenomic sequencing.
Collapse
|
8
|
García-Serquén AL, Chumbe-Nolasco LD, Navarrete AA, Girón-Aguilar RC, Gutiérrez-Reynoso DL. Traditional potato tillage systems in the Peruvian Andes impact bacterial diversity, evenness, community composition, and functions in soil microbiomes. Sci Rep 2024; 14:3963. [PMID: 38368478 PMCID: PMC10874408 DOI: 10.1038/s41598-024-54652-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 02/15/2024] [Indexed: 02/19/2024] Open
Abstract
The soil microbiome, a crucial component of agricultural ecosystems, plays a pivotal role in crop production and ecosystem functioning. However, its response to traditional tillage systems in potato cultivation in the Peruvian highlands is still far from understood. Here, ecological and functional aspects of the bacterial community were analyzed based on soil samples from two traditional tillage systems: 'chiwa' (minimal tillage) and 'barbecho' (full tillage), in the Huanuco region of the Peruvian central Andes. Similar soil bacterial community composition was shown for minimal tillage system, but it was heterogeneous for full tillage system. This soil bacterial community composition under full tillage system may be attributed to stochastic, and a more dynamic environment within this tillage system. 'Chiwa' and 'barbecho' soils harbored distinct bacterial genera into their communities, indicating their potential as bioindicators of traditional tillage effects. Functional analysis revealed common metabolic pathways in both tillage systems, with differences in anaerobic pathways in 'chiwa' and more diverse pathways in 'barbecho'. These findings open the possibilities to explore microbial bioindicators for minimal and full tillage systems, which are in relationship with healthy soil, and they can be used to propose adequate tillage systems for the sowing of potatoes in Peru.
Collapse
Affiliation(s)
- Aura L García-Serquén
- Laboratorio de Biología Molecular y Genómica, Dirección de Recursos Genéticos y Biotecnología, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, 15024, Lima, Peru.
| | - Lenin D Chumbe-Nolasco
- Laboratorio de Biología Molecular y Genómica, Dirección de Recursos Genéticos y Biotecnología, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, 15024, Lima, Peru
| | - Acacio Aparecido Navarrete
- Graduate Program in Environmental Sciences, Brazil University (UB), Estrada Projetada F1, Fazenda Santa Rita, Fernandópolis, São Paulo, 15613-899, Brazil
| | - R Carolina Girón-Aguilar
- Laboratorio de Biología Molecular y Genómica, Dirección de Recursos Genéticos y Biotecnología, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, 15024, Lima, Peru
| | - Dina L Gutiérrez-Reynoso
- Laboratorio de Biología Molecular y Genómica, Dirección de Recursos Genéticos y Biotecnología, Instituto Nacional de Innovación Agraria (INIA), Av. La Molina 1981, 15024, Lima, Peru
| |
Collapse
|
9
|
Sharko FS, Mazloum A, Krotova AO, Byadovskaya OP, Prokhvatilova LB, Chvala IA, Zolotikov UE, Kozlova AD, Krylova AS, Grosfeld EV, Prokopenko AV, Korzhenkov AA, Patrushev MV, Namsaraev ZB, Sprygin AV, Toshchakov SV. Metagenomic profiling of viral and microbial communities from the pox lesions of lumpy skin disease virus and sheeppox virus-infected hosts. Front Vet Sci 2024; 11:1321202. [PMID: 38420205 PMCID: PMC10899707 DOI: 10.3389/fvets.2024.1321202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/23/2024] [Indexed: 03/02/2024] Open
Abstract
Introduction It has been recognized that capripoxvirus infections have a strong cutaneous tropism with the manifestation of skin lesions in the form of nodules and scabs in the respective hosts, followed by necrosis and sloughing off. Considering that the skin microbiota is a complex community of commensal bacteria, fungi and viruses that are influenced by infections leading to pathological states, there is no evidence on how the skin microbiome is affected during capripoxvirus pathogenesis. Methods In this study, shotgun metagenomic sequencing was used to investigate the microbiome in pox lesions from hosts infected with lumpy skin disease virus and sheep pox virus. Results The analysis revealed a high degree of variability in bacterial community structures across affected skin samples, indicating the importance of specific commensal microorganisms colonizing individual hosts. The most common and abundant bacteria found in scab samples were Fusobacterium necrophorum, Streptococcus dysgalactiae, Helcococcus ovis and Trueperella pyogenes, irrespective of host. Bacterial reads belonging to the genera Moraxella, Mannheimia, Corynebacterium, Staphylococcus and Micrococcus were identified. Discussion This study is the first to investigate capripox virus-associated changes in the skin microbiome using whole-genome metagenomic profiling. The findings will provide a basis for further investigation into capripoxvirus pathogenesis. In addition, this study highlights the challenge of selecting an optimal bioinformatics approach for the analysis of metagenomic data in clinical and veterinary practice. For example, direct classification of reads using a kmer-based algorithm resulted in a significant number of systematic false positives, which may be attributed to the peculiarities of the algorithm and database selection. On the contrary, the process of de novo assembly requires a large number of target reads from the symbiotic microbial community. In this work, the obtained sequencing data were processed by three different approaches, including direct classification of reads based on k-mers, mapping of reads to a marker gene database, and de novo assembly and binning of metagenomic contigs. The advantages and disadvantages of these techniques and their practicality in veterinary settings are discussed in relation to the results obtained.
Collapse
Affiliation(s)
- Fedor S. Sharko
- National Research Center “Kurchatov Institute”, Moscow, Russia
| | - Ali Mazloum
- Federal Center for Animal Health FGBI ARRIAH, Vladimir, Russia
| | | | | | | | - Ilya A. Chvala
- Federal Center for Animal Health FGBI ARRIAH, Vladimir, Russia
| | | | | | | | - Erika V. Grosfeld
- National Research Center “Kurchatov Institute”, Moscow, Russia
- Moscow Institute of Physics and Technology, National Research University, Dolgoprudny, Russia
| | | | | | | | | | | | | |
Collapse
|
10
|
Verma B, Parkinson J. HiTaxon: a hierarchical ensemble framework for taxonomic classification of short reads. BIOINFORMATICS ADVANCES 2024; 4:vbae016. [PMID: 38371920 PMCID: PMC10873905 DOI: 10.1093/bioadv/vbae016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 02/16/2024] [Accepted: 02/16/2024] [Indexed: 02/20/2024]
Abstract
Motivation Whole microbiome DNA and RNA sequencing (metagenomics and metatranscriptomics) are pivotal to determining the functional roles of microbial communities. A key challenge in analyzing these complex datasets, typically composed of tens of millions of short reads, is accurately classifying reads to their taxa of origin. While still performing worse relative to reference-based short-read tools in species classification, ML algorithms have shown promising results in taxonomic classification at higher ranks. A recent approach exploited to enhance the performance of ML tools, which can be translated to reference-dependent classifiers, has been to integrate the hierarchical structure of taxonomy within the tool's predictive algorithm. Results Here, we introduce HiTaxon, an end-to-end hierarchical ensemble framework for taxonomic classification. HiTaxon facilitates data collection and processing, reference database construction and optional training of ML models to streamline ensemble creation. We show that databases created by HiTaxon improve the species-level performance of reference-dependent classifiers, while reducing their computational overhead. In addition, through exploring hierarchical methods for HiTaxon, we highlight that our custom approach to hierarchical ensembling improves species-level classification relative to traditional strategies. Finally, we demonstrate the improved performance of our hierarchical ensembles over current state-of-the-art classifiers in species classification using datasets comprised of either simulated or experimentally derived reads. Availability and implementation HiTaxon is available at: https://github.com/ParkinsonLab/HiTaxon.
Collapse
Affiliation(s)
- Bhavish Verma
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
| | - John Parkinson
- Program in Molecular Medicine, Hospital for Sick Children, Toronto, ON M5G 0A4, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON M5S 1A8, Canada
- Department of Biochemistry, University of Toronto, Toronto, ON M5S 1A8, Canada
| |
Collapse
|
11
|
Pereira-Marques J, Ferreira RM, Figueiredo C. A metatranscriptomics strategy for efficient characterization of the microbiome in human tissues with low microbial biomass. Gut Microbes 2024; 16:2323235. [PMID: 38425025 PMCID: PMC10913719 DOI: 10.1080/19490976.2024.2323235] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 02/21/2024] [Indexed: 03/02/2024] Open
Abstract
The high background of host RNA poses a major challenge to metatranscriptome analysis of human samples. Hence, metatranscriptomics has been mainly applied to microbe-rich samples, while its application in human tissues with low ratio of microbial to host cells has yet to be explored. Since there is no computational workflow specifically designed for the taxonomic and functional analysis of this type of samples, we propose an effective metatranscriptomics strategy to accurately characterize the microbiome in human tissues with a low ratio of microbial to host content. We experimentally generated synthetic samples with well-characterized bacterial and host cell compositions, and mimicking human samples with high and low microbial loads. These synthetic samples were used for optimizing and establishing the workflow in a controlled setting. Our results show that the integration of the taxonomic analysis of optimized Kraken 2/Bracken with the functional analysis of HUMAnN 3 in samples with low microbial content, enables the accurate identification of a large number of microbial species with a low false-positive rate, while improving the detection of microbial functions. The effectiveness of our metatranscriptomics workflow was demonstrated in synthetic samples, simulated datasets, and most importantly, human gastric tissue specimens, thus providing a proof of concept for its applicability on mucosal tissues of the gastrointestinal tract. The use of an accurate and reliable metatranscriptomics approach for human tissues with low microbial content will expand our understanding of the functional activity of the mucosal microbiome, uncovering critical interactions between the microbiome and the host in health and disease.
Collapse
Affiliation(s)
- Joana Pereira-Marques
- i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Ipatimup – Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | - Rui M. Ferreira
- i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Ipatimup – Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | - Ceu Figueiredo
- i3S – Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Porto, Portugal
- Ipatimup – Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
- Department of Pathology, Faculty of Medicine of the University of Porto, Porto, Portugal
| |
Collapse
|
12
|
Kim N, Kim CY, Ma J, Yang S, Park DJ, Ha SJ, Belenky P, Lee I. MRGM: an enhanced catalog of mouse gut microbial genomes substantially broadening taxonomic and functional landscapes. Gut Microbes 2024; 16:2393791. [PMID: 39230075 PMCID: PMC11376411 DOI: 10.1080/19490976.2024.2393791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/10/2024] [Revised: 08/12/2024] [Accepted: 08/13/2024] [Indexed: 09/05/2024] Open
Abstract
Mouse gut microbiome research is pivotal for understanding the human gut microbiome, providing insights into disease modeling, host-microbe interactions, and the dietary influence on the gut microbiome. To enhance the translational value of mouse gut microbiome studies, we need detailed and high-quality catalogs of mouse gut microbial genomes. We introduce the Mouse Reference Gut Microbiome (MRGM), a comprehensive catalog with 42,245 non-redundant mouse gut bacterial genomes across 1,524 species. MRGM marks a 40% increase in the known taxonomic diversity of mouse gut microbes, capturing previously underrepresented lineages through refined genome quality assessment techniques. MRGM not only broadens the taxonomic landscape but also enriches the functional landscape of the mouse gut microbiome. Using deep learning, we have elevated the Gene Ontology annotation rate for mouse gut microbial proteins from 3.2% with orthology to 60%, marking an over 18-fold increase. MRGM supports both DNA- and marker-based taxonomic profiling by providing custom databases, surpassing previous catalogs in performance. Finally, taxonomic and functional comparisons between human and mouse gut microbiota reveal diet-driven divergences in their taxonomic composition and functional enrichment. Overall, our study highlights the value of high-quality microbial genome catalogs in advancing our understanding of the co-evolution between gut microbes and their host.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Chan Yeong Kim
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Sunmo Yang
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Dong Jin Park
- Department of Biochemistry, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Sang-Jun Ha
- Department of Biochemistry, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, USA
| | - Insuk Lee
- Department of Biotechnology, College of Life Science & Biotechnology, Yonsei University, Seoul, Republic of Korea
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea
| |
Collapse
|
13
|
Zadjelovic V, Wright RJ, Borsetto C, Quartey J, Cairns TN, Langille MGI, Wellington EMH, Christie-Oleza JA. Microbial hitchhikers harbouring antimicrobial-resistance genes in the riverine plastisphere. MICROBIOME 2023; 11:225. [PMID: 37908022 PMCID: PMC10619285 DOI: 10.1186/s40168-023-01662-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Accepted: 09/04/2023] [Indexed: 11/02/2023]
Abstract
BACKGROUND The widespread nature of plastic pollution has given rise to wide scientific and social concern regarding the capacity of these materials to serve as vectors for pathogenic bacteria and reservoirs for Antimicrobial Resistance Genes (ARG). In- and ex-situ incubations were used to characterise the riverine plastisphere taxonomically and functionally in order to determine whether antibiotics within the water influenced the ARG profiles in these microbiomes and how these compared to those on natural surfaces such as wood and their planktonic counterparts. RESULTS We show that plastics support a taxonomically distinct microbiome containing potential pathogens and ARGs. While the plastisphere was similar to those biofilms that grew on wood, they were distinct from the surrounding water microbiome. Hence, whilst potential opportunistic pathogens (i.e. Pseudomonas aeruginosa, Acinetobacter and Aeromonas) and ARG subtypes (i.e. those that confer resistance to macrolides/lincosamides, rifamycin, sulfonamides, disinfecting agents and glycopeptides) were predominant in all surface-related microbiomes, especially on weathered plastics, a completely different set of potential pathogens (i.e. Escherichia, Salmonella, Klebsiella and Streptococcus) and ARGs (i.e. aminoglycosides, tetracycline, aminocoumarin, fluoroquinolones, nitroimidazole, oxazolidinone and fosfomycin) dominated in the planktonic compartment. Our genome-centric analysis allowed the assembly of 215 Metagenome Assembled Genomes (MAGs), linking ARGs and other virulence-related genes to their host. Interestingly, a MAG belonging to Escherichia -that clearly predominated in water- harboured more ARGs and virulence factors than any other MAG, emphasising the potential virulent nature of these pathogenic-related groups. Finally, ex-situ incubations using environmentally-relevant concentrations of antibiotics increased the prevalence of their corresponding ARGs, but different riverine compartments -including plastispheres- were affected differently by each antibiotic. CONCLUSIONS Our results provide insights into the capacity of the riverine plastisphere to harbour a distinct set of potentially pathogenic bacteria and function as a reservoir of ARGs. The environmental impact that plastics pose if they act as a reservoir for either pathogenic bacteria or ARGs is aggravated by the persistence of plastics in the environment due to their recalcitrance and buoyancy. Nevertheless, the high similarities with microbiomes growing on natural co-occurring materials and even more worrisome microbiome observed in the surrounding water highlights the urgent need to integrate the analysis of all environmental compartments when assessing risks and exposure to pathogens and ARGs in anthropogenically-impacted ecosystems. Video Abstract.
Collapse
Affiliation(s)
- Vinko Zadjelovic
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK.
- Present address: Centro de Bioinnovación de Antofagasta (CBIA), Facultad de Ciencias del Mar y Recursos Biológicos, Universidad de Antofagasta, 1271155, Antofagasta, Chile.
| | - Robyn J Wright
- Department of Pharmacology, Faculty of Medicine, Dalhousie University, Halifax, Canada
| | - Chiara Borsetto
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | - Jeannelle Quartey
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | - Tyler N Cairns
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK
| | - Morgan G I Langille
- Department of Pharmacology, Faculty of Medicine, Dalhousie University, Halifax, Canada
| | | | - Joseph A Christie-Oleza
- School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK.
- Department of Biology, University of the Balearic Islands, 07122, Palma, Spain.
| |
Collapse
|