1
|
Van Uffelen A, Posadas A, Roosens NHC, Marchal K, De Keersmaecker SCJ, Vanneste K. Benchmarking bacterial taxonomic classification using nanopore metagenomics data of several mock communities. Sci Data 2024; 11:864. [PMID: 39127718 DOI: 10.1038/s41597-024-03672-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 07/22/2024] [Indexed: 08/12/2024] Open
Abstract
Taxonomic classification is crucial in identifying organisms within diverse microbial communities when using metagenomics shotgun sequencing. While second-generation Illumina sequencing still dominates, third-generation nanopore sequencing promises improved classification through longer reads. However, extensive benchmarking studies on nanopore data are lacking. We systematically evaluated performance of bacterial taxonomic classification for metagenomics nanopore sequencing data for several commonly used classifiers, using standardized reference sequence databases, on the largest collection of publicly available data for defined mock communities thus far (nine samples), representing different research domains and application scopes. Our results categorize classifiers into three categories: low precision/high recall; medium precision/medium recall, and high precision/medium recall. Most fall into the first group, although precision can be improved without excessively penalizing recall with suitable abundance filtering. No definitive 'best' classifier emerges, and classifier selection depends on application scope and practical requirements. Although few classifiers designed for long reads exist, they generally exhibit better performance. Our comprehensive benchmarking provides concrete recommendations, supported by publicly available code for reassessment and fine-tuning by other scientists.
Collapse
Affiliation(s)
- Alexander Van Uffelen
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Andrés Posadas
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Nancy H C Roosens
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Kathleen Marchal
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Genetics, University of Pretoria, Pretoria, South Africa
| | | | - Kevin Vanneste
- Transversal activities in Applied Genomics, Sciensano, Brussels, Belgium.
| |
Collapse
|
2
|
Achudhan AB, Kannan P, Gupta A, Saleena LM. A Review of Web-Based Metagenomics Platforms for Analysing Next-Generation Sequence Data. Biochem Genet 2024; 62:621-632. [PMID: 37507643 DOI: 10.1007/s10528-023-10467-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2023] [Accepted: 07/18/2023] [Indexed: 07/30/2023]
Abstract
Metagenomics has now evolved as a promising technology for understanding the microbial population in the environment. By metagenomics, a number of extreme and complex environment has been explored for their microbial population. Using this technology, researchers have brought out novel genes and their potential characteristics, which have robust applications in food, pharmaceutical, scientific research, and other biotechnological fields. A sequencing platform can provide a sequence of microbial populations in any given environment. The sequence needs to be analysed computationally to derive meaningful information. It is presumed that only bioinformaticians with extensive computational skills can process the sequencing data till the downstream end. However, numerous open-source software and online servers are available to analyse the metagenomic data developed for a biologist with less computational skills. This review is focused on bioinformatics tools such as Galaxy, CSI-NGS portal, ANASTASIA and SHAMAN, EBI- metagenomics, IDseq, and MG-RAST for analysing metagenomic data.
Collapse
Affiliation(s)
- Arunmozhi Bharathi Achudhan
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Priya Kannan
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Annapurna Gupta
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India
| | - Lilly M Saleena
- Department of Biotechnology, School of Bioengineering, College of Engineering and Technology, SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu, India.
| |
Collapse
|
3
|
Arıkan M, Muth T. Integrated multi-omics analyses of microbial communities: a review of the current state and future directions. Mol Omics 2023; 19:607-623. [PMID: 37417894 DOI: 10.1039/d3mo00089c] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/08/2023]
Abstract
Integrated multi-omics analyses of microbiomes have become increasingly common in recent years as the emerging omics technologies provide an unprecedented opportunity to better understand the structural and functional properties of microbial communities. Consequently, there is a growing need for and interest in the concepts, approaches, considerations, and available tools for investigating diverse environmental and host-associated microbial communities in an integrative manner. In this review, we first provide a general overview of each omics analysis type, including a brief history, typical workflow, primary applications, strengths, and limitations. Then, we inform on both experimental design and bioinformatics analysis considerations in integrated multi-omics analyses, elaborate on the current approaches and commonly used tools, and highlight the current challenges. Finally, we discuss the expected key advances, emerging trends, potential implications on various fields from human health to biotechnology, and future directions.
Collapse
Affiliation(s)
- Muzaffer Arıkan
- Regenerative and Restorative Medicine Research Center (REMER), Research Institute for Health Sciences and Technologies (SABITA), Istanbul Medipol University, Istanbul, Turkey.
- Department of Medical Biology, Faculty of Medicine, Istanbul Medipol University, Istanbul, Turkey
| | - Thilo Muth
- Section eScience (S.3), Federal Institute for Materials Research and Testing (BAM), Berlin, Germany.
| |
Collapse
|
4
|
González-Plaza JJ, Furlan C, Rijavec T, Lapanje A, Barros R, Tamayo-Ramos JA, Suarez-Diez M. Advances in experimental and computational methodologies for the study of microbial-surface interactions at different omics levels. Front Microbiol 2022; 13:1006946. [PMID: 36519168 PMCID: PMC9744117 DOI: 10.3389/fmicb.2022.1006946] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 11/02/2022] [Indexed: 08/31/2023] Open
Abstract
The study of the biological response of microbial cells interacting with natural and synthetic interfaces has acquired a new dimension with the development and constant progress of advanced omics technologies. New methods allow the isolation and analysis of nucleic acids, proteins and metabolites from complex samples, of interest in diverse research areas, such as materials sciences, biomedical sciences, forensic sciences, biotechnology and archeology, among others. The study of the bacterial recognition and response to surface contact or the diagnosis and evolution of ancient pathogens contained in archeological tissues require, in many cases, the availability of specialized methods and tools. The current review describes advances in in vitro and in silico approaches to tackle existing challenges (e.g., low-quality sample, low amount, presence of inhibitors, chelators, etc.) in the isolation of high-quality samples and in the analysis of microbial cells at genomic, transcriptomic, proteomic and metabolomic levels, when present in complex interfaces. From the experimental point of view, tailored manual and automatized methodologies, commercial and in-house developed protocols, are described. The computational level focuses on the discussion of novel tools and approaches designed to solve associated issues, such as sample contamination, low quality reads, low coverage, etc. Finally, approaches to obtain a systems level understanding of these complex interactions by integrating multi omics datasets are presented.
Collapse
Affiliation(s)
- Juan José González-Plaza
- International Research Centre in Critical Raw Materials-ICCRAM, University of Burgos, Burgos, Spain
| | - Cristina Furlan
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| | - Tomaž Rijavec
- Department of Environmental Sciences, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Aleš Lapanje
- Department of Environmental Sciences, Jožef Stefan Institute, Ljubljana, Slovenia
| | - Rocío Barros
- International Research Centre in Critical Raw Materials-ICCRAM, University of Burgos, Burgos, Spain
| | | | - Maria Suarez-Diez
- Laboratory of Systems and Synthetic Biology, Wageningen University and Research, Wageningen, Netherlands
| |
Collapse
|
5
|
Establishment and Validation of a New Analysis Strategy for the Study of Plant Endophytic Microorganisms. Int J Mol Sci 2022; 23:ijms232214223. [PMID: 36430699 PMCID: PMC9697482 DOI: 10.3390/ijms232214223] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 11/11/2022] [Accepted: 11/14/2022] [Indexed: 11/19/2022] Open
Abstract
Amplicon sequencing of bacterial or fungal marker sequences is currently the main method for the study of endophytic microorganisms in plants. However, it cannot obtain all types of microorganisms, including bacteria, fungi, protozoa, etc., in samples, nor compare the relative content between endophytic microorganisms and plants and between different types of endophytes. Therefore, it is necessary to develop a better analysis strategy for endophytic microorganism investigation. In this study, a new analysis strategy was developed to obtain endophytic microbiome information from plant transcriptome data. Results showed that the new strategy can obtain the composition of microbial communities and the relative content between plants and endophytic microorganisms, and between different types of endophytic microorganisms from the plant transcriptome data. Compared with the amplicon sequencing method, more endophytic microorganisms and relative content information can be obtained with the new strategy, which can greatly broaden the research scope and save the experimental cost. Furthermore, the advantages and effectiveness of the new strategy were verified with different analysis of the microbial composition, correlation analysis, inoculant content test, and repeatability test.
Collapse
|
6
|
Boix-Amorós A, Monaco H, Sambataro E, Clemente JC. Novel technologies to characterize and engineer the microbiome in inflammatory bowel disease. Gut Microbes 2022; 14:2107866. [PMID: 36104776 PMCID: PMC9481095 DOI: 10.1080/19490976.2022.2107866] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
We present an overview of recent experimental and computational advances in technology used to characterize the microbiome, with a focus on how these developments improve our understanding of inflammatory bowel disease (IBD). Specifically, we present studies that make use of flow cytometry and metabolomics assays to provide a functional characterization of microbial communities. We also describe computational methods for strain-level resolution, temporal series, mycobiome and virome data, co-occurrence networks, and compositional data analysis. In addition, we review novel techniques to therapeutically manipulate the microbiome in IBD. We discuss the benefits and drawbacks of these technologies to increase awareness of specific biases, and to facilitate a more rigorous interpretation of results and their potential clinical application. Finally, we present future lines of research to better characterize the relation between microbial communities and IBD pathogenesis and progression.
Collapse
Affiliation(s)
- Alba Boix-Amorós
- Department of Genetics and Genomic Sciences, Precision Immunology Institute, Icahn School of Medicine at Mount Sinai. New York, NY, USA
| | - Hilary Monaco
- Department of Genetics and Genomic Sciences, Precision Immunology Institute, Icahn School of Medicine at Mount Sinai. New York, NY, USA
| | - Elisa Sambataro
- Department of Biological Sciences, CUNY Hunter College, New York, NY, USA
| | - Jose C. Clemente
- Department of Genetics and Genomic Sciences, Precision Immunology Institute, Icahn School of Medicine at Mount Sinai. New York, NY, USA,CONTACT Jose C. Clemente Department of Genetics and Genomic Sciences, Precision Immunology Institute, Icahn School of Medicine at Mount Sinai. New York, NY10029USA
| |
Collapse
|
7
|
Rodrigues KF, Yong WTL, Bhuiyan MSA, Siddiquee S, Shah MD, Venmathi Maran BA. Current Understanding on the Genetic Basis of Key Metabolic Disorders: A Review. BIOLOGY 2022; 11:biology11091308. [PMID: 36138787 PMCID: PMC9495729 DOI: 10.3390/biology11091308] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 08/27/2022] [Accepted: 08/29/2022] [Indexed: 12/02/2022]
Abstract
Simple Summary Metabolic disorders (MD) are a challenge to healthcare systems; the emergence of the modern socio-economic system has led to a profound change in lifestyles in terms of dietary habits, exercise regimens, and behavior, all of which complement the genetic factors associated with MD. Diabetes Mellitus and Familial hypercholesterolemia are two of the 14 most widely researched MD, as they pose the greatest challenge to the public healthcare system and have an impact on productivity and the economy. Research findings have led to the development of new therapeutic molecules for the mitigation of MD as well as the invention of experimental strategies, which target the genes themselves via gene editing and RNA interference. Although these approaches may herald the emergence of a new toolbox to treat MD, the current therapeutic approaches still heavily depend on substrate reduction, dietary restrictions based on genetic factors, exercise, and the maintenance of good mental health. The development of orphan drugs for the less common MD such as Krabbe, Farber, Fabry, and Gaucher diseases, remains in its infancy, owing to the lack of investment in research and development, and this has driven the development of personalized therapeutics based on gene silencing and related technologies. Abstract Advances in data acquisition via high resolution genomic, transcriptomic, proteomic and metabolomic platforms have driven the discovery of the underlying factors associated with metabolic disorders (MD) and led to interventions that target the underlying genetic causes as well as lifestyle changes and dietary regulation. The review focuses on fourteen of the most widely studied inherited MD, which are familial hypercholesterolemia, Gaucher disease, Hunter syndrome, Krabbe disease, Maple syrup urine disease, Metachromatic leukodystrophy, Mitochondrial encephalopathy lactic acidosis stroke-like episodes (MELAS), Niemann-Pick disease, Phenylketonuria (PKU), Porphyria, Tay-Sachs disease, Wilson’s disease, Familial hypertriglyceridemia (F-HTG) and Galactosemia based on genome wide association studies, epigenetic factors, transcript regulation, post-translational genetic modifications and biomarker discovery through metabolomic studies. We will delve into the current approaches being undertaken to analyze metadata using bioinformatic approaches and the emerging interventions using genome editing platforms as applied to animal models.
Collapse
Affiliation(s)
- Kenneth Francis Rodrigues
- Biotechnology Research Institute, Universiti Malaysia Sabah, Kota Kinabalu 88400, Malaysia
- Correspondence: (K.F.R.); (B.A.V.M.); Tel.: +60-16-2096905 (B.A.V.M.)
| | - Wilson Thau Lym Yong
- Biotechnology Research Institute, Universiti Malaysia Sabah, Kota Kinabalu 88400, Malaysia
| | | | | | - Muhammad Dawood Shah
- Borneo Marine Research Institute, Universiti Malaysia Sabah, Kota Kinabalu 88400, Malaysia
| | - Balu Alagar Venmathi Maran
- Borneo Marine Research Institute, Universiti Malaysia Sabah, Kota Kinabalu 88400, Malaysia
- Correspondence: (K.F.R.); (B.A.V.M.); Tel.: +60-16-2096905 (B.A.V.M.)
| |
Collapse
|
8
|
Selenium Metabolism and Selenoproteins in Prokaryotes: A Bioinformatics Perspective. Biomolecules 2022; 12:biom12070917. [PMID: 35883471 PMCID: PMC9312934 DOI: 10.3390/biom12070917] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/23/2022] [Accepted: 06/27/2022] [Indexed: 01/25/2023] Open
Abstract
Selenium (Se) is an important trace element that mainly occurs in the form of selenocysteine in selected proteins. In prokaryotes, Se is also required for the synthesis of selenouridine and Se-containing cofactor. A large number of selenoprotein families have been identified in diverse prokaryotic organisms, most of which are thought to be involved in various redox reactions. In the last decade or two, computational prediction of selenoprotein genes and comparative genomics of Se metabolic pathways and selenoproteomes have arisen, providing new insights into the metabolism and function of Se and their evolutionary trends in bacteria and archaea. This review aims to offer an overview of recent advances in bioinformatics analysis of Se utilization in prokaryotes. We describe current computational strategies for the identification of selenoprotein genes and generate the most comprehensive list of prokaryotic selenoproteins reported to date. Furthermore, we highlight the latest research progress in comparative genomics and metagenomics of Se utilization in prokaryotes, which demonstrates the divergent and dynamic evolutionary patterns of different Se metabolic pathways, selenoprotein families, and selenoproteomes in sequenced organisms and environmental samples. Overall, bioinformatics analyses of Se utilization, function, and evolution may contribute to a systematic understanding of how this micronutrient is used in nature.
Collapse
|
9
|
Efficient and Quality-Optimized Metagenomic Pipeline Designed for Taxonomic Classification in Routine Microbiological Clinical Tests. Microorganisms 2022; 10:microorganisms10040711. [PMID: 35456762 PMCID: PMC9026403 DOI: 10.3390/microorganisms10040711] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 03/09/2022] [Accepted: 03/23/2022] [Indexed: 01/26/2023] Open
Abstract
Metagenomics analysis is now routinely used for clinical diagnosis in several diseases, and we need confidence in interpreting metagenomics analysis of microbiota. Particularly from the side of clinical microbiology, we consider that it would be a major milestone to further advance microbiota studies with an innovative and significant approach consisting of processing steps and quality assessment for interpreting metagenomics data used for diagnosis. Here, we propose a methodology for taxon identification and abundance assessment of shotgun sequencing data of microbes that are well fitted for clinical setup. Processing steps of quality controls have been developed in order (i) to avoid low-quality reads and sequences, (ii) to optimize abundance thresholds and profiles, (iii) to combine classifiers and reference databases for best classification of species and abundance profiles for both prokaryotic and eukaryotic sequences, and (iv) to introduce external positive control. We find that the best strategy is to use a pipeline composed of a combination of different but complementary classifiers such as Kraken2/Bracken and Kaiju. Such improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.
Collapse
|
10
|
Abstract
Microbial ecology is the study of microorganisms present in nature. It particularly focuses on microbial interactions with any biota and with surrounding environments. Microbial ecology is entering its golden age with innovative multi-omics methods triggered by next-generation sequencing technologies. However, the extraction of ecologically relevant information from ever-increasing omics data remains one of the most challenging tasks in microbial ecology. This special issue includes 11 review articles that provide an overview of the state of the art of omics-based approaches in the field of microbial ecology, with particular emphasis on the interpretation of omics data, environmental pollution tracking, interactions in microbiomes, and viral ecology.
Collapse
|