1
|
Kane Y, Tendu A, Li R, Chen Y, Mastriani E, Lan J, Catherine Hughes A, Berthet N, Wong G. Viral diversity in wild and urban rodents of Yunnan Province, China. Emerg Microbes Infect 2024; 13:2290842. [PMID: 38047395 PMCID: PMC10829829 DOI: 10.1080/22221751.2023.2290842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 11/29/2023] [Indexed: 12/05/2023]
Abstract
Rodents represent over 40% of known mammal species and are found in various terrestrial habitats. They are significant reservoirs for zoonotic viruses, including harmful pathogens such as arenaviruses and hantaviruses, yet knowledge of their hosts and distributions is limited. Therefore, characterizing the virome profile in these animals is invaluable for outbreak preparedness, especially in potential hotspots of mammal diversity. This study included 681 organs from 124 rodents and one Chinese tree shrew collected from Yunnan Province, China, during 2020-2021. Metagenomic analysis revealed unique features of mammalian viruses in rodent organs across habitats with varying human disturbances. R. tanezumi in locations with high anthropogenic disturbance exhibited the highest mammal viral diversity, with spleen and lung samples showing the highest diversities for these viruses at the organ level. Mammal viral diversity for both commensal and non-commensal rats was identified to positively correlate with landscape disturbance. Some virus families were associated with particular organs or host species, suggesting tropism for these pathogens. Notably, known and novel viral species that are likely to infect humans were identified. R. tanezumi was identified as a reservoir and carrier for various zoonotic viruses, including porcine bocavirus, hantavirus, cardiovirus, and lyssavirus. These findings highlight the influence of rodent community composition and anthropogenic activities on diverse virome profiles, with R. tanezumi as an important reservoir for zoonotic viruses.
Collapse
Affiliation(s)
- Yakhouba Kane
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
- University of Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Alexander Tendu
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
- University of Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Ruiya Li
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
- University of Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Yanhua Chen
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
- Landscape Ecology Group, Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, People’s Republic of China
| | - Emilio Mastriani
- Centre for Microbes, Development, and Health, and Unit of Discovery and Molecular Characterization of Pathogens, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
| | - Jiaming Lan
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
| | - Alice Catherine Hughes
- Landscape Ecology Group, Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, People’s Republic of China
- School of Biological Sciences, University of Hong Kong, Hong Kong SAR, People’s Republic of China
| | - Nicolas Berthet
- Centre for Microbes, Development, and Health, and Unit of Discovery and Molecular Characterization of Pathogens, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
- Institut Pasteur, Unité Environnement et Risque Infectieux, Cellule d’Intervention Biologique d’Urgence, Paris, France
- Institut Pasteur, Université Paris-cite, Unité Epidémiologie et Physiopathologie des Virus Oncogènes, Paris, France
| | - Gary Wong
- Viral Hemorrhagic Fevers Research Unit, CAS Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, People’s Republic of China
| |
Collapse
|
2
|
Maccaro JJ, Figueroa LL, McFrederick QS. From pollen to putrid: Comparative metagenomics reveals how microbiomes support dietary specialization in vulture bees. Mol Ecol 2024; 33:e17421. [PMID: 38828760 DOI: 10.1111/mec.17421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Revised: 05/12/2024] [Accepted: 05/20/2024] [Indexed: 06/05/2024]
Abstract
For most animals, the microbiome is key for nutrition and pathogen defence, and is often shaped by diet. Corbiculate bees, including honey bees, bumble bees, and stingless bees, share a core microbiome that has been shaped, at least in part, by the challenges associated with pollen digestion. However, three species of stingless bees deviate from the general rule of bees obtaining their protein exclusively from pollen (obligate pollinivores) and instead consume carrion as their sole protein source (obligate necrophages) or consume both pollen and carrion (facultative necrophages). These three life histories can provide missing insights into microbiome evolution associated with extreme dietary transitions. Here, we investigate, via shotgun metagenomics, the functionality of the microbiome across three bee diet types: obligate pollinivory, obligate necrophagy, and facultative necrophagy. We find distinct differences in microbiome composition and gene functional profiles between the diet types. Obligate necrophages and pollinivores have more specialized microbes, whereas facultative necrophages have a diversity of environmental microbes associated with several dietary niches. Our study suggests that necrophagous bee microbiomes may have evolved to overcome cellular stress and microbial competition associated with carrion. We hypothesize that the microbiome evolved social phenotypes, such as biofilms, that protect the bees from opportunistic pathogens present on carcasses, allowing them to overcome novel nutritional challenges. Whether specific microbes enabled diet shifts or diet shifts occurred first and microbial evolution followed requires further research to disentangle. Nonetheless, we find that necrophagous microbiomes, vertebrate and invertebrate alike, have functional commonalities regardless of their taxonomy.
Collapse
Affiliation(s)
- Jessica J Maccaro
- Department of Entomology, University of California Riverside, Riverside, California, USA
| | - Laura L Figueroa
- Department of Environmental Conservation, University of Massachusetts Amherst, Amherst, Massachusetts, USA
| | - Quinn S McFrederick
- Department of Entomology, University of California Riverside, Riverside, California, USA
| |
Collapse
|
3
|
Espinoza JL, Phillips A, Prentice MB, Tan GS, Kamath PL, Lloyd KG, Dupont CL. Unveiling the microbial realm with VEBA 2.0: a modular bioinformatics suite for end-to-end genome-resolved prokaryotic, (micro)eukaryotic and viral multi-omics from either short- or long-read sequencing. Nucleic Acids Res 2024:gkae528. [PMID: 38909293 DOI: 10.1093/nar/gkae528] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 05/21/2024] [Accepted: 06/10/2024] [Indexed: 06/24/2024] Open
Abstract
The microbiome is a complex community of microorganisms, encompassing prokaryotic (bacterial and archaeal), eukaryotic, and viral entities. This microbial ensemble plays a pivotal role in influencing the health and productivity of diverse ecosystems while shaping the web of life. However, many software suites developed to study microbiomes analyze only the prokaryotic community and provide limited to no support for viruses and microeukaryotes. Previously, we introduced the Viral Eukaryotic Bacterial Archaeal (VEBA) open-source software suite to address this critical gap in microbiome research by extending genome-resolved analysis beyond prokaryotes to encompass the understudied realms of eukaryotes and viruses. Here we present VEBA 2.0 with key updates including a comprehensive clustered microeukaryotic protein database, rapid genome/protein-level clustering, bioprospecting, non-coding/organelle gene modeling, genome-resolved taxonomic/pathway profiling, long-read support, and containerization. We demonstrate VEBA's versatile application through the analysis of diverse case studies including marine water, Siberian permafrost, and white-tailed deer lung tissues with the latter showcasing how to identify integrated viruses. VEBA represents a crucial advancement in microbiome research, offering a powerful and accessible software suite that bridges the gap between genomics and biotechnological solutions.
Collapse
Affiliation(s)
- Josh L Espinoza
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Allan Phillips
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Melanie B Prentice
- School of Food and Agriculture, University of Maine, Orono, ME 04469, USA
| | - Gene S Tan
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Pauline L Kamath
- School of Food and Agriculture, University of Maine, Orono, ME 04469, USA
- Maine Center for Genetics in the Environment, University of Maine, Orono, ME 04469, USA
| | - Karen G Lloyd
- Microbiology Department, University of Tennessee, Knoxville, TN 37917, USA
| | - Chris L Dupont
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| |
Collapse
|
4
|
Bouras G, Judd LM, Edwards RA, Vreugde S, Stinear TP, Wick RR. How low can you go? Short-read polishing of Oxford Nanopore bacterial genome assemblies. Microb Genom 2024; 10. [PMID: 38833287 DOI: 10.1099/mgen.0.001254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024] Open
Abstract
It is now possible to assemble near-perfect bacterial genomes using Oxford Nanopore Technologies (ONT) long reads, but short-read polishing is usually required for perfection. However, the effect of short-read depth on polishing performance is not well understood. Here, we introduce Pypolca (with default and careful parameters) and Polypolish v0.6.0 (with a new careful parameter). We then show that: (1) all polishers other than Pypolca-careful, Polypolish-default and Polypolish-careful commonly introduce false-positive errors at low read depth; (2) most of the benefit of short-read polishing occurs by 25× depth; (3) Polypolish-careful almost never introduces false-positive errors at any depth; and (4) Pypolca-careful is the single most effective polisher. Overall, we recommend the following polishing strategies: Polypolish-careful alone when depth is very low (<5×), Polypolish-careful and Pypolca-careful when depth is low (5-25×), and Polypolish-default and Pypolca-careful when depth is sufficient (>25×).
Collapse
Affiliation(s)
- George Bouras
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia
- The Department of Surgery - Otolaryngology Head and Neck Surgery, University of Adelaide and the Basil Hetzel Institute for Translational Health Research, Central Adelaide Local Health Network, South Australia, Australia
| | - Louise M Judd
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, Australia
| | - Sarah Vreugde
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, Australia
- The Department of Surgery - Otolaryngology Head and Neck Surgery, University of Adelaide and the Basil Hetzel Institute for Translational Health Research, Central Adelaide Local Health Network, South Australia, Australia
| | - Timothy P Stinear
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| | - Ryan R Wick
- Department of Microbiology and Immunology, University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Melbourne, Australia
| |
Collapse
|
5
|
Wattanasombat S, Tongjai S. Easing genomic surveillance: A comprehensive performance evaluation of long-read assemblers across multi-strain mixture data of HIV-1 and Other pathogenic viruses for constructing a user-friendly bioinformatic pipeline. F1000Res 2024; 13:556. [PMID: 38984017 PMCID: PMC11231628 DOI: 10.12688/f1000research.149577.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 05/14/2024] [Indexed: 07/11/2024] Open
Abstract
Background Determining the appropriate computational requirements and software performance is essential for efficient genomic surveillance. The lack of standardized benchmarking complicates software selection, especially with limited resources. Methods We developed a containerized benchmarking pipeline to evaluate seven long-read assemblers-Canu, GoldRush, MetaFlye, Strainline, HaploDMF, iGDA, and RVHaplo-for viral haplotype reconstruction, using both simulated and experimental Oxford Nanopore sequencing data of HIV-1 and other viruses. Benchmarking was conducted on three computational systems to assess each assembler's performance, utilizing QUAST and BLASTN for quality assessment. Results Our findings show that assembler choice significantly impacts assembly time, with CPU and memory usage having minimal effect. Assembler selection also influences the size of the contigs, with a minimum read length of 2,000 nucleotides required for quality assembly. A 4,000-nucleotide read length improves quality further. Canu was efficient among de novo assemblers but not suitable for multi-strain mixtures, while GoldRush produced only consensus assemblies. Strainline and MetaFlye were suitable for metagenomic sequencing data, with Strainline requiring high memory and MetaFlye operable on low-specification machines. Among reference-based assemblers, iGDA had high error rates, RVHaplo showed the best runtime and accuracy but became ineffective with similar sequences, and HaploDMF, utilizing machine learning, had fewer errors with a slightly longer runtime. Conclusions The HIV-64148 pipeline, containerized using Docker, facilitates easy deployment and offers flexibility to select from a range of assemblers to match computational systems or study requirements. This tool aids in genome assembly and provides valuable information on HIV-1 sequences, enhancing viral evolution monitoring and understanding.
Collapse
Affiliation(s)
- Sara Wattanasombat
- Department of Microbiology, Faculty of Medicine, Chiang Mai University, Chiang Mai, 50200, Thailand
| | - Siripong Tongjai
- Department of Microbiology, Faculty of Medicine, Chiang Mai University, Chiang Mai, 50200, Thailand
| |
Collapse
|
6
|
Dindhoria K, Kumar R, Bhargava B, Kumar R. Metagenomic assembled genomes indicated the potential application of hypersaline microbiome for plant growth promotion and stress alleviation in salinized soils. mSystems 2024; 9:e0105023. [PMID: 38377278 PMCID: PMC10949518 DOI: 10.1128/msystems.01050-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 01/19/2024] [Indexed: 02/22/2024] Open
Abstract
Climate change is causing unpredictable seasonal variations globally. Due to the continuously increasing earth's surface temperature, the rate of water evaporation is enhanced, conceiving a problem of soil salinization, especially in arid and semi-arid regions. The accumulation of salt degrades soil quality, impairs plant growth, and reduces agricultural yields. Salt-tolerant, plant-growth-promoting microorganisms may offer a solution, enhancing crop productivity and soil fertility in salinized areas. In the current study, genome-resolved metagenomic analysis has been performed to investigate the salt-tolerating and plant growth-promoting potential of two hypersaline ecosystems, Sambhar Lake and Drang Mine. The samples were co-assembled independently by Megahit, MetaSpades, and IDBA-UD tools. A total of 67 metagenomic assembled genomes (MAGs) were reconstructed following the binning process, including 15 from Megahit, 26 from MetaSpades, and 26 from IDBA_UD assembly tools. As compared to other assemblers, the MAGs obtained by MetaSpades were of superior quality, with a completeness range of 12.95%-96.56% and a contamination range of 0%-8.65%. The medium and high-quality MAGs from MetaSpades, upon functional annotation, revealed properties such as salt tolerance (91.3%), heavy metal tolerance (95.6%), exopolysaccharide (95.6%), and antioxidant (60.86%) biosynthesis. Several plant growth-promoting attributes, including phosphate solubilization and indole-3-acetic acid (IAA) production, were consistently identified across all obtained MAGs. Conversely, characteristics such as iron acquisition and potassium solubilization were observed in a substantial majority, specifically 91.3%, of the MAGs. The present study indicates that hypersaline microflora can be used as bio-fertilizing agents for agricultural practices in salinized areas by alleviating prevalent stresses. IMPORTANCE The strategic implementation of metagenomic assembled genomes (MAGs) in exploring the properties and harnessing microorganisms from ecosystems like hypersaline niches has transformative potential in agriculture. This approach promises to redefine our comprehension of microbial diversity and its ecosystem roles. Recovery and decoding of MAGs unlock genetic resources, enabling the development of new solutions for agricultural challenges. Enhanced understanding of these microbial communities can lead to more efficient nutrient cycling, pest control, and soil health maintenance. Consequently, traditional agricultural practices can be improved, resulting in increased yields, reduced environmental impacts, and heightened sustainability. MAGs offer a promising avenue for sustainable agriculture, bridging the gap between cutting-edge genomics and practical field applications.
Collapse
Affiliation(s)
- Kiran Dindhoria
- Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Raghawendra Kumar
- Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Bhavya Bhargava
- Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
| | - Rakshak Kumar
- Biotechnology Division, CSIR-Institute of Himalayan Bioresource Technology, Palampur, Himachal Pradesh, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
7
|
Song J, Dong X, Lan Y, Lu Y, Liu X, Kang X, Huang Z, Yue B, Liu Y, Ma W, Zhang L, Yan H, He M, Fan Z, Guo T. Interpretation of vaginal metagenomic characteristics in different types of vaginitis. mSystems 2024; 9:e0137723. [PMID: 38364107 PMCID: PMC10949516 DOI: 10.1128/msystems.01377-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 01/22/2024] [Indexed: 02/18/2024] Open
Abstract
Although vaginitis is closely related to vaginal microecology in females, the precise composition and functional potential of different types of vaginitis remain unclear. Here, metagenomic sequencing was applied to analyze the vaginal flora in patients with various forms of vaginitis, including cases with a clue cell proportion ranging from 1% to 20% (Clue1_20), bacterial vaginitis (BV), vulvovaginal candidiasis (VVC), and BV combined with VVC (VVC_BV). Our results identified Prevotella as an important biomarker between BV and Clue1_20. Moreover, a gradual decrease was observed in the relative abundance of shikimic acid metabolism associated with bacteria producing indole as well as a decline in the abundance of Gardnerella vaginalis in patients with BV, Clue1_20, and healthy women. Interestingly, the vaginal flora of patients in the VVC_BV group exhibited structural similarities to that of the VVC group, and its potentially functional characteristics resembled those of the BV and VVC groups. Finally, Lactobacillus crispatus was found in high abundance in healthy samples, greatly contributing to the stability of the vaginal environment. For the further study of L. crispatus, we isolated five strains of L. crispatus from healthy samples and evaluated their capacity to inhibit G. vaginalis biofilms and produce lactic acid in vitro to select the potential probiotic candidate for improving vaginitis in future clinical studies. Overall, we successfully identified bacterial biomarkers of different vaginitis and characterized the dynamic shifts in vaginal flora between patients with BV and healthy females. This research advances our understanding and holds great promise in enhancing clinical approaches for the treatment of vaginitis. IMPORTANCE Vaginitis is one of the most common gynecological diseases, mostly caused by infections of pathogens such as Candida albicans and Gardnerella vaginalis. In recent years, it has been found that the stability of the vaginal flora plays an important role in vaginitis. Furthermore, the abundant Lactobacillus-producing rich lactic acid in the vagina provides a healthy acidic environment such as Lactobacillus crispatus. The metabolites of Lactobacillus can inhibit the colonization of pathogens. Here, we collected the vaginal samples of patients with bacterial vaginitis (BV), vulvovaginal candidiasis (VVC), and BV combined with VVC to discover the differences and relationships among the different kinds of vaginitis by metagenomic sequencing. Furthermore, because of the importance of L. crispatus in promoting vaginal health, we isolated multiple strains from vaginal samples of healthy females and chose the most promising strain with potential probiotic benefits to provide clinical implications for treatment strategies.
Collapse
Affiliation(s)
- Jiarong Song
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Xue Dong
- Department of Gynecology and Obstetrics, West China Second Hospital, Sichuan University, Chengdu, China
| | - Yue Lan
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Yunwei Lu
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Xu Liu
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Xuena Kang
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Zhonglu Huang
- Meishan Women and Children’s Hospital, Meishan, Sichuan, China
| | - Bisong Yue
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Yu Liu
- Institute of Blood Transfusion, Chinese Academy of Medical Sciences, Chengdu, Sichuan, China
| | - Wenjin Ma
- Chenghua District Maternal and Child Health Hospital, Chengdu, Sichuan, China
| | - Libo Zhang
- Renshou County People’s Hospital, Renshou, Sichuan, China
| | - Haijun Yan
- Meishan Traditional Chinese Medicine Hospital, Meishan, Sichuan, China
| | - Miao He
- Institute of Blood Transfusion, Chinese Academy of Medical Sciences, Chengdu, Sichuan, China
| | - Zhenxin Fan
- Key Laboratory of Bioresources and Ecoenvironment (Ministry of Education), College of Life Sciences, Sichuan University, Chengdu, China
| | - Tao Guo
- Department of Gynecology and Obstetrics, West China Second Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
8
|
Espinoza JL, Phillips A, Prentice MB, Tan GS, Kamath PL, Lloyd KG, Dupont CL. Unveiling the Microbial Realm with VEBA 2.0: A modular bioinformatics suite for end-to-end genome-resolved prokaryotic, (micro)eukaryotic, and viral multi-omics from either short- or long-read sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.08.583560. [PMID: 38559265 PMCID: PMC10979853 DOI: 10.1101/2024.03.08.583560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
The microbiome is a complex community of microorganisms, encompassing prokaryotic (bacterial and archaeal), eukaryotic, and viral entities. This microbial ensemble plays a pivotal role in influencing the health and productivity of diverse ecosystems while shaping the web of life. However, many software suites developed to study microbiomes analyze only the prokaryotic community and provide limited to no support for viruses and microeukaryotes. Previously, we introduced the Viral Eukaryotic Bacterial Archaeal (VEBA) open-source software suite to address this critical gap in microbiome research by extending genome-resolved analysis beyond prokaryotes to encompass the understudied realms of eukaryotes and viruses. Here we present VEBA 2.0 with key updates including a comprehensive clustered microeukaryotic protein database, rapid genome/protein-level clustering, bioprospecting, non-coding/organelle gene modeling, genome-resolved taxonomic/pathway profiling, long-read support, and containerization. We demonstrate VEBA's versatile application through the analysis of diverse case studies including marine water, Siberian permafrost, and white-tailed deer lung tissues with the latter showcasing how to identify integrated viruses. VEBA represents a crucial advancement in microbiome research, offering a powerful and accessible platform that bridges the gap between genomics and biotechnological solutions.
Collapse
Affiliation(s)
- Josh L. Espinoza
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Allan Phillips
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | | | - Gene S. Tan
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| | - Pauline L. Kamath
- School of Food and Agriculture, University of Maine, Orono, ME 04469, USA
| | - Karen G. Lloyd
- Microbiology Department, University of Tennessee, Knoxville, TN 37917, USA
| | - Chris L. Dupont
- Department of Environment and Sustainability, J. Craig Venter Institute, La Jolla, CA 92037, USA
- Department of Genomic Medicine and Infectious Diseases, J. Craig Venter Institute, La Jolla, CA 92037, USA
| |
Collapse
|
9
|
Matchado MS, Rühlemann M, Reitmeier S, Kacprowski T, Frost F, Haller D, Baumbach J, List M. On the limits of 16S rRNA gene-based metagenome prediction and functional profiling. Microb Genom 2024; 10:001203. [PMID: 38421266 PMCID: PMC10926695 DOI: 10.1099/mgen.0.001203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 02/05/2024] [Indexed: 03/02/2024] Open
Abstract
Molecular profiling techniques such as metagenomics, metatranscriptomics or metabolomics offer important insights into the functional diversity of the microbiome. In contrast, 16S rRNA gene sequencing, a widespread and cost-effective technique to measure microbial diversity, only allows for indirect estimation of microbial function. To mitigate this, tools such as PICRUSt2, Tax4Fun2, PanFP and MetGEM infer functional profiles from 16S rRNA gene sequencing data using different algorithms. Prior studies have cast doubts on the quality of these predictions, motivating us to systematically evaluate these tools using matched 16S rRNA gene sequencing, metagenomic datasets, and simulated data. Our contribution is threefold: (i) using simulated data, we investigate if technical biases could explain the discordance between inferred and expected results; (ii) considering human cohorts for type two diabetes, colorectal cancer and obesity, we test if health-related differential abundance measures of functional categories are concordant between 16S rRNA gene-inferred and metagenome-derived profiles and; (iii) since 16S rRNA gene copy number is an important confounder in functional profiles inference, we investigate if a customised copy number normalisation with the rrnDB database could improve the results. Our results show that 16S rRNA gene-based functional inference tools generally do not have the necessary sensitivity to delineate health-related functional changes in the microbiome and should thus be used with care. Furthermore, we outline important differences in the individual tools tested and offer recommendations for tool selection.
Collapse
Affiliation(s)
- Monica Steffi Matchado
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Malte Rühlemann
- Institute of Clinical Molecular Biology, Kiel University, Kiel, Germany
| | - Sandra Reitmeier
- ZIEL - Institute for Food & Health, Core Facility Microbiome, Technical University of Munich, Freising, Germany
| | - Tim Kacprowski
- Division Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of Technische Universität Braunschweig and Hannover Medical School, Braunschweig, Germany
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
- Department of Computational Biology of Infection Research, Helmholtz Centre for Infection Research (HZI), Braunschweig, Germany
| | - Fabian Frost
- Department of Medicine A, University Medicine Greifswald, Greifswald, Germany
| | - Dirk Haller
- ZIEL - Institute for Food & Health, Core Facility Microbiome, Technical University of Munich, Freising, Germany
- Chair of Nutrition and Immunology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Jan Baumbach
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
- Institute of Mathematics and Computer Science, University of Southern Denmark, Odense, Denmark
| | - Markus List
- Data Science in Systems Biology, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
10
|
Ospino MC, Engel K, Ruiz-Navas S, Binns WJ, Doxey AC, Neufeld JD. Evaluation of multiple displacement amplification for metagenomic analysis of low biomass samples. ISME COMMUNICATIONS 2024; 4:ycae024. [PMID: 38500705 PMCID: PMC10945365 DOI: 10.1093/ismeco/ycae024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 02/05/2024] [Accepted: 02/12/2024] [Indexed: 03/20/2024]
Abstract
Combining multiple displacement amplification (MDA) with metagenomics enables the analysis of samples with extremely low DNA concentrations, making them suitable for high-throughput sequencing. Although amplification bias and nonspecific amplification have been reported from MDA-amplified samples, the impact of MDA on metagenomic datasets is not well understood. We compared three MDA methods (i.e. bulk MDA, emulsion MDA, and primase MDA) for metagenomic analysis of two DNA template concentrations (approx. 1 and 100 pg) derived from a microbial community standard "mock community" and two low biomass environmental samples (i.e. borehole fluid and groundwater). We assessed the impact of MDA on metagenome-based community composition, assembly quality, functional profiles, and binning. We found amplification bias against high GC content genomes but relatively low nonspecific amplification such as chimeras, artifacts, or contamination for all MDA methods. We observed MDA-associated representational bias for microbial community profiles, especially for low-input DNA and with the primase MDA method. Nevertheless, similar taxa were represented in MDA-amplified libraries to those of unamplified samples. The MDA libraries were highly fragmented, but similar functional profiles to the unamplified libraries were obtained for bulk MDA and emulsion MDA at higher DNA input and across these MDA libraries for the groundwater sample. Medium to low-quality bins were possible for the high input bulk MDA metagenomes for the most simple microbial communities, borehole fluid, and mock community. Although MDA-based amplification should be avoided, it can still reveal meaningful taxonomic and functional information from samples with extremely low DNA concentration where direct metagenomics is otherwise impossible.
Collapse
Affiliation(s)
| | - Katja Engel
- Department of Biology, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Santiago Ruiz-Navas
- Department of Biology, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - W Jeffrey Binns
- Safety and Technical Research, Nuclear Waste Management Organization of Canada, Toronto, Ontario M4T 2S3, Canada
| | - Andrew C Doxey
- Department of Biology, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | - Josh D Neufeld
- Department of Biology, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| |
Collapse
|
11
|
Puente-Sánchez F, Hoetzinger M, Buck M, Bertilsson S. Exploring environmental intra-species diversity through non-redundant pangenome assemblies. Mol Ecol Resour 2023; 23:1724-1736. [PMID: 37382302 DOI: 10.1111/1755-0998.13826] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 05/24/2023] [Accepted: 06/15/2023] [Indexed: 06/30/2023]
Abstract
At the genome level, microorganisms are highly adaptable both in terms of allele and gene composition. Such heritable traits emerge in response to different environmental niches and can have a profound influence on microbial community dynamics. As a consequence, any individual genome or population will contain merely a fraction of the total genetic diversity of any operationally defined "species", whose ecological potential can thus be only fully understood by studying all of their genomes and the genes therein. This concept, known as the pangenome, is valuable for studying microbial ecology and evolution, as it partitions genomes into core (present in all the genomes from a species, and responsible for housekeeping and species-level niche adaptation among others) and accessory regions (present only in some, and responsible for intra-species differentiation). Here we present SuperPang, an algorithm producing pangenome assemblies from a set of input genomes of varying quality, including metagenome-assembled genomes (MAGs). SuperPang runs in linear time and its results are complete, non-redundant, preserve gene ordering and contain both coding and non-coding regions. Our approach provides a modular view of the pangenome, identifying operons and genomic islands, and allowing to track their prevalence in different populations. We illustrate this by analysing intra-species diversity in Polynucleobacter, a bacterial genus ubiquitous in freshwater ecosystems, characterized by their streamlined genomes and their ecological versatility. We show how SuperPang facilitates the simultaneous analysis of allelic and gene content variation under different environmental pressures, allowing us to study the drivers of microbial diversification at unprecedented resolution.
Collapse
Affiliation(s)
- Fernando Puente-Sánchez
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Matthias Hoetzinger
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Moritz Buck
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Stefan Bertilsson
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
12
|
Akter S, Rahman MS, Ali H, Minch B, Mehzabin K, Siddique MM, Galib SM, Yesmin F, Azmuda N, Adnan N, Hasan NA, Rahman SR, Moniruzzaman M, Ahmed MF. Phylogenetic diversity and functional potential of the microbial communities along the Bay of Bengal coast. Sci Rep 2023; 13:15976. [PMID: 37749192 PMCID: PMC10520010 DOI: 10.1038/s41598-023-43306-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 09/21/2023] [Indexed: 09/27/2023] Open
Abstract
The Bay of Bengal, the world's largest bay, is bordered by populous countries and rich in resources like fisheries, oil, gas, and minerals, while also hosting diverse marine ecosystems such as coral reefs, mangroves, and seagrass beds; regrettably, its microbial diversity and ecological significance have received limited research attention. Here, we present amplicon (16S and 18S) profiling and shotgun metagenomics data regarding microbial communities from BoB's eastern coast, viz., Saint Martin and Cox's Bazar, Bangladesh. From the 16S barcoding data, Proteobacteria appeared to be the dominant phylum in both locations, with Alteromonas, Methylophaga, Anaerospora, Marivita, and Vibrio dominating in Cox's Bazar and Pseudoalteromonas, Nautella, Marinomonas, Vibrio, and Alteromonas dominating the Saint Martin site. From the 18S barcoding data, Ochrophyta, Chlorophyta, and Protalveolata appeared among the most abundant eukaryotic divisions in both locations, with significantly higher abundance of Choanoflagellida, Florideophycidae, and Dinoflagellata in Cox's Bazar. The shotgun sequencing data reveals that in both locations, Alteromonas is the most prevalent bacterial genus, closely paralleling the dominance observed in the metabarcoding data, with Methylophaga in Cox's Bazar and Vibrio in Saint Martin. Functional annotations revealed that the microbial communities in these samples harbor genes for biofilm formation, quorum sensing, xenobiotics degradation, antimicrobial resistance, and a variety of other processes. Together, these results provide the first molecular insight into the functional and phylogenetic diversity of microbes along the BoB coast of Bangladesh. This baseline understanding of microbial community structure and functional potential will be critical for assessing impacts of climate change, pollution, and other anthropogenic disturbances on this ecologically and economically vital bay.
Collapse
Affiliation(s)
- Salma Akter
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - M Shaminur Rahman
- Department of Microbiology, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Hazrat Ali
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Benjamin Minch
- Department of Marine Biology and Ecology, Rosenstiel School of Marine, Atmospheric, and Earth Science, University of Miami, Coral Gables, FL, USA
| | - Kaniz Mehzabin
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Md Moradul Siddique
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Syed Md Galib
- Department of Computer Science and Engineering, Jashore University of Science and Technology, Jashore, Bangladesh
| | - Farida Yesmin
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Nafisa Azmuda
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Nihad Adnan
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh
| | - Nur A Hasan
- University of Maryland, College Park, MD, USA
| | | | - Mohammad Moniruzzaman
- Department of Marine Biology and Ecology, Rosenstiel School of Marine, Atmospheric, and Earth Science, University of Miami, Coral Gables, FL, USA.
| | - Md Firoz Ahmed
- Department of Microbiology, Jahangirnagar University, Savar, Dhaka, Bangladesh.
| |
Collapse
|
13
|
Arikawa K, Hosokawa M. Uncultured prokaryotic genomes in the spotlight: An examination of publicly available data from metagenomics and single-cell genomics. Comput Struct Biotechnol J 2023; 21:4508-4518. [PMID: 37771751 PMCID: PMC10523443 DOI: 10.1016/j.csbj.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/10/2023] [Accepted: 09/10/2023] [Indexed: 09/30/2023] Open
Abstract
Owing to the ineffectiveness of traditional culture techniques for the vast majority of microbial species, culture-independent analyses utilizing next-generation sequencing and bioinformatics have become essential for gaining insight into microbial ecology and function. This mini-review focuses on two essential methods for obtaining genetic information from uncultured prokaryotes, metagenomics and single-cell genomics. We analyzed the registration status of uncultured prokaryotic genome data from major public databases and assessed the advantages and limitations of both the methods. Metagenomics generates a significant quantity of sequence data and multiple prokaryotic genomes using straightforward experimental procedures. However, in ecosystems with high microbial diversity, such as soil, most genes are presented as brief, disconnected contigs, and lack association of highly conserved genes and mobile genetic elements with individual species genomes. Although technically more challenging, single-cell genomics offers valuable insights into complex ecosystems by providing strain-resolved genomes, addressing issues in metagenomics. Recent technological advancements, such as long-read sequencing, machine learning algorithms, and in silico protein structure prediction, in combination with vast genomic data, have the potential to overcome the current technical challenges and facilitate a deeper understanding of uncultured microbial ecosystems and microbial dark matter genes and proteins. In light of this, it is imperative that continued innovation in both methods and technologies take place to create high-quality reference genome databases that will support future microbial research and industrial applications.
Collapse
Affiliation(s)
- Koji Arikawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
| | - Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| |
Collapse
|
14
|
Baker JL. Illuminating the oral microbiome and its host interactions: recent advancements in omics and bioinformatics technologies in the context of oral microbiome research. FEMS Microbiol Rev 2023; 47:fuad051. [PMID: 37667515 PMCID: PMC10503653 DOI: 10.1093/femsre/fuad051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 08/02/2023] [Accepted: 09/01/2023] [Indexed: 09/06/2023] Open
Abstract
The oral microbiota has an enormous impact on human health, with oral dysbiosis now linked to many oral and systemic diseases. Recent advancements in sequencing, mass spectrometry, bioinformatics, computational biology, and machine learning are revolutionizing oral microbiome research, enabling analysis at an unprecedented scale and level of resolution using omics approaches. This review contains a comprehensive perspective of the current state-of-the-art tools available to perform genomics, metagenomics, phylogenomics, pangenomics, transcriptomics, proteomics, metabolomics, lipidomics, and multi-omics analysis on (all) microbiomes, and then provides examples of how the techniques have been applied to research of the oral microbiome, specifically. Key findings of these studies and remaining challenges for the field are highlighted. Although the methods discussed here are placed in the context of their contributions to oral microbiome research specifically, they are pertinent to the study of any microbiome, and the intended audience of this includes researchers would simply like to get an introduction to microbial omics and/or an update on the latest omics methods. Continued research of the oral microbiota using omics approaches is crucial and will lead to dramatic improvements in human health, longevity, and quality of life.
Collapse
Affiliation(s)
- Jonathon L Baker
- Department of Oral Rehabilitation & Biosciences, School of Dentistry, Oregon Health & Science University, 3181 Sam Jackson Park Road, Portland, OR 97202, United States
- Genomic Medicine Group, J. Craig Venter Institute, La Jolla, CA 92037, United States
- Department of Pediatrics, UC San Diego School of Medicine, La Jolla, CA 92093, United States
| |
Collapse
|
15
|
Kumar R, Yadav G, Kuddus M, Ashraf GM, Singh R. Unlocking the microbial studies through computational approaches: how far have we reached? ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2023; 30:48929-48947. [PMID: 36920617 PMCID: PMC10016191 DOI: 10.1007/s11356-023-26220-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Accepted: 02/24/2023] [Indexed: 04/16/2023]
Abstract
The metagenomics approach accelerated the study of genetic information from uncultured microbes and complex microbial communities. In silico research also facilitated an understanding of protein-DNA interactions, protein-protein interactions, docking between proteins and phyto/biochemicals for drug design, and modeling of the 3D structure of proteins. These in silico approaches provided insight into analyzing pathogenic and nonpathogenic strains that helped in the identification of probable genes for vaccines and antimicrobial agents and comparing whole-genome sequences to microbial evolution. Artificial intelligence, more precisely machine learning (ML) and deep learning (DL), has proven to be a promising approach in the field of microbiology to handle, analyze, and utilize large data that are generated through nucleic acid sequencing and proteomics. This enabled the understanding of the functional and taxonomic diversity of microorganisms. ML and DL have been used in the prediction and forecasting of diseases and applied to trace environmental contaminants and environmental quality. This review presents an in-depth analysis of the recent application of silico approaches in microbial genomics, proteomics, functional diversity, vaccine development, and drug design.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India
- Department of Veterinary Medicine and Surgery, College of Veterinary Medicine, University of Missouri, Columbia, MO, USA
| | - Garima Yadav
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India
| | - Mohammed Kuddus
- Department of Biochemistry, College of Medicine, University of Hail, Hail, Saudi Arabia
| | - Ghulam Md Ashraf
- Department of Medical Laboratory Sciences, College of Health Sciences, and Sharjah Institute for Medical Research, University of Sharjah, Sharjah , 27272, United Arab Emirates
| | - Rachana Singh
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Lucknow, Uttar Pradesh, India.
| |
Collapse
|
16
|
Rohrer SD, Jiménez-Uzcátegui G, Parker PG, Chubiz LM. Composition and function of the Galapagos penguin gut microbiome vary with age, location, and a putative bacterial pathogen. Sci Rep 2023; 13:5358. [PMID: 37005428 PMCID: PMC10067942 DOI: 10.1038/s41598-023-31826-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 03/17/2023] [Indexed: 04/04/2023] Open
Abstract
Microbial colonization plays a direct role in host health. Understanding the ecology of the resident microbial community for a given host species is thus an important step for detecting population vulnerabilities like disease. However, the idea of integrating microbiome research into conservation is still relatively new, and wild birds have received less attention in this field than mammals or domesticated animals. Here we examine the composition and function of the gut microbiome of the endangered Galapagos penguin (Spheniscus mendiculus) with the goals of characterizing the normal microbial community and resistome, identifying likely pathogens, and testing hypotheses of structuring forces for this community based on demographics, location, and infection status. We collected fecal samples from wild penguins in 2018 and performed 16S rRNA gene sequencing and whole genome sequencing (WGS) on extracted DNA. 16S sequencing revealed that the bacterial phyla Fusobacteria, Epsilonbacteraeota, Firmicutes, and Proteobacteria dominate the community. Functional pathways were computed from WGS data, showing genetic functional potential primarily focused on metabolism-amino acid metabolism, carbohydrate metabolism, and energy metabolism are the most well-represented functional groups. WGS samples were each screened for antimicrobial resistance, characterizing a resistome made up of nine antibiotic resistance genes. Samples were screened for potential enteric pathogens using virulence factors as indicators; Clostridium perfringens was revealed as a likely pathogen. Overall, three factors appear to be shaping the alpha and beta diversity of the microbial community: penguin developmental stage, sampling location, and C. perfringens. We found that juvenile penguins have significantly lower alpha diversity than adults based on three metrics, as well as significantly different beta diversity. Location effects are minimal, but one site has significantly lower Shannon diversity than the other primary sites. Finally, when samples were grouped by C. perfringens virulence factors, we found dramatic changes in beta diversity based on operational taxonomic units, protein families, and functional pathways. This study provides a baseline microbiome for an endangered species, implicates both penguin age and the presence of a potential bacterial pathogen as primary factors associated with microbial community variance, and reveals widespread antibiotic resistance genes across the population.
Collapse
Affiliation(s)
- Sage D Rohrer
- Department of Biology and Whitney R. Harris World Ecology Center, University of Missouri-St. Louis, One University Blvd., St. Louis, MO, 63121, USA.
| | | | - Patricia G Parker
- Department of Biology and Whitney R. Harris World Ecology Center, University of Missouri-St. Louis, One University Blvd., St. Louis, MO, 63121, USA
- WildCare Institute, Saint Louis Zoo, One Government Drive, St. Louis, MO, 63110, USA
| | - Lon M Chubiz
- Department of Biology and Whitney R. Harris World Ecology Center, University of Missouri-St. Louis, One University Blvd., St. Louis, MO, 63121, USA
| |
Collapse
|
17
|
Zhang Z, Zhang L, Zhang G, Zhao Z, Wang H, Ju F. Deduplication Improves Cost-Efficiency and Yields of De Novo Assembly and Binning of Shotgun Metagenomes in Microbiome Research. Microbiol Spectr 2023; 11:e0428222. [PMID: 36744896 PMCID: PMC10101064 DOI: 10.1128/spectrum.04282-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 01/18/2023] [Indexed: 02/07/2023] Open
Abstract
In the last decade, metagenomics has greatly revolutionized the study of microbial communities. However, the presence of artificial duplicate reads raised mainly from the preparation of metagenomic DNA sequencing libraries and their impacts on metagenomic assembly and binning have never been brought to attention. Here, we explicitly investigated the effects of duplicate reads on metagenomic assemblies and binning based on analyses of five groups of representative metagenomes with distinct microbiome complexities. Our results showed that deduplication considerably increased the binning yields (by 3.5% to 80%) for most of the metagenomic data sets examined thanks to the improved contig length and coverage profiling of metagenome-assembled contigs, whereas it slightly decreased the binning yields of metagenomes with low complexity (e.g., human gut metagenomes). Specifically, 411 versus 397, 331 versus 317, 104 versus 88, and 9 versus 5 metagenome-assembled genomes (MAGs) were recovered from MEGAHIT assemblies of bioreactor sludge, surface water, lake sediment, and forest soil metagenomes, respectively. Noticeably, deduplication significantly reduced the computational costs of the metagenomic assembly, including the elapsed time (9.0% to 29.9%) and the maximum memory requirement (4.3% to 37.1%). Collectively, we recommend the removal of duplicate reads in metagenomes with high complexity before assembly and binning analyses, for example, the forest soil metagenomes examined in this study. IMPORTANCE Duplicated reads in shotgun metagenomes are usually considered technical artifacts. Their presence in metagenomes would theoretically not only introduce bias into the quantitative analysis but also result in mistakes in the coverage profile, leading to adverse effects on or even failures in metagenomic assembly and binning, as the widely used metagenome assemblers and binners all need coverage information for graph partitioning and assembly binning, respectively. However, this issue was seldom noticed, and its impacts on downstream essential bioinformatic procedures (e.g., assembly and binning) remained unclear. In this study, we comprehensively evaluated for the first time the implications of duplicate reads for the de novo assembly and binning of real metagenomic data sets by comparing the assembly qualities, binning yields, and requirements for computational resources with and without the removal of duplicate reads. It was revealed that deduplication considerably increased the binning yields of metagenomes with high complexity and significantly reduced the computational costs, including the elapsed time and the maximum memory requirement, for most of the metagenomes studied. These results provide empirical references for more cost-efficient metagenomic analyses in microbiome research.
Collapse
Affiliation(s)
- Zhiguo Zhang
- College of Environmental and Resources Sciences, Zhejiang University, Hangzhou, Zhejiang Province, China
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
| | - Lu Zhang
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
| | - Guoqing Zhang
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
| | - Ze Zhao
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
| | - Hui Wang
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
| | - Feng Ju
- Research Center for Industries of the Future, Key Laboratory of Coastal Environment and Resources of Zhejiang Province, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Center of Synthetic Biology and Integrated Bioengineering, School of Engineering, Westlake University, Hangzhou, Zhejiang Province, China
- Institute of Advanced Technology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, China
- Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang Province, China
| |
Collapse
|
18
|
Arumugam K, Bessarab I, Haryono MAS, Williams RBH. Recovery and Analysis of Long-Read Metagenome-Assembled Genomes. Methods Mol Biol 2023; 2649:235-259. [PMID: 37258866 DOI: 10.1007/978-1-0716-3072-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The development of long-read nucleic acid sequencing is beginning to make very substantive impact on the conduct of metagenome analysis, particularly in relation to the problem of recovering the genomes of member species of complex microbial communities. Here we outline bioinformatics workflows for the recovery and characterization of complete genomes from long-read metagenome data and some complementary procedures for comparison of cognate draft genomes and gene quality obtained from short-read sequencing and long-read sequencing.
Collapse
Affiliation(s)
- Krithika Arumugam
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Irina Bessarab
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Mindia A S Haryono
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Rohan B H Williams
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
19
|
Vuong P, Wise MJ, Whiteley AS, Kaur P. Ten simple rules for investigating (meta)genomic data from environmental ecosystems. PLoS Comput Biol 2022; 18:e1010675. [PMID: 36480496 PMCID: PMC9731419 DOI: 10.1371/journal.pcbi.1010675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Michael J. Wise
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, Australia
- The Marshall Centre of Infectious Diseases, School of Biological Sciences, The University of Western Australia, Perth, Australia
| | - Andrew S. Whiteley
- Centre for Environment & Life Sciences, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Floreat, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
- * E-mail:
| |
Collapse
|
20
|
Critical Assessment of Short-Read Assemblers for the Metagenomic Identification of Foodborne and Waterborne Pathogens Using Simulated Bacterial Communities. Microorganisms 2022; 10:microorganisms10122416. [PMID: 36557669 PMCID: PMC9784204 DOI: 10.3390/microorganisms10122416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 11/30/2022] [Accepted: 12/03/2022] [Indexed: 12/12/2022] Open
Abstract
Metagenomics offers the highest level of strain discrimination of bacterial pathogens from complex food and water microbiota. With the rapid evolvement of assembly algorithms, defining an optimal assembler based on the performance in the metagenomic identification of foodborne and waterborne pathogens is warranted. We aimed to benchmark short-read assemblers for the metagenomic identification of foodborne and waterborne pathogens using simulated bacterial communities. Bacterial communities on fresh spinach and in surface water were simulated by generating paired-end short reads of Illumina HiSeq, MiSeq, and NovaSeq at different sequencing depths. Multidrug-resistant Salmonella Indiana SI43 and Pseudomonas aeruginosa PAO1 were included in the simulated communities on fresh spinach and in surface water, respectively. ABySS, IDBA-UD, MaSuRCA, MEGAHIT, metaSPAdes, and Ray Meta were benchmarked in terms of assembly quality, identifications of plasmids, virulence genes, Salmonella pathogenicity island, antimicrobial resistance genes, chromosomal point mutations, serotyping, multilocus sequence typing, and whole-genome phylogeny. Overall, MEGHIT, metaSPAdes, and Ray Meta were more effective for metagenomic identification. We did not obtain an optimal assembler when using the extracted reads classified as Salmonella or P. aeruginosa for downstream genomic analyses, but the extracted reads showed consistent phylogenetic topology with the reference genome when they were aligned with Salmonella or P. aeruginosa strains. In most cases, HiSeq, MiSeq, and NovaSeq were comparable at the same sequencing depth, while higher sequencing depths generally led to more accurate results. As assembly algorithms advance and mature, the evaluation of assemblers should be a continuous process.
Collapse
|
21
|
Lai S, Pan S, Sun C, Coelho LP, Chen WH, Zhao XM. metaMIC: reference-free misassembly identification and correction of de novo metagenomic assemblies. Genome Biol 2022; 23:242. [PMID: 36376928 PMCID: PMC9661791 DOI: 10.1186/s13059-022-02810-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Accepted: 11/01/2022] [Indexed: 11/16/2022] Open
Abstract
Evaluating the quality of metagenomic assemblies is important for constructing reliable metagenome-assembled genomes and downstream analyses. Here, we present metaMIC ( https://github.com/ZhaoXM-Lab/metaMIC ), a machine learning-based tool for identifying and correcting misassemblies in metagenomic assemblies. Benchmarking results on both simulated and real datasets demonstrate that metaMIC outperforms existing tools when identifying misassembled contigs. Furthermore, metaMIC is able to localize the misassembly breakpoints, and the correction of misassemblies by splitting at misassembly breakpoints can improve downstream scaffolding and binning results.
Collapse
Affiliation(s)
- Senying Lai
- grid.8547.e0000 0001 0125 2443Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Shaojun Pan
- grid.8547.e0000 0001 0125 2443Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
| | - Chuqing Sun
- grid.33199.310000 0004 0368 7223Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China
| | - Luis Pedro Coelho
- grid.8547.e0000 0001 0125 2443Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China ,grid.8547.e0000 0001 0125 2443MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China
| | - Wei-Hua Chen
- grid.33199.310000 0004 0368 7223Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei China ,grid.462338.80000 0004 0605 6769College of Life Science, Henan Normal University, Xinxiang, Henan China
| | - Xing-Ming Zhao
- grid.8547.e0000 0001 0125 2443Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China ,grid.8547.e0000 0001 0125 2443MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Fudan University, Shanghai, China ,grid.8547.e0000 0001 0125 2443State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China ,grid.8547.e0000 0001 0125 2443Research Institute of Intelligent Complex Systems, Fudan University, Shanghai, China ,International Human Phenome Institutes (Shanghai), Shanghai, China ,Zhangjiang Fudan International Innovation Center, Shanghai, China
| |
Collapse
|
22
|
Wu Z, Wang Y, Zeng J, Zhou Y. Constructing metagenome-assembled genomes for almost all components in a real bacterial consortium for binning benchmarking. BMC Genomics 2022; 23:746. [DOI: 10.1186/s12864-022-08967-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 10/25/2022] [Indexed: 11/11/2022] Open
Abstract
Abstract
Background
So far, a lot of binning approaches have been intensively developed for untangling metagenome-assembled genomes (MAGs) and evaluated by two main strategies. The strategy by comparison to known genomes prevails over the other strategy by using single-copy genes. However, there is still no dataset with all known genomes for a real (not simulated) bacterial consortium yet.
Results
Here, we continue investigating the real bacterial consortium F1RT enriched and sequenced by us previously, considering the high possibility to unearth all MAGs, due to its low complexity. The improved F1RT metagenome reassembled by metaSPAdes here utilizes about 98.62% of reads, and a series of analyses for the remaining reads suggests that the possibility of containing other low-abundance organisms in F1RT is greatly low, demonstrating that almost all MAGs are successfully assembled. Then, 4 isolates are obtained and individually sequenced. Based on the 4 isolate genomes and the entire metagenome, an elaborate pipeline is then in-house developed to construct all F1RT MAGs. A series of assessments extensively prove the high reliability of the herein reconstruction. Next, our findings further show that this dataset harbors several properties challenging for binning and thus is suitable to compare advanced binning tools available now or benchmark novel binners. Using this dataset, 8 advanced binning algorithms are assessed, giving useful insights for developing novel approaches. In addition, compared with our previous study, two novel MAGs termed FC8 and FC9 are discovered here, and 7 MAGs are solidly unearthed for species without any available genomes.
Conclusion
To our knowledge, it is the first time to construct a dataset with almost all known MAGs for a not simulated consortium. We hope that this dataset will be used as a routine toolkit to complement mock datasets for evaluating binning methods to further facilitate binning and metagenomic studies in the future.
Collapse
|
23
|
Hempel CA, Wright N, Harvie J, Hleap JS, Adamowicz S, Steinke D. Metagenomics versus total RNA sequencing: most accurate data-processing tools, microbial identification accuracy and perspectives for ecological assessments. Nucleic Acids Res 2022; 50:9279-9293. [PMID: 35979944 PMCID: PMC9458450 DOI: 10.1093/nar/gkac689] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2022] [Revised: 07/05/2022] [Accepted: 07/29/2022] [Indexed: 12/24/2022] Open
Abstract
Metagenomics and total RNA sequencing (total RNA-Seq) have the potential to improve the taxonomic identification of diverse microbial communities, which could allow for the incorporation of microbes into routine ecological assessments. However, these target-PCR-free techniques require more testing and optimization. In this study, we processed metagenomics and total RNA-Seq data from a commercially available microbial mock community using 672 data-processing workflows, identified the most accurate data-processing tools, and compared their microbial identification accuracy at equal and increasing sequencing depths. The accuracy of data-processing tools substantially varied among replicates. Total RNA-Seq was more accurate than metagenomics at equal sequencing depths and even at sequencing depths almost one order of magnitude lower than those of metagenomics. We show that while data-processing tools require further exploration, total RNA-Seq might be a favorable alternative to metagenomics for target-PCR-free taxonomic identifications of microbial communities and might enable a substantial reduction in sequencing costs while maintaining accuracy. This could be particularly an advantage for routine ecological assessments, which require cost-effective yet accurate methods, and might allow for the incorporation of microbes into ecological assessments.
Collapse
Affiliation(s)
- Christopher A Hempel
- To whom correspondence should be addressed. Tel: +1 519 824 4120; Fax: +1 519 824 5703;
| | - Natalie Wright
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Julia Harvie
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Jose S Hleap
- SHARCNET, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Sarah J Adamowicz
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada
| | - Dirk Steinke
- Department of Integrative Biology, University of Guelph, Guelph, ON N1G 2W1, Canada,Centre for Biodiversity Genomics, University of Guelph, Guelph, ON N1G 2W1, Canada
| |
Collapse
|
24
|
Li J, Yang F, Xiao M, Li A. Advances and challenges in cataloging the human gut virome. Cell Host Microbe 2022; 30:908-916. [PMID: 35834962 DOI: 10.1016/j.chom.2022.06.003] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Revised: 06/02/2022] [Accepted: 06/07/2022] [Indexed: 11/17/2022]
Abstract
The human gut virome, which is often referred to as the "dark matter" of the gut microbiome, remains understudied. A better understanding of the composition and variations of the gut virome across populations is critical for exploring its impact on diseases and health. A series of advances in the characterization of human gut virome have unveiled high genetic diversity and various functional potentials of gut viruses. Here, we summarize the recently available human gut virome databases and discuss their features, procedures, and challenges with the intention to provide a reference to researchers to use while choosing a profiling database. We also propose a "best practice" for cataloging the viral population.
Collapse
Affiliation(s)
- Junhua Li
- BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI-Shenzhen, Shenzhen 518083, China.
| | | | - Minfeng Xiao
- BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI-Shenzhen, Shenzhen 518083, China.
| | - Aixin Li
- BGI-Shenzhen, Shenzhen 518083, China; Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI-Shenzhen, Shenzhen 518083, China
| |
Collapse
|
25
|
Goussarov G, Mysara M, Vandamme P, Van Houdt R. Introduction to the principles and methods underlying the recovery of metagenome-assembled genomes from metagenomic data. Microbiologyopen 2022; 11:e1298. [PMID: 35765182 PMCID: PMC9179125 DOI: 10.1002/mbo3.1298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/18/2022] Open
Abstract
The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome‐assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.
Collapse
Affiliation(s)
- Gleb Goussarov
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium.,Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Mohamed Mysara
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| | - Peter Vandamme
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Rob Van Houdt
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| |
Collapse
|
26
|
Yu KHO, Fang X, Yao H, Ng B, Leung TK, Wang LL, Lin CH, Chan ASW, Leung WK, Leung SY, Ho JWK. Evaluation of Experimental Protocols for Shotgun Whole-Genome Metagenomic Discovery of Antibiotic Resistance Genes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:1313-1321. [PMID: 32750872 DOI: 10.1109/tcbb.2020.3004063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Shotgun metagenomics has enabled the discovery of antibiotic resistance genes (ARGs). Although there have been numerous studies benchmarking the bioinformatics methods for shotgun metagenomic data analysis, there has not yet been a study that systematically evaluates the performance of different experimental protocols on metagenomic species profiling and ARG detection. In this study, we generated 35 whole genome shotgun metagenomic sequencing data sets for five samples (three human stool and two microbial standard) using seven experimental protocols (KAPA or Flex kits at 50ng, 10ng, or 5ng input amounts; XT kit at 1ng input amount). Using this comprehensive resource, we evaluated the seven protocols in terms of robust detection of ARGs and microbial abundance estimation at various sequencing depths. We found that the data generated by the seven protocols are largely similar. The inter-protocol variability is significantly smaller than the variability between samples or sequencing depths. We found that a sequencing depth of more than 30M is suitable for human stool samples. A higher input amount (50ng) is generally favorable for the KAPA and Flex kits. This systematic benchmarking study sheds light on the impact of sequencing depth, experimental protocol, and DNA input amount on ARG detection in human stool samples.
Collapse
|
27
|
Oh HS, Min U, Jang H, Kim N, Lim J, Chalita M, Chun J. Proposal of a health gut microbiome index based on a meta-analysis of Korean and global population datasets. JOURNAL OF MICROBIOLOGY (SEOUL, KOREA) 2022; 60:533-549. [PMID: 35362897 DOI: 10.1007/s12275-022-1526-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 01/03/2022] [Accepted: 01/26/2022] [Indexed: 02/08/2023]
Abstract
The disruption of the human gut microbiota has been linked to host health conditions, including various diseases. However, no reliable index for measuring and predicting a healthy microbiome is currently available. Here, the sequencing data of 1,663 Koreans were obtained from three independent studies. Furthermore, we pooled 3,490 samples from public databases and analyzed a total of 5,153 fecal samples. First, we analyzed Korean gut microbiome covariates to determine the influence of lifestyle on variation in the gut microbiota. Next, patterns of microbiota variations across geographical locations and disease statuses were confirmed using a global cohort and di-sease data. Based on comprehensive comparative analysis, we were able to define three enterotypes among Korean cohorts, namely, Prevotella type, Bacteroides type, and outlier type. By a thorough categorization of dysbiosis and the evaluation of microbial characteristics using multiple datasets, we identified a wide spectrum of accuracy levels in classifying health and disease states. Using the observed microbiome patterns, we devised an index named the gut microbiome index (GMI) that could consistently predict health conditions from human gut microbiome data. Compared to ecological metrics, the microbial marker index, and machine learning approaches, GMI distinguished between healthy and non-healthy individuals with a higher accuracy across various datasets. Thus, this study proposes a potential index to measure health status of gut microbiome that is verified from multiethnic data of various diseases, and we expect this model to facilitate further clinical application of gut microbiota data in future.
Collapse
Affiliation(s)
- Hyun-Seok Oh
- ChunLab Inc., Seoul, 06194, Republic of Korea.,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea
| | - Uigi Min
- ChunLab Inc., Seoul, 06194, Republic of Korea
| | - Hyejin Jang
- ChunLab Inc., Seoul, 06194, Republic of Korea
| | - Namil Kim
- ChunLab Inc., Seoul, 06194, Republic of Korea
| | | | | | - Jongsik Chun
- ChunLab Inc., Seoul, 06194, Republic of Korea. .,Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul, 08826, Republic of Korea. .,School of Biological Sciences, Seoul National University, Seoul, 08826, Republic of Korea.
| |
Collapse
|
28
|
Zayed AA, Wainaina JM, Dominguez-Huerta G, Pelletier E, Guo J, Mohssen M, Tian F, Pratama AA, Bolduc B, Zablocki O, Cronin D, Solden L, Delage E, Alberti A, Aury JM, Carradec Q, da Silva C, Labadie K, Poulain J, Ruscheweyh HJ, Salazar G, Shatoff E, Coordinators TO, Bundschuh R, Fredrick K, Kubatko LS, Chaffron S, Culley AI, Sunagawa S, Kuhn JH, Wincker P, Sullivan MB. Cryptic and abundant marine viruses at the evolutionary origins of Earth's RNA virome. Science 2022; 376:156-162. [PMID: 35389782 PMCID: PMC10990476 DOI: 10.1126/science.abm5847] [Citation(s) in RCA: 98] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Whereas DNA viruses are known to be abundant, diverse, and commonly key ecosystem players, RNA viruses are insufficiently studied outside disease settings. In this study, we analyzed ≈28 terabases of Global Ocean RNA sequences to expand Earth's RNA virus catalogs and their taxonomy, investigate their evolutionary origins, and assess their marine biogeography from pole to pole. Using new approaches to optimize discovery and classification, we identified RNA viruses that necessitate substantive revisions of taxonomy (doubling phyla and adding >50% new classes) and evolutionary understanding. "Species"-rank abundance determination revealed that viruses of the new phyla "Taraviricota," a missing link in early RNA virus evolution, and "Arctiviricota" are widespread and dominant in the oceans. These efforts provide foundational knowledge critical to integrating RNA viruses into ecological and epidemiological models.
Collapse
Affiliation(s)
- Ahmed A. Zayed
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - James M. Wainaina
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Guillermo Dominguez-Huerta
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Eric Pelletier
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Jiarong Guo
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Mohamed Mohssen
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Biophysics Graduate Program, Ohio State University, Columbus, OH 43210, USA
| | - Funing Tian
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Akbar Adjie Pratama
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
| | - Benjamin Bolduc
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Olivier Zablocki
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Dylan Cronin
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
| | - Lindsey Solden
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
| | - Erwan Delage
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
- Nantes Université, CNRS UMR 6004, LS2N, F-44000 Nantes, France
| | - Adriana Alberti
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Quentin Carradec
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Corinne da Silva
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Karine Labadie
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Julie Poulain
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Hans-Joachim Ruscheweyh
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland
| | - Guillem Salazar
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland
| | - Elan Shatoff
- Department of Physics, Ohio State University, Columbus, OH 43210, USA
| | | | - Ralf Bundschuh
- The Interdisciplinary Biophysics Graduate Program, Ohio State University, Columbus, OH 43210, USA
- Department of Physics, Ohio State University, Columbus, OH 43210, USA
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH 43210, USA
- Division of Hematology, Department of Internal Medicine, Ohio State University, Columbus, OH 43210, USA
| | - Kurt Fredrick
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
| | - Laura S. Kubatko
- Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus, OH 43210, USA
- Department of Statistics, Ohio State University, Columbus, OH 43210, USA
| | - Samuel Chaffron
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
- Nantes Université, CNRS UMR 6004, LS2N, F-44000 Nantes, France
| | - Alexander I. Culley
- Département de Biochimie, Microbiologie et Bio-informatique, Université Laval, Québec, Québec G1V 0A6, Canada
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zurich, Zurich, Switzerland
| | - Jens H. Kuhn
- Integrated Research Facility at Fort Detrick, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Fort Detrick, Frederick, MD 21702, USA
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, 91000 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara Oceans GOSEE, 75016 Paris, France
| | - Matthew B. Sullivan
- Department of Microbiology, Ohio State University, Columbus, OH 43210, USA
- EMERGE Biology Integration Institute, Ohio State University, Columbus, OH 43210, USA
- Center of Microbiome Science, Ohio State University, Columbus, OH 43210, USA
- The Interdisciplinary Biophysics Graduate Program, Ohio State University, Columbus, OH 43210, USA
- Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus, OH 43210, USA
- Department of Civil, Environmental, and Geodetic Engineering, Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
29
|
Zhou Y, Liu M, Yang J. Recovering metagenome-assembled genomes from shotgun metagenomic sequencing data: methods, applications, challenges, and opportunities. Microbiol Res 2022; 260:127023. [DOI: 10.1016/j.micres.2022.127023] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 03/07/2022] [Accepted: 04/05/2022] [Indexed: 12/12/2022]
|
30
|
Li QC, Wang B, Zeng YH, Cai ZH, Zhou J. The Microbial Mechanisms of a Novel Photosensitive Material (Treated Rape Pollen) in Anti-Biofilm Process under Marine Environment. Int J Mol Sci 2022; 23:ijms23073837. [PMID: 35409199 PMCID: PMC8998240 DOI: 10.3390/ijms23073837] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2022] [Revised: 03/18/2022] [Accepted: 03/24/2022] [Indexed: 02/01/2023] Open
Abstract
Marine biofouling is a worldwide problem in coastal areas and affects the maritime industry primarily by attachment of fouling organisms to solid immersed surfaces. Biofilm formation by microbes is the main cause of biofouling. Currently, application of antibacterial materials is an important strategy for preventing bacterial colonization and biofilm formation. A natural three-dimensional carbon skeleton material, TRP (treated rape pollen), attracted our attention owing to its visible-light-driven photocatalytic disinfection property. Based on this, we hypothesized that TRP, which is eco-friendly, would show antifouling performance and could be used for marine antifouling. We then assessed its physiochemical characteristics, oxidant potential, and antifouling ability. The results showed that TRP had excellent photosensitivity and oxidant ability, as well as strong anti-bacterial colonization capability under light-driven conditions. Confocal laser scanning microscopy showed that TRP could disperse pre-established biofilms on stainless steel surfaces in natural seawater. The biodiversity and taxonomic composition of biofilms were significantly altered by TRP (p < 0.05). Moreover, metagenomics analysis showed that functional classes involved in the antioxidant system, environmental stress, glucose−lipid metabolism, and membrane-associated functions were changed after TRP exposure. Co-occurrence model analysis further revealed that TRP markedly increased the complexity of the biofilm microbial network under light irradiation. Taken together, these results demonstrate that TRP with light irradiation can inhibit bacterial colonization and prevent initial biofilm formation. Thus, TRP is a potential nature-based green material for marine antifouling.
Collapse
Affiliation(s)
- Qing-Chao Li
- Shenzhen Public Platform for Screening and Application of Marine Microbial Resources, Institute for Ocean Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (Q.-C.L.); (Y.-H.Z.); (Z.-H.C.)
| | - Bo Wang
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
| | - Yan-Hua Zeng
- Shenzhen Public Platform for Screening and Application of Marine Microbial Resources, Institute for Ocean Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (Q.-C.L.); (Y.-H.Z.); (Z.-H.C.)
| | - Zhong-Hua Cai
- Shenzhen Public Platform for Screening and Application of Marine Microbial Resources, Institute for Ocean Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (Q.-C.L.); (Y.-H.Z.); (Z.-H.C.)
| | - Jin Zhou
- Shenzhen Public Platform for Screening and Application of Marine Microbial Resources, Institute for Ocean Engineering, Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China; (Q.-C.L.); (Y.-H.Z.); (Z.-H.C.)
- Correspondence:
| |
Collapse
|
31
|
Characteristics changes on Applications of Antibiotics and Current Approaches to Enhance Productivity with Soil Microbiome. JOURNAL OF PURE AND APPLIED MICROBIOLOGY 2022. [DOI: 10.22207/jpam.16.1.61] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The contamination of environmental sully with antibiotics is regarded as a major problem today and predictable to attain more recognition in near future. However, human intervention resulting in antibiotic consumption is being enhancing all around the world. Our review of literature revealed the role of microbiome in sully and how antibiotic resistant genes raised. The structure of antibiotics basically influenced by natural components such as biotic and abiotic push which shifts based on different soils. Therefore, management of microbiome in soil and their expression studies were distinctively revealed. The assessment of antibiotic resistance genes with help of next generation sequencing provided a clear comprehension on genome and transcriptome of the bacterial genes. Thus, interaction of microbiome with soil can also be well understood. The current findings in our study will guide every researcher to follow logical protocol in analyzing microbiota composition is covered as well and also to understand its metagenomic and sequenced with next-generation sequencer which helps to comprehend the diverse micro-flora present in soil and its operation. Finally, later progresses in bioinformatics computer program, flow of work, and applications for analyzing metagenomic information are put in a nutshell.
Collapse
|
32
|
Ventolero MF, Wang S, Hu H, Li X. Computational analyses of bacterial strains from shotgun reads. Brief Bioinform 2022; 23:6524011. [PMID: 35136954 DOI: 10.1093/bib/bbac013] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2021] [Revised: 01/10/2022] [Accepted: 01/11/2022] [Indexed: 12/21/2022] Open
Abstract
Shotgun sequencing is routinely employed to study bacteria in microbial communities. With the vast amount of shotgun sequencing reads generated in a metagenomic project, it is crucial to determine the microbial composition at the strain level. This study investigated 20 computational tools that attempt to infer bacterial strain genomes from shotgun reads. For the first time, we discussed the methodology behind these tools. We also systematically evaluated six novel-strain-targeting tools on the same datasets and found that BHap, mixtureS and StrainFinder performed better than other tools. Because the performance of the best tools is still suboptimal, we discussed future directions that may address the limitations.
Collapse
Affiliation(s)
| | - Saidi Wang
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Haiyan Hu
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA.,Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, FL 32816, USA
| | - Xiaoman Li
- Burnett School of Biomedical Science, University of Central Florida, Orlando, FL 32816, USA
| |
Collapse
|
33
|
Vuong P, Wise MJ, Whiteley AS, Kaur P. Small investments with big returns: environmental genomic bioprospecting of microbial life. Crit Rev Microbiol 2022; 48:641-655. [PMID: 35100064 DOI: 10.1080/1040841x.2021.2011833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Microorganisms and their natural products are major drivers of ecological processes and industrial applications. Microbial bioprospecting has been critical for the advancement in various fields such as pharmaceuticals, sustainable industries, food security and bioremediation. Next generation sequencing has been paramount in the exploration of diverse environmental microbiomes. It presents a culture-independent approach to investigating hitherto uncultured taxa, resulting in the creation of massive sequence databases, which are available in the public domain. Genome mining searches available (meta)genomic data for target biosynthetic genes, and combined with the large-scale public data, this in-silico bioprospecting method presents an efficient and extensive way to uncover microbial bioproducts. Bioinformatic tools have progressed to a stage where we can recover genomes from the environment; these metagenome-assembled genomes present a way to understand the metabolic capacity of microorganisms in a physiological and ecological context. Environmental sampling been extensive across various ecological settings, including microbiomes with unique physicochemical properties that could influence the discovery of novel functions and metabolic pathways. Although in-silico methods cannot completely substitute in-vitro studies, the contextual information it provides is invaluable for understanding the ecological and taxonomic distribution of microbial genotypes and to form effective strategies for future microbial bioprospecting efforts.
Collapse
Affiliation(s)
- Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Michael J Wise
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, Australia
| | - Andrew S Whiteley
- Centre for Environment & Life Sciences, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Floreat, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| |
Collapse
|
34
|
Bornemann TLV, Adam PS, Probst AJ. Reconstruction of Archaeal Genomes from Short-Read Metagenomes. Methods Mol Biol 2022; 2522:487-527. [PMID: 36125772 DOI: 10.1007/978-1-0716-2445-6_33] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
As the majority of biological diversity remains unexplored and uncultured, investigating it requires culture-independent approaches. Archaea in particular suffer from a multitude of issues that make their culturing problematic, from them being frequently members of the rare biosphere, to low growth rates, to them thriving under very specific and often extreme environmental and community conditions that are difficult to replicate. OMICs techniques are state of the art approaches that allow direct high-throughput investigations of environmental samples at all levels from nucleic acids to proteins, lipids, and secondary metabolites. Metagenomics, as the foundation for other OMICs techniques, facilitates the identification and functional characterization of the microbial community members and can be combined with other methods to provide insights into the microbial activities, both on the RNA and protein levels. In this chapter, we provide a step-by-step workflow for the recovery of archaeal genomes from metagenomes, starting from raw short-read sequences. This workflow can be applied to recover bacterial genomes as well.
Collapse
Affiliation(s)
- Till L V Bornemann
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Essen, Germany.
| | - Panagiotis S Adam
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Essen, Germany
| | - Alexander J Probst
- Environmental Microbiology and Biotechnology, Faculty of Chemistry, University of Duisburg-Essen, Essen, Germany.
- Centre of Water and Environmental Research (ZWU), University of Duisburg-Essen, Essen, Germany.
| |
Collapse
|
35
|
Blakeley-Ruiz JA, Kleiner M. Considerations for Constructing a Protein Sequence Database for Metaproteomics. Comput Struct Biotechnol J 2022; 20:937-952. [PMID: 35242286 PMCID: PMC8861567 DOI: 10.1016/j.csbj.2022.01.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 12/14/2022] Open
Abstract
Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies.
Collapse
Affiliation(s)
- J. Alfredo Blakeley-Ruiz
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|
36
|
Fabiańska I, Borutzki S, Richter B, Tran HQ, Neubert A, Mayer D. LABRADOR-A Computational Workflow for Virus Detection in High-Throughput Sequencing Data. Viruses 2021; 13:v13122541. [PMID: 34960810 PMCID: PMC8704571 DOI: 10.3390/v13122541] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Revised: 12/13/2021] [Accepted: 12/16/2021] [Indexed: 11/16/2022] Open
Abstract
High-throughput sequencing (HTS) allows detection of known and unknown viruses in samples of broad origin. This makes HTS a perfect technology to determine whether or not the biological products, such as vaccines are free from the adventitious agents, which could support or replace extensive testing using various in vitro and in vivo assays. Due to bioinformatics complexities, there is a need for standardized and reliable methods to manage HTS generated data in this field. Thus, we developed LABRADOR—an analysis pipeline for adventitious virus detection. The pipeline consists of several third-party programs and is divided into two major parts: (i) direct reads classification based on the comparison of characteristic profiles between reads and sequences deposited in the database supported with alignment of to the best matching reference sequence and (ii) de novo assembly of contigs and their classification on nucleotide and amino acid levels. To meet the requirements published in guidelines for biologicals’ safety we generated a custom nucleotide database with viral sequences. We tested our pipeline on publicly available HTS datasets and showed that LABRADOR can reliably detect viruses in mixtures of model viruses, vaccines and clinical samples.
Collapse
|
37
|
Song S, Ma L, Xu X, Shi H, Li X, Liu Y, Hao P. Rapid screening and identification of viral pathogens in metagenomic data. BMC Med Genomics 2021; 14:289. [PMID: 34903237 PMCID: PMC8668262 DOI: 10.1186/s12920-021-01138-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Accepted: 11/16/2021] [Indexed: 11/17/2022] Open
Abstract
BACKGROUND Virus screening and viral genome reconstruction are urgent and crucial for the rapid identification of viral pathogens, i.e., tracing the source and understanding the pathogenesis when a viral outbreak occurs. Next-generation sequencing (NGS) provides an efficient and unbiased way to identify viral pathogens in host-associated and environmental samples without prior knowledge. Despite the availability of software, data analysis still requires human operations. A mature pipeline is urgently needed when thousands of viral pathogen and viral genome reconstruction samples need to be rapidly identified. RESULTS In this paper, we present a rapid and accurate workflow to screen metagenomics sequencing data for viral pathogens and other compositions, as well as enable a reference-based assembler to reconstruct viral genomes. Moreover, we tested our workflow on several metagenomics datasets, including a SARS-CoV-2 patient sample with NGS data, pangolins tissues with NGS data, Middle East Respiratory Syndrome (MERS)-infected cells with NGS data, etc. Our workflow demonstrated high accuracy and efficiency when identifying target viruses from large scale NGS metagenomics data. Our workflow was flexible when working with a broad range of NGS datasets from small (kb) to large (100 Gb). This took from a few minutes to a few hours to complete each task. At the same time, our workflow automatically generates reports that incorporate visualized feedback (e.g., metagenomics data quality statistics, host and viral sequence compositions, details about each of the identified viral pathogens and their coverages, and reassembled viral pathogen sequences based on their closest references). CONCLUSIONS Overall, our system enabled the rapid screening and identification of viral pathogens from metagenomics data, providing an important piece to support viral pathogen research during a pandemic. The visualized report contains information from raw sequence quality to a reconstructed viral sequence, which allows non-professional people to screen their samples for viruses by themselves (Additional file 1).
Collapse
Affiliation(s)
- Shiyang Song
- Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Liangxiao Ma
- Bio-Med Big Data Center, Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, 20031, China
| | - Xintian Xu
- Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Han Shi
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Xuan Li
- Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, 200032, China
| | - Yuanhua Liu
- Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Shanghai, 200031, China.
| | - Pei Hao
- Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Center for Biosafety Mega-Science, Chinese Academy of Sciences, Shanghai, 200031, China.
| |
Collapse
|
38
|
Fuentes-Trillo A, Monzó C, Manzano I, Santiso-Bellón C, Andrade JDSRD, Gozalbo-Rovira R, García-García AB, Rodríguez-Díaz J, Chaves FJ. Benchmarking different approaches for Norovirus genome assembly in metagenome samples. BMC Genomics 2021; 22:849. [PMID: 34819031 PMCID: PMC8611953 DOI: 10.1186/s12864-021-08067-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 10/10/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Genome assembly of viruses with high mutation rates, such as Norovirus and other RNA viruses, or from metagenome samples, poses a challenge for the scientific community due to the coexistence of several viral quasispecies and strains. Furthermore, there is no standard method for obtaining whole-genome sequences in non-related patients. After polyA RNA isolation and sequencing in eight patients with acute gastroenteritis, we evaluated two de Bruijn graph assemblers (SPAdes and MEGAHIT), combined with four different and common pre-assembly strategies, and compared those yielding whole genome Norovirus contigs. RESULTS Reference-genome guided strategies with both host and target virus did not present any advantages compared to the assembly of non-filtered data in the case of SPAdes, and in the case of MEGAHIT, only host genome filtering presented improvements. MEGAHIT performed better than SPAdes in most samples, reaching complete genome sequences in most of them for all the strategies employed. Read binning with CD-HIT improved assembly when paired with different analysis strategies, and more notably in the case of SPAdes. CONCLUSIONS Not all metagenome assemblies are equal and the choice in the workflow depends on the species studied and the prior steps to analysis. We may need different approaches even for samples treated equally due to the presence of high intra host variability. We tested and compared different workflows for the accurate assembly of Norovirus genomes and established their assembly capacities for this purpose.
Collapse
Affiliation(s)
- Azahara Fuentes-Trillo
- Unit of Genomics and Diabetes. Research Foundation of Valencia University Clinical Hospital- INCLIVA, Valencia, Spain
| | - Carolina Monzó
- Unit of Genomics and Diabetes. Research Foundation of Valencia University Clinical Hospital- INCLIVA, Valencia, Spain
| | - Iris Manzano
- Unit of Genomics and Diabetes. Research Foundation of Valencia University Clinical Hospital- INCLIVA, Valencia, Spain
| | | | | | | | - Ana-Bárbara García-García
- Unit of Genomics and Diabetes. Research Foundation of Valencia University Clinical Hospital- INCLIVA, Valencia, Spain.
- Spanish Biomedical Research Network in Diabetes and Associated Metabolic Disorders (CIBERDEM), Madrid, Spain.
| | - Jesús Rodríguez-Díaz
- Department of Microbiology, School of Medicine, University of Valencia, Valencia, Spain
| | - Felipe Javier Chaves
- Unit of Genomics and Diabetes. Research Foundation of Valencia University Clinical Hospital- INCLIVA, Valencia, Spain
- Spanish Biomedical Research Network in Diabetes and Associated Metabolic Disorders (CIBERDEM), Madrid, Spain
- Sequencing Multiplex S.L., Valencia, Spain
| |
Collapse
|
39
|
Long-read metagenomics of soil communities reveals phylum-specific secondary metabolite dynamics. Commun Biol 2021; 4:1302. [PMID: 34795375 PMCID: PMC8602731 DOI: 10.1038/s42003-021-02809-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 10/25/2021] [Indexed: 01/04/2023] Open
Abstract
Microbial biosynthetic gene clusters (BGCs) encoding secondary metabolites are thought to impact a plethora of biologically mediated environmental processes, yet their discovery and functional characterization in natural microbiomes remains challenging. Here we describe deep long-read sequencing and assembly of metagenomes from biological soil crusts, a group of soil communities that are rich in BGCs. Taking advantage of the unusually long assemblies produced by this approach, we recovered nearly 3,000 BGCs for analysis, including 712 full-length BGCs. Functional exploration through metatranscriptome analysis of a 3-day wetting experiment uncovered phylum-specific BGC expression upon activation from dormancy, elucidating distinct roles and complex phylogenetic and temporal dynamics in wetting processes. For example, a pronounced increase in BGC transcription occurs at night primarily in cyanobacteria, implicating BGCs in nutrient scavenging roles and niche competition. Taken together, our results demonstrate that long-read metagenomic sequencing combined with metatranscriptomic analysis provides a direct view into the functional dynamics of BGCs in environmental processes and suggests a central role of secondary metabolites in maintaining phylogenetically conserved niches within biocrusts.
Collapse
|
40
|
Abstract
Reconstructing microbial genomes from metagenomic short-read data can be challenging due to the unknown and uneven complexity of microbial communities. This complexity encompasses highly diverse populations, which often includes strain variants. Reconstructing high-quality genomes is a crucial part of the metagenomic workflow, as subsequent ecological and metabolic inferences depend on their accuracy, quality, and completeness. In contrast to microbial communities in other ecosystems, there has been no systematic assessment of genome-centric metagenomic workflows for drinking water microbiomes. In this study, we assessed the performance of a combination of assembly and binning strategies for time series drinking water metagenomes that were collected over 6 months. The goal of this study was to identify the combination of assembly and binning approaches that result in high-quality and -quantity metagenome-assembled genomes (MAGs), representing most of the sequenced metagenome. Our findings suggest that the metaSPAdes coassembly strategies had the best performance, as they resulted in larger and less fragmented assemblies, with at least 85% of the sequence data mapping to contigs greater than 1 kbp. Furthermore, a combination of metaSPAdes coassembly strategies and MetaBAT2 produced the highest number of medium-quality MAGs while capturing at least 70% of the metagenomes based on read recruitment. Utilizing different assembly/binning approaches also assists in the reconstruction of unique MAGs from closely related species that would have otherwise collapsed into a single MAG using a single workflow. Overall, our study suggests that leveraging multiple binning approaches with different metaSPAdes coassembly strategies may be required to maximize the recovery of good-quality MAGs. IMPORTANCE Drinking water contains phylogenetic diverse groups of bacteria, archaea, and eukarya that affect the esthetic quality of water, water infrastructure, and public health. Taxonomic, metabolic, and ecological inferences of the drinking water microbiome depend on the accuracy, quality, and completeness of genomes that are reconstructed through the application of genome-resolved metagenomics. Using time series metagenomic data, we present reproducible genome-centric metagenomic workflows that result in high-quality and -quantity genomes, which more accurately signifies the sequenced drinking water microbiome. These genome-centric metagenomic workflows will allow for improved taxonomic and functional potential analysis that offers enhanced insights into the stability and dynamics of drinking water microbial communities.
Collapse
|
41
|
Kayani MUR, Huang W, Feng R, Chen L. Genome-resolved metagenomics using environmental and clinical samples. Brief Bioinform 2021; 22:bbab030. [PMID: 33758906 PMCID: PMC8425419 DOI: 10.1093/bib/bbab030] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 11/29/2020] [Accepted: 01/20/2021] [Indexed: 12/25/2022] Open
Abstract
Recent advances in high-throughput sequencing technologies and computational methods have added a new dimension to metagenomic data analysis i.e. genome-resolved metagenomics. In general terms, it refers to the recovery of draft or high-quality microbial genomes and their taxonomic classification and functional annotation. In recent years, several studies have utilized the genome-resolved metagenome analysis approach and identified previously unknown microbial species from human and environmental metagenomes. In this review, we describe genome-resolved metagenome analysis as a series of four necessary steps: (i) preprocessing of the sequencing reads, (ii) de novo metagenome assembly, (iii) genome binning and (iv) taxonomic and functional analysis of the recovered genomes. For each of these four steps, we discuss the most commonly used tools and the currently available pipelines to guide the scientific community in the recovery and subsequent analyses of genomes from any metagenome sample. Furthermore, we also discuss the tools required for validation of assembly quality as well as for improving quality of the recovered genomes. We also highlight the currently available pipelines that can be used to automate the whole analysis without having advanced bioinformatics knowledge. Finally, we will highlight the most widely adapted and actively maintained tools and pipelines that can be helpful to the scientific community in decision making before they commence the analysis.
Collapse
Affiliation(s)
- Masood ur Rehman Kayani
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| | - Wanqiu Huang
- Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 200,000, China
| | - Ru Feng
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| | - Lei Chen
- Center for Microbiota and Immunological Diseases, Shanghai General Hospital, Shanghai Institute of Immunology, Shanghai Jiao Tong University, School of Medicine, Shanghai 2,000,025, China
| |
Collapse
|
42
|
Haro-Moreno JM, López-Pérez M, Rodriguez-Valera F. Enhanced Recovery of Microbial Genes and Genomes From a Marine Water Column Using Long-Read Metagenomics. Front Microbiol 2021; 12:708782. [PMID: 34512586 PMCID: PMC8430335 DOI: 10.3389/fmicb.2021.708782] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Accepted: 07/30/2021] [Indexed: 12/12/2022] Open
Abstract
Third-generation sequencing has penetrated little in metagenomics due to the high error rate and dependence for assembly on short-read designed bioinformatics. However, second-generation sequencing metagenomics (mostly Illumina) suffers from limitations, particularly in the assembly of microbes with high microdiversity and retrieval of the flexible (adaptive) fraction of prokaryotic genomes. Here, we have used a third-generation technique to study the metagenome of a well-known marine sample from the mixed epipelagic water column of the winter Mediterranean. We have compared PacBio Sequel II with the classical approach using Illumina Nextseq short reads followed by assembly to study the metagenome. Long reads allow for efficient direct retrieval of complete genes avoiding the bias of the assembly step. Besides, the application of long reads on metagenomic assembly allows for the reconstruction of much more complete metagenome-assembled genomes (MAGs), particularly from microbes with high microdiversity such as Pelagibacterales. The flexible genome of reconstructed MAGs was much more complete containing many adaptive genes (some with biotechnological potential). PacBio Sequel II CCS appears particularly suitable for cellular metagenomics due to its low error rate. For most applications of metagenomics, from community structure analysis to ecosystem functioning, long reads should be applied whenever possible. Specifically, for in silico screening of biotechnologically useful genes, or population genomics, long-read metagenomics appears presently as a very fruitful approach and can be analyzed from raw reads before a computationally demanding (and potentially artifactual) assembly step.
Collapse
Affiliation(s)
- Jose M. Haro-Moreno
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, Alicante, Spain
| | - Mario López-Pérez
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, Alicante, Spain
| | - Francisco Rodriguez-Valera
- Evolutionary Genomics Group, División de Microbiología, Universidad Miguel Hernández, Alicante, Spain
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Russia
| |
Collapse
|
43
|
Milani C, Lugli GA, Fontana F, Mancabelli L, Alessandri G, Longhi G, Anzalone R, Viappiani A, Turroni F, van Sinderen D, Ventura M. METAnnotatorX2: a Comprehensive Tool for Deep and Shallow Metagenomic Data Set Analyses. mSystems 2021; 6:e0058321. [PMID: 34184911 PMCID: PMC8269244 DOI: 10.1128/msystems.00583-21] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 06/10/2021] [Indexed: 12/04/2022] Open
Abstract
The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results. IMPORTANCE We developed a novel tool, i.e., METAnnotatorX2, that includes a number of new advanced features for analysis of deep and shallow metagenomic data sets and is accompanied by (regularly updated) customized databases for archaea, bacteria, fungi, protists, and viruses. Both software and databases were developed so as to maximize sensitivity and specificity while including support for shallow metagenomic data sets. Through extensive tests performed on Illumina and Nanopore artificial data sets, we demonstrated the high performance of the software to not only extract taxonomic and functional information from sequence reads but also to assemble and process genomes from metagenomic data. The robustness of these functionalities was validated using "real-life" data sets obtained from Illumina and Nanopore sequencing of biological samples. Furthermore, the performance of METAnnotatorX2 was compared to other available software tools for analysis of shotgun metagenomics data.
Collapse
Affiliation(s)
- Christian Milani
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| | - Gabriele Andrea Lugli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Federico Fontana
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- GenProbio srl, Parma, Italy
| | - Leonardo Mancabelli
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Giulia Alessandri
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
| | - Giulia Longhi
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- GenProbio srl, Parma, Italy
| | | | | | - Francesca Turroni
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| | - Douwe van Sinderen
- APC Microbiome Ireland and School of Microbiology, Bioscience Institute, National University of Ireland, Cork, Ireland
| | - Marco Ventura
- Laboratory of Probiogenomics, Department of Chemistry, Life Sciences, and Environmental Sustainability, University of Parma, Parma, Italy
- Microbiome Research Hub, University of Parma, Parma, Italy
| |
Collapse
|
44
|
Martínez Arbas S, Busi SB, Queirós P, de Nies L, Herold M, May P, Wilmes P, Muller EEL, Narayanasamy S. Challenges, Strategies, and Perspectives for Reference-Independent Longitudinal Multi-Omic Microbiome Studies. Front Genet 2021; 12:666244. [PMID: 34194470 PMCID: PMC8236828 DOI: 10.3389/fgene.2021.666244] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Accepted: 04/30/2021] [Indexed: 12/21/2022] Open
Abstract
In recent years, multi-omic studies have enabled resolving community structure and interrogating community function of microbial communities. Simultaneous generation of metagenomic, metatranscriptomic, metaproteomic, and (meta) metabolomic data is more feasible than ever before, thus enabling in-depth assessment of community structure, function, and phenotype, thus resulting in a multitude of multi-omic microbiome datasets and the development of innovative methods to integrate and interrogate those multi-omic datasets. Specifically, the application of reference-independent approaches provides opportunities in identifying novel organisms and functions. At present, most of these large-scale multi-omic datasets stem from spatial sampling (e.g., water/soil microbiomes at several depths, microbiomes in/on different parts of the human anatomy) or case-control studies (e.g., cohorts of human microbiomes). We believe that longitudinal multi-omic microbiome datasets are the logical next step in microbiome studies due to their characteristic advantages in providing a better understanding of community dynamics, including: observation of trends, inference of causality, and ultimately, prediction of community behavior. Furthermore, the acquisition of complementary host-derived omics, environmental measurements, and suitable metadata will further enhance the aforementioned advantages of longitudinal data, which will serve as the basis to resolve drivers of community structure and function to understand the biotic and abiotic factors governing communities and specific populations. Carefully setup future experiments hold great potential to further unveil ecological mechanisms to evolution, microbe-microbe interactions, or microbe-host interactions. In this article, we discuss the challenges, emerging strategies, and best-practices applicable to longitudinal microbiome studies ranging from sampling, biomolecular extraction, systematic multi-omic measurements, reference-independent data integration, modeling, and validation.
Collapse
Affiliation(s)
- Susana Martínez Arbas
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Susheel Bhanu Busi
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Pedro Queirós
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Laura de Nies
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Malte Herold
- Department of Environmental Research and Innovation, Luxembourg Institute of Science and Technology, Belvaux, Luxembourg
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Paul Wilmes
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
- Department of Life Sciences and Medicine, Faculty of Science, Technology and Medicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| | - Emilie E. L. Muller
- Université de Strasbourg, UMR 7156 CNRS, Génétique Moléculaire, Génomique, Microbiologie, Strasbourg, France
| | - Shaman Narayanasamy
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette, Luxembourg
| |
Collapse
|
45
|
Metagenomic Assembly: Reconstructing Genomes from Metagenomes. Methods Mol Biol 2021. [PMID: 33961222 DOI: 10.1007/978-1-0716-1099-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2023]
Abstract
Assembly of metagenomic sequence data into microbial genomes is of critical importance for disentangling community complexity and unraveling the functional capacity of microorganisms. The rapid development of sequencing technology and novel assembly algorithms have made it possible to reliably reconstruct hundreds to thousands of microbial genomes from raw sequencing reads through metagenomic assembly. In this chapter, we introduce a routinely used metagenomic assembly workflow including read quality filtering, assembly, contig/scaffold binning, and postassembly check for genome completeness and contamination. We also describe a case study to reconstruct near-complete microbial genomes from metagenomes using our workflow.
Collapse
|
46
|
Recovering prokaryotic genomes from host-associated, short-read shotgun metagenomic sequencing data. Nat Protoc 2021; 16:2520-2541. [PMID: 33864056 DOI: 10.1038/s41596-021-00508-2] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Accepted: 01/12/2021] [Indexed: 02/02/2023]
Abstract
Recovering genomes from shotgun metagenomic sequence data allows detailed taxonomic and functional characterization of individual species or strains in a microbial community. Retrieving these metagenome-assembled genomes (MAGs) involves seven stages. First, low-quality bases, along with adapter and host sequences, are removed. Second, overlapping sequences are assembled to create longer contiguous fragments. Third, these fragments are clustered based on sequence composition and abundance. Fourth, these sequence clusters, or bins, undergo rounds of quality assessment and refinement to yield MAGs. The optional fifth stage is dereplication of MAGs to select representatives. Next, each MAG is taxonomically classified. The optional seventh stage is assessing the fraction of diversity that has been recovered. The output of this protocol is draft genomes, which can provide invaluable clues about uncultured organisms. This protocol takes ~1 week to run, depending on computational resources available, and requires prior experience with high-performance computing, shell script programming and Python.
Collapse
|
47
|
Werbin ZR, Hackos B, Lopez-Nava J, Dietze MC, Bhatnagar JM. The National Ecological Observatory Network's soil metagenomes: assembly and basic analysis. F1000Res 2021; 10:299. [PMID: 35707452 PMCID: PMC9178279 DOI: 10.12688/f1000research.51494.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/08/2022] [Indexed: 11/20/2022] Open
Abstract
The largest dataset of soil metagenomes has recently been released by the National Ecological Observatory Network (NEON), which performs annual shotgun sequencing of soils at 47 sites across the United States. NEON serves as a valuable educational resource, thanks to its open data and programming tutorials, but there is currently no introductory tutorial for accessing and analyzing the soil shotgun metagenomic dataset. Here, we describe methods for processing raw soil metagenome sequencing reads using a bioinformatics pipeline tailored to the high complexity and diversity of the soil microbiome. We describe the rationale, necessary resources, and implementation of steps such as cleaning raw reads, taxonomic classification, assembly into contigs or genomes, annotation of predicted genes using custom protein databases, and exporting data for downstream analysis. The workflow presented here aims to increase the accessibility of NEON's shotgun metagenome data, which can provide important clues about soil microbial communities and their ecological roles.
Collapse
Affiliation(s)
- Zoey R. Werbin
- Department of Biology, Boston University, Boston, MA, 02215, USA
| | - Briana Hackos
- Department of Mathematics, University of Colorado, Boulder, Boulder, CO, 80309, USA
| | - Jorge Lopez-Nava
- Department of Mathematics, Swarthmore College, Swarthmore, PA 19081, USA
| | - Michael C. Dietze
- Department of Earth & Environment, Boston University, Boston, MA, 02215, USA
| | | |
Collapse
|
48
|
Gao B, Chi L, Zhu Y, Shi X, Tu P, Li B, Yin J, Gao N, Shen W, Schnabl B. An Introduction to Next Generation Sequencing Bioinformatic Analysis in Gut Microbiome Studies. Biomolecules 2021; 11:530. [PMID: 33918473 PMCID: PMC8066849 DOI: 10.3390/biom11040530] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/28/2021] [Accepted: 03/29/2021] [Indexed: 12/12/2022] Open
Abstract
The gut microbiome is a microbial ecosystem which expresses 100 times more genes than the human host and plays an essential role in human health and disease pathogenesis. Since most intestinal microbial species are difficult to culture, next generation sequencing technologies have been widely applied to study the gut microbiome, including 16S rRNA, 18S rRNA, internal transcribed spacer (ITS) sequencing, shotgun metagenomic sequencing, metatranscriptomic sequencing and viromic sequencing. Various software tools were developed to analyze different sequencing data. In this review, we summarize commonly used computational tools for gut microbiome data analysis, which extended our understanding of the gut microbiome in health and diseases.
Collapse
Affiliation(s)
- Bei Gao
- Department of Marine Science, School of Marine Sciences, Nanjing University of Information Science and Technology, Nanjing 210044, China;
| | - Liang Chi
- Metaorganism Immunity Section, Laboratory of Immune Systems Biology, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA;
| | - Yixin Zhu
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA;
| | - Xiaochun Shi
- Department of Environmental Ecological Engineering, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; (X.S.); (W.S.)
| | - Pengcheng Tu
- Department of Food Science and Nutrition, College of Biosystems Engineering and Food Science, Zhejiang University, Hangzhou 310058, China;
| | - Bing Li
- Suzhou Industrial Park Environmental Law Enforcement Brigade (Environmental Monitoring Station), Suzhou 215021, China;
| | - Jun Yin
- Department of Hydrometeorology, School of Hydrology and Water Resources, Nanjing University of Information Science and Technology, Nanjing 210044, China;
| | - Nan Gao
- Department of Biotechnology, School of Biological and Pharmaceutical Engineering, Nanjing Tech University, Nanjing 211816, China;
| | - Weishou Shen
- Department of Environmental Ecological Engineering, School of Environmental Science and Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China; (X.S.); (W.S.)
- Jiangsu Key Laboratory of Atmospheric Environment Monitoring and Pollution Control, Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing 210044, China
| | - Bernd Schnabl
- Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA;
- Department of Medicine, VA San Diego Healthcare System, San Diego, CA 92161, USA
| |
Collapse
|
49
|
Nayfach S, Roux S, Seshadri R, Udwary D, Varghese N, Schulz F, Wu D, Paez-Espino D, Chen IM, Huntemann M, Palaniappan K, Ladau J, Mukherjee S, Reddy TBK, Nielsen T, Kirton E, Faria JP, Edirisinghe JN, Henry CS, Jungbluth SP, Chivian D, Dehal P, Wood-Charlson EM, Arkin AP, Tringe SG, Visel A, Woyke T, Mouncey NJ, Ivanova NN, Kyrpides NC, Eloe-Fadrosh EA. A genomic catalog of Earth's microbiomes. Nat Biotechnol 2021; 39:499-509. [PMID: 33169036 PMCID: PMC8041624 DOI: 10.1038/s41587-020-0718-6] [Citation(s) in RCA: 336] [Impact Index Per Article: 112.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 09/28/2020] [Indexed: 01/02/2023]
Abstract
The reconstruction of bacterial and archaeal genomes from shotgun metagenomes has enabled insights into the ecology and evolution of environmental and host-associated microbiomes. Here we applied this approach to >10,000 metagenomes collected from diverse habitats covering all of Earth's continents and oceans, including metagenomes from human and animal hosts, engineered environments, and natural and agricultural soils, to capture extant microbial, metabolic and functional potential. This comprehensive catalog includes 52,515 metagenome-assembled genomes representing 12,556 novel candidate species-level operational taxonomic units spanning 135 phyla. The catalog expands the known phylogenetic diversity of bacteria and archaea by 44% and is broadly available for streamlined comparative analyses, interactive exploration, metabolic modeling and bulk download. We demonstrate the utility of this collection for understanding secondary-metabolite biosynthetic potential and for resolving thousands of new host linkages to uncultivated viruses. This resource underscores the value of genome-centric approaches for revealing genomic properties of uncultivated microorganisms that affect ecosystem processes.
Collapse
Affiliation(s)
| | - Simon Roux
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | - Dongying Wu
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | - I-Min Chen
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | - T B K Reddy
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | | | | - Sean P Jungbluth
- DOE Joint Genome Institute, Berkeley, CA, USA
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Dylan Chivian
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Paramvir Dehal
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Adam P Arkin
- Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Axel Visel
- DOE Joint Genome Institute, Berkeley, CA, USA
| | - Tanja Woyke
- DOE Joint Genome Institute, Berkeley, CA, USA
| | | | | | | | | |
Collapse
|
50
|
Thornton CN, Tanner WD, VanDerslice JA, Brazelton WJ. Localized effect of treated wastewater effluent on the resistome of an urban watershed. Gigascience 2020; 9:5992824. [PMID: 33215210 PMCID: PMC7677451 DOI: 10.1093/gigascience/giaa125] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2020] [Revised: 07/14/2020] [Indexed: 11/14/2022] Open
Abstract
Background Wastewater treatment is an essential tool for maintaining water quality in urban environments. While the treatment of wastewater can remove most bacterial cells, some will inevitably survive treatment to be released into natural environments. Previous studies have investigated antibiotic resistance within wastewater treatment plants, but few studies have explored how a river’s complete set of antibiotic resistance genes (the “resistome") is affected by the release of treated effluent into surface waters. Results Here we used high-throughput, deep metagenomic sequencing to investigate the effect of treated wastewater effluent on the resistome of an urban river and the downstream distribution of effluent-associated antibiotic resistance genes and mobile genetic elements. Treated effluent release was found to be associated with increased abundance and diversity of antibiotic resistance genes and mobile genetic elements. The impact of wastewater discharge on the river’s resistome diminished with increasing distance from effluent discharge points. The resistome at river locations that were not immediately downstream from any wastewater discharge points was dominated by a single integron carrying genes associated with resistance to sulfonamides and quaternary ammonium compounds. Conclusions Our study documents variations in the resistome of an urban watershed from headwaters to a major confluence in an urban center. Greater abundances and diversity of antibiotic resistance genes are associated with human fecal contamination in river surface water, but the fecal contamination effect seems to be localized, with little measurable effect in downstream waters. The diverse composition of antibiotic resistance genes throughout the watershed suggests the influence of multiple environmental and biological factors.
Collapse
Affiliation(s)
- Christopher N Thornton
- School of Biological Sciences, University of Utah, 257 South 1400 East, Rm. 201, 84112, Salt Lake City, UT, USA
| | - Windy D Tanner
- Department of Family and Preventive Medicine, University of Utah, 257 South 1400 East, Rm. 201, 84112, Salt Lake City, UT, USA
| | - James A VanDerslice
- Department of Family and Preventive Medicine, University of Utah, 257 South 1400 East, Rm. 201, 84112, Salt Lake City, UT, USA
| | - William J Brazelton
- School of Biological Sciences, University of Utah, 257 South 1400 East, Rm. 201, 84112, Salt Lake City, UT, USA
| |
Collapse
|