1
|
Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I. Genome-resolved metagenomics: a game changer for microbiome medicine. Exp Mol Med 2024:10.1038/s12276-024-01262-7. [PMID: 38945961 DOI: 10.1038/s12276-024-01262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/06/2024] [Accepted: 03/25/2024] [Indexed: 07/02/2024] Open
Abstract
Recent substantial evidence implicating commensal bacteria in human diseases has given rise to a new domain in biomedical research: microbiome medicine. This emerging field aims to understand and leverage the human microbiota and derivative molecules for disease prevention and treatment. Despite the complex and hierarchical organization of this ecosystem, most research over the years has relied on 16S amplicon sequencing, a legacy of bacterial phylogeny and taxonomy. Although advanced sequencing technologies have enabled cost-effective analysis of entire microbiota, translating the relatively short nucleotide information into the functional and taxonomic organization of the microbiome has posed challenges until recently. In the last decade, genome-resolved metagenomics, which aims to reconstruct microbial genomes directly from whole-metagenome sequencing data, has made significant strides and continues to unveil the mysteries of various human-associated microbial communities. There has been a rapid increase in the volume of whole metagenome sequencing data and in the compilation of novel metagenome-assembled genomes and protein sequences in public depositories. This review provides an overview of the capabilities and methods of genome-resolved metagenomics for studying the human microbiome, with a focus on investigating the prokaryotic microbiota of the human gut. Just as decoding the human genome and its variations marked the beginning of the genomic medicine era, unraveling the genomes of commensal microbes and their sequence variations is ushering us into the era of microbiome medicine. Genome-resolved metagenomics stands as a pivotal tool in this transition and can accelerate our journey toward achieving these scientific and medical milestones.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Wonjong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Jungyeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea.
| |
Collapse
|
2
|
Fernando CM, Breaker RR. Bioinformatic prediction of proteins relevant to functions of the bacterial OLE ribonucleoprotein complex. mSphere 2024; 9:e0015924. [PMID: 38771028 DOI: 10.1128/msphere.00159-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Accepted: 04/19/2024] [Indexed: 05/22/2024] Open
Abstract
OLE (ornate, large, extremophilic) RNAs are members of a noncoding RNA class present in many Gram-positive, extremophilic bacteria. The large size, complex structure, and extensive sequence conservation of OLE RNAs are characteristics consistent with the hypothesis that they likely function as ribozymes. The OLE RNA representative from Halalkalibacterium halodurans is known to localize to the phospholipid membrane and requires at least three essential protein partners: OapA, OapB, and OapC. However, the precise biochemical functions of this unusual ribonucleoprotein (RNP) complex remain unknown. Genetic disruption of OLE RNA or its partners revealed that the complex is beneficial under diverse stress conditions. To search for additional links between OLE RNA and other cellular components, we used phylogenetic profiling to identify proteins that are either correlated or anticorrelated with the presence of OLE RNA in various bacterial species. This analysis revealed strong correlations between the essential protein-binding partners of OLE RNA and organisms that carry the ole gene. Similarly, proteins involved in sporulation are correlated, suggesting a potential role for the OLE RNP complex in spore formation. Intriguingly, the Mg2+ transporter MpfA is strongly anticorrelated with OLE RNA. Evidence indicates that MpfA is structurally related to OapA and therefore MpfA may serve as a functional replacement for some contributions otherwise performed by the OLE RNP complex in species that lack this device. Indeed, OLE RNAs might represent an ancient RNA class that enabled primitive organisms to sense and respond to major cellular stresses.IMPORTANCEOLE (ornate, large, extremophilic) RNAs were first reported nearly 20 years ago, and they represent one of the largest and most intricately folded noncoding RNA classes whose biochemical function remains to be established. Other RNAs with similar size, structural complexity, and extent of sequence conservation have proven to catalyze chemical transformations. Therefore, we speculate that OLE RNAs likewise operate as ribozymes and that they might catalyze a fundamental reaction that has persisted since the RNA World era-a time before the emergence of proteins in evolution. To seek additional clues regarding the function of OLE RNA, we undertook a computational effort to identify potential protein components of the OLE ribonucleoprotein (RNP) complex or other proteins that have functional links to this device. This analysis revealed known protein partners and several additional proteins that might be physically or functionally linked to the OLE RNP complex. Finally, we identified a Mg2+ transporter protein, MpfA, that strongly anticorrelates with the OLE RNP complex. This latter result suggests that MpfA might perform at least some functions that are like those carried out by the OLE RNP complex.
Collapse
Affiliation(s)
- Chrishan M Fernando
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
| | - Ronald R Breaker
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, Connecticut, USA
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, Connecticut, USA
- Howard Hughes Medical Institute, Yale University, New Haven, Connecticut, USA
| |
Collapse
|
3
|
Sapoval N, Liu Y, Curry KD, Kille B, Huang W, Kokroko N, Nute MG, Tyshaieva A, Dilthey A, Molloy EK, Treangen TJ. Lightweight taxonomic profiling of long-read sequenced metagenomes with Lemur and Magnet. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.01.596961. [PMID: 38895276 PMCID: PMC11185576 DOI: 10.1101/2024.06.01.596961] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Taxonomic profiling is a ubiquitous task in the analysis of clinical and environmental microbiomes. The advent of long-read sequencing of microbiomes necessitates the development of new taxonomic profilers tailored to long-read shotgun metagenomic datasets. Here, we introduce Lemur and Magnet, a pair of tools optimized for lightweight and accurate taxonomic profiling from long-read shotgun metagenomic datasets. Lemur is a marker-gene based method that leverages an EM algorithm to reduce false positive calls while preserving true positives; Magnet makes detailed presence/absence calls for bacterial genomes based on whole-genome read mapping. The tools work in sequence: Lemur estimates abundances conservatively, and Magnet operates on the genomes of identified organisms to filter out likely false positive taxa. The result is an increase in precision of as much as 70%, which far exceeds competing methods. By operating only on marker genes, Lemur is a comparatively lightweight software. We demonstrate that it can run in minutes to hours on a laptop with 32 GB of RAM, even for large inputs - a crucial feature given the portability of long-read sequencing machines. Furthermore, the marker gene database used by Lemur is only 4 GB and contains information from over 300,000 RefSeq genomes. The reference is available at https://zenodo.org/records/10802546, and the software is open-source and available at https://github.com/treangenlab/lemur.
Collapse
Affiliation(s)
- Nicolae Sapoval
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Yunxi Liu
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Kristen D. Curry
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Bryce Kille
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Wenyu Huang
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Natalie Kokroko
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Michael G. Nute
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| | - Alona Tyshaieva
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Dusseldorf, Düsseldorf, Germany
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, Heinrich Heine University Dusseldorf, Düsseldorf, Germany
| | - Erin K. Molloy
- Department of Computer Science, University of Maryland, College Park, MD 20742, USA
| | - Todd J. Treangen
- Department of Computer Science, Rice University, Houston, TX 77005, USA
| |
Collapse
|
4
|
Wang YC, Mao Y, Fu HM, Wang J, Weng X, Liu ZH, Xu XW, Yan P, Fang F, Guo JS, Shen Y, Chen YP. New insights into functional divergence and adaptive evolution of uncultured bacteria in anammox community by complete genome-centric analysis. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 924:171530. [PMID: 38453092 DOI: 10.1016/j.scitotenv.2024.171530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/13/2023] [Accepted: 03/04/2024] [Indexed: 03/09/2024]
Abstract
Anaerobic ammonium-oxidation (anammox) bacteria play a crucial role in global nitrogen cycling and wastewater nitrogen removal, but they share symbiotic relationships with various other microorganisms. Functional divergence and adaptive evolution of uncultured bacteria in anammox community remain underexplored. Although shotgun metagenomics based on short reads has been widely used in anammox research, metagenome-assembled genomes (MAGs) are often discontinuous and highly contaminated, which limits in-depth analyses of anammox communities. Here, for the first time, we performed Pacific Biosciences high-fidelity (HiFi) long-read sequencing on the anammox granule sludge sample from a lab-scale bioreactor, and obtained 30 accurate and complete metagenome-assembled genomes (cMAGs). These cMAGs were obtained by selecting high-quality circular contigs from initial assemblies of long reads generated by HiFi sequencing, eliminating the need for Illumina short reads, binning, and reassembly. One new anammox species affiliated with Candidatus Jettenia and three species affiliated with novel families were found in this anammox community. cMAG-centric analysis revealed functional divergence in general and nitrogen metabolism among the anammox community members, and they might adopt a cross-feeding strategy in organic matter, cofactors, and vitamins. Furthermore, we identified 63 mobile genetic elements (MGEs) and 50 putative horizontal gene transfer (HGT) events within these cMAGs. The results suggest that HGT events and MGEs related to phage and integration or excision, particularly transposons containing tnpA in anammox bacteria, might play important roles in the adaptive evolution of this anammox community. The cMAGs generated in the present study could be used to establish of a comprehensive database for anammox bacteria and associated microorganisms. These findings highlight the advantages of HiFi sequencing for the studies of complex mixed cultures and advance the understanding of anammox communities.
Collapse
Affiliation(s)
- Yi-Cheng Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Yanping Mao
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518071, Guangdong, China
| | - Hui-Min Fu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China; National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - Jin Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Xun Weng
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Zi-Hao Liu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Xiao-Wei Xu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Peng Yan
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Fang Fang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Jin-Song Guo
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Yu Shen
- National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - You-Peng Chen
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China.
| |
Collapse
|
5
|
Feng X, Li H. Evaluating and improving the representation of bacterial contents in long-read metagenome assemblies. Genome Biol 2024; 25:92. [PMID: 38605401 PMCID: PMC11007910 DOI: 10.1186/s13059-024-03234-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open
Abstract
BACKGROUND In the metagenomic assembly of a microbial community, abundant species are often thought to assemble well given their deeper sequencing coverage. This conjuncture is rarely tested or evaluated in practice. We often do not know how many abundant species are missing and do not have an approach to recover them. RESULTS Here, we propose k-mer based and 16S RNA based methods to measure the completeness of metagenome assembly. We show that even with PacBio high-fidelity (HiFi) reads, abundant species are often not assembled, as high strain diversity may lead to fragmented contigs. We develop a novel reference-free algorithm to recover abundant metagenome-assembled genomes (MAGs) by identifying circular assembly subgraphs. Complemented with a reference-free genome binning heuristics based on dimension reduction, the proposed method rescues many abundant species that would be missing with existing methods and produces competitive results compared to those state-of-the-art binners in terms of total number of near-complete genome bins. CONCLUSIONS Our work emphasizes the importance of metagenome completeness, which has often been overlooked. Our algorithm generates more circular MAGs and moves a step closer to the complete representation of microbial communities.
Collapse
Affiliation(s)
- Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA.
| |
Collapse
|
6
|
Eisenhofer R, Nesme J, Santos-Bay L, Koziol A, Sørensen SJ, Alberdi A, Aizpurua O. A comparison of short-read, HiFi long-read, and hybrid strategies for genome-resolved metagenomics. Microbiol Spectr 2024; 12:e0359023. [PMID: 38451230 PMCID: PMC10986573 DOI: 10.1128/spectrum.03590-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 02/11/2024] [Indexed: 03/08/2024] Open
Abstract
Shotgun metagenomics enables the reconstruction of complex microbial communities at a high level of detail. Such an approach can be conducted using both short-read and long-read sequencing data, as well as a combination of both. To assess the pros and cons of these different approaches, we used 22 fecal DNA extracts collected weekly for 11 weeks from two respective lab mice to study seven performance metrics over four combinations of sequencing depth and technology: (i) 20 Gbp of Illumina short-read data, (ii) 40 Gbp of short-read data, (iii) 20 Gbp of PacBio HiFi long-read data, and (iv) 40 Gbp of hybrid (20 Gbp of short-read +20 Gbp of long-read) data. No strategy was best for all metrics; instead, each one excelled across different metrics. The long-read approach yielded the best assembly statistics, with the highest N50 and lowest number of contigs. The 40 Gbp short-read approach yielded the highest number of refined bins. Finally, the hybrid approach yielded the longest assemblies and the highest mapping rate to the bacterial genomes. Our results suggest that while long-read sequencing significantly improves the quality of reconstructed bacterial genomes, it is more expensive and requires deeper sequencing than short-read approaches to recover a comparable amount of reconstructed genomes. The most optimal strategy is study-specific and depends on how researchers assess the trade-off between the quantity and quality of recovered genomes.IMPORTANCEMice are an important model organism for understanding the gut microbiome. When studying these gut microbiomes using DNA techniques, researchers can choose from technologies that use short or long DNA reads. In this study, we perform an extensive benchmark between short- and long-read DNA sequencing for studying mice gut microbiomes. We find that no one approach was best for all metrics and provide information that can help guide researchers in planning their experiments.
Collapse
Affiliation(s)
- Raphael Eisenhofer
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Joseph Nesme
- Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Luisa Santos-Bay
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Adam Koziol
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Søren Johannes Sørensen
- Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Ostaizka Aizpurua
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
7
|
Cook R, Telatin A, Hsieh SY, Newberry F, Tariq MA, Baker DJ, Carding SR, Adriaenssens EM. Nanopore and Illumina sequencing reveal different viral populations from human gut samples. Microb Genom 2024; 10. [PMID: 38683195 DOI: 10.1099/mgen.0.001236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2024] Open
Abstract
The advent of viral metagenomics, or viromics, has improved our knowledge and understanding of global viral diversity. High-throughput sequencing technologies enable explorations of the ecological roles, contributions to host metabolism, and the influence of viruses in various environments, including the human intestinal microbiome. However, bacterial metagenomic studies frequently have the advantage. The adoption of advanced technologies like long-read sequencing has the potential to be transformative in refining viromics and metagenomics. Here, we examined the effectiveness of long-read and hybrid sequencing by comparing Illumina short-read and Oxford Nanopore Technology (ONT) long-read sequencing technologies and different assembly strategies on recovering viral genomes from human faecal samples. Our findings showed that if a single sequencing technology is to be chosen for virome analysis, Illumina is preferable due to its superior ability to recover fully resolved viral genomes and minimise erroneous genomes. While ONT assemblies were effective in recovering viral diversity, the challenges related to input requirements and the necessity for amplification made it less ideal as a standalone solution. However, using a combined, hybrid approach enabled a more authentic representation of viral diversity to be obtained within samples.
Collapse
Affiliation(s)
- Ryan Cook
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | | | | | - Fiona Newberry
- Department of Biosciences, Nottingham Trent University, Nottingham, NG11 8NS, UK
| | - Mohammad A Tariq
- Faculty of Health and Life Sciences, University of Northumbria, Newcastle upon Tyne, NE1 8ST, UK
| | - Dave J Baker
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | - Simon R Carding
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
- Norwich Medical School, University of East Anglia, Norwich, NR4 7TJ, UK
| | | |
Collapse
|
8
|
Liu X, Liu Y, Liu J, Zhang H, Shan C, Guo Y, Gong X, Cui M, Li X, Tang M. Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence. Neural Regen Res 2024; 19:833-845. [PMID: 37843219 PMCID: PMC10664138 DOI: 10.4103/1673-5374.382223] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/19/2023] [Accepted: 06/17/2023] [Indexed: 10/17/2023] Open
Abstract
A growing body of evidence suggests that the gut microbiota contributes to the development of neurodegenerative diseases via the microbiota-gut-brain axis. As a contributing factor, microbiota dysbiosis always occurs in pathological changes of neurodegenerative diseases, such as Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis. High-throughput sequencing technology has helped to reveal that the bidirectional communication between the central nervous system and the enteric nervous system is facilitated by the microbiota's diverse microorganisms, and for both neuroimmune and neuroendocrine systems. Here, we summarize the bioinformatics analysis and wet-biology validation for the gut metagenomics in neurodegenerative diseases, with an emphasis on multi-omics studies and the gut virome. The pathogen-associated signaling biomarkers for identifying brain disorders and potential therapeutic targets are also elucidated. Finally, we discuss the role of diet, prebiotics, probiotics, postbiotics and exercise interventions in remodeling the microbiome and reducing the symptoms of neurodegenerative diseases.
Collapse
Affiliation(s)
- Xiaoyan Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yi Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
- Institute of Animal Husbandry, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu Province, China
| | - Junlin Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Hantao Zhang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Chaofan Shan
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yinglu Guo
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Xun Gong
- Department of Rheumatology & Immunology, Affiliated Hospital of Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Mengmeng Cui
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Xiubin Li
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| |
Collapse
|
9
|
Qiu Z, Yuan L, Lian CA, Lin B, Chen J, Mu R, Qiao X, Zhang L, Xu Z, Fan L, Zhang Y, Wang S, Li J, Cao H, Li B, Chen B, Song C, Liu Y, Shi L, Tian Y, Ni J, Zhang T, Zhou J, Zhuang WQ, Yu K. BASALT refines binning from metagenomic data and increases resolution of genome-resolved metagenomic analysis. Nat Commun 2024; 15:2179. [PMID: 38467684 PMCID: PMC10928208 DOI: 10.1038/s41467-024-46539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Accepted: 03/01/2024] [Indexed: 03/13/2024] Open
Abstract
Metagenomic binning is an essential technique for genome-resolved characterization of uncultured microorganisms in various ecosystems but hampered by the low efficiency of binning tools in adequately recovering metagenome-assembled genomes (MAGs). Here, we introduce BASALT (Binning Across a Series of Assemblies Toolkit) for binning and refinement of short- and long-read sequencing data. BASALT employs multiple binners with multiple thresholds to produce initial bins, then utilizes neural networks to identify core sequences to remove redundant bins and refine non-redundant bins. Using the same assemblies generated from Critical Assessment of Metagenome Interpretation (CAMI) datasets, BASALT produces up to twice as many MAGs as VAMB, DASTool, or metaWRAP. Processing assemblies from a lake sediment dataset, BASALT produces ~30% more MAGs than metaWRAP, including 21 unique class-level prokaryotic lineages. Functional annotations reveal that BASALT can retrieve 47.6% more non-redundant opening-reading frames than metaWRAP. These results highlight the robust handling of metagenomic sequencing data of BASALT.
Collapse
Affiliation(s)
- Zhiguang Qiu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
| | - Li Yuan
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Chun-Ang Lian
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
| | - Bin Lin
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
| | - Jie Chen
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Rong Mu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Xuejiao Qiao
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Liyu Zhang
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
| | - Zheng Xu
- Southern University of Sciences and Technology Yantian Hospital, Shenzhen, China
- Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China
| | - Lu Fan
- Department of Ocean Science and Engineering, Southern University of Science and Technology (SUSTech), Shenzhen, China
| | - Yunzeng Zhang
- Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Shanquan Wang
- Environmental Microbiomics Research Center, School of Environmental Science and Engineering, Sun Yat-Sen University, Guangzhou, China
| | - Junyi Li
- School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen), Shenzhen, Guangdong, China
| | - Huiluo Cao
- Department of Microbiology, University of Hong Kong, Hong Kong, China
| | - Bing Li
- Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
| | - Baowei Chen
- Guangdong Provincial Key Laboratory of Marine Resources and Coastal Engineering, School of Marine Sciences, Sun Yat-sen University, Zhuhai, China
| | - Chi Song
- Institute of Herbgenomics, Chengdu University of Traditional Chinese Medicine, Chengdu, China
- Wuhan Benagen Technology Co., Ltd, Wuhan, China
| | - Yongxin Liu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Lili Shi
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- State Key Laboratory of Chemical Oncogenomics, School of Chemical Biology and Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, China
| | - Yonghong Tian
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China
- School of Electronic and Computer Engineering, Peking University, Shenzhen, China
- Peng Cheng Laboratory, Shenzhen, China
| | - Jinren Ni
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China
- College of Environmental Sciences and Engineering, Key Laboratory of Water and Sediment Sciences, Ministry of Education, Peking University, Beijing, China
| | - Tong Zhang
- Department of Civil Engineering, University of Hong Kong, Hong Kong, China
| | - Jizhong Zhou
- Institute for Environmental Genomics, University of Oklahoma, Norman, OK, USA
| | - Wei-Qin Zhuang
- Department of Civil and Environmental Engineering, Faculty of Engineering, University of Auckland, Auckland, New Zealand
| | - Ke Yu
- Eco-environment and Resource Efficiency Research Laboratory, School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, China.
- AI for Science (AI4S)-Preferred Program, Peking University, Shenzhen, China.
| |
Collapse
|
10
|
Hui X, Yang J, Sun J, Liu F, Pan W. MCSS: microbial community simulator based on structure. Front Microbiol 2024; 15:1358257. [PMID: 38516019 PMCID: PMC10956353 DOI: 10.3389/fmicb.2024.1358257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
De novo assembly plays a pivotal role in metagenomic analysis, and the incorporation of third-generation sequencing technology can significantly improve the integrity and accuracy of assembly results. Recently, with advancements in sequencing technology (Hi-Fi, ultra-long), several long-read-based bioinformatic tools have been developed. However, the validation of the performance and reliability of these tools is a crucial concern. To address this gap, we present MCSS (microbial community simulator based on structure), which has the capability to generate simulated microbial community and sequencing datasets based on the structure attributes of real microbiome communities. The evaluation results indicate that it can generate simulated communities that exhibit both diversity and similarity to actual community structures. Additionally, MCSS generates synthetic PacBio Hi-Fi and Oxford Nanopore Technologies (ONT) long reads for the species within the simulated community. This innovative tool provides a valuable resource for benchmarking and refining metagenomic analysis methods. Code available at: https://github.com/panlab-bio/mcss.
Collapse
Affiliation(s)
- Xingqi Hui
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou, China
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
| | - Jinbao Yang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Jinhuan Sun
- Key Laboratory of Plant Molecular Physiology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Fang Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou, China
- National Key Laboratory of Cotton Bio-Breeding and Integrated Utilization, Institute of Cotton Research, Chinese Academy of Agricultural Sciences (ICR, CAAS), Anyang, China
| | - Weihua Pan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
| |
Collapse
|
11
|
Liang H, Huang J, Tao Y, Klümper U, Berendonk TU, Zhou K, Xia Y, Yang Y, Yu Y, Yu K, Lin L, Li X, Li B. Investigating the antibiotic resistance genes and their potential risks in the megacity water environment: A case study of Shenzhen Bay Basin, China. JOURNAL OF HAZARDOUS MATERIALS 2024; 465:133536. [PMID: 38242018 DOI: 10.1016/j.jhazmat.2024.133536] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 01/13/2024] [Accepted: 01/13/2024] [Indexed: 01/21/2024]
Abstract
Antibiotic resistance genes (ARGs) constitute emerging pollutants and pose serious risks to public health. Anthropogenic activities are recognized as the main driver of ARG dissemination in coastal regions. However, the distribution and dissemination of ARGs in Shenzhen Bay Basin, a typical megacity water environment, have been poorly investigated. Here, we comprehensively profiled ARGs in Shenzhen Bay Basin using metagenomic approaches, and estimated their associated health risks. ARG profiles varied greatly among different sampling locations with total abundance ranging from 2.79 × 10-2 (Shenzhen Bay sediment) to 1.04 (hospital sewage) copies per 16S rRNA gene copy, and 45.4% of them were located on plasmid-like sequences. Sewage treatment plants effluent and the corresponding tributary rivers were identified as the main sources of ARG contamination in Shenzhen Bay. Mobilizable plasmids and complete integrons carrying various ARGs probably participated in the dissemination of ARGs in Shenzhen Bay Basin. Additionally, 19 subtypes were assigned as high-risk ARGs (Rank I), and numerous ARGs were identified in potential human-associated pathogens, such as Burkholderiaceae, Rhodocyclaceae, Vibrionaceae, Pseudomonadaceae, and Aeromonadaceae. Overall, Shenzhen Bay represented a higher level of ARG risk than the ocean environment based on quantitative risk assessment. This study deepened our understanding of the ARGs and the associated risks in the megacity water environment.
Collapse
Affiliation(s)
- Hebin Liang
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Jin Huang
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Yi Tao
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Uli Klümper
- Institute for Hydrobiology, Technische Universität Dresden, Dresden 01217, Germany
| | - Thomas U Berendonk
- Institute for Hydrobiology, Technische Universität Dresden, Dresden 01217, Germany
| | - Kai Zhou
- Shenzhen Institute of Respiratory Disease, Shenzhen People's Hospital (the First Affiliated Hospital, Southern University of Science and Technology; the Second Clinical Medical College, Jinan University), Shenzhen 518020, China
| | - Yu Xia
- School of Environmental Science and Engineering, College of Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| | - Ying Yang
- School of Marine Sciences, Sun Yat-Sen University, Zhuhai 519082, China; Southern Marine Science and Engineering Guangdong Laboratory, Zhuhai 519000, China
| | - Yang Yu
- National Risk Assessment Laboratory for Antimicrobial Resistance of Animal Original Bacteria, South China Agricultural University, Guangzhou 510642, China; Guangdong Provincial Key Laboratory of Veterinary Pharmaceutics Development and Safety Evaluation, South China Agricultural University, Guangzhou 510642, China
| | - Ke Yu
- School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen 518055, China
| | - Lin Lin
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Xiaoyan Li
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China
| | - Bing Li
- State Environmental Protection Key Laboratory of Microorganism Application and Risk Control, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen 518055, China.
| |
Collapse
|
12
|
Hosokawa M, Nishikawa Y. Tools for microbial single-cell genomics for obtaining uncultured microbial genomes. Biophys Rev 2024; 16:69-77. [PMID: 38495448 PMCID: PMC10937852 DOI: 10.1007/s12551-023-01124-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 08/23/2023] [Indexed: 03/19/2024] Open
Abstract
The advent of next-generation sequencing technologies has facilitated the acquisition of large amounts of DNA sequence data at a relatively low cost, leading to numerous breakthroughs in decoding microbial genomes. Among the various genome sequencing activities, metagenomic analysis, which entails the direct analysis of uncultured microbial DNA, has had a profound impact on microbiome research and has emerged as an indispensable technology in this field. Despite its valuable contributions, metagenomic analysis is a "bulk analysis" technique that analyzes samples containing a wide diversity of microbes, such as bacteria, yielding information that is averaged across the entire microbial population. In order to gain a deeper understanding of the heterogeneous nature of the microbial world, there is a growing need for single-cell analysis, similar to its use in human cell biology. With this paradigm shift in mind, comprehensive single-cell genomics technology has become a much-anticipated innovation that is now poised to revolutionize microbiome research. It has the potential to enable the discovery of differences at the strain level and to facilitate a more comprehensive examination of microbial ecosystems. In this review, we summarize the current state-of-the-art in microbial single-cell genomics, highlighting the potential impact of this technology on our understanding of the microbial world. The successful implementation of this technology is expected to have a profound impact in the field, leading to new discoveries and insights into the diversity and evolution of microbes.
Collapse
Affiliation(s)
- Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-Cho, Shinjuku-Ku, Tokyo, 162-8480 Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- bitBiome, Inc., 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
| | - Yohei Nishikawa
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
| |
Collapse
|
13
|
Qi W, Xue MY, Jia MH, Zhang S, Yan Q, Sun HZ. - Invited Review - Understanding the functionality of the rumen microbiota: searching for better opportunities for rumen microbial manipulation. Anim Biosci 2024; 37:370-384. [PMID: 38186256 PMCID: PMC10838668 DOI: 10.5713/ab.23.0308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 11/03/2023] [Indexed: 01/09/2024] Open
Abstract
Rumen microbiota play a central role in the digestive process of ruminants. Their remarkable ability to break down complex plant fibers and proteins, converting them into essential organic compounds that provide animals with energy and nutrition. Research on rumen microbiota not only contributes to improving animal production performance and enhancing feed utilization efficiency but also holds the potential to reduce methane emissions and environmental impact. Nevertheless, studies on rumen microbiota face numerous challenges, including complexity, difficulties in cultivation, and obstacles in functional analysis. This review provides an overview of microbial species involved in the degradation of macromolecules, the fermentation processes, and methane production in the rumen, all based on cultivation methods. Additionally, the review introduces the applications, advantages, and limitations of emerging omics technologies such as metagenomics, metatranscriptomics, metaproteomics, and metabolomics, in investigating the functionality of rumen microbiota. Finally, the article offers a forward-looking perspective on the new horizons and technologies in the field of rumen microbiota functional research. These emerging technologies, with continuous refinement and mutual complementation, have deepened our understanding of rumen microbiota functionality, thereby enabling effective manipulation of the rumen microbial community.
Collapse
Affiliation(s)
- Wenlingli Qi
- Key Laboratory of Dairy Cow Genetic Improvement and Milk Quality Research of Zhejiang Province, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ming-Yuan Xue
- Key Laboratory of Dairy Cow Genetic Improvement and Milk Quality Research of Zhejiang Province, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Ming-Hui Jia
- Key Laboratory of Dairy Cow Genetic Improvement and Milk Quality Research of Zhejiang Province, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| | - Shuxian Zhang
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China
| | - Qiongxian Yan
- CAS Key Laboratory of Agro-Ecological Processes in Subtropical Region, Hunan Provincial Key Laboratory of Animal Nutritional Physiology and Metabolic Process, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha 410125, China
| | - Hui-Zeng Sun
- Key Laboratory of Dairy Cow Genetic Improvement and Milk Quality Research of Zhejiang Province, College of Animal Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
14
|
Cook R, Brown N, Rihtman B, Michniewski S, Redgwell T, Clokie M, Stekel DJ, Chen Y, Scanlan DJ, Hobman JL, Nelson A, Jones MA, Smith D, Millard A. The long and short of it: benchmarking viromics using Illumina, Nanopore and PacBio sequencing technologies. Microb Genom 2024; 10:001198. [PMID: 38376377 PMCID: PMC10926689 DOI: 10.1099/mgen.0.001198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 01/25/2024] [Indexed: 02/21/2024] Open
Abstract
Viral metagenomics has fuelled a rapid change in our understanding of global viral diversity and ecology. Long-read sequencing and hybrid assembly approaches that combine long- and short-read technologies are now being widely implemented in bacterial genomics and metagenomics. However, the use of long-read sequencing to investigate viral communities is still in its infancy. While Nanopore and PacBio technologies have been applied to viral metagenomics, it is not known to what extent different technologies will impact the reconstruction of the viral community. Thus, we constructed a mock bacteriophage community of previously sequenced phage genomes and sequenced them using Illumina, Nanopore and PacBio sequencing technologies and tested a number of different assembly approaches. When using a single sequencing technology, Illumina assemblies were the best at recovering phage genomes. Nanopore- and PacBio-only assemblies performed poorly in comparison to Illumina in both genome recovery and error rates, which both varied with the assembler used. The best Nanopore assembly had errors that manifested as SNPs and INDELs at frequencies 41 and 157 % higher than found in Illumina only assemblies, respectively. While the best PacBio assemblies had SNPs at frequencies 12 and 78 % higher than found in Illumina-only assemblies, respectively. Despite high-read coverage, long-read-only assemblies recovered a maximum of one complete genome from any assembly, unless reads were down-sampled prior to assembly. Overall the best approach was assembly by a combination of Illumina and Nanopore reads, which reduced error rates to levels comparable with short-read-only assemblies. When using a single technology, Illumina only was the best approach. The differences in genome recovery and error rates between technology and assembler had downstream impacts on gene prediction, viral prediction, and subsequent estimates of diversity within a sample. These findings will provide a starting point for others in the choice of reads and assembly algorithms for the analysis of viromes.
Collapse
Affiliation(s)
- Ryan Cook
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, College Road, Loughborough, Leicestershire, LE12 5RD, UK
| | - Nathan Brown
- Centre for Phage Research, Dept Genetics and Genome Biology, University of Leicester, University Road, Leicester, Leicestershire, LE1 7RH, UK
| | - Branko Rihtman
- School of Life Sciences, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK
| | - Slawomir Michniewski
- Warwick Medical School, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK
| | - Tamsin Redgwell
- COPSAC, Copenhagen Prospective Studies on Asthma in Childhood, Herlev and Gentofte Hospital, University of Copenhagen, Ledreborg Alle 34, 2820, Gentofte, Denmark
| | - Martha Clokie
- Centre for Phage Research, Dept Genetics and Genome Biology, University of Leicester, University Road, Leicester, Leicestershire, LE1 7RH, UK
| | - Dov J. Stekel
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, College Road, Loughborough, Leicestershire, LE12 5RD, UK
- Department of Mathematics and Applied Mathematics, University of Johannesburg, Rossmore 2029, South Africa
| | - Yin Chen
- School of Life Sciences, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK
| | - David J. Scanlan
- School of Life Sciences, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK
| | - Jon L. Hobman
- School of Biosciences, University of Nottingham, Sutton Bonington Campus, College Road, Loughborough, Leicestershire, LE12 5RD, UK
| | - Andrew Nelson
- Faculty of Health and Life Sciences, University of Northumbria, Newcastle upon Tyne, NE1 8ST, UK
| | - Michael A. Jones
- School of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, College Road, Loughborough, Leicestershire, LE12 5RD, UK
| | - Darren Smith
- Faculty of Health and Life Sciences, University of Northumbria, Newcastle upon Tyne, NE1 8ST, UK
| | - Andrew Millard
- Centre for Phage Research, Dept Genetics and Genome Biology, University of Leicester, University Road, Leicester, Leicestershire, LE1 7RH, UK
| |
Collapse
|
15
|
Cerk K, Ugalde‐Salas P, Nedjad CG, Lecomte M, Muller C, Sherman DJ, Hildebrand F, Labarthe S, Frioux C. Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing. Microb Biotechnol 2024; 17:e14396. [PMID: 38243750 PMCID: PMC10832553 DOI: 10.1111/1751-7915.14396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/27/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024] Open
Abstract
Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.
Collapse
Affiliation(s)
- Klara Cerk
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Chabname Ghassemi Nedjad
- Inria, University of Bordeaux, INRAETalenceFrance
- University of Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800TalenceFrance
| | - Maxime Lecomte
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE STLO¸University of RennesRennesFrance
| | | | | | - Falk Hildebrand
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Simon Labarthe
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE, University of Bordeaux, BIOGECO, UMR 1202CestasFrance
| | | |
Collapse
|
16
|
McGuinness AJ, Stinson LF, Snelson M, Loughman A, Stringer A, Hannan AJ, Cowan CSM, Jama HA, Caparros-Martin JA, West ML, Wardill HR. From hype to hope: Considerations in conducting robust microbiome science. Brain Behav Immun 2024; 115:120-130. [PMID: 37806533 DOI: 10.1016/j.bbi.2023.09.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 08/14/2023] [Accepted: 09/30/2023] [Indexed: 10/10/2023] Open
Abstract
Microbiome science has been one of the most exciting and rapidly evolving research fields in the past two decades. Breakthroughs in technologies including DNA sequencing have meant that the trillions of microbes (particularly bacteria) inhabiting human biological niches (particularly the gut) can be profiled and analysed in exquisite detail. This microbiome profiling has profound impacts across many fields of research, especially biomedical science, with implications for how we understand and ultimately treat a wide range of human disorders. However, like many great scientific frontiers in human history, the pioneering nature of microbiome research comes with a multitude of challenges and potential pitfalls. These include the reproducibility and robustness of microbiome science, especially in its applications to human health outcomes. In this article, we address the enormous promise of microbiome science and its many challenges, proposing constructive solutions to enhance the reproducibility and robustness of research in this nascent field. The optimisation of microbiome science spans research design, implementation and analysis, and we discuss specific aspects such as the importance of ecological principals and functionality, challenges with microbiome-modulating therapies and the consideration of confounding, alternative options for microbiome sequencing, and the potential of machine learning and computational science to advance the field. The power of microbiome science promises to revolutionise our understanding of many diseases and provide new approaches to prevention, early diagnosis, and treatment.
Collapse
Affiliation(s)
- Amelia J McGuinness
- Deakin University, Geelong, Australia, the Institute for Mental and Physical Health and Clinical Translation (IMPACT), School of Medicine and Barwon Health, Geelong, Australia
| | - Lisa F Stinson
- School of Molecular Sciences, The University of Western Australia, Perth, WA, Australia
| | - Matthew Snelson
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Clayton, VIC, Australia.
| | - Amy Loughman
- Deakin University, Geelong, Australia, the Institute for Mental and Physical Health and Clinical Translation (IMPACT), School of Medicine and Barwon Health, Geelong, Australia
| | - Andrea Stringer
- Clinical and Health Sciences, University of South Australia, Adelaide, South Australia, Australia
| | - Anthony J Hannan
- The Florey Institute of Neuroscience and Mental Health, University of Melbourne, Parkville, Australia
| | | | - Hamdi A Jama
- Hypertension Research Laboratory, School of Biological Sciences, Faculty of Science, Monash University, Clayton, VIC, Australia
| | | | - Madeline L West
- Deakin University, Geelong, Australia, the Institute for Mental and Physical Health and Clinical Translation (IMPACT), School of Medicine and Barwon Health, Geelong, Australia
| | - Hannah R Wardill
- Supportive Oncology Research Group, Precision Medicine (Cancer), South Australian Health and Medical Research Institute (SAHMRI), University of Adelaide, Adelaide, South Australia, Australia
| |
Collapse
|
17
|
Arikawa K, Hosokawa M. Uncultured prokaryotic genomes in the spotlight: An examination of publicly available data from metagenomics and single-cell genomics. Comput Struct Biotechnol J 2023; 21:4508-4518. [PMID: 37771751 PMCID: PMC10523443 DOI: 10.1016/j.csbj.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/10/2023] [Accepted: 09/10/2023] [Indexed: 09/30/2023] Open
Abstract
Owing to the ineffectiveness of traditional culture techniques for the vast majority of microbial species, culture-independent analyses utilizing next-generation sequencing and bioinformatics have become essential for gaining insight into microbial ecology and function. This mini-review focuses on two essential methods for obtaining genetic information from uncultured prokaryotes, metagenomics and single-cell genomics. We analyzed the registration status of uncultured prokaryotic genome data from major public databases and assessed the advantages and limitations of both the methods. Metagenomics generates a significant quantity of sequence data and multiple prokaryotic genomes using straightforward experimental procedures. However, in ecosystems with high microbial diversity, such as soil, most genes are presented as brief, disconnected contigs, and lack association of highly conserved genes and mobile genetic elements with individual species genomes. Although technically more challenging, single-cell genomics offers valuable insights into complex ecosystems by providing strain-resolved genomes, addressing issues in metagenomics. Recent technological advancements, such as long-read sequencing, machine learning algorithms, and in silico protein structure prediction, in combination with vast genomic data, have the potential to overcome the current technical challenges and facilitate a deeper understanding of uncultured microbial ecosystems and microbial dark matter genes and proteins. In light of this, it is imperative that continued innovation in both methods and technologies take place to create high-quality reference genome databases that will support future microbial research and industrial applications.
Collapse
Affiliation(s)
- Koji Arikawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
| | - Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| |
Collapse
|
18
|
Pessi IS, Popin RV, Durieu B, Lara Y, Tytgat B, Savaglia V, Roncero-Ramos B, Hultman J, Verleyen E, Vyverman W, Wilmotte A. Novel diversity of polar Cyanobacteria revealed by genome-resolved metagenomics. Microb Genom 2023; 9:mgen001056. [PMID: 37417735 PMCID: PMC10438808 DOI: 10.1099/mgen.0.001056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Accepted: 05/30/2023] [Indexed: 07/08/2023] Open
Abstract
Benthic microbial mats dominated by Cyanobacteria are important features of polar lakes. Although culture-independent studies have provided important insights into the diversity of polar Cyanobacteria, only a handful of genomes have been sequenced to date. Here, we applied a genome-resolved metagenomics approach to data obtained from Arctic, sub-Antarctic and Antarctic microbial mats. We recovered 37 metagenome-assembled genomes (MAGs) of Cyanobacteria representing 17 distinct species, most of which are only distantly related to genomes that have been sequenced so far. These include (i) lineages that are common in polar microbial mats such as the filamentous taxa Pseudanabaena, Leptolyngbya, Microcoleus/Tychonema and Phormidium; (ii) the less common taxa Crinalium and Chamaesiphon; (iii) an enigmatic Chroococcales lineage only distantly related to Microcystis; and (iv) an early branching lineage in the order Gloeobacterales that is distributed across the cold biosphere, for which we propose the name Candidatus Sivonenia alaskensis. Our results show that genome-resolved metagenomics is a powerful tool for expanding our understanding of the diversity of Cyanobacteria, especially in understudied remote and extreme environments.
Collapse
Affiliation(s)
- Igor S. Pessi
- Department of Microbiology, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science (HELSUS), Helsinki, Finland
| | - Rafael V. Popin
- Department of Microbiology, University of Helsinki, Helsinki, Finland
| | - Benoit Durieu
- InBioS – Centre for Protein Engineering, University of Liège, Liège, Belgium
| | - Yannick Lara
- Early Life Traces & Evolution-Astrobiology, UR-Astrobiology, University of Liège, Liège, Belgium
| | - Bjorn Tytgat
- Laboratory of Protistology & Aquatic Ecology, Ghent University, Ghent, Belgium
| | - Valentina Savaglia
- InBioS – Centre for Protein Engineering, University of Liège, Liège, Belgium
- Laboratory of Protistology & Aquatic Ecology, Ghent University, Ghent, Belgium
| | - Beatriz Roncero-Ramos
- InBioS – Centre for Protein Engineering, University of Liège, Liège, Belgium
- Department of Plant Biology and Ecology, University of Sevilla, Sevilla, Spain
| | - Jenni Hultman
- Department of Microbiology, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science (HELSUS), Helsinki, Finland
- Natural Resources Institute Finland (LUKE), Helsinki, Finland
| | - Elie Verleyen
- Laboratory of Protistology & Aquatic Ecology, Ghent University, Ghent, Belgium
| | - Wim Vyverman
- Laboratory of Protistology & Aquatic Ecology, Ghent University, Ghent, Belgium
| | - Annick Wilmotte
- InBioS – Centre for Protein Engineering, University of Liège, Liège, Belgium
| |
Collapse
|
19
|
De León LF, Silva B, Avilés-Rodríguez KJ, Buitrago-Rosas D. Harnessing the omics revolution to address the global biodiversity crisis. Curr Opin Biotechnol 2023; 80:102901. [PMID: 36773576 DOI: 10.1016/j.copbio.2023.102901] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 01/10/2023] [Accepted: 01/18/2023] [Indexed: 02/12/2023]
Abstract
Human disturbances are altering global biodiversity in unprecedented ways. We identify three fundamental challenges underpinning our understanding of global biodiversity (namely discovery, loss, and preservation), and discuss how the omics revolution (e.g. genomics, transcriptomics, proteomics, metabolomics, and meta-omics) can help address these challenges. We also discuss how omics tools can illuminate the major drivers of biodiversity loss, including invasive species, pollution, urbanization, overexploitation, and climate change, with a special focus on highly diverse tropical environments. Although omics tools are transforming the traditional toolkit of biodiversity research, their application to addressing the current biodiversity crisis remains limited and may not suffice to offset current rates of biodiversity loss. Despite technical and logistical challenges, omics tools need to be fully integrated into global biodiversity research, and better strategies are needed to improve their translation into biodiversity policy and practice. It is also important to recognize that although the omics revolution can be considered the biologist's dream, socioeconomic disparity limits their application in biodiversity research.
Collapse
Affiliation(s)
- Luis F De León
- Department of Biology, University of Massachusetts Boston, Boston, MA 02125, USA.
| | - Bruna Silva
- Department of Biology, University of Massachusetts Boston, Boston, MA 02125, USA
| | - Kevin J Avilés-Rodríguez
- Department of Biology, University of Massachusetts Boston, Boston, MA 02125, USA; Department of Biology, Fordham University, Bronx, NY, USA
| | | |
Collapse
|
20
|
Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing. Microbiol Spectr 2023; 11:e0332822. [PMID: 36475839 PMCID: PMC9927493 DOI: 10.1128/spectrum.03328-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as 'NGS', 'Hybrid (NGS+HiFi)', and 'HiFi'. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the 'HiFi' approach to assemble high-quality microbial genomes. Among the 3 strategies, the 'Hybrid' approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the 'HiFi' assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the 'Hybrid' approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the 'Hybrid' assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research.
Collapse
|
21
|
Huang Y, Jiang P, Liang Z, Chen R, Yue Z, Xie X, Guan C, Fang X. Assembly and analytical validation of a metagenomic reference catalog of human gut microbiota based on co-barcoding sequencing. Front Microbiol 2023; 14:1145315. [PMID: 37213501 PMCID: PMC10196144 DOI: 10.3389/fmicb.2023.1145315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 04/06/2023] [Indexed: 05/23/2023] Open
Abstract
Human gut microbiota is associated with human health and disease, and is known to have the second-largest genome in the human body. The microbiota genome is important for their functions and metabolites; however, accurate genomic access to the microbiota of the human gut is hindered due to the difficulty of cultivating and the shortcomings of sequencing technology. Therefore, we applied the stLFR library construction method to assemble the microbiota genomes and demonstrated that assembly property outperformed standard metagenome sequencing. Using the assembled genomes as references, SNP, INDEL, and HGT gene analyses were performed. The results demonstrated significant differences in the number of SNPs and INDELs among different individuals. The individual displayed a unique species variation spectrum, and the similarity of strains within individuals decreased over time. In addition, the coverage depth analysis of the stLFR method shows that a sequencing depth of 60X is sufficient for SNP calling. HGT analysis revealed that the genes involved in replication, recombination and repair, mobilome prophages, and transposons were the most transferred genes among different bacterial species in individuals. A preliminary framework for human gut microbiome studies was established using the stLFR library construction method.
Collapse
Affiliation(s)
- Yufen Huang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
- BGI-Shenzhen, Shenzhen, China
| | | | | | | | - Zhen Yue
- BGI-Sanya, BGI-Shenzhen, Sanya, China
| | | | - Changge Guan
- BGI-Sanya, BGI-Shenzhen, Sanya, China
- *Correspondence: Changge Guan
| | - Xiaodong Fang
- BGI Genomics, BGI-Shenzhen, Shenzhen, China
- BGI-Shenzhen, Shenzhen, China
- State Key Laboratory of Dampness Syndrome of Chinese Medicine, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou, China
- Xiaodong Fang
| |
Collapse
|
22
|
Fedarko MW, Kolmogorov M, Pevzner PA. Analyzing rare mutations in metagenomes assembled using long and accurate reads. Genome Res 2022; 32:2119-2133. [PMID: 36418060 PMCID: PMC9808630 DOI: 10.1101/gr.276917.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/16/2022] [Indexed: 11/25/2022]
Abstract
The advent of long and accurate "HiFi" reads has greatly improved our ability to generate complete metagenome-assembled genomes (MAGs), enabling "complete metagenomics" studies that were nearly impossible to conduct with short reads. In particular, HiFi reads simplify the identification and phasing of mutations in MAGs: It is increasingly feasible to distinguish between positions that are prone to mutations and positions that rarely ever mutate, and to identify co-occurring groups of mutations. However, the problems of identifying rare mutations in MAGs, estimating the false-discovery rate (FDR) of these identifications, and phasing identified mutations remain open in the context of HiFi data. We present strainFlye, a pipeline for the FDR-controlled identification and analysis of rare mutations in MAGs assembled using HiFi reads. We show that deep HiFi sequencing has the potential to reveal and phase tens of thousands of rare mutations in a single MAG, identify hotspots and coldspots of these mutations, and detail MAGs' growth dynamics.
Collapse
Affiliation(s)
- Marcus W. Fedarko
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA;,UC Santa Cruz Genomics Institute, Santa Cruz, California 95064, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|