51
|
Jin H, Quan K, He Q, Kwok LY, Ma T, Li Y, Zhao F, You L, Zhang H, Sun Z. A high-quality genome compendium of the human gut microbiome of Inner Mongolians. Nat Microbiol 2023; 8:150-161. [PMID: 36604505 DOI: 10.1038/s41564-022-01270-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 10/13/2022] [Indexed: 01/07/2023]
Abstract
Metagenome-based resources have revealed the diversity and function of the human gut microbiome, but further understanding is limited by insufficient genome quality and a lack of samples from typically understudied populations. Here we used hybrid long-read PromethION and short-read HiSeq sequencing to characterize the faecal microbiota of 60 Inner Mongolian individuals (n = 180 samples over three time points) who were part of a probiotic yogurt intervention trial. We present the Inner Mongolian Gut Genome catalogue, comprising 802 closed and 5,927 high-quality metagenome-assembled genomes. This approach achieved high genome continuity and substantially increased the resolution of genomic elements, including ribosomal RNA operons, metabolic gene clusters, prophages and insertion sequences. Particularly, we report the ribosomal RNA operon copy numbers for uncultured species, over 12,000 previously undescribed gut prophages and the distribution of insertion sequence elements across gut bacteria. Overall, these data provide a high-quality, large-scale resource for studying the human gut microbiota.
Collapse
Affiliation(s)
- Hao Jin
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Keyu Quan
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Qiuwen He
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Lai-Yu Kwok
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Teng Ma
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Yalin Li
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Feiyan Zhao
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Lijun You
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Heping Zhang
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.
| | - Zhihong Sun
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.
| |
Collapse
|
52
|
Long-Read Metagenome-Assembled Genomes Improve Identification of Novel Complete Biosynthetic Gene Clusters in a Complex Microbial Activated Sludge Ecosystem. mSystems 2022; 7:e0063222. [PMID: 36445112 PMCID: PMC9765116 DOI: 10.1128/msystems.00632-22] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Microorganisms produce a wide variety of secondary/specialized metabolites (SMs), the majority of which are yet to be discovered. These natural products play multiple roles in microbiomes and are important for microbial competition, communication, and success in the environment. SMs have been our major source of antibiotics and are used in a range of biotechnological applications. In silico mining for biosynthetic gene clusters (BGCs) encoding the production of SMs is commonly used to assess the genetic potential of organisms. However, as BGCs span tens to over 200 kb, identifying complete BGCs requires genome data that has minimal assembly gaps within the BGCs, a prerequisite that was previously only met by individually sequenced genomes. Here, we assess the performance of the currently available genome mining platform antiSMASH on 1,080 high-quality metagenome-assembled bacterial genomes (HQ MAGs) previously produced from wastewater treatment plants (WWTPs) using a combination of long-read (Oxford Nanopore) and short-read (Illumina) sequencing technologies. More than 4,200 different BGCs were identified, with 88% of these being complete. Sequence similarity clustering of the BGCs implies that the majority of this biosynthetic potential likely encodes novel compounds, and few BGCs are shared between genera. We identify BGCs in abundant and functionally relevant genera in WWTPs, suggesting a role of secondary metabolism in this ecosystem. We find that the assembly of HQ MAGs using long-read sequencing is vital to explore the genetic potential for SM production among the uncultured members of microbial communities. IMPORTANCE Cataloguing secondary metabolite (SM) potential using genome mining of metagenomic data has become the method of choice in bioprospecting for novel compounds. However, accurate biosynthetic gene cluster (BGC) detection requires unfragmented genomic assemblies, which have been technically difficult to obtain from metagenomes until very recently with new long-read technologies. Here, we determined the biosynthetic potential of activated sludge (AS), the microbial community used in resource recovery and wastewater treatment, by mining high-quality metagenome-assembled genomes generated from long-read data. We found over 4,000 BGCs, including BGCs in abundant process-critical bacteria, with no similarity to the BGCs of characterized products. We show how long-read MAGs are required to confidently assemble complete BGCs, and we determined that the AS BGCs from different studies have very little overlap, suggesting that AS is a rich source of biosynthetic potential and new bioactive compounds.
Collapse
|
53
|
Tadrent N, Dedeine F, Hervé V. SnakeMAGs: a simple, efficient, flexible and scalable workflow to reconstruct prokaryotic genomes from metagenomes. F1000Res 2022; 11:1522. [PMID: 36875992 PMCID: PMC9978240 DOI: 10.12688/f1000research.128091.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/23/2023] [Indexed: 03/02/2023] Open
Abstract
Background: Over the last decade, we have observed in microbial ecology a transition from gene-centric to genome-centric analyses. Indeed, the advent of metagenomics combined with binning methods, single-cell genome sequencing as well as high-throughput cultivation methods have contributed to the continuing and exponential increase of available prokaryotic genomes, which in turn has favored the exploration of microbial metabolisms. In the case of metagenomics, data processing, from raw reads to genome reconstruction, involves various steps and software which can represent a major technical obstacle. Methods: To overcome this challenge, we developed SnakeMAGs, a simple workflow that can process Illumina data, from raw reads to metagenome-assembled genomes (MAGs) classification and relative abundance estimate. It integrates state-of-the-art bioinformatic tools to sequentially perform: quality control of the reads (illumina-utils, Trimmomatic), host sequence removal (optional step, using Bowtie2), assembly (MEGAHIT), binning (MetaBAT2), quality filtering of the bins (CheckM, GUNC), classification of the MAGs (GTDB-Tk) and estimate of their relative abundance (CoverM). Developed with the popular Snakemake workflow management system, it can be deployed on various architectures, from single to multicore and from workstation to computer clusters and grids. It is also flexible since users can easily change parameters and/or add new rules. Results: Using termite gut metagenomic datasets, we showed that SnakeMAGs is slower but allowed the recovery of more MAGs encompassing more diverse phyla compared to another similar workflow named ATLAS. Importantly, these additional MAGs showed no significant difference compared to the other ones in terms of completeness, contamination, genome size nor relative abundance. Conclusions: Overall, it should make the reconstruction of MAGs more accessible to microbiologists. SnakeMAGs as well as test files and an extended tutorial are available at https://github.com/Nachida08/SnakeMAGs.
Collapse
Affiliation(s)
- Nachida Tadrent
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Franck Dedeine
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Vincent Hervé
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
- Université Paris-Saclay, INRAE, AgroParisTech, UMR SayFood, Palaiseau, 91120, France
| |
Collapse
|
54
|
Tadrent N, Dedeine F, Hervé V. SnakeMAGs: a simple, efficient, flexible and scalable workflow to reconstruct prokaryotic genomes from metagenomes. F1000Res 2022; 11:1522. [PMID: 36875992 PMCID: PMC9978240 DOI: 10.12688/f1000research.128091.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/01/2022] [Indexed: 01/05/2024] Open
Abstract
Background: Over the last decade, we have observed in microbial ecology a transition from gene-centric to genome-centric analyses. Indeed, the advent of metagenomics combined with binning methods, single-cell genome sequencing as well as high-throughput cultivation methods have contributed to the continuing and exponential increase of available prokaryotic genomes, which in turn has favored the exploration of microbial metabolisms. In the case of metagenomics, data processing, from raw reads to genome reconstruction, involves various steps and software which can represent a major technical obstacle. Methods: To overcome this challenge, we developed SnakeMAGs, a simple workflow that can process Illumina data, from raw reads to metagenome-assembled genomes (MAGs) classification and relative abundance estimate. It integrates state-of-the-art bioinformatic tools to sequentially perform: quality control of the reads (illumina-utils, Trimmomatic), host sequence removal (optional step, using Bowtie2), assembly (MEGAHIT), binning (MetaBAT2), quality filtering of the bins (CheckM), classification of the MAGs (GTDB-Tk) and estimate of their relative abundance (CoverM). Developed with the popular Snakemake workflow management system, it can be deployed on various architectures, from single to multicore and from workstation to computer clusters and grids. It is also flexible since users can easily change parameters and/or add new rules. Results: Using termite gut metagenomic datasets, we showed that SnakeMAGs is slower but allowed the recovery of more MAGs encompassing more diverse phyla compared to another similar workflow named ATLAS. Conclusions: Overall, it should make the reconstruction of MAGs more accessible to microbiologists. SnakeMAGs as well as test files and an extended tutorial are available at https://github.com/Nachida08/SnakeMAGs.
Collapse
Affiliation(s)
- Nachida Tadrent
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Franck Dedeine
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Vincent Hervé
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
- Université Paris-Saclay, INRAE, AgroParisTech, UMR SayFood, Palaiseau, 91120, France
| |
Collapse
|
55
|
Vuong P, Wise MJ, Whiteley AS, Kaur P. Ten simple rules for investigating (meta)genomic data from environmental ecosystems. PLoS Comput Biol 2022; 18:e1010675. [PMID: 36480496 PMCID: PMC9731419 DOI: 10.1371/journal.pcbi.1010675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Paton Vuong
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
| | - Michael J. Wise
- School of Physics, Mathematics and Computing, University of Western Australia, Perth, Australia
- The Marshall Centre of Infectious Diseases, School of Biological Sciences, The University of Western Australia, Perth, Australia
| | - Andrew S. Whiteley
- Centre for Environment & Life Sciences, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Floreat, Australia
| | - Parwinder Kaur
- UWA School of Agriculture & Environment, University of Western Australia, Perth, Australia
- * E-mail:
| |
Collapse
|
56
|
Liu L, Yang Y, Deng Y, Zhang T. Nanopore long-read-only metagenomics enables complete and high-quality genome reconstruction from mock and complex metagenomes. MICROBIOME 2022; 10:209. [PMID: 36457010 PMCID: PMC9716684 DOI: 10.1186/s40168-022-01415-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 11/07/2022] [Indexed: 05/31/2023]
Abstract
BACKGROUND The accurate and comprehensive analyses of genome-resolved metagenomics largely depend on the reconstruction of reference-quality (complete and high-quality) genomes from diverse microbiomes. Closing gaps in draft genomes have been approaching with the inclusion of Nanopore long reads; however, genome quality improvement requires extensive and time-consuming high-accuracy short-read polishing. RESULTS Here, we introduce NanoPhase, an open-source tool to reconstruct reference-quality genomes from complex metagenomes using only Nanopore long reads. Using Kit 9 and Q20+ chemistries, we first evaluated the feasibility of NanoPhase using a ZymoBIOMICS gut microbiome standard (including 21 strains), then sequenced the complex activated sludge microbiome and reconstructed 275 MAGs with median completeness of ~ 90%. As a result, NanoPhase improved the MAG contiguity (median MAG N50: 735 Kb, 44-86X compared to conventional short-read-based methods) while maintaining high accuracy, allowing for a full and accurate investigation of target microbiomes. Additionally, leveraging these high-contiguity reference-quality genomes, we identified 165 prophages within 111 MAGs, with 5 as active prophages, indicating the prophage was a neglected source of genetic diversity within microbial populations and influencer in shaping microbial composition in the activated sludge microbiome. CONCLUSIONS Our results demonstrated that NanoPhase enables reference-quality genome reconstruction from complex metagenomes directly using only Nanopore long reads. Furthermore, besides the 16S rRNA genes and biosynthetic gene clusters, the generated high-accuracy and high-contiguity MAGs improved the host identification of critical mobile genetic elements, e.g., prophage, serving as a genomic blueprint to investigate the microbial potential and ecology in the activated sludge ecosystem. Video Abstract.
Collapse
Affiliation(s)
- Lei Liu
- Environmental Microbiome Engineering and Biotechnology Laboratory, Center for Environmental Engineering Research, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR, China
| | - Yu Yang
- Environmental Microbiome Engineering and Biotechnology Laboratory, Center for Environmental Engineering Research, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR, China
| | - Yu Deng
- Environmental Microbiome Engineering and Biotechnology Laboratory, Center for Environmental Engineering Research, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR, China
| | - Tong Zhang
- Environmental Microbiome Engineering and Biotechnology Laboratory, Center for Environmental Engineering Research, Department of Civil Engineering, The University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
57
|
D’aes J, Fraiture MA, Bogaerts B, De Keersmaecker SCJ, Roosens NHCJ, Vanneste K. Metagenomic Characterization of Multiple Genetically Modified Bacillus Contaminations in Commercial Microbial Fermentation Products. LIFE (BASEL, SWITZERLAND) 2022; 12:life12121971. [PMID: 36556336 PMCID: PMC9781105 DOI: 10.3390/life12121971] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/21/2022] [Accepted: 11/23/2022] [Indexed: 11/27/2022]
Abstract
Genetically modified microorganisms (GMM) are frequently employed for manufacturing microbial fermentation products such as food enzymes or vitamins. Although the fermentation product is required to be pure, GMM contaminations have repeatedly been reported in numerous commercial microbial fermentation produce types, leading to several rapid alerts at the European level. The aim of this study was to investigate the added value of shotgun metagenomic high-throughput sequencing to confirm and extend the results of classical analysis methods for the genomic characterization of unauthorized GMM. By combining short- and long-read metagenomic sequencing, two transgenic constructs were characterized, with insertions of alpha-amylase genes originating from B. amyloliquefaciens and B. licheniformis, respectively, and a transgenic construct with a protease gene insertion originating from B. velezensis, which were all present in all four investigated samples. Additionally, the samples were contaminated with up to three unculturable Bacillus strains, carrying genetic modifications that may hamper their ability to sporulate. Moreover, several samples contained viable Bacillus strains. Altogether these contaminations constitute a considerable load of antimicrobial resistance genes, that may represent a potential public health risk. In conclusion, our study showcases the added value of metagenomics to investigate the quality and safety of complex commercial microbial fermentation products.
Collapse
|
58
|
Zhang Y, Jiang F, Yang B, Wang S, Wang H, Wang A, Xu D, Fan W. Improved microbial genomes and gene catalog of the chicken gut from metagenomic sequencing of high-fidelity long reads. Gigascience 2022; 11:6833030. [PMID: 36399059 PMCID: PMC9673493 DOI: 10.1093/gigascience/giac116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 09/27/2022] [Accepted: 10/30/2022] [Indexed: 11/19/2022] Open
Abstract
Background Due to the importance of chicken production and the remarkable influence of the gut microbiota on host health and growth, tens of thousands of metagenome-assembled genomes (MAGs) have been constructed for the chicken gut microbiome. However, due to the limitations of short-read sequencing and assembly technologies, most of these MAGs are far from complete, are of lower quality, and include contaminant reads. Results We generated 332 Gb of high-fidelity (HiFi) long reads from the 5 chicken intestinal compartments and assembled 461 and 337 microbial genomes, of which 53% and 55% are circular, at the species and strain levels, respectively. For the assembled microbial genomes, approximately 95% were regarded as complete according to the “RNA complete” criteria, which requires at least 1 full-length ribosomal RNA (rRNA) operon encoding all 3 types of rRNA (16S, 23S, and 5S) and at least 18 copies of full-length transfer RNA genes. In comparison with the short-read-derived chicken MAGs, 384 (83% of 461) and 89 (26% of 337) strain-level and species-level genomes in this study are novel, with no matches to previously reported sequences. At the gene level, one-third of the 2.5 million genes in the HiFi-derived gene catalog are novel and cannot be matched to the short-read-derived gene catalog. Moreover, the HiFi-derived genomes have much higher continuity and completeness, as well as lower contamination; the HiFi-derived gene catalog has a much higher ratio of complete gene structures. The dominant phylum in our HiFi-assembled genomes was Firmicutes (82.5%), and the foregut was highly enriched in 5 genera: Ligilactobacillus, Limosilactobacillus, Lactobacillus, Weissella, and Enterococcus, all of which belong to the order Lactobacillales. Using GTDB-Tk, all 337 species-level genomes were successfully classified at the order level; however, 2, 35, and 189 genomes could not be classified into any known family, genus, and species, respectively. Among these incompletely classified genomes, 9 and 49 may belong to novel genera and species, respectively, because their 16S rRNA genes have identities lower than 95% and 97% to any known 16S rRNA genes. Conclusions HiFi sequencing not only produced metagenome assemblies and gene structures with markedly improved quality but also recovered a substantial portion of novel genomes and genes that were missed in previous short-read-based metagenome studies. The novel genomes and species obtained in this study will facilitate gut microbiome and host–microbiota interaction studies, thereby contributing to the sustainable development of poultry resources.
Collapse
Affiliation(s)
- Yan Zhang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Fan Jiang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Boyuan Yang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Sen Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Hengchao Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Anqi Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Dong Xu
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Wei Fan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| |
Collapse
|
59
|
Kato S, Masuda S, Shibata A, Shirasu K, Ohkuma M. Insights into ecological roles of uncultivated bacteria in Katase hot spring sediment from long-read metagenomics. Front Microbiol 2022; 13:1045931. [PMID: 36406403 PMCID: PMC9671151 DOI: 10.3389/fmicb.2022.1045931] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 10/11/2022] [Indexed: 08/11/2023] Open
Abstract
Diverse yet-uncultivated bacteria and archaea, i.e., microbial dark matter, are present in terrestrial hot spring environments. Numerous metagenome-assembled genomes (MAGs) of these uncultivated prokaryotes by short-read metagenomics have been reported so far, suggesting their metabolic potential. However, more reliable MAGs, i.e., circularized complete MAGs (cMAGs), have been rarely reported from hot spring environments. Here, we report 61 high-quality (HQ)-MAGs, including 14 cMAGs, of diverse uncultivated bacteria and archaea retrieved from hot spring sediment (52°C, pH 7.2) by highly accurate long-read sequencing using PacBio Sequel II. The HQ MAGs were affiliated with one archaeal and 13 bacterial phyla. Notably, nine of the 14 cMAGs were the first reported cMAGs for the family- to class-level clades that these cMAGs belonged to. The genome information suggests that the bacteria represented by MAGs play a significant role in the biogeochemical cycling of carbon, nitrogen, iron, and sulfur at this site. In particular, the genome analysis of six HQ MAGs including two cMAGs of Armatimonadota, of which members are frequently abundant in hot spring environments, predicts that they are aerobic, moderate thermophilic chemoorganoheterotrophs, and potentially oxidize and/or reduce iron. This prediction is consistent with the environmental conditions where they were detected. Our results expand the knowledge regarding the ecological potential of uncultivated bacteria in moderately-high-temperature environments.
Collapse
Affiliation(s)
- Shingo Kato
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| | - Sachiko Masuda
- Plant Immunity Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Arisa Shibata
- Plant Immunity Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Ken Shirasu
- Plant Immunity Research Group, RIKEN Center for Sustainable Resource Science, Yokohama, Japan
| | - Moriya Ohkuma
- Japan Collection of Microorganisms, RIKEN BioResource Research Center, Tsukuba, Japan
| |
Collapse
|
60
|
Kim CY, Ma J, Lee I. HiFi metagenomic sequencing enables assembly of accurate and complete genomes from human gut microbiota. Nat Commun 2022; 13:6367. [PMID: 36289209 PMCID: PMC9606305 DOI: 10.1038/s41467-022-34149-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 10/12/2022] [Indexed: 12/25/2022] Open
Abstract
Advances in metagenomic assembly have led to the discovery of genomes belonging to uncultured microorganisms. Metagenome-assembled genomes (MAGs) often suffer from fragmentation and chimerism. Recently, 20 complete MAGs (cMAGs) have been assembled from Oxford Nanopore long-read sequencing of 13 human fecal samples, but with low nucleotide accuracy. Here, we report 102 cMAGs obtained by Pacific Biosciences (PacBio) high-accuracy long-read (HiFi) metagenomic sequencing of five human fecal samples, whose initial circular contigs were selected for complete prokaryotic genomes using our bioinformatics workflow. Nucleotide accuracy of the final cMAGs was as high as that of Illumina sequencing. The cMAGs could exceed 6 Mbp and included complete genomes of diverse taxa, including entirely uncultured RF39 and TANB77 orders. Moreover, cMAGs revealed that regions hard to assemble by short-read sequencing comprised mostly genomic islands and rRNAs. HiFi metagenomic sequencing will facilitate cataloging accurate and complete genomes from complex microbial communities, including uncultured species.
Collapse
Affiliation(s)
- Chan Yeong Kim
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea ,grid.4709.a0000 0004 0495 846XPresent Address: Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Junyeong Ma
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea
| | - Insuk Lee
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea ,grid.49100.3c0000 0001 0742 4007POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673 Republic of Korea
| |
Collapse
|
61
|
Srinivas M, O’Sullivan O, Cotter PD, van Sinderen D, Kenny JG. The Application of Metagenomics to Study Microbial Communities and Develop Desirable Traits in Fermented Foods. Foods 2022; 11:3297. [PMCID: PMC9601669 DOI: 10.3390/foods11203297] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The microbial communities present within fermented foods are diverse and dynamic, producing a variety of metabolites responsible for the fermentation processes, imparting characteristic organoleptic qualities and health-promoting traits, and maintaining microbiological safety of fermented foods. In this context, it is crucial to study these microbial communities to characterise fermented foods and the production processes involved. High Throughput Sequencing (HTS)-based methods such as metagenomics enable microbial community studies through amplicon and shotgun sequencing approaches. As the field constantly develops, sequencing technologies are becoming more accessible, affordable and accurate with a further shift from short read to long read sequencing being observed. Metagenomics is enjoying wide-spread application in fermented food studies and in recent years is also being employed in concert with synthetic biology techniques to help tackle problems with the large amounts of waste generated in the food sector. This review presents an introduction to current sequencing technologies and the benefits of their application in fermented foods.
Collapse
Affiliation(s)
- Meghana Srinivas
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- School of Microbiology, University College Cork, T12 CY82 Cork, Ireland
| | - Orla O’Sullivan
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
| | - Paul D. Cotter
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
| | - Douwe van Sinderen
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- School of Microbiology, University College Cork, T12 CY82 Cork, Ireland
| | - John G. Kenny
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
- Correspondence:
| |
Collapse
|
62
|
Genome-centric analysis of short and long read metagenomes reveals uncharacterized microbiome diversity in Southeast Asians. Nat Commun 2022; 13:6044. [PMID: 36229545 PMCID: PMC9561172 DOI: 10.1038/s41467-022-33782-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 09/27/2022] [Indexed: 12/24/2022] Open
Abstract
Despite extensive efforts to address it, the vastness of uncharacterized 'dark matter' microbial genetic diversity can impact short-read sequencing based metagenomic studies. Population-specific biases in genomic reference databases can further compound this problem. Leveraging advances in hybrid assembly (using short and long reads) and Hi-C technologies in a cross-sectional survey, we deeply characterized 109 gut microbiomes from three ethnicities in Singapore to comprehensively reconstruct 4497 medium and high-quality metagenome assembled genomes, 1708 of which were missing in short-read only analysis and with >28× N50 improvement. Species-level clustering identified 70 (>10% of total) novel gut species out of 685, improved reference genomes for 363 species (53% of total), and discovered 3413 strains unique to these populations. Among the top 10 most abundant gut bacteria in our study, one of the species and >80% of strains were unrepresented in existing databases. Annotation of biosynthetic gene clusters (BGCs) uncovered more than 27,000 BGCs with a large fraction (36-88%) unrepresented in current databases, and with several unique clusters predicted to produce bacteriocins that could significantly alter microbiome community structure. These results reveal significant uncharacterized gut microbial diversity in Southeast Asian populations and highlight the utility of hybrid metagenomic references for bioprospecting and disease-focused studies.
Collapse
|
63
|
Neri U, Wolf YI, Roux S, Camargo AP, Lee B, Kazlauskas D, Chen IM, Ivanova N, Zeigler Allen L, Paez-Espino D, Bryant DA, Bhaya D, Krupovic M, Dolja VV, Kyrpides NC, Koonin EV, Gophna U. Expansion of the global RNA virome reveals diverse clades of bacteriophages. Cell 2022; 185:4023-4037.e18. [PMID: 36174579 DOI: 10.1016/j.cell.2022.08.023] [Citation(s) in RCA: 73] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/16/2022] [Accepted: 08/24/2022] [Indexed: 01/26/2023]
Abstract
High-throughput RNA sequencing offers broad opportunities to explore the Earth RNA virome. Mining 5,150 diverse metatranscriptomes uncovered >2.5 million RNA virus contigs. Analysis of >330,000 RNA-dependent RNA polymerases (RdRPs) shows that this expansion corresponds to a 5-fold increase of the known RNA virus diversity. Gene content analysis revealed multiple protein domains previously not found in RNA viruses and implicated in virus-host interactions. Extended RdRP phylogeny supports the monophyly of the five established phyla and reveals two putative additional bacteriophage phyla and numerous putative additional classes and orders. The dramatically expanded phylum Lenarviricota, consisting of bacterial and related eukaryotic viruses, now accounts for a third of the RNA virome. Identification of CRISPR spacer matches and bacteriolytic proteins suggests that subsets of picobirnaviruses and partitiviruses, previously associated with eukaryotes, infect prokaryotic hosts.
Collapse
Affiliation(s)
- Uri Neri
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv 6997801, Israel.
| | - Yuri I Wolf
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Simon Roux
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Antonio Pedro Camargo
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Benjamin Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Darius Kazlauskas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Saulėtekio av. 7, Vilnius 10257, Lithuania
| | - I Min Chen
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Natalia Ivanova
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Lisa Zeigler Allen
- Microbial and Environmental Genomics, J. Craig Venter Institute, La Jolla, CA, USA; Marine Biology Research Division, Scripps Institution of Oceanography, La Jolla, CA, USA
| | - David Paez-Espino
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Donald A Bryant
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA 16802, USA
| | - Devaki Bhaya
- Department of Plant Biology, Carnegie Institution for Science, Stanford, CA 94305, USA
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR 6047, Archaeal Virology Unit, 75015 Paris, France
| | - Valerian V Dolja
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA; Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA.
| | - Nikos C Kyrpides
- Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | - Uri Gophna
- The Shmunis School of Biomedicine and Cancer Research, Tel Aviv University, Tel Aviv 6997801, Israel.
| |
Collapse
|
64
|
Pu L, Shamir R. 3CAC: improving the classification of phages and plasmids in metagenomic assemblies using assembly graphs. Bioinformatics 2022; 38:ii56-ii61. [PMID: 36124804 DOI: 10.1093/bioinformatics/btac468] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Bacteriophages and plasmids usually coexist with their host bacteria in microbial communities and play important roles in microbial evolution. Accurately identifying sequence contigs as phages, plasmids and bacterial chromosomes in mixed metagenomic assemblies is critical for further unraveling their functions. Many classification tools have been developed for identifying either phages or plasmids in metagenomic assemblies. However, only two classifiers, PPR-Meta and viralVerify, were proposed to simultaneously identify phages and plasmids in mixed metagenomic assemblies. Due to the very high fraction of chromosome contigs in the assemblies, both tools achieve high precision in the classification of chromosomes but perform poorly in classifying phages and plasmids. Short contigs in these assemblies are often wrongly classified or classified as uncertain. RESULTS Here we present 3CAC, a new three-class classifier that improves the precision of phage and plasmid classification. 3CAC starts with an initial three-class classification generated by existing classifiers and improves the classification of short contigs and contigs with low confidence classification by using proximity in the assembly graph. Evaluation on simulated metagenomes and on real human gut microbiome samples showed that 3CAC outperformed PPR-Meta and viralVerify in both precision and recall, and increased F1-score by 10-60 percentage points. AVAILABILITY AND IMPLEMENTATION The 3CAC software is available on https://github.com/Shamir-Lab/3CAC. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Lianrong Pu
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| | - Ron Shamir
- The Blavatnik School of Computer Science, Tel Aviv University, Tel Aviv, 69978, Israel
| |
Collapse
|
65
|
Long-Read-Resolved, Ecosystem-Wide Exploration of Nucleotide and Structural Microdiversity of Lake Bacterioplankton Genomes. mSystems 2022; 7:e0043322. [PMID: 35938717 PMCID: PMC9426551 DOI: 10.1128/msystems.00433-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
Reconstruction of metagenome-assembled genomes (MAGs) has become a fundamental approach in microbial ecology. However, a MAG is hardly complete and overlooks genomic microdiversity because metagenomic assembly fails to resolve microvariants among closely related genotypes. Aiming at understanding the universal factors that drive or constrain prokaryotic genome diversification, we performed an ecosystem-wide high-resolution metagenomic exploration of microdiversity by combining spatiotemporal (2 depths × 12 months) sampling from a pelagic freshwater system, high-quality MAG reconstruction using long- and short-read metagenomic sequences, and profiling of single nucleotide variants (SNVs) and structural variants (SVs) through mapping of short and long reads to the MAGs, respectively. We reconstructed 575 MAGs, including 29 circular assemblies, providing high-quality reference genomes of freshwater bacterioplankton. Read mapping against these MAGs identified 100 to 101,781 SNVs/Mb and 0 to 305 insertions, 0 to 467 deletions, 0 to 41 duplications, and 0 to 6 inversions for each MAG. Nonsynonymous SNVs were accumulated in genes potentially involved in cell surface structural modification to evade phage recognition. Most (80.2%) deletions overlapped with a gene coding region, and genes of prokaryotic defense systems were most frequently (>8% of the genes) overlapped with a deletion. Some such deletions exhibited a monthly shift in their allele frequency, suggesting a rapid turnover of genotypes in response to phage predation. MAGs with extremely low microdiversity were either rare or opportunistic bloomers, suggesting that population persistency is key to their genomic diversification. The results concluded that prokaryotic genomic diversification is driven primarily by viral load and constrained by a population bottleneck. IMPORTANCE Identifying intraspecies genomic diversity (microdiversity) is crucial to understanding microbial ecology and evolution. However, microdiversity among environmental assemblages is not well investigated, because most microbes are difficult to culture. In this study, we performed cultivation-independent exploration of bacterial genomic microdiversity in a lake ecosystem using a combination of short- and long-read metagenomic analyses. The results revealed the broad spectrum of genomic microdiversity among the diverse bacterial species in the ecosystem, which has been overlooked by conventional approaches. Our ecosystem-wide exploration further allowed comparative analysis among the genomes and genes and revealed factors behind microbial genomic diversification, namely, that diversification is driven primarily by resistance against viral infection and constrained by the population size.
Collapse
|
66
|
Sereika M, Kirkegaard RH, Karst SM, Michaelsen TY, Sørensen EA, Wollenberg RD, Albertsen M. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat Methods 2022; 19:823-826. [PMID: 35789207 PMCID: PMC9262707 DOI: 10.1038/s41592-022-01539-7] [Citation(s) in RCA: 127] [Impact Index Per Article: 63.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/24/2022] [Indexed: 12/26/2022]
Abstract
Long-read Oxford Nanopore sequencing has democratized microbial genome sequencing and enables the recovery of highly contiguous microbial genomes from isolates or metagenomes. However, to obtain near-finished genomes it has been necessary to include short-read polishing to correct insertions and deletions derived from homopolymer regions. Here, we show that Oxford Nanopore R10.4 can be used to generate near-finished microbial genomes from isolates or metagenomes without short-read or reference polishing.
Collapse
Affiliation(s)
- Mantas Sereika
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark
| | - Rasmus Hansen Kirkegaard
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark.,Joint Microbiome Facility, University of Vienna, Vienna, Austria
| | | | | | | | | | - Mads Albertsen
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark.
| |
Collapse
|
67
|
Haryono MAS, Law YY, Arumugam K, Liew LCW, Nguyen TQN, Drautz-Moses DI, Schuster SC, Wuertz S, Williams RBH. Recovery of High Quality Metagenome-Assembled Genomes From Full-Scale Activated Sludge Microbial Communities in a Tropical Climate Using Longitudinal Metagenome Sampling. Front Microbiol 2022; 13:869135. [PMID: 35756038 PMCID: PMC9230771 DOI: 10.3389/fmicb.2022.869135] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/05/2022] [Indexed: 01/23/2023] Open
Abstract
The analysis of metagenome data based on the recovery of draft genomes (so called metagenome-assembled genomes, or MAG) has assumed an increasingly central role in microbiome research in recent years. Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high ecological complexity, and remain important, albeit understudied, microbial communities that play ssa key role in mediating interactions between human and natural ecosystems. Here we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multi-sample co-assembly and a recently published MAG recovery workflow using deep learning. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, showing that co-assembly may offer advantages over single-sample assembly in the case of metagenome data obtained from closely sampled longitudinal study designs. Around 1,000 MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genomes. These findings carry broader broader implications for performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.
Collapse
Affiliation(s)
- Mindia A S Haryono
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Ying Yu Law
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Krithika Arumugam
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Larry C-W Liew
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Thi Quynh Ngoc Nguyen
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Daniela I Drautz-Moses
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Stephan C Schuster
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Stefan Wuertz
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore.,School of Civil and Environmental Engineering, Nanyang Technological University, Singapore, Singapore
| | - Rohan B H Williams
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| |
Collapse
|
68
|
Arabinoxylan and Pectin Metabolism in Crohn’s Disease Microbiota: An In Silico Study. Int J Mol Sci 2022; 23:ijms23137093. [PMID: 35806099 PMCID: PMC9266297 DOI: 10.3390/ijms23137093] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 06/20/2022] [Accepted: 06/22/2022] [Indexed: 12/03/2022] Open
Abstract
Inflammatory bowel disease is a chronic disorder including ulcerative colitis and Crohn’s disease (CD). Gut dysbiosis is often associated with CD, and metagenomics allows a better understanding of the microbial communities involved. The objective of this study was to reconstruct in silico carbohydrate metabolic capabilities from metagenome-assembled genomes (MAGs) obtained from healthy and CD individuals. This computational method was developed as a mean to aid rationally designed prebiotic interventions to rebalance CD dysbiosis, with a focus on metabolism of emergent prebiotics derived from arabinoxylan and pectin. Up to 1196 and 1577 MAGs were recovered from CD and healthy people, respectively. MAGs of Akkermansia muciniphila, Barnesiella viscericola DSM 18177 and Paraprevotella xylaniphila YIT 11841 showed a wide range of unique and specific enzymes acting on arabinoxylan and pectin. These glycosidases were also found in MAGs recovered from CD patients. Interestingly, these arabinoxylan and pectin degraders are predicted to exhibit metabolic interactions with other gut microbes reduced in CD. Thus, administration of arabinoxylan and pectin may ameliorate dysbiosis in CD by promoting species with key metabolic functions, capable of cross-feeding other beneficial species. These computational methods may be of special interest for the rational design of prebiotic ingredients targeting at CD.
Collapse
|
69
|
Smith SE, Huang W, Tiamani K, Unterer M, Khan Mirzaei M, Deng L. Emerging technologies in the study of the virome. Curr Opin Virol 2022; 54:101231. [DOI: 10.1016/j.coviro.2022.101231] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 04/16/2022] [Accepted: 04/19/2022] [Indexed: 11/03/2022]
|
70
|
Goussarov G, Mysara M, Vandamme P, Van Houdt R. Introduction to the principles and methods underlying the recovery of metagenome-assembled genomes from metagenomic data. Microbiologyopen 2022; 11:e1298. [PMID: 35765182 PMCID: PMC9179125 DOI: 10.1002/mbo3.1298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/18/2022] Open
Abstract
The rise of metagenomics offers a leap forward for understanding the genetic diversity of microorganisms in many different complex environments by providing a platform that can identify potentially unlimited numbers of known and novel microorganisms. As such, it is impossible to imagine new major initiatives without metagenomics. Nevertheless, it represents a relatively new discipline with various levels of complexity and demands on bioinformatics. The underlying principles and methods used in metagenomics are often seen as common knowledge and often not detailed or fragmented. Therefore, we reviewed these to guide microbiologists in taking the first steps into metagenomics. We specifically focus on a workflow aimed at reconstructing individual genomes, that is, metagenome‐assembled genomes, integrating DNA sequencing, assembly, binning, identification and annotation.
Collapse
Affiliation(s)
- Gleb Goussarov
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium.,Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Mohamed Mysara
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| | - Peter Vandamme
- Laboratory of Microbiology and BCCM/LMG Bacteria Collection, Faculty of Sciences, Ghent University, Ghent, Belgium
| | - Rob Van Houdt
- Microbiology Unit, Belgian Nuclear Research Centre (SCK CEN), Mol, Belgium
| |
Collapse
|
71
|
Feng X, Cheng H, Portik D, Li H. Metagenome assembly of high-fidelity long reads with hifiasm-meta. Nat Methods 2022; 19:671-674. [PMID: 35534630 DOI: 10.1038/s41592-022-01478-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 03/28/2022] [Indexed: 12/26/2022]
Abstract
De novo assembly of metagenome samples is a common approach to the study of microbial communities. Current metagenome assemblers developed for short sequence reads or noisy long reads were not optimized for accurate long reads. We thus developed hifiasm-meta, a metagenome assembler that exploits the high accuracy of recent data. Evaluated on seven empirical datasets, hifiasm-meta reconstructed tens to hundreds of complete circular bacterial genomes per dataset, consistently outperforming other metagenome assemblers.
Collapse
Affiliation(s)
- Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA. .,Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
72
|
Recovery of Metagenome-Assembled Genomes from a Human Fecal Sample with Pacific Biosciences High-Fidelity Sequencing. Microbiol Resour Announc 2022; 11:e0025022. [PMID: 35532226 PMCID: PMC9202402 DOI: 10.1128/mra.00250-22] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Here, we report the recovery of 89 metagenome-assembled genomes (MAGs) derived from a human fecal sample subjected to Pacific Biosciences (PacBio) high-fidelity (HiFi) sequencing. A total of 9 MAGs consisted of complete circular contigs, and 45 MAGs were high-quality draft genomes according to the minimum information about a metagenome-assembled genome (MIMAG) standards.
Collapse
|
73
|
Ko KKK, Chng KR, Nagarajan N. Metagenomics-enabled microbial surveillance. Nat Microbiol 2022; 7:486-496. [PMID: 35365786 DOI: 10.1038/s41564-022-01089-w] [Citation(s) in RCA: 54] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Accepted: 02/22/2022] [Indexed: 12/13/2022]
Abstract
Lessons learnt from the COVID-19 pandemic include increased awareness of the potential for zoonoses and emerging infectious diseases that can adversely affect human health. Although emergent viruses are currently in the spotlight, we must not forget the ongoing toll of morbidity and mortality owing to antimicrobial resistance in bacterial pathogens and to vector-borne, foodborne and waterborne diseases. Population growth, planetary change, international travel and medical tourism all contribute to the increasing frequency of infectious disease outbreaks. Surveillance is therefore of crucial importance, but the diversity of microbial pathogens, coupled with resource-intensive methods, compromises our ability to scale-up such efforts. Innovative technologies that are both easy to use and able to simultaneously identify diverse microorganisms (viral, bacterial or fungal) with precision are necessary to enable informed public health decisions. Metagenomics-enabled surveillance methods offer the opportunity to improve detection of both known and yet-to-emerge pathogens.
Collapse
Affiliation(s)
- Karrie K K Ko
- Laboratory of Metagenomic Technologies and Microbial Systems, Genome Institute of Singapore, Singapore, Singapore.,Department of Microbiology, Singapore General Hospital, Singapore, Singapore.,Department of Molecular Pathology, Singapore General Hospital, Singapore, Singapore.,Duke-NUS Medical School, Singapore, Singapore.,Yong Loo Lin School of Medicine, National Univerisity of Singapore, Singapore, Singapore
| | - Kern Rei Chng
- Laboratory of Metagenomic Technologies and Microbial Systems, Genome Institute of Singapore, Singapore, Singapore.,National Centre for Food Science, Singapore Food Agency, Singapore, Singapore
| | - Niranjan Nagarajan
- Laboratory of Metagenomic Technologies and Microbial Systems, Genome Institute of Singapore, Singapore, Singapore. .,Yong Loo Lin School of Medicine, National Univerisity of Singapore, Singapore, Singapore.
| |
Collapse
|
74
|
Cuscó A, Pérez D, Viñes J, Fàbregas N, Francino O. Novel canine high-quality metagenome-assembled genomes, prophages and host-associated plasmids provided by long-read metagenomics together with Hi-C proximity ligation. Microb Genom 2022; 8. [PMID: 35298370 PMCID: PMC9176287 DOI: 10.1099/mgen.0.000802] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The human gut microbiome has been extensively studied, yet the canine gut microbiome is still largely unknown. The availability of high-quality genomes is essential in the fields of veterinary medicine and nutrition to unravel the biological role of key microbial members in the canine gut environment. Our aim was to evaluate nanopore long-read metagenomics and Hi-C (high-throughput chromosome conformation capture) proximity ligation to provide high-quality metagenome-assembled genomes (HQ MAGs) of the canine gut environment. By combining nanopore long-read metagenomics and Hi-C proximity ligation, we retrieved 27 HQ MAGs and 7 medium-quality MAGs of a faecal sample of a healthy dog. Canine MAGs (CanMAGs) improved genome contiguity of representatives from the animal and human MAG catalogues – short-read MAGs from public datasets – for the species they represented: they were more contiguous with complete ribosomal operons and at least 18 canonical tRNAs. Both canine-specific bacterial species and gut generalists inhabit the dog’s gastrointestinal environment. Most of them belonged to Firmicutes, followed by Bacteroidota and Proteobacteria. We also assembled one Actinobacteriota and one Fusobacteriota MAG. CanMAGs harboured antimicrobial-resistance genes (ARGs) and prophages and were linked to plasmids. ARGs conferring resistance to tetracycline were most predominant within CanMAGs, followed by lincosamide and macrolide ones. At the functional level, carbohydrate transport and metabolism was the most variable within the CanMAGs, and mobilome function was abundant in some MAGs. Specifically, we assigned the mobilome functions and the associated mobile genetic elements to the bacterial host. The CanMAGs harboured 50 bacteriophages, providing novel bacterial-host information for eight viral clusters, and Hi-C proximity ligation data linked the six potential plasmids to their bacterial host. Long-read metagenomics and Hi-C proximity ligation are likely to become a comprehensive approach to HQ MAG discovery and assignment of extra-chromosomal elements to their bacterial host. This will provide essential information for studying the canine gut microbiome in veterinary medicine and animal nutrition.
Collapse
Affiliation(s)
- Anna Cuscó
- Vetgenomics, Edificio Eureka, Parc de Recerca UAB, Barcelona, Spain.,Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, PR China
| | - Daniel Pérez
- Molecular Genetics Veterinary Service (SVGM), Veterinary School, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Joaquim Viñes
- Vetgenomics, Edificio Eureka, Parc de Recerca UAB, Barcelona, Spain.,Molecular Genetics Veterinary Service (SVGM), Veterinary School, Universitat Autònoma de Barcelona, Barcelona, Spain
| | - Norma Fàbregas
- Vetgenomics, Edificio Eureka, Parc de Recerca UAB, Barcelona, Spain
| | - Olga Francino
- Molecular Genetics Veterinary Service (SVGM), Veterinary School, Universitat Autònoma de Barcelona, Barcelona, Spain
| |
Collapse
|
75
|
viralFlye: assembling viruses and identifying their hosts from long-read metagenomics data. Genome Biol 2022; 23:57. [PMID: 35189932 PMCID: PMC8862349 DOI: 10.1186/s13059-021-02566-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 12/03/2021] [Indexed: 11/10/2022] Open
Abstract
Although the use of long-read sequencing improves the contiguity of assembled viral genomes compared to short-read methods, assembling complex viral communities remains an open problem. We describe the viralFlye tool for identification and analysis of metagenome-assembled viruses in long-read assemblies. We show it significantly improves viral assemblies and demonstrate that long-reads result in a much larger array of predicted virus-host associations as compared to short-read assemblies. We demonstrate that the identification of novel CRISPR arrays in bacterial genomes from a newly assembled metagenomic sample provides information for predicting novel hosts for novel viruses.
Collapse
|
76
|
Ivanova V, Chernevskaya E, Vasiluev P, Ivanov A, Tolstoganov I, Shafranskaya D, Ulyantsev V, Korobeynikov A, Razin SV, Beloborodova N, Ulianov SV, Tyakht A. Hi-C Metagenomics in the ICU: Exploring Clinically Relevant Features of Gut Microbiome in Chronically Critically Ill Patients. Front Microbiol 2022; 12:770323. [PMID: 35185811 PMCID: PMC8851603 DOI: 10.3389/fmicb.2021.770323] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 11/25/2021] [Indexed: 01/02/2023] Open
Abstract
Gut microbiome in critically ill patients shows profound dysbiosis. The most vulnerable is the subgroup of chronically critically ill (CCI) patients – those suffering from long-term dependence on support systems in intensive care units. It is important to investigate their microbiome as a potential reservoir of opportunistic taxa causing co-infections and a morbidity factor. We explored dynamics of microbiome composition in the CCI patients by combining “shotgun” metagenomics with chromosome conformation capture (Hi-C). Stool samples were collected at 2 time points from 2 patients with severe brain injury with different outcomes within a 1–2-week interval. The metagenome-assembled genomes (MAGs) were reconstructed based on the Hi-C data using a novel hicSPAdes method (along with the bin3c method for comparison), as well as independently of the Hi-C using MetaBAT2. The resistomes of the samples were derived using a novel assembly graph-based approach. Links of bacteria to antibiotic resistance genes, plasmids and viruses were analyzed using Hi-C-based networks. The gut community structure was enriched in opportunistic microorganisms. The binning using hicSPAdes was superior to the conventional WGS-based binning as well as to the bin3c in terms of the number, completeness and contamination of the reconstructed MAGs. Using Klebsiella pneumoniae as an example, we showed how chromosome conformation capture can aid comparative genomic analysis of clinically important pathogens. Diverse associations of resistome with antimicrobial therapy from the level of assembly graphs to gene content were discovered. Analysis of Hi-C networks suggested multiple “host-plasmid” and “host-phage” links. Hi-C metagenomics is a promising technique for investigating clinical microbiome samples. It provides a community composition profile with increased details on bacterial gene content and mobile genetic elements compared to conventional metagenomics. The ability of Hi-C binning to encompass the MAG’s plasmid content facilitates metagenomic evaluation of virulence and drug resistance dynamics in clinically relevant opportunistic pathogens. These findings will help to identify the targets for developing cost-effective and rapid tests for assessing microbiome-related health risks.
Collapse
Affiliation(s)
- Valeriia Ivanova
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
| | - Ekaterina Chernevskaya
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow, Russia
| | - Petr Vasiluev
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- Research Centre for Medical Genetics, Moscow, Russia
| | - Artem Ivanov
- Computer Technologies Laboratory, ITMO University, Saint Petersburg, Russia
| | - Ivan Tolstoganov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Saint Petersburg, Russia
| | - Daria Shafranskaya
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Saint Petersburg, Russia
| | - Vladimir Ulyantsev
- Computer Technologies Laboratory, ITMO University, Saint Petersburg, Russia
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Saint Petersburg, Russia
| | - Sergey V. Razin
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Natalia Beloborodova
- Federal Research and Clinical Center of Intensive Care Medicine and Rehabilitology, Moscow, Russia
| | - Sergey V. Ulianov
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- Faculty of Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Alexander Tyakht
- Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology Russian Academy of Sciences, Moscow, Russia
- *Correspondence: Alexander Tyakht,
| |
Collapse
|
77
|
|
78
|
Fournier P, Pellan L, Barroso-Bergadà D, Bohan DA, Candresse T, Delmotte F, Dufour MC, Lauvergeat V, Le Marrec C, Marais A, Martins G, Masneuf-Pomarède I, Rey P, Sherman D, This P, Frioux C, Labarthe S, Vacher C. The functional microbiome of grapevine throughout plant evolutionary history and lifetime. ADV ECOL RES 2022. [DOI: 10.1016/bs.aecr.2022.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
79
|
Fedarko MW, Kolmogorov M, Pevzner PA. Analyzing rare mutations in metagenomes assembled using long and accurate reads. Genome Res 2022; 32:2119-2133. [PMID: 36418060 PMCID: PMC9808630 DOI: 10.1101/gr.276917.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/16/2022] [Indexed: 11/25/2022]
Abstract
The advent of long and accurate "HiFi" reads has greatly improved our ability to generate complete metagenome-assembled genomes (MAGs), enabling "complete metagenomics" studies that were nearly impossible to conduct with short reads. In particular, HiFi reads simplify the identification and phasing of mutations in MAGs: It is increasingly feasible to distinguish between positions that are prone to mutations and positions that rarely ever mutate, and to identify co-occurring groups of mutations. However, the problems of identifying rare mutations in MAGs, estimating the false-discovery rate (FDR) of these identifications, and phasing identified mutations remain open in the context of HiFi data. We present strainFlye, a pipeline for the FDR-controlled identification and analysis of rare mutations in MAGs assembled using HiFi reads. We show that deep HiFi sequencing has the potential to reveal and phase tens of thousands of rare mutations in a single MAG, identify hotspots and coldspots of these mutations, and detail MAGs' growth dynamics.
Collapse
Affiliation(s)
- Marcus W. Fedarko
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA;,UC Santa Cruz Genomics Institute, Santa Cruz, California 95064, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
80
|
Bzikadze AV, Mikheenko A, Pevzner PA. Fast and accurate mapping of long reads to complete genome assemblies with VerityMap. Genome Res 2022; 32:2107-2118. [PMID: 36379716 PMCID: PMC9808623 DOI: 10.1101/gr.276871.122] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Accepted: 11/09/2022] [Indexed: 11/16/2022]
Abstract
Recent advancements in long-read sequencing have enabled the telomere-to-telomere (complete) assembly of a human genome and are now contributing to the haplotype-resolved complete assemblies of multiple human genomes. Because the accuracy of read mapping tools deteriorates in highly repetitive regions, there is a need to develop accurate, error-exposing (detecting potential assembly errors), and diploid-aware (distinguishing different haplotypes) tools for read mapping in complete assemblies. We describe the first accurate, error-exposing, and partially diploid-aware VerityMap tool for long-read mapping to complete assemblies.
Collapse
Affiliation(s)
- Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego, California 92093, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Saint Petersburg State University, Saint Petersburg, 199034, Russia
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California, San Diego, California 92093, USA
| |
Collapse
|
81
|
Blakeley-Ruiz JA, Kleiner M. Considerations for Constructing a Protein Sequence Database for Metaproteomics. Comput Struct Biotechnol J 2022; 20:937-952. [PMID: 35242286 PMCID: PMC8861567 DOI: 10.1016/j.csbj.2022.01.018] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 01/14/2022] [Accepted: 01/18/2022] [Indexed: 12/14/2022] Open
Abstract
Mass spectrometry-based metaproteomics has emerged as a prominent technique for interrogating the functions of specific organisms in microbial communities, in addition to total community function. Identifying proteins by mass spectrometry requires matching mass spectra of fragmented peptide ions to a database of protein sequences corresponding to the proteins in the sample. This sequence database determines which protein sequences can be identified from the measurement, and as such the taxonomic and functional information that can be inferred from a metaproteomics measurement. Thus, the construction of the protein sequence database directly impacts the outcome of any metaproteomics study. Several factors, such as source of sequence information and database curation, need to be considered during database construction to maximize accurate protein identifications traceable to the species of origin. In this review, we provide an overview of existing strategies for database construction and the relevant studies that have sought to test and validate these strategies. Based on this review of the literature and our experience we provide a decision tree and best practices for choosing and implementing database construction strategies.
Collapse
Affiliation(s)
- J. Alfredo Blakeley-Ruiz
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Center for Gastrointestinal Biology and Disease, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA
- Corresponding authors at: Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC, USA.
| |
Collapse
|