1
|
Valentin-Alvarado LE, Appler KE, De Anda V, Schoelmerich MC, West-Roberts J, Kivenson V, Crits-Christoph A, Ly L, Sachdeva R, Greening C, Savage DF, Baker BJ, Banfield JF. Asgard archaea modulate potential methanogenesis substrates in wetland soil. Nat Commun 2024; 15:6384. [PMID: 39085194 DOI: 10.1038/s41467-024-49872-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2024] [Accepted: 06/20/2024] [Indexed: 08/02/2024] Open
Abstract
The roles of Asgard archaea in eukaryogenesis and marine biogeochemical cycles are well studied, yet their contributions in soil ecosystems remain unknown. Of particular interest are Asgard archaeal contributions to methane cycling in wetland soils. To investigate this, we reconstructed two complete genomes for soil-associated Atabeyarchaeia, a new Asgard lineage, and a complete genome of Freyarchaeia, and predicted their metabolism in situ. Metatranscriptomics reveals expression of genes for [NiFe]-hydrogenases, pyruvate oxidation and carbon fixation via the Wood-Ljungdahl pathway. Also expressed are genes encoding enzymes for amino acid metabolism, anaerobic aldehyde oxidation, hydrogen peroxide detoxification and carbohydrate breakdown to acetate and formate. Overall, soil-associated Asgard archaea are predicted to include non-methanogenic acetogens, highlighting their potential role in carbon cycling in terrestrial environments.
Collapse
Affiliation(s)
- Luis E Valentin-Alvarado
- Innovative Genomics Institute, University of California, Berkeley, California, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Kathryn E Appler
- Department of Marine Science, University of Texas at Austin; Marine Science Institute, Port Aransas, TX, USA
| | - Valerie De Anda
- Department of Marine Science, University of Texas at Austin; Marine Science Institute, Port Aransas, TX, USA
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, USA
| | - Marie C Schoelmerich
- Innovative Genomics Institute, University of California, Berkeley, California, USA
- Department of Environmental Systems Sciences; ETH Zürich, Zürich, Switzerland
| | - Jacob West-Roberts
- Environmental Science, Policy and Management, University of California, Berkeley, CA, USA
| | - Veronika Kivenson
- Innovative Genomics Institute, University of California, Berkeley, California, USA
| | - Alexander Crits-Christoph
- Innovative Genomics Institute, University of California, Berkeley, California, USA
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
- Cultivarium, Watertown, MA, USA
| | - Lynn Ly
- Oxford Nanopore Technologies Inc, New York, NY, USA
| | - Rohan Sachdeva
- Innovative Genomics Institute, University of California, Berkeley, California, USA
| | - Chris Greening
- Department of Microbiology, Biomedicine Discovery Institute; Monash University, Clayton, VIC, Australia
- Securing Antarctica's Environmental Future, Monash University, Clayton, VIC, Australia
| | - David F Savage
- Innovative Genomics Institute, University of California, Berkeley, California, USA
- Howard Hughes Medical Institute, University of California, Berkeley, California, USA
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, USA
| | - Brett J Baker
- Department of Marine Science, University of Texas at Austin; Marine Science Institute, Port Aransas, TX, USA.
- Department of Integrative Biology, University of Texas at Austin, Austin, TX, USA.
| | - Jillian F Banfield
- Innovative Genomics Institute, University of California, Berkeley, California, USA.
- Environmental Science, Policy and Management, University of California, Berkeley, CA, USA.
- Department of Microbiology, Biomedicine Discovery Institute; Monash University, Clayton, VIC, Australia.
- Earth and Planetary Science, University of California, Berkeley, CA, USA.
| |
Collapse
|
2
|
de Matos JP, Ribeiro DF, da Silva AK, de Paula CH, Cordeiro IF, Lemes CGDC, Sanchez AB, Rocha LCM, Garcia CCM, Almeida NF, Alves RM, de Abreu VAC, Varani AM, Moreira LM. Diversity and potential functional role of phyllosphere-associated actinomycetota isolated from cupuassu (Theobroma grandiflorum) leaves: implications for ecosystem dynamics and plant defense strategies. Mol Genet Genomics 2024; 299:73. [PMID: 39066857 DOI: 10.1007/s00438-024-02162-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 06/25/2024] [Indexed: 07/30/2024]
Abstract
Exploring the intricate relationships between plants and their resident microorganisms is crucial not only for developing new methods to improve disease resistance and crop yields but also for understanding their co-evolutionary dynamics. Our research delves into the role of the phyllosphere-associated microbiome, especially Actinomycetota species, in enhancing pathogen resistance in Theobroma grandiflorum, or cupuassu, an agriculturally valuable Amazonian fruit tree vulnerable to witches' broom disease caused by Moniliophthora perniciosa. While breeding resistant cupuassu genotypes is a possible solution, the capacity of the Actinomycetota phylum to produce beneficial metabolites offers an alternative approach yet to be explored in this context. Utilizing advanced long-read sequencing and metagenomic analysis, we examined Actinomycetota from the phyllosphere of a disease-resistant cupuassu genotype, identifying 11 Metagenome-Assembled Genomes across eight genera. Our comparative genomic analysis uncovered 54 Biosynthetic Gene Clusters related to antitumor, antimicrobial, and plant growth-promoting activities, alongside cutinases and type VII secretion system-associated genes. These results indicate the potential of phyllosphere-associated Actinomycetota in cupuassu for inducing resistance or antagonism against pathogens. By integrating our genomic discoveries with the existing knowledge of cupuassu's defense mechanisms, we developed a model hypothesizing the synergistic or antagonistic interactions between plant and identified Actinomycetota during plant-pathogen interactions. This model offers a framework for understanding the intricate dynamics of microbial influence on plant health. In conclusion, this study underscores the significance of the phyllosphere microbiome, particularly Actinomycetota, in the broader context of harnessing microbial interactions for plant health. These findings offer valuable insights for enhancing agricultural productivity and sustainability.
Collapse
Affiliation(s)
- Jéssica Pereira de Matos
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | - Dilson Fagundes Ribeiro
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | - Ana Karla da Silva
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | - Camila Henriques de Paula
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | - Isabella Ferreira Cordeiro
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | | | - Angélica Bianchini Sanchez
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | | | - Camila Carrião Machado Garcia
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
- Departamento de Ciências Biológicas, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil
| | - Nalvo F Almeida
- Faculdade de Computação, Universidade Federal de Mato Grosso do Sul, Campo Grande, MS, Brazil
| | | | | | - Alessandro M Varani
- Departamento de Biotecnologia Agropecuária e Ambiental, Faculdade de Ciências Agrárias e Veterinárias, Universidade Estadual Paulista (UNESP), Jaboticabal, SP, Brazil.
| | - Leandro Marcio Moreira
- Núcleo de Pesquisas em Ciências Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil.
- Departamento de Ciências Biológicas, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto, Ouro Preto, MG, 35400-000, Brazil.
| |
Collapse
|
3
|
Sakurai R, Fukuda Y, Tada C. Circular metagenome-assembled genome of Candidatus Cloacimonadota recovered from anaerobic digestion sludge. Microbiol Resour Announc 2024; 13:e0040324. [PMID: 38916296 PMCID: PMC11256810 DOI: 10.1128/mra.00403-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Accepted: 06/06/2024] [Indexed: 06/26/2024] Open
Abstract
This study reports a circular metagenome-assembled genome (cMAG) of Candidatus Cloacimonadota recovered from a mesophilic full-scale food waste treatment plant. The cMAG spans 2,298,113 bp, with 980× coverage and 1 contig.
Collapse
Affiliation(s)
- Riku Sakurai
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
- Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo, Japan
| | - Yasuhiro Fukuda
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
| | - Chika Tada
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
| |
Collapse
|
4
|
Shaw J, Gounot JS, Chen H, Nagarajan N, Yu YW. Floria: fast and accurate strain haplotyping in metagenomes. Bioinformatics 2024; 40:i30-i38. [PMID: 38940183 PMCID: PMC11211831 DOI: 10.1093/bioinformatics/btae252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
SUMMARY Shotgun metagenomics allows for direct analysis of microbial community genetics, but scalable computational methods for the recovery of bacterial strain genomes from microbiomes remains a key challenge. We introduce Floria, a novel method designed for rapid and accurate recovery of strain haplotypes from short and long-read metagenome sequencing data, based on minimum error correction (MEC) read clustering and a strain-preserving network flow model. Floria can function as a standalone haplotyping method, outputting alleles and reads that co-occur on the same strain, as well as an end-to-end read-to-assembly pipeline (Floria-PL) for strain-level assembly. Benchmarking evaluations on synthetic metagenomes show that Floria is > 3× faster and recovers 21% more strain content than base-level assembly methods (Strainberry) while being over an order of magnitude faster when only phasing is required. Applying Floria to a set of 109 deeply sequenced nanopore metagenomes took <20 min on average per sample and identified several species that have consistent strain heterogeneity. Applying Floria's short-read haplotyping to a longitudinal gut metagenomics dataset revealed a dynamic multi-strain Anaerostipes hadrus community with frequent strain loss and emergence events over 636 days. With Floria, accurate haplotyping of metagenomic datasets takes mere minutes on standard workstations, paving the way for extensive strain-level metagenomic analyses. AVAILABILITY AND IMPLEMENTATION Floria is available at https://github.com/bluenote-1577/floria, and the Floria-PL pipeline is available at https://github.com/jsgounot/Floria_analysis_workflow along with code for reproducing the benchmarks.
Collapse
Affiliation(s)
- Jim Shaw
- Department of Mathematics, University of Toronto, Toronto, Ontario, M5S 2E4, Canada
| | - Jean-Sebastien Gounot
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
| | - Hanrong Chen
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
| | - Niranjan Nagarajan
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), 60 Biopolis Street, Singapore, 138672, Republic of Singapore
- Yong Loo Lin School of Medicine, National University of Singapore, Singapore, 117597, Republic of Singapore
| | - Yun William Yu
- Department of Mathematics, University of Toronto, Toronto, Ontario, M5S 2E4, Canada
- Ray and Stephanie Lane Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, 15213, United States
| |
Collapse
|
5
|
Wang YC, Fu HM, Shen Y, Wang J, Wang N, Chen YP, Yan P. Biosynthetic potential of uncultured anammox community bacteria revealed through multi-omics analysis. BIORESOURCE TECHNOLOGY 2024; 401:130740. [PMID: 38677385 DOI: 10.1016/j.biortech.2024.130740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/11/2024] [Accepted: 04/24/2024] [Indexed: 04/29/2024]
Abstract
Microbial secondary metabolites (SMs) and their derivatives have been widely used in medicine, agriculture, and energy. Growing needs for renewable energy and the challenges posed by antibiotic resistance, cancer, and pesticides emphasize the crucial hunt for new SMs. Anaerobic ammonium-oxidation (anammox) systems harbor many uncultured or underexplored bacteria, representing potential resources for discovering novel SMs. Leveraging HiFi long-read metagenomic sequencing, 1,040 biosynthetic gene clusters (BGCs) were unearthed from the anammox microbiome with 58% being complete and showcasing rich diversity. Most of them showed distant relations to known BGCs, implying novelty. Members of the underexplored lineages (Chloroflexota and Planctomycetota) and Proteobacteria contained lots of BGCs, showcasing substantial biosynthetic potential. Metaproteomic results indicated that Planctomycetota members harbored the most active BGCs, particularly those involved in producing potential biofuel-ladderane. Overall, these findings underscore that anammox microbiomes could serve as valuable resources for mining novel BGCs and discovering new SMs for practical application.
Collapse
Affiliation(s)
- Yi-Cheng Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Hui-Min Fu
- National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - Yu Shen
- National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - Jin Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Nuo Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - You-Peng Chen
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Peng Yan
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China.
| |
Collapse
|
6
|
Tschitschko B, Esti M, Philippi M, Kidane AT, Littmann S, Kitzinger K, Speth DR, Li S, Kraberg A, Tienken D, Marchant HK, Kartal B, Milucka J, Mohr W, Kuypers MMM. Rhizobia-diatom symbiosis fixes missing nitrogen in the ocean. Nature 2024; 630:899-904. [PMID: 38723661 PMCID: PMC11208148 DOI: 10.1038/s41586-024-07495-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 04/30/2024] [Indexed: 06/21/2024]
Abstract
Nitrogen (N2) fixation in oligotrophic surface waters is the main source of new nitrogen to the ocean1 and has a key role in fuelling the biological carbon pump2. Oceanic N2 fixation has been attributed almost exclusively to cyanobacteria, even though genes encoding nitrogenase, the enzyme that fixes N2 into ammonia, are widespread among marine bacteria and archaea3-5. Little is known about these non-cyanobacterial N2 fixers, and direct proof that they can fix nitrogen in the ocean has so far been lacking. Here we report the discovery of a non-cyanobacterial N2-fixing symbiont, 'Candidatus Tectiglobus diatomicola', which provides its diatom host with fixed nitrogen in return for photosynthetic carbon. The N2-fixing symbiont belongs to the order Rhizobiales and its association with a unicellular diatom expands the known hosts for this order beyond the well-known N2-fixing rhizobia-legume symbioses on land6. Our results show that the rhizobia-diatom symbioses can contribute as much fixed nitrogen as can cyanobacterial N2 fixers in the tropical North Atlantic, and that they might be responsible for N2 fixation in the vast regions of the ocean in which cyanobacteria are too rare to account for the measured rates.
Collapse
Affiliation(s)
- Bernhard Tschitschko
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- Department of Microbiology, University of Innsbruck, Innsbruck, Austria
| | - Mertcan Esti
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Miriam Philippi
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
| | - Abiel T Kidane
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Sten Littmann
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Katharina Kitzinger
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- Centre for Microbiology and Environmental Systems Science, Division of Microbial Ecology, University of Vienna, Vienna, Austria
| | - Daan R Speth
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- Centre for Microbiology and Environmental Systems Science, Division of Microbial Ecology, University of Vienna, Vienna, Austria
| | - Shengjie Li
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Alexandra Kraberg
- Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany
| | - Daniela Tienken
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Hannah K Marchant
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- MARUM - Centre for Marine Environmental Sciences, University of Bremen, Bremen, Germany
| | - Boran Kartal
- Max Planck Institute for Marine Microbiology, Bremen, Germany
- School of Science, Constructor University, Bremen, Germany
| | - Jana Milucka
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Wiebke Mohr
- Max Planck Institute for Marine Microbiology, Bremen, Germany
| | | |
Collapse
|
7
|
Agustinho DP, Fu Y, Menon VK, Metcalf GA, Treangen TJ, Sedlazeck FJ. Unveiling microbial diversity: harnessing long-read sequencing technology. Nat Methods 2024; 21:954-966. [PMID: 38689099 DOI: 10.1038/s41592-024-02262-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Accepted: 03/29/2024] [Indexed: 05/02/2024]
Abstract
Long-read sequencing has recently transformed metagenomics, enhancing strain-level pathogen characterization, enabling accurate and complete metagenome-assembled genomes, and improving microbiome taxonomic classification and profiling. These advancements are not only due to improvements in sequencing accuracy, but also happening across rapidly changing analysis methods. In this Review, we explore long-read sequencing's profound impact on metagenomics, focusing on computational pipelines for genome assembly, taxonomic characterization and variant detection, to summarize recent advancements in the field and provide an overview of available analytical methods to fully leverage long reads. We provide insights into the advantages and disadvantages of long reads over short reads and their evolution from the early days of long-read sequencing to their recent impact on metagenomics and clinical diagnostics. We further point out remaining challenges for the field such as the integration of methylation signals in sub-strain analysis and the lack of benchmarks.
Collapse
Affiliation(s)
- Daniel P Agustinho
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
| | - Yilei Fu
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Vipin K Menon
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
- Senior research project manager, Human Genetics, Genentech, South San Francisco, CA, USA
| | - Ginger A Metcalf
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA
- Department of Bioengineering, Rice University, Houston, TX, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing center, Baylor College of Medicine, Houston, TX, USA.
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
8
|
Huang W, Ding Y, Fan S, Liu W, Chen H, Segar S, Compton SG, Yu H. A high-quality chromosome-level genome assembly of Ficus hirta. Sci Data 2024; 11:526. [PMID: 38778063 PMCID: PMC11111794 DOI: 10.1038/s41597-024-03376-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 05/14/2024] [Indexed: 05/25/2024] Open
Abstract
Ficus species (Moraceae) play pivotal roles in tropical and subtropical ecosystems. Thriving across diverse habitats, from rainforests to deserts, they harbor a multitude of mutualistic and antagonistic interactions with insects, nematodes, and pathogens. Despite their ecological significance, knowledge about the genomic background of Ficus remains limited. In this study, we report a chromosome-level reference genome of F. hirta, with a total size of 297.27 Mb, containing 28,625 protein-coding genes and 44.67% repeat sequences. These findings illuminate the genetic basis of Ficus responses to environmental challenges, offering valuable genomic resources for understanding genome size, adaptive evolution, and co-evolution with natural enemies and mutualists within the genus.
Collapse
Affiliation(s)
- Weicheng Huang
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Yamei Ding
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China
- State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, the Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Songle Fan
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Wanzhen Liu
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China
- State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, the Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Hongfeng Chen
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China
- State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, the Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Simon Segar
- Department of Crop and Environment Sciences, Harper Adams University, Newport, Shropshire, TF10 8NB, UK
| | | | - Hui Yu
- Plant Resources Conservation and Sustainable Utilization, the Chinese Academy of Sciences, Guangzhou, 510650, China.
- State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, the Chinese Academy of Sciences, Guangzhou, 510650, China.
- State Key Laboratory of Plant Diversity and Specialty Crops, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, Guangdong, 510650, China.
| |
Collapse
|
9
|
Chen R, Meng S, Wang A, Jiang F, Yuan L, Lei L, Wang H, Fan W. The genomes of seven economic Caesalpinioideae trees provide insights into polyploidization history and secondary metabolite biosynthesis. PLANT COMMUNICATIONS 2024:100944. [PMID: 38733080 DOI: 10.1016/j.xplc.2024.100944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 03/29/2024] [Accepted: 05/08/2024] [Indexed: 05/13/2024]
Abstract
The Caesalpinioideae subfamily contains many well-known trees that are important for economic sustainability and human health, but a lack of genomic resources has hindered their breeding and utilization. Here, we present chromosome-level reference genomes for the two food and industrial trees Gleditsia sinensis (921 Mb) and Biancaea sappan (872 Mb), the three shade and ornamental trees Albizia julibrissin (705 Mb), Delonix regia (580 Mb), and Acacia confusa (566 Mb), and the two pioneer and hedgerow trees Leucaena leucocephala (1338 Mb) and Mimosa bimucronata (641 Mb). Phylogenetic inference shows that the mimosoid clade has a much higher evolutionary rate than the other clades of Caesalpinioideae. Macrosynteny comparison suggests that the fusion and breakage of an unstable chromosome are responsible for the difference in basic chromosome number (13 or 14) for Caesalpinioideae. After an ancient whole-genome duplication (WGD) shared by all Caesalpinioideae species (CWGD, ∼72.0 million years ago [MYA]), there were two recent successive WGD events, LWGD-1 (16.2-19.5 MYA) and LWGD-2 (7.1-9.5 MYA), in L. leucocephala. Thereafter, ∼40% gene loss and genome-size contraction have occurred during the diploidization process in L. leucocephala. To investigate secondary metabolites, we identified all gene copies involved in mimosine metabolism in these species and found that the abundance of mimosine biosynthesis genes in L. leucocephala largely explains its high mimosine production. We also identified the set of all potential genes involved in triterpenoid saponin biosynthesis in G. sinensis, which is more complete than that based on previous transcriptome-derived unigenes. Our results and genomic resources will facilitate biological studies of Caesalpinioideae and promote the utilization of valuable secondary metabolites.
Collapse
Affiliation(s)
- Rong Chen
- College of Agronomy, Qingdao Agricultural University, Qingdao 266109, China; Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Sihan Meng
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Anqi Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Fan Jiang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Lihua Yuan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China
| | - Lihong Lei
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; State Key Laboratory of Crop Stress Adaptation and Improvement, School of Life Sciences, Henan University, Kaifeng 475004, China
| | - Hengchao Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Wei Fan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China.
| |
Collapse
|
10
|
Wang YC, Mao Y, Fu HM, Wang J, Weng X, Liu ZH, Xu XW, Yan P, Fang F, Guo JS, Shen Y, Chen YP. New insights into functional divergence and adaptive evolution of uncultured bacteria in anammox community by complete genome-centric analysis. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 924:171530. [PMID: 38453092 DOI: 10.1016/j.scitotenv.2024.171530] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 11/13/2023] [Accepted: 03/04/2024] [Indexed: 03/09/2024]
Abstract
Anaerobic ammonium-oxidation (anammox) bacteria play a crucial role in global nitrogen cycling and wastewater nitrogen removal, but they share symbiotic relationships with various other microorganisms. Functional divergence and adaptive evolution of uncultured bacteria in anammox community remain underexplored. Although shotgun metagenomics based on short reads has been widely used in anammox research, metagenome-assembled genomes (MAGs) are often discontinuous and highly contaminated, which limits in-depth analyses of anammox communities. Here, for the first time, we performed Pacific Biosciences high-fidelity (HiFi) long-read sequencing on the anammox granule sludge sample from a lab-scale bioreactor, and obtained 30 accurate and complete metagenome-assembled genomes (cMAGs). These cMAGs were obtained by selecting high-quality circular contigs from initial assemblies of long reads generated by HiFi sequencing, eliminating the need for Illumina short reads, binning, and reassembly. One new anammox species affiliated with Candidatus Jettenia and three species affiliated with novel families were found in this anammox community. cMAG-centric analysis revealed functional divergence in general and nitrogen metabolism among the anammox community members, and they might adopt a cross-feeding strategy in organic matter, cofactors, and vitamins. Furthermore, we identified 63 mobile genetic elements (MGEs) and 50 putative horizontal gene transfer (HGT) events within these cMAGs. The results suggest that HGT events and MGEs related to phage and integration or excision, particularly transposons containing tnpA in anammox bacteria, might play important roles in the adaptive evolution of this anammox community. The cMAGs generated in the present study could be used to establish of a comprehensive database for anammox bacteria and associated microorganisms. These findings highlight the advantages of HiFi sequencing for the studies of complex mixed cultures and advance the understanding of anammox communities.
Collapse
Affiliation(s)
- Yi-Cheng Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Yanping Mao
- College of Chemistry and Environmental Engineering, Shenzhen University, Shenzhen 518071, Guangdong, China
| | - Hui-Min Fu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China; National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - Jin Wang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Xun Weng
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Zi-Hao Liu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Xiao-Wei Xu
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Peng Yan
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Fang Fang
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Jin-Song Guo
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China
| | - Yu Shen
- National Research Base of Intelligent Manufacturing Service, Chongqing Technology and Business University, Chongqing 400067, China
| | - You-Peng Chen
- Key Laboratory of the Three Gorges Reservoir Region's Eco-Environments of MOE, Chongqing University, Chongqing 400045, China.
| |
Collapse
|
11
|
Zhang J, Liu Q, Dai L, Zhang Z, Wang Y. Pan-Genome Analysis of Wolbachia, Endosymbiont of Diaphorina citri, Reveals Independent Origin in Asia and North America. Int J Mol Sci 2024; 25:4851. [PMID: 38732070 PMCID: PMC11084931 DOI: 10.3390/ijms25094851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2024] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/13/2024] Open
Abstract
Wolbachia, a group of Gram-negative symbiotic bacteria, infects nematodes and a wide range of arthropods. Diaphorina citri Kuwayama, the vector of Candidatus Liberibacter asiaticus (CLas) that causes citrus greening disease, is naturally infected with Wolbachia (wDi). However, the interaction between wDi and D. citri remains poorly understood. In this study, we performed a pan-genome analysis using 65 wDi genomes to gain a comprehensive understanding of wDi. Based on average nucleotide identity (ANI) analysis, we classified the wDi strains into Asia and North America strains. The ANI analysis, principal coordinates analysis (PCoA), and phylogenetic tree analysis supported that the D. citri in Florida did not originate from China. Furthermore, we found that a significant number of core genes were associated with metabolic pathways. Pathways such as thiamine metabolism, type I secretion system, biotin transport, and phospholipid transport were highly conserved across all analyzed wDi genomes. The variation analysis between Asia and North America wDi showed that there were 39,625 single-nucleotide polymorphisms (SNPs), 2153 indels, 10 inversions, 29 translocations, 65 duplications, 10 SV-based insertions, and 4 SV-based deletions. The SV-based insertions and deletions involved genes encoding transposase, phage tail tube protein, ankyrin repeat (ANK) protein, and group II intron-encoded protein. Pan-genome analysis of wDi contributes to our understanding of the geographical population of wDi, the origin of hosts of D. citri, and the interaction between wDi and its host, thus facilitating the development of strategies to control the insects and huanglongbing (HLB).
Collapse
Affiliation(s)
- Jiahui Zhang
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, College of Plant Protection, Hunan Agricultural University, Changsha 410128, China; (J.Z.); (Q.L.); (L.D.)
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Institute of Plant Protection and Microbiology, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Qian Liu
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, College of Plant Protection, Hunan Agricultural University, Changsha 410128, China; (J.Z.); (Q.L.); (L.D.)
| | - Liangying Dai
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, College of Plant Protection, Hunan Agricultural University, Changsha 410128, China; (J.Z.); (Q.L.); (L.D.)
| | - Zhijun Zhang
- State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Institute of Plant Protection and Microbiology, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Yunsheng Wang
- Hunan Provincial Key Laboratory for Biology and Control of Plant Diseases and Insect Pests, College of Plant Protection, Hunan Agricultural University, Changsha 410128, China; (J.Z.); (Q.L.); (L.D.)
| |
Collapse
|
12
|
Li H, Durbin R. Genome assembly in the telomere-to-telomere era. Nat Rev Genet 2024:10.1038/s41576-024-00718-w. [PMID: 38649458 DOI: 10.1038/s41576-024-00718-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/27/2024] [Indexed: 04/25/2024]
Abstract
Genome sequences largely determine the biology and encode the history of an organism, and de novo assembly - the process of reconstructing the genome sequence of an organism from sequencing reads - has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best, but now technological advances in long-read sequencing enable the near-complete assembly of each chromosome - also known as telomere-to-telomere assembly - for many organisms. Here, we review recent progress on assembly algorithms and protocols, with a focus on how to derive near-telomere-to-telomere assemblies. We also discuss the additional developments that will be required to resolve remaining assembly gaps and to assemble non-diploid genomes.
Collapse
Affiliation(s)
- Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
| | - Richard Durbin
- Department of Genetics, Cambridge University, Cambridge, UK.
| |
Collapse
|
13
|
Xu Q, Zhang H, Vandenkoornhuyse P, Guo S, Kuzyakov Y, Shen Q, Ling N. Carbon starvation raises capacities in bacterial antibiotic resistance and viral auxiliary carbon metabolism in soils. Proc Natl Acad Sci U S A 2024; 121:e2318160121. [PMID: 38598339 PMCID: PMC11032446 DOI: 10.1073/pnas.2318160121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Accepted: 03/12/2024] [Indexed: 04/12/2024] Open
Abstract
Organic carbon availability in soil is crucial for shaping microbial communities, yet, uncertainties persist concerning microbial adaptations to carbon levels and the ensuing ecological and evolutionary consequences. We investigated organic carbon metabolism, antibiotic resistance, and virus-host interactions in soils subjected to 40 y of chemical and organic fertilization that led to contrasting carbon availability: carbon-poor and carbon-rich soils, respectively. Carbon-poor soils drove the enrichment of putative genes involved in organic matter decomposition and exhibited specialization in utilizing complex organic compounds, reflecting scramble competition. This specialization confers a competitive advantage of microbial communities in carbon-poor soils but reduces their buffering capacity in terms of organic carbon metabolisms, making them more vulnerable to environmental fluctuations. Additionally, in carbon-poor soils, viral auxiliary metabolic genes linked to organic carbon metabolism increased host competitiveness and environmental adaptability through a strategy akin to "piggyback the winner." Furthermore, putative antibiotic resistance genes, particularly in low-abundance drug categories, were enriched in carbon-poor soils as an evolutionary consequence of chemical warfare (i.e., interference competition). This raises concerns about the potential dissemination of antibiotic resistance from conventional agriculture that relies on chemical-only fertilization. Consequently, carbon starvation resulting from long-term chemical-only fertilization increases microbial adaptations to competition, underscoring the importance of implementing sustainable agricultural practices to mitigate the emergence and spread of antimicrobial resistance and to increase soil carbon storage.
Collapse
Affiliation(s)
- Qicheng Xu
- Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing Agricultural University, Nanjing210095, China
- CNRS, UMR 6553 EcoBio, Université de Rennes, Rennes Cedex35042, France
| | - He Zhang
- Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing Agricultural University, Nanjing210095, China
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou730020, China
| | | | - Shiwei Guo
- Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing Agricultural University, Nanjing210095, China
| | - Yakov Kuzyakov
- Department of Soil Science of Temperate Ecosystems, University of Göttingen, Göttingen37077, Germany
- Department of Agricultural Soil Science, University of Göttingen, Göttingen37077, Germany
| | - Qirong Shen
- Jiangsu Provincial Key Lab for Solid Organic Waste Utilization, Nanjing Agricultural University, Nanjing210095, China
| | - Ning Ling
- State Key Laboratory of Herbage Improvement and Grassland Agro-Ecosystems, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou730020, China
| |
Collapse
|
14
|
Sakurai R, Fukuda Y, Tada C. Circular metagenome-assembled genome of Candidatus Patescibacteria recovered from anaerobic digestion sludge. Microbiol Resour Announc 2024; 13:e0008324. [PMID: 38526092 PMCID: PMC11008200 DOI: 10.1128/mra.00083-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 03/15/2024] [Indexed: 03/26/2024] Open
Abstract
A single-contig, circular metagenome-assembled genome (cMAG) of Candidatus (Ca.) Patescibacteria was reconstructed from a mesophilic full-scale food waste treatment plant in Japan. The genome is of small size and lacks fundamental biosynthetic pathways. Taxonomic analysis using the Genome Taxonomy Database revealed that this cMAG belonged to the genus JAEZRQ01 (Ca. Parcubacteria).
Collapse
Affiliation(s)
- Riku Sakurai
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
- Japan Society for the Promotion of Science, Chiyoda-ku, Tokyo, Japan
| | - Yasuhiro Fukuda
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
| | - Chika Tada
- Laboratory of Sustainable Animal Environment, Graduate School of Agricultural Science, Tohoku University, Osaki, Miyagi, Japan
| |
Collapse
|
15
|
Feng X, Li H. Evaluating and improving the representation of bacterial contents in long-read metagenome assemblies. Genome Biol 2024; 25:92. [PMID: 38605401 PMCID: PMC11007910 DOI: 10.1186/s13059-024-03234-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open
Abstract
BACKGROUND In the metagenomic assembly of a microbial community, abundant species are often thought to assemble well given their deeper sequencing coverage. This conjuncture is rarely tested or evaluated in practice. We often do not know how many abundant species are missing and do not have an approach to recover them. RESULTS Here, we propose k-mer based and 16S RNA based methods to measure the completeness of metagenome assembly. We show that even with PacBio high-fidelity (HiFi) reads, abundant species are often not assembled, as high strain diversity may lead to fragmented contigs. We develop a novel reference-free algorithm to recover abundant metagenome-assembled genomes (MAGs) by identifying circular assembly subgraphs. Complemented with a reference-free genome binning heuristics based on dimension reduction, the proposed method rescues many abundant species that would be missing with existing methods and produces competitive results compared to those state-of-the-art binners in terms of total number of near-complete genome bins. CONCLUSIONS Our work emphasizes the importance of metagenome completeness, which has often been overlooked. Our algorithm generates more circular MAGs and moves a step closer to the complete representation of microbial communities.
Collapse
Affiliation(s)
- Xiaowen Feng
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, USA.
- Department of Biomedical Informatics, Harvard Medical School, Boston, USA.
| |
Collapse
|
16
|
Eisenhofer R, Nesme J, Santos-Bay L, Koziol A, Sørensen SJ, Alberdi A, Aizpurua O. A comparison of short-read, HiFi long-read, and hybrid strategies for genome-resolved metagenomics. Microbiol Spectr 2024; 12:e0359023. [PMID: 38451230 PMCID: PMC10986573 DOI: 10.1128/spectrum.03590-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 02/11/2024] [Indexed: 03/08/2024] Open
Abstract
Shotgun metagenomics enables the reconstruction of complex microbial communities at a high level of detail. Such an approach can be conducted using both short-read and long-read sequencing data, as well as a combination of both. To assess the pros and cons of these different approaches, we used 22 fecal DNA extracts collected weekly for 11 weeks from two respective lab mice to study seven performance metrics over four combinations of sequencing depth and technology: (i) 20 Gbp of Illumina short-read data, (ii) 40 Gbp of short-read data, (iii) 20 Gbp of PacBio HiFi long-read data, and (iv) 40 Gbp of hybrid (20 Gbp of short-read +20 Gbp of long-read) data. No strategy was best for all metrics; instead, each one excelled across different metrics. The long-read approach yielded the best assembly statistics, with the highest N50 and lowest number of contigs. The 40 Gbp short-read approach yielded the highest number of refined bins. Finally, the hybrid approach yielded the longest assemblies and the highest mapping rate to the bacterial genomes. Our results suggest that while long-read sequencing significantly improves the quality of reconstructed bacterial genomes, it is more expensive and requires deeper sequencing than short-read approaches to recover a comparable amount of reconstructed genomes. The most optimal strategy is study-specific and depends on how researchers assess the trade-off between the quantity and quality of recovered genomes.IMPORTANCEMice are an important model organism for understanding the gut microbiome. When studying these gut microbiomes using DNA techniques, researchers can choose from technologies that use short or long DNA reads. In this study, we perform an extensive benchmark between short- and long-read DNA sequencing for studying mice gut microbiomes. We find that no one approach was best for all metrics and provide information that can help guide researchers in planning their experiments.
Collapse
Affiliation(s)
- Raphael Eisenhofer
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Joseph Nesme
- Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Luisa Santos-Bay
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Adam Koziol
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Søren Johannes Sørensen
- Section of Microbiology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Ostaizka Aizpurua
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
17
|
Cook R, Telatin A, Hsieh SY, Newberry F, Tariq MA, Baker DJ, Carding SR, Adriaenssens EM. Nanopore and Illumina sequencing reveal different viral populations from human gut samples. Microb Genom 2024; 10. [PMID: 38683195 DOI: 10.1099/mgen.0.001236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/01/2024] Open
Abstract
The advent of viral metagenomics, or viromics, has improved our knowledge and understanding of global viral diversity. High-throughput sequencing technologies enable explorations of the ecological roles, contributions to host metabolism, and the influence of viruses in various environments, including the human intestinal microbiome. However, bacterial metagenomic studies frequently have the advantage. The adoption of advanced technologies like long-read sequencing has the potential to be transformative in refining viromics and metagenomics. Here, we examined the effectiveness of long-read and hybrid sequencing by comparing Illumina short-read and Oxford Nanopore Technology (ONT) long-read sequencing technologies and different assembly strategies on recovering viral genomes from human faecal samples. Our findings showed that if a single sequencing technology is to be chosen for virome analysis, Illumina is preferable due to its superior ability to recover fully resolved viral genomes and minimise erroneous genomes. While ONT assemblies were effective in recovering viral diversity, the challenges related to input requirements and the necessity for amplification made it less ideal as a standalone solution. However, using a combined, hybrid approach enabled a more authentic representation of viral diversity to be obtained within samples.
Collapse
Affiliation(s)
- Ryan Cook
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | | | | | - Fiona Newberry
- Department of Biosciences, Nottingham Trent University, Nottingham, NG11 8NS, UK
| | - Mohammad A Tariq
- Faculty of Health and Life Sciences, University of Northumbria, Newcastle upon Tyne, NE1 8ST, UK
| | - Dave J Baker
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
| | - Simon R Carding
- Quadram Institute Bioscience, Norwich, NR4 7UQ, UK
- Norwich Medical School, University of East Anglia, Norwich, NR4 7TJ, UK
| | | |
Collapse
|
18
|
Yu W, Luo H, Yang J, Zhang S, Jiang H, Zhao X, Hui X, Sun D, Li L, Wei XQ, Lonardi S, Pan W. Comprehensive assessment of 11 de novo HiFi assemblers on complex eukaryotic genomes and metagenomes. Genome Res 2024; 34:326-340. [PMID: 38428994 PMCID: PMC10984382 DOI: 10.1101/gr.278232.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 01/23/2024] [Indexed: 03/03/2024]
Abstract
Pacific Biosciences (PacBio) HiFi sequencing technology generates long reads (>10 kbp) with very high accuracy (<0.01% sequencing error). Although several de novo assembly tools are available for HiFi reads, there are no comprehensive studies on the evaluation of these assemblers. We evaluated the performance of 11 de novo HiFi assemblers on (1) real data for three eukaryotic genomes; (2) 34 synthetic data sets with different ploidy, sequencing coverage levels, heterozygosity rates, and sequencing error rates; (3) one real metagenomic data set; and (4) five synthetic metagenomic data sets with different composition abundance and heterozygosity rates. The 11 assemblers were evaluated using quality assessment tool (QUAST) and benchmarking universal single-copy ortholog (BUSCO). We also used several additional criteria, namely, completion rate, single-copy completion rate, duplicated completion rate, average proportion of largest category, average distance difference, quality value, run-time, and memory utilization. Results show that hifiasm and hifiasm-meta should be the first choice for assembling eukaryotic genomes and metagenomes with HiFi data. We performed a comprehensive benchmarking study of commonly used assemblers on complex eukaryotic genomes and metagenomes. Our study will help the research community to choose the most appropriate assembler for their data and identify possible improvements in assembly algorithms.
Collapse
Affiliation(s)
- Wenjuan Yu
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Haohui Luo
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Jinbao Yang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Shengchen Zhang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Heling Jiang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Xianjia Zhao
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
- School of Agricultural Sciences, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Xingqi Hui
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
- School of Agricultural Sciences, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Da Sun
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China
| | - Liang Li
- Fruit Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian 350002, China
| | - Xiu-Qing Wei
- Fruit Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian 350002, China;
| | - Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, California 92521, USA;
| | - Weihua Pan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China;
| |
Collapse
|
19
|
Hui X, Yang J, Sun J, Liu F, Pan W. MCSS: microbial community simulator based on structure. Front Microbiol 2024; 15:1358257. [PMID: 38516019 PMCID: PMC10956353 DOI: 10.3389/fmicb.2024.1358257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2023] [Accepted: 02/20/2024] [Indexed: 03/23/2024] Open
Abstract
De novo assembly plays a pivotal role in metagenomic analysis, and the incorporation of third-generation sequencing technology can significantly improve the integrity and accuracy of assembly results. Recently, with advancements in sequencing technology (Hi-Fi, ultra-long), several long-read-based bioinformatic tools have been developed. However, the validation of the performance and reliability of these tools is a crucial concern. To address this gap, we present MCSS (microbial community simulator based on structure), which has the capability to generate simulated microbial community and sequencing datasets based on the structure attributes of real microbiome communities. The evaluation results indicate that it can generate simulated communities that exhibit both diversity and similarity to actual community structures. Additionally, MCSS generates synthetic PacBio Hi-Fi and Oxford Nanopore Technologies (ONT) long reads for the species within the simulated community. This innovative tool provides a valuable resource for benchmarking and refining metagenomic analysis methods. Code available at: https://github.com/panlab-bio/mcss.
Collapse
Affiliation(s)
- Xingqi Hui
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou, China
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
| | - Jinbao Yang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Jinhuan Sun
- Key Laboratory of Plant Molecular Physiology, CAS Center for Excellence in Molecular Plant Sciences, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Fang Liu
- Zhengzhou Research Base, State Key Laboratory of Cotton Biology, School of Agricultural Sciences, Zhengzhou University, Zhengzhou, China
- National Key Laboratory of Cotton Bio-Breeding and Integrated Utilization, Institute of Cotton Research, Chinese Academy of Agricultural Sciences (ICR, CAAS), Anyang, China
| | - Weihua Pan
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences (ICR, CAAS), Shenzhen, China
| |
Collapse
|
20
|
Wang J, Zhang Q, Tung J, Zhang X, Liu D, Deng Y, Tian Z, Chen H, Wang T, Yin W, Li B, Lai Z, Dinesh-Kumar SP, Baker B, Li F. High-quality assembled and annotated genomes of Nicotiana tabacum and Nicotiana benthamiana reveal chromosome evolution and changes in defense arsenals. MOLECULAR PLANT 2024; 17:423-437. [PMID: 38273657 DOI: 10.1016/j.molp.2024.01.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 01/08/2024] [Accepted: 01/21/2024] [Indexed: 01/27/2024]
Abstract
Nicotiana tabacum and Nicotiana benthamiana are widely used models in plant biology research. However, genomic studies of these species have lagged. Here we report the chromosome-level reference genome assemblies for N. benthamiana and N. tabacum with an estimated 99.5% and 99.8% completeness, respectively. Sensitive transcription start and termination site sequencing methods were developed and used for accurate gene annotation in N. tabacum. Comparative analyses revealed evidence for the parental origins and chromosome structural changes, leading to hybrid genome formation of each species. Interestingly, the antiviral silencing genes RDR1, RDR6, DCL2, DCL3, and AGO2 were lost from one or both subgenomes in N. benthamiana, while both homeologs were kept in N. tabacum. Furthermore, the N. benthamiana genome encodes fewer immune receptors and signaling components than that of N. tabacum. These findings uncover possible reasons underlying the hypersusceptible nature of N. benthamiana. We developed the user-friendly Nicomics (http://lifenglab.hzau.edu.cn/Nicomics/) web server to facilitate better use of Nicotiana genomic resources as well as gene structure and expression analyses.
Collapse
Affiliation(s)
- Jubin Wang
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China; The Key Laboratory of Horticultural Plant Genetic and Improvement of Jiangxi Province, Institute of Biological Resources, Jiangxi Academy of Sciences, Nanchang 330299, China
| | - Qingling Zhang
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Institute of Vegetables and Flowers, Jiangxi Academy of Agricultural Sciences, Nanchang 330200, China
| | - Jeffrey Tung
- Plant Gene Expression Center, Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA 94706, USA
| | - Xi Zhang
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Dan Liu
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Yingtian Deng
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Zhendong Tian
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China
| | - Huilan Chen
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Taotao Wang
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Weixiao Yin
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Bo Li
- College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China
| | - Zhibing Lai
- College of Life Science and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China
| | - Savithramma P Dinesh-Kumar
- Department of Plant Biology and The Genome Center, College of Biological Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Barbara Baker
- Plant Gene Expression Center, Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, CA 94706, USA.
| | - Feng Li
- National Key Laboratory for Germplasm Innovation and Utilization for Fruit and Vegetable Horticultural Crops, College of Horticulture and Forestry Sciences, Huazhong Agricultural University, Wuhan, Hubei 430070, China; Hubei Hongshan Laboratory, Wuhan, Hubei 430070, China.
| |
Collapse
|
21
|
He Y, Zhang K, Shi Y, Lin H, Huang X, Lu X, Wang Z, Li W, Feng X, Shi T, Chen Q, Wang J, Tang Y, Chapman MA, Germ M, Luthar Z, Kreft I, Janovská D, Meglič V, Woo SH, Quinet M, Fernie AR, Liu X, Zhou M. Genomic insight into the origin, domestication, dispersal, diversification and human selection of Tartary buckwheat. Genome Biol 2024; 25:61. [PMID: 38414075 PMCID: PMC10898187 DOI: 10.1186/s13059-024-03203-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 02/21/2024] [Indexed: 02/29/2024] Open
Abstract
BACKGROUND Tartary buckwheat, Fagopyrum tataricum, is a pseudocereal crop with worldwide distribution and high nutritional value. However, the origin and domestication history of this crop remain to be elucidated. RESULTS Here, by analyzing the population genomics of 567 accessions collected worldwide and reviewing historical documents, we find that Tartary buckwheat originated in the Himalayan region and then spread southwest possibly along with the migration of the Yi people, a minority in Southwestern China that has a long history of planting Tartary buckwheat. Along with the expansion of the Mongol Empire, Tartary buckwheat dispersed to Europe and ultimately to the rest of the world. The different natural growth environments resulted in adaptation, especially significant differences in salt tolerance between northern and southern Chinese Tartary buckwheat populations. By scanning for selective sweeps and using a genome-wide association study, we identify genes responsible for Tartary buckwheat domestication and differentiation, which we then experimentally validate. Comparative genomics and QTL analysis further shed light on the genetic foundation of the easily dehulled trait in a particular variety that was artificially selected by the Wa people, a minority group in Southwestern China known for cultivating Tartary buckwheat specifically for steaming as a staple food to prevent lysine deficiency. CONCLUSIONS This study provides both comprehensive insights into the origin and domestication of, and a foundation for molecular breeding for, Tartary buckwheat.
Collapse
Affiliation(s)
- Yuqi He
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Kaixuan Zhang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Yaliang Shi
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Hao Lin
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Xu Huang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Xiang Lu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Zhirong Wang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Wei Li
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Xibo Feng
- Tibet Key Experiments of Crop Cultivation and Farming/College of Plant Science, Tibet Agriculture and Animal Husbandry University, Linzhi, 860000, China
| | - Taoxiong Shi
- Research Center of Buckwheat Industry Technology, Guizhou Normal University, Guiyang, 550001, China
| | - Qingfu Chen
- Research Center of Buckwheat Industry Technology, Guizhou Normal University, Guiyang, 550001, China
| | - Junzhen Wang
- Xichang Institute of Agricultural Science, Liangshan Yi People Autonomous Prefecture, Liangshan, Sichuan, 615000, China
| | - Yu Tang
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China
| | - Mark A Chapman
- Biological Sciences, University of Southampton, Life Sciences Building 85, Highfield Campus, Southampton, SO17 1BJ, UK
| | - Mateja Germ
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000, Ljubljana, Slovenia
| | - Zlata Luthar
- Biotechnical Faculty, University of Ljubljana, Jamnikarjeva 101, SI-1000, Ljubljana, Slovenia
| | - Ivan Kreft
- Nutrition Institute, Koprska Ulica 98, SI-1000, Ljubljana, Slovenia
| | - Dagmar Janovská
- Gene Bank, Crop Research Institute, Drnovská 507, Prague 6, Czech Republic
| | - Vladimir Meglič
- Agricultural Institute of Slovenia, Hacquetova ulica 17, SI-1000, Ljubljana, Slovenia
| | - Sun-Hee Woo
- Department of Crop Science, Chungbuk National University, Cheong-ju, Republic of Korea
| | - Muriel Quinet
- Groupe de Recherche en Physiologie Végétale (GRPV), Earth and Life Institute-Agronomy (ELI-A), Université catholique de Louvain, Croix du Sud 45, boîte L7.07.13, B-1348, Louvain-la-Neuve, Belgium
| | - Alisdair R Fernie
- Department of Molecular Physiology, Max-Planck-Institute of Molecular Plant Physiology, 14476, Potsdam, Germany
| | - Xu Liu
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| | - Meiliang Zhou
- State Key Laboratory of Crop Gene Resources and Breeding, Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100081, China.
| |
Collapse
|
22
|
Vancaester E, Blaxter ML. MarkerScan: Separation and assembly of cobionts sequenced alongside target species in biodiversity genomics projects. Wellcome Open Res 2024; 9:33. [PMID: 38617467 PMCID: PMC11016177 DOI: 10.12688/wellcomeopenres.20730.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/18/2023] [Indexed: 04/16/2024] Open
Abstract
Contamination of public databases by mislabelled sequences has been highlighted for many years and the avalanche of novel sequencing data now being deposited has the potential to make databases difficult to use effectively. It is therefore crucial that sequencing projects and database curators perform pre-submission checks to remove obvious contamination and avoid propagating erroneous taxonomic relationships. However, it is important also to recognise that biological contamination of a target sample with unexpected species' DNA can also lead to the discovery of fascinating biological phenomena through the identification of environmental organisms or endosymbionts. Here, we present a novel, integrated method for detection and generation of high-quality genomes of all non-target genomes co-sequenced in eukaryotic genome sequencing projects. After performing taxonomic profiling of an assembly from the raw data, and leveraging the identity of small rRNA sequences discovered therein as markers, a targeted classification approach retrieves and assembles high-quality genomes. The genomes of these cobionts are then not only removed from the target species' genome but also available for further interrogation. Source code is available from https://github.com/CobiontID/MarkerScan. MarkerScan is written in Python and is deployed as a Docker container.
Collapse
Affiliation(s)
| | - Mark L. Blaxter
- Tree of Life, Wellcome Sanger Institute, Hinxton, England, UK
| |
Collapse
|
23
|
Hosokawa M, Nishikawa Y. Tools for microbial single-cell genomics for obtaining uncultured microbial genomes. Biophys Rev 2024; 16:69-77. [PMID: 38495448 PMCID: PMC10937852 DOI: 10.1007/s12551-023-01124-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 08/23/2023] [Indexed: 03/19/2024] Open
Abstract
The advent of next-generation sequencing technologies has facilitated the acquisition of large amounts of DNA sequence data at a relatively low cost, leading to numerous breakthroughs in decoding microbial genomes. Among the various genome sequencing activities, metagenomic analysis, which entails the direct analysis of uncultured microbial DNA, has had a profound impact on microbiome research and has emerged as an indispensable technology in this field. Despite its valuable contributions, metagenomic analysis is a "bulk analysis" technique that analyzes samples containing a wide diversity of microbes, such as bacteria, yielding information that is averaged across the entire microbial population. In order to gain a deeper understanding of the heterogeneous nature of the microbial world, there is a growing need for single-cell analysis, similar to its use in human cell biology. With this paradigm shift in mind, comprehensive single-cell genomics technology has become a much-anticipated innovation that is now poised to revolutionize microbiome research. It has the potential to enable the discovery of differences at the strain level and to facilitate a more comprehensive examination of microbial ecosystems. In this review, we summarize the current state-of-the-art in microbial single-cell genomics, highlighting the potential impact of this technology on our understanding of the microbial world. The successful implementation of this technology is expected to have a profound impact in the field, leading to new discoveries and insights into the diversity and evolution of microbes.
Collapse
Affiliation(s)
- Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-Cho, Shinjuku-Ku, Tokyo, 162-8480 Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- bitBiome, Inc., 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
| | - Yohei Nishikawa
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-Ku, Tokyo, 169-8555 Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-Cho, Shinjuku-Ku, Tokyo, 162-0041 Japan
| |
Collapse
|
24
|
Kim C, Pongpanich M, Porntaveetus T. Unraveling metagenomics through long-read sequencing: a comprehensive review. J Transl Med 2024; 22:111. [PMID: 38282030 PMCID: PMC10823668 DOI: 10.1186/s12967-024-04917-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Accepted: 01/21/2024] [Indexed: 01/30/2024] Open
Abstract
The study of microbial communities has undergone significant advancements, starting from the initial use of 16S rRNA sequencing to the adoption of shotgun metagenomics. However, a new era has emerged with the advent of long-read sequencing (LRS), which offers substantial improvements over its predecessor, short-read sequencing (SRS). LRS produces reads that are several kilobases long, enabling researchers to obtain more complete and contiguous genomic information, characterize structural variations, and study epigenetic modifications. The current leaders in LRS technologies are Pacific Biotechnologies (PacBio) and Oxford Nanopore Technologies (ONT), each offering a distinct set of advantages. This review covers the workflow of long-read metagenomics sequencing, including sample preparation (sample collection, sample extraction, and library preparation), sequencing, processing (quality control, assembly, and binning), and analysis (taxonomic annotation and functional annotation). Each section provides a concise outline of the key concept of the methodology, presenting the original concept as well as how it is challenged or modified in the context of LRS. Additionally, the section introduces a range of tools that are compatible with LRS and can be utilized to execute the LRS process. This review aims to present the workflow of metagenomics, highlight the transformative impact of LRS, and provide researchers with a selection of tools suitable for this task.
Collapse
Affiliation(s)
- Chankyung Kim
- Center of Excellence in Genomics and Precision Dentistry, Department of Physiology, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand
- Graduate Program in Bioinformatics and Computational Biology, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Monnat Pongpanich
- Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Center of Excellence for Cancer and Inflammation, Chulalongkorn University, Bangkok, Thailand
| | - Thantrira Porntaveetus
- Center of Excellence in Genomics and Precision Dentistry, Department of Physiology, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand.
- Graduate Program in Geriatric and Special Patients Care, Faculty of Dentistry, Chulalongkorn University, Bangkok, Thailand.
| |
Collapse
|
25
|
Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, Quince C. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat Biotechnol 2024:10.1038/s41587-023-01983-6. [PMID: 38168989 DOI: 10.1038/s41587-023-01983-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/08/2023] [Indexed: 01/05/2024]
Abstract
We introduce metaMDBG, a metagenomics assembler for PacBio HiFi reads. MetaMDBG combines a de Bruijn graph assembly in a minimizer space with an iterative assembly over sequences of minimizers to address variations in genome coverage depth and an abundance-based filtering strategy to simplify strain complexity. For complex communities, we obtained up to twice as many high-quality circularized prokaryotic metagenome-assembled genomes as existing methods and had better recovery of viruses and plasmids.
Collapse
Affiliation(s)
- Gaëtan Benoit
- Organisms and Ecosystems, Earlham Institute, Norwich, UK
| | | | - Robert James
- Gut Microbes and Health, Quadram Institute, Norwich, UK
| | - Adam M Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Rayan Chikhi
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Christopher Quince
- Organisms and Ecosystems, Earlham Institute, Norwich, UK.
- Gut Microbes and Health, Quadram Institute, Norwich, UK.
- School of Biological Sciences, University of East Anglia, Norwich, UK.
- Warwick Medical School, University of Warwick, Coventry, UK.
| |
Collapse
|
26
|
Cerk K, Ugalde‐Salas P, Nedjad CG, Lecomte M, Muller C, Sherman DJ, Hildebrand F, Labarthe S, Frioux C. Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing. Microb Biotechnol 2024; 17:e14396. [PMID: 38243750 PMCID: PMC10832553 DOI: 10.1111/1751-7915.14396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/27/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024] Open
Abstract
Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.
Collapse
Affiliation(s)
- Klara Cerk
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Chabname Ghassemi Nedjad
- Inria, University of Bordeaux, INRAETalenceFrance
- University of Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800TalenceFrance
| | - Maxime Lecomte
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE STLO¸University of RennesRennesFrance
| | | | | | - Falk Hildebrand
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Simon Labarthe
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE, University of Bordeaux, BIOGECO, UMR 1202CestasFrance
| | | |
Collapse
|
27
|
Jang Y, Kang JS, Bae EH, Lee J. Metagenome-assembled genomes of the GU0601 sample (the Han River, South Korea). Microbiol Resour Announc 2023; 12:e0068823. [PMID: 37982653 PMCID: PMC10720407 DOI: 10.1128/mra.00688-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 10/22/2023] [Indexed: 11/21/2023] Open
Abstract
We generated metagenome sequences of the GU0601 sample collected from the Han River and constructed metagenome-assembled genomes (MAGs) to identify their bacterial composition. We identified six MAGs belonging to Alphaproteobacteria, Cyanobacteria, and Flavobacteria.
Collapse
Affiliation(s)
- YeongJun Jang
- Department of Oceanography, Kyungpook National University, Daegu, South Korea
| | - Jae-Shin Kang
- Biodiversity Research and Cooperation Division, National Institute of Biological Resources, Incheon, South Korea
| | - Eun Hee Bae
- Climate Change and Environmental Biology Research Division, National Institute of Biological Resources, Incheon, South Korea
| | - JunMo Lee
- Department of Oceanography, Kyungpook National University, Daegu, South Korea
- Kyungpook Institute of Oceanography, Kyungpook National University, Daegu, South Korea
| |
Collapse
|
28
|
Kang X, Xu J, Luo X, Schönhuth A. Hybrid-hybrid correction of errors in long reads with HERO. Genome Biol 2023; 24:275. [PMID: 38041098 PMCID: PMC10690975 DOI: 10.1186/s13059-023-03112-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Accepted: 11/16/2023] [Indexed: 12/03/2023] Open
Abstract
Although generally superior, hybrid approaches for correcting errors in third-generation sequencing (TGS) reads, using next-generation sequencing (NGS) reads, mistake haplotype-specific variants for errors in polyploid and mixed samples. We suggest HERO, as the first "hybrid-hybrid" approach, to make use of both de Bruijn graphs and overlap graphs for optimal catering to the particular strengths of NGS and TGS reads. Extensive benchmarking experiments demonstrate that HERO improves indel and mismatch error rates by on average 65% (27[Formula: see text]95%) and 20% (4[Formula: see text]61%). Using HERO prior to genome assembly significantly improves the assemblies in the majority of the relevant categories.
Collapse
Affiliation(s)
- Xiongbin Kang
- College of Biology, Hunan University, Changsha, China
- Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany
| | - Jialu Xu
- College of Biology, Hunan University, Changsha, China
| | - Xiao Luo
- College of Biology, Hunan University, Changsha, China.
| | - Alexander Schönhuth
- Genome Data Science, Faculty of Technology, Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
29
|
Schaerer L, Putman L, Bigcraft I, Byrne E, Kulas D, Zolghadr A, Aloba S, Ong R, Shonnard D, Techtmann S. Coexistence of specialist and generalist species within mixed plastic derivative-utilizing microbial communities. MICROBIOME 2023; 11:224. [PMID: 37838714 PMCID: PMC10576394 DOI: 10.1186/s40168-023-01645-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Accepted: 08/09/2023] [Indexed: 10/16/2023]
Abstract
BACKGROUND Plastic-degrading microbial isolates offer great potential to degrade, transform, and upcycle plastic waste. Tandem chemical and biological processing of plastic wastes has been shown to substantially increase the rates of plastic degradation; however, the focus of this work has been almost entirely on microbial isolates (either bioengineered or naturally occurring). We propose that a microbial community has even greater potential for plastic upcycling. A microbial community has greater metabolic diversity to process mixed plastic waste streams and has built-in functional redundancy for optimal resilience. RESULTS Here, we used two plastic-derivative degrading communities as a model system to investigate the roles of specialist and generalist species within the microbial communities. These communities were grown on five plastic-derived substrates: pyrolysis treated high-density polyethylene, chemically deconstructed polyethylene terephthalate, disodium terephthalate, terephthalamide, and ethylene glycol. Short-read metagenomic and metatranscriptomic sequencing were performed to evaluate activity of microorganisms in each treatment. Long-read metagenomic sequencing was performed to obtain high-quality metagenome assembled genomes and evaluate division of labor. CONCLUSIONS Data presented here show that the communities are primarily dominated by Rhodococcus generalists and lower abundance specialists for each of the plastic-derived substrates investigated here, supporting previous research that generalist species dominate batch culture. Additionally, division of labor may be present between Hydrogenophaga terephthalate degrading specialists and lower abundance protocatechuate degrading specialists. Video Abstract.
Collapse
Affiliation(s)
- Laura Schaerer
- Department of Biological Sciences, Michigan Technological University, 740 Dow ESE Building, 1400 Townsend Drive, Houghton, MI, 49931, USA
| | - Lindsay Putman
- Department of Biological Sciences, Michigan Technological University, 740 Dow ESE Building, 1400 Townsend Drive, Houghton, MI, 49931, USA
| | - Isaac Bigcraft
- Department of Biological Sciences, Michigan Technological University, 740 Dow ESE Building, 1400 Townsend Drive, Houghton, MI, 49931, USA
| | - Emma Byrne
- Department of Biological Sciences, Michigan Technological University, 740 Dow ESE Building, 1400 Townsend Drive, Houghton, MI, 49931, USA
| | - Daniel Kulas
- Department of Chemical Engineering, Michigan Technological University, Houghton, MI, USA
| | - Ali Zolghadr
- Department of Chemical Engineering, Michigan Technological University, Houghton, MI, USA
| | - Sulihat Aloba
- Department of Chemical Engineering, Michigan Technological University, Houghton, MI, USA
| | - Rebecca Ong
- Department of Chemical Engineering, Michigan Technological University, Houghton, MI, USA
| | - David Shonnard
- Department of Chemical Engineering, Michigan Technological University, Houghton, MI, USA
| | - Stephen Techtmann
- Department of Biological Sciences, Michigan Technological University, 740 Dow ESE Building, 1400 Townsend Drive, Houghton, MI, 49931, USA.
| |
Collapse
|
30
|
Aizpurua O, Dunn RR, Hansen LH, Gilbert MTP, Alberdi A. Field and laboratory guidelines for reliable bioinformatic and statistical analysis of bacterial shotgun metagenomic data. Crit Rev Biotechnol 2023:1-19. [PMID: 37731336 DOI: 10.1080/07388551.2023.2254933] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Accepted: 06/27/2023] [Indexed: 09/22/2023]
Abstract
Shotgun metagenomics is an increasingly cost-effective approach for profiling environmental and host-associated microbial communities. However, due to the complexity of both microbiomes and the molecular techniques required to analyze them, the reliability and representativeness of the results are contingent upon the field, laboratory, and bioinformatic procedures employed. Here, we consider 15 field and laboratory issues that critically impact downstream bioinformatic and statistical data processing, as well as result interpretation, in bacterial shotgun metagenomic studies. The issues we consider encompass intrinsic properties of samples, study design, and laboratory-processing strategies. We identify the links of field and laboratory steps with downstream analytical procedures, explain the means for detecting potential pitfalls, and propose mitigation measures to overcome or minimize their impact in metagenomic studies. We anticipate that our guidelines will assist data scientists in appropriately processing and interpreting their data, while aiding field and laboratory researchers to implement strategies for improving the quality of the generated results.
Collapse
Affiliation(s)
- Ostaizka Aizpurua
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Robert R Dunn
- Department of Applied Ecology, North Carolina State University, Raleigh, NC, USA
| | - Lars H Hansen
- Department of Plant and Environmental Sciences, University of Copenhagen, Frederiksberg, Denmark
| | - M T P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
- University Museum, NTNU, Trondheim, Norway
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
31
|
Arikawa K, Hosokawa M. Uncultured prokaryotic genomes in the spotlight: An examination of publicly available data from metagenomics and single-cell genomics. Comput Struct Biotechnol J 2023; 21:4508-4518. [PMID: 37771751 PMCID: PMC10523443 DOI: 10.1016/j.csbj.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/10/2023] [Accepted: 09/10/2023] [Indexed: 09/30/2023] Open
Abstract
Owing to the ineffectiveness of traditional culture techniques for the vast majority of microbial species, culture-independent analyses utilizing next-generation sequencing and bioinformatics have become essential for gaining insight into microbial ecology and function. This mini-review focuses on two essential methods for obtaining genetic information from uncultured prokaryotes, metagenomics and single-cell genomics. We analyzed the registration status of uncultured prokaryotic genome data from major public databases and assessed the advantages and limitations of both the methods. Metagenomics generates a significant quantity of sequence data and multiple prokaryotic genomes using straightforward experimental procedures. However, in ecosystems with high microbial diversity, such as soil, most genes are presented as brief, disconnected contigs, and lack association of highly conserved genes and mobile genetic elements with individual species genomes. Although technically more challenging, single-cell genomics offers valuable insights into complex ecosystems by providing strain-resolved genomes, addressing issues in metagenomics. Recent technological advancements, such as long-read sequencing, machine learning algorithms, and in silico protein structure prediction, in combination with vast genomic data, have the potential to overcome the current technical challenges and facilitate a deeper understanding of uncultured microbial ecosystems and microbial dark matter genes and proteins. In light of this, it is imperative that continued innovation in both methods and technologies take place to create high-quality reference genome databases that will support future microbial research and industrial applications.
Collapse
Affiliation(s)
- Koji Arikawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
| | - Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| |
Collapse
|
32
|
Li H, Durbin R. Genome assembly in the telomere-to-telomere era. ARXIV 2023:arXiv:2308.07877v1. [PMID: 37645045 PMCID: PMC10462168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
De novo assembly is the process of reconstructing the genome sequence of an organism from sequencing reads. Genome sequences are essential to biology, and assembly has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best but technological advances in long-read sequencing now enable near complete chromosome-level assembly, also known as telomere-to-telomere assembly, for many organisms. Here we review recent progress on assembly algorithms and protocols. We focus on how to derive near telomere-to-telomere assemblies and discuss potential future developments.
Collapse
Affiliation(s)
- Heng Li
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Richard Durbin
- Department of Genetics, Cambridge University, Cambridge, UK
| |
Collapse
|
33
|
Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, Quince C. Efficient High-Quality Metagenome Assembly from Long Accurate Reads using Minimizer-space de Bruijn Graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.07.548136. [PMID: 37786716 PMCID: PMC10541625 DOI: 10.1101/2023.07.07.548136] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
We introduce a novel metagenomics assembler for high-accuracy long reads. Our approach, implemented as metaMDBG, combines highly efficient de Bruijn graph assembly in minimizer space, with both a multi-k' approach for dealing with variations in genome coverage depth and an abundance-based filtering strategy for simplifying strain complexity. The resulting algorithm is more efficient than the state-of-the-art but with better assembly results. metaMDBG was 1.5 to 12 times faster than competing assemblers and requires between one-tenth and one-thirtieth of the memory across a range of data sets. We obtained up to twice as many high-quality circularised prokaryotic metagenome assembled genomes (MAGs) on the most complex communities, and a better recovery of viruses and plasmids. metaMDBG performs particularly well for abundant organisms whilst being robust to the presence of strain diversity. The result is that for the first time it is possible to efficiently reconstruct the majority of complex communities by abundance as near-complete MAGs.
Collapse
Affiliation(s)
- Gaëtan Benoit
- Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK
| | | | - Robert James
- Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK
| | - Adam M. Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Rayan Chikhi
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Christopher Quince
- Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK
- Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK
- Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
34
|
Pan S, Zhao XM, Coelho LP. SemiBin2: self-supervised contrastive learning leads to better MAGs for short- and long-read sequencing. Bioinformatics 2023; 39:i21-i29. [PMID: 37387171 PMCID: PMC10311329 DOI: 10.1093/bioinformatics/btad209] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Metagenomic binning methods to reconstruct metagenome-assembled genomes (MAGs) from environmental samples have been widely used in large-scale metagenomic studies. The recently proposed semi-supervised binning method, SemiBin, achieved state-of-the-art binning results in several environments. However, this required annotating contigs, a computationally costly and potentially biased process. RESULTS We propose SemiBin2, which uses self-supervised learning to learn feature embeddings from the contigs. In simulated and real datasets, we show that self-supervised learning achieves better results than the semi-supervised learning used in SemiBin1 and that SemiBin2 outperforms other state-of-the-art binners. Compared to SemiBin1, SemiBin2 can reconstruct 8.3-21.5% more high-quality bins and requires only 25% of the running time and 11% of peak memory usage in real short-read sequencing samples. To extend SemiBin2 to long-read data, we also propose ensemble-based DBSCAN clustering algorithm, resulting in 13.1-26.3% more high-quality genomes than the second best binner for long-read data. AVAILABILITY AND IMPLEMENTATION SemiBin2 is available as open source software at https://github.com/BigDataBiology/SemiBin/ and the analysis scripts used in the study can be found at https://github.com/BigDataBiology/SemiBin2_benchmark.
Collapse
Affiliation(s)
- Shaojun Pan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Shanghai 200433, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Shanghai 200433, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
- Zhangjiang Fudan International Innovation Center, Shanghai 201203, China
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Shanghai 200433, China
| |
Collapse
|
35
|
Jia L, Wu Y, Dong Y, Chen J, Chen WH, Zhao XM. A survey on computational strategies for genome-resolved gut metagenomics. Brief Bioinform 2023; 24:7145904. [PMID: 37114640 DOI: 10.1093/bib/bbad162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 03/20/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023] Open
Abstract
Recovering high-quality metagenome-assembled genomes (HQ-MAGs) is critical for exploring microbial compositions and microbe-phenotype associations. However, multiple sequencing platforms and computational tools for this purpose may confuse researchers and thus call for extensive evaluation. Here, we systematically evaluated a total of 40 combinations of popular computational tools and sequencing platforms (i.e. strategies), involving eight assemblers, eight metagenomic binners and four sequencing technologies, including short-, long-read and metaHiC sequencing. We identified the best tools for the individual tasks (e.g. the assembly and binning) and combinations (e.g. generating more HQ-MAGs) depending on the availability of the sequencing data. We found that the combination of the hybrid assemblies and metaHiC-based binning performed best, followed by the hybrid and long-read assemblies. More importantly, both long-read and metaHiC sequencings link more mobile elements and antibiotic resistance genes to bacterial hosts and improve the quality of public human gut reference genomes with 32% (34/105) HQ-MAGs that were either of better quality than those in the Unified Human Gastrointestinal Genome catalog version 2 or novel.
Collapse
Affiliation(s)
- Longhao Jia
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Yingjian Wu
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Yanqi Dong
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Jingchao Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
| | - Wei-Hua Chen
- Key Laboratory of Molecular Biophysics of the Ministry of Education, Hubei Key Laboratory of Bioinformatics and Molecular-imaging, Center for Artificial Intelligence Biology, Department of Bioinformatics and Systems Biology, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, Hubei, China
- Institution of Medical Artificial Intelligence, Binzhou Medical University, Yantai 264003, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, Ministry of Education, Shanghai 200433, China
- MOE Frontiers Center for Brain Science, Fudan University, Shanghai 200433, China
- State Key Laboratory of Medical Neurobiology, Institutes of Brain Science, Fudan University, Shanghai, China
| |
Collapse
|
36
|
Eco-evolutionary implications of helminth microbiomes. J Helminthol 2023; 97:e22. [PMID: 36790127 DOI: 10.1017/s0022149x23000056] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
Abstract
The evolution of helminth parasites has long been seen as an interplay between host resistance to infection and the parasite's capacity to bypass such resistance. However, there has recently been an increasing appreciation of the role of symbiotic microbes in the interaction of helminth parasites and their hosts. It is now clear that helminths have a different microbiome from the organisms they parasitize, and sometimes amid large variability, components of the microbiome are shared among different life stages or among populations of the parasite. Helminths have been shown to acquire microbes from their parent generations (vertical transmission) and from their surroundings (horizontal transmission). In this latter case, natural selection has been strongly linked to the fact that helminth-associated microbiota is not simply a random assemblage of the pool of microbes available from their organismal hosts or environments. Indeed, some helminth parasites and specific microbial taxa have evolved complex ecological relationships, ranging from obligate mutualism to reproductive manipulation of the helminth by associated microbes. However, our understanding is still very elementary regarding the net effect of all microbiome components in the eco-evolution of helminths and their interaction with hosts. In this non-exhaustible review, we focus on the bacterial microbiome associated with helminths (as opposed to the microbiome of their hosts) and highlight relevant concepts and key findings in bacterial transmission, ecological associations, and taxonomic and functional diversity of the bacteriome. We integrate the microbiome dimension in a discussion of the evolution of helminth parasites and identify fundamental knowledge gaps, finally suggesting research avenues for understanding the eco-evolutionary impacts of the microbiome in host-parasite interactions in light of new technological developments.
Collapse
|
37
|
Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing. Microbiol Spectr 2023; 11:e0332822. [PMID: 36475839 PMCID: PMC9927493 DOI: 10.1128/spectrum.03328-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as 'NGS', 'Hybrid (NGS+HiFi)', and 'HiFi'. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the 'HiFi' approach to assemble high-quality microbial genomes. Among the 3 strategies, the 'Hybrid' approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the 'HiFi' assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the 'Hybrid' approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the 'Hybrid' assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research.
Collapse
|
38
|
Phylogenomic analysis of Wolbachia genomes from the Darwin Tree of Life biodiversity genomics project. PLoS Biol 2023; 21:e3001972. [PMID: 36689552 PMCID: PMC9894559 DOI: 10.1371/journal.pbio.3001972] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Revised: 02/02/2023] [Accepted: 12/19/2022] [Indexed: 01/24/2023] Open
Abstract
The Darwin Tree of Life (DToL) project aims to sequence all described terrestrial and aquatic eukaryotic species found in Britain and Ireland. Reference genome sequences are generated from single individuals for each target species. In addition to the target genome, sequenced samples often contain genetic material from microbiomes, endosymbionts, parasites, and other cobionts. Wolbachia endosymbiotic bacteria are found in a diversity of terrestrial arthropods and nematodes, with supergroups A and B the most common in insects. We identified and assembled 110 complete Wolbachia genomes from 93 host species spanning 92 families by filtering data from 368 insect species generated by the DToL project. From 15 infected species, we assembled more than one Wolbachia genome, including cases where individuals carried simultaneous supergroup A and B infections. Different insect orders had distinct patterns of infection, with Lepidopteran hosts mostly infected with supergroup B, while infections in Diptera and Hymenoptera were dominated by A-type Wolbachia. Other than these large-scale order-level associations, host and Wolbachia phylogenies revealed no (or very limited) cophylogeny. This points to the occurrence of frequent host switching events, including between insect orders, in the evolutionary history of the Wolbachia pandemic. While supergroup A and B genomes had distinct GC% and GC skew, and B genomes had a larger core gene set and tended to be longer, it was the abundance of copies of bacteriophage WO who was a strong determinant of Wolbachia genome size. Mining raw genome data generated for reference genome assemblies is a robust way of identifying and analysing cobiont genomes and giving greater ecological context for their hosts.
Collapse
|
39
|
Jiang F, Li Q, Wang S, Shen T, Wang H, Wang A, Xu D, Yuan L, Lei L, Chen R, Yang B, Deng Y, Fan W. Recovery of metagenome-assembled microbial genomes from a full-scale biogas plant of food waste by pacific biosciences high-fidelity sequencing. Front Microbiol 2023; 13:1095497. [PMID: 36699587 PMCID: PMC9869026 DOI: 10.3389/fmicb.2022.1095497] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Accepted: 12/20/2022] [Indexed: 01/12/2023] Open
Abstract
Background Anaerobic digestion (AD) is important in treating of food waste, and thousands of metagenome-assembled genomes (MAGs) have been constructed for the microbiome in AD. However, due to the limitations of the short-read sequencing and assembly technologies, most of these MAGs are grouped from hundreds of short contigs by binning algorithms, and the errors are easily introduced. Results In this study, we constructed a total of 60 non-redundant microbial genomes from 64.5 Gb of PacBio high-fidelity (HiFi) long reads, generated from the digestate samples of a full-scale biogas plant fed with food waste. Of the 60 microbial genomes, all genomes have at least one copy of rRNA operons (16S, 23S, and 5S rRNA), 54 have ≥18 types of standard tRNA genes, and 39 are circular complete genomes. In comparison with the published short-read derived MAGs for AD, we found 23 genomes with average nucleotide identity less than 95% to any known MAGs. Besides, our HiFi-derived genomes have much higher average contig N50 size, slightly higher average genome size and lower contamination. GTDB-Tk classification of these genomes revealed two genomes belonging to novel genus and four genomes belonging to novel species, since their 16S rRNA genes have identities lower than 95 and 97% to any known 16S rRNA genes, respectively. Microbial community analysis based on the these assembled genomes reveals the most predominant phylum was Thermotogae (70.5%), followed by Euryarchaeota (6.1%), and Bacteroidetes (4.7%), and the most predominant bacterial and archaeal genera were Defluviitoga (69.1%) and Methanothrix (5.4%), respectively. Analysis of the full-length 16S rRNA genes identified from the HiFi reads gave similar microbial compositions to that derived from the 60 assembled genomes. Conclusion High-fidelity sequencing not only generated microbial genomes with obviously improved quality but also recovered a substantial portion of novel genomes missed in previous short-read based studies, and the novel genomes will deepen our understanding of the microbial composition in AD of food waste.
Collapse
Affiliation(s)
- Fan Jiang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Qiang Li
- Biogas Institute of Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China,Key Laboratory of Development and Application of Rural Renewable Energy, Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China
| | - Sen Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Ting Shen
- Biogas Institute of Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China,Key Laboratory of Development and Application of Rural Renewable Energy, Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China
| | - Hengchao Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Anqi Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Dong Xu
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Lihua Yuan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Lihong Lei
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Rong Chen
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Boyuan Yang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China
| | - Yu Deng
- Biogas Institute of Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China,Key Laboratory of Development and Application of Rural Renewable Energy, Ministry of Agriculture and Rural Affairs, Chengdu, Sichuan, China,*Correspondence: Yu Deng, ; Wei Fan,
| | - Wei Fan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong, China,*Correspondence: Yu Deng, ; Wei Fan,
| |
Collapse
|
40
|
Affiliation(s)
- Mads Albertsen
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark.
| |
Collapse
|
41
|
Jin H, Quan K, He Q, Kwok LY, Ma T, Li Y, Zhao F, You L, Zhang H, Sun Z. A high-quality genome compendium of the human gut microbiome of Inner Mongolians. Nat Microbiol 2023; 8:150-161. [PMID: 36604505 DOI: 10.1038/s41564-022-01270-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Accepted: 10/13/2022] [Indexed: 01/07/2023]
Abstract
Metagenome-based resources have revealed the diversity and function of the human gut microbiome, but further understanding is limited by insufficient genome quality and a lack of samples from typically understudied populations. Here we used hybrid long-read PromethION and short-read HiSeq sequencing to characterize the faecal microbiota of 60 Inner Mongolian individuals (n = 180 samples over three time points) who were part of a probiotic yogurt intervention trial. We present the Inner Mongolian Gut Genome catalogue, comprising 802 closed and 5,927 high-quality metagenome-assembled genomes. This approach achieved high genome continuity and substantially increased the resolution of genomic elements, including ribosomal RNA operons, metabolic gene clusters, prophages and insertion sequences. Particularly, we report the ribosomal RNA operon copy numbers for uncultured species, over 12,000 previously undescribed gut prophages and the distribution of insertion sequence elements across gut bacteria. Overall, these data provide a high-quality, large-scale resource for studying the human gut microbiota.
Collapse
Affiliation(s)
- Hao Jin
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Keyu Quan
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Qiuwen He
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Lai-Yu Kwok
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Teng Ma
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Yalin Li
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Feiyan Zhao
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Lijun You
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China
| | - Heping Zhang
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.
| | - Zhihong Sun
- Inner Mongolia Key Laboratory of Dairy Biotechnology and Engineering, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Products Processing, Ministry of Agriculture and Rural Affairs, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China. .,Key Laboratory of Dairy Biotechnology and Engineering, Ministry of Education, Inner Mongolia Agricultural University, Hohhot, Inner Mongolia, China.
| |
Collapse
|
42
|
Long-Read Sequencing Improves Recovery of Picoeukaryotic Genomes and Zooplankton Marker Genes from Marine Metagenomes. mSystems 2022; 7:e0059522. [PMID: 36448813 PMCID: PMC9765425 DOI: 10.1128/msystems.00595-22] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Long-read sequencing offers the potential to improve metagenome assemblies and provide more robust assessments of microbial community composition and function than short-read sequencing. We applied Pacific Biosciences (PacBio) CCS (circular consensus sequencing) HiFi shotgun sequencing to 14 marine water column samples and compared the results with those for short-read metagenomes from the corresponding environmental DNA samples. We found that long-read metagenomes varied widely in quality and biological information. The community compositions of the corresponding long- and short-read metagenomes were frequently dissimilar, suggesting higher stochasticity and/or bias associated with PacBio sequencing. Long reads provided few improvements to the assembly qualities, gene annotations, and prokaryotic metagenome-assembled genome (MAG) binning results. However, only long reads produced high-quality eukaryotic MAGs and contigs containing complete zooplankton marker gene sequences. These results suggest that high-quality long-read metagenomes can improve marine community composition analyses and provide important insight into eukaryotic phyto- and zooplankton genetics, but the benefits may be outweighed by the inconsistent data quality. IMPORTANCE Ocean microbes provide critical ecosystem services, but most remain uncultivated. Their communities can be studied through shotgun metagenomic sequencing and bioinformatic analyses, including binning draft microbial genomes. However, most sequencing to date has been done using short-read technology, which rarely yields genome sequences of key microbes like SAR11. Long-read sequencing can improve metagenome assemblies but is hampered by technological shortcomings and high costs. In this study, we compared long- and short-read sequencing of marine metagenomes. We found a wide range of long-read metagenome qualities and minimal improvements to microbiome analyses. However, long reads generated draft genomes of eukaryotic algal species and provided full-length marker gene sequences of zooplankton species, including krill and copepods. These results suggest that long-read sequencing can provide greater genetic insight into the wide diversity of eukaryotic phyto- and zooplankton that interact as part of and with the marine microbiome.
Collapse
|
43
|
Tadrent N, Dedeine F, Hervé V. SnakeMAGs: a simple, efficient, flexible and scalable workflow to reconstruct prokaryotic genomes from metagenomes. F1000Res 2022; 11:1522. [PMID: 36875992 PMCID: PMC9978240 DOI: 10.12688/f1000research.128091.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/23/2023] [Indexed: 03/02/2023] Open
Abstract
Background: Over the last decade, we have observed in microbial ecology a transition from gene-centric to genome-centric analyses. Indeed, the advent of metagenomics combined with binning methods, single-cell genome sequencing as well as high-throughput cultivation methods have contributed to the continuing and exponential increase of available prokaryotic genomes, which in turn has favored the exploration of microbial metabolisms. In the case of metagenomics, data processing, from raw reads to genome reconstruction, involves various steps and software which can represent a major technical obstacle. Methods: To overcome this challenge, we developed SnakeMAGs, a simple workflow that can process Illumina data, from raw reads to metagenome-assembled genomes (MAGs) classification and relative abundance estimate. It integrates state-of-the-art bioinformatic tools to sequentially perform: quality control of the reads (illumina-utils, Trimmomatic), host sequence removal (optional step, using Bowtie2), assembly (MEGAHIT), binning (MetaBAT2), quality filtering of the bins (CheckM, GUNC), classification of the MAGs (GTDB-Tk) and estimate of their relative abundance (CoverM). Developed with the popular Snakemake workflow management system, it can be deployed on various architectures, from single to multicore and from workstation to computer clusters and grids. It is also flexible since users can easily change parameters and/or add new rules. Results: Using termite gut metagenomic datasets, we showed that SnakeMAGs is slower but allowed the recovery of more MAGs encompassing more diverse phyla compared to another similar workflow named ATLAS. Importantly, these additional MAGs showed no significant difference compared to the other ones in terms of completeness, contamination, genome size nor relative abundance. Conclusions: Overall, it should make the reconstruction of MAGs more accessible to microbiologists. SnakeMAGs as well as test files and an extended tutorial are available at https://github.com/Nachida08/SnakeMAGs.
Collapse
Affiliation(s)
- Nachida Tadrent
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Franck Dedeine
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Vincent Hervé
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
- Université Paris-Saclay, INRAE, AgroParisTech, UMR SayFood, Palaiseau, 91120, France
| |
Collapse
|
44
|
Tadrent N, Dedeine F, Hervé V. SnakeMAGs: a simple, efficient, flexible and scalable workflow to reconstruct prokaryotic genomes from metagenomes. F1000Res 2022; 11:1522. [PMID: 36875992 PMCID: PMC9978240 DOI: 10.12688/f1000research.128091.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/01/2022] [Indexed: 01/05/2024] Open
Abstract
Background: Over the last decade, we have observed in microbial ecology a transition from gene-centric to genome-centric analyses. Indeed, the advent of metagenomics combined with binning methods, single-cell genome sequencing as well as high-throughput cultivation methods have contributed to the continuing and exponential increase of available prokaryotic genomes, which in turn has favored the exploration of microbial metabolisms. In the case of metagenomics, data processing, from raw reads to genome reconstruction, involves various steps and software which can represent a major technical obstacle. Methods: To overcome this challenge, we developed SnakeMAGs, a simple workflow that can process Illumina data, from raw reads to metagenome-assembled genomes (MAGs) classification and relative abundance estimate. It integrates state-of-the-art bioinformatic tools to sequentially perform: quality control of the reads (illumina-utils, Trimmomatic), host sequence removal (optional step, using Bowtie2), assembly (MEGAHIT), binning (MetaBAT2), quality filtering of the bins (CheckM), classification of the MAGs (GTDB-Tk) and estimate of their relative abundance (CoverM). Developed with the popular Snakemake workflow management system, it can be deployed on various architectures, from single to multicore and from workstation to computer clusters and grids. It is also flexible since users can easily change parameters and/or add new rules. Results: Using termite gut metagenomic datasets, we showed that SnakeMAGs is slower but allowed the recovery of more MAGs encompassing more diverse phyla compared to another similar workflow named ATLAS. Conclusions: Overall, it should make the reconstruction of MAGs more accessible to microbiologists. SnakeMAGs as well as test files and an extended tutorial are available at https://github.com/Nachida08/SnakeMAGs.
Collapse
Affiliation(s)
- Nachida Tadrent
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Franck Dedeine
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
| | - Vincent Hervé
- Institut de Recherche sur la Biologie de l'Insecte, UMR 7261, CNRS-Université de Tours, Tours, 37200, France
- Université Paris-Saclay, INRAE, AgroParisTech, UMR SayFood, Palaiseau, 91120, France
| |
Collapse
|
45
|
Zhang Y, Jiang F, Yang B, Wang S, Wang H, Wang A, Xu D, Fan W. Improved microbial genomes and gene catalog of the chicken gut from metagenomic sequencing of high-fidelity long reads. Gigascience 2022; 11:6833030. [PMID: 36399059 PMCID: PMC9673493 DOI: 10.1093/gigascience/giac116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2022] [Revised: 09/27/2022] [Accepted: 10/30/2022] [Indexed: 11/19/2022] Open
Abstract
Background Due to the importance of chicken production and the remarkable influence of the gut microbiota on host health and growth, tens of thousands of metagenome-assembled genomes (MAGs) have been constructed for the chicken gut microbiome. However, due to the limitations of short-read sequencing and assembly technologies, most of these MAGs are far from complete, are of lower quality, and include contaminant reads. Results We generated 332 Gb of high-fidelity (HiFi) long reads from the 5 chicken intestinal compartments and assembled 461 and 337 microbial genomes, of which 53% and 55% are circular, at the species and strain levels, respectively. For the assembled microbial genomes, approximately 95% were regarded as complete according to the “RNA complete” criteria, which requires at least 1 full-length ribosomal RNA (rRNA) operon encoding all 3 types of rRNA (16S, 23S, and 5S) and at least 18 copies of full-length transfer RNA genes. In comparison with the short-read-derived chicken MAGs, 384 (83% of 461) and 89 (26% of 337) strain-level and species-level genomes in this study are novel, with no matches to previously reported sequences. At the gene level, one-third of the 2.5 million genes in the HiFi-derived gene catalog are novel and cannot be matched to the short-read-derived gene catalog. Moreover, the HiFi-derived genomes have much higher continuity and completeness, as well as lower contamination; the HiFi-derived gene catalog has a much higher ratio of complete gene structures. The dominant phylum in our HiFi-assembled genomes was Firmicutes (82.5%), and the foregut was highly enriched in 5 genera: Ligilactobacillus, Limosilactobacillus, Lactobacillus, Weissella, and Enterococcus, all of which belong to the order Lactobacillales. Using GTDB-Tk, all 337 species-level genomes were successfully classified at the order level; however, 2, 35, and 189 genomes could not be classified into any known family, genus, and species, respectively. Among these incompletely classified genomes, 9 and 49 may belong to novel genera and species, respectively, because their 16S rRNA genes have identities lower than 95% and 97% to any known 16S rRNA genes. Conclusions HiFi sequencing not only produced metagenome assemblies and gene structures with markedly improved quality but also recovered a substantial portion of novel genomes and genes that were missed in previous short-read-based metagenome studies. The novel genomes and species obtained in this study will facilitate gut microbiome and host–microbiota interaction studies, thereby contributing to the sustainable development of poultry resources.
Collapse
Affiliation(s)
- Yan Zhang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Fan Jiang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Boyuan Yang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Sen Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Hengchao Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Anqi Wang
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Dong Xu
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| | - Wei Fan
- Guangdong Laboratory for Lingnan Modern Agriculture (Shenzhen Branch), Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences , Shenzhen, Guangdong, 518120, China
| |
Collapse
|
46
|
Kim CY, Ma J, Lee I. HiFi metagenomic sequencing enables assembly of accurate and complete genomes from human gut microbiota. Nat Commun 2022; 13:6367. [PMID: 36289209 PMCID: PMC9606305 DOI: 10.1038/s41467-022-34149-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 10/12/2022] [Indexed: 12/25/2022] Open
Abstract
Advances in metagenomic assembly have led to the discovery of genomes belonging to uncultured microorganisms. Metagenome-assembled genomes (MAGs) often suffer from fragmentation and chimerism. Recently, 20 complete MAGs (cMAGs) have been assembled from Oxford Nanopore long-read sequencing of 13 human fecal samples, but with low nucleotide accuracy. Here, we report 102 cMAGs obtained by Pacific Biosciences (PacBio) high-accuracy long-read (HiFi) metagenomic sequencing of five human fecal samples, whose initial circular contigs were selected for complete prokaryotic genomes using our bioinformatics workflow. Nucleotide accuracy of the final cMAGs was as high as that of Illumina sequencing. The cMAGs could exceed 6 Mbp and included complete genomes of diverse taxa, including entirely uncultured RF39 and TANB77 orders. Moreover, cMAGs revealed that regions hard to assemble by short-read sequencing comprised mostly genomic islands and rRNAs. HiFi metagenomic sequencing will facilitate cataloging accurate and complete genomes from complex microbial communities, including uncultured species.
Collapse
Affiliation(s)
- Chan Yeong Kim
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea ,grid.4709.a0000 0004 0495 846XPresent Address: Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Junyeong Ma
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea
| | - Insuk Lee
- grid.15444.300000 0004 0470 5454Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722 Republic of Korea ,grid.49100.3c0000 0001 0742 4007POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673 Republic of Korea
| |
Collapse
|
47
|
Sereika M, Kirkegaard RH, Karst SM, Michaelsen TY, Sørensen EA, Wollenberg RD, Albertsen M. Oxford Nanopore R10.4 long-read sequencing enables the generation of near-finished bacterial genomes from pure cultures and metagenomes without short-read or reference polishing. Nat Methods 2022; 19:823-826. [PMID: 35789207 PMCID: PMC9262707 DOI: 10.1038/s41592-022-01539-7] [Citation(s) in RCA: 136] [Impact Index Per Article: 68.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/24/2022] [Indexed: 12/26/2022]
Abstract
Long-read Oxford Nanopore sequencing has democratized microbial genome sequencing and enables the recovery of highly contiguous microbial genomes from isolates or metagenomes. However, to obtain near-finished genomes it has been necessary to include short-read polishing to correct insertions and deletions derived from homopolymer regions. Here, we show that Oxford Nanopore R10.4 can be used to generate near-finished microbial genomes from isolates or metagenomes without short-read or reference polishing.
Collapse
Affiliation(s)
- Mantas Sereika
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark
| | - Rasmus Hansen Kirkegaard
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark.,Joint Microbiome Facility, University of Vienna, Vienna, Austria
| | | | | | | | | | - Mads Albertsen
- Center for Microbial Communities, Aalborg University, Aalborg, Denmark.
| |
Collapse
|
48
|
Fedarko MW, Kolmogorov M, Pevzner PA. Analyzing rare mutations in metagenomes assembled using long and accurate reads. Genome Res 2022; 32:2119-2133. [PMID: 36418060 PMCID: PMC9808630 DOI: 10.1101/gr.276917.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 11/16/2022] [Indexed: 11/25/2022]
Abstract
The advent of long and accurate "HiFi" reads has greatly improved our ability to generate complete metagenome-assembled genomes (MAGs), enabling "complete metagenomics" studies that were nearly impossible to conduct with short reads. In particular, HiFi reads simplify the identification and phasing of mutations in MAGs: It is increasingly feasible to distinguish between positions that are prone to mutations and positions that rarely ever mutate, and to identify co-occurring groups of mutations. However, the problems of identifying rare mutations in MAGs, estimating the false-discovery rate (FDR) of these identifications, and phasing identified mutations remain open in the context of HiFi data. We present strainFlye, a pipeline for the FDR-controlled identification and analysis of rare mutations in MAGs assembled using HiFi reads. We show that deep HiFi sequencing has the potential to reveal and phase tens of thousands of rare mutations in a single MAG, identify hotspots and coldspots of these mutations, and detail MAGs' growth dynamics.
Collapse
Affiliation(s)
- Marcus W. Fedarko
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA;,UC Santa Cruz Genomics Institute, Santa Cruz, California 95064, USA
| | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA;,Center for Microbiome Innovation, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|