1
|
Yamada A, Nishi Y, Noguchi M, Watanabe K, Oshiro M, Sakai K, Tashiro Y. Isolated hair bacteria reveal different isolation possibilities under various conditions. J Biosci Bioeng 2024; 138:290-300. [PMID: 39033053 DOI: 10.1016/j.jbiosc.2024.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Revised: 06/05/2024] [Accepted: 06/07/2024] [Indexed: 07/23/2024]
Abstract
Microorganisms are assumed to inhabit various environments and organisms, including the human body. The presence of more than 700 bacterial species on scalp hair has been reported through rRNA gene amplicon analysis. However, the biological properties of bacteria on the scalp hair (hair bacteria) and their functions are poorly understood as few hair bacteria have been isolated from hair in previous studies. This study aimed to isolate hair bacteria using standard media under 24 different conditions (including medium components, component concentrations, gelling agents, and atmospheric environments). Furthermore, we evaluated the possibility of isolating strains under these isolation conditions and examined the carbon metabolic ability of several predominantly isolated strains. A total of 63 bacterial species belonging to 27 genera were isolated from hair under 24 isolation conditions. The predominant bacterial species isolated from human hair in this study showed different carbon metabolic capabilities than those of the reference strains. In addition, isolation possibility was newly proposed to systematically evaluate the number of isolation conditions that could cultivate a bacterial species. Based on isolation possibility, the isolates were categorized into groups with a high number of isolation conditions (e.g., ≥25%; such as Staphylococcus) and those with a low number (e.g., ≤25%; such as Brachybacterium). These findings indicate the existence of easily isolated microorganisms and difficultly isolated microorganism from human hair.
Collapse
Affiliation(s)
- Azusa Yamada
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Yuri Nishi
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Mei Noguchi
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Kota Watanabe
- Department of Fermentation Science, Faculty of Applied Biosciences, Tokyo University of Agriculture, 1-1-1 Sakuragaoka, Setagaya-ku, Tokyo 156-8502, Japan
| | - Mugihito Oshiro
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Kenji Sakai
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan; Center for International Education and Research of Agriculture, Faculty of Agriculture, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Yukihiro Tashiro
- Division of Systems Bioengineering, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan; Center for International Education and Research of Agriculture, Faculty of Agriculture, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan.
| |
Collapse
|
2
|
Kutuzova S, Nielsen M, Piera P, Nissen JN, Rasmussen S. Taxometer: Improving taxonomic classification of metagenomics contigs. Nat Commun 2024; 15:8357. [PMID: 39333501 PMCID: PMC11437175 DOI: 10.1038/s41467-024-52771-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 09/20/2024] [Indexed: 09/29/2024] Open
Abstract
For taxonomy based classification of metagenomics assembled contigs, current methods use sequence similarity to identify their most likely taxonomy. However, in the related field of metagenomic binning, contigs are routinely clustered using information from both the contig sequences and their abundance. We introduce Taxometer, a neural network based method that improves the annotations and estimates the quality of any taxonomic classifier using contig abundance profiles and tetra-nucleotide frequencies. We apply Taxometer to five short-read CAMI2 datasets and find that it increases the average share of correct species-level contig annotations of the MMSeqs2 tool from 66.6% to 86.2%. Additionally, it reduce the share of wrong species-level annotations in the CAMI2 Rhizosphere dataset by an average of two-fold for Metabuli, Centrifuge, and Kraken2. Futhermore, we use Taxometer for benchmarking taxonomic classifiers on two complex long-read metagenomics data sets where ground truth is not known. Taxometer is available as open-source software and can enhance any taxonomic annotation of metagenomic contigs.
Collapse
Affiliation(s)
- Svetlana Kutuzova
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, Copenhagen, 2100, Denmark
- The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark
| | - Mads Nielsen
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, Copenhagen, 2100, Denmark
| | - Pau Piera
- The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark
| | - Jakob Nybo Nissen
- The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark.
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark.
| | - Simon Rasmussen
- The Novo Nordisk Foundation Center for Protein Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark.
- The Novo Nordisk Foundation Center for Basic Metabolic Research, University of Copenhagen, Blegdamsvej 3A, Copenhagen, 2200, Denmark.
- The Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute of MIT and Harvard, Cambridge, 02142, MA, USA.
| |
Collapse
|
3
|
Kazantseva E, Donmez A, Frolova M, Pop M, Kolmogorov M. Strainy: phasing and assembly of strain haplotypes from long-read metagenome sequencing. Nat Methods 2024:10.1038/s41592-024-02424-1. [PMID: 39327484 DOI: 10.1038/s41592-024-02424-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 08/22/2024] [Indexed: 09/28/2024]
Abstract
Bacterial species in microbial communities are often represented by mixtures of strains, distinguished by small variations in their genomes. Short-read approaches can be used to detect small-scale variation between strains but fail to phase these variants into contiguous haplotypes. Long-read metagenome assemblers can generate contiguous bacterial chromosomes but often suppress strain-level variation in favor of species-level consensus. Here we present Strainy, an algorithm for strain-level metagenome assembly and phasing from Nanopore and PacBio reads. Strainy takes a de novo metagenomic assembly as input and identifies strain variants, which are then phased and assembled into contiguous haplotypes. Using simulated and mock Nanopore and PacBio metagenome data, we show that Strainy assembles accurate and complete strain haplotypes, outperforming current Nanopore-based methods and comparable with PacBio-based algorithms in completeness and accuracy. We then use Strainy to assemble strain haplotypes of a complex environmental metagenome, revealing distinct strain distribution and mutational patterns in bacterial species.
Collapse
Affiliation(s)
- Ekaterina Kazantseva
- Bioinformatics and Systems Biology Program, ITMO University, St. Petersburg, Russia
| | - Ataberk Donmez
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
- Department of Computer Science, University of Maryland, College Park, MD, USA
| | - Maria Frolova
- Functional Genomics of Prokaryotes Laboratory, Institute of Cell Biophysics, RAS, Pushchino, Russia
| | - Mihai Pop
- Department of Computer Science, University of Maryland, College Park, MD, USA.
| | - Mikhail Kolmogorov
- Cancer Data Science Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
4
|
Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, Quince C. High-quality metagenome assembly from long accurate reads with metaMDBG. Nat Biotechnol 2024; 42:1378-1383. [PMID: 38168989 PMCID: PMC11392814 DOI: 10.1038/s41587-023-01983-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 09/08/2023] [Indexed: 01/05/2024]
Abstract
We introduce metaMDBG, a metagenomics assembler for PacBio HiFi reads. MetaMDBG combines a de Bruijn graph assembly in a minimizer space with an iterative assembly over sequences of minimizers to address variations in genome coverage depth and an abundance-based filtering strategy to simplify strain complexity. For complex communities, we obtained up to twice as many high-quality circularized prokaryotic metagenome-assembled genomes as existing methods and had better recovery of viruses and plasmids.
Collapse
Affiliation(s)
- Gaëtan Benoit
- Organisms and Ecosystems, Earlham Institute, Norwich, UK
| | | | - Robert James
- Gut Microbes and Health, Quadram Institute, Norwich, UK
| | - Adam M Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Rayan Chikhi
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Christopher Quince
- Organisms and Ecosystems, Earlham Institute, Norwich, UK.
- Gut Microbes and Health, Quadram Institute, Norwich, UK.
- School of Biological Sciences, University of East Anglia, Norwich, UK.
- Warwick Medical School, University of Warwick, Coventry, UK.
| |
Collapse
|
5
|
Cook LSJ, Briscoe AG, Fonseca VG, Boenigk J, Woodward G, Bass D. Microbial, holobiont, and Tree of Life eDNA/eRNA for enhanced ecological assessment. Trends Microbiol 2024:S0966-842X(24)00173-2. [PMID: 39164135 DOI: 10.1016/j.tim.2024.07.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 07/07/2024] [Accepted: 07/08/2024] [Indexed: 08/22/2024]
Abstract
Microbial environmental DNA and RNA (collectively 'eNA') originate from a diverse and abundant array of microbes present in environmental samples. These eNA signals, largely representing whole organisms, serve as a powerful complement to signals derived from fragments or remnants of larger organisms. Integrating microbial data into the toolbox of ecosystem assessments and biotic indices therefore has the potential to transform how we use eNA data to understand biodiversity dynamics and ecosystem functions, and to inform the next generation of environmental monitoring. Incorporating holobiont and Tree of Life approaches into eNA analyses offers further holistic insight into the range of ecological interactions between microbes and other organisms, paving the way for advancing our understanding of, and ultimately manipulating ecosystem properties pertinent to environmental management, conservation, wildlife health, and food production.
Collapse
Affiliation(s)
- Lauren S J Cook
- Centre for Environment, Fisheries and Aquaculture Science, Barrack Road, Weymouth, Dorset DT4 8UB, UK; Science, The Natural History Museum, Cromwell Road, London SW7 5BD, UK; Royal Holloway University of London, Egham, Surrey TW20 0EX, UK
| | - Andrew G Briscoe
- Science, The Natural History Museum, Cromwell Road, London SW7 5BD, UK; NatureMetrics, Surrey Research Park, Guildford GU2 7HJ, UK
| | - Vera G Fonseca
- Centre for Environment, Fisheries and Aquaculture Science, Barrack Road, Weymouth, Dorset DT4 8UB, UK
| | - Jens Boenigk
- Department of Biodiversity, University of Duisburg-Essen, 45141 Essen, Universitätsstraße 5, Germany
| | - Guy Woodward
- Georgina Mace Centre for the Living Planet, Department of Life Sciences, Imperial College London, Silwood Park Campus, Ascot, Berkshire SL5 7PY, UK
| | - David Bass
- Centre for Environment, Fisheries and Aquaculture Science, Barrack Road, Weymouth, Dorset DT4 8UB, UK; Science, The Natural History Museum, Cromwell Road, London SW7 5BD, UK; Biosciences, University of Exeter, Stocker Road, Exeter EX4 4QD, UK.
| |
Collapse
|
6
|
Shaw J, Yu YW. Fairy: fast approximate coverage for multi-sample metagenomic binning. MICROBIOME 2024; 12:151. [PMID: 39143609 PMCID: PMC11323348 DOI: 10.1186/s40168-024-01861-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/07/2024] [Accepted: 06/20/2024] [Indexed: 08/16/2024]
Abstract
BACKGROUND Metagenomic binning, the clustering of assembled contigs that belong to the same genome, is a crucial step for recovering metagenome-assembled genomes (MAGs). Contigs are linked by exploiting consistent signatures along a genome, such as read coverage patterns. Using coverage from multiple samples leads to higher-quality MAGs; however, standard pipelines require all-to-all read alignments for multiple samples to compute coverage, becoming a key computational bottleneck. RESULTS We present fairy ( https://github.com/bluenote-1577/fairy ), an approximate coverage calculation method for metagenomic binning. Fairy is a fast k-mer-based alignment-free method. For multi-sample binning, fairy can be > 250 × faster than read alignment and accurate enough for binning. Fairy is compatible with several existing binners on host and non-host-associated datasets. Using MetaBAT2, fairy recovers 98.5 % of MAGs with > 50 % completeness and < 5 % contamination relative to alignment with BWA. Notably, multi-sample binning with fairy is always better than single-sample binning using BWA ( > 1.5 × more > 50 % complete MAGs on average) while still being faster. For a public sediment metagenome project, we demonstrate that multi-sample binning recovers higher quality Asgard archaea MAGs than single-sample binning and that fairy's results are indistinguishable from read alignment. CONCLUSIONS Fairy is a new tool for approximately and quickly calculating multi-sample coverage for binning, resolving a computational bottleneck for metagenomics. Video Abstract.
Collapse
Affiliation(s)
- Jim Shaw
- Department of Mathematics, University of Toronto, Toronto, Canada.
| | - Yun William Yu
- Department of Mathematics, University of Toronto, Toronto, Canada.
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, USA.
| |
Collapse
|
7
|
Mallawaarachchi V, Wickramarachchi A, Xue H, Papudeshi B, Grigson SR, Bouras G, Prahl RE, Kaphle A, Verich A, Talamantes-Becerra B, Dinsdale EA, Edwards RA. Solving genomic puzzles: computational methods for metagenomic binning. Brief Bioinform 2024; 25:bbae372. [PMID: 39082646 PMCID: PMC11289683 DOI: 10.1093/bib/bbae372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 06/05/2024] [Accepted: 07/15/2024] [Indexed: 08/03/2024] Open
Abstract
Metagenomics involves the study of genetic material obtained directly from communities of microorganisms living in natural environments. The field of metagenomics has provided valuable insights into the structure, diversity and ecology of microbial communities. Once an environmental sample is sequenced and processed, metagenomic binning clusters the sequences into bins representing different taxonomic groups such as species, genera, or higher levels. Several computational tools have been developed to automate the process of metagenomic binning. These tools have enabled the recovery of novel draft genomes of microorganisms allowing us to study their behaviors and functions within microbial communities. This review classifies and analyzes different approaches of metagenomic binning and different refinement, visualization, and evaluation techniques used by these methods. Furthermore, the review highlights the current challenges and areas of improvement present within the field of research.
Collapse
Affiliation(s)
- Vijini Mallawaarachchi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Anuradha Wickramarachchi
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Hansheng Xue
- School of Computing, National University of Singapore, Singapore 119077, Singapore
| | - Bhavya Papudeshi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - George Bouras
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, SA 5005, Australia
- The Department of Surgery—Otolaryngology Head and Neck Surgery, University of Adelaide and the Basil Hetzel Institute for Translational Health Research, Central Adelaide Local Health Network, Adelaide, SA 5011, Australia
| | - Rosa E Prahl
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Anubhav Kaphle
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Andrey Verich
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
- The Kirby Institute, The University of New South Wales, Randwick, Sydney, NSW 2052, Australia
| | - Berenice Talamantes-Becerra
- Australian e-Health Research Centre, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Westmead, NSW 2145, Australia
| | - Elizabeth A Dinsdale
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, SA 5042, Australia
| |
Collapse
|
8
|
Curry KD, Yu FB, Vance SE, Segarra S, Bhaya D, Chikhi R, Rocha EPC, Treangen TJ. Reference-free structural variant detection in microbiomes via long-read co-assembly graphs. Bioinformatics 2024; 40:i58-i67. [PMID: 38940156 PMCID: PMC11211843 DOI: 10.1093/bioinformatics/btae224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
MOTIVATION The study of bacterial genome dynamics is vital for understanding the mechanisms underlying microbial adaptation, growth, and their impact on host phenotype. Structural variants (SVs), genomic alterations of 50 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to the absence of clear reference genomes and the presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing all metagenomic samples in a series (time or other metric) into a single co-assembly graph. The log fold change in graph coverage between successive samples is then calculated to call SVs that are thriving or declining. RESULTS We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, particularly as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between successive time and temperature samples, suggesting host advantage. Our approach leverages previous work in assembly graph structural and coverage patterns to provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial gene flux. AVAILABILITY AND IMPLEMENTATION rhea is open source and available at: https://github.com/treangenlab/rhea.
Collapse
Affiliation(s)
- Kristen D Curry
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States
- Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France
| | | | - Summer E Vance
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, CA 94720, United States
| | - Santiago Segarra
- Department of Electrical and Computer Engineering, Rice University, Houston, TX 77005, United States
| | - Devaki Bhaya
- Carnegie Institution for Science, Department of Plant Biology, Stanford, CA 94305, United States
| | - Rayan Chikhi
- Department of Computational Biology, Institut Pasteur, Université Paris Cité, Paris 75015, France
| | - Eduardo P C Rocha
- Department of Genomes and Genetics, Microbial Evolutionary Genomics, Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Paris 75015, France
| | - Todd J Treangen
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX 77005, United States
| |
Collapse
|
9
|
Cooley NP, Wright ES. Many purported pseudogenes in bacterial genomes are bona fide genes. BMC Genomics 2024; 25:365. [PMID: 38622536 PMCID: PMC11017572 DOI: 10.1186/s12864-024-10137-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 02/17/2024] [Indexed: 04/17/2024] Open
Abstract
BACKGROUND Microbial genomes are largely comprised of protein coding sequences, yet some genomes contain many pseudogenes caused by frameshifts or internal stop codons. These pseudogenes are believed to result from gene degradation during evolution but could also be technical artifacts of genome sequencing or assembly. RESULTS Using a combination of observational and experimental data, we show that many putative pseudogenes are attributable to errors that are incorporated into genomes during assembly. Within 126,564 publicly available genomes, we observed that nearly identical genomes often substantially differed in pseudogene counts. Causal inference implicated assembler, sequencing platform, and coverage as likely causative factors. Reassembly of genomes from raw reads confirmed that each variable affects the number of putative pseudogenes in an assembly. Furthermore, simulated sequencing reads corroborated our observations that the quality and quantity of raw data can significantly impact the number of pseudogenes in an assembler dependent fashion. The number of unexpected pseudogenes due to internal stops was highly correlated (R2 = 0.96) with average nucleotide identity to the ground truth genome, implying relative pseudogene counts can be used as a proxy for overall assembly correctness. Applying our method to assemblies in RefSeq resulted in rejection of 3.6% of assemblies due to significantly elevated pseudogene counts. Reassembly from real reads obtained from high coverage genomes showed considerable variability in spurious pseudogenes beyond that observed with simulated reads, reinforcing the finding that high coverage is necessary to mitigate assembly errors. CONCLUSIONS Collectively, these results demonstrate that many pseudogenes in microbial genome assemblies are actually genes. Our results suggest that high read coverage is required for correct assembly and indicate an inflated number of pseudogenes due to internal stops is indicative of poor overall assembly quality.
Collapse
Affiliation(s)
- Nicholas P Cooley
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA
| | - Erik S Wright
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Center for Evolutionary Biology and Medicine, Pittsburgh, PA, USA.
| |
Collapse
|
10
|
Logares R. Decoding populations in the ocean microbiome. MICROBIOME 2024; 12:67. [PMID: 38561814 PMCID: PMC10983722 DOI: 10.1186/s40168-024-01778-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/27/2023] [Accepted: 02/12/2024] [Indexed: 04/04/2024]
Abstract
Understanding the characteristics and structure of populations is fundamental to comprehending ecosystem processes and evolutionary adaptations. While the study of animal and plant populations has spanned a few centuries, microbial populations have been under scientific scrutiny for a considerably shorter period. In the ocean, analyzing the genetic composition of microbial populations and their adaptations to multiple niches can yield important insights into ecosystem function and the microbiome's response to global change. However, microbial populations have remained elusive to the scientific community due to the challenges associated with isolating microorganisms in the laboratory. Today, advancements in large-scale metagenomics and metatranscriptomics facilitate the investigation of populations from many uncultured microbial species directly from their habitats. The knowledge acquired thus far reveals substantial genetic diversity among various microbial species, showcasing distinct patterns of population differentiation and adaptations, and highlighting the significant role of selection in structuring populations. In the coming years, population genomics is expected to significantly increase our understanding of the architecture and functioning of the ocean microbiome, providing insights into its vulnerability or resilience in the face of ongoing global change. Video Abstract.
Collapse
Affiliation(s)
- Ramiro Logares
- Institute of Marine Sciences (ICM), CSIC, Barcelona, Catalonia, 08003, Spain.
| |
Collapse
|
11
|
Schiano-Lomoriello D, Abicca I, Contento L, Gabrielli F, Alfonsi C, Di Pietro F, Papa FT, Ballesteros-Sánchez A, Sánchez-González JM, Rocha-De-Lossada C, Mazzotta C, Giannaccare G, Bonzano C, Borroni D. Infectious Keratitis: Characterization of Microbial Diversity through Species Richness and Shannon Diversity Index. Biomolecules 2024; 14:389. [PMID: 38672407 PMCID: PMC11048652 DOI: 10.3390/biom14040389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2024] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 04/28/2024] Open
Abstract
Purpose: To characterize microbial keratitis diversity utilizing species richness and Shannon Diversity Index. Methods: Corneal impression membrane was used to collect samples. All swabs were processed and analyzed by Biolab Laboratory (level V-SSN Excellence: ISO 9001:2015), Biolab Srl (Ascoli Piceno, Italy). DNA extraction, library preparation, and sequencing were performed in all samples. After sequencing, low-quality and polyclonal sequences were filtered out by the Ion software. At this point, we employed Kraken2 for microbial community analysis in keratitis samples. Nuclease-free water and all the reagents included in the experiment were used as a negative control. The primary outcome was the reduction in bacterial DNA (microbial load) at T1, expressed as a percentage of the baseline value (T0). Richness and Shannon alpha diversity metrics, along with Bray-Curtis beta diversity values, were calculated using the phyloseq package in R. Principal coordinate analysis was also conducted to interpret these metrics. Results: 19 samples were included in the study. The results exhibited a motley species richness, with the highest recorded value surpassing 800 species. Most of the samples displayed richness values ranging broadly from under 200 to around 600, indicating considerable variability in species count among the keratitis samples. Conclusions: A significant presence of both typical and atypical bacterial phyla in keratitis infections, underlining the complexity of the disease's microbial etiology.
Collapse
Affiliation(s)
| | - Irene Abicca
- I.R.C.C.S.-G.B. Bietti Foundation, 00198 Rome, Italy; (D.S.-L.); (I.A.); (L.C.)
| | - Laura Contento
- I.R.C.C.S.-G.B. Bietti Foundation, 00198 Rome, Italy; (D.S.-L.); (I.A.); (L.C.)
| | - Federico Gabrielli
- Biolab SRL, Laboratorio di Genetica e Genomica Molecolare, Largo degli Aranci, 9, 63100 Ascoli Piceno, Italy; (F.G.); (C.A.); (F.D.P.); (F.T.P.)
| | - Cinzia Alfonsi
- Biolab SRL, Laboratorio di Genetica e Genomica Molecolare, Largo degli Aranci, 9, 63100 Ascoli Piceno, Italy; (F.G.); (C.A.); (F.D.P.); (F.T.P.)
| | - Fabio Di Pietro
- Biolab SRL, Laboratorio di Genetica e Genomica Molecolare, Largo degli Aranci, 9, 63100 Ascoli Piceno, Italy; (F.G.); (C.A.); (F.D.P.); (F.T.P.)
| | - Filomena Tiziana Papa
- Biolab SRL, Laboratorio di Genetica e Genomica Molecolare, Largo degli Aranci, 9, 63100 Ascoli Piceno, Italy; (F.G.); (C.A.); (F.D.P.); (F.T.P.)
| | - Antonio Ballesteros-Sánchez
- Department of Physics of Condensed Matter, Optics Area, University of Seville, 41004 Seville, Spain; (A.B.-S.)
- Department of Ophthalmology, Clínica Novovisión, 30008 Murcia, Spain
| | - José-María Sánchez-González
- Department of Physics of Condensed Matter, Optics Area, University of Seville, 41004 Seville, Spain; (A.B.-S.)
| | - Carlos Rocha-De-Lossada
- Regional University Hospital of Malaga, Hospital Civil Square, 29009 Malaga, Spain;
- Department of Surgery, Ophthalmology Area, University of Seville, 41009 Seville, Spain
| | | | - Giuseppe Giannaccare
- Eye Clinic, Department of Surgical Sciences, University of Cagliari, 09121 Cagliari, Italy;
| | - Chiara Bonzano
- DiNOGMI, University of Genoa and IRCCS San Martino Polyclinic Hospital, 16132 Genoa, Italy;
| | - Davide Borroni
- Department of Ophthalmology, Riga Stradins University, LV-1007 Riga, Latvia
- Eyemetagenomics Ltd., 71-75, Shelton Street, Covent Garden, London WC2H 9JQ, UK
| |
Collapse
|
12
|
Curry KD, Yu FB, Vance SE, Segarra S, Bhaya D, Chikhi R, Rocha EP, Treangen TJ. Reference-free Structural Variant Detection in Microbiomes via Long-read Coassembly Graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.25.577285. [PMID: 38352454 PMCID: PMC10862772 DOI: 10.1101/2024.01.25.577285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Bacterial genome dynamics are vital for understanding the mechanisms underlying microbial adaptation, growth, and their broader impact on host phenotype. Structural variants (SVs), genomic alterations of 10 base pairs or more, play a pivotal role in driving evolutionary processes and maintaining genomic heterogeneity within bacterial populations. While SV detection in isolate genomes is relatively straightforward, metagenomes present broader challenges due to absence of clear reference genomes and presence of mixed strains. In response, our proposed method rhea, forgoes reference genomes and metagenome-assembled genomes (MAGs) by encompassing a single metagenome coassembly graph constructed from all samples in a series. The log fold change in graph coverage between subsequent samples is then calculated to call SVs that are thriving or declining throughout the series. We show rhea to outperform existing methods for SV and horizontal gene transfer (HGT) detection in two simulated mock metagenomes, which is particularly noticeable as the simulated reads diverge from reference genomes and an increase in strain diversity is incorporated. We additionally demonstrate use cases for rhea on series metagenomic data of environmental and fermented food microbiomes to detect specific sequence alterations between subsequent time and temperature samples, suggesting host advantage. Our innovative approach leverages raw read patterns rather than references or MAGs to include all sequencing reads in analysis, and thus provide versatility in studying SVs across diverse and poorly characterized microbial communities for more comprehensive insights into microbial genome dynamics.
Collapse
Affiliation(s)
- Kristen D. Curry
- Rice University, Department of Computer Science, Houston, TX 77005, United States
- Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Microbial Evolutionary Genomics, 75015 Paris, France
| | | | - Summer E. Vance
- University of California, Berkeley, Department of Environmental Science, Policy, and Management, Berkeley, CA 94720, United States
| | - Santiago Segarra
- Rice University, Department of Electrical and Computer Engineering, Houston, TX 77005, United States
| | - Devaki Bhaya
- Carnegie Institution for Science, Department of Plant Biology, Stanford, CA 94305, United States
| | - Rayan Chikhi
- Institut Pasteur, Université Paris Cité, Sequence Bioinformatics unit, 75015 Paris, France
| | - Eduardo P.C. Rocha
- Institut Pasteur, Université Paris Cité, CNRS, UMR3525, Microbial Evolutionary Genomics, 75015 Paris, France
| | - Todd J. Treangen
- Rice University, Department of Computer Science, Houston, TX 77005, United States
| |
Collapse
|
13
|
Bucci L, Ghiotto G, Zampieri G, Raga R, Favaro L, Treu L, Campanaro S. Adaptation of Anaerobic Digestion Microbial Communities to High Ammonium Levels: Insights from Strain-Resolved Metagenomics. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:580-590. [PMID: 38114447 PMCID: PMC10785762 DOI: 10.1021/acs.est.3c07737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/29/2023] [Accepted: 11/30/2023] [Indexed: 12/21/2023]
Abstract
Ammonia release from proteinaceous feedstocks represents the main inhibitor of the anaerobic digestion (AD) process, which can result in a decreased biomethane yield or even complete failure of the process. The present study focused on the adaptation of mesophilic AD communities to a stepwise increase in the concentration of ammonium chloride in synthetic medium with casein used as the carbon source. An adaptation process occurring over more than 20 months allowed batch reactors to reach up to 20 g of NH4+ N/L without collapsing in acidification nor ceasing methane production. To decipher the microbial dynamics occurring during the adaptation and determine the genes mostly exposed to selective pressure, a combination of biochemical and metagenomics analyses was performed, reconstructing the strains of key species and tracking them over time. Subsequently, the adaptive metabolic mechanisms were delineated by following the single nucleotide variants (SNVs) characterizing the strains and prioritizing the associated genes according to their function. An in-depth exploration of the archaeon Methanoculleus bourgensis vb3066 and the putative syntrophic acetate-oxidizing bacteria Acetomicrobium sp. ma133 identified positively selected SNVs on genes involved in stress adaptation. The intraspecies diversity with multiple coexisting strains in a temporal succession pattern allows us to detect the presence of an additional level of diversity within the microbial community beyond the species level.
Collapse
Affiliation(s)
- Luca Bucci
- Department
of Biology (DIBIO), University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Gabriele Ghiotto
- Department
of Biology (DIBIO), University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Guido Zampieri
- Department
of Biology (DIBIO), University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Roberto Raga
- Department
of Civil, Environmental and Architectural Engineering (ICEA), University of Padova, Via Marzolo 9, 35131 Padova, Italy
| | - Lorenzo Favaro
- Department
of Agronomy Food Natural Resources Animals and Environment (DAFNAE), University of Padova,
Campus Agripolis, Viale dell’Università
16, 35020 Legnaro, Italy
| | - Laura Treu
- Department
of Biology (DIBIO), University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Stefano Campanaro
- Department
of Biology (DIBIO), University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| |
Collapse
|
14
|
Cerk K, Ugalde‐Salas P, Nedjad CG, Lecomte M, Muller C, Sherman DJ, Hildebrand F, Labarthe S, Frioux C. Community-scale models of microbiomes: Articulating metabolic modelling and metagenome sequencing. Microb Biotechnol 2024; 17:e14396. [PMID: 38243750 PMCID: PMC10832553 DOI: 10.1111/1751-7915.14396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Revised: 11/27/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024] Open
Abstract
Building models is essential for understanding the functions and dynamics of microbial communities. Metabolic models built on genome-scale metabolic network reconstructions (GENREs) are especially relevant as a means to decipher the complex interactions occurring among species. Model reconstruction increasingly relies on metagenomics, which permits direct characterisation of naturally occurring communities that may contain organisms that cannot be isolated or cultured. In this review, we provide an overview of the field of metabolic modelling and its increasing reliance on and synergy with metagenomics and bioinformatics. We survey the means of assigning functions and reconstructing metabolic networks from (meta-)genomes, and present the variety and mathematical fundamentals of metabolic models that foster the understanding of microbial dynamics. We emphasise the characterisation of interactions and the scaling of model construction to large communities, two important bottlenecks in the applicability of these models. We give an overview of the current state of the art in metagenome sequencing and bioinformatics analysis, focusing on the reconstruction of genomes in microbial communities. Metagenomics benefits tremendously from third-generation sequencing, and we discuss the opportunities of long-read sequencing, strain-level characterisation and eukaryotic metagenomics. We aim at providing algorithmic and mathematical support, together with tool and application resources, that permit bridging the gap between metagenomics and metabolic modelling.
Collapse
Affiliation(s)
- Klara Cerk
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | | | - Chabname Ghassemi Nedjad
- Inria, University of Bordeaux, INRAETalenceFrance
- University of Bordeaux, CNRS, Bordeaux INP, LaBRI, UMR 5800TalenceFrance
| | - Maxime Lecomte
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE STLO¸University of RennesRennesFrance
| | | | | | - Falk Hildebrand
- Quadram Institute BioscienceNorwichUK
- Earlham InstituteNorwichUK
| | - Simon Labarthe
- Inria, University of Bordeaux, INRAETalenceFrance
- INRAE, University of Bordeaux, BIOGECO, UMR 1202CestasFrance
| | | |
Collapse
|
15
|
Ghiotto G, Zampieri G, Campanaro S, Treu L. Strain-resolved metagenomics approaches applied to biogas upgrading. ENVIRONMENTAL RESEARCH 2024; 240:117414. [PMID: 37852461 DOI: 10.1016/j.envres.2023.117414] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 10/09/2023] [Accepted: 10/13/2023] [Indexed: 10/20/2023]
Abstract
Genetic heterogeneity is a common trait in microbial populations, caused by de novo mutations and changes in variant frequencies over time. Microbes can thus differ genetically within the same species and acquire different phenotypes. For instance, performance and stability of anaerobic reactors are linked to the composition of the microbiome involved in the digestion process and to the environmental parameters imposing selective pressure on the metagenome, shaping its evolution. Changes at the strain level have the potential to determine variations in microbial functions, and their characterization could provide new insight into ecological and evolutionary processes driving anaerobic digestion. In this work, single nucleotide variant dynamics were studied in two time-course biogas upgrading experiments, testing alternative carbon sources and the response to exogenous hydrogen addition. A cumulative total of 76,229 and 64,289 high-confidence single nucleotide variants were discerned in the experiments related to carbon substrate availability and hydrogen addition, respectively. By combining complementary bioinformatic approaches, the study reconstructed the precise strain count-two for both hydrogenotrophic archaea-and tracked their abundance over time, while also characterizing tens of genes under strong selection. Results in the dominant archaea revealed the presence of nearly 100 variants within genes encoding enzymes involved in hydrogenotrophic methanogenesis. In the bacterial counterparts, 119 mutations were identified across 23 genes associated with the Wood-Ljungdahl pathway, suggesting a possible impact on the syntrophic acetate-oxidation process. Strain replacement events took place in both experiments, confirming the trends suggested by the variants trajectories and providing a comprehensive understanding of the biogas upgrading microbiome at the strain level. Overall, this resolution level allowed us to reveal fine-scale evolutionary mechanisms, functional dynamics, and strain-level metabolic variation that could contribute to the selection of key species actively involved in the carbon dioxide fixation process.
Collapse
Affiliation(s)
- Gabriele Ghiotto
- Department of Biology, University of Padua, Via U. Bassi 58/b, 35131, Padova, Italy
| | - Guido Zampieri
- Department of Biology, University of Padua, Via U. Bassi 58/b, 35131, Padova, Italy
| | - Stefano Campanaro
- Department of Biology, University of Padua, Via U. Bassi 58/b, 35131, Padova, Italy.
| | - Laura Treu
- Department of Biology, University of Padua, Via U. Bassi 58/b, 35131, Padova, Italy
| |
Collapse
|
16
|
Fu S, Zhang Y, Wang R, Qiu Z, Song W, Yang Q, Shen L. A novel culture-enriched metagenomic sequencing strategy effectively guarantee the microbial safety of drinking water by uncovering the low abundance pathogens. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2023; 345:118737. [PMID: 37657296 DOI: 10.1016/j.jenvman.2023.118737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 07/21/2023] [Accepted: 07/31/2023] [Indexed: 09/03/2023]
Abstract
Assessing the presence of waterborne pathogens and antibiotic resistance genes (ARGs) is crucial for managing the environmental quality of drinking water sources. However, detecting low abundance pathogens in such settings is challenging. In this study, a workflow was developed to enrich for broad spectrum pathogens from drinking water samples. A mock community was used to evaluate the effectiveness of various enrichment broths in detecting low-abundance pathogens. Monthly metagenomic surveillance was conducted in a drinking water source from May to September 2021, and water samples were subjected to five enrichment procedures for 6 h to recover the majority of waterborne bacterial pathogens. Oxford Nanopore Technology (ONT) was used for metagenomic sequencing of enriched samples to obtain high-quality pathogen genomes. The results showed that selective enrichment significantly increased the proportions of targeted bacterial pathogens. Compared to direct metagenomic sequencing of untreated water samples, targeted enrichment followed by ONT sequencing significantly improved the detection of waterborne pathogens and the quality of metagenome-assembled genomes (MAGs). Eighty-six high-quality MAGs, including 70 pathogen MAGs, were obtained from ONT sequencing, while only 12 MAGs representing 10 species were obtained from direct metagenomic sequencing of untreated water samples. In addition, ONT sequencing improved the recovery of mobile genetic elements and the accuracy of phylogenetic analysis. This study highlights the urgent need for efficient methodologies to detect and manage microbial risks in drinking water sources. The developed workflow provides a cost-effective approach for environmental management of drinking water sources with microbial risks. The study also uncovered pathogens that were not detected by traditional methods, thereby advancing microbial risk management of drinking water sources.
Collapse
Affiliation(s)
- Songzhe Fu
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, 710069, China; Key Laboratory of Environment Controlled Aquaculture (Dalian Ocean University), Ministry of Education, 116023, China.
| | - Yixiang Zhang
- CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences. Shanghai, China; University of Chinese Academy of Sciences, Shanghai, China
| | - Rui Wang
- Key Laboratory of Environment Controlled Aquaculture (Dalian Ocean University), Ministry of Education, 116023, China
| | - Zhiguang Qiu
- School of Environment and Energy, Shenzhen Graduate School, Peking University, Shenzhen, 518055, China
| | - Weizhi Song
- School of Life Sciences and State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, SAR, Hong Kong, China
| | - Qian Yang
- Center for Microbial Ecology and Technology, Ghent University, Ghent, Belgium
| | - Lixin Shen
- Key Laboratory of Resource Biology and Biotechnology in Western China, Ministry of Education, Northwest University, Xi'an, 710069, China.
| |
Collapse
|
17
|
Ventolero M, Wang S, Hu H, Li X. Are the predicted known bacterial strains in a sample really present? A case study. PLoS One 2023; 18:e0291964. [PMID: 37831725 PMCID: PMC10575510 DOI: 10.1371/journal.pone.0291964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Accepted: 09/10/2023] [Indexed: 10/15/2023] Open
Abstract
With mutations constantly accumulating in bacterial genomes, it is unclear whether the previously identified bacterial strains are really present in an extant sample. To address this question, we did a case study on the known strains of the bacterial species S. aureus and S. epidermis in 68 atopic dermatitis shotgun metagenomic samples. We evaluated the likelihood of the presence of all sixteen known strains predicted in the original study and by two popular tools in this study. We found that even with the same tool, only two known strains were predicted by the original study and this study. Moreover, none of the sixteen known strains was likely present in these 68 samples. Our study thus indicates the limitation of the known-strain-based studies, especially those on rapidly evolving bacterial species. It implies the unlikely presence of the previously identified known strains in a current environmental sample. It also called for de novo bacterial strain identification directly from shotgun metagenomic reads.
Collapse
Affiliation(s)
- Minerva Ventolero
- Burnett School of Biomedical Science, College of Medicine, University of Central Florida, Orlando, Florida, United States of America
| | - Saidi Wang
- Department of Computer Science, University of Central Florida, Orlando, Florida, United States of America
| | - Haiyan Hu
- Department of Computer Science, Genomics and Bioinformatics Cluster, University of Central Florida, Orlando, Florida, United States of America
| | - Xiaoman Li
- Burnett School of Biomedical Science, College of Medicine, University of Central Florida, Orlando, Florida, United States of America
| |
Collapse
|
18
|
Zhu X, Zhao L, Huang L, Yang W, Wang L, Yu R. cgMSI: pathogen detection within species from nanopore metagenomic sequencing data. BMC Bioinformatics 2023; 24:387. [PMID: 37821827 PMCID: PMC10568937 DOI: 10.1186/s12859-023-05512-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 10/02/2023] [Indexed: 10/13/2023] Open
Abstract
BACKGROUND Metagenomic sequencing is an unbiased approach that can potentially detect all the known and unidentified strains in pathogen detection. Recently, nanopore sequencing has been emerging as a highly potential tool for rapid pathogen detection due to its fast turnaround time. However, identifying pathogen within species is nontrivial for nanopore sequencing data due to the high sequencing error rate. RESULTS We developed the core gene alleles metagenome strain identification (cgMSI) tool, which uses a two-stage maximum a posteriori probability estimation method to detect pathogens at strain level from nanopore metagenomic sequencing data at low computational cost. The cgMSI tool can accurately identify strains and estimate relative abundance at 1× coverage. CONCLUSIONS We developed cgMSI for nanopore metagenomic pathogen detection within species. cgMSI is available at https://github.com/ZHU-XU-xmu/cgMSI .
Collapse
Affiliation(s)
- Xu Zhu
- School of Informatics, Xiamen University, Xiamen, Fujian, China
| | - Lili Zhao
- Women and Children's Hospital, School of Medicine, Xiamen University, Xiamen, Fujian, China
| | - Lihong Huang
- Computer Management Center, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | | | - Liansheng Wang
- School of Informatics, Xiamen University, Xiamen, Fujian, China.
- National Institute for Data Science in Health and Medicine, Informatics, Xiamen University, Xiamen, Fujian, China.
| | - Rongshan Yu
- School of Informatics, Xiamen University, Xiamen, Fujian, China.
- National Institute for Data Science in Health and Medicine, Informatics, Xiamen University, Xiamen, Fujian, China.
| |
Collapse
|
19
|
Mallawaarachchi V, Roach MJ, Decewicz P, Papudeshi B, Giles SK, Grigson SR, Bouras G, Hesse RD, Inglis LK, Hutton ALK, Dinsdale EA, Edwards RA. Phables: from fragmented assemblies to high-quality bacteriophage genomes. Bioinformatics 2023; 39:btad586. [PMID: 37738590 PMCID: PMC10563150 DOI: 10.1093/bioinformatics/btad586] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 07/14/2023] [Accepted: 09/19/2023] [Indexed: 09/24/2023] Open
Abstract
MOTIVATION Microbial communities have a profound impact on both human health and various environments. Viruses infecting bacteria, known as bacteriophages or phages, play a key role in modulating bacterial communities within environments. High-quality phage genome sequences are essential for advancing our understanding of phage biology, enabling comparative genomics studies and developing phage-based diagnostic tools. Most available viral identification tools consider individual sequences to determine whether they are of viral origin. As a result of challenges in viral assembly, fragmentation of genomes can occur, and existing tools may recover incomplete genome fragments. Therefore, the identification and characterization of novel phage genomes remain a challenge, leading to the need of improved approaches for phage genome recovery. RESULTS We introduce Phables, a new computational method to resolve phage genomes from fragmented viral metagenome assemblies. Phables identifies phage-like components in the assembly graph, models each component as a flow network, and uses graph algorithms and flow decomposition techniques to identify genomic paths. Experimental results of viral metagenomic samples obtained from different environments show that Phables recovers on average over 49% more high-quality phage genomes compared to existing viral identification tools. Furthermore, Phables can resolve variant phage genomes with over 99% average nucleotide identity, a distinction that existing tools are unable to make. AVAILABILITY AND IMPLEMENTATION Phables is available on GitHub at https://github.com/Vini2/phables.
Collapse
Affiliation(s)
- Vijini Mallawaarachchi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Michael J Roach
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Przemyslaw Decewicz
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw 02-096, Poland
| | - Bhavya Papudeshi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Sarah K Giles
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - George Bouras
- Adelaide Medical School, Faculty of Health and Medical Sciences, The University of Adelaide, Adelaide, South Australia 5005, Australia
- The Department of Surgery—Otolaryngology Head and Neck Surgery, Central Adelaide Local Health Network, Adelaide, South Australia 5000, Australia
| | - Ryan D Hesse
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Laura K Inglis
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Abbey L K Hutton
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Elizabeth A Dinsdale
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, South Australia 5042, Australia
| |
Collapse
|
20
|
Puente-Sánchez F, Hoetzinger M, Buck M, Bertilsson S. Exploring environmental intra-species diversity through non-redundant pangenome assemblies. Mol Ecol Resour 2023; 23:1724-1736. [PMID: 37382302 DOI: 10.1111/1755-0998.13826] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 05/24/2023] [Accepted: 06/15/2023] [Indexed: 06/30/2023]
Abstract
At the genome level, microorganisms are highly adaptable both in terms of allele and gene composition. Such heritable traits emerge in response to different environmental niches and can have a profound influence on microbial community dynamics. As a consequence, any individual genome or population will contain merely a fraction of the total genetic diversity of any operationally defined "species", whose ecological potential can thus be only fully understood by studying all of their genomes and the genes therein. This concept, known as the pangenome, is valuable for studying microbial ecology and evolution, as it partitions genomes into core (present in all the genomes from a species, and responsible for housekeeping and species-level niche adaptation among others) and accessory regions (present only in some, and responsible for intra-species differentiation). Here we present SuperPang, an algorithm producing pangenome assemblies from a set of input genomes of varying quality, including metagenome-assembled genomes (MAGs). SuperPang runs in linear time and its results are complete, non-redundant, preserve gene ordering and contain both coding and non-coding regions. Our approach provides a modular view of the pangenome, identifying operons and genomic islands, and allowing to track their prevalence in different populations. We illustrate this by analysing intra-species diversity in Polynucleobacter, a bacterial genus ubiquitous in freshwater ecosystems, characterized by their streamlined genomes and their ecological versatility. We show how SuperPang facilitates the simultaneous analysis of allelic and gene content variation under different environmental pressures, allowing us to study the drivers of microbial diversification at unprecedented resolution.
Collapse
Affiliation(s)
- Fernando Puente-Sánchez
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Matthias Hoetzinger
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Moritz Buck
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| | - Stefan Bertilsson
- Department of Aquatic Sciences and Assessment, Swedish University of Agricultural Sciences, Uppsala, Sweden
| |
Collapse
|
21
|
Seong HJ, Kim JJ, Sul WJ. ACR: metagenome-assembled prokaryotic and eukaryotic genome refinement tool. Brief Bioinform 2023; 24:bbad381. [PMID: 37889119 DOI: 10.1093/bib/bbad381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 09/16/2023] [Accepted: 10/03/2023] [Indexed: 10/28/2023] Open
Abstract
Microbial genome recovery from metagenomes can further explain microbial ecosystem structures, functions and dynamics. Thus, this study developed the Additional Clustering Refiner (ACR) to enhance high-purity prokaryotic and eukaryotic metagenome-assembled genome (MAGs) recovery. ACR refines low-quality MAGs by subjecting them to iterative k-means clustering predicated on contig abundance and increasing bin purity through validated universal marker genes. Synthetic and real-world metagenomic datasets, including short- and long-read sequences, evaluated ACR's effectiveness. The results demonstrated improved MAG purity and a significant increase in high- and medium-quality MAG recovery rates. In addition, ACR seamlessly integrates with various binning algorithms, augmenting their strengths without modifying core features. Furthermore, its multiple sequencing technology compatibilities expand its applicability. By efficiently recovering high-quality prokaryotic and eukaryotic genomes, ACR is a promising tool for deepening our understanding of microbial communities through genome-centric metagenomics.
Collapse
Affiliation(s)
- Hoon Je Seong
- Korean Medicine Data Division, Korea Institute of Oriental Medicine, Daejeon, Republic of Korea
| | - Jin Ju Kim
- Department of Systems Biotechnology, Chung-Ang University, Anseong, Republic of Korea
| | - Woo Jun Sul
- Department of Systems Biotechnology, Chung-Ang University, Anseong, Republic of Korea
| |
Collapse
|
22
|
Mallawaarachchi V, Roach MJ, Decewicz P, Papudeshi B, Giles SK, Grigson SR, Bouras G, Hesse RD, Inglis LK, Hutton ALK, Dinsdale EA, Edwards RA. Phables: from fragmented assemblies to high-quality bacteriophage genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.04.535632. [PMID: 37066369 PMCID: PMC10104058 DOI: 10.1101/2023.04.04.535632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2023]
Abstract
Microbial communities influence both human health and different environments. Viruses infecting bacteria, known as bacteriophages or phages, play a key role in modulating bacterial communities within environments. High-quality phage genome sequences are essential for advancing our understanding of phage biology, enabling comparative genomics studies, and developing phage-based diagnostic tools. Most available viral identification tools consider individual sequences to determine whether they are of viral origin. As a result of the challenges in viral assembly, fragmentation of genomes can occur, leading to the need for new approaches in viral identification. Therefore, the identification and characterisation of novel phages remain a challenge. We introduce Phables, a new computational method to resolve phage genomes from fragmented viral metagenome assemblies. Phables identifies phage-like components in the assembly graph, models each component as a flow network, and uses graph algorithms and flow decomposition techniques to identify genomic paths. Experimental results of viral metagenomic samples obtained from different environments show that Phables recovers on average over 49% more high-quality phage genomes compared to existing viral identification tools. Furthermore, Phables can resolve variant phage genomes with over 99% average nucleotide identity, a distinction that existing tools are unable to make. Phables is available on GitHub at https://github.com/Vini2/phables.
Collapse
Affiliation(s)
- Vijini Mallawaarachchi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Michael J Roach
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Przemyslaw Decewicz
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw, Warsaw 02-096, Poland
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Bhavya Papudeshi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Sarah K Giles
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - George Bouras
- Adelaide Medical School, The University of Adelaide, North Tce, Adelaide, SA, 5000, Australia
| | - Ryan D Hesse
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Laura K Inglis
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Abbey L K Hutton
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Elizabeth A Dinsdale
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Bedford Park, Adelaide, SA, 5042, Australia
| |
Collapse
|
23
|
Breusing C, Xiao Y, Russell SL, Corbett-Detig RB, Li S, Sun J, Chen C, Lan Y, Qian PY, Beinart RA. Ecological differences among hydrothermal vent symbioses may drive contrasting patterns of symbiont population differentiation. mSystems 2023; 8:e0028423. [PMID: 37493648 PMCID: PMC10469979 DOI: 10.1128/msystems.00284-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/13/2023] [Indexed: 07/27/2023] Open
Abstract
The intra-host composition of horizontally transmitted microbial symbionts can vary across host populations due to interactive effects of host genetics, environmental, and geographic factors. While adaptation to local habitat conditions can drive geographic subdivision of symbiont strains, it is unknown how differences in ecological characteristics among host-symbiont associations influence the genomic structure of symbiont populations. To address this question, we sequenced metagenomes of different populations of the deep-sea mussel Bathymodiolus septemdierum, which are common at Western Pacific deep-sea hydrothermal vents and show characteristic patterns of niche partitioning with sympatric gastropod symbioses. Bathymodiolus septemdierum lives in close symbiotic relationship with sulfur-oxidizing chemosynthetic bacteria but supplements its symbiotrophic diet through filter-feeding, enabling it to occupy ecological niches with little exposure to geochemical reductants. Our analyses indicate that symbiont populations associated with B. septemdierum show structuring by geographic location, but that the dominant symbiont strain is uncorrelated with vent site. These patterns are in contrast to co-occurring Alviniconcha and Ifremeria gastropod symbioses that exhibit greater symbiont nutritional dependence and occupy habitats with higher spatial variability in environmental conditions. Our results suggest that relative habitat homogeneity combined with sufficient symbiont dispersal and genomic mixing might promote persistence of similar symbiont strains across geographic locations, while mixotrophy might decrease selective pressures on the host to affiliate with locally adapted symbiont strains. Overall, these data contribute to our understanding of the potential mechanisms influencing symbiont population structure across a spectrum of marine microbial symbioses that occupy contrasting ecological niches. IMPORTANCE Beneficial relationships between animals and microbial organisms (symbionts) are ubiquitous in nature. In the ocean, microbial symbionts are typically acquired from the environment and their composition across geographic locations is often shaped by adaptation to local habitat conditions. However, it is currently unknown how generalizable these patterns are across symbiotic systems that have contrasting ecological characteristics. To address this question, we compared symbiont population structure between deep-sea hydrothermal vent mussels and co-occurring but ecologically distinct snail species. Our analyses show that mussel symbiont populations are less partitioned by geography and do not demonstrate evidence for environmental adaptation. We posit that the mussel's mixotrophic feeding mode may lower its need to affiliate with locally adapted symbiont strains, while microhabitat stability and symbiont genomic mixing likely favors persistence of symbiont strains across geographic locations. Altogether, these findings further our understanding of the mechanisms shaping symbiont population structure in marine environmentally transmitted symbioses.
Collapse
Affiliation(s)
- Corinna Breusing
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, USA
| | - Yao Xiao
- Department of Ocean Science, Division of Life Science and Hong Kong Branch of the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), The Hong Kong University of Science and Technology, Hong Kong, China
- The Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Nansha, Guangzhou, China
| | - Shelbi L. Russell
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, USA
| | - Russell B. Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, California, USA
| | - Sixuan Li
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, USA
| | - Jin Sun
- Institute of Evolution & Marine Biodiversity, Ocean University of China, Qingdao, China
| | - Chong Chen
- X-STAR, Japan Agency for Marine-Earth Science and Technology (JAMSTEC), Yokosuka, Japan
| | - Yi Lan
- Department of Ocean Science, Division of Life Science and Hong Kong Branch of the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), The Hong Kong University of Science and Technology, Hong Kong, China
- The Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Nansha, Guangzhou, China
| | - Pei-Yuan Qian
- Department of Ocean Science, Division of Life Science and Hong Kong Branch of the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), The Hong Kong University of Science and Technology, Hong Kong, China
- The Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), Nansha, Guangzhou, China
| | - Roxanne A. Beinart
- Graduate School of Oceanography, University of Rhode Island, Narragansett, Rhode Island, USA
| |
Collapse
|
24
|
Benoit G, Raguideau S, James R, Phillippy AM, Chikhi R, Quince C. Efficient High-Quality Metagenome Assembly from Long Accurate Reads using Minimizer-space de Bruijn Graphs. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.07.548136. [PMID: 37786716 PMCID: PMC10541625 DOI: 10.1101/2023.07.07.548136] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/04/2023]
Abstract
We introduce a novel metagenomics assembler for high-accuracy long reads. Our approach, implemented as metaMDBG, combines highly efficient de Bruijn graph assembly in minimizer space, with both a multi-k' approach for dealing with variations in genome coverage depth and an abundance-based filtering strategy for simplifying strain complexity. The resulting algorithm is more efficient than the state-of-the-art but with better assembly results. metaMDBG was 1.5 to 12 times faster than competing assemblers and requires between one-tenth and one-thirtieth of the memory across a range of data sets. We obtained up to twice as many high-quality circularised prokaryotic metagenome assembled genomes (MAGs) on the most complex communities, and a better recovery of viruses and plasmids. metaMDBG performs particularly well for abundant organisms whilst being robust to the presence of strain diversity. The result is that for the first time it is possible to efficiently reconstruct the majority of complex communities by abundance as near-complete MAGs.
Collapse
Affiliation(s)
- Gaëtan Benoit
- Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK
| | | | - Robert James
- Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK
| | - Adam M. Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Rayan Chikhi
- Sequence Bioinformatics, Department of Computational Biology, Institut Pasteur, Paris, France
| | - Christopher Quince
- Organisms and Ecosystems, Earlham Institute, Norwich, NR4 7UZ, UK
- Gut Microbes and Health, Quadram Institute, Norwich, NR4 7UQ, UK
- Warwick Medical School, University of Warwick, Coventry, CV4 7AL, UK
| |
Collapse
|
25
|
Arnau V, Díaz-Villanueva W, Mifsut Benet J, Villasante P, Beamud B, Mompó P, Sanjuan R, González-Candelas F, Domingo-Calap P, Džunková M. Inference of the Life Cycle of Environmental Phages from Genomic Signature Distances to Their Hosts. Viruses 2023; 15:v15051196. [PMID: 37243281 DOI: 10.3390/v15051196] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2023] [Revised: 05/17/2023] [Accepted: 05/17/2023] [Indexed: 05/28/2023] Open
Abstract
The environmental impact of uncultured phages is shaped by their preferred life cycle (lytic or lysogenic). However, our ability to predict it is very limited. We aimed to discriminate between lytic and lysogenic phages by comparing the similarity of their genomic signatures to those of their hosts, reflecting their co-evolution. We tested two approaches: (1) similarities of tetramer relative frequencies, (2) alignment-free comparisons based on exact k = 14 oligonucleotide matches. First, we explored 5126 reference bacterial host strains and 284 associated phages and found an approximate threshold for distinguishing lysogenic and lytic phages using both oligonucleotide-based methods. The analysis of 6482 plasmids revealed the potential for horizontal gene transfer between different host genera and, in some cases, distant bacterial taxa. Subsequently, we experimentally analyzed combinations of 138 Klebsiella pneumoniae strains and their 41 phages and found that the phages with the largest number of interactions with these strains in the laboratory had the shortest genomic distances to K. pneumoniae. We then applied our methods to 24 single-cells from a hot spring biofilm containing 41 uncultured phage-host pairs, and the results were compatible with the lysogenic life cycle of phages detected in this environment. In conclusion, oligonucleotide-based genome analysis methods can be used for predictions of (1) life cycles of environmental phages, (2) phages with the broadest host range in culture collections, and (3) potential horizontal gene transfer by plasmids.
Collapse
Affiliation(s)
- Vicente Arnau
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Wladimiro Díaz-Villanueva
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Jorge Mifsut Benet
- Department of Space, Earth and Environment, Chalmers University of Technology, 41296 Gothenburg, Sweden
| | | | - Beatriz Beamud
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
| | - Paula Mompó
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
| | - Rafael Sanjuan
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| | - Fernando González-Candelas
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
- Foundation for the Promotion of Sanitary and Biomedical Research of the Valencian Community (FISABIO), 46020 Valencia, Spain
- CIBER in Epidemiology and Public Health (CIBEResp), 28029 Madrid, Spain
| | - Pilar Domingo-Calap
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| | - Mária Džunková
- Institute for Integrative Systems Biology, University of Valencia and Consejo Superior de Investigaciones Científicas (CSIC), 46980 Valencia, Spain
| |
Collapse
|
26
|
Wolff R, Shoemaker W, Garud N. Ecological Stability Emerges at the Level of Strains in the Human Gut Microbiome. mBio 2023; 14:e0250222. [PMID: 36809109 PMCID: PMC10127601 DOI: 10.1128/mbio.02502-22] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 01/13/2023] [Indexed: 02/23/2023] Open
Abstract
The human gut microbiome harbors substantial ecological diversity at the species level as well as at the strain level within species. In healthy hosts, species abundance fluctuations in the microbiome are thought to be stable, and these fluctuations can be described by macroecological laws. However, it is less clear how strain abundances change over time. An open question is whether individual strains behave like species themselves, exhibiting stability and following the macroecological relationships known to hold at the species level, or whether strains have different dynamics, perhaps due to the relatively close phylogenetic relatedness of cocolonizing lineages. Here, we analyze the daily dynamics of intraspecific genetic variation in the gut microbiomes of four healthy, densely longitudinally sampled hosts. First, we find that the overall genetic diversity of a large majority of species is stationary over time despite short-term fluctuations. Next, we show that fluctuations in abundances in approximately 80% of strains analyzed can be predicted with a stochastic logistic model (SLM), an ecological model of a population experiencing environmental fluctuations around a fixed carrying capacity, which has previously been shown to capture statistical properties of species abundance fluctuations. The success of this model indicates that strain abundances typically fluctuate around a fixed carrying capacity, suggesting that most strains are dynamically stable. Finally, we find that the strain abundances follow several empirical macroecological laws known to hold at the species level. Together, our results suggest that macroecological properties of the human gut microbiome, including its stability, emerge at the level of strains. IMPORTANCE To date, there has been an intense focus on the ecological dynamics of the human gut microbiome at the species level. However, there is considerable genetic diversity within species at the strain level, and these intraspecific differences can have important phenotypic effects on the host, impacting the ability to digest certain foods and metabolize drugs. Thus, to fully understand how the gut microbiome operates in times of health and sickness, its ecological dynamics may need to be quantified at the level of strains. Here, we show that a large majority of strains maintain stable abundances for periods of months to years, exhibiting fluctuations in abundance that can be well described by macroecological laws known to hold at the species level, while a smaller percentage of strains undergo rapid, directional changes in abundance. Overall, our work indicates that strains are an important unit of ecological organization in the human gut microbiome.
Collapse
Affiliation(s)
- Richard Wolff
- Department of Ecology and Evolutionary Biology, UCLA, Los Angeles, California, USA
| | - William Shoemaker
- Department of Ecology and Evolutionary Biology, UCLA, Los Angeles, California, USA
| | - Nandita Garud
- Department of Ecology and Evolutionary Biology, UCLA, Los Angeles, California, USA
- Department of Human Genetics, UCLA, Los Angeles, California, USA
| |
Collapse
|
27
|
Anderson BD, Bisanz JE. Challenges and opportunities of strain diversity in gut microbiome research. Front Microbiol 2023; 14:1117122. [PMID: 36876113 PMCID: PMC9981649 DOI: 10.3389/fmicb.2023.1117122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 01/24/2023] [Indexed: 02/19/2023] Open
Abstract
Just because two things are related does not mean they are the same. In analyzing microbiome data, we are often limited to species-level analyses, and even with the ability to resolve strains, we lack comprehensive databases and understanding of the importance of strain-level variation outside of a limited number of model organisms. The bacterial genome is highly plastic with gene gain and loss occurring at rates comparable or higher than de novo mutations. As such, the conserved portion of the genome is often a fraction of the pangenome which gives rise to significant phenotypic variation, particularly in traits which are important in host microbe interactions. In this review, we discuss the mechanisms that give rise to strain variation and methods that can be used to study it. We identify that while strain diversity can act as a major barrier in interpreting and generalizing microbiome data, it can also be a powerful tool for mechanistic research. We then highlight recent examples demonstrating the importance of strain variation in colonization, virulence, and xenobiotic metabolism. Moving past taxonomy and the species concept will be crucial for future mechanistic research to understand microbiome structure and function.
Collapse
Affiliation(s)
- Benjamin D. Anderson
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States
| | - Jordan E. Bisanz
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States
- The Penn State Microbiome Center, Huck Institutes of the Life Sciences, University Park, PA, United States
| |
Collapse
|
28
|
Zhao C, Shi ZJ, Pollard KS. Pitfalls of genotyping microbial communities with rapidly growing genome collections. Cell Syst 2023; 14:160-176.e3. [PMID: 36657438 PMCID: PMC9957970 DOI: 10.1016/j.cels.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/15/2022] [Accepted: 12/19/2022] [Indexed: 01/20/2023]
Abstract
Detecting genetic variants in metagenomic data is a priority for understanding the evolution, ecology, and functional characteristics of microbial communities. Many tools that perform this metagenotyping rely on aligning reads of unknown origin to a database of sequences from many species before calling variants. In this synthesis, we investigate how databases of increasingly diverse and closely related species have pushed the limits of current alignment algorithms, thereby degrading the performance of metagenotyping tools. We identify multi-mapping reads as a prevalent source of errors and illustrate a trade-off between retaining correct alignments versus limiting incorrect alignments, many of which map reads to the wrong species. Then we evaluate several actionable mitigation strategies and review emerging methods showing promise to further improve metagenotyping in response to the rapid growth in genome collections. Our results have implications beyond metagenotyping to the many tools in microbial genomics that depend upon accurate read mapping.
Collapse
Affiliation(s)
- Chunyu Zhao
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
29
|
The Current State of Nanopore Sequencing. Methods Mol Biol 2023; 2632:3-14. [PMID: 36781717 DOI: 10.1007/978-1-0716-2996-3_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023]
Abstract
Nanopore sensing is a disruptive, revolutionary way in which to sequence nucleic acids, including both native DNA and RNA molecules. First commercialized with the MinIONTM sequencer from Oxford Nanopore TechnologiesTM in 2015, this review article looks at the current state of nanopore sequencing as of June 2022. Covering the unique characteristics of the technology and how it functions, we then go on to look at the ability of the platform to deliver sequencing at all scales-from personal to high-throughput devices-before looking at how the scientific community is applying the technology around the world to answer their biological questions.
Collapse
|
30
|
Chen P, Sun Z, Wang J, Liu X, Bai Y, Chen J, Liu A, Qiao F, Chen Y, Yuan C, Sha J, Zhang J, Xu LQ, Li J. Portable nanopore-sequencing technology: Trends in development and applications. Front Microbiol 2023; 14:1043967. [PMID: 36819021 PMCID: PMC9929578 DOI: 10.3389/fmicb.2023.1043967] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 01/03/2023] [Indexed: 02/04/2023] Open
Abstract
Sequencing technology is the most commonly used technology in molecular biology research and an essential pillar for the development and applications of molecular biology. Since 1977, when the first generation of sequencing technology opened the door to interpreting the genetic code, sequencing technology has been developing for three generations. It has applications in all aspects of life and scientific research, such as disease diagnosis, drug target discovery, pathological research, species protection, and SARS-CoV-2 detection. However, the first- and second-generation sequencing technology relied on fluorescence detection systems and DNA polymerization enzyme systems, which increased the cost of sequencing technology and limited its scope of applications. The third-generation sequencing technology performs PCR-free and single-molecule sequencing, but it still depends on the fluorescence detection device. To break through these limitations, researchers have made arduous efforts to develop a new advanced portable sequencing technology represented by nanopore sequencing. Nanopore technology has the advantages of small size and convenient portability, independent of biochemical reagents, and direct reading using physical methods. This paper reviews the research and development process of nanopore sequencing technology (NST) from the laboratory to commercially viable tools; discusses the main types of nanopore sequencing technologies and their various applications in solving a wide range of real-world problems. In addition, the paper collates the analysis tools necessary for performing different processing tasks in nanopore sequencing. Finally, we highlight the challenges of NST and its future research and application directions.
Collapse
Affiliation(s)
- Pin Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Zepeng Sun
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Jiawei Wang
- School of Computer Science and Technology, Southeast University, Nanjing, China
| | - Xinlong Liu
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Yun Bai
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Jiang Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Anna Liu
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Feng Qiao
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China
| | - Yang Chen
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China
| | - Chenyan Yuan
- Clinical Laboratory, Southeast University Zhongda Hospital, Nanjing, China
| | - Jingjie Sha
- School of Mechanical Engineering, Southeast University, Nanjing, China
| | - Jinghui Zhang
- School of Computer Science and Technology, Southeast University, Nanjing, China
| | - Li-Qun Xu
- China Mobile (Chengdu) Industrial Research Institute, Chengdu, China,*Correspondence: Li-Qun Xu, ✉
| | - Jian Li
- Key Laboratory of DGHD, MOE, School of Life Science and Technology, Southeast University, Nanjing, China,Jian Li, ✉
| |
Collapse
|
31
|
Vineis JH, Bulseco AN, Bowen JL. Microbial chemolithoautotrophs are abundant in salt marsh sediment following long-term experimental nitrate enrichment. FEMS Microbiol Lett 2023; 370:fnad082. [PMID: 37541957 DOI: 10.1093/femsle/fnad082] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 07/13/2023] [Accepted: 07/25/2023] [Indexed: 08/06/2023] Open
Abstract
Long-term anthropogenic nitrate (NO3-) enrichment is a serious threat to many coastal systems. Nitrate reduction coupled with the oxidation of reduced forms of sulfur is conducted by chemolithoautotrophic microbial populations in a process that decreases nitrogen (N) pollution. However, little is known about the diversity and distribution of microbes capable of carbon fixation within salt marsh sediment and how they respond to long-term NO3- loading. We used genome-resolved metagenomics to characterize the distribution, phylogenetic relationships, and adaptations important to microbial communities within NO3--enriched sediment. We found NO3- reducing sulfur oxidizers became dominant members of the microbial community throughout the top 25 cm of the sediment following long-term NO3- enrichment. We also found that most of the chemolithoautotrophic genomes recovered contained striking metabolic versatility, including the potential for complete denitrification and evidence of mixotrophy. Phylogenetic reconstruction indicated that similar carbon fixation strategies and metabolic versatility can be found in several phylogenetic groups, but the genomes recovered here represent novel organisms. Our results suggest that the role of chemolithoautotrophy within NO3--enriched salt marsh sediments may be quantitatively more important for retaining carbon and filtering NO3- than previously indicated and further inquiry is needed to explicitly measure their contribution to carbon turnover and removal of N pollution.
Collapse
Affiliation(s)
- Joseph H Vineis
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| | - Ashley N Bulseco
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| | - Jennifer L Bowen
- Department of Marine and Environmental Sciences, Marine Science Center, Northeastern University, 30 Nahant Road, Nahant, MA 01908, United States
| |
Collapse
|
32
|
Borroni D, Bonzano C, Sánchez-González JM, Rachwani-Anil R, Zamorano-Martín F, Pereza-Nieves J, Traverso CE, García Lorente M, Rodríguez-Calvo-de-Mora M, Esposito A, Godin F, Rocha-de-Lossada C. Shotgun metagenomic sequencing in culture negative microbial keratitis. Eur J Ophthalmol 2023:11206721221149077. [PMID: 36617769 DOI: 10.1177/11206721221149077] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
PURPOSE To evaluate the microbiota of culture negative Corneal Impression Membrane (CIM) microbial keratitis samples with the use of shotgun metagenomics analysis. METHODS DNA of microbial keratitis samples were collected with CIM and extracted using the MasterPure™ Complete DNA and RNA Purification Kit (Epicentre). DNA was fragmented by sonication into fragments of 300 to 400 base pairs (bp) using Bioruptor® (Diagenode, Belgium) and then used as a template for library preparation. DNA libraries were sequenced on Illumina® HiSeq2500. The resulting reads were quality controlled, trimmed and mapped against the human reference genome. The unmapped reads were taxonomically classified using the Kraken software. RESULTS 18 microbial keratitis samples were included in the study. Brevundimonas diminuta was found in 5 samples while 6 samples showed the presence of viral infections. Cutibacterium acnes, Staphylococcus aureus, Moraxella lacunata and Pseudomonas alcaligenes were also identified as the presumed putative cause of the infection in 7 samples. CONCLUSIONS Shotgun sequencing can be used as a diagnostic tool in microbial keratitis samples. This diagnostic method expands the available tests to diagnose eye infections and could be clinically significant in culture negative samples.
Collapse
Affiliation(s)
- Davide Borroni
- Department of Ophthalmology, Riga Stradins University, Riga, Latvia
| | - Chiara Bonzano
- DiNOGMI, University of Genoa and IRCCS San Martino Polyclinic Hospital, Genoa, Italy
| | | | | | | | | | - Carlo Enrico Traverso
- DiNOGMI, University of Genoa and IRCCS San Martino Polyclinic Hospital, Genoa, Italy
| | | | | | - Alfonso Esposito
- 18470International Centre for Genetic Engineering and Biotechnology, Trieste, Italy
| | - Fernando Godin
- Department of Ophthalmology, Universidad El Bosque, Bogotá, Colombia
| | - Carlos Rocha-de-Lossada
- Qvision, Opththalmology Department, VITHAS Almería Hospital, Almería, Spain.,Ophthalmology Department, VITHAS Málaga, Málaga, Spain.,Hospital Regional Universitario de Málaga, Plaza del Hospital Civil, Málaga, Spain.,Departamento de Cirugía, Universidad de Sevilla, Área de Oftalmología, Doctor Fedriani, Seville, Spain
| |
Collapse
|
33
|
Herold M, Hock L, Penny C, Walczak C, Djabi F, Cauchie HM, Ragimbeau C. Metagenomic Strain-Typing Combined with Isolate Sequencing Provides Increased Resolution of the Genetic Diversity of Campylobacter jejuni Carriage in Wild Birds. Microorganisms 2023; 11:microorganisms11010121. [PMID: 36677413 PMCID: PMC9860660 DOI: 10.3390/microorganisms11010121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/28/2022] [Accepted: 12/29/2022] [Indexed: 01/05/2023] Open
Abstract
As the world's leading cause of human gastro-enteritis, the food- and waterborne pathogen Campylobacter needs to be intensively monitored through a One Health approach. Particularly, wild birds have been hypothesized to contribute to the spread of human clinical recurring C. jejuni genotypes across several countries. A major concern in studying epidemiological dynamics is resolving the large genomic diversity of strains circulating in the environment and various reservoirs, challenging to achieve with isolation techniques. Here, we applied a passive-filtration method to obtain isolates and in parallel recovered genotypes from metagenomic sequencing data from associated filter sweeps. For genotyping mixed strains, a reference-based computational workflow to predict allelic profiles of nine extended-MLST loci was utilized. We validated the pipeline by sequencing artificial mixtures of C. jejuni strains and observed the highest prediction accuracy when including obtained isolates as references. By analyzing metagenomic samples, we were able to detect over 20% additional genetic diversity and observed an over 50% increase in the potential to connect genotypes across wild-bird samples. With an optimized filtration method and a computational approach for genotyping strain mixtures, we provide the foundation for future studies assessing C. jejuni diversity in environmental and clinical settings at improved throughput and resolution.
Collapse
Affiliation(s)
- Malte Herold
- Environmental Research and Innovation (ERIN) Department, Luxembourg Institute of Science and Technology (LIST), 41 rue du Brill, L-4422 Belvaux, Luxembourg
- Epidemiology and Microbial Genomics, Laboratoire National de Santé (LNS), 1 rue Louis Rech, L-3555 Dudelange, Luxembourg
- Correspondence:
| | - Louise Hock
- Environmental Research and Innovation (ERIN) Department, Luxembourg Institute of Science and Technology (LIST), 41 rue du Brill, L-4422 Belvaux, Luxembourg
| | - Christian Penny
- Environmental Research and Innovation (ERIN) Department, Luxembourg Institute of Science and Technology (LIST), 41 rue du Brill, L-4422 Belvaux, Luxembourg
| | - Cécile Walczak
- Environmental Research and Innovation (ERIN) Department, Luxembourg Institute of Science and Technology (LIST), 41 rue du Brill, L-4422 Belvaux, Luxembourg
| | - Fatu Djabi
- Epidemiology and Microbial Genomics, Laboratoire National de Santé (LNS), 1 rue Louis Rech, L-3555 Dudelange, Luxembourg
| | - Henry-Michel Cauchie
- Environmental Research and Innovation (ERIN) Department, Luxembourg Institute of Science and Technology (LIST), 41 rue du Brill, L-4422 Belvaux, Luxembourg
| | - Catherine Ragimbeau
- Epidemiology and Microbial Genomics, Laboratoire National de Santé (LNS), 1 rue Louis Rech, L-3555 Dudelange, Luxembourg
| |
Collapse
|
34
|
Arumugam K, Bessarab I, Haryono MAS, Williams RBH. Recovery and Analysis of Long-Read Metagenome-Assembled Genomes. Methods Mol Biol 2023; 2649:235-259. [PMID: 37258866 DOI: 10.1007/978-1-0716-3072-3_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The development of long-read nucleic acid sequencing is beginning to make very substantive impact on the conduct of metagenome analysis, particularly in relation to the problem of recovering the genomes of member species of complex microbial communities. Here we outline bioinformatics workflows for the recovery and characterization of complete genomes from long-read metagenome data and some complementary procedures for comparison of cognate draft genomes and gene quality obtained from short-read sequencing and long-read sequencing.
Collapse
Affiliation(s)
- Krithika Arumugam
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Irina Bessarab
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Mindia A S Haryono
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Rohan B H Williams
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore.
| |
Collapse
|
35
|
Ma S, Li H. Statistical and Computational Methods for Microbial Strain Analysis. Methods Mol Biol 2023; 2629:231-245. [PMID: 36929080 DOI: 10.1007/978-1-0716-2986-4_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Microbial strains are interpreted as a lineage derived from a recent ancestor that have not experienced "too many" recombination events and can be successfully retrieved with culture-independent techniques using metagenomic sequencing. Such a strain variability has been increasingly shown to display additional phenotypic heterogeneities that affect host health, such as virulence, transmissibility, and antibiotics resistance. New statistical and computational methods have recently been developed to track the strains in samples based on shotgun metagenomics data either based on reference genome sequences or Metagenome-assembled genomes (MAGs). In this paper, we review some recent statistical methods for strain identifications based on frequency counts at a set of single nucleotide variants (SNVs) within a set of single-copy marker genes. These methods differ in terms of whether reference genome sequences are needed, how SNVs are called, what methods of deconvolution are used and whether the methods can be applied to multiple samples. We conclude our review with areas that require further research.
Collapse
Affiliation(s)
- Siyuan Ma
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
36
|
Martin S, Ayling M, Patrono L, Caccamo M, Murcia P, Leggett RM. Capturing variation in metagenomic assembly graphs with MetaCortex. Bioinformatics 2023; 39:6986127. [PMID: 36722204 PMCID: PMC9889960 DOI: 10.1093/bioinformatics/btad020] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 11/10/2022] [Accepted: 01/11/2023] [Indexed: 01/13/2023] Open
Abstract
MOTIVATION The assembly of contiguous sequence from metagenomic samples presents a particular challenge, due to the presence of multiple species, often closely related, at varying levels of abundance. Capturing diversity within species, for example, viral haplotypes, or bacterial strain-level diversity, is even more challenging. RESULTS We present MetaCortex, a metagenome assembler that captures intra-species diversity by searching for signatures of local variation along assembled sequences in the underlying assembly graph and outputting these sequences in sequence graph format. We show that MetaCortex produces accurate assemblies with higher genome coverage and contiguity than other popular metagenomic assemblers on mock viral communities with high levels of strain-level diversity and on simulated communities containing simulated strains. AVAILABILITY AND IMPLEMENTATION Source code is freely available to download from https://github.com/SR-Martin/metacortex, is implemented in C and supported on MacOS and Linux. The version used for the results presented in this article is available at doi.org/10.5281/zenodo.7273627. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | | | - Pablo Murcia
- MRC-University of Glasgow Centre for Virus Research, Glasgow G61 1QH, UK
| | | |
Collapse
|
37
|
Salazar VW, Shaban B, Quiroga MDM, Turnbull R, Tescari E, Rossetto Marcelino V, Verbruggen H, Lê Cao KA. Metaphor-A workflow for streamlined assembly and binning of metagenomes. Gigascience 2022; 12:giad055. [PMID: 37522759 PMCID: PMC10388702 DOI: 10.1093/gigascience/giad055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 06/05/2023] [Accepted: 07/04/2023] [Indexed: 08/01/2023] Open
Abstract
Recent advances in bioinformatics and high-throughput sequencing have enabled the large-scale recovery of genomes from metagenomes. This has the potential to bring important insights as researchers can bypass cultivation and analyze genomes sourced directly from environmental samples. There are, however, technical challenges associated with this process, most notably the complexity of computational workflows required to process metagenomic data, which include dozens of bioinformatics software tools, each with their own set of customizable parameters that affect the final output of the workflow. At the core of these workflows are the processes of assembly-combining the short-input reads into longer, contiguous fragments (contigs)-and binning, clustering these contigs into individual genome bins. The limitations of assembly and binning algorithms also pose different challenges depending on the selected strategy to execute them. Both of these processes can be done for each sample separately or by pooling together multiple samples to leverage information from a combination of samples. Here we present Metaphor, a fully automated workflow for genome-resolved metagenomics (GRM). Metaphor differs from existing GRM workflows by offering flexible approaches for the assembly and binning of the input data and by combining multiple binning algorithms with a bin refinement step to achieve high-quality genome bins. Moreover, Metaphor generates reports to evaluate the performance of the workflow. We showcase the functionality of Metaphor on different synthetic datasets and the impact of available assembly and binning strategies on the final results.
Collapse
Affiliation(s)
- Vinícius W Salazar
- Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| | - Babak Shaban
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Maria del Mar Quiroga
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Robert Turnbull
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Edoardo Tescari
- Melbourne Data Analytics Platform (MDAP), University of Melbourne, Carlton, VIC 3053, Victoria, Australia
| | - Vanessa Rossetto Marcelino
- Department of Molecular and Translational Sciences, Monash University, Clayton, VIC 3168, Victoria, Australia
- Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Clayton, VIC 3168, Victoria, Australia
- School of BioSciences, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
- Department of Microbiology and Immunology, The University of Melbourne at the Peter Doherty Institute for Infection and Immunity, Parkville, VIC 3052, Victoria, Australia
| | - Heroen Verbruggen
- School of BioSciences, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| | - Kim-Anh Lê Cao
- Melbourne Integrative Genomics, School of Mathematics & Statistics, University of Melbourne, Parkville, VIC 3052, Victoria, Australia
| |
Collapse
|
38
|
Mallawaarachchi V, Lin Y. Accurate Binning of Metagenomic Contigs Using Composition, Coverage, and Assembly Graphs. J Comput Biol 2022; 29:1357-1376. [DOI: 10.1089/cmb.2022.0262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- Vijini Mallawaarachchi
- School of Computing, College of Engineering and Computer Science, Australian National University, Canberra, Australia
| | - Yu Lin
- School of Computing, College of Engineering and Computer Science, Australian National University, Canberra, Australia
| |
Collapse
|
39
|
Brunner FS, Brown MR, Bassano I, Denise H, Khalifa MS, Wade MJ, van Aerle R, Kevill JL, Jones DL, Farkas K, Jeffries AR, Cairns E, Wierzbicki C, Paterson S. City-wide wastewater genomic surveillance through the successive emergence of SARS-CoV-2 Alpha and Delta variants. WATER RESEARCH 2022; 226:119306. [PMID: 36369689 PMCID: PMC9614697 DOI: 10.1016/j.watres.2022.119306] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 10/13/2022] [Accepted: 10/26/2022] [Indexed: 05/08/2023]
Abstract
Genomic surveillance of SARS-CoV-2 has provided a critical evidence base for public health decisions throughout the pandemic. Sequencing data from clinical cases has helped to understand disease transmission and the spread of novel variants. Genomic wastewater surveillance can offer important, complementary information by providing frequency estimates of all variants circulating in a population without sampling biases. Here we show that genomic SARS-CoV-2 wastewater surveillance can detect fine-scale differences within urban centres, specifically within the city of Liverpool, UK, during the emergence of Alpha and Delta variants between November 2020 and June 2021. Furthermore, wastewater and clinical sequencing match well in the estimated timing of new variant rises and the first detection of a new variant in a given area may occur in either clinical or wastewater samples. The study's main limitation was sample quality when infection prevalence was low in spring 2021, resulting in a lower resolution of the rise of the Delta variant compared to the rise of the Alpha variant in the previous winter. The correspondence between wastewater and clinical variant frequencies demonstrates the reliability of wastewater surveillance. However, discrepancies in the first detection of the Alpha variant between the two approaches highlight that wastewater monitoring can also capture missing information, possibly resulting from asymptomatic cases or communities less engaged with testing programmes, as found by a simultaneous surge testing effort across the city.
Collapse
Affiliation(s)
- F S Brunner
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, L69 7ZB, UK
| | - M R Brown
- Environmental Monitoring for Health Protection, UK Health Security Agency, Nobel House, London SW1P 3HX, UK; School of Engineering, Newcastle University, Newcastle-upon-Tyne NE1 7RU, UK
| | - I Bassano
- Environmental Monitoring for Health Protection, UK Health Security Agency, Nobel House, London SW1P 3HX, UK; Department of Infectious Disease, Imperial College London, London SW7 2AZ, UK
| | - H Denise
- Environmental Monitoring for Health Protection, UK Health Security Agency, Nobel House, London SW1P 3HX, UK
| | - M S Khalifa
- Environmental Monitoring for Health Protection, UK Health Security Agency, Nobel House, London SW1P 3HX, UK; Division of Biosciences, College of Health, Medicine and Life Sciences, Brunel University, London, UB8 3PH, UK
| | - M J Wade
- Environmental Monitoring for Health Protection, UK Health Security Agency, Nobel House, London SW1P 3HX, UK; School of Engineering, Newcastle University, Newcastle-upon-Tyne NE1 7RU, UK
| | - R van Aerle
- International Centre of Excellence for Aquatic Animal Health, Centre for Environment, Fisheries & Aquaculture Science, Dorset, DT4 8UB, UK
| | - J L Kevill
- International Centre of Excellence for Aquatic Animal Health, Centre for Environment, Fisheries & Aquaculture Science, Dorset, DT4 8UB, UK
| | - D L Jones
- International Centre of Excellence for Aquatic Animal Health, Centre for Environment, Fisheries & Aquaculture Science, Dorset, DT4 8UB, UK; Food Futures Institute, Murdoch University, Murdoch WA 6105, Australia
| | - K Farkas
- Centre for Environmental Biotechnology, School of Natural Sciences, Bangor University, Bangor, Gwynedd LL57 2UW, UK
| | - A R Jeffries
- Biosciences, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, UK
| | - E Cairns
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, L69 7ZB, UK
| | - C Wierzbicki
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, L69 7ZB, UK
| | - S Paterson
- Institute of Infection, Veterinary and Ecological Sciences, University of Liverpool, L69 7ZB, UK.
| |
Collapse
|
40
|
Hu H, Tan Y, Li C, Chen J, Kou Y, Xu ZZ, Liu Y, Tan Y, Dai L. StrainPanDA: Linked reconstruction of strain composition and gene content profiles via pangenome-based decomposition of metagenomic data. IMETA 2022; 1:e41. [PMID: 38868710 PMCID: PMC10989911 DOI: 10.1002/imt2.41] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 05/20/2022] [Accepted: 06/28/2022] [Indexed: 06/14/2024]
Abstract
Microbial strains of variable functional capacities coexist in microbiomes. Current bioinformatics methods of strain analysis cannot provide the direct linkage between strain composition and their gene contents from metagenomic data. Here we present Strain-level Pangenome Decomposition Analysis (StrainPanDA), a novel method that uses the pangenome coverage profile of multiple metagenomic samples to simultaneously reconstruct the composition and gene content variation of coexisting strains in microbial communities. We systematically validate the accuracy and robustness of StrainPanDA using synthetic data sets. To demonstrate the power of gene-centric strain profiling, we then apply StrainPanDA to analyze the gut microbiome samples of infants, as well as patients treated with fecal microbiota transplantation. We show that the linked reconstruction of strain composition and gene content profiles is critical for understanding the relationship between microbial adaptation and strain-specific functions (e.g., nutrient utilization and pathogenicity). Finally, StrainPanDA has minimal requirements for computing resources and can be scaled to process multiple species in a community in parallel. In short, StrainPanDA can be applied to metagenomic data sets to detect the association between molecular functions and microbial/host phenotypes to formulate testable hypotheses and gain novel biological insights at the strain or subspecies level.
Collapse
Affiliation(s)
- Han Hu
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic BiologyShenzhen Institutes of Advanced Technology, Chinese Academy of SciencesShenzhenChina
- Bioinformatics DepartmentXbiome, Scientific Research Building, Tsinghua High‐Tech ParkShenzhenChina
| | - Yuxiang Tan
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic BiologyShenzhen Institutes of Advanced Technology, Chinese Academy of SciencesShenzhenChina
| | - Chenhao Li
- Center for Computational and Integrative BiologyMassachusetts General Hospital and Harvard Medical School, Richard B. Simches Research CenterBostonMassachusettsUSA
| | - Junyu Chen
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic BiologyShenzhen Institutes of Advanced Technology, Chinese Academy of SciencesShenzhenChina
| | - Yan Kou
- Bioinformatics DepartmentXbiome, Scientific Research Building, Tsinghua High‐Tech ParkShenzhenChina
| | - Zhenjiang Zech Xu
- Department of Food Science and Technology, State Key Laboratory of Food Science and TechnologyNanchang UniversityNanchangChina
| | - Yang‐Yu Liu
- Channing Division of Network Medicine, Department of MedicineBrigham and Women's Hospital and Harvard Medical SchoolBostonMassachusettsUSA
| | - Yan Tan
- Bioinformatics DepartmentXbiome, Scientific Research Building, Tsinghua High‐Tech ParkShenzhenChina
| | - Lei Dai
- CAS Key Laboratory of Quantitative Engineering Biology, Shenzhen Institute of Synthetic BiologyShenzhen Institutes of Advanced Technology, Chinese Academy of SciencesShenzhenChina
| |
Collapse
|
41
|
Purushothaman S, Meola M, Egli A. Combination of Whole Genome Sequencing and Metagenomics for Microbiological Diagnostics. Int J Mol Sci 2022; 23:9834. [PMID: 36077231 PMCID: PMC9456280 DOI: 10.3390/ijms23179834] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Revised: 08/24/2022] [Accepted: 08/26/2022] [Indexed: 12/21/2022] Open
Abstract
Whole genome sequencing (WGS) provides the highest resolution for genome-based species identification and can provide insight into the antimicrobial resistance and virulence potential of a single microbiological isolate during the diagnostic process. In contrast, metagenomic sequencing allows the analysis of DNA segments from multiple microorganisms within a community, either using an amplicon- or shotgun-based approach. However, WGS and shotgun metagenomic data are rarely combined, although such an approach may generate additive or synergistic information, critical for, e.g., patient management, infection control, and pathogen surveillance. To produce a combined workflow with actionable outputs, we need to understand the pre-to-post analytical process of both technologies. This will require specific databases storing interlinked sequencing and metadata, and also involves customized bioinformatic analytical pipelines. This review article will provide an overview of the critical steps and potential clinical application of combining WGS and metagenomics together for microbiological diagnosis.
Collapse
Affiliation(s)
- Srinithi Purushothaman
- Applied Microbiology Research, Department of Biomedicine, University of Basel, 4031 Basel, Switzerland
- Institute of Medical Microbiology, University of Zurich, 8006 Zurich, Switzerland
| | - Marco Meola
- Applied Microbiology Research, Department of Biomedicine, University of Basel, 4031 Basel, Switzerland
- Institute of Medical Microbiology, University of Zurich, 8006 Zurich, Switzerland
- Swiss Institute of Bioinformatics, University of Basel, 4031 Basel, Switzerland
| | - Adrian Egli
- Applied Microbiology Research, Department of Biomedicine, University of Basel, 4031 Basel, Switzerland
- Institute of Medical Microbiology, University of Zurich, 8006 Zurich, Switzerland
- Clinical Bacteriology and Mycology, University Hospital Basel, 4031 Basel, Switzerland
| |
Collapse
|
42
|
A revisit to universal single-copy genes in bacterial genomes. Sci Rep 2022; 12:14550. [PMID: 36008577 PMCID: PMC9411617 DOI: 10.1038/s41598-022-18762-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 08/18/2022] [Indexed: 11/08/2022] Open
Abstract
Universal single-copy genes (USCGs) are widely used for species classification and taxonomic profiling. Despite many studies on USCGs, our understanding of USCGs in bacterial genomes might be out of date, especially how different the USCGs are in different studies, how well a set of USCGs can distinguish two bacterial species, whether USCGs can separate different strains of a bacterial species, to name a few. To fill the void, we studied USCGs in the most updated complete bacterial genomes. We showed that different USCG sets are quite different while coming from highly similar functional categories. We also found that although USCGs occur once in almost all bacterial genomes, each USCG does occur multiple times in certain genomes. We demonstrated that USCGs are reliable markers to distinguish different species while they cannot distinguish different strains of most bacterial species. Our study sheds new light on the usage and limitations of USCGs, which will facilitate their applications in evolutionary, phylogenomic, and metagenomic studies.
Collapse
|
43
|
Haryono MAS, Law YY, Arumugam K, Liew LCW, Nguyen TQN, Drautz-Moses DI, Schuster SC, Wuertz S, Williams RBH. Recovery of High Quality Metagenome-Assembled Genomes From Full-Scale Activated Sludge Microbial Communities in a Tropical Climate Using Longitudinal Metagenome Sampling. Front Microbiol 2022; 13:869135. [PMID: 35756038 PMCID: PMC9230771 DOI: 10.3389/fmicb.2022.869135] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 05/05/2022] [Indexed: 01/23/2023] Open
Abstract
The analysis of metagenome data based on the recovery of draft genomes (so called metagenome-assembled genomes, or MAG) has assumed an increasingly central role in microbiome research in recent years. Microbial communities underpinning the operation of wastewater treatment plants are particularly challenging targets for MAG analysis due to their high ecological complexity, and remain important, albeit understudied, microbial communities that play ssa key role in mediating interactions between human and natural ecosystems. Here we consider strategies for recovery of MAG sequence from time series metagenome surveys of full-scale activated sludge microbial communities. We generate MAG catalogs from this set of data using several different strategies, including the use of multiple individual sample assemblies, two variations on multi-sample co-assembly and a recently published MAG recovery workflow using deep learning. We obtain a total of just under 9,100 draft genomes, which collapse to around 3,100 non-redundant genomic clusters. We examine the strengths and weaknesses of these approaches in relation to MAG yield and quality, showing that co-assembly may offer advantages over single-sample assembly in the case of metagenome data obtained from closely sampled longitudinal study designs. Around 1,000 MAGs were candidates for being considered high quality, based on single-copy marker gene occurrence statistics, however only 58 MAG formally meet the MIMAG criteria for being high quality draft genomes. These findings carry broader broader implications for performing genome-resolved metagenomics on highly complex communities, the design and implementation of genome recoverability strategies, MAG decontamination and the search for better binning methodology.
Collapse
Affiliation(s)
- Mindia A S Haryono
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| | - Ying Yu Law
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Krithika Arumugam
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Larry C-W Liew
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Thi Quynh Ngoc Nguyen
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Daniela I Drautz-Moses
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore
| | - Stephan C Schuster
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore.,School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Stefan Wuertz
- Singapore Centre for Environmental Life Sciences Engineering, Nanyang Technological University, Singapore, Singapore.,School of Civil and Environmental Engineering, Nanyang Technological University, Singapore, Singapore
| | - Rohan B H Williams
- Singapore Centre for Environmental Life Sciences Engineering, National University of Singapore, Singapore, Singapore
| |
Collapse
|
44
|
Spang A, Mahendrarajah TA, Offre P, Stairs CW. Evolving Perspective on the Origin and Diversification of Cellular Life and the Virosphere. Genome Biol Evol 2022; 14:evac034. [PMID: 35218347 PMCID: PMC9169541 DOI: 10.1093/gbe/evac034] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2022] [Indexed: 11/14/2022] Open
Abstract
The tree of life (TOL) is a powerful framework to depict the evolutionary history of cellular organisms through time, from our microbial origins to the diversification of multicellular eukaryotes that shape the visible biosphere today. During the past decades, our perception of the TOL has fundamentally changed, in part, due to profound methodological advances, which allowed a more objective approach to studying organismal and viral diversity and led to the discovery of major new branches in the TOL as well as viral lineages. Phylogenetic and comparative genomics analyses of these data have, among others, revolutionized our understanding of the deep roots and diversity of microbial life, the origin of the eukaryotic cell, eukaryotic diversity, as well as the origin, and diversification of viruses. In this review, we provide an overview of some of the recent discoveries on the evolutionary history of cellular organisms and their viruses and discuss a variety of complementary techniques that we consider crucial for making further progress in our understanding of the TOL and its interconnection with the virosphere.
Collapse
Affiliation(s)
- Anja Spang
- Department of Marine Microbiology and Biogeochemistry, NIOZ, Royal Netherlands Institute for Sea Research, Utrecht University, Den Burg, The Netherlands
- Department of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Tara A Mahendrarajah
- Department of Marine Microbiology and Biogeochemistry, NIOZ, Royal Netherlands Institute for Sea Research, Utrecht University, Den Burg, The Netherlands
| | - Pierre Offre
- Department of Marine Microbiology and Biogeochemistry, NIOZ, Royal Netherlands Institute for Sea Research, Utrecht University, Den Burg, The Netherlands
| | - Courtney W Stairs
- Department of Biology, Microbiology research group, Lund University, Lund, Sweden
| |
Collapse
|
45
|
Jin S, Wetzel D, Schirmer M. Deciphering mechanisms and implications of bacterial translocation in human health and disease. Curr Opin Microbiol 2022; 67:102147. [PMID: 35461008 DOI: 10.1016/j.mib.2022.102147] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 02/28/2022] [Accepted: 03/03/2022] [Indexed: 12/12/2022]
Abstract
Significant increases in potential microbial translocation, especially along the oral-gut axis, have been identified in many immune-related and inflammatory diseases, such as inflammatory bowel disease, colorectal cancer, rheumatoid arthritis, and liver cirrhosis, for which we currently have no cure or long-term treatment options. Recent advances in computational and experimental omics approaches now enable strain tracking, functional profiling, and strain isolation in unprecedented detail, which has the potential to elucidate the causes and consequences of microbial translocation. In this review, we discuss current evidence for the detection of bacterial translocation, examine different translocation axes with a primary focus on the oral-gut axis, and outline currently known translocation mechanisms and how they adversely affect the host in disease. Finally, we conclude with an overview of state-of-the-art computational and experimental tools for strain tracking and highlight the required next steps to elucidate the role of bacterial translocation in human health.
Collapse
Affiliation(s)
- Shen Jin
- ZIEL - Institute for Food and Health, Technical University of Munich, Gregor-Mendel-Str. 2, 85354 Freising, Germany
| | - Daniela Wetzel
- ZIEL - Institute for Food and Health, Technical University of Munich, Gregor-Mendel-Str. 2, 85354 Freising, Germany
| | - Melanie Schirmer
- ZIEL - Institute for Food and Health, Technical University of Munich, Gregor-Mendel-Str. 2, 85354 Freising, Germany.
| |
Collapse
|
46
|
Strain identification and quantitative analysis in microbial communities. J Mol Biol 2022; 434:167582. [DOI: 10.1016/j.jmb.2022.167582] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/31/2022] [Accepted: 04/03/2022] [Indexed: 12/14/2022]
|
47
|
Podlesny D, Arze C, Dörner E, Verma S, Dutta S, Walter J, Fricke WF. Metagenomic strain detection with SameStr: identification of a persisting core gut microbiota transferable by fecal transplantation. MICROBIOME 2022; 10:53. [PMID: 35337386 PMCID: PMC8951724 DOI: 10.1186/s40168-022-01251-w] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 02/24/2022] [Indexed: 05/13/2023]
Abstract
BACKGROUND The understanding of how microbiomes assemble, function, and evolve requires metagenomic tools that can resolve microbiota compositions at the strain level. However, the identification and tracking of microbial strains in fecal metagenomes is challenging and available tools variably classify subspecies lineages, which affects their applicability to infer microbial persistence and transfer. RESULTS We introduce SameStr, a bioinformatic tool that identifies shared strains in metagenomes by determining single-nucleotide variants (SNV) in species-specific marker genes, which are compared based on a maximum variant profile similarity. We validated SameStr on mock strain populations, available human fecal metagenomes from healthy individuals and newly generated data from recurrent Clostridioides difficile infection (rCDI) patients treated with fecal microbiota transplantation (FMT). SameStr demonstrated enhanced sensitivity to detect shared dominant and subdominant strains in related samples (where strain persistence or transfer would be expected) when compared to other tools, while being robust against false-positive shared strain calls between unrelated samples (where neither strain persistence nor transfer would be expected). We applied SameStr to identify strains that are stably maintained in fecal microbiomes of healthy adults over time (strain persistence) and that successfully engraft in rCDI patients after FMT (strain engraftment). Taxonomy-dependent strain persistence and engraftment frequencies were positively correlated, indicating that a specific core microbiota of intestinal species is adapted to be competitive both in healthy microbiomes and during post-FMT microbiome assembly. We explored other use cases for strain-level microbiota profiling, as a metagenomics quality control measure and to identify individuals based on the persisting core gut microbiota. CONCLUSION SameStr provides for a robust identification of shared strains in metagenomic sequence data with sufficient specificity and sensitivity to examine strain persistence, transfer, and engraftment in human fecal microbiomes. Our findings identify a persisting healthy adult core gut microbiota, which should be further studied to shed light on microbiota contributions to chronic diseases. Video abstract.
Collapse
Affiliation(s)
- Daniel Podlesny
- Department of Microbiome Research and Applied Bioinformatics, University of Hohenheim, Stuttgart, Germany.
| | - Cesar Arze
- Department of Microbiome Research and Applied Bioinformatics, University of Hohenheim, Stuttgart, Germany
- Current address: Ring Therapeutics, Cambridge, MA, USA
| | - Elisabeth Dörner
- Department of Microbiome Research and Applied Bioinformatics, University of Hohenheim, Stuttgart, Germany
| | - Sandeep Verma
- Division of Gastroenterology, Sinai Hospital of Baltimore, Baltimore, MD, USA
| | - Sudhir Dutta
- Division of Gastroenterology, Sinai Hospital of Baltimore, Baltimore, MD, USA
| | - Jens Walter
- APC Microbiome Ireland, School of Microbiology, and Department of Medicine, University College Cork, Cork, Ireland
| | - W Florian Fricke
- Department of Microbiome Research and Applied Bioinformatics, University of Hohenheim, Stuttgart, Germany.
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
48
|
van Dijk LR, Walker BJ, Straub TJ, Worby CJ, Grote A, Schreiber HL, Anyansi C, Pickering AJ, Hultgren SJ, Manson AL, Abeel T, Earl AM. StrainGE: a toolkit to track and characterize low-abundance strains in complex microbial communities. Genome Biol 2022; 23:74. [PMID: 35255937 PMCID: PMC8900328 DOI: 10.1186/s13059-022-02630-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 02/09/2022] [Indexed: 01/21/2023] Open
Abstract
Human-associated microbial communities comprise not only complex mixtures of bacterial species, but also mixtures of conspecific strains, the implications of which are mostly unknown since strain level dynamics are underexplored due to the difficulties of studying them. We introduce the Strain Genome Explorer (StrainGE) toolkit, which deconvolves strain mixtures and characterizes component strains at the nucleotide level from short-read metagenomic sequencing with higher sensitivity and resolution than other tools. StrainGE is able to identify strains at 0.1x coverage and detect variants for multiple conspecific strains within a sample from coverages as low as 0.5x.
Collapse
Affiliation(s)
- Lucas R. van Dijk
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Bruce J. Walker
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,Applied Invention, Cambridge, MA USA
| | - Timothy J. Straub
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.38142.3c000000041936754XDepartment of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Boston, MA 02115 USA
| | - Colin J. Worby
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Alexandra Grote
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Henry L. Schreiber
- grid.4367.60000 0001 2355 7002Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, MO 63110 USA ,grid.4367.60000 0001 2355 7002Center for Women’s Infectious Disease Research (CWIDR), Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Christine Anyansi
- grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Amy J. Pickering
- grid.47840.3f0000 0001 2181 7878Department of Civil and Environmental Engineering, University of California, Berkeley, Berkeley, CA 94720 USA ,grid.429997.80000 0004 1936 7531Stuart B. Levy Center for Integrated Management of Antimicrobial Resistance (Levy CIMAR), Tufts University, Boston, MA USA
| | - Scott J. Hultgren
- grid.4367.60000 0001 2355 7002Department of Molecular Microbiology, Washington University School of Medicine, St. Louis, MO 63110 USA ,grid.4367.60000 0001 2355 7002Center for Women’s Infectious Disease Research (CWIDR), Washington University School of Medicine, St. Louis, MO 63110 USA
| | - Abigail L. Manson
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| | - Thomas Abeel
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA ,grid.5292.c0000 0001 2097 4740Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE The Netherlands
| | - Ashlee M. Earl
- grid.66859.340000 0004 0546 1623Infectious Disease & Microbiome Program, Broad Institute, 415 Main Street, Cambridge, MA 02142 USA
| |
Collapse
|
49
|
Setubal JC. Metagenome-assembled genomes: concepts, analogies, and challenges. Biophys Rev 2022; 13:905-909. [PMID: 35059016 DOI: 10.1007/s12551-021-00865-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2021] [Accepted: 10/21/2021] [Indexed: 12/25/2022] Open
Abstract
Metagenome-assembled genomes (MAGs) are microbial genomes reconstructed from metagenome data. In the last few years, many thousands of MAGs have been reported in the literature, for a variety of environments and host-associated microbiota, including humans. MAGs have helped us better understand microbial populations and their interactions with the environment where they live; moreover most MAGs belong to novel species, therefore helping to decrease the so-called microbial dark matter. However, questions about the biological reality of MAGs have not, in general, been properly addressed. In this review, I define the notions of hypothetical MAGs and conserved hypothetical MAGs. These notions should help with the understanding of the biological reality of MAGs, their worldwide occurrence, and the efforts to improve MAG recovery processes.
Collapse
Affiliation(s)
- João C Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, SP 05508-000 São Paulo, Brazil
| |
Collapse
|
50
|
|