1
|
Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I. Genome-resolved metagenomics: a game changer for microbiome medicine. Exp Mol Med 2024; 56:1501-1512. [PMID: 38945961 PMCID: PMC11297344 DOI: 10.1038/s12276-024-01262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/06/2024] [Accepted: 03/25/2024] [Indexed: 07/02/2024] Open
Abstract
Recent substantial evidence implicating commensal bacteria in human diseases has given rise to a new domain in biomedical research: microbiome medicine. This emerging field aims to understand and leverage the human microbiota and derivative molecules for disease prevention and treatment. Despite the complex and hierarchical organization of this ecosystem, most research over the years has relied on 16S amplicon sequencing, a legacy of bacterial phylogeny and taxonomy. Although advanced sequencing technologies have enabled cost-effective analysis of entire microbiota, translating the relatively short nucleotide information into the functional and taxonomic organization of the microbiome has posed challenges until recently. In the last decade, genome-resolved metagenomics, which aims to reconstruct microbial genomes directly from whole-metagenome sequencing data, has made significant strides and continues to unveil the mysteries of various human-associated microbial communities. There has been a rapid increase in the volume of whole metagenome sequencing data and in the compilation of novel metagenome-assembled genomes and protein sequences in public depositories. This review provides an overview of the capabilities and methods of genome-resolved metagenomics for studying the human microbiome, with a focus on investigating the prokaryotic microbiota of the human gut. Just as decoding the human genome and its variations marked the beginning of the genomic medicine era, unraveling the genomes of commensal microbes and their sequence variations is ushering us into the era of microbiome medicine. Genome-resolved metagenomics stands as a pivotal tool in this transition and can accelerate our journey toward achieving these scientific and medical milestones.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Wonjong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Jungyeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea.
| |
Collapse
|
2
|
Pinto Y, Bhatt AS. Sequencing-based analysis of microbiomes. Nat Rev Genet 2024:10.1038/s41576-024-00746-6. [PMID: 38918544 DOI: 10.1038/s41576-024-00746-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/15/2024] [Indexed: 06/27/2024]
Abstract
Microbiomes occupy a range of niches and, in addition to having diverse compositions, they have varied functional roles that have an impact on agriculture, environmental sciences, and human health and disease. The study of microbiomes has been facilitated by recent technological and analytical advances, such as cheaper and higher-throughput DNA and RNA sequencing, improved long-read sequencing and innovative computational analysis methods. These advances are providing a deeper understanding of microbiomes at the genomic, transcriptional and translational level, generating insights into their function and composition at resolutions beyond the species level.
Collapse
Affiliation(s)
- Yishay Pinto
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA, USA.
- Department of Medicine, Divisions of Hematology and Blood & Marrow Transplantation, Stanford University, Stanford, CA, USA.
| |
Collapse
|
3
|
Straub TJ, Lombardo MJ, Bryant JA, Diao L, Lodise TP, Freedberg DE, Wortman JR, Litcofsky KD, Hasson BR, McGovern BH, Ford CB, Henn MR. Impact of a Purified Microbiome Therapeutic on Abundance of Antimicrobial Resistance Genes in Patients With Recurrent Clostridioides difficile Infection. Clin Infect Dis 2024; 78:833-841. [PMID: 37823484 PMCID: PMC11006105 DOI: 10.1093/cid/ciad636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/25/2023] [Accepted: 10/11/2023] [Indexed: 10/13/2023] Open
Abstract
BACKGROUND The gastrointestinal microbiota is an important line of defense against colonization with antimicrobial resistant (AR) bacteria. In this post hoc analysis of the phase 3 ECOSPOR III trial, we assessed impact of a microbiota-based oral therapeutic (fecal microbiota spores, live; VOWST Oral Spores [VOS], formerly SER-109]; Seres Therapeutics) compared with placebo, on AR gene (ARG) abundance in patients with recurrent Clostridioides difficile infection (rCDI). METHODS Adults with rCDI were randomized to receive VOS or placebo orally for 3 days following standard-of-care antibiotics. ARG and taxonomic profiles were generated using whole metagenomic sequencing of stool at baseline and weeks 1, 2, 8, and 24 posttreatment. RESULTS Baseline (n = 151) and serial posttreatment stool samples collected through 24 weeks (total N = 472) from 182 patients (59.9% female; mean age: 65.5 years) in ECOSPOR III as well as 68 stool samples obtained at a single time point from a healthy cohort were analyzed. Baseline ARG abundance was similar between arms and significantly elevated versus the healthy cohort. By week 1, there was a greater decline in ARG abundance in VOS versus placebo (P = .003) in association with marked decline of Proteobacteria and repletion of spore-forming Firmicutes, as compared with baseline. We observed abundance of Proteobacteria and non-spore-forming Firmicutes were associated with ARG abundance, while spore-forming Firmicutes abundance was negatively associated. CONCLUSIONS This proof-of-concept analysis suggests that microbiome remodeling with Firmicutes spores may be a potential novel approach to reduce ARG colonization in the gastrointestinal tract.
Collapse
Affiliation(s)
| | | | | | - Liyang Diao
- Seres Therapeutics, Cambridge, Massachusetts, USA
| | - Thomas P Lodise
- Albany College of Pharmacy and Health Sciences, Albany, New York, USA
| | - Daniel E Freedberg
- Division of Digestive and Liver Diseases, Columbia University Irving Medical Center–New York Presbyterian Hospital, New York, New York, USA
| | | | | | | | | | | | | |
Collapse
|
4
|
Shu HY, Zhao L, Jia Y, Liu FF, Chen J, Chang CM, Jin T, Yang J, Shu WS. CyanoStrainChip: A Novel DNA Microarray Tool for High-Throughput Detection of Environmental Cyanobacteria at the Strain Level. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2024; 58:5024-5034. [PMID: 38454313 PMCID: PMC10956431 DOI: 10.1021/acs.est.3c11096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 03/09/2024]
Abstract
Detecting cyanobacteria in environments is an important concern due to their crucial roles in ecosystems, and they can form blooms with the potential to harm humans and nonhuman entities. However, the most widely used methods for high-throughput detection of environmental cyanobacteria, such as 16S rRNA sequencing, typically provide above-species-level resolution, thereby disregarding intraspecific variation. To address this, we developed a novel DNA microarray tool, termed the CyanoStrainChip, that enables strain-level comprehensive profiling of environmental cyanobacteria. The CyanoStrainChip was designed to target 1277 strains; nearly all major groups of cyanobacteria are included by implementing 43,666 genome-wide, strain-specific probes. It demonstrated strong specificity by in vitro mock community experiments. The high correlation (Pearson's R > 0.97) between probe fluorescence intensities and the corresponding DNA amounts (ranging from 1-100 ng) indicated excellent quantitative capability. Consistent cyanobacterial profiles of field samples were observed by both the CyanoStrainChip and next-generation sequencing methods. Furthermore, CyanoStrainChip analysis of surface water samples in Lake Chaohu uncovered a high intraspecific variation of abundance change within the genus Microcystis between different severity levels of cyanobacterial blooms, highlighting two toxic Microcystis strains that are of critical concern for Lake Chaohu harmful blooms suppression. Overall, these results suggest a potential for CyanoStrainChip as a valuable tool for cyanobacterial ecological research and harmful bloom monitoring to supplement existing techniques.
Collapse
Affiliation(s)
- Hao-Yue Shu
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
- School
of Food and Drug, Shenzhen Polytechnic, Shenzhen 518081, PR China
| | - Liang Zhao
- Institute
of Ecological Science, Guangzhou Key Laboratory of Subtropical Biodiversity
and Biomonitoring, Guangdong Provincial Key Laboratory of Biotechnology
for Plant Development, School of Life Sciences, South China Normal University, Guangzhou 510006, PR China
| | - Yanyan Jia
- School
of Ecology, Sun Yat-sen University, Shenzhen 518107, PR China
| | - Fei-Fei Liu
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
| | - Jiang Chen
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
| | - Chih-Min Chang
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
| | - Tao Jin
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
- One
Health Biotechnology (Suzhou) Co., Ltd., Suzhou 215009, PR China
| | - Jian Yang
- School
of Food and Drug, Shenzhen Polytechnic, Shenzhen 518081, PR China
| | - Wen-Sheng Shu
- Guangdong
Magigene Biotechnology Co., Ltd., Shenzhen 518081, PR China
- Institute
of Ecological Science, Guangzhou Key Laboratory of Subtropical Biodiversity
and Biomonitoring, Guangdong Provincial Key Laboratory of Biotechnology
for Plant Development, School of Life Sciences, South China Normal University, Guangzhou 510006, PR China
| |
Collapse
|
5
|
Boyte ME, Benkowski A, Pane M, Shehata HR. Probiotic and postbiotic analytical methods: a perspective of available enumeration techniques. Front Microbiol 2023; 14:1304621. [PMID: 38192285 PMCID: PMC10773886 DOI: 10.3389/fmicb.2023.1304621] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 11/20/2023] [Indexed: 01/10/2024] Open
Abstract
Probiotics are the largest non-herbal/traditional dietary supplements category worldwide. To be effective, a probiotic strain must be delivered viable at an adequate dose proven to deliver a health benefit. The objective of this article is to provide an overview of the various technologies available for probiotic enumeration, including a general description of each technology, their advantages and limitations, and their potential for the future of the probiotics industry. The current "gold standard" for analytical quantification of probiotics in the probiotic industry is the Plate Count method (PC). PC measures the bacterial cell's ability to proliferate into detectable colonies, thus PC relies on cultivability as a measure of viability. Although viability has widely been measured by cultivability, there has been agreement that the definition of viability is not limited to cultivability. For example, bacterial cells may exist in a state known as viable but not culturable (VBNC) where the cells lose cultivability but can maintain some of the characteristics of viable cells as well as probiotic properties. This led to questioning the association between viability and cultivability and the accuracy of PC in enumerating all the viable cells in probiotic products. PC has always been an estimate of the number of viable cells and not a true cell count. Additionally, newer probiotic categories such as Next Generation Probiotics (NGPs) are difficult to culture in routine laboratories as NGPs are often strict anaerobes with extreme sensitivity to atmospheric oxygen. Thus, accurate quantification using culture-based techniques will be complicated. Another emerging category of biotics is postbiotics, which are inanimate microorganisms, also often referred to as tyndallized or heat-killed bacteria. Obviously, culture dependent methods are not suitable for these products, and alternative methods are needed for their quantification. Different methodologies provide a more complete picture of a heterogeneous bacterial population versus PC focusing exclusively on the eventual multiplication of the cells. Alternative culture-independent techniques including real-time PCR, digital PCR and flow cytometry are discussed. These methods can measure viability beyond cultivability (i.e., by measuring cellular enzymatic activity, membrane integrity or membrane potential), and depending on how they are designed they can achieve strain-specific enumeration.
Collapse
Affiliation(s)
- Marie-Eve Boyte
- NutraPharma Consulting Services Inc., Sainte-Anne-des-Plaines, QC, Canada
| | | | - Marco Pane
- Probiotical Research s.r.l., Novara, Italy
| | | |
Collapse
|
6
|
Shi ZJ, Nayfach S, Pollard KS. Maast: genotyping thousands of microbial strains efficiently. Genome Biol 2023; 24:186. [PMID: 37563669 PMCID: PMC10416524 DOI: 10.1186/s13059-023-03030-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open
Abstract
Existing single nucleotide polymorphism (SNP) genotyping algorithms do not scale for species with thousands of sequenced strains, nor do they account for conspecific redundancy. Here we present a bioinformatics tool, Maast, which empowers population genetic meta-analysis of microbes at an unrivaled scale. Maast implements a novel algorithm to heuristically identify a minimal set of diverse conspecific genomes, then constructs a reliable SNP panel for each species, and enables rapid and accurate genotyping using a hybrid of whole-genome alignment and k-mer exact matching. We demonstrate Maast's utility by genotyping thousands of Helicobacter pylori strains and tracking SARS-CoV-2 diversification.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Joint Genome Institute, Department of Energy, Walnut Creek, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
7
|
Shi ZJ, Nayfach S, Pollard KS. Identifying species-specific k-mers for fast and accurate metagenotyping with Maast and GT-Pro. STAR Protoc 2023; 4:101964. [PMID: 36856771 PMCID: PMC10037184 DOI: 10.1016/j.xpro.2022.101964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/19/2022] [Accepted: 12/08/2022] [Indexed: 01/22/2023] Open
Abstract
Genotyping single-nucleotide polymorphisms (SNPs) in microbiomes enables strain-level quantification. In this protocol, we describe a computational pipeline that performs fast and accurate SNP genotyping using metagenomic data. We first demonstrate how to use Maast to catalog SNPs from microbial genomes. Then we use GT-Pro to extract unique SNP-covering k-mers, optimize a data structure for storing these k-mers, and finally perform metagenotyping. For proof of concept, the protocol leverages public whole-genome sequences to metagenotype a synthetic community. For complete details on the use and execution of this protocol, please refer to Shi et al. (2022a)1 and Shi et al. (2022b).2.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA; Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA; University of California San Francisco, Department of Epidemiology and Biostatistics, San Francisco, CA, USA.
| |
Collapse
|
8
|
Anusha P, Ragavendran C, Kamaraj C, Sangeetha K, Thesai AS, Natarajan D, Malafaia G. Eco-friendly bioremediation of pollutants from contaminated sewage wastewater using special reference bacterial strain of Bacillus cereus SDN1 and their genotoxicological assessment in Allium cepa. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 863:160935. [PMID: 36527898 DOI: 10.1016/j.scitotenv.2022.160935] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 12/08/2022] [Accepted: 12/11/2022] [Indexed: 06/17/2023]
Abstract
The present study aimed to assess the Bacillus cereus SDN1 native bacterium's ability to clean up contaminated or polluted water. The isolated bacterium was identified by its morphological and biochemical characteristics, which were then confirmed at the genus level. Furthermore, the isolated B. cereus (NCBI accession No: MW828583) was identified genomically by PCR amplifying 16 s rDNA using a universal primer. The phylogenetic analysis of the rDNA sequence was analyzed to determine the taxonomic and evolutionary profile of the isolate of the previously identified Bacillus sp. Besides, B. cereus and the bacterial consortium were treated using sewage wastewater. After 15 days of treatment, the following pollutants or chemicals were reduced: total hardness particles removal varied from 63.33 % to 67.55 %, calcium removal varied from 90 % to 93.33 %, and total nitrate decreased range from 37.77 % to 22.22 %, respectively. Electrical conductivity ranged from 1809 mS/cm to 2500 mS/cm, and pH values ranged from 6.5 to 8.95. The outcome of in-situ remediation results suggested that B. cereus has a noticeable remediation efficiency to the suspended particles. A root tip test was also used to investigate the genotoxicity of treated and untreated sewage-contaminated waters on onion (Allium cepa) root cells. The highest chromosomal aberrations and mitotic inhibition were found in roots exposed to contaminated sewage water, and their results displayed chromosome abnormalities, including disorganized, sticky chain, disturbed metaphase, chromosomal displacement in anaphase, abnormal telophase, spindle disturbances, and binucleate cells observed in A. cepa exposed to untreated contaminated water. The study can thus be applied as a biomarker to detect the genotoxic impacts of sewage water pollution on biota. Furthermore, based on an identified bacterial consortium, this work offers a low-cost and eco-favorable method for treating household effluents.
Collapse
Affiliation(s)
- Ponniah Anusha
- Department of Science and Humanities, Kongunadu College of Engineering and Technology, Tholurpatti, Trichy 621 215, Tamil Nadu, India
| | - Chinnasamy Ragavendran
- Department of Conservative Dentistry and Endodontics, Saveetha Dental College, and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Chennai 600 077, India.
| | - Chinnaperumal Kamaraj
- Interdisciplinary Institute of Indian System of Medicine (IIISM), SRM Institute of Science and Technology (SRMIST), Kattankulathur, Chennai 603 203, Tamil Nadu, India
| | - Kanagaraj Sangeetha
- Natural Drug Research Laboratory, Department of Biotechnology, School of Biosciences, Periyar University, Salem, Tamil Nadu, India
| | | | - Devarajan Natarajan
- Natural Drug Research Laboratory, Department of Biotechnology, School of Biosciences, Periyar University, Salem, Tamil Nadu, India
| | - Guilherme Malafaia
- Laboratory of Toxicology Applied to the Environment, Goiano Federal Institute, Urutaí, GO, Brazil.; Post-Graduation Program in Conservation of Cerrado Natural Resources, Goiano Federal Institute, Urutaí, GO, Brazil.; Post-Graduation Program in Ecology, Conservation, and Biodiversity, Federal University of Uberlândia, Uberlândia, MG, Brazil.; Post-Graduation Program in Biotechnology and Biodiversity, Federal University of Goiás, Goiânia, GO, Brazil..
| |
Collapse
|
9
|
Zhao C, Shi ZJ, Pollard KS. Pitfalls of genotyping microbial communities with rapidly growing genome collections. Cell Syst 2023; 14:160-176.e3. [PMID: 36657438 PMCID: PMC9957970 DOI: 10.1016/j.cels.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/15/2022] [Accepted: 12/19/2022] [Indexed: 01/20/2023]
Abstract
Detecting genetic variants in metagenomic data is a priority for understanding the evolution, ecology, and functional characteristics of microbial communities. Many tools that perform this metagenotyping rely on aligning reads of unknown origin to a database of sequences from many species before calling variants. In this synthesis, we investigate how databases of increasingly diverse and closely related species have pushed the limits of current alignment algorithms, thereby degrading the performance of metagenotyping tools. We identify multi-mapping reads as a prevalent source of errors and illustrate a trade-off between retaining correct alignments versus limiting incorrect alignments, many of which map reads to the wrong species. Then we evaluate several actionable mitigation strategies and review emerging methods showing promise to further improve metagenotyping in response to the rapid growth in genome collections. Our results have implications beyond metagenotyping to the many tools in microbial genomics that depend upon accurate read mapping.
Collapse
Affiliation(s)
- Chunyu Zhao
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
10
|
Ma S, Li H. Statistical and Computational Methods for Microbial Strain Analysis. Methods Mol Biol 2023; 2629:231-245. [PMID: 36929080 DOI: 10.1007/978-1-0716-2986-4_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2023]
Abstract
Microbial strains are interpreted as a lineage derived from a recent ancestor that have not experienced "too many" recombination events and can be successfully retrieved with culture-independent techniques using metagenomic sequencing. Such a strain variability has been increasingly shown to display additional phenotypic heterogeneities that affect host health, such as virulence, transmissibility, and antibiotics resistance. New statistical and computational methods have recently been developed to track the strains in samples based on shotgun metagenomics data either based on reference genome sequences or Metagenome-assembled genomes (MAGs). In this paper, we review some recent statistical methods for strain identifications based on frequency counts at a set of single nucleotide variants (SNVs) within a set of single-copy marker genes. These methods differ in terms of whether reference genome sequences are needed, how SNVs are called, what methods of deconvolution are used and whether the methods can be applied to multiple samples. We conclude our review with areas that require further research.
Collapse
Affiliation(s)
- Siyuan Ma
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA
| | - Hongzhe Li
- Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
11
|
Zhao C, Goldman M, Smith BJ, Pollard KS. Genotyping Microbial Communities with MIDAS2: From Metagenomic Reads to Allele Tables. Curr Protoc 2022; 2:e604. [PMID: 36469554 PMCID: PMC9907011 DOI: 10.1002/cpz1.604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The Metagenomic Intra-Species Diversity Analysis System 2 (MIDAS2) is a scalable pipeline that identifies single nucleotide variants and gene copy number variants in metagenomes using comprehensive reference databases built from public microbial genome collections (metagenotyping). MIDAS2 is the first metagenotyping tool with functionality to control metagenomic read mapping filters and to customize the reference database to the microbial community, features that improve the precision and recall of detected variants. In this article we present four basic protocols for the most common use cases of MIDAS2, along with supporting protocols for installation and use. In addition, we provide in-depth guidance on adjusting command line parameters, editing the reference database, optimizing hardware utilization, and understanding the metagenotyping results. All the steps of metagenotyping, from raw sequencing reads to population genetic analysis, are demonstrated with example data in two downloadable sequencing libraries of single-end metagenomic reads representing a mixture of multiple bacterial species. This set of protocols empowers users to accurately genotype hundreds of species in thousands of samples, providing rich genetic data for studying the evolution and strain-level ecology of microbial communities. © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1: Species prescreening Basic Protocol 2: Download MIDAS reference database Basic Protocol 3: Population single nucleotide variant calling Basic Protocol 4: Pan-genome copy number variant calling Support Protocol 1: Installing MIDAS2 Support Protocol 2: Command line inputs Support Protocol 3: Metagenotyping with a custom collection of genomes Support Protocol 4: Metagenotyping with advanced parameters.
Collapse
Affiliation(s)
- Chunyu Zhao
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- These authors contributed equally to this work
| | - Miriam Goldman
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Biomedical Informatics, University of California San Francisco, San Francisco, California
- These authors contributed equally to this work
| | - Byron J. Smith
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| | - Katherine S. Pollard
- Data Science, Chan Zuckerberg Biohub, San Francisco, California
- Data Science and Biotechnology, Gladstone Institutes, San Francisco, California
- Epidemiology and Biostatistics, University of California San Francisco, San Francisco, California
| |
Collapse
|
12
|
Editorial: Artificial Intelligence, machine learning and the changing landscape of molecular biology. J Mol Biol 2022; 434:167712. [PMID: 35777464 DOI: 10.1016/j.jmb.2022.167712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|