1
|
Kim N, Ma J, Kim W, Kim J, Belenky P, Lee I. Genome-resolved metagenomics: a game changer for microbiome medicine. Exp Mol Med 2024:10.1038/s12276-024-01262-7. [PMID: 38945961 DOI: 10.1038/s12276-024-01262-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 03/06/2024] [Accepted: 03/25/2024] [Indexed: 07/02/2024] Open
Abstract
Recent substantial evidence implicating commensal bacteria in human diseases has given rise to a new domain in biomedical research: microbiome medicine. This emerging field aims to understand and leverage the human microbiota and derivative molecules for disease prevention and treatment. Despite the complex and hierarchical organization of this ecosystem, most research over the years has relied on 16S amplicon sequencing, a legacy of bacterial phylogeny and taxonomy. Although advanced sequencing technologies have enabled cost-effective analysis of entire microbiota, translating the relatively short nucleotide information into the functional and taxonomic organization of the microbiome has posed challenges until recently. In the last decade, genome-resolved metagenomics, which aims to reconstruct microbial genomes directly from whole-metagenome sequencing data, has made significant strides and continues to unveil the mysteries of various human-associated microbial communities. There has been a rapid increase in the volume of whole metagenome sequencing data and in the compilation of novel metagenome-assembled genomes and protein sequences in public depositories. This review provides an overview of the capabilities and methods of genome-resolved metagenomics for studying the human microbiome, with a focus on investigating the prokaryotic microbiota of the human gut. Just as decoding the human genome and its variations marked the beginning of the genomic medicine era, unraveling the genomes of commensal microbes and their sequence variations is ushering us into the era of microbiome medicine. Genome-resolved metagenomics stands as a pivotal tool in this transition and can accelerate our journey toward achieving these scientific and medical milestones.
Collapse
Affiliation(s)
- Nayeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Junyeong Ma
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Wonjong Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Jungyeon Kim
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea
| | - Peter Belenky
- Department of Molecular Microbiology and Immunology, Brown University, Providence, RI, 02912, USA.
| | - Insuk Lee
- Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul, 03722, Republic of Korea.
- POSTECH Biotech Center, Pohang University of Science and Technology (POSTECH), Pohang, 37673, Republic of Korea.
| |
Collapse
|
2
|
Wang S, Jiang Y, Che L, Wang RH, Li SC. Enhancing insights into diseases through horizontal gene transfer event detection from gut microbiome. Nucleic Acids Res 2024:gkae515. [PMID: 38884260 DOI: 10.1093/nar/gkae515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Revised: 04/23/2024] [Accepted: 06/04/2024] [Indexed: 06/18/2024] Open
Abstract
Horizontal gene transfer (HGT) phenomena pervade the gut microbiome and significantly impact human health. Yet, no current method can accurately identify complete HGT events, including the transferred sequence and the associated deletion and insertion breakpoints from shotgun metagenomic data. Here, we develop LocalHGT, which facilitates the reliable and swift detection of complete HGT events from shotgun metagenomic data, delivering an accuracy of 99.4%-verified by Nanopore data-across 200 gut microbiome samples, and achieving an average F1 score of 0.99 on 100 simulated data. LocalHGT enables a systematic characterization of HGT events within the human gut microbiome across 2098 samples, revealing that multiple recipient genome sites can become targets of a transferred sequence, microhomology is enriched in HGT breakpoint junctions (P-value = 3.3e-58), and HGTs can function as host-specific fingerprints indicated by the significantly higher HGT similarity of intra-personal temporal samples than inter-personal samples (P-value = 4.3e-303). Crucially, HGTs showed potential contributions to colorectal cancer (CRC) and acute diarrhoea, as evidenced by the enrichment of the butyrate metabolism pathway (P-value = 3.8e-17) and the shigellosis pathway (P-value = 5.9e-13) in the respective associated HGTs. Furthermore, differential HGTs demonstrated promise as biomarkers for predicting various diseases. Integrating HGTs into a CRC prediction model achieved an AUC of 0.87.
Collapse
Affiliation(s)
- Shuai Wang
- City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Yiqi Jiang
- City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Lijia Che
- City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Ruo Han Wang
- City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| | - Shuai Cheng Li
- City University of Hong Kong Shenzhen Research Institute, Shenzhen, China
- Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong
| |
Collapse
|
3
|
Wang D, Yang X, Ren Z, Hu B, Zhao H, Yang K, Shi P, Zhang Z, Feng Q, Nawenja CV, Obanda V, Robert K, Nalikka B, Waruhiu CN, Ochola GO, Onyuok SO, Ochieng H, Li B, Zhu Y, Si H, Yin J, Kristiansen K, Jin X, Xu X, Xiao M, Agwanda B, Ommeh S, Li J, Shi ZL. Substantial viral diversity in bats and rodents from East Africa: insights into evolution, recombination, and cocirculation. MICROBIOME 2024; 12:72. [PMID: 38600530 PMCID: PMC11005217 DOI: 10.1186/s40168-024-01782-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 02/26/2024] [Indexed: 04/12/2024]
Abstract
BACKGROUND Zoonotic viruses cause substantial public health and socioeconomic problems worldwide. Understanding how viruses evolve and spread within and among wildlife species is a critical step when aiming for proactive identification of viral threats to prevent future pandemics. Despite the many proposed factors influencing viral diversity, the genomic diversity and structure of viral communities in East Africa are largely unknown. RESULTS Using 38.3 Tb of metatranscriptomic data obtained via ultradeep sequencing, we screened vertebrate-associated viromes from 844 bats and 250 rodents from Kenya and Uganda collected from the wild. The 251 vertebrate-associated viral genomes of bats (212) and rodents (39) revealed the vast diversity, host-related variability, and high geographic specificity of viruses in East Africa. Among the surveyed viral families, Coronaviridae and Circoviridae showed low host specificity, high conservation of replication-associated proteins, high divergence among viral entry proteins, and frequent recombination. Despite major dispersal limitations, recurrent mutations, cocirculation, and occasional gene flow contribute to the high local diversity of viral genomes. CONCLUSIONS The present study not only shows the landscape of bat and rodent viromes in this zoonotic hotspot but also reveals genomic signatures driven by the evolution and dispersal of the viral community, laying solid groundwork for future proactive surveillance of emerging zoonotic pathogens in wildlife. Video Abstract.
Collapse
Affiliation(s)
- Daxi Wang
- BGI Research, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Xinglou Yang
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan, China
- Hubei Jiangxia Lab, Wuhan, 430071, China
| | - Zirui Ren
- BGI Research, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Ben Hu
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan, China
| | - Hailong Zhao
- BGI Research, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Kaixin Yang
- BGI Research, Shenzhen, 518083, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Peibo Shi
- BGI Research, Shenzhen, 518083, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Zhipeng Zhang
- BGI Research, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Qikai Feng
- BGI Research, Shenzhen, 518083, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China
| | - Carol Vannesa Nawenja
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Vincent Obanda
- Veterinary Services Department, Kenya Wildlife Service, Nairobi, Kenya
| | - Kityo Robert
- Department of Zoology, Entomology and Fisheries Sciences, School of BioSciences, Makerere University, Kampala, Uganda
| | - Betty Nalikka
- Department of Zoology, Entomology and Fisheries Sciences, School of BioSciences, Makerere University, Kampala, Uganda
| | - Cecilia Njeri Waruhiu
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Griphin Ochieng Ochola
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Mammalogy Section, National Museums of Kenya, Nairobi, Kenya
| | - Samson Omondi Onyuok
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Mammalogy Section, National Museums of Kenya, Nairobi, Kenya
| | - Harold Ochieng
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
- Mammalogy Section, National Museums of Kenya, Nairobi, Kenya
| | - Bei Li
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Yan Zhu
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | - Haorui Si
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China
| | | | - Karsten Kristiansen
- Laboratory of Genomics and Molecular Biomedicine, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Xin Jin
- BGI Research, Shenzhen, 518083, China
| | - Xun Xu
- BGI Research, Shenzhen, 518083, China
| | - Minfeng Xiao
- BGI Research, Shenzhen, 518083, China.
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China.
| | - Bernard Agwanda
- Mammalogy Section, National Museums of Kenya, Nairobi, Kenya.
| | - Sheila Ommeh
- Center for Animal Science, Queensland Alliance for Agriculture & Food Innovation, The University of Queensland, St Lucia, QLD, 4072, Australia.
| | - Junhua Li
- BGI Research, Shenzhen, 518083, China.
- Shenzhen Key Laboratory of Unknown Pathogen Identification, BGI Research, Shenzhen, 518083, China.
| | - Zheng-Li Shi
- CAS Key Laboratory of Special Pathogens and Biosafety, Wuhan Institute of Virology, Chinese Academy of Sciences, Wuhan, China.
- Sino-Africa Joint Research Center, Chinese Academy of Sciences, Wuhan, China.
| |
Collapse
|
4
|
Zahavi L, Lavon A, Reicher L, Shoer S, Godneva A, Leviatan S, Rein M, Weissbrod O, Weinberger A, Segal E. Bacterial SNPs in the human gut microbiome associate with host BMI. Nat Med 2023; 29:2785-2792. [PMID: 37919437 PMCID: PMC10999242 DOI: 10.1038/s41591-023-02599-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 09/19/2023] [Indexed: 11/04/2023]
Abstract
Genome-wide association studies (GWASs) have provided numerous associations between human single-nucleotide polymorphisms (SNPs) and health traits. Likewise, metagenome-wide association studies (MWASs) between bacterial SNPs and human traits can suggest mechanistic links, but very few such studies have been done thus far. In this study, we devised an MWAS framework to detect SNPs and associate them with host phenotypes systematically. We recruited and obtained gut metagenomic samples from a cohort of 7,190 healthy individuals and discovered 1,358 statistically significant associations between a bacterial SNP and host body mass index (BMI), from which we distilled 40 independent associations. Most of these associations were unexplained by diet, medications or physical exercise, and 17 replicated in a geographically independent cohort. We uncovered BMI-associated SNPs in 27 bacterial species, and 12 of them showed no association by standard relative abundance analysis. We revealed a BMI association of an SNP in a potentially inflammatory pathway of Bilophila wadsworthia as well as of a group of SNPs in a region coding for energy metabolism functions in a Faecalibacterium prausnitzii genome. Our results demonstrate the importance of considering nucleotide-level diversity in microbiome studies and pave the way toward improved understanding of interpersonal microbiome differences and their potential health implications.
Collapse
Affiliation(s)
- Liron Zahavi
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Amit Lavon
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Lee Reicher
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
- Lis Maternity and Women's Hospital, Tel Aviv Sourasky Medical Center, Tel Aviv University (affiliated with Sackler Faculty of Medicine), Tel Aviv, Israel
| | - Saar Shoer
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Anastasia Godneva
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Sigal Leviatan
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Michal Rein
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | | | - Adina Weinberger
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot, Israel.
- Department of Molecular Cell Biology, Weizmann Institute of Science, Rehovot, Israel.
| |
Collapse
|
5
|
Cai J, Auster A, Cho S, Lai Z. Dissecting the human gut microbiome to better decipher drug liability: A once-forgotten organ takes center stage. J Adv Res 2023; 52:171-201. [PMID: 37419381 PMCID: PMC10555929 DOI: 10.1016/j.jare.2023.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 05/25/2023] [Accepted: 07/03/2023] [Indexed: 07/09/2023] Open
Abstract
BACKGROUND The gut microbiome is a diverse system within the gastrointestinal tract composed of trillions of microorganisms (gut microbiota), along with their genomes. Accumulated evidence has revealed the significance of the gut microbiome in human health and disease. Due to its ability to alter drug/xenobiotic pharmacokinetics and therapeutic outcomes, this once-forgotten "metabolic organ" is receiving increasing attention. In parallel with the growing microbiome-driven studies, traditional analytical techniques and technologies have also evolved, allowing researchers to gain a deeper understanding of the functional and mechanistic effects of gut microbiome. AIM OF REVIEW From a drug development perspective, microbial drug metabolism is becoming increasingly critical as new modalities (e.g., degradation peptides) with potential microbial metabolism implications emerge. The pharmaceutical industry thus has a pressing need to stay up-to-date with, and continue pursuing, research efforts investigating clinical impact of the gut microbiome on drug actions whilst integrating advances in analytical technology and gut microbiome models. Our review aims to practically address this need by comprehensively introducing the latest innovations in microbial drug metabolism research- including strengths and limitations, to aid in mechanistically dissecting the impact of the gut microbiome on drug metabolism and therapeutic impact, and to develop informed strategies to address microbiome-related drug liability and minimize clinical risk. KEY SCIENTIFIC CONCEPTS OF REVIEW We present comprehensive mechanisms and co-contributing factors by which the gut microbiome influences drug therapeutic outcomes. We highlight in vitro, in vivo, and in silico models for elucidating the mechanistic role and clinical impact of the gut microbiome on drugs in combination with high-throughput, functionally oriented, and physiologically relevant techniques. Integrating pharmaceutical knowledge and insight, we provide practical suggestions to pharmaceutical scientists for when, why, how, and what is next in microbial studies for improved drug efficacy and safety, and ultimately, support precision medicine formulation for personalized and efficacious therapies.
Collapse
Affiliation(s)
- Jingwei Cai
- Drug Metabolism & Pharmacokinetics, Genentech Inc., South San Francisco, CA 94080, USA.
| | - Alexis Auster
- Drug Metabolism & Pharmacokinetics, Genentech Inc., South San Francisco, CA 94080, USA
| | - Sungjoon Cho
- Drug Metabolism & Pharmacokinetics, Genentech Inc., South San Francisco, CA 94080, USA
| | - Zijuan Lai
- Drug Metabolism & Pharmacokinetics, Genentech Inc., South San Francisco, CA 94080, USA
| |
Collapse
|
6
|
Shi ZJ, Nayfach S, Pollard KS. Maast: genotyping thousands of microbial strains efficiently. Genome Biol 2023; 24:186. [PMID: 37563669 PMCID: PMC10416524 DOI: 10.1186/s13059-023-03030-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open
Abstract
Existing single nucleotide polymorphism (SNP) genotyping algorithms do not scale for species with thousands of sequenced strains, nor do they account for conspecific redundancy. Here we present a bioinformatics tool, Maast, which empowers population genetic meta-analysis of microbes at an unrivaled scale. Maast implements a novel algorithm to heuristically identify a minimal set of diverse conspecific genomes, then constructs a reliable SNP panel for each species, and enables rapid and accurate genotyping using a hybrid of whole-genome alignment and k-mer exact matching. We demonstrate Maast's utility by genotyping thousands of Helicobacter pylori strains and tracking SARS-CoV-2 diversification.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Joint Genome Institute, Department of Energy, Walnut Creek, CA, USA
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
- Gladstone Institutes of Data Science and Biotechnology, San Francisco, CA, USA.
- Department of Epidemiology and Biostatistics, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
7
|
Briscoe L, Halperin E, Garud NR. SNV-FEAST: microbial source tracking with single nucleotide variants. Genome Biol 2023; 24:101. [PMID: 37121994 PMCID: PMC10150486 DOI: 10.1186/s13059-023-02927-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 04/06/2023] [Indexed: 05/02/2023] Open
Abstract
Elucidating the sources of a microbiome can provide insight into the ecological dynamics responsible for the formation of these communities. Source tracking approaches to date leverage species abundance information; however, single nucleotide variants (SNVs) may be more informative because of their high specificity to certain sources. To overcome the computational burden of utilizing all SNVs for a given sample, we introduce a novel method to identify signature SNVs for source tracking. Signature SNVs used as input into a previously designed source tracking algorithm, FEAST, can more accurately estimate contributions than species and provide novel insights, demonstrated in three case studies.
Collapse
Affiliation(s)
- Leah Briscoe
- Bioinformatics Interdepartmental Program, University of California Los Angeles, Los Angeles, CA, USA.
| | - Eran Halperin
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Department of Anesthesiology and Perioperative Medicine, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA
- Institute of Precision Health, University of California Los Angeles, Los Angeles, CA, USA
| | - Nandita R Garud
- Department of Human Genetics, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA.
- Department of Ecology and Evolutionary Biology, University of California Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
8
|
Shi ZJ, Nayfach S, Pollard KS. Identifying species-specific k-mers for fast and accurate metagenotyping with Maast and GT-Pro. STAR Protoc 2023; 4:101964. [PMID: 36856771 PMCID: PMC10037184 DOI: 10.1016/j.xpro.2022.101964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 11/19/2022] [Accepted: 12/08/2022] [Indexed: 01/22/2023] Open
Abstract
Genotyping single-nucleotide polymorphisms (SNPs) in microbiomes enables strain-level quantification. In this protocol, we describe a computational pipeline that performs fast and accurate SNP genotyping using metagenomic data. We first demonstrate how to use Maast to catalog SNPs from microbial genomes. Then we use GT-Pro to extract unique SNP-covering k-mers, optimize a data structure for storing these k-mers, and finally perform metagenotyping. For proof of concept, the protocol leverages public whole-genome sequences to metagenotype a synthetic community. For complete details on the use and execution of this protocol, please refer to Shi et al. (2022a)1 and Shi et al. (2022b).2.
Collapse
Affiliation(s)
- Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA
| | - Stephen Nayfach
- Department of Energy, Joint Genome Institute, Walnut Creek, CA, USA; Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Berkeley, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institutes, Data Science and Biotechnology, San Francisco, CA, USA; University of California San Francisco, Department of Epidemiology and Biostatistics, San Francisco, CA, USA.
| |
Collapse
|
9
|
Zhao C, Shi ZJ, Pollard KS. Pitfalls of genotyping microbial communities with rapidly growing genome collections. Cell Syst 2023; 14:160-176.e3. [PMID: 36657438 PMCID: PMC9957970 DOI: 10.1016/j.cels.2022.12.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/15/2022] [Accepted: 12/19/2022] [Indexed: 01/20/2023]
Abstract
Detecting genetic variants in metagenomic data is a priority for understanding the evolution, ecology, and functional characteristics of microbial communities. Many tools that perform this metagenotyping rely on aligning reads of unknown origin to a database of sequences from many species before calling variants. In this synthesis, we investigate how databases of increasingly diverse and closely related species have pushed the limits of current alignment algorithms, thereby degrading the performance of metagenotyping tools. We identify multi-mapping reads as a prevalent source of errors and illustrate a trade-off between retaining correct alignments versus limiting incorrect alignments, many of which map reads to the wrong species. Then we evaluate several actionable mitigation strategies and review emerging methods showing promise to further improve metagenotyping in response to the rapid growth in genome collections. Our results have implications beyond metagenotyping to the many tools in microbial genomics that depend upon accurate read mapping.
Collapse
Affiliation(s)
- Chunyu Zhao
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Zhou Jason Shi
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
| | - Katherine S Pollard
- Chan Zuckerberg Biohub, San Francisco, CA, USA; Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA; Department of Epidemiology & Biostatistics, University of California, San Francisco, San Francisco, CA, USA.
| |
Collapse
|
10
|
Zhang N, Kandalai S, Zhou X, Hossain F, Zheng Q. Applying multi-omics toward tumor microbiome research. IMETA 2023; 2:e73. [PMID: 38868335 PMCID: PMC10989946 DOI: 10.1002/imt2.73] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 10/30/2022] [Accepted: 11/28/2022] [Indexed: 06/14/2024]
Abstract
Rather than a "short-term tenant," the tumor microbiome has been shown to play a vital role as a "permanent resident," affecting carcinogenesis, cancer development, metastasis, and cancer therapies. As the tumor microbiome has great potential to become a target for the early diagnosis and treatment of cancer, recent research on the relevance of the tumor microbiota has attracted a wide range of attention from various scientific fields, resulting in remarkable progress that benefits from the development of interdisciplinary technologies. However, there are still a great variety of challenges in this emerging area, such as the low biomass of intratumoral bacteria and unculturable character of some microbial species. Due to the complexity of tumor microbiome research (e.g., the heterogeneity of tumor microenvironment), new methods with high spatial and temporal resolution are urgently needed. Among these developing methods, multi-omics technologies (combinations of genomics, transcriptomics, proteomics, and metabolomics) are powerful approaches that can facilitate the understanding of the tumor microbiome on different levels of the central dogma. Therefore, multi-omics (especially single-cell omics) will make enormous impacts on the future studies of the interplay between microbes and tumor microenvironment. In this review, we have systematically summarized the advances in multi-omics and their existing and potential applications in tumor microbiome research, thus providing an omics toolbox for investigators to reference in the future.
Collapse
Affiliation(s)
- Nan Zhang
- Department of Radiation Oncology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
- Center for Cancer Metabolism, Ohio State University Comprehensive Cancer Center ‐ James Cancer Hospital and Solove Research InstituteThe Ohio State UniversityOhioColumbusUSA
| | - Shruthi Kandalai
- Department of Radiation Oncology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
- Center for Cancer Metabolism, Ohio State University Comprehensive Cancer Center ‐ James Cancer Hospital and Solove Research InstituteThe Ohio State UniversityOhioColumbusUSA
| | - Xiaozhuang Zhou
- Department of Radiation Oncology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
- Center for Cancer Metabolism, Ohio State University Comprehensive Cancer Center ‐ James Cancer Hospital and Solove Research InstituteThe Ohio State UniversityOhioColumbusUSA
| | - Farzana Hossain
- Department of Radiation Oncology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
- Center for Cancer Metabolism, Ohio State University Comprehensive Cancer Center ‐ James Cancer Hospital and Solove Research InstituteThe Ohio State UniversityOhioColumbusUSA
| | - Qingfei Zheng
- Department of Radiation Oncology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
- Center for Cancer Metabolism, Ohio State University Comprehensive Cancer Center ‐ James Cancer Hospital and Solove Research InstituteThe Ohio State UniversityOhioColumbusUSA
- Department of Biological Chemistry and Pharmacology, College of MedicineThe Ohio State UniversityColumbusOhioUSA
| |
Collapse
|
11
|
Anderson BD, Bisanz JE. Challenges and opportunities of strain diversity in gut microbiome research. Front Microbiol 2023; 14:1117122. [PMID: 36876113 PMCID: PMC9981649 DOI: 10.3389/fmicb.2023.1117122] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 01/24/2023] [Indexed: 02/19/2023] Open
Abstract
Just because two things are related does not mean they are the same. In analyzing microbiome data, we are often limited to species-level analyses, and even with the ability to resolve strains, we lack comprehensive databases and understanding of the importance of strain-level variation outside of a limited number of model organisms. The bacterial genome is highly plastic with gene gain and loss occurring at rates comparable or higher than de novo mutations. As such, the conserved portion of the genome is often a fraction of the pangenome which gives rise to significant phenotypic variation, particularly in traits which are important in host microbe interactions. In this review, we discuss the mechanisms that give rise to strain variation and methods that can be used to study it. We identify that while strain diversity can act as a major barrier in interpreting and generalizing microbiome data, it can also be a powerful tool for mechanistic research. We then highlight recent examples demonstrating the importance of strain variation in colonization, virulence, and xenobiotic metabolism. Moving past taxonomy and the species concept will be crucial for future mechanistic research to understand microbiome structure and function.
Collapse
Affiliation(s)
- Benjamin D Anderson
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States
| | - Jordan E Bisanz
- Department of Biochemistry and Molecular Biology, Pennsylvania State University, University Park, PA, United States.,The Penn State Microbiome Center, Huck Institutes of the Life Sciences, University Park, PA, United States
| |
Collapse
|
12
|
Fu W, Zhao C, Xue W, Li C. An investigation of the influence of microstructure surface topography on the imaging mechanism to explore super-resolution microstructure. Sci Rep 2022; 12:13651. [PMID: 35953698 PMCID: PMC9372069 DOI: 10.1038/s41598-022-17209-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2022] [Accepted: 07/21/2022] [Indexed: 11/09/2022] Open
Abstract
Vision-based precision measurement is limited by the optical resolution. Although various super-resolution algorithms have been developed, measurement precision and accuracy are difficult to guarantee. To achieve nanoscale resolution measurement, a super-resolution microstructure concept is proposed which is based on the idea of a strong mathematical mapping relationship that may exist between microstructure surface topography features and the corresponding image pixel intensities. In this work, a series of microgrooves are ultra-precision machined and their surface topographies and images are measured. A mapping relationship model is established to analyze the effect of the microgroove surface topography on the imaging mechanism. The results show that the surface roughness and surface defects of the microgroove have significant effects on predicting the imaging mechanism. The optimized machining parameters are determined afterward. This paper demonstrates a feasible and valuable work to support the design and manufacture super-resolution microstructure which has essential applications in precision positioning measurement.
Collapse
Affiliation(s)
- Wenpeng Fu
- School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Chenyang Zhao
- School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, 518055, China.
| | - Wen Xue
- School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, 518055, China
| | - Changlin Li
- School of Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen, 518055, China
| |
Collapse
|
13
|
Smith BJ, Li X, Shi ZJ, Abate A, Pollard KS. Scalable Microbial Strain Inference in Metagenomic Data Using StrainFacts. FRONTIERS IN BIOINFORMATICS 2022; 2:867386. [PMID: 36304283 PMCID: PMC9580935 DOI: 10.3389/fbinf.2022.867386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 04/14/2022] [Indexed: 11/25/2022] Open
Abstract
While genome databases are nearing a complete catalog of species commonly inhabiting the human gut, their representation of intraspecific diversity is lacking for all but the most abundant and frequently studied taxa. Statistical deconvolution of allele frequencies from shotgun metagenomic data into strain genotypes and relative abundances is a promising approach, but existing methods are limited by computational scalability. Here we introduce StrainFacts, a method for strain deconvolution that enables inference across tens of thousands of metagenomes. We harness a “fuzzy” genotype approximation that makes the underlying graphical model fully differentiable, unlike existing methods. This allows parameter estimates to be optimized with gradient-based methods, speeding up model fitting by two orders of magnitude. A GPU implementation provides additional scalability. Extensive simulations show that StrainFacts can perform strain inference on thousands of metagenomes and has comparable accuracy to more computationally intensive tools. We further validate our strain inferences using single-cell genomic sequencing from a human stool sample. Applying StrainFacts to a collection of more than 10,000 publicly available human stool metagenomes, we quantify patterns of strain diversity, biogeography, and linkage-disequilibrium that agree with and expand on what is known based on existing reference genomes. StrainFacts paves the way for large-scale biogeography and population genetic studies of microbiomes using metagenomic data.
Collapse
Affiliation(s)
- Byron J. Smith
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
| | - Xiangpeng Li
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, United States
| | - Zhou Jason Shi
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
| | - Adam Abate
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
| | - Katherine S. Pollard
- The Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, United States
- Department of Epidemiology and Biostatistics, University of California, San Francisco, San Francisco, CA, United States
- Chan-Zuckerberg Biohub, San Francisco, CA, United States
- *Correspondence: Katherine S. Pollard,
| |
Collapse
|
14
|
Miller BF, Huang F, Atta L, Sahoo A, Fan J. Reference-free cell type deconvolution of multi-cellular pixel-resolution spatially resolved transcriptomics data. Nat Commun 2022; 13:2339. [PMID: 35487922 PMCID: PMC9055051 DOI: 10.1038/s41467-022-30033-z] [Citation(s) in RCA: 37] [Impact Index Per Article: 18.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 04/12/2022] [Indexed: 12/12/2022] Open
Abstract
Recent technological advancements have enabled spatially resolved transcriptomic profiling but at multi-cellular pixel resolution, thereby hindering the identification of cell-type-specific spatial patterns and gene expression variation. To address this challenge, we develop STdeconvolve as a reference-free approach to deconvolve underlying cell types comprising such multi-cellular pixel resolution spatial transcriptomics (ST) datasets. Using simulated as well as real ST datasets from diverse spatial transcriptomics technologies comprising a variety of spatial resolutions such as Spatial Transcriptomics, 10X Visium, DBiT-seq, and Slide-seq, we show that STdeconvolve can effectively recover cell-type transcriptional profiles and their proportional representation within pixels without reliance on external single-cell transcriptomics references. STdeconvolve provides comparable performance to existing reference-based methods when suitable single-cell references are available, as well as potentially superior performance when suitable single-cell references are not available. STdeconvolve is available as an open-source R software package with the source code available at https://github.com/JEFworks-Lab/STdeconvolve. Identifying cell-type-specific spatial patterns in ST data is critical for understanding tissue organization but current methods rely on external references. Here the authors develop a reference-free method to effectively recover cell-type transcriptional profiles and proportions.
Collapse
Affiliation(s)
- Brendan F Miller
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, 21211, United States.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, United States
| | - Feiyang Huang
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, 21211, United States.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, United States
| | - Lyla Atta
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, 21211, United States.,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, United States
| | - Arpan Sahoo
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, 21211, United States.,Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, United States
| | - Jean Fan
- Center for Computational Biology, Whiting School of Engineering, Johns Hopkins University, Baltimore, MD, 21211, United States. .,Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD, 21218, United States. .,Department of Computer Science, Johns Hopkins University, Baltimore, MD, 21218, United States.
| |
Collapse
|
15
|
Shalon N, Relman DA, Yaffe E. Precise genotyping of circular mobile elements from metagenomic data uncovers human-associated plasmids with recent common ancestors. Genome Res 2022; 32:986-1003. [PMID: 35414589 PMCID: PMC9104695 DOI: 10.1101/gr.275894.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Accepted: 04/01/2022] [Indexed: 11/25/2022]
Abstract
Mobile genetic elements with circular genomes play a key role in the evolution of microbial communities. Their circular genomes correspond to circular walks in metagenome graphs, and yet, assemblies derived from natural microbial communities produce graphs riddled with spurious cycles, complicating the accurate reconstruction of circular genomes. We present DomCycle, an algorithm that reconstructs likely circular genomes based on the identification of so-called 'dominant' graph cycles. In the implementation we leverage paired reads to bridge assembly gaps and scrutinize cycles through a nucleotide-level analysis, making the approach robust to misassembly artifacts. We validated the approach using simulated and real sequencing data. Application of DomCycle to 32 publicly available DNA shotgun sequence data sets from diverse natural environments led to the reconstruction of hundreds of circular mobile genomes. Clustering revealed 20 highly prevalent and cryptic plasmids that have clonal population structures with recent common ancestors. This method facilitates the study of microbial communities that evolve through horizontal gene transfer.
Collapse
|
16
|
Strain identification and quantitative analysis in microbial communities. J Mol Biol 2022; 434:167582. [DOI: 10.1016/j.jmb.2022.167582] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 03/31/2022] [Accepted: 04/03/2022] [Indexed: 12/14/2022]
|