1
|
Oles RE, Terrazas MC, Loomis LR, Hsu CY, Tribelhorn C, Ferre PB, Ea A, Bryant M, Young J, Carrow HC, Sandborn WJ, Dulai P, Sivagnanam M, Pride D, Knight R, Chu H. Pangenome comparison of Bacteroides fragilis genomospecies unveil genetic diversity and ecological insights. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572674. [PMID: 38187556 PMCID: PMC10769428 DOI: 10.1101/2023.12.20.572674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Bacteroides fragilis is a Gram-negative commensal bacterium commonly found in the human colon that differentiates into two genomospecies termed division I and II. We leverage a comprehensive collection of 694 B. fragilis whole genome sequences and report differential gene abundance to further support the recent proposal that divisions I and II represent separate species. In division I strains, we identify an increased abundance of genes related to complex carbohydrate degradation, colonization, and host niche occupancy, confirming the role of division I strains as gut commensals. In contrast, division II strains display an increased prevalence of plant cell wall degradation genes and exhibit a distinct geographic distribution, primarily originating from Asian countries, suggesting dietary influences. Notably, division II strains have an increased abundance of genes linked to virulence, survival in toxic conditions, and antimicrobial resistance, consistent with a higher incidence of these strains in bloodstream infections. This study provides new evidence supporting a recent proposal for classifying divisions I and II B. fragilis strains as distinct species, and our comparative genomic analysis reveals their niche-specific roles.
Collapse
Affiliation(s)
- Renee E Oles
- Department of Pathology, University of California, San Diego, La Jolla, CA
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
| | | | - Luke R Loomis
- Department of Pathology, University of California, San Diego, La Jolla, CA
| | - Chia-Yun Hsu
- Department of Pathology, University of California, San Diego, La Jolla, CA
| | - Caitlin Tribelhorn
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
| | - Pedro Belda Ferre
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
| | - Allison Ea
- Department of Pathology, University of California, San Diego, La Jolla, CA
| | - MacKenzie Bryant
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
| | - Jocelyn Young
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
- Rady Children's Hospital, San Diego, CA, United States
| | - Hannah C Carrow
- Department of Pathology, University of California, San Diego, La Jolla, CA
| | - William J Sandborn
- Division of Gastroenterology, University of California, San Diego, La Jolla, CA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA
| | - Parambir Dulai
- Division of Gastroenterology, University of California, San Diego, La Jolla, CA
- Division of Gastroenterology, Northwestern University, Chicago, Illinois
| | - Mamata Sivagnanam
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
- Rady Children's Hospital, San Diego, CA, United States
| | - David Pride
- Department of Pathology, University of California, San Diego, La Jolla, CA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA
- Center for Innovative Phage Applications and Therapeutics (IPATH), University of California, San Diego, La Jolla, CA
- Center of Advanced Laboratory Medicine (CALM), University of California, San Diego, La Jolla, CA
| | - Rob Knight
- Department of Pediatrics, School of Medicine, University of California, La Jolla, CA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA
- Shu Chien-Gene Lay Department of Bioengineering, University of California San Diego, La Jolla, CA
- Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA
- Halıcıoğlu Data Science Institute, University of California, San Diego, La Jolla, CA
| | - Hiutung Chu
- Department of Pathology, University of California, San Diego, La Jolla, CA
- Center for Microbiome Innovation, University of California, San Diego, La Jolla, CA
- Chiba University-UC San Diego Center for Mucosal Immunology, Allergy and Vaccines (cMAV), University of California, San Diego, La Jolla, CA
| |
Collapse
|
2
|
Thurimella K, Mohamed AMT, Graham DB, Owens RM, La Rosa SL, Plichta DR, Bacallado S, Xavier RJ. Protein Language Models Uncover Carbohydrate-Active Enzyme Function in Metagenomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.23.563620. [PMID: 37961379 PMCID: PMC10634757 DOI: 10.1101/2023.10.23.563620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
In metagenomics, the pool of uncharacterized microbial enzymes presents a challenge for functional annotation. Among these, carbohydrate-active enzymes (CAZymes) stand out due to their pivotal roles in various biological processes related to host health and nutrition. Here, we present CAZyLingua, the first tool that harnesses protein language model embeddings to build a deep learning framework that facilitates the annotation of CAZymes in metagenomic datasets. Our benchmarking results showed on average a higher F1 score (reflecting an average of precision and recall) on the annotated genomes of Bacteroides thetaiotaomicron, Eggerthella lenta and Ruminococcus gnavus compared to the traditional sequence homology-based method in dbCAN2. We applied our tool to a paired mother/infant longitudinal dataset and revealed unannotated CAZymes linked to microbial development during infancy. When applied to metagenomic datasets derived from patients affected by fibrosis-prone diseases such as Crohn's disease and IgG4-related disease, CAZyLingua uncovered CAZymes associated with disease and healthy states. In each of these metagenomic catalogs, CAZyLingua discovered new annotations that were previously overlooked by traditional sequence homology tools. Overall, the deep learning model CAZyLingua can be applied in combination with existing tools to unravel intricate CAZyme evolutionary profiles and patterns, contributing to a more comprehensive understanding of microbial metabolic dynamics.
Collapse
Affiliation(s)
- Kumar Thurimella
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK
- School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | - Ahmed M. T. Mohamed
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Daniel B. Graham
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Róisín M. Owens
- Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge, UK
| | - Sabina Leanti La Rosa
- Faculty of Chemistry, Biotechnology and Food Science, Norwegian University of Life Sciences, Ås, Norway
| | - Damian R. Plichta
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| | - Sergio Bacallado
- Department of Pure Mathematics and Mathematical Statistics, University of Cambridge, Cambridge, UK
| | - Ramnik J. Xavier
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Center for Computational and Integrative Biology and Department of Molecular Biology, Massachusetts General Hospital, Harvard Medical School, Boston, MA, USA
| |
Collapse
|