1
|
Cheskis S, Akerman A, Levy A. Deciphering bacterial protein functions with innovative computational methods. Trends Microbiol 2024:S0966-842X(24)00316-0. [PMID: 39736484 DOI: 10.1016/j.tim.2024.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Revised: 11/28/2024] [Accepted: 11/29/2024] [Indexed: 01/01/2025]
Abstract
Bacteria colonize every niche on Earth and play key roles in many environmental and host-associated processes. The sequencing revolution revealed the remarkable bacterial genetic and proteomic diversity and the genomic content of cultured and uncultured bacteria. However, deciphering functions of novel proteins remains a high barrier, often preventing the deep understanding of microbial life and its interaction with the surrounding environment. In recent years, exciting new bioinformatic tools, many of which are based on machine learning, facilitate the challenging task of gene and protein function discovery in the era of big genomics data, leading to the generation of testable hypotheses for bacterial protein functions. The new tools allow prediction of protein structures and interactions and allow sensitive and efficient sequence- and structure-based searching and clustering. Here, we summarize some of these recent tools which revolutionize modern microbiology research, along with examples for their usage, emphasizing the user-friendly, web-based ones. Adoption of these capabilities by experimentalists and computational biologists could save resources and accelerate microbiology research.
Collapse
Affiliation(s)
- Shani Cheskis
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Avital Akerman
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel
| | - Asaf Levy
- Department of Plant Pathology and Microbiology, Institute of Environmental Science, The Faculty of Agriculture, Food, and Environment, The Hebrew University of Jerusalem, Rehovot, Israel.
| |
Collapse
|
2
|
Unzueta-Martínez A, Girguis PR. Taxonomic diversity and functional potential of microbial communities in oyster calcifying fluid. Appl Environ Microbiol 2024:e0109424. [PMID: 39665561 DOI: 10.1128/aem.01094-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 10/30/2024] [Indexed: 12/13/2024] Open
Abstract
Creating and maintaining an appropriate chemical environment is essential for biomineralization, the process by which organisms precipitate minerals to form their shells or skeletons, yet the mechanisms involved in maintaining calcifying fluid chemistry are not fully defined. In particular, the role of microorganisms in facilitating or hindering animal biomineralization is poorly understood. Here, we investigated the taxonomic diversity and functional potential of microbial communities inhabiting oyster calcifying fluid. We used shotgun metagenomics to survey calcifying fluid microbial communities from three different oyster harvesting sites. There was a striking consistency in taxonomic composition across the three collection sites. We also observed archaea and viruses that had not been previously identified in oyster calcifying fluid. Furthermore, we identified microbial energy-conserving metabolisms that could influence the host's calcification, including genes involved in sulfate reduction and denitrification that are thought to play pivotal roles in inorganic carbon chemistry and calcification in microbial biofilms. These findings provide new insights into the taxonomy and functional capacity of oyster calcifying fluid microbiomes, highlighting their potential contributions to shell biomineralization, and contribute to a deeper understanding of the interplay between microbial ecology and biogeochemistry that could potentially bolster oyster calcification. IMPORTANCE Previous research has underscored the influence of microbial metabolisms in carbonate deposition throughout the geological record. Despite the ecological importance of microbes to animals and inorganic carbon transformations, there have been limited studies characterizing the potential role of microbiomes in calcification by animals such as bivalves. Here, we use metagenomics to investigate the taxonomic diversity and functional potential of microbial communities in calcifying fluids from oysters collected at three different locations. We show a diverse microbial community that includes bacteria, archaea, and viruses, and we discuss their functional potential to influence calcifying fluid chemistry via reactions like sulfate reduction and denitrification. We also report the presence of carbonic anhydrase and urease, both of which are critical in microbial biofilm calcification. Our findings have broader implications in understanding what regulates calcifying fluid chemistry and consequentially the resilience of calcifying organisms to 21st century acidifying oceans.
Collapse
Affiliation(s)
- Andrea Unzueta-Martínez
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| | - Peter R Girguis
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
| |
Collapse
|
3
|
Urban M, Cuzick A, Seager J, Nonavinakere N, Sahoo J, Sahu P, Iyer VL, Khamari L, Martinez MC, Hammond-Kosack KE. PHI-base - the multi-species pathogen-host interaction database in 2025. Nucleic Acids Res 2024:gkae1084. [PMID: 39588765 DOI: 10.1093/nar/gkae1084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Revised: 10/21/2024] [Accepted: 10/23/2024] [Indexed: 11/27/2024] Open
Abstract
The Pathogen-Host Interactions Database (PHI-base) has, since 2005, provided manually curated genes from fungal, bacterial and protist pathogens that have been experimentally verified to have important pathogenicity, virulence and/or effector functions during different types of interactions involving human, animal, plant, invertebrate and fungal hosts. PHI-base provides phenotypic annotation and genotypic information for both native and model host interactions, including gene alterations that do not alter the phenotype of the interaction. In this article, we describe major updates to PHI-base. The latest version of PHI-base, 4.17, contains a 19% increase in genes and a 23% increase in interactions relative to version 4.12 (released September 2022). We also describe the unification of data in PHI-base 4 with the data curated from a new curation workflow (PHI-Canto), which forms the first complete release of PHI-base version 5.0. Additionally, we describe adding support for the Frictionless Data framework to PHI-base 4 datasets, new ways of sharing interaction data with the Ensembl database, an analysis of the conserved orthologous genes in PHI-base, and the increasing variety of research studies that make use of PHI-base. PHI-base version 4.17 is freely available at www.phi-base.org and PHI-base version 5.0 is freely available at phi5.phi-base.org.
Collapse
Affiliation(s)
- Martin Urban
- Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK
| | - Alayne Cuzick
- Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK
| | - James Seager
- Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK
| | - Nagashree Nonavinakere
- Molecular Connections, Kandala Mansions, Kariappa Road, Basavanagudi, Bengaluru 560 004, India
| | - Jahobanta Sahoo
- Molecular Connections, Kandala Mansions, Kariappa Road, Basavanagudi, Bengaluru 560 004, India
| | - Pallavi Sahu
- Molecular Connections, Kandala Mansions, Kariappa Road, Basavanagudi, Bengaluru 560 004, India
| | - Vijay Laksmi Iyer
- Molecular Connections, Kandala Mansions, Kariappa Road, Basavanagudi, Bengaluru 560 004, India
| | - Lokanath Khamari
- Molecular Connections, Kandala Mansions, Kariappa Road, Basavanagudi, Bengaluru 560 004, India
| | - Manuel Carbajo Martinez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kim E Hammond-Kosack
- Protecting Crops and the Environment, Rothamsted Research, Harpenden AL5 2JQ, UK
| |
Collapse
|
4
|
Litchman E, Villéger S, Zinger L, Auguet JC, Thuiller W, Munoz F, Kraft NJB, Philippot L, Violle C. Refocusing the microbial rare biosphere concept through a functional lens. Trends Ecol Evol 2024; 39:923-936. [PMID: 38987022 DOI: 10.1016/j.tree.2024.06.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 06/04/2024] [Accepted: 06/11/2024] [Indexed: 07/12/2024]
Abstract
The influential concept of the rare biosphere in microbial ecology has underscored the importance of taxa occurring at low abundances yet potentially playing key roles in communities and ecosystems. Here, we refocus the concept of rare biosphere through a functional trait-based lens and provide a framework to characterize microbial functional rarity, a combination of numerical scarcity across space or time and trait distinctiveness. We demonstrate how this novel interpretation of the rare biosphere, rooted in microbial functions, can enhance our mechanistic understanding of microbial community structure. It also sheds light on functionally distinct microbes, directing conservation efforts towards taxa harboring rare yet ecologically crucial functions.
Collapse
Affiliation(s)
- Elena Litchman
- Department of Global Ecology, Carnegie Institution for Science, Stanford, CA, USA; Kellogg Biological Station, Michigan State University, Hickory Corners, MI, USA.
| | | | - Lucie Zinger
- Institut de Biologie de l'École Normale Supérieure (IBENS), École Normale Supérieure, CNRS, INSERM, PSL Université Paris, Paris, France; Centre de Recherche sur la Biodiversité et l'Environnement (CRBE), UMR 5300, CNRS, Institut de Recherche pour le Développement (IRD), Toulouse INP, Université Toulouse 3 Paul Sabatier, Toulouse, France
| | | | - Wilfried Thuiller
- Université Grenoble Alpes, Université Savoie Mont Blanc, CNRS, LECA, F-38000 Grenoble, France
| | - François Munoz
- Université Grenoble Alpes, CNRS, LIPhy, F-38000 Grenoble, France
| | - Nathan J B Kraft
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, Los Angeles, CA, USA
| | - Laurent Philippot
- Université Bourgogne Franche-Comté, INRAE, Institut Agro Dijon, Agroecology, Dijon, France
| | - Cyrille Violle
- CEFE, Université Montpellier, CNRS, IRD, EPHE, Montpellier, France
| |
Collapse
|
5
|
Eren AM, Banfield JF. Modern microbiology: Embracing complexity through integration across scales. Cell 2024; 187:5151-5170. [PMID: 39303684 PMCID: PMC11450119 DOI: 10.1016/j.cell.2024.08.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2024] [Revised: 08/14/2024] [Accepted: 08/14/2024] [Indexed: 09/22/2024]
Abstract
Microbes were the only form of life on Earth for most of its history, and they still account for the vast majority of life's diversity. They convert rocks to soil, produce much of the oxygen we breathe, remediate our sewage, and sustain agriculture. Microbes are vital to planetary health as they maintain biogeochemical cycles that produce and consume major greenhouse gases and support large food webs. Modern microbiologists analyze nucleic acids, proteins, and metabolites; leverage sophisticated genetic tools, software, and bioinformatic algorithms; and process and integrate complex and heterogeneous datasets so that microbial systems may be harnessed to address contemporary challenges in health, the environment, and basic science. Here, we consider an inevitably incomplete list of emergent themes in our discipline and highlight those that we recognize as the archetypes of its modern era that aim to address the most pressing problems of the 21st century.
Collapse
Affiliation(s)
- A Murat Eren
- Helmholtz Institute for Functional Marine Biodiversity, 26129 Oldenburg, Germany; Alfred Wegener Institute, Helmholtz Centre for Polar and Marine Research, Bremerhaven, Germany; Institute for Chemistry and Biology of the Marine Environment, University of Oldenburg, Oldenburg, Germany; Marine Biological Laboratory, Woods Hole, MA, USA; Max Planck Institute for Marine Microbiology, Bremen, Germany.
| | - Jillian F Banfield
- Department of Earth and Planetary Sciences, University of California, Berkeley, Berkeley, CA, USA; Earth and Environmental Sciences, Lawrence Berkeley National Laboratory, Berkeley, CA, USA; Innovative Genomics Institute, University of California, Berkeley, Berkeley, CA, USA; Biomedicine Discovery Institute, Monash University, Clayton, VIC, Australia; Department of Environmental Science Policy, and Management, University of California, Berkeley, Berkeley, CA, USA.
| |
Collapse
|
6
|
Gerasimova Y, Ali H, Nadeem U. Challenges for pathologists in implementing clinical microbiome diagnostic testing. J Pathol Clin Res 2024; 10:e70002. [PMID: 39289163 PMCID: PMC11407905 DOI: 10.1002/2056-4538.70002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 08/11/2024] [Accepted: 08/26/2024] [Indexed: 09/19/2024]
Abstract
Recent research has established that the microbiome plays potential roles in the pathogenesis of numerous chronic diseases, including carcinomas. This discovery has led to significant interest in clinical microbiome testing among physicians, translational investigators, and the lay public. As novel, inexpensive methodologies to interrogate the microbiota become available, research labs and commercial vendors have offered microbial assays. However, these tests still have not infiltrated the clinical laboratory space. Here, we provide an overview of the challenges of implementing microbiome testing in clinical pathology. We discuss challenges associated with preanalytical and analytic sample handling and collection that can influence results, choosing the appropriate testing methodology for the clinical context, establishing reference ranges, interpreting the data generated by testing and its value in making patient care decisions, regulation, and cost considerations of testing. Additionally, we suggest potential solutions for these problems to expedite the establishment of microbiome testing in the clinical laboratory.
Collapse
Affiliation(s)
- Yulia Gerasimova
- Department of Infectious Diseases, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Haroon Ali
- Department of Medicine, Woodland Heights Medical Center, Lufkin, TX, USA
| | - Urooba Nadeem
- Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
7
|
Bizzotto E, Fraulini S, Zampieri G, Orellana E, Treu L, Campanaro S. MICROPHERRET: MICRObial PHEnotypic tRait ClassifieR using Machine lEarning Techniques. ENVIRONMENTAL MICROBIOME 2024; 19:58. [PMID: 39113074 PMCID: PMC11308548 DOI: 10.1186/s40793-024-00600-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 07/24/2024] [Indexed: 08/10/2024]
Abstract
BACKGROUND In recent years, there has been a rapid increase in the number of microbial genomes reconstructed through shotgun sequencing, and obtained by newly developed approaches including metagenomic binning and single-cell sequencing. However, our ability to functionally characterize these genomes by experimental assays is orders of magnitude less efficient. Consequently, there is a pressing need for the development of swift and automated strategies for the functional classification of microbial genomes. RESULTS The present work leverages a suite of supervised machine learning algorithms to establish a range of 86 metabolic and other ecological functions, such as methanotrophy and plastic degradation, starting from widely obtainable microbial genome annotations. Tests performed on independent datasets demonstrated robust performance across complete, fragmented, and incomplete genomes above a 70% completeness level for most of the considered functions. Application of the algorithms to the Biogas Microbiome database yielded predictions broadly consistent with current biological knowledge and correctly detecting functionally-related nuances of archaeal genomes. Finally, a case study focused on acetoclastic methanogenesis demonstrated how the developed machine learning models can be refined or expanded with models describing novel functions of interest. CONCLUSIONS The resulting tool, MICROPHERRET, incorporates a total of 86 models, one for each tested functional class, and can be applied to high-quality microbial genomes as well as to low-quality genomes derived from metagenomics and single-cell sequencing. MICROPHERRET can thus aid in understanding the functional role of newly generated genomes within their micro-ecological context.
Collapse
Affiliation(s)
- Edoardo Bizzotto
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Sofia Fraulini
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Guido Zampieri
- Department of Biology, University of Padova, Padova, 35131, Italy.
| | - Esteban Orellana
- Department of Biology, University of Padova, Padova, 35131, Italy
| | - Laura Treu
- Department of Biology, University of Padova, Padova, 35131, Italy
| | | |
Collapse
|
8
|
Huang YY, Price MN, Hung A, Gal-Oz O, Tripathi S, Smith CW, Ho D, Carion H, Deutschbauer AM, Arkin AP. Barcoded overexpression screens in gut Bacteroidales identify genes with roles in carbon utilization and stress resistance. Nat Commun 2024; 15:6618. [PMID: 39103350 DOI: 10.1038/s41467-024-50124-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 06/28/2024] [Indexed: 08/07/2024] Open
Abstract
A mechanistic understanding of host-microbe interactions in the gut microbiome is hindered by poorly annotated bacterial genomes. While functional genomics can generate large gene-to-phenotype datasets to accelerate functional discovery, their applications to study gut anaerobes have been limited. For instance, most gain-of-function screens of gut-derived genes have been performed in Escherichia coli and assayed in a small number of conditions. To address these challenges, we develop Barcoded Overexpression BActerial shotgun library sequencing (Boba-seq). We demonstrate the power of this approach by assaying genes from diverse gut Bacteroidales overexpressed in Bacteroides thetaiotaomicron. From hundreds of experiments, we identify new functions and phenotypes for 29 genes important for carbohydrate metabolism or tolerance to antibiotics or bile salts. Highlights include the discovery of a D-glucosamine kinase, a raffinose transporter, and several routes that increase tolerance to ceftriaxone and bile salts through lipid biosynthesis. This approach can be readily applied to develop screens in other strains and additional phenotypic assays.
Collapse
Affiliation(s)
- Yolanda Y Huang
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, NY, USA.
| | - Morgan N Price
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Allison Hung
- Department of Molecular and Cell Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Omree Gal-Oz
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Surya Tripathi
- Department of Plant and Microbial Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Christopher W Smith
- Department of Microbiology and Immunology, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Davian Ho
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA
| | - Héloïse Carion
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA
| | - Adam M Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Plant and Microbial Biology, University of California-Berkeley, Berkeley, CA, USA
| | - Adam P Arkin
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Department of Bioengineering, University of California-Berkeley, Berkeley, CA, USA.
| |
Collapse
|
9
|
Vakirlis N, Kupczok A. Large-scale investigation of species-specific orphan genes in the human gut microbiome elucidates their evolutionary origins. Genome Res 2024; 34:888-903. [PMID: 38977308 PMCID: PMC11293555 DOI: 10.1101/gr.278977.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 06/12/2024] [Indexed: 07/10/2024]
Abstract
Species-specific genes, also known as orphans, are ubiquitous across life's domains. In prokaryotes, species-specific orphan genes (SSOGs) are mostly thought to originate in external elements such as viruses followed by horizontal gene transfer, whereas the scenario of native origination, through rapid divergence or de novo, is mostly dismissed. However, quantitative evidence supporting either scenario is lacking. Here, we systematically analyzed genomes from 4644 human gut microbiome species and identified more than 600,000 unique SSOGs, representing an average of 2.6% of a given species' pangenome. These sequences are mostly rare within each species yet show signs of purifying selection. Overall, SSOGs use optimal codons less frequently, and their proteins are more disordered than those of conserved genes (i.e., non-SSOGs). Importantly, across species, the GC content of SSOGs closely matches that of conserved ones. In contrast, the ∼5% of SSOGs that share similarity to known viral sequences have distinct characteristics, including lower GC content. Thus, SSOGs with similarity to viruses differ from the remaining SSOGs, contrasting an external origination scenario for most of them. By examining the orthologous genomic region in closely related species, we show that a small subset of SSOGs likely evolved natively de novo and find that these genes also differ in their properties from the remaining SSOGs. Our results challenge the notion that external elements are the dominant source of prokaryotic genetic novelty and will enable future studies into the biological role and relevance of species-specific genes in the human gut.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Institute For Fundamental Biomedical Research, B.S.R.C. "Alexander Fleming," Vari 166 72, Greece;
- Institute for General Microbiology, Kiel University, 24118 Kiel, Germany
| | - Anne Kupczok
- Bioinformatics Group, Wageningen University, 6700 PB Wageningen, The Netherlands
| |
Collapse
|
10
|
Kruk ME, Mehta S, Murray K, Higgins L, Do K, Johnson JE, Wagner R, Wendt CH, O’Connor JB, Harris JK, Laguna TA, Jagtap PD, Griffin TJ. An integrated metaproteomics workflow for studying host-microbe dynamics in bronchoalveolar lavage samples applied to cystic fibrosis disease. mSystems 2024; 9:e0092923. [PMID: 38934598 PMCID: PMC11264604 DOI: 10.1128/msystems.00929-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Airway microbiota are known to contribute to lung diseases, such as cystic fibrosis (CF), but their contributions to pathogenesis are still unclear. To improve our understanding of host-microbe interactions, we have developed an integrated analytical and bioinformatic mass spectrometry (MS)-based metaproteomics workflow to analyze clinical bronchoalveolar lavage (BAL) samples from people with airway disease. Proteins from BAL cellular pellets were processed and pooled together in groups categorized by disease status (CF vs. non-CF) and bacterial diversity, based on previously performed small subunit rRNA sequencing data. Proteins from each pooled sample group were digested and subjected to liquid chromatography tandem mass spectrometry (MS/MS). MS/MS spectra were matched to human and bacterial peptide sequences leveraging a bioinformatic workflow using a metagenomics-guided protein sequence database and rigorous evaluation. Label-free quantification revealed differentially abundant human peptides from proteins with known roles in CF, like neutrophil elastase and collagenase, and proteins with lesser-known roles in CF, including apolipoproteins. Differentially abundant bacterial peptides were identified from known CF pathogens (e.g., Pseudomonas), as well as other taxa with potentially novel roles in CF. We used this host-microbe peptide panel for targeted parallel-reaction monitoring validation, demonstrating for the first time an MS-based assay effective for quantifying host-microbe protein dynamics within BAL cells from individual CF patients. Our integrated bioinformatic and analytical workflow combining discovery, verification, and validation should prove useful for diverse studies to characterize microbial contributors in airway diseases. Furthermore, we describe a promising preliminary panel of differentially abundant microbe and host peptide sequences for further study as potential markers of host-microbe relationships in CF disease pathogenesis.IMPORTANCEIdentifying microbial pathogenic contributors and dysregulated human responses in airway disease, such as CF, is critical to understanding disease progression and developing more effective treatments. To this end, characterizing the proteins expressed from bacterial microbes and human host cells during disease progression can provide valuable new insights. We describe here a new method to confidently detect and monitor abundance changes of both microbe and host proteins from challenging BAL samples commonly collected from CF patients. Our method uses both state-of-the art mass spectrometry-based instrumentation to detect proteins present in these samples and customized bioinformatic software tools to analyze the data and characterize detected proteins and their association with CF. We demonstrate the use of this method to characterize microbe and host proteins from individual BAL samples, paving the way for a new approach to understand molecular contributors to CF and other diseases of the airway.
Collapse
Affiliation(s)
- Monica E. Kruk
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Subina Mehta
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Kevin Murray
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
- Center for Metabolomics and Proteomics, University of Minnesota, Minneapolis, Minnesota, USA
| | - LeeAnn Higgins
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
- Center for Metabolomics and Proteomics, University of Minnesota, Minneapolis, Minnesota, USA
| | - Katherine Do
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - James E. Johnson
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Reid Wagner
- Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota, USA
| | - Chris H. Wendt
- Division of Pulmonary, Allergy, Critical Care and Sleep Medicine, Medical School, University of Minnesota, Minneapolis, Minnesota, USA
- Minneapolis VA Health Care System, Minneapolis, Minnesota, USA
| | - John B. O’Connor
- Department of Pediatrics, Division of Pulmonary and Sleep Medicine, Seattle Children’s Hospital, Seattle, Washington, USA
| | - J. Kirk Harris
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, Colorado, USA
| | - Theresa A. Laguna
- Department of Pediatrics, Division of Pulmonary and Sleep Medicine, Seattle Children’s Hospital, Seattle, Washington, USA
- Department of Pediatrics, University of Washington School of Medicine, Seattle, Washington, USA
| | - Pratik D. Jagtap
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| | - Timothy J. Griffin
- Department of Biochemistry, Molecular Biology and Biophysics, University of Minnesota, Minneapolis, Minneapolis, Minnesota, USA
| |
Collapse
|
11
|
Lamkiewicz K, Barf LM, Sachse K, Hölzer M. RIBAP: a comprehensive bacterial core genome annotation pipeline for pangenome calculation beyond the species level. Genome Biol 2024; 25:170. [PMID: 38951884 PMCID: PMC11218241 DOI: 10.1186/s13059-024-03312-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 06/14/2024] [Indexed: 07/03/2024] Open
Abstract
Microbial pangenome analysis identifies present or absent genes in prokaryotic genomes. However, current tools are limited when analyzing species with higher sequence diversity or higher taxonomic orders such as genera or families. The Roary ILP Bacterial core Annotation Pipeline (RIBAP) uses an integer linear programming approach to refine gene clusters predicted by Roary for identifying core genes. RIBAP successfully handles the complexity and diversity of Chlamydia, Klebsiella, Brucella, and Enterococcus genomes, outperforming other established and recent pangenome tools for identifying all-encompassing core genes at the genus level. RIBAP is a freely available Nextflow pipeline at github.com/hoelzer-lab/ribap and zenodo.org/doi/10.5281/zenodo.10890871.
Collapse
Affiliation(s)
- Kevin Lamkiewicz
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Lisa-Marie Barf
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Konrad Sachse
- RNA Bioinformatics and High-Throughput Analysis, Friedrich Schiller University Jena, Leutragraben 1, Jena, 07743, Germany
| | - Martin Hölzer
- Genome Competence Center (MF1), Robert Koch Institute, Berlin, 13353, Germany.
| |
Collapse
|
12
|
Liu Z, Zhou Y, Wang H, Liu C, Wang L. Recent advances in understanding the fitness and survival mechanisms of Vibrio parahaemolyticus. Int J Food Microbiol 2024; 417:110691. [PMID: 38631283 DOI: 10.1016/j.ijfoodmicro.2024.110691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Revised: 03/14/2024] [Accepted: 04/02/2024] [Indexed: 04/19/2024]
Abstract
The presence of Vibrio parahaemolyticus (Vp) in different production stages of seafood has generated negative impacts on both public health and the sustainability of the industry. To further better investigate the fitness of Vp at the phenotypical level, a great number of studies have been conducted in recent years using plate counting methods. In the meantime, with the increasing accessibility of the next generation sequencing and the advances in analytical chemistry techniques, omics-oriented biotechnologies have further advanced our knowledge in the survival and virulence mechanisms of Vp at various molecular levels. These observations provide insights to guide the development of novel prevention and control strategies and benefit the monitoring and mitigation of food safety risks associated with Vp contamination. To timely capture these recent advances, this review firstly summarizes the most recent phenotypical level studies and provide insights about the survival of Vp under important in vitro stresses and on aquatic products. After that, molecular survival mechanisms of Vp at transcriptomic and proteomic levels are summarized and discussed. Looking forward, other newer omics-biotechnology such as metabolomics and secretomics show great potential to be used for confirming the cellular responses of Vp. Powerful data mining tools from the field of machine learning and artificial intelligence, that can better utilize the omics data and solve complex problems in the processing, analysis, and interpretation of omics data, will further improve our mechanistic understanding of Vp.
Collapse
Affiliation(s)
- Zhuosheng Liu
- Department of Food Science and Technology, University of California Davis, Davis, CA 95618, USA
| | - Yi Zhou
- Department of Food Science and Technology, University of California Davis, Davis, CA 95618, USA
| | - Hongye Wang
- Department of Food Science and Technology, University of California Davis, Davis, CA 95618, USA
| | - Chengchu Liu
- University of Maryland Sea Grant Extension Program, UMES Center for Food Science and Technology, Princess Anne, MD, United States
| | - Luxin Wang
- Department of Food Science and Technology, University of California Davis, Davis, CA 95618, USA.
| |
Collapse
|
13
|
Hamamsy T, Morton JT, Blackwell R, Berenberg D, Carriero N, Gligorijevic V, Strauss CEM, Leman JK, Cho K, Bonneau R. Protein remote homology detection and structural alignment using deep learning. Nat Biotechnol 2024; 42:975-985. [PMID: 37679542 PMCID: PMC11180608 DOI: 10.1038/s41587-023-01917-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2022] [Accepted: 07/26/2023] [Indexed: 09/09/2023]
Abstract
Exploiting sequence-structure-function relationships in biotechnology requires improved methods for aligning proteins that have low sequence similarity to previously annotated proteins. We develop two deep learning methods to address this gap, TM-Vec and DeepBLAST. TM-Vec allows searching for structure-structure similarities in large sequence databases. It is trained to accurately predict TM-scores as a metric of structural similarity directly from sequence pairs without the need for intermediate computation or solution of structures. Once structurally similar proteins have been identified, DeepBLAST can structurally align proteins using only sequence information by identifying structurally homologous regions between proteins. It outperforms traditional sequence alignment methods and performs similarly to structure-based alignment methods. We show the merits of TM-Vec and DeepBLAST on a variety of datasets, including better identification of remotely homologous proteins compared with state-of-the-art sequence alignment and structure prediction methods.
Collapse
Grants
- R35GM122515 National Science Foundation (NSF)
- IOS-1546218 National Science Foundation (NSF)
- R35 GM122515 NIGMS NIH HHS
- R01 DK103358 NIDDK NIH HHS
- CBET- 1728858 National Science Foundation (NSF)
- R01 AI130945 NIAID NIH HHS
- This research was supported by NIH R01DK103358, the Simons Foundation, NSF- IOS-1546218, R35GM122515, NSF CBET- 1728858, NIH R01AI130945, to T.H. This research was supported by the intramural research program of the Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD) to J.T.M. This research was supported by the Flatiron Institute as part of the Simons Foundation to Robert Blackwell, J.K.L., and N.C. This research was supported by Los Alamos National Lab to C.S. This research was supported by the Samsung Advanced Institute of Technology (Next Generation Deep Learning: from pattern recognition to AI), Samsung Research (Improving Deep Learning using Latent Structure), and NSF Award 1922658 to K.C.
- Simons Foundation
- U.S. Department of Health & Human Services | NIH | Eunice Kennedy Shriver National Institute of Child Health and Human Development (NICHD)
Collapse
Affiliation(s)
- Tymor Hamamsy
- Center for Data Science, New York University, New York, NY, USA
| | - James T Morton
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| | - Robert Blackwell
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Daniel Berenberg
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA
- Prescient Design, New York, NY, USA
| | - Nicholas Carriero
- Scientific Computing Core, Flatiron Institute, Simons Foundation, New York, NY, USA
| | | | | | - Julia Koehler Leman
- Center for Computational Biology, Flatiron Institute, Simons Foundation, New York, NY, USA
| | - Kyunghyun Cho
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- CIFAR, Toronto, Ontario, Canada.
| | - Richard Bonneau
- Center for Data Science, New York University, New York, NY, USA.
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, New York, NY, USA.
- Prescient Design, New York, NY, USA.
- Department of Biology, New York University, New York, NY, USA.
| |
Collapse
|
14
|
Roder T, Pimentel G, Fuchsmann P, Stern MT, von Ah U, Vergères G, Peischl S, Brynildsrud O, Bruggmann R, Bär C. Scoary2: rapid association of phenotypic multi-omics data with microbial pan-genomes. Genome Biol 2024; 25:93. [PMID: 38605417 PMCID: PMC11007987 DOI: 10.1186/s13059-024-03233-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open
Abstract
Unraveling bacterial gene function drives progress in various areas, such as food production, pharmacology, and ecology. While omics technologies capture high-dimensional phenotypic data, linking them to genomic data is challenging, leaving 40-60% of bacterial genes undescribed. To address this bottleneck, we introduce Scoary2, an ultra-fast microbial genome-wide association studies (mGWAS) software. With its data exploration app and improved performance, Scoary2 is the first tool to enable the study of large phenotypic datasets using mGWAS. As proof of concept, we explore the metabolome of yogurts, each produced with a different Propionibacterium reichii strain and discover two genes affecting carnitine metabolism.
Collapse
Affiliation(s)
- Thomas Roder
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland
- Graduate School for Cellular and Biomedical Sciences, University of Bern, CH-3012, Bern, Switzerland
| | - Grégory Pimentel
- Methods development and analytics, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Pascal Fuchsmann
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Mireille Tena Stern
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Ueli von Ah
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Guy Vergères
- Food microbial systems, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| | - Stephan Peischl
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland
| | - Ola Brynildsrud
- Norwegian Institute of Public Health, Oslo and Norwegian University of Life Science, Ås, Norway
| | - Rémy Bruggmann
- Interfaculty Bioinformatics Unit and Swiss Institute of Bioinformatics, University of Bern, Bern, CH-3012, Switzerland.
| | - Cornelia Bär
- Methods development and analytics, Agroscope, Schwarzenburgstrasse 161, Bern, CH-3003, Switzerland
| |
Collapse
|
15
|
Hwang Y, Cornman AL, Kellogg EH, Ovchinnikov S, Girguis PR. Genomic language model predicts protein co-regulation and function. Nat Commun 2024; 15:2880. [PMID: 38570504 PMCID: PMC10991518 DOI: 10.1038/s41467-024-46947-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 03/13/2024] [Indexed: 04/05/2024] Open
Abstract
Deciphering the relationship between a gene and its genomic context is fundamental to understanding and engineering biological systems. Machine learning has shown promise in learning latent relationships underlying the sequence-structure-function paradigm from massive protein sequence datasets. However, to date, limited attempts have been made in extending this continuum to include higher order genomic context information. Evolutionary processes dictate the specificity of genomic contexts in which a gene is found across phylogenetic distances, and these emergent genomic patterns can be leveraged to uncover functional relationships between gene products. Here, we train a genomic language model (gLM) on millions of metagenomic scaffolds to learn the latent functional and regulatory relationships between genes. gLM learns contextualized protein embeddings that capture the genomic context as well as the protein sequence itself, and encode biologically meaningful and functionally relevant information (e.g. enzymatic function, taxonomy). Our analysis of the attention patterns demonstrates that gLM is learning co-regulated functional modules (i.e. operons). Our findings illustrate that gLM's unsupervised deep learning of the metagenomic corpus is an effective and promising approach to encode functional semantics and regulatory syntax of genes in their genomic contexts and uncover complex relationships between genes in a genomic region.
Collapse
Affiliation(s)
- Yunha Hwang
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| | | | - Elizabeth H Kellogg
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Department of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, USA.
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Peter R Girguis
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
16
|
Asnicar F, Thomas AM, Passerini A, Waldron L, Segata N. Machine learning for microbiologists. Nat Rev Microbiol 2024; 22:191-205. [PMID: 37968359 DOI: 10.1038/s41579-023-00984-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/2023] [Indexed: 11/17/2023]
Abstract
Machine learning is increasingly important in microbiology where it is used for tasks such as predicting antibiotic resistance and associating human microbiome features with complex host diseases. The applications in microbiology are quickly expanding and the machine learning tools frequently used in basic and clinical research range from classification and regression to clustering and dimensionality reduction. In this Review, we examine the main machine learning concepts, tasks and applications that are relevant for experimental and clinical microbiologists. We provide the minimal toolbox for a microbiologist to be able to understand, interpret and use machine learning in their experimental and translational activities.
Collapse
Affiliation(s)
- Francesco Asnicar
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrew Maltez Thomas
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy
| | - Andrea Passerini
- Department of Information Engineering and Computer Science, University of Trento, Trento, Italy
| | - Levi Waldron
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Epidemiology and Biostatistics, City University of New York, New York, NY, USA.
| | - Nicola Segata
- Department of Cellular, Computational and Integrative Biology, University of Trento, Trento, Italy.
- Department of Experimental Oncology, European Institute of Oncology IRCCS, Milan, Italy.
| |
Collapse
|
17
|
Wirbel J, Bhatt AS, Probst AJ. The journey to understand previously unknown microbial genes. Nature 2024; 626:267-269. [PMID: 38291331 DOI: 10.1038/d41586-024-00077-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]
|
18
|
Rodríguez Del Río Á, Giner-Lamia J, Cantalapiedra CP, Botas J, Deng Z, Hernández-Plaza A, Munar-Palmer M, Santamaría-Hernando S, Rodríguez-Herva JJ, Ruscheweyh HJ, Paoli L, Schmidt TSB, Sunagawa S, Bork P, López-Solanilla E, Coelho LP, Huerta-Cepas J. Functional and evolutionary significance of unknown genes from uncultivated taxa. Nature 2024; 626:377-384. [PMID: 38109938 PMCID: PMC10849945 DOI: 10.1038/s41586-023-06955-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Accepted: 12/08/2023] [Indexed: 12/20/2023]
Abstract
Many of the Earth's microbes remain uncultured and understudied, limiting our understanding of the functional and evolutionary aspects of their genetic material, which remain largely overlooked in most metagenomic studies1. Here we analysed 149,842 environmental genomes from multiple habitats2-6 and compiled a curated catalogue of 404,085 functionally and evolutionarily significant novel (FESNov) gene families exclusive to uncultivated prokaryotic taxa. All FESNov families span multiple species, exhibit strong signals of purifying selection and qualify as new orthologous groups, thus nearly tripling the number of bacterial and archaeal gene families described to date. The FESNov catalogue is enriched in clade-specific traits, including 1,034 novel families that can distinguish entire uncultivated phyla, classes and orders, probably representing synapomorphies that facilitated their evolutionary divergence. Using genomic context analysis and structural alignments we predicted functional associations for 32.4% of FESNov families, including 4,349 high-confidence associations with important biological processes. These predictions provide a valuable hypothesis-driven framework that we used for experimental validatation of a new gene family involved in cell motility and a novel set of antimicrobial peptides. We also demonstrate that the relative abundance profiles of novel families can discriminate between environments and clinical conditions, leading to the discovery of potentially new biomarkers associated with colorectal cancer. We expect this work to enhance future metagenomics studies and expand our knowledge of the genetic repertory of uncultivated organisms.
Collapse
Affiliation(s)
- Álvaro Rodríguez Del Río
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Joaquín Giner-Lamia
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
- Departamento de Bioquímica Vegetal y Biología Molecular, Facultad de Biología, Instituto de Bioquímica Vegetal y Fotosíntesis (IBVF), Universidad de Sevilla-CSIC, Seville, Spain
| | - Carlos P Cantalapiedra
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Jorge Botas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Ziqi Deng
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Ana Hernández-Plaza
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Martí Munar-Palmer
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - Saray Santamaría-Hernando
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
| | - José J Rodríguez-Herva
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
| | - Hans-Joachim Ruscheweyh
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Lucas Paoli
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Thomas S B Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology and Swiss Institute of Bioinformatics, ETH Zürich, Zürich, Switzerland
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
- Max Delbrück Centre for Molecular Medicine, Berlin, Germany
- Department of Bioinformatics, Biocenter, University of Würzburg, Würzburg, Germany
| | - Emilia López-Solanilla
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain
- Departamento de Biotecnología-Biología Vegetal, Escuela Técnica Superior de Ingeniería Agronómica, Alimentaria y de Biosistemas, Universidad Politécnica de Madrid (UPM), Madrid, Spain
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, China
- MOE Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, and MOE Frontiers Center for Brain Science, Shanghai, China
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, Queensland, Australia
| | - Jaime Huerta-Cepas
- Centro de Biotecnología y Genómica de Plantas, Universidad Politécnica de Madrid (UPM) - Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA-CSIC), Madrid, Spain.
| |
Collapse
|
19
|
Duan N, Hand E, Pheko M, Sharma S, Emiola A. Structure-guided discovery of anti-CRISPR and anti-phage defense proteins. Nat Commun 2024; 15:649. [PMID: 38245560 PMCID: PMC10799925 DOI: 10.1038/s41467-024-45068-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Accepted: 01/12/2024] [Indexed: 01/22/2024] Open
Abstract
Bacteria use a variety of defense systems to protect themselves from phage infection. In turn, phages have evolved diverse counter-defense measures to overcome host defenses. Here, we use protein structural similarity and gene co-occurrence analyses to screen >66 million viral protein sequences and >330,000 metagenome-assembled genomes for the identification of anti-phage and counter-defense systems. We predict structures for ~300,000 proteins and perform large-scale, pairwise comparison to known anti-CRISPR (Acr) and anti-phage proteins to identify structural homologs that otherwise may not be uncovered using primary sequence search. This way, we identify a Bacteroidota phage Acr protein that inhibits Cas12a, and an Akkermansia muciniphila anti-phage defense protein, termed BxaP. Gene bxaP is found in loci encoding Bacteriophage Exclusion (BREX) and restriction-modification defense systems, but confers immunity independently. Our work highlights the advantage of combining protein structural features and gene co-localization information in studying host-phage interactions.
Collapse
Affiliation(s)
- Ning Duan
- Microbial Therapeutics Unit, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, MD, USA
| | - Emily Hand
- Microbial Therapeutics Unit, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, MD, USA
| | - Mannuku Pheko
- Microbial Therapeutics Unit, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, MD, USA
| | - Shikha Sharma
- Microbial Therapeutics Unit, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, MD, USA
| | - Akintunde Emiola
- Microbial Therapeutics Unit, National Institute of Dental and Craniofacial Research, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
20
|
Schmidt TSB, Fullam A, Ferretti P, Orakov A, Maistrenko OM, Ruscheweyh HJ, Letunic I, Duan Y, Van Rossum T, Sunagawa S, Mende DR, Finn RD, Kuhn M, Pedro Coelho L, Bork P. SPIRE: a Searchable, Planetary-scale mIcrobiome REsource. Nucleic Acids Res 2024; 52:D777-D783. [PMID: 37897342 PMCID: PMC10767986 DOI: 10.1093/nar/gkad943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/01/2023] [Accepted: 10/11/2023] [Indexed: 10/30/2023] Open
Abstract
Meta'omic data on microbial diversity and function accrue exponentially in public repositories, but derived information is often siloed according to data type, study or sampled microbial environment. Here we present SPIRE, a Searchable Planetary-scale mIcrobiome REsource that integrates various consistently processed metagenome-derived microbial data modalities across habitats, geography and phylogeny. SPIRE encompasses 99 146 metagenomic samples from 739 studies covering a wide array of microbial environments and augmented with manually-curated contextual data. Across a total metagenomic assembly of 16 Tbp, SPIRE comprises 35 billion predicted protein sequences and 1.16 million newly constructed metagenome-assembled genomes (MAGs) of medium or high quality. Beyond mapping to the high-quality genome reference provided by proGenomes3 (http://progenomes.embl.de), these novel MAGs form 92 134 novel species-level clusters, the majority of which are unclassified at species level using current tools. SPIRE enables taxonomic profiling of these species clusters via an updated, custom mOTUs database (https://motu-tool.org/) and includes several layers of functional annotation, as well as crosslinks to several (micro-)biological databases. The resource is accessible, searchable and browsable via http://spire.embl.de.
Collapse
Affiliation(s)
- Thomas S B Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Anthony Fullam
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Pamela Ferretti
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Askarbek Orakov
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Oleksandr M Maistrenko
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Hans-Joachim Ruscheweyh
- Institute of Microbiology, Department of Biology and Swiss Institute of Bioinformatics, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Ivica Letunic
- Biobyte solutions GmbH, Bothestr. 142, 69117 Heidelberg, Germany
| | - Yiqian Duan
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Thea Van Rossum
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Shinichi Sunagawa
- Institute of Microbiology, Department of Biology and Swiss Institute of Bioinformatics, ETH Zurich, Vladimir-Prelog-Weg 4, 8093 Zurich, Switzerland
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam University Medical Centers, Amsterdam, The Netherlands
| | - Robert D Finn
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, United Kingdom
| | - Michael Kuhn
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Luis Pedro Coelho
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Centre for Microbiome Research, School of Biomedical Sciences, Queensland University of Technology, Translational Research Institute, Woolloongabba, Queensland, Australia
| | - Peer Bork
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
- Department of Bioinformatics, Biozentrum, University of Würzburg, 97074 Würzburg, Germany
- Max Delbrück Centre for Molecular Medicine, 13125 Berlin, Germany
| |
Collapse
|
21
|
Rich MH, Sharrock AV, Mulligan TS, Matthews F, Brown AS, Lee-Harwood HR, Williams EM, Copp JN, Little RF, Francis JJB, Horvat CN, Stevenson LJ, Owen JG, Saxena MT, Mumm JS, Ackerley DF. A metagenomic library cloning strategy that promotes high-level expression of captured genes to enable efficient functional screening. Cell Chem Biol 2023; 30:1680-1691.e6. [PMID: 37898120 PMCID: PMC10842177 DOI: 10.1016/j.chembiol.2023.10.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 06/17/2023] [Accepted: 10/02/2023] [Indexed: 10/30/2023]
Abstract
Functional screening of environmental DNA (eDNA) libraries is a potentially powerful approach to discover enzymatic "unknown unknowns", but is usually heavily biased toward the tiny subset of genes preferentially transcribed and translated by the screening strain. We have overcome this by preparing an eDNA library via partial digest with restriction enzyme FatI (cuts CATG), causing a substantial proportion of ATG start codons to be precisely aligned with strong plasmid-encoded promoter and ribosome-binding sequences. Whereas we were unable to select nitroreductases from standard metagenome libraries, our FatI strategy yielded 21 nitroreductases spanning eight different enzyme families, each conferring resistance to the nitro-antibiotic niclosamide and sensitivity to the nitro-prodrug metronidazole. We showed expression could be improved by co-expressing rare tRNAs and encoded proteins purified directly using an embedded His6-tag. In a transgenic zebrafish model of metronidazole-mediated targeted cell ablation, our lead MhqN-family nitroreductase proved ∼5-fold more effective than the canonical nitroreductase NfsB.
Collapse
Affiliation(s)
- Michelle H Rich
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Abigail V Sharrock
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Timothy S Mulligan
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Frazer Matthews
- Department of Genetic Medicine, McKusick-Nathans Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Alistair S Brown
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Hannah R Lee-Harwood
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Elsie M Williams
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Janine N Copp
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Rory F Little
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Jenni J B Francis
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Claire N Horvat
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Luke J Stevenson
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Jeremy G Owen
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Meera T Saxena
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jeff S Mumm
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Department of Genetic Medicine, McKusick-Nathans Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA; Center for Nanomedicine, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - David F Ackerley
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand; Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand.
| |
Collapse
|
22
|
Joerger AK, Albrecht C, Rothhammer V, Neuhaus K, Wagner A, Meyer B, Wostrack M. The Role of Gut and Oral Microbiota in the Formation and Rupture of Intracranial Aneurysms: A Literature Review. Int J Mol Sci 2023; 25:48. [PMID: 38203219 PMCID: PMC10779325 DOI: 10.3390/ijms25010048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 12/14/2023] [Accepted: 12/18/2023] [Indexed: 01/12/2024] Open
Abstract
In recent years, there has been a growing interest in the role of the microbiome in cardiovascular and cerebrovascular diseases. Emerging research highlights the potential role of the microbiome in intracranial aneurysm (IA) formation and rupture, particularly in relation to inflammation. In this review, we aim to explore the existing literature regarding the influence of the gut and oral microbiome on IA formation and rupture. In the first section, we provide background information, elucidating the connection between inflammation and aneurysm formation and presenting potential mechanisms of gut-brain interaction. Additionally, we explain the methods for microbiome analysis. The second section reviews existing studies that investigate the relationship between the gut and oral microbiome and IAs. We conclude with a prospective overview, highlighting the extent to which the microbiome is already therapeutically utilized in other fields. Furthermore, we address the challenges associated with the context of IAs that still need to be overcome.
Collapse
Affiliation(s)
- Ann-Kathrin Joerger
- Department of Neurosurgery, Klinikum Rechts der Isar, Technical University, 81675 Munich, Germany; (A.-K.J.); (B.M.)
| | - Carolin Albrecht
- Department of Neurosurgery, Klinikum Rechts der Isar, Technical University, 81675 Munich, Germany; (A.-K.J.); (B.M.)
| | - Veit Rothhammer
- Department of Neurology, University Hospital Erlangen, Friedrich-Alexander University Erlangen Nuremberg, 91054 Erlangen, Germany;
| | - Klaus Neuhaus
- Core Facility Microbiom, ZIEL Institute for Food & Health, Technical University of Munich, 85354 Freising, Germany;
| | - Arthur Wagner
- Department of Neurosurgery, Klinikum Rechts der Isar, Technical University, 81675 Munich, Germany; (A.-K.J.); (B.M.)
| | - Bernhard Meyer
- Department of Neurosurgery, Klinikum Rechts der Isar, Technical University, 81675 Munich, Germany; (A.-K.J.); (B.M.)
| | - Maria Wostrack
- Department of Neurosurgery, Klinikum Rechts der Isar, Technical University, 81675 Munich, Germany; (A.-K.J.); (B.M.)
| |
Collapse
|
23
|
Robinson SL. Structure-guided metagenome mining to tap microbial functional diversity. Curr Opin Microbiol 2023; 76:102382. [PMID: 37741262 DOI: 10.1016/j.mib.2023.102382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/21/2023] [Accepted: 08/22/2023] [Indexed: 09/25/2023]
Abstract
Scientists now have access to millions of accurate three-dimensional (3D) models of protein structures. How do we leverage 3D structural models to learn about microbial functions encoded in metagenomes? Here, we review recent developments using protein structural features to mine metagenomes from diverse environments ranging from the human gut to soil and ocean viromes. We compare 3D protein structural methods to characterize antibiotic resistance phenotypes, nutrient cycling, and host-drug-microbe interactions. Broadly, we encourage the scientific community to look beyond global sequence and structure alignments by considering fine-grained descriptors such as distance to ligand, active site, and tertiary interactions between amino acid residues scaling to microbiomes. Finally, we highlight structure-inspired approaches to chart new areas of microbial protein-coding sequence space.
Collapse
Affiliation(s)
- Serina L Robinson
- Department of Environmental Microbiology, Eawag, Swiss Federal Institute of Aquatic Science and Technology, Ueberlandstrasse 133, 8600 Dübendorf, Switzerland.
| |
Collapse
|
24
|
Meng L, Delmont TO, Gaïa M, Pelletier E, Fernàndez-Guerra A, Chaffron S, Neches RY, Wu J, Kaneko H, Endo H, Ogata H. Genomic adaptation of giant viruses in polar oceans. Nat Commun 2023; 14:6233. [PMID: 37828003 PMCID: PMC10570341 DOI: 10.1038/s41467-023-41910-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Accepted: 09/24/2023] [Indexed: 10/14/2023] Open
Abstract
Despite being perennially frigid, polar oceans form an ecosystem hosting high and unique biodiversity. Various organisms show different adaptive strategies in this habitat, but how viruses adapt to this environment is largely unknown. Viruses of phyla Nucleocytoviricota and Mirusviricota are groups of eukaryote-infecting large and giant DNA viruses with genomes encoding a variety of functions. Here, by leveraging the Global Ocean Eukaryotic Viral database, we investigate the biogeography and functional repertoire of these viruses at a global scale. We first confirm the existence of an ecological barrier that clearly separates polar and nonpolar viral communities, and then demonstrate that temperature drives dramatic changes in the virus-host network at the polar-nonpolar boundary. Ancestral niche reconstruction suggests that adaptation of these viruses to polar conditions has occurred repeatedly over the course of evolution, with polar-adapted viruses in the modern ocean being scattered across their phylogeny. Numerous viral genes are specifically associated with polar adaptation, although most of their homologues are not identified as polar-adaptive genes in eukaryotes. These results suggest that giant viruses adapt to cold environments by changing their functional repertoire, and this viral evolutionary strategy is distinct from the polar adaptation strategy of their hosts.
Collapse
Affiliation(s)
- Lingjie Meng
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan
| | - Tom O Delmont
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, F-91057, Evry, France
- Research Federation for the study of Global Ocean systems ecology and evolution, FR2022/Tara GOsee, F-75016, Paris, France
| | - Morgan Gaïa
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, F-91057, Evry, France
- Research Federation for the study of Global Ocean systems ecology and evolution, FR2022/Tara GOsee, F-75016, Paris, France
| | - Eric Pelletier
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, F-91057, Evry, France
- Research Federation for the study of Global Ocean systems ecology and evolution, FR2022/Tara GOsee, F-75016, Paris, France
| | - Antonio Fernàndez-Guerra
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Samuel Chaffron
- Research Federation for the study of Global Ocean systems ecology and evolution, FR2022/Tara GOsee, F-75016, Paris, France
- Nantes Université, École Centrale Nantes, CNRS, LS2N, UMR 6004, F-44000, Nantes, France
| | - Russell Y Neches
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan
| | - Junyi Wu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan
| | - Hiroto Kaneko
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan
| | - Hisashi Endo
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan
| | - Hiroyuki Ogata
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, 611-0011, Japan.
| |
Collapse
|
25
|
Aguirre-Sánchez JR, Quiñones B, Ortiz-Muñoz JA, Prieto-Alvarado R, Vega-López IF, Martínez-Urtaza J, Lee BG, Chaidez C. Comparative Genomic Analyses of Virulence and Antimicrobial Resistance in Citrobacter werkmanii, an Emerging Opportunistic Pathogen. Microorganisms 2023; 11:2114. [PMID: 37630674 PMCID: PMC10457828 DOI: 10.3390/microorganisms11082114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 08/11/2023] [Accepted: 08/13/2023] [Indexed: 08/27/2023] Open
Abstract
Citrobacter werkmanii is an emerging and opportunistic human pathogen found in developing countries and is a causative agent of wound, urinary tract, and blood infections. The present study conducted comparative genomic analyses of a C. werkmanii strain collection from diverse geographical locations and sources to identify the relevant virulence and antimicrobial resistance genes. Pangenome analyses divided the examined C. werkmanii strains into five distinct clades; the subsequent classification identified genes with functional roles in carbohydrate and general metabolism for the core genome and genes with a role in secretion, adherence, and the mobilome for the shell and cloud genomes. A maximum-likelihood phylogenetic tree with a heatmap, showing the virulence and antimicrobial genes' presence or absence, demonstrated the presence of genes with functional roles in secretion systems, adherence, enterobactin, and siderophore among the strains belonging to the different clades. C. werkmanii strains in clade V, predominantly from clinical sources, harbored genes implicated in type II and type Vb secretion systems as well as multidrug resistance to aminoglycoside, beta-lactamase, fluoroquinolone, phenicol, trimethoprim, macrolides, sulfonamide, and tetracycline. In summary, these comparative genomic analyses have demonstrated highly pathogenic and multidrug-resistant genetic profiles in C. werkmanii strains, indicating a virulence potential for this commensal and opportunistic human pathogen.
Collapse
Affiliation(s)
- José R. Aguirre-Sánchez
- Laboratorio Nacional para la Investigación en Inocuidad Alimentaria, Centro de Investigación en Alimentación y Desarrollo A.C. (CIAD), Coordinación Regional Culiacán, Culiacan 80110, Mexico;
| | - Beatriz Quiñones
- Produce Safety and Microbiology Research Unit, Western Regional Research Center, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA 94710, USA; (B.Q.); (B.G.L.)
| | - José A. Ortiz-Muñoz
- Parque de Innovación Tecnológica de la Universidad Autónoma de Sinaloa, Culiacan 80040, Mexico; (J.A.O.-M.); (R.P.-A.); (I.F.V.-L.)
| | - Rogelio Prieto-Alvarado
- Parque de Innovación Tecnológica de la Universidad Autónoma de Sinaloa, Culiacan 80040, Mexico; (J.A.O.-M.); (R.P.-A.); (I.F.V.-L.)
| | - Inés F. Vega-López
- Parque de Innovación Tecnológica de la Universidad Autónoma de Sinaloa, Culiacan 80040, Mexico; (J.A.O.-M.); (R.P.-A.); (I.F.V.-L.)
| | - Jaime Martínez-Urtaza
- Departament de Genètica i de Microbiologia, Universitat Autờnoma de Barcelona, 08193 Bellaterra, Spain;
| | - Bertram G. Lee
- Produce Safety and Microbiology Research Unit, Western Regional Research Center, Agricultural Research Service, U.S. Department of Agriculture, Albany, CA 94710, USA; (B.Q.); (B.G.L.)
| | - Cristóbal Chaidez
- Laboratorio Nacional para la Investigación en Inocuidad Alimentaria, Centro de Investigación en Alimentación y Desarrollo A.C. (CIAD), Coordinación Regional Culiacán, Culiacan 80110, Mexico;
| |
Collapse
|
26
|
Obiol A, López-Escardó D, Salomaki ED, Wiśniewska MM, Forn I, Sà E, Vaqué D, Kolísko M, Massana R. Gene expression dynamics of natural assemblages of heterotrophic flagellates during bacterivory. MICROBIOME 2023; 11:134. [PMID: 37322519 PMCID: PMC10268365 DOI: 10.1186/s40168-023-01571-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 05/12/2023] [Indexed: 06/17/2023]
Abstract
BACKGROUND Marine heterotrophic flagellates (HF) are dominant bacterivores in the ocean, where they represent the trophic link between bacteria and higher trophic levels and participate in the recycling of inorganic nutrients for regenerated primary production. Studying their activity and function in the ecosystem is challenging since most of the HFs in the ocean are still uncultured. In the present work, we investigated gene expression of natural HF communities during bacterivory in four unamended seawater incubations. RESULTS The most abundant species growing in our incubations belonged to the taxonomic groups MAST-4, MAST-7, Chrysophyceae, and Telonemia. Gene expression dynamics were similar between incubations and could be divided into three states based on microbial counts, each state displaying distinct expression patterns. The analysis of samples where HF growth was highest revealed some highly expressed genes that could be related to bacterivory. Using available genomic and transcriptomic references, we identified 25 species growing in our incubations and used those to compare the expression levels of these specific genes. Video Abstract CONCLUSIONS: Our results indicate that several peptidases, together with some glycoside hydrolases and glycosyltransferases, are more expressed in phagotrophic than in phototrophic species, and thus could be used to infer the process of bacterivory in natural assemblages.
Collapse
Affiliation(s)
- Aleix Obiol
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain.
| | - David López-Escardó
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain
| | - Eric D Salomaki
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
| | - Monika M Wiśniewska
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Irene Forn
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain
| | - Elisabet Sà
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain
| | - Dolors Vaqué
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain
| | - Martin Kolísko
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České Budějovice, Czech Republic
- Faculty of Science, University of South Bohemia, České Budějovice, Czech Republic
| | - Ramon Massana
- Department of Marine Biology and Oceanography, Institut de Ciències del Mar (ICM-CSIC), Passeig Marítim de la Barceloneta 37-49, Barcelona, Catalonia, 08003, Spain.
| |
Collapse
|
27
|
Gaïa M, Meng L, Pelletier E, Forterre P, Vanni C, Fernandez-Guerra A, Jaillon O, Wincker P, Ogata H, Krupovic M, Delmont TO. Mirusviruses link herpesviruses to giant viruses. Nature 2023; 616:783-789. [PMID: 37076623 PMCID: PMC10132985 DOI: 10.1038/s41586-023-05962-4] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 03/16/2023] [Indexed: 04/21/2023]
Abstract
DNA viruses have a major influence on the ecology and evolution of cellular organisms1-4, but their overall diversity and evolutionary trajectories remain elusive5. Here we carried out a phylogeny-guided genome-resolved metagenomic survey of the sunlit oceans and discovered plankton-infecting relatives of herpesviruses that form a putative new phylum dubbed Mirusviricota. The virion morphogenesis module of this large monophyletic clade is typical of viruses from the realm Duplodnaviria6, with multiple components strongly indicating a common ancestry with animal-infecting Herpesvirales. Yet, a substantial fraction of mirusvirus genes, including hallmark transcription machinery genes missing in herpesviruses, are closely related homologues of giant eukaryotic DNA viruses from another viral realm, Varidnaviria. These remarkable chimaeric attributes connecting Mirusviricota to herpesviruses and giant eukaryotic viruses are supported by more than 100 environmental mirusvirus genomes, including a near-complete contiguous genome of 432 kilobases. Moreover, mirusviruses are among the most abundant and active eukaryotic viruses characterized in the sunlit oceans, encoding a diverse array of functions used during the infection of microbial eukaryotes from pole to pole. The prevalence, functional activity, diversification and atypical chimaeric attributes of mirusviruses point to a lasting role of Mirusviricota in the ecology of marine ecosystems and in the evolution of eukaryotic DNA viruses.
Collapse
Affiliation(s)
- Morgan Gaïa
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Paris, France
| | - Lingjie Meng
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan
| | - Eric Pelletier
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Paris, France
| | - Patrick Forterre
- Institut de Biologie Intégrative de la Cellule (I2BC), CNRS, Université Paris-Saclay, Gif sur Yvette, France
- Département de Microbiologie, Institut Pasteur, Paris, France
| | - Chiara Vanni
- MARUM Center for Marine Environmental Sciences, University of Bremen, Bremen, Germany
| | - Antonio Fernandez-Guerra
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - Olivier Jaillon
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Paris, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Paris, France
| | - Hiroyuki Ogata
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Japan
| | - Mart Krupovic
- Institut Pasteur, Université Paris Cité, CNRS UMR6047, Archaeal Virology Unit, Paris, France
| | - Tom O Delmont
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ. Evry, Université Paris-Saclay, Evry, France.
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, Paris, France.
| |
Collapse
|
28
|
Rich MH, Sharrock AV, Mulligan TS, Matthews F, Brown AS, Lee-Harwood HR, Williams EM, Copp JN, Little RF, Francis JJB, Horvat CN, Stevenson LJ, Owen JG, Saxena MT, Mumm JS, Ackerley DF. A metagenomic library cloning strategy that promotes high-level expression of captured genes to enable efficient functional screening. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.24.534183. [PMID: 36993673 PMCID: PMC10055417 DOI: 10.1101/2023.03.24.534183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Functional screening of environmental DNA (eDNA) libraries is a potentially powerful approach to discover enzymatic "unknown unknowns", but is usually heavily biased toward the tiny subset of genes preferentially transcribed and translated by the screening strain. We have overcome this by preparing an eDNA library via partial digest with restriction enzyme FatI (cuts CATG), causing a substantial proportion of ATG start codons to be precisely aligned with strong plasmid-encoded promoter and ribosome-binding sequences. Whereas we were unable to select nitroreductases from standard metagenome libraries, our FatI strategy yielded 21 nitroreductases spanning eight different enzyme families, each conferring resistance to the nitro-antibiotic niclosamide and sensitivity to the nitro-prodrug metronidazole. We showed expression could be improved by co-expressing rare tRNAs and encoded proteins purified directly using an embedded His6-tag. In a transgenic zebrafish model of metronidazole-mediated targeted cell ablation, our lead MhqN-family nitroreductase proved ~5-fold more effective than the canonical nitroreductase NfsB.
Collapse
Affiliation(s)
- Michelle H Rich
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Abigail V Sharrock
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Timothy S Mulligan
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Frazer Matthews
- Department of Genetic Medicine, McKusick-Nathans Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Alistair S Brown
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Hannah R Lee-Harwood
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Elsie M Williams
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Current address: Burnet Institute, Melbourne, Victoria 3004, Australia
| | - Janine N Copp
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Current addresses: Michael Smith Laboratories, University of British Columbia, Vancouver BC V6T 1Z4, Canada; Abcellera Biologics Inc, Vancouver BC V5Y 0A1, Canada
| | - Rory F Little
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Current address: Leibniz Institute for Natural Product Research and Infection Biology, Hans Knöll Institute, 07745 Jena, Germany
| | - Jenni JB Francis
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Claire N Horvat
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Current address: Teva Pharmaceuticals, Sydney, New South Wales 2113, Australia
| | - Luke J Stevenson
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Jeremy G Owen
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Meera T Saxena
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - Jeff S Mumm
- Department of Ophthalmology, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Department of Genetic Medicine, McKusick-Nathans Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
- Center for Nanomedicine, Wilmer Eye Institute, Johns Hopkins University School of Medicine, Baltimore, MD 21287, USA
| | - David F Ackerley
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
- Maurice Wilkins Centre for Molecular Biodiscovery, Victoria University of Wellington, Wellington 6012, New Zealand
| |
Collapse
|
29
|
Krinos AI, Cohen NR, Follows MJ, Alexander H. Reverse engineering environmental metatranscriptomes clarifies best practices for eukaryotic assembly. BMC Bioinformatics 2023; 24:74. [PMID: 36869298 PMCID: PMC9983209 DOI: 10.1186/s12859-022-05121-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 12/21/2022] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND Diverse communities of microbial eukaryotes in the global ocean provide a variety of essential ecosystem services, from primary production and carbon flow through trophic transfer to cooperation via symbioses. Increasingly, these communities are being understood through the lens of omics tools, which enable high-throughput processing of diverse communities. Metatranscriptomics offers an understanding of near real-time gene expression in microbial eukaryotic communities, providing a window into community metabolic activity. RESULTS Here we present a workflow for eukaryotic metatranscriptome assembly, and validate the ability of the pipeline to recapitulate real and manufactured eukaryotic community-level expression data. We also include an open-source tool for simulating environmental metatranscriptomes for testing and validation purposes. We reanalyze previously published metatranscriptomic datasets using our metatranscriptome analysis approach. CONCLUSION We determined that a multi-assembler approach improves eukaryotic metatranscriptome assembly based on recapitulated taxonomic and functional annotations from an in-silico mock community. The systematic validation of metatranscriptome assembly and annotation methods provided here is a necessary step to assess the fidelity of our community composition measurements and functional content assignments from eukaryotic metatranscriptomes.
Collapse
Affiliation(s)
- Arianna I Krinos
- MIT-WHOI Joint Program in Oceanography and Applied Ocean Science and Engineering, Cambridge and Woods Hole, MA, USA.
- Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA.
- Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Natalie R Cohen
- Skidaway Institute of Oceanography, University of Georgia, Savannah, GA, USA
| | - Michael J Follows
- Department of Earth, Atmospheric, and Planetary Science, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Harriet Alexander
- Department of Biology, Woods Hole Oceanographic Institution, Woods Hole, MA, USA.
| |
Collapse
|
30
|
Rizos I, Debeljak P, Finet T, Klein D, Ayata SD, Not F, Bittner L. Beyond the limits of the unassigned protist microbiome: inferring large-scale spatio-temporal patterns of Syndiniales marine parasites. ISME COMMUNICATIONS 2023; 3:16. [PMID: 36854980 PMCID: PMC9975217 DOI: 10.1038/s43705-022-00203-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 11/15/2022] [Accepted: 11/16/2022] [Indexed: 03/02/2023]
Abstract
Marine protists are major components of the oceanic microbiome that remain largely unrepresented in culture collections and genomic reference databases. The exploration of this uncharted protist diversity in oceanic communities relies essentially on studying genetic markers from the environment as taxonomic barcodes. Here we report that across 6 large scale spatio-temporal planktonic surveys, half of the genetic barcodes remain taxonomically unassigned at the genus level, preventing a fine ecological understanding for numerous protist lineages. Among them, parasitic Syndiniales (Dinoflagellata) appear as the least described protist group. We have developed a computational workflow, integrating diverse 18S rDNA gene metabarcoding datasets, in order to infer large-scale ecological patterns at 100% similarity of the genetic marker, overcoming the limitation of taxonomic assignment. From a spatial perspective, we identified 2171 unassigned clusters, i.e., Syndiniales sequences with 100% similarity, exclusively shared between the Tropical/Subtropical Ocean and the Mediterranean Sea among all Syndiniales orders and 25 ubiquitous clusters shared within all the studied marine regions. From a temporal perspective, over 3 time-series, we highlighted 39 unassigned clusters that follow rhythmic patterns of recurrence and are the best indicators of parasite community's variation. These clusters withhold potential as ecosystem change indicators, mirroring their associated host community responses. Our results underline the importance of Syndiniales in structuring planktonic communities through space and time, raising questions regarding host-parasite association specificity and the trophic mode of persistent Syndiniales, while providing an innovative framework for prioritizing unassigned protist taxa for further description.
Collapse
Affiliation(s)
- Iris Rizos
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France.
- Sorbonne Université, CNRS, AD2M-UMR7144 Station Biologique de Roscoff, 29680, Roscoff, France.
| | - Pavla Debeljak
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Thomas Finet
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Dylan Klein
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
| | - Sakina-Dorothée Ayata
- Sorbonne Université, Laboratoire d'Océanographie et du Climat: Expérimentation et Analyses Numériques (LOCEAN, SU/CNRS/IRD/MNHN), 75252, Paris Cedex 05, France
| | - Fabrice Not
- Sorbonne Université, CNRS, AD2M-UMR7144 Station Biologique de Roscoff, 29680, Roscoff, France
| | - Lucie Bittner
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d'Histoire Naturelle, CNRS, Sorbonne Université, EPHE, Université des Antilles, Paris, France
- Institut Universitaire de France, Paris, France
| |
Collapse
|
31
|
Improved Assembly of Metagenome-Assembled Genomes and Viruses in Tibetan Saline Lake Sediment by HiFi Metagenomic Sequencing. Microbiol Spectr 2023; 11:e0332822. [PMID: 36475839 PMCID: PMC9927493 DOI: 10.1128/spectrum.03328-22] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
With the development and reduced costs of high-throughput sequencing technology, environmental dark matter, such as novel metagenome-assembled genomes (MAGs) and viruses, is now being discovered easily. However, due to read length limitations, MAGs and viromes often suffer from genome discontinuity and deficiencies in key functional elements. Here, by applying long-read sequencing technology to sediment samples from a Tibetan saline lake, we comprehensively analyzed the performance of high-fidelity (HiFi) reads and the possibility of integration with short-read next-generation sequencing (NGS) data. In total, 207 full-length nonredundant 16S rRNA gene sequences and 19 full-length nonredundant 18S rRNA genes were directly obtained from HiFi reads, which greatly surpassed the retrieval performance of NGS technology. We carried out a cross-sectional comparison among multiple assembly strategies, referred to as 'NGS', 'Hybrid (NGS+HiFi)', and 'HiFi'. Two MAGs and 29 viruses with circular genomes were reconstructed using HiFi reads alone, indicating the great power of the 'HiFi' approach to assemble high-quality microbial genomes. Among the 3 strategies, the 'Hybrid' approach produced the highest number of medium/high-quality MAGs and viral genomes, while the ratio of MAGs containing 16S rRNA genes was significantly improved in the 'HiFi' assembly results. Overall, our study provides a practical metagenomic resolution for analyzing complex environmental samples by taking advantage of both the short-read and HiFi long-read sequencing methods to extract the maximum amount of information, including data on prokaryotes, eukaryotes, and viruses, via the 'Hybrid' approach. IMPORTANCE To expand the understanding of microbial dark matter in the environment, we did the first comparative evaluation of multiple assembly strategies based on high-throughput short-read and HiFi data from lake sediments metagenomic sequencing. The results demonstrated great improvement of the 'Hybrid' assembly method (short-read next-generation sequencing data plus HiFi data) in the recovery of medium/high-quality MAGs and viral genomes. Further analysis showed that HiFi data is important to retrieve the complete circular prokaryotic and viral genomes. Meanwhile, hundreds of full-length 16S/18S rRNA genes were assembled directly from HiFi data, which facilitated the species composition studies of complex environmental samples, especially for understanding micro-eukaryotes. Therefore, the application of the latest HiFi long-read sequencing could greatly improve the metagenomic assembly integrity and promote environmental microbiome research.
Collapse
|
32
|
Nguyen R, Sokhansanj BA, Polikar R, Rosen GL. Complet+: a computationally scalable method to improve completeness of large-scale protein sequence clustering. PeerJ 2023; 11:e14779. [PMID: 36785708 PMCID: PMC9921987 DOI: 10.7717/peerj.14779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 01/03/2023] [Indexed: 02/10/2023] Open
Abstract
A major challenge for clustering algorithms is to balance the trade-off between homogeneity, i.e., the degree to which an individual cluster includes only related sequences, and completeness, the degree to which related sequences are broken up into multiple clusters. Most algorithms are conservative in grouping sequences with other sequences. Remote homologs may fail to be clustered together and instead form unnecessarily distinct clusters. The resulting clusters have high homogeneity but completeness that is too low. We propose Complet+, a computationally scalable post-processing method to increase the completeness of clusters without an undue cost in homogeneity. Complet+ proves to effectively merge closely-related clusters of protein that have verified structural relationships in the SCOPe classification scheme, improving the completeness of clustering results at little cost to homogeneity. Applying Complet+ to clusters obtained using MMseqs2's clusterupdate achieves an increased V-measure of 0.09 and 0.05 at the SCOPe superfamily and family levels, respectively. Complet+ also creates more biologically representative clusters, as shown by a substantial increase in Adjusted Mutual Information (AMI) and Adjusted Rand Index (ARI) metrics when comparing predicted clusters to biological classifications. Complet+ similarly improves clustering metrics when applied to other methods, such as CD-HIT and linclust. Finally, we show that Complet+ runtime scales linearly with respect to the number of clusters being post-processed on a COG dataset of over 3 million sequences. Code and supplementary information is available on Github: https://github.com/EESI/Complet-Plus.
Collapse
Affiliation(s)
- Rachel Nguyen
- Drexel University, Philadelphia, United States of America
| | | | - Robi Polikar
- Rowan University, Glassboro, NJ, United States of America
| | - Gail L. Rosen
- Drexel University, Philadelphia, United States of America
| |
Collapse
|
33
|
Abstract
Escherichia coli contain a high level of genetic diversity and are generally associated with the guts of warm-blooded animals but have also been isolated from secondary habitats outside hosts. We used E. coli isolates from previous in situ microcosm experiments conducted under actual beach conditions and performed population-level genomic analysis to identify accessory genes associated with survival within the beach sand environment. E. coli strains capable of surviving had been selected for by seeding isolates originating from sand, sewage, and gull waste (n = 528; 176 from each source) into sand, which was sealed in microcosm chambers and buried for 45 days in the backshore beach of Lake Michigan. In the current work, survival-associated genes were identified by comparing the pangenome of viable E. coli populations at the end of the microcosm experiment with the original isolate collection and identifying loci enriched in the out put samples. We found that environmental survival was associated with a wide variety of genetic factors, with the majority corresponding to metabolism enzymes and transport proteins. Of the 414 unique functions identified, most were present across E. coli phylogroups, except B2 which is often associated with human pathogens. Gene modules that were enriched in surviving populations included a betaine biosynthesis pathway, which produces an osmoprotectant, and the GABA (gamma-aminobutyrate) biosynthesis pathway, which aids in pH homeostasis and nutrient use versatility. Overall, these results demonstrate that the genetic flexibility within this species allows for survival in the environment for extended periods. IMPORTANCE Escherichia coli is commonly used as an indicator of recent fecal pollution in recreational water despite its known ability to survive in secondary environments, such as beach sand. These long-term survivors from sand reservoirs can be introduced into the water column through wave action or runoff during precipitation events, thereby impacting the perception of local water quality. Current beach monitoring methods cannot differentiate long-term environmental survivors from E. coli derived from recent fecal input, resulting in inaccurate monitoring results and unnecessary beach closures. This work identified the genetic factors that are associated with long-term survivors, providing insight into the mechanistic basis for E. coli accumulation in beach sand. A greater understanding of the intrinsic ability of E. coli to survive long-term and conditions that promote such survival will provide evidence of the limitations of beach water quality assessments using this indicator.
Collapse
|
34
|
de Crécy-lagard V, Amorin de Hegedus R, Arighi C, Babor J, Bateman A, Blaby I, Blaby-Haas C, Bridge AJ, Burley SK, Cleveland S, Colwell LJ, Conesa A, Dallago C, Danchin A, de Waard A, Deutschbauer A, Dias R, Ding Y, Fang G, Friedberg I, Gerlt J, Goldford J, Gorelik M, Gyori BM, Henry C, Hutinet G, Jaroch M, Karp PD, Kondratova L, Lu Z, Marchler-Bauer A, Martin MJ, McWhite C, Moghe GD, Monaghan P, Morgat A, Mungall CJ, Natale DA, Nelson WC, O’Donoghue S, Orengo C, O’Toole KH, Radivojac P, Reed C, Roberts RJ, Rodionov D, Rodionova IA, Rudolf JD, Saleh L, Sheynkman G, Thibaud-Nissen F, Thomas PD, Uetz P, Vallenet D, Carter EW, Weigele PR, Wood V, Wood-Charlson EM, Xu J. A roadmap for the functional annotation of protein families: a community perspective. Database (Oxford) 2022; 2022:baac062. [PMID: 35961013 PMCID: PMC9374478 DOI: 10.1093/database/baac062] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 06/28/2022] [Accepted: 08/03/2022] [Indexed: 12/23/2022]
Abstract
Over the last 25 years, biology has entered the genomic era and is becoming a science of 'big data'. Most interpretations of genomic analyses rely on accurate functional annotations of the proteins encoded by more than 500 000 genomes sequenced to date. By different estimates, only half the predicted sequenced proteins carry an accurate functional annotation, and this percentage varies drastically between different organismal lineages. Such a large gap in knowledge hampers all aspects of biological enterprise and, thereby, is standing in the way of genomic biology reaching its full potential. A brainstorming meeting to address this issue funded by the National Science Foundation was held during 3-4 February 2022. Bringing together data scientists, biocurators, computational biologists and experimentalists within the same venue allowed for a comprehensive assessment of the current state of functional annotations of protein families. Further, major issues that were obstructing the field were identified and discussed, which ultimately allowed for the proposal of solutions on how to move forward.
Collapse
Affiliation(s)
- Valérie de Crécy-lagard
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | | | - Cecilia Arighi
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19713, USA
| | - Jill Babor
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ian Blaby
- US Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Crysten Blaby-Haas
- Biology Department, Brookhaven National Laboratory, Upton, NY 11973, USA
| | - Alan J Bridge
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
| | - Stephen K Burley
- RCSB Protein Data Bank, Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Stacey Cleveland
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Lucy J Colwell
- Departmenf of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Ana Conesa
- Spanish National Research Council, Institute for Integrative Systems Biology, Paterna, Valencia 46980, Spain
| | - Christian Dallago
- TUM (Technical University of Munich) Department of Informatics, Bioinformatics & Computational Biology, i12, Boltzmannstr. 3, Garching/Munich 85748, Germany
| | - Antoine Danchin
- School of Biomedical Sciences, Li KaShing Faculty of Medicine, The University of Hong Kong, 21 Sassoon Road, Pokfulam, SAR Hong Kong 999077, China
| | - Anita de Waard
- Research Collaboration Unit, Elsevier, Jericho, VT 05465, USA
| | - Adam Deutschbauer
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Raquel Dias
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Yousong Ding
- Department of Medicinal Chemistry, Center for Natural Products, Drug Discovery and Development, University of Florida, Gainesville, FL 32610, USA
| | - Gang Fang
- NYU-Shanghai, Shanghai 200120, China
| | - Iddo Friedberg
- Department of Veterinary Microbiology and Preventive Medicine, Iowa State University, Ames, IA 50011, USA
| | - John Gerlt
- Institute for Genomic Biology and Departments of Biochemistry and Chemistry, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Joshua Goldford
- Physics of Living Systems, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Mark Gorelik
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Benjamin M Gyori
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA 02115, USA
| | - Christopher Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Geoffrey Hutinet
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Marshall Jaroch
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | - Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA 94025, USA
| | | | - Zhiyong Lu
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Maria-Jesus Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Claire McWhite
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ 08540, USA
| | - Gaurav D Moghe
- Plant Biology Section, School of Integrative Plant Science, Cornell University, Ithaca, NY 14853, USA
| | - Paul Monaghan
- Department of Agricultural Education and Communication, University of Florida, Gainesville, FL 32611, USA
| | - Anne Morgat
- Swiss-Prot group, SIB Swiss Institute of Bioinformatics, Centre Medical Universitaire, Geneva 4 CH-1211, Switzerland
| | - Christopher J Mungall
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Darren A Natale
- Georgetown University Medical Center, Washington, DC 20007, USA
| | - William C Nelson
- Biological Sciences Division, Pacific Northwest National Laboratories, Richland, WA 99354, USA
| | - Seán O’Donoghue
- School of Biotechnology and Biomolecular Sciences, University of NSW, Sydney, NSW 2052, Australia
| | - Christine Orengo
- Department of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | | | - Predrag Radivojac
- Khoury College of Computer Sciences, Northeastern University, Boston, MA 02115, USA
| | - Colbie Reed
- Department of Microbiology and Cell Sciences, University of Florida, Gainesville, FL 32611, USA
| | | | - Dmitri Rodionov
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA 92037, USA
| | - Irina A Rodionova
- Department of Bioengineering, Division of Engineering, University of California at San Diego, La Jolla, CA 92093-0412, USA
| | - Jeffrey D Rudolf
- Department of Chemistry, University of Florida, Gainesville, FL 32611, USA
| | - Lana Saleh
- New England Biolabs, Ipswich, MA 01938, USA
| | - Gloria Sheynkman
- Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH), 8600 Rockville Pike, Bethesda, MD 20817, USA
| | - Paul D Thomas
- Department of Population and Public Health Sciences, University of Southern California, Los Angeles, CA 90033, USA
| | - Peter Uetz
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - David Vallenet
- LABGeM, Génomique Métabolique, CEA, Genoscope, Institut François Jacob, Université d’Évry, Université Paris-Saclay, CNRS, Evry 91057, France
| | - Erica Watson Carter
- Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| | | | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Elisha M Wood-Charlson
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Jin Xu
- Department of Plant Pathology, University of Florida Citrus Research and Education Center, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA
| |
Collapse
|
35
|
Delmont TO, Gaia M, Hinsinger DD, Frémont P, Vanni C, Fernandez-Guerra A, Eren AM, Kourlaiev A, d'Agata L, Clayssen Q, Villar E, Labadie K, Cruaud C, Poulain J, Da Silva C, Wessner M, Noel B, Aury JM, de Vargas C, Bowler C, Karsenti E, Pelletier E, Wincker P, Jaillon O. Functional repertoire convergence of distantly related eukaryotic plankton lineages abundant in the sunlit ocean. CELL GENOMICS 2022; 2:100123. [PMID: 36778897 PMCID: PMC9903769 DOI: 10.1016/j.xgen.2022.100123] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Revised: 12/10/2021] [Accepted: 04/04/2022] [Indexed: 12/20/2022]
Abstract
Marine planktonic eukaryotes play critical roles in global biogeochemical cycles and climate. However, their poor representation in culture collections limits our understanding of the evolutionary history and genomic underpinnings of planktonic ecosystems. Here, we used 280 billion Tara Oceans metagenomic reads from polar, temperate, and tropical sunlit oceans to reconstruct and manually curate more than 700 abundant and widespread eukaryotic environmental genomes ranging from 10 Mbp to 1.3 Gbp. This genomic resource covers a wide range of poorly characterized eukaryotic lineages that complement long-standing contributions from culture collections while better representing plankton in the upper layer of the oceans. We performed the first, to our knowledge, comprehensive genome-wide functional classification of abundant unicellular eukaryotic plankton, revealing four major groups connecting distantly related lineages. Neither trophic modes of plankton nor its vertical evolutionary history could completely explain the functional repertoire convergence of major eukaryotic lineages that coexisted within oceanic currents for millions of years.
Collapse
Affiliation(s)
- Tom O. Delmont
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Morgan Gaia
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Damien D. Hinsinger
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Paul Frémont
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Chiara Vanni
- Microbial Genomics and Bioinformatics Research Group, Max Planck Institute for Marine Microbiology, Bremen, Germany
| | - Antonio Fernandez-Guerra
- Lundbeck Foundation GeoGenetics Centre, GLOBE Institute, University of Copenhagen, Copenhagen, Denmark
| | - A. Murat Eren
- Helmholtz Institute for Functional Marine Biodiversity at Oldenburg, Germany
| | - Artem Kourlaiev
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Leo d'Agata
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Quentin Clayssen
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Emilie Villar
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
| | - Karine Labadie
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Corinne Cruaud
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Julie Poulain
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Corinne Da Silva
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Marc Wessner
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Benjamin Noel
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Jean-Marc Aury
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Colomban de Vargas
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
- Sorbonne Université and CNRS, UMR 7144 (AD2M), ECOMAP, Station Biologique de Roscoff, Roscoff, France
| | - Chris Bowler
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
- Institut de Biologie de l’ENS, Département de Biologie, École Normale Supérieure, CNRS, INSERM, Université PSL, Paris, France
| | - Eric Karsenti
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
- Sorbonne Université and CNRS, UMR 7144 (AD2M), ECOMAP, Station Biologique de Roscoff, Roscoff, France
- Directors’ Research, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Eric Pelletier
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| | - Olivier Jaillon
- Génomique Métabolique, Genoscope, Institut François-Jacob, CEA, CNRS, Université d'Evry, Université Paris-Saclay, 91057 Evry, France
- Research Federation for the Study of Global Ocean Systems Ecology and Evolution, FR2022/Tara GOSEE, 75016 Paris, France
| |
Collapse
|