1
|
Paley S, Caspi R, O'Maille P, Karp PD. The Comparative Genome Dashboard. Front Microbiol 2024; 15:1447632. [PMID: 39144229 PMCID: PMC11322064 DOI: 10.3389/fmicb.2024.1447632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 07/15/2024] [Indexed: 08/16/2024] Open
Abstract
The Comparative Genome Dashboard is a web-based software tool for interactive exploration of the similarities and differences in gene functions between organisms. It provides a high-level graphical survey of cellular functions, and enables the user to drill down to examine subsystems of interest in greater detail. At its highest level the Comparative Dashboard contains panels for cellular systems such as biosynthesis, energy metabolism, transport, and response to stimulus. Each panel contains a set of bar graphs that plot the numbers of compounds or gene products for each organism across a set of subsystems of that panel. Users can interactively drill down to focus on subsystems of interest and see grids of compounds produced or consumed by each organism, specific GO term assignments, pathway diagrams, and links to more detailed comparison pages. For example, the dashboard enables users to compare the cofactors that a set of organisms can synthesize, the metal ions that they are able to transport, their DNA damage repair capabilities, their biofilm-formation genes, and their viral response proteins. The dashboard enables users to quickly perform comprehensive comparisons at varying levels of detail.
Collapse
Affiliation(s)
- Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| | - Ron Caspi
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| | - Paul O'Maille
- Biosciences Division, SRI International, Menlo Park, CA, United States
| | - Peter D. Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| |
Collapse
|
2
|
Paley S, Caspi R, O'Maille P, Karp PD. The Comparative Genome Dashboard. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.11.598546. [PMID: 38915637 PMCID: PMC11195217 DOI: 10.1101/2024.06.11.598546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
The Comparative Genome Dashboard is a web-based software tool for interactive exploration of the similarities and differences in gene functions between organisms. It provides a high-level graphical survey of cellular functions, and enables the user to drill down to examine subsystems of interest in greater detail. At its highest level the Comparative Dashboard contains panels for cellular systems such as biosynthesis, energy metabolism, transport, and response to stimulus. Each panel contains a set of bar graphs that plot the numbers of compounds or gene products for each organism across a set of subsystems of that panel. Users can interactively drill down to focus on subsystems of interest and see grids of compounds produced or consumed by each organism, specific GO term assignments, pathway diagrams, and links to more detailed comparison pages. For example, the dashboard enables users to compare the cofactors that a set of organisms can synthesize, the metal ions that they are able to transport, their DNA damage repair capabilities, their biofilm-formation genes, and their viral response proteins. The dashboard enables users to quickly perform comprehensive comparisons at varying levels of detail.
Collapse
Affiliation(s)
- Suzanne Paley
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| | - Ron Caspi
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| | - Paul O'Maille
- Biosciences Division, SRI International, Menlo Park, CA, United States
| | - Peter D Karp
- Bioinformatics Research Group, SRI International, Menlo Park, CA, United States
| |
Collapse
|
3
|
Pascal Andreu V, Augustijn HE, Chen L, Zhernakova A, Fu J, Fischbach MA, Dodd D, Medema MH. gutSMASH predicts specialized primary metabolic pathways from the human gut microbiota. Nat Biotechnol 2023; 41:1416-1423. [PMID: 36782070 PMCID: PMC10423304 DOI: 10.1038/s41587-023-01675-1] [Citation(s) in RCA: 28] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 01/10/2023] [Indexed: 02/15/2023]
Abstract
The gut microbiota produce hundreds of small molecules, many of which modulate host physiology. Although efforts have been made to identify biosynthetic genes for secondary metabolites, the chemical output of the gut microbiome consists predominantly of primary metabolites. Here we introduce the gutSMASH algorithm for identification of primary metabolic gene clusters, and we used it to systematically profile gut microbiome metabolism, identifying 19,890 gene clusters in 4,240 high-quality microbial genomes. We found marked differences in pathway distribution among phyla, reflecting distinct strategies for energy capture. These data explain taxonomic differences in short-chain fatty acid production and suggest a characteristic metabolic niche for each taxon. Analysis of 1,135 individuals from a Dutch population-based cohort shows that the level of microbiome-derived metabolites in plasma and feces is almost completely uncorrelated with the metagenomic abundance of corresponding metabolic genes, indicating a crucial role for pathway-specific gene regulation and metabolite flux. This work is a starting point for understanding differences in how bacterial taxa contribute to the chemistry of the microbiome.
Collapse
Affiliation(s)
| | - Hannah E Augustijn
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Lianmin Chen
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Department of Pediatrics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Changzhou Medical Center, Nanjing Medical University, Changzhou, China
- Department of Cardiology, Nanjing Medical University, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Alexandra Zhernakova
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Jingyuan Fu
- Department of Genetics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
- Department of Pediatrics, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands
| | - Michael A Fischbach
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
- Department of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| | - Dylan Dodd
- Department of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
- Department of Pathology, Stanford University, Stanford, CA, USA.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
| |
Collapse
|
4
|
Arikawa K, Hosokawa M. Uncultured prokaryotic genomes in the spotlight: An examination of publicly available data from metagenomics and single-cell genomics. Comput Struct Biotechnol J 2023; 21:4508-4518. [PMID: 37771751 PMCID: PMC10523443 DOI: 10.1016/j.csbj.2023.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/10/2023] [Accepted: 09/10/2023] [Indexed: 09/30/2023] Open
Abstract
Owing to the ineffectiveness of traditional culture techniques for the vast majority of microbial species, culture-independent analyses utilizing next-generation sequencing and bioinformatics have become essential for gaining insight into microbial ecology and function. This mini-review focuses on two essential methods for obtaining genetic information from uncultured prokaryotes, metagenomics and single-cell genomics. We analyzed the registration status of uncultured prokaryotic genome data from major public databases and assessed the advantages and limitations of both the methods. Metagenomics generates a significant quantity of sequence data and multiple prokaryotic genomes using straightforward experimental procedures. However, in ecosystems with high microbial diversity, such as soil, most genes are presented as brief, disconnected contigs, and lack association of highly conserved genes and mobile genetic elements with individual species genomes. Although technically more challenging, single-cell genomics offers valuable insights into complex ecosystems by providing strain-resolved genomes, addressing issues in metagenomics. Recent technological advancements, such as long-read sequencing, machine learning algorithms, and in silico protein structure prediction, in combination with vast genomic data, have the potential to overcome the current technical challenges and facilitate a deeper understanding of uncultured microbial ecosystems and microbial dark matter genes and proteins. In light of this, it is imperative that continued innovation in both methods and technologies take place to create high-quality reference genome databases that will support future microbial research and industrial applications.
Collapse
Affiliation(s)
- Koji Arikawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
| | - Masahito Hosokawa
- Department of Life Science and Medical Bioscience, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo 162-8480, Japan
- bitBiome, Inc., 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Research Organization for Nano and Life Innovation, Waseda University, 513 Wasedatsurumaki-cho, Shinjuku-ku, Tokyo 162-0041, Japan
- Institute for Advanced Research of Biosystem Dynamics, Waseda Research Institute for Science and Engineering, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Computational Bio Big-Data Open Innovation Laboratory, National Institute of Advanced Industrial Science and Technology, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| |
Collapse
|
5
|
Metagenomes from Soils along an Agricultural Transect in Ulster County, New York. Microbiol Resour Announc 2023; 12:e0101522. [PMID: 36779724 PMCID: PMC10019288 DOI: 10.1128/mra.01015-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023] Open
Abstract
Many modern farming practices negatively impact ecosystems on the local and global scales. Here, we assessed the taxonomic structures of 48 soil microbial communities along an agricultural transect using 16S rRNA and internal transcribed spacer (ITS) amplicon sequencing. We further characterized the functional structures of a subsample of 12 microbiomes using whole-genome sequencing.
Collapse
|
6
|
Wei W, Cao B, Xu D, Liu Y, Zhang X, Wang Y. Development and validation of a prognostic prediction model for iron metabolism-related genes in patients with pancreatic adenocarcinoma. Front Genet 2023; 13:1058062. [PMID: 36685915 PMCID: PMC9846079 DOI: 10.3389/fgene.2022.1058062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/30/2022] [Indexed: 01/05/2023] Open
Abstract
Background: Pancreatic adenocarcinoma (PAAD) is one of the most aggressive tumors of the digestive tract, with low surgical resection rate and insensitivity to radiotherapy and chemotherapy. Existing evidence suggests that regulation of ferroptosis can induce PAAD cell death, inhibit tumor growth, and may synergistically improve the sensitivity of other antitumor drugs. However, there is little of systematic research on iron metabolism-related genes in PAAD. In this study, a risk-score system of PAAD iron metabolism-related genes was designed and tested, and verified to be robust. Materials and Methods: The TCGA database was used to download 177 PAAD patients' message RNA (mRNA) expression profiles and clinical characteristics. By identifying dysregulated iron metabolism-related genes between PAAD related tissues and adjacent normal tissues, univariate Cox proportional hazards regression and LASSO regression algorithm were used to establish prognostic risk-score system and construct nomogram to estimate the 1-, 2-, 3-year survival in PAAD patients. Finally, selected genes were validated by quantitative PCR (q-PCR). Results: A 9-gene related to iron metabolism risk-score system of PAAD was constructed and validated. The clinicopathological characteristics of age, histologic grade, pathologic stage, T stage, residual tumor, and primary therapy outcome were all worse in patients with a higher risk-score. Further, immunohistochemistry results of SLC2A1, MBOAT2, XDH, CTSE, MOCOS, and ATP6V0A4 confirmed that patients with higher expression are more malignant. Then, a nomogram with 9-gene risk score system as a separate clinical factor was utilized to foretell the 1-, 2-, 3-year overall survival rate of PAAD patients. Results of q-PCR showed that 8 of the 9 genes screened were significantly up-regulated in at least one PAAD cell line, and one gene was significantly down-regulated in three PAAD cell lines. Conclusion: To conclude, we generated a nine-gene system linked to iron metabolism as an independent indicator for predicting PAAD prognosis, therefore presenting a possible prognostic biomarker and potential treatment targets for PAAD.
Collapse
Affiliation(s)
- Wenhan Wei
- Department of Gastroenterology, Affiliated Hangzhou First People’s Hospital, Zhejiang University School of Medicine, Hangzhou, China,China State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou, China
| | - Bin Cao
- Department of Pharmacy, First Affiliated Hospital, Huzhou University, Huzhou, China
| | - Dongchao Xu
- Department of Gastroenterology, Affiliated Hangzhou First People’s Hospital, Zhejiang University School of Medicine, Hangzhou, China,Hangzhou Institute of Digestive Diseases, Hangzhou, China,Key Laboratory of Integrated Traditional Chinese and Western Medicine for Biliary and Pancreatic Diseases of Zhejiang Province, Hangzhou, China
| | - Yusheng Liu
- China State Key Laboratory of CAD&CG, Zhejiang University, Hangzhou, China
| | - Xiaofeng Zhang
- Department of Gastroenterology, Affiliated Hangzhou First People’s Hospital, Zhejiang University School of Medicine, Hangzhou, China,Hangzhou Institute of Digestive Diseases, Hangzhou, China,Key Laboratory of Integrated Traditional Chinese and Western Medicine for Biliary and Pancreatic Diseases of Zhejiang Province, Hangzhou, China,*Correspondence: Xiaofeng Zhang, ; Yu Wang,
| | - Yu Wang
- Department of Gastroenterology, Affiliated Hangzhou First People’s Hospital, Zhejiang University School of Medicine, Hangzhou, China,Hangzhou Institute of Digestive Diseases, Hangzhou, China,Key Laboratory of Integrated Traditional Chinese and Western Medicine for Biliary and Pancreatic Diseases of Zhejiang Province, Hangzhou, China,*Correspondence: Xiaofeng Zhang, ; Yu Wang,
| |
Collapse
|
7
|
Lin L, Lai Z, Yang H, Zhang J, Qi W, Xie F, Mao S. Genome-centric investigation of bile acid metabolizing microbiota of dairy cows and associated diet-induced functional implications. THE ISME JOURNAL 2023; 17:172-184. [PMID: 36261508 PMCID: PMC9750977 DOI: 10.1038/s41396-022-01333-5] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 10/03/2022] [Accepted: 10/07/2022] [Indexed: 11/05/2022]
Abstract
Although the importance of bile acid (BA)-related microbial strains and enzymes is increasingly recognized for monogastric animals, a lack of knowledge about BA metabolism in dairy cows limits functional applications aimed at the targeted modulation of microbe-host interactions for animal production and health. In the present study, 108 content samples from six intestinal regions of dairy cows were used for shotgun metagenomic sequencing. Overall, 372 high-quality metagenome-assembled genomes (MAGs) were involved in BA deconjugation, oxidation, and dehydroxylation pathways. Furthermore, the BA-metabolizing microbiome predominately occurred in the large intestine, resulting in the accumulation of secondary unconjugated BAs. Comparative genomic analysis revealed that the bile salt hydrolase (BSH)-carrying microbial populations managed with the selective environment of the dairy cow intestine by adopting numerous host mucin glycan-degrading abilities. A sequence similarity network analysis classified 439 BSH homologs into 12 clusters and identified different clusters with diverse evolution, taxonomy, signal peptides, and ecological niches. Our omics data further revealed that the strains of Firmicutes bacterium CAG-110 processed the increased abundance of BSHs from Cluster 1, coinciding with the changes in the colon cholic acid concentration after grain introduction, and were intricately related to intestinal inflammation. This study is the first to use a genome-centric approach and whole intestine-targeted metabolomics to reveal microbial BA metabolism and its diet-induced functional implications in dairy cows. These findings provide insight into the manipulation of intestinal microorganisms for improving host health.
Collapse
Affiliation(s)
- Limei Lin
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Zheng Lai
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Huisheng Yang
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Jiyou Zhang
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Weibiao Qi
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Fei Xie
- grid.27871.3b0000 0000 9750 7019Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China ,grid.27871.3b0000 0000 9750 7019Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095 China
| | - Shengyong Mao
- Ruminant Nutrition and Feed Engineering Technology Research Center, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095, China. .,Laboratory of Gastrointestinal Microbiology, Jiangsu Key Laboratory of Gastrointestinal Nutrition and Animal Health, National Center for International Research on Animal Gut Nutrition, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, 210095, China.
| |
Collapse
|
8
|
Richardson L, Allen B, Baldi G, Beracochea M, Bileschi M, Burdett T, Burgin J, Caballero-Pérez J, Cochrane G, Colwell L, Curtis T, Escobar-Zepeda A, Gurbich T, Kale V, Korobeynikov A, Raj S, Rogers A, Sakharova E, Sanchez S, Wilkinson D, Finn R. MGnify: the microbiome sequence data analysis resource in 2023. Nucleic Acids Res 2022; 51:D753-D759. [PMID: 36477304 PMCID: PMC9825492 DOI: 10.1093/nar/gkac1080] [Citation(s) in RCA: 64] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/19/2022] [Accepted: 11/01/2022] [Indexed: 12/12/2022] Open
Abstract
The MGnify platform (https://www.ebi.ac.uk/metagenomics) facilitates the assembly, analysis and archiving of microbiome-derived nucleic acid sequences. The platform provides access to taxonomic assignments and functional annotations for nearly half a million analyses covering metabarcoding, metatranscriptomic, and metagenomic datasets, which are derived from a wide range of different environments. Over the past 3 years, MGnify has not only grown in terms of the number of datasets contained but also increased the breadth of analyses provided, such as the analysis of long-read sequences. The MGnify protein database now exceeds 2.4 billion non-redundant sequences predicted from metagenomic assemblies. This collection is now organised into a relational database making it possible to understand the genomic context of the protein through navigation back to the source assembly and sample metadata, marking a major improvement. To extend beyond the functional annotations already provided in MGnify, we have applied deep learning-based annotation methods. The technology underlying MGnify's Application Programming Interface (API) and website has been upgraded, and we have enabled the ability to perform downstream analysis of the MGnify data through the introduction of a coupled Jupyter Lab environment.
Collapse
Affiliation(s)
- Lorna Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ben Allen
- School of Engineering, Newcastle University, Newcastle upon Tyne, UK
| | - Germana Baldi
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Tony Burdett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Juan Caballero-Pérez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Lucy J Colwell
- Google Research, Brain Team, Mountain View, CA, USA,Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Tom Curtis
- School of Engineering, Newcastle University, Newcastle upon Tyne, UK
| | - Alejandra Escobar-Zepeda
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Tatiana A Gurbich
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Varsha Kale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnology, St Petersburg State University, St Petersburg, Russia
| | - Shriya Raj
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Alexander B Rogers
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ekaterina Sakharova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Santiago Sanchez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, UK
| | | | - Robert D Finn
- To whom correspondence should be addressed. Tel: +44 1223 492679;
| |
Collapse
|
9
|
YÜCEL H, EKİNCİ K. Carbohydrate active enzyme system in rumen fungi: a review. INTERNATIONAL JOURNAL OF SECONDARY METABOLITE 2022. [DOI: 10.21448/ijsm.1075030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
Hydrolysis and dehydration reactions of carbohydrates, which are used as energy raw materials by all living things in nature, are controlled by Carbohydrate Active Enzyme (CAZy) systems. These enzymes are also used in different industrial areas today. There are different types of microorganisms that have the CAZy system and are used in the industrial sector. Apart from current organisms, there are also rumen fungi within the group of candidate microorganisms with the CAZy system. It has been reported that xylanase (EC3.2.1.8 and EC3.2.1.37) enzyme, a member of the glycoside hydrolase enzyme family obtained from Trichoderma sp. and used especially in areas such as bread, paper, and feed industry, is more synthesized in rumen fungi such as Orpinomyces sp. and Neocallimastix sp. Therefore, this study reviews Neocallimastixsp., Orpinomyces sp., Caecomyces sp., Piromyces sp., and Anaeromyces sp., registered in the CAZy and Mycocosm database for rumen fungi to have both CAZy enzyme activity and to be an alternative microorganism in the industry. Furthermore the CAZy enzyme activities of the strains are investigated. The review shows thatNeocallimax sp. and Orpinomyces sp. areconsidered as candidate microorganisms.
Collapse
Affiliation(s)
- Halit YÜCEL
- KAHRAMANMARAŞ SÜTÇÜ İMAM ÜNİVERSİTESİ, ZİRAAT FAKÜLTESİ
| | | |
Collapse
|
10
|
Classification of the plant-associated lifestyle of Pseudomonas strains using genome properties and machine learning. Sci Rep 2022; 12:10857. [PMID: 35760985 PMCID: PMC9237127 DOI: 10.1038/s41598-022-14913-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Accepted: 06/15/2022] [Indexed: 12/30/2022] Open
Abstract
The rhizosphere, the region of soil surrounding roots of plants, is colonized by a unique population of Plant Growth Promoting Rhizobacteria (PGPR). Many important PGPR as well as plant pathogens belong to the genus Pseudomonas. There is, however, uncertainty on the divide between beneficial and pathogenic strains as previously thought to be signifying genomic features have limited power to separate these strains. Here we used the Genome properties (GP) common biological pathways annotation system and Machine Learning (ML) to establish the relationship between the genome wide GP composition and the plant-associated lifestyle of 91 Pseudomonas strains isolated from the rhizosphere and the phyllosphere representing both plant-associated phenotypes. GP enrichment analysis, Random Forest model fitting and feature selection revealed 28 discriminating features. A test set of 75 new strains confirmed the importance of the selected features for classification. The results suggest that GP annotations provide a promising computational tool to better classify the plant-associated lifestyle.
Collapse
|
11
|
Beresford-Jones BS, Forster SC, Stares MD, Notley G, Viciani E, Browne HP, Boehmler DJ, Soderholm AT, Kumar N, Vervier K, Cross JR, Almeida A, Lawley TD, Pedicord VA. The Mouse Gastrointestinal Bacteria Catalogue enables translation between the mouse and human gut microbiotas via functional mapping. Cell Host Microbe 2021; 30:124-138.e8. [PMID: 34971560 PMCID: PMC8763404 DOI: 10.1016/j.chom.2021.12.003] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Revised: 10/05/2021] [Accepted: 11/30/2021] [Indexed: 12/12/2022]
Abstract
Human health and disease have increasingly been shown to be impacted by the gut microbiota, and mouse models are essential for investigating these effects. However, the compositions of human and mouse gut microbiotas are distinct, limiting translation of microbiota research between these hosts. To address this, we constructed the Mouse Gastrointestinal Bacteria Catalogue (MGBC), a repository of 26,640 high-quality mouse microbiota-derived bacterial genomes. This catalog enables species-level analyses for mapping functions of interest and identifying functionally equivalent taxa between the microbiotas of humans and mice. We have complemented this with a publicly deposited collection of 223 bacterial isolates, including 62 previously uncultured species, to facilitate experimental investigation of individual commensal bacteria functions in vitro and in vivo. Together, these resources provide the ability to identify and test functionally equivalent members of the host-specific gut microbiotas of humans and mice and support the informed use of mouse models in human microbiota research.
Collapse
Affiliation(s)
- Benjamin S Beresford-Jones
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, Cambridge, UK; Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, UK
| | | | - Mark D Stares
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - George Notley
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Elisa Viciani
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Hilary P Browne
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Daniel J Boehmler
- Donald B. and Catherine C. Marron Cancer Metabolism Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Amelia T Soderholm
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, Cambridge, UK; Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, UK
| | - Nitin Kumar
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Kevin Vervier
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Justin R Cross
- Donald B. and Catherine C. Marron Cancer Metabolism Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Alexandre Almeida
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK; European Bioinformatics Institute, EMBL-EBI, Wellcome Genome Campus, Hinxton, UK
| | - Trevor D Lawley
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| | - Virginia A Pedicord
- Cambridge Institute of Therapeutic Immunology and Infectious Disease, Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, Cambridge, UK; Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, UK.
| |
Collapse
|
12
|
Gómez-Pérez D, Kemen E. Predicting Lifestyle from Positive Selection Data and Genome Properties in Oomycetes. Pathogens 2021; 10:807. [PMID: 34202069 PMCID: PMC8308905 DOI: 10.3390/pathogens10070807] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Revised: 06/19/2021] [Accepted: 06/21/2021] [Indexed: 11/30/2022] Open
Abstract
As evidenced in parasitism, host and niche shifts are a source of genomic and phenotypic diversification. Exemplary is a reduction in the core metabolism as parasites adapt to a particular host, while the accessory genome often maintains a high degree of diversification. However, selective pressures acting on the genome of organisms that have undergone recent lifestyle or host changes have not been fully investigated. Here, we developed a comparative genomics approach to study underlying adaptive trends in oomycetes, a eukaryotic phylum with a wide and diverse range of economically important plant and animal parasitic lifestyles. Our analysis reveals converging evolution on biological processes for oomycetes that have similar lifestyles. Moreover, we find that certain functions, in particular carbohydrate metabolism, transport, and signaling, are important for host and environmental adaptation in oomycetes. Given the high correlation between lifestyle and genome properties in our oomycete dataset, together with the known convergent evolution of fungal and oomycete genomes, we developed a model that predicts plant pathogenic lifestyles with high accuracy based on functional annotations. These insights into how selective pressures correlate with lifestyle may be crucial to better understand host/lifestyle shifts and their impact on the genome.
Collapse
Affiliation(s)
| | - Eric Kemen
- Center for Plant Molecular Biology (ZMBP), University of Tübingen, 72074 Tübingen, Germany;
| |
Collapse
|
13
|
Xie F, Jin W, Si H, Yuan Y, Tao Y, Liu J, Wang X, Yang C, Li Q, Yan X, Lin L, Jiang Q, Zhang L, Guo C, Greening C, Heller R, Guan LL, Pope PB, Tan Z, Zhu W, Wang M, Qiu Q, Li Z, Mao S. An integrated gene catalog and over 10,000 metagenome-assembled genomes from the gastrointestinal microbiome of ruminants. MICROBIOME 2021; 9:137. [PMID: 34118976 PMCID: PMC8199421 DOI: 10.1186/s40168-021-01078-x] [Citation(s) in RCA: 98] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Accepted: 04/15/2021] [Indexed: 05/20/2023]
Abstract
BACKGROUND Gastrointestinal tract (GIT) microbiomes in ruminants play major roles in host health and thus animal production. However, we lack an integrated understanding of microbial community structure and function as prior studies. are predominantly biased towards the rumen. Therefore, to acquire a microbiota inventory of the discrete GIT compartments, In this study, we used shotgun metagenomics to profile the microbiota of 370 samples that represent 10 GIT regions of seven ruminant species. RESULTS Our analyses reconstructed a GIT microbial reference catalog with > 154 million nonredundant genes and identified 8745 uncultured candidate species from over 10,000 metagenome-assembled genomes. The integrated gene catalog across the GIT regions demonstrates spatial associations between the microbiome and physiological adaptations, and 8745 newly characterized genomes substantially expand the genomic landscape of ruminant microbiota, particularly those from the lower gut. This substantially expands the previously known set of endogenous microbial diversity and the taxonomic classification rate of the GIT microbiome. These candidate species encode hundreds of enzymes and novel biosynthetic gene clusters that improve our understanding concerning methane production and feed efficiency in ruminants. Overall, this study expands the characterization of the ruminant GIT microbiota at unprecedented spatial resolution and offers clues for improving ruminant livestock production in the future. CONCLUSIONS Having access to a comprehensive gene catalog and collections of microbial genomes provides the ability to perform efficiently genome-based analysis to achieve a detailed classification of GIT microbial ecosystem composition. Our study will bring unprecedented power in future association studies to investigate the impact of the GIT microbiota in ruminant health and production. Video abstract.
Collapse
Affiliation(s)
- Fei Xie
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Wei Jin
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Huazhe Si
- College of Animal Science and Technology, Jilin Agricultural University, Changchun, China
| | - Yuan Yuan
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Ye Tao
- Shanghai BIOZERON Biotechnology Company Ltd, Shanghai, China
| | - Junhua Liu
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Xiaoxu Wang
- Department of Special Economic Animal Nutrition and Feed Science, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, China
| | - Chengjian Yang
- Buffalo Research Institute, Chinese Academy of Agricultural Sciences, Nanning, China
| | - Qiushuang Li
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, China
| | - Xiaoting Yan
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China
| | - Limei Lin
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Qian Jiang
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Lei Zhang
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Changzheng Guo
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Chris Greening
- Biomedicine Discovery Institute, Department of Microbiology, Monash University, Clayton, Australia
| | - Rasmus Heller
- Section for Computational and RNA Biology, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Le Luo Guan
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, Canada
| | - Phillip B Pope
- Faculty of Biosciences, Norwegian University of Life Sciences, Aas, Norway
| | - Zhiliang Tan
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, China
| | - Weiyun Zhu
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China
| | - Min Wang
- CAS Key Laboratory for Agro-Ecological Processes in Subtropical Region, Institute of Subtropical Agriculture, Chinese Academy of Sciences, Changsha, China.
| | - Qiang Qiu
- School of Ecology and Environment, Northwestern Polytechnical University, Xi'an, China.
| | - Zhipeng Li
- College of Animal Science and Technology, Jilin Agricultural University, Changchun, China.
- Department of Special Economic Animal Nutrition and Feed Science, Institute of Special Animal and Plant Sciences, Chinese Academy of Agricultural Sciences, Changchun, China.
| | - Shengyong Mao
- Laboratory of Gastrointestinal Microbiology, College of Animal Science and Technology, Nanjing Agricultural University, Nanjing, China.
| |
Collapse
|
14
|
Meldal BHM, Pons C, Perfetto L, Del-Toro N, Wong E, Aloy P, Hermjakob H, Orchard S, Porras P. Analysing the yeast complexome-the Complex Portal rising to the challenge. Nucleic Acids Res 2021; 49:3156-3167. [PMID: 33677561 PMCID: PMC8034636 DOI: 10.1093/nar/gkab077] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2020] [Revised: 01/22/2021] [Accepted: 01/27/2021] [Indexed: 02/06/2023] Open
Abstract
The EMBL-EBI Complex Portal is a knowledgebase of macromolecular complexes providing persistent stable identifiers. Entries are linked to literature evidence and provide details of complex membership, function, structure and complex-specific Gene Ontology annotations. Data are freely available and downloadable in HUPO-PSI community standards and missing entries can be requested for curation. In collaboration with Saccharomyces Genome Database and UniProt, the yeast complexome, a compendium of all known heteromeric assemblies from the model organism Saccharomyces cerevisiae, was curated. This expansion of knowledge and scope has led to a 50% increase in curated complexes compared to the previously published dataset, CYC2008. The yeast complexome is used as a reference resource for the analysis of complexes from large-scale experiments. Our analysis showed that genes coding for proteins in complexes tend to have more genetic interactions, are co-expressed with more genes, are more multifunctional, localize more often in the nucleus, and are more often involved in nucleic acid-related metabolic processes and processes where large machineries are the predominant functional drivers. A comparison to genetic interactions showed that about 40% of expanded co-complex pairs also have genetic interactions, suggesting strong functional links between complex members.
Collapse
Affiliation(s)
- Birgit H M Meldal
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carles Pons
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, 08028 Barcelona, Catalonia, Spain
| | - Livia Perfetto
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Noemi Del-Toro
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Edith Wong
- Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305-5477, USA
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute for Science and Technology, 08028 Barcelona, Catalonia, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), 08010 Barcelona, Catalonia, Spain
| | - Henning Hermjakob
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sandra Orchard
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Pablo Porras
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
15
|
Zhu SY, Liu LL, Huang YQ, Li XW, Talukder M, Dai XY, Li YH, Li JL. In silico analysis of selenoprotein N (Gallus gallus): absence of EF-hand motif and the role of CUGS-helix domain in antioxidant protection. Metallomics 2021; 13:6132312. [PMID: 33693771 DOI: 10.1093/mtomcs/mfab004] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Revised: 01/12/2021] [Accepted: 01/13/2021] [Indexed: 12/13/2022]
Abstract
Selenoprotein N (SEPN1) is critical to the normal muscular physiology. Mutation of SEPN1 can raise congenital muscular disorder in human. It is also central to maturation and structure of skeletal muscle in chicken. However, human SEPN1 contained an EF-hand motif, which was not found in chicken. And the biochemical and molecular characterization of chicken SEPN1 remains unclear. Hence, protein domains, transcription factors, and interactions of Ca2+ in SEPN1 were analyzed in silico to provide the divergence and homology between chicken and human in this work. The results showed that vertebrates' SEPN1 evolved from a common ancestor. Human and chicken's SEPN1 shared a conserved CUGS-helix domain with function in antioxidant protection. SEPN1 might be a downstream target of JNK pathway, and it could respond to multiple stresses. Human's SEPN1 might not combine with Ca2+ with a single EF-hand motif in calcium homeostasis, and chicken SEPN1 did not have the EF-hand motif in the prediction, indicating the EF-hand motif malfunctioned in chicken SEPN1.
Collapse
Affiliation(s)
- Shi-Yong Zhu
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Li-Li Liu
- College of Pharmacy, Heilongjiang University of Chinese Medicine, Harbin 150040, P. R. China
| | - Yue-Qiang Huang
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Xiao-Wei Li
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Milton Talukder
- Department of Physiology and Pharmacology, Faculty of Animal Science and Veterinary Medicine, Patuakhali Science and Technology University, Barishal 8210, Bangladesh
| | - Xue-Yan Dai
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Yan-Hua Li
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| | - Jin-Long Li
- College of Veterinary Medicine, Northeast Agricultural University, Harbin 150030, P. R. China.,Key Laboratory of the Provincial Education, Department of Heilongjiang for Common Animal Disease Prevention and Treatment, Northeast Agricultural University, Harbin 150030, P. R. China.,Heilongjiang Key Laboratory for Laboratory Animals and Comparative Medicine, Northeast Agricultural University, Harbin 150030, P. R. China
| |
Collapse
|
16
|
Li W, O’Neill KR, Haft DH, DiCuccio M, Chetvernin V, Badretdin A, Coulouris G, Chitsaz F, Derbyshire M, Durkin AS, Gonzales NR, Gwadz M, Lanczycki C, Song JS, Thanki N, Wang J, Yamashita R, Yang M, Zheng C, Marchler-Bauer A, Thibaud-Nissen F. RefSeq: expanding the Prokaryotic Genome Annotation Pipeline reach with protein family model curation. Nucleic Acids Res 2021; 49:D1020-D1028. [PMID: 33270901 PMCID: PMC7779008 DOI: 10.1093/nar/gkaa1105] [Citation(s) in RCA: 495] [Impact Index Per Article: 165.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/19/2020] [Accepted: 11/02/2020] [Indexed: 11/14/2022] Open
Abstract
The Reference Sequence (RefSeq) project at the National Center for Biotechnology Information (NCBI) contains nearly 200 000 bacterial and archaeal genomes and 150 million proteins with up-to-date annotation. Changes in the Prokaryotic Genome Annotation Pipeline (PGAP) since 2018 have resulted in a substantial reduction in spurious annotation. The hierarchical collection of protein family models (PFMs) used by PGAP as evidence for structural and functional annotation was expanded to over 35 000 protein profile hidden Markov models (HMMs), 12 300 BlastRules and 36 000 curated CDD architectures. As a result, >122 million or 79% of RefSeq proteins are now named based on a match to a curated PFM. Gene symbols, Enzyme Commission numbers or supporting publication attributes are available on over 40% of the PFMs and are inherited by the proteins and features they name, facilitating multi-genome analyses and connections to the literature. In adherence with the principles of FAIR (findable, accessible, interoperable, reusable), the PFMs are available in the Protein Family Models Entrez database to any user. Finally, the reference and representative genome set, a taxonomically diverse subset of RefSeq prokaryotic genomes, is now recalculated regularly and available for download and homology searches with BLAST. RefSeq is found at https://www.ncbi.nlm.nih.gov/refseq/.
Collapse
Affiliation(s)
- Wenjun Li
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Kathleen R O’Neill
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Daniel H Haft
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Michael DiCuccio
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Vyacheslav Chetvernin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Azat Badretdin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - George Coulouris
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Farideh Chitsaz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Myra K Derbyshire
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - A Scott Durkin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Noreen R Gonzales
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Marc Gwadz
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Christopher J Lanczycki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - James S Song
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Narmada Thanki
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Jiyao Wang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Roxanne A Yamashita
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Mingzhang Yang
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Chanjuan Zheng
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Aron Marchler-Bauer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 45 Center Drive, Bethesda, MD 20892-6511, USA
| |
Collapse
|
17
|
Mitchell AL, Almeida A, Beracochea M, Boland M, Burgin J, Cochrane G, Crusoe MR, Kale V, Potter SC, Richardson LJ, Sakharova E, Scheremetjew M, Korobeynikov A, Shlemov A, Kunyavskaya O, Lapidus A, Finn RD. MGnify: the microbiome analysis resource in 2020. Nucleic Acids Res 2020; 48:D570-D578. [PMID: 31696235 PMCID: PMC7145632 DOI: 10.1093/nar/gkz1035] [Citation(s) in RCA: 197] [Impact Index Per Article: 49.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2019] [Accepted: 10/23/2019] [Indexed: 12/16/2022] Open
Abstract
MGnify (http://www.ebi.ac.uk/metagenomics) provides a free to use platform for the assembly, analysis and archiving of microbiome data derived from sequencing microbial populations that are present in particular environments. Over the past 2 years, MGnify (formerly EBI Metagenomics) has more than doubled the number of publicly available analysed datasets held within the resource. Recently, an updated approach to data analysis has been unveiled (version 5.0), replacing the previous single pipeline with multiple analysis pipelines that are tailored according to the input data, and that are formally described using the Common Workflow Language, enabling greater provenance, reusability, and reproducibility. MGnify's new analysis pipelines offer additional approaches for taxonomic assertions based on ribosomal internal transcribed spacer regions (ITS1/2) and expanded protein functional annotations. Biochemical pathways and systems predictions have also been added for assembled contigs. MGnify's growing focus on the assembly of metagenomic data has also seen the number of datasets it has assembled and analysed increase six-fold. The non-redundant protein database constructed from the proteins encoded by these assemblies now exceeds 1 billion sequences. Meanwhile, a newly developed contig viewer provides fine-grained visualisation of the assembled contigs and their enriched annotations.
Collapse
Affiliation(s)
- Alex L Mitchell
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alexandre Almeida
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK.,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, UK
| | - Martin Beracochea
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Miguel Boland
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Josephine Burgin
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Guy Cochrane
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael R Crusoe
- Common Workflow Language, a project of the Software Freedom Conservancy, Inc. 137 Montague Street, Suite 380, Brooklyn, NY 11201-3548, USA
| | - Varsha Kale
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Simon C Potter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Lorna J Richardson
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ekaterina Sakharova
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Maxim Scheremetjew
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anton Korobeynikov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Alex Shlemov
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Olga Kunyavskaya
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Alla Lapidus
- Center for Algorithmic Biotechnologies, Saint Petersburg State University, Russia
| | - Robert D Finn
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| |
Collapse
|
18
|
Viruses with different genome types adopt a similar strategy to pack nucleic acids based on positively charged protein domains. Sci Rep 2020; 10:5470. [PMID: 32214181 PMCID: PMC7096446 DOI: 10.1038/s41598-020-62328-w] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 03/02/2020] [Indexed: 11/17/2022] Open
Abstract
Capsid proteins often present a positively charged arginine-rich sequence at their terminal regions, which has a fundamental role in genome packaging and particle stability for some icosahedral viruses. These sequences show little to no conservation and are structurally dynamic such that they cannot be easily detected by common sequence or structure comparisons. As a result, the occurrence and distribution of positively charged domains across the viral universe are unknown. Based on the net charge calculation of discrete protein segments, we identified proteins containing amino acid stretches with a notably high net charge (Q > + 17), which are enriched in icosahedral viruses with a distinctive bias towards arginine over lysine. We used viral particle structural data to calculate the total electrostatic charge derived from the most positively charged protein segment of capsid proteins and correlated these values with genome charges arising from the phosphates of each nucleotide. We obtained a positive correlation (r = 0.91, p-value <0001) for a group of 17 viral families, corresponding to 40% of all families with icosahedral structures described to date. These data indicated that unrelated viruses with diverse genome types adopt a common underlying mechanism for capsid assembly based on R-arms.
Collapse
|
19
|
Arıkan M, Mitchell AL, Finn RD, Gürel F. Microbial composition of Kombucha determined using amplicon sequencing and shotgun metagenomics. J Food Sci 2020; 85:455-464. [PMID: 31957879 PMCID: PMC7027524 DOI: 10.1111/1750-3841.14992] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Revised: 11/12/2019] [Accepted: 11/13/2019] [Indexed: 01/26/2023]
Abstract
Kombucha, a fermented tea generated from the co-culture of yeasts and bacteria, has gained worldwide popularity in recent years due to its potential benefits to human health. As a result, many studies have attempted to characterize both its biochemical properties and microbial composition. Here, we have applied a combination of whole metagenome sequencing (WMS) and amplicon (16S rRNA and Internal Transcribed Spacer 1 [ITS1]) sequencing to investigate the microbial communities of homemade Kombucha fermentations from day 3 to day 15. We identified the dominant bacterial genus as Komagataeibacter and dominant fungal genus as Zygosaccharomyces in all samples at all time points. Furthermore, we recovered three near complete Komagataeibacter genomes and one Zygosaccharomyces bailii genome and then predicted their functional properties. Also, we determined the broad taxonomic and functional profile of plasmids found within the Kombucha microbial communities. Overall, this study provides a detailed description of the taxonomic and functional systems of the Kombucha microbial community. Based on this, we conject that the functional complementarity enables metabolic cross talks between Komagataeibacter species and Z. bailii, which helps establish the sustained a relatively low diversity ecosystem in Kombucha.
Collapse
Affiliation(s)
- Muzaffer Arıkan
- Regenerative and Restorative Medicine Research CenterIstanbul Medipol Univ.34810IstanbulTurkey
| | - Alex L. Mitchell
- European Molecular Biology LaboratoryEuropean Bioinformatics Inst. (EMBL‐EBI)Wellcome Trust Genome Campus, HinxtonCambridgeUnited Kingdom
| | - Robert D. Finn
- European Molecular Biology LaboratoryEuropean Bioinformatics Inst. (EMBL‐EBI)Wellcome Trust Genome Campus, HinxtonCambridgeUnited Kingdom
| | - Filiz Gürel
- Molecular Biology and Genetics Dept.Faculty of Science, Istanbul Univ.34134IstanbulTurkey
| |
Collapse
|
20
|
Kim S, Kim MS, Jo S, Shin DH. GTP Preference of d-Glycero-α-d- manno-Heptose-1-Phosphate Guanylyltransferase from Yersinia pseudotuberculosis. Int J Mol Sci 2019; 21:ijms21010280. [PMID: 31906195 PMCID: PMC6981941 DOI: 10.3390/ijms21010280] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Revised: 12/23/2019] [Accepted: 12/24/2019] [Indexed: 12/22/2022] Open
Abstract
d-glycero-α-d-manno-heptose-1-phosphate guanylyltransferase (HddC) is the fourth enzyme synthesizing a building component of lipopolysaccharide (LPS) of Gram-negative bacteria. Since HddC is a potential new target to develop antibiotics, the analysis of the structural and functional relationship of the complex structure will lead to a better idea to design inhibitory compounds. X-ray crystallography and biochemical experiments to elucidate the guanine preference were performed based on the multiple sequence alignment. The crystal structure of HddC from Yersinia pseudotuberculosis (YPT) complexed with guanosine 5′-(β-amino)-diphosphate (GMPPN) has been determined at 1.55 Å resolution. Meanwhile, the mutants revealed their reduced guanine affinity, instead of acquiring noticeable pyrimidine affinity. The complex crystal structure revealed that GMPPN is docked in the catalytic site with the aid of Glu80 positioning on the conserved motif EXXPLGTGGA. In the HddC family, this motif is expected to recruit nucleotides through interacting with bases. The crystal structure shows that oxygen atoms of Glu80 forming two hydrogen bonds play a critical role in interaction with two nitrogen atoms of the guanine base of GMPPN. Interestingly, the binding of GMPPN induced the formation of an oxyanion hole-like conformation on the L(S/A/G)X(S/G) motif and consequently influenced on inducing a conformational shift of the region around Ser55.
Collapse
|
21
|
Bergstrand LH, Neufeld JD, Doxey AC. Pygenprop: a Python library for programmatic exploration and comparison of organism genome properties. Bioinformatics 2019; 35:5063-5065. [PMID: 31240307 DOI: 10.1093/bioinformatics/btz522] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Revised: 05/27/2019] [Accepted: 06/20/2019] [Indexed: 11/12/2022] Open
Abstract
SUMMARY A critical step in comparative genomics is the identification of differences in the presence/absence of encoded biochemical pathways among organisms. Our library, Pygenprop, facilitates these comparisons using data from the Genome Properties database. Pygenprop is written in Python and, unlike existing libraries, it is compatible with a variety of tools in the Python data science ecosystem, such as Jupyter Notebooks for interactive analyses and scikit-learn for machine learning. Pygenprop assigns YES, NO, or PARTIAL support for each property based on InterProScan annotations of open reading frames from an organism's genome. The library contains classes for representing the Genome Properties database as a whole and methods for detecting differences in property assignments between organisms. As the Genome Properties database grows, we anticipate widespread adoption of Pygenprop for routine genome analyses and integration within third-party bioinformatics software. AVAILABILITY AND IMPLEMENTATION Pygenprop is written in Python and is compatible with versions 3.6 or higher. Source code is available under Apache Licence Version 2 at https://github.com/Micromeda/pygenprop. The package can be installed from both PyPi (https://pypi.org/project/pygenprop) and Anaconda (https://anaconda.org/lbergstrand/pygenprop). Documentation is available on Read the Docs (http://pygenprop.rtfd.io/).
Collapse
Affiliation(s)
| | - Josh D Neufeld
- Department of Biology, University of Waterloo, Waterloo, Canada
| | - Andrew C Doxey
- Department of Biology, University of Waterloo, Waterloo, Canada
| |
Collapse
|
22
|
Combinatorial Avidity Selection of Mosaic Landscape Phages Targeted at Breast Cancer Cells-An Alternative Mechanism of Directed Molecular Evolution. Viruses 2019; 11:v11090785. [PMID: 31454976 PMCID: PMC6784196 DOI: 10.3390/v11090785] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Revised: 08/19/2019] [Accepted: 08/22/2019] [Indexed: 02/08/2023] Open
Abstract
Low performance of actively targeted nanomedicines required revision of the traditional drug targeting paradigm and stimulated the development of novel phage-programmed, self-navigating drug delivery vehicles. In the proposed smart vehicles, targeting peptides, selected from phage libraries using traditional principles of affinity selection, are substituted for phage proteins discovered through combinatorial avidity selection. Here, we substantiate the potential of combinatorial avidity selection using landscape phage in the discovery of Short Linear Motifs (SLiMs) and their partner domains. We proved an algorithm for analysis of phage populations evolved through multistage screening of landscape phage libraries against the MDA-MB-231 breast cancer cell line. The suggested combinatorial avidity selection model proposes a multistage accumulation of Elementary Binding Units (EBU), or Core Motifs (CorMs), in landscape phage fusion peptides, serving as evolutionary initiators for formation of SLiMs. Combinatorial selection has the potential to harness directed molecular evolution to create novel smart materials with diverse novel, emergent properties.
Collapse
|
23
|
Gregory K, Salvador LA, Akbar S, Adaikpoh BI, Stevens DC. Survey of Biosynthetic Gene Clusters from Sequenced Myxobacteria Reveals Unexplored Biosynthetic Potential. Microorganisms 2019; 7:E181. [PMID: 31238501 PMCID: PMC6616573 DOI: 10.3390/microorganisms7060181] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 06/20/2019] [Accepted: 06/21/2019] [Indexed: 01/31/2023] Open
Abstract
Coinciding with the increase in sequenced bacteria, mining of bacterial genomes for biosynthetic gene clusters (BGCs) has become a critical component of natural product discovery. The order Myxococcales, a reputable source of biologically active secondary metabolites, spans three suborders which all include natural product producing representatives. Utilizing the BiG-SCAPE-CORASON platform to generate a sequence similarity network that contains 994 BGCs from 36 sequenced myxobacteria deposited in the antiSMASH database, a total of 843 BGCs with lower than 75% similarity scores to characterized clusters within the MIBiG database are presented. This survey provides the biosynthetic diversity of these BGCs and an assessment of the predicted chemical space yet to be discovered. Considering the mere snapshot of myxobacteria included in this analysis, these untapped BGCs exemplify the potential for natural product discovery from myxobacteria.
Collapse
Affiliation(s)
- Katherine Gregory
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, University, MS 38677, USA.
| | - Laura A Salvador
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, University, MS 38677, USA.
| | - Shukria Akbar
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, University, MS 38677, USA.
| | - Barbara I Adaikpoh
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, University, MS 38677, USA.
| | - D Cole Stevens
- Department of BioMolecular Sciences, School of Pharmacy, University of Mississippi, University, MS 38677, USA.
| |
Collapse
|
24
|
Almeida A, Mitchell AL, Boland M, Forster SC, Gloor GB, Tarkowska A, Lawley TD, Finn RD. A new genomic blueprint of the human gut microbiota. Nature 2019; 568:499-504. [PMID: 30745586 PMCID: PMC6784870 DOI: 10.1038/s41586-019-0965-1] [Citation(s) in RCA: 723] [Impact Index Per Article: 144.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2018] [Accepted: 02/01/2019] [Indexed: 12/13/2022]
Abstract
The composition of the human gut microbiota is linked to health and disease, but knowledge of individual microbial species is needed to decipher their biological roles. Despite extensive culturing and sequencing efforts, the complete bacterial repertoire of the human gut microbiota remains undefined. Here we identify 1,952 uncultured candidate bacterial species by reconstructing 92,143 metagenome-assembled genomes from 11,850 human gut microbiomes. These uncultured genomes substantially expand the known species repertoire of the collective human gut microbiota, with a 281% increase in phylogenetic diversity. Although the newly identified species are less prevalent in well-studied populations compared to reference isolate genomes, they improve classification of understudied African and South American samples by more than 200%. These candidate species encode hundreds of newly identified biosynthetic gene clusters and possess a distinctive functional capacity that might explain their elusive nature. Our work expands the known diversity of uncultured gut bacteria, which provides unprecedented resolution for taxonomic and functional characterization of the intestinal microbiota. The known species repertoire of the collective human gut microbiota is substantially expanded with the discovery of 1,952 uncultured bacterial species that greatly improve classification of understudied African and South American samples.
Collapse
Affiliation(s)
- Alexandre Almeida
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK. .,Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.
| | - Alex L Mitchell
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Miguel Boland
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Samuel C Forster
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK.,Centre for Innate Immunity and Infectious Diseases, Hudson Institute of Medical Research, Clayton, Victoria, Australia.,Department of Molecular and Translational Sciences, Monash University, Clayton, Victoria, Australia
| | - Gregory B Gloor
- Department of Biochemistry, University of Western Ontario, London, Ontario, Canada
| | - Aleksandra Tarkowska
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK
| | - Trevor D Lawley
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, UK
| | - Robert D Finn
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, UK.
| |
Collapse
|