1
|
M A Basher AR, Hallam SJ. Leveraging heterogeneous network embedding for metabolic pathway prediction. Bioinformatics 2021; 37:822-829. [PMID: 33305310 PMCID: PMC8098024 DOI: 10.1093/bioinformatics/btaa906] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 10/03/2020] [Accepted: 10/08/2020] [Indexed: 01/27/2023] Open
Abstract
MOTIVATION Metabolic pathway reconstruction from genomic sequence information is a key step in predicting regulatory and functional potential of cells at the individual, population and community levels of organization. Although the most common methods for metabolic pathway reconstruction are gene-centric e.g. mapping annotated proteins onto known pathways using a reference database, pathway-centric methods based on heuristics or machine learning to infer pathway presence provide a powerful engine for hypothesis generation in biological systems. Such methods rely on rule sets or rich feature information that may not be known or readily accessible. RESULTS Here, we present pathway2vec, a software package consisting of six representational learning modules used to automatically generate features for pathway inference. Specifically, we build a three-layered network composed of compounds, enzymes and pathways, where nodes within a layer manifest inter-interactions and nodes between layers manifest betweenness interactions. This layered architecture captures relevant relationships used to learn a neural embedding-based low-dimensional space of metabolic features. We benchmark pathway2vec performance based on node-clustering, embedding visualization and pathway prediction using MetaCyc as a trusted source. In the pathway prediction task, results indicate that it is possible to leverage embeddings to improve prediction outcomes. AVAILABILITY AND IMPLEMENTATION The software package and installation instructions are published on http://github.com/pathway2vec. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Abdur Rahman M A Basher
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| | - Steven J Hallam
- Graduate Program in Bioinformatics, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
- Department of Microbiology & Immunology, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
- Genome Science and Technology Program, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
- Life Sciences Institute, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
- ECOSCOPE Training Program, University of British Columbia, Vancouver, BC V6T 1Z3, Canada
| |
Collapse
|
2
|
M A Basher AR, McLaughlin RJ, Hallam SJ. Metabolic pathway inference using multi-label classification with rich pathway features. PLoS Comput Biol 2020; 16:e1008174. [PMID: 33001968 PMCID: PMC7529316 DOI: 10.1371/journal.pcbi.1008174] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2020] [Accepted: 07/21/2020] [Indexed: 12/15/2022] Open
Abstract
Metabolic inference from genomic sequence information is a necessary step in determining the capacity of cells to make a living in the world at different levels of biological organization. A common method for determining the metabolic potential encoded in genomes is to map conceptually translated open reading frames onto a database containing known product descriptions. Such gene-centric methods are limited in their capacity to predict pathway presence or absence and do not support standardized rule sets for automated and reproducible research. Pathway-centric methods based on defined rule sets or machine learning algorithms provide an adjunct or alternative inference method that supports hypothesis generation and testing of metabolic relationships within and between cells. Here, we present mlLGPR, multi-label based on logistic regression for pathway prediction, a software package that uses supervised multi-label classification and rich pathway features to infer metabolic networks in organismal and multi-organismal datasets. We evaluated mlLGPR performance using a corpora of 12 experimental datasets manifesting diverse multi-label properties, including manually curated organismal genomes, synthetic microbial communities and low complexity microbial communities. Resulting performance metrics equaled or exceeded previous reports for organismal genomes and identify specific challenges associated with features engineering and training data for community-level metabolic inference.
Collapse
Affiliation(s)
- Abdur Rahman M A Basher
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
| | - Ryan J McLaughlin
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
| | - Steven J Hallam
- Graduate Program in Bioinformatics, University of British Columbia, Genome Sciences Centre, 100-570 West 7th Avenue, Vancouver, British Columbia, Canada
- Department of Microbiology & Immunology, University of British Columbia, 2552-2350 Health Sciences Mall, Vancouver, British Columbia, Canada
- Genome Science and Technology Program, University of British Columbia, 2329 West Mall, Vancouver, BC, Canada
- Life Sciences Institute, University of British Columbia, Vancouver, British Columbia, Canada
- ECOSCOPE Training Program, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
3
|
Hiraoka S, Yang CC, Iwasaki W. Metagenomics and Bioinformatics in Microbial Ecology: Current Status and Beyond. Microbes Environ 2016; 31:204-12. [PMID: 27383682 PMCID: PMC5017796 DOI: 10.1264/jsme2.me16024] [Citation(s) in RCA: 44] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Metagenomic approaches are now commonly used in microbial ecology to study microbial communities in more detail, including many strains that cannot be cultivated in the laboratory. Bioinformatic analyses make it possible to mine huge metagenomic datasets and discover general patterns that govern microbial ecosystems. However, the findings of typical metagenomic and bioinformatic analyses still do not completely describe the ecology and evolution of microbes in their environments. Most analyses still depend on straightforward sequence similarity searches against reference databases. We herein review the current state of metagenomics and bioinformatics in microbial ecology and discuss future directions for the field. New techniques will allow us to go beyond routine analyses and broaden our knowledge of microbial ecosystems. We need to enrich reference databases, promote platforms that enable meta- or comprehensive analyses of diverse metagenomic datasets, devise methods that utilize long-read sequence information, and develop more powerful bioinformatic methods to analyze data from diverse perspectives.
Collapse
Affiliation(s)
- Satoshi Hiraoka
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, the University of Tokyo
| | | | | |
Collapse
|
4
|
Krishnan S, Alden N, Lee K. Pathways and functions of gut microbiota metabolism impacting host physiology. Curr Opin Biotechnol 2015; 36:137-45. [PMID: 26340103 DOI: 10.1016/j.copbio.2015.08.015] [Citation(s) in RCA: 121] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2015] [Revised: 08/07/2015] [Accepted: 08/09/2015] [Indexed: 01/13/2023]
Abstract
The bacterial populations in the human intestine impact host physiological functions through their metabolic activity. In addition to performing essential catabolic and biotransformation functions, the gut microbiota produces bioactive small molecules that mediate interactions with the host and contribute to the neurohumoral axes connecting the intestine with other parts of the body. This review discusses recent progress in characterizing the metabolic products of the gut microbiota and their biological functions, focusing on studies that investigate the responsible bacterial pathways and cognate host receptors. Several key areas are highlighted for future development: context-based analysis targeting pathways; integration of analytical approaches; metabolic modeling; and synthetic systems for in vivo manipulation of microbiota functions. Prospectively, these developments could further our mechanistic understanding of host-microbiota interactions.
Collapse
Affiliation(s)
- Smitha Krishnan
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA, United States
| | - Nicholas Alden
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA, United States
| | - Kyongbum Lee
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA, United States.
| |
Collapse
|
5
|
Christley S, Cockrell C, An G. Computational Studies of the Intestinal Host-Microbiota Interactome. COMPUTATION (BASEL, SWITZERLAND) 2015; 3:2-28. [PMID: 34765258 PMCID: PMC8580329 DOI: 10.3390/computation3010002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
A large and growing body of research implicates aberrant immune response and compositional shifts of the intestinal microbiota in the pathogenesis of many intestinal disorders. The molecular and physical interaction between the host and the microbiota, known as the host-microbiota interactome, is one of the key drivers in the pathophysiology of many of these disorders. This host-microbiota interactome is a set of dynamic and complex processes, and needs to be treated as a distinct entity and subject for study. Disentangling this complex web of interactions will require novel approaches, using a combination of data-driven bioinformatics with knowledge-driven computational modeling. This review describes the computational approaches for investigating the host-microbiota interactome, with emphasis on the human intestinal tract and innate immunity, and highlights open challenges and existing gaps in the computation methodology for advancing our knowledge about this important facet of human health.
Collapse
Affiliation(s)
- Scott Christley
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| | - Chase Cockrell
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| | - Gary An
- Department of Surgery, University of Chicago, 5841 South Maryland Avenue, Chicago, IL 60637, USA
| |
Collapse
|
6
|
Abram F. Systems-based approaches to unravel multi-species microbial community functioning. Comput Struct Biotechnol J 2014; 13:24-32. [PMID: 25750697 PMCID: PMC4348430 DOI: 10.1016/j.csbj.2014.11.009] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2014] [Revised: 11/25/2014] [Accepted: 11/26/2014] [Indexed: 01/24/2023] Open
Abstract
Some of the most transformative discoveries promising to enable the resolution of this century's grand societal challenges will most likely arise from environmental science and particularly environmental microbiology and biotechnology. Understanding how microbes interact in situ, and how microbial communities respond to environmental changes remains an enormous challenge for science. Systems biology offers a powerful experimental strategy to tackle the exciting task of deciphering microbial interactions. In this framework, entire microbial communities are considered as metaorganisms and each level of biological information (DNA, RNA, proteins and metabolites) is investigated along with in situ environmental characteristics. In this way, systems biology can help unravel the interactions between the different parts of an ecosystem ultimately responsible for its emergent properties. Indeed each level of biological information provides a different level of characterisation of the microbial communities. Metagenomics, metatranscriptomics, metaproteomics, metabolomics and SIP-omics can be employed to investigate collectively microbial community structure, potential, function, activity and interactions. Omics approaches are enabled by high-throughput 21st century technologies and this review will discuss how their implementation has revolutionised our understanding of microbial communities.
Collapse
Affiliation(s)
- Florence Abram
- Functional Environmental Microbiology, School of Natural Sciences, National University of Ireland Galway, University Road, Galway, Ireland
| |
Collapse
|
7
|
Shafiei M, Dunn KA, Chipman H, Gu H, Bielawski JP. BiomeNet: a Bayesian model for inference of metabolic divergence among microbial communities. PLoS Comput Biol 2014; 10:e1003918. [PMID: 25412107 PMCID: PMC4238953 DOI: 10.1371/journal.pcbi.1003918] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2014] [Accepted: 09/16/2014] [Indexed: 02/07/2023] Open
Abstract
Metagenomics yields enormous numbers of microbial sequences that can be assigned a metabolic function. Using such data to infer community-level metabolic divergence is hindered by the lack of a suitable statistical framework. Here, we describe a novel hierarchical Bayesian model, called BiomeNet (Bayesian inference of metabolic networks), for inferring differential prevalence of metabolic subnetworks among microbial communities. To infer the structure of community-level metabolic interactions, BiomeNet applies a mixed-membership modelling framework to enzyme abundance information. The basic idea is that the mixture components of the model (metabolic reactions, subnetworks, and networks) are shared across all groups (microbiome samples), but the mixture proportions vary from group to group. Through this framework, the model can capture nested structures within the data. BiomeNet is unique in modeling each metagenome sample as a mixture of complex metabolic systems (metabosystems). The metabosystems are composed of mixtures of tightly connected metabolic subnetworks. BiomeNet differs from other unsupervised methods by allowing researchers to discriminate groups of samples through the metabolic patterns it discovers in the data, and by providing a framework for interpreting them. We describe a collapsed Gibbs sampler for inference of the mixture weights under BiomeNet, and we use simulation to validate the inference algorithm. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among inflammatory bowel disease (IBD) patients. Based on the discriminatory subnetworks for this metabosystem, we inferred that the community is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an antioxidant connected to IBD. Because this metabosystem has a greater capacity to exploit host-associated glycans, we speculate that IBD-associated communities might arise from opportunist growth of bacteria that can circumvent the host's nutrient-based mechanism for bacterial partner selection. Metagenomic studies of microbial communities yield enormous numbers of gene sequences that have a known enzymatic function, and thus have potential to contribute to community-level metabolic activities. Ecologically divergent microbial communities are presumed to differ in metabolic repertoire and function, but detecting such differences is challenging because the required analytical methodology is complex. Here, we present a novel Bayesian model suitable for this task. Our model, BiomeNet, does not assume that microbiome samples of a certain type are the same; rather, a sample is modeled as a unique mixture of complex metabolic systems referred to as “metabosystems”. The metabosystems are composed of mixtures of subnetworks, where subnetworks are mixtures of reactions related by function. Application of BiomeNet to human gut metagenomes revealed a metabosystem with greater prevalence among IBD patients. We inferred that this metabosystem is likely to be closely associated with the human gut epithelium, resistant to dietary interventions, and interfere with human uptake of an important antioxidant, possibly contributing to gut inflammation associated with IBD.
Collapse
Affiliation(s)
- Mahdi Shafiei
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Katherine A. Dunn
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Hugh Chipman
- Department of Mathematics & Statistics, Acadia University, Wolfville, Nova Scotia, Canada
| | - Hong Gu
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Joseph P. Bielawski
- Department of Mathematics & Statistics, Dalhousie University, Halifax, Nova Scotia, Canada
- Department of Biology, Dalhousie University, Halifax, Nova Scotia, Canada
- * E-mail:
| |
Collapse
|
8
|
Sridharan GV, Choi K, Klemashevich C, Wu C, Prabakaran D, Pan LB, Steinmeyer S, Mueller C, Yousofshahi M, Alaniz RC, Lee K, Jayaraman A. Prediction and quantification of bioactive microbiota metabolites in the mouse gut. Nat Commun 2014; 5:5492. [DOI: 10.1038/ncomms6492] [Citation(s) in RCA: 164] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Accepted: 10/07/2014] [Indexed: 12/22/2022] Open
|
9
|
Rhoads DD, Sintchenko V, Rauch CA, Pantanowitz L. Clinical microbiology informatics. Clin Microbiol Rev 2014; 27:1025-47. [PMID: 25278581 PMCID: PMC4187636 DOI: 10.1128/cmr.00049-14] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The clinical microbiology laboratory has responsibilities ranging from characterizing the causative agent in a patient's infection to helping detect global disease outbreaks. All of these processes are increasingly becoming partnered more intimately with informatics. Effective application of informatics tools can increase the accuracy, timeliness, and completeness of microbiology testing while decreasing the laboratory workload, which can lead to optimized laboratory workflow and decreased costs. Informatics is poised to be increasingly relevant in clinical microbiology, with the advent of total laboratory automation, complex instrument interfaces, electronic health records, clinical decision support tools, and the clinical implementation of microbial genome sequencing. This review discusses the diverse informatics aspects that are relevant to the clinical microbiology laboratory, including the following: the microbiology laboratory information system, decision support tools, expert systems, instrument interfaces, total laboratory automation, telemicrobiology, automated image analysis, nucleic acid sequence databases, electronic reporting of infectious agents to public health agencies, and disease outbreak surveillance. The breadth and utility of informatics tools used in clinical microbiology have made them indispensable to contemporary clinical and laboratory practice. Continued advances in technology and development of these informatics tools will further improve patient and public health care in the future.
Collapse
Affiliation(s)
- Daniel D Rhoads
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA
| | - Vitali Sintchenko
- Marie Bashir Institute for Infectious Diseases and Biosecurity and Sydney Medical School, The University of Sydney, Sydney, New South Wales, Australia Centre for Infectious Diseases and Microbiology-Public Health, Institute of Clinical Pathology and Medical Research, Westmead Hospital, Sydney, New South Wales, Australia
| | - Carol A Rauch
- Department of Pathology, Microbiology and Immunology, Vanderbilt University School of Medicine, Nashville, Tennessee, USA
| | - Liron Pantanowitz
- Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
10
|
Deusch O, O’Flynn C, Colyer A, Morris P, Allaway D, Jones PG, Swanson KS. Deep Illumina-based shotgun sequencing reveals dietary effects on the structure and function of the fecal microbiome of growing kittens. PLoS One 2014; 9:e101021. [PMID: 25010839 PMCID: PMC4091873 DOI: 10.1371/journal.pone.0101021] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2013] [Accepted: 06/02/2014] [Indexed: 12/22/2022] Open
Abstract
Background Previously, we demonstrated that dietary protein:carbohydrate ratio dramatically affects the fecal microbial taxonomic structure of kittens using targeted 16S gene sequencing. The present study, using the same fecal samples, applied deep Illumina shotgun sequencing to identify the diet-associated functional potential and analyze taxonomic changes of the feline fecal microbiome. Methodology & Principal Findings Fecal samples from kittens fed one of two diets differing in protein and carbohydrate content (high–protein, low–carbohydrate, HPLC; and moderate-protein, moderate-carbohydrate, MPMC) were collected at 8, 12 and 16 weeks of age (n = 6 per group). A total of 345.3 gigabases of sequence were generated from 36 samples, with 99.75% of annotated sequences identified as bacterial. At the genus level, 26% and 39% of reads were annotated for HPLC- and MPMC-fed kittens, with HPLC-fed cats showing greater species richness and microbial diversity. Two phyla, ten families and fifteen genera were responsible for more than 80% of the sequences at each taxonomic level for both diet groups, consistent with the previous taxonomic study. Significantly different abundances between diet groups were observed for 324 genera (56% of all genera identified) demonstrating widespread diet-induced changes in microbial taxonomic structure. Diversity was not affected over time. Functional analysis identified 2,013 putative enzyme function groups were different (p<0.000007) between the two dietary groups and were associated to 194 pathways, which formed five discrete clusters based on average relative abundance. Of those, ten contained more (p<0.022) enzyme functions with significant diet effects than expected by chance. Six pathways were related to amino acid biosynthesis and metabolism linking changes in dietary protein with functional differences of the gut microbiome. Conclusions These data indicate that feline feces-derived microbiomes have large structural and functional differences relating to the dietary protein:carbohydrate ratio and highlight the impact of diet early in life.
Collapse
Affiliation(s)
- Oliver Deusch
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - Ciaran O’Flynn
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - Alison Colyer
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - Penelope Morris
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - David Allaway
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - Paul G. Jones
- WALTHAM Centre for Pet Nutrition, Waltham-on-the-Wolds, Leicestershire, United Kingdom
| | - Kelly S. Swanson
- Department of Animal Sciences, University of Illinois, Urbana, Illinois, United States of America
- Division of Nutritional Sciences, University of Illinois, Urbana, Illinois, United States of America
- Department of Veterinary Clinical Medicine, University of Illinois, Urbana, Illinois, United States of America
- * E-mail:
| |
Collapse
|