1
|
Astero M, Rousu J. Learning symmetry-aware atom mapping in chemical reactions through deep graph matching. J Cheminform 2024; 16:46. [PMID: 38650016 PMCID: PMC11036715 DOI: 10.1186/s13321-024-00841-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 04/07/2024] [Indexed: 04/25/2024] Open
Abstract
Accurate atom mapping, which establishes correspondences between atoms in reactants and products, is a crucial step in analyzing chemical reactions. In this paper, we present a novel end-to-end approach that formulates the atom mapping problem as a deep graph matching task. Our proposed model, AMNet (Atom Matching Network), utilizes molecular graph representations and employs various atom and bond features using graph neural networks to capture the intricate structural characteristics of molecules, ensuring precise atom correspondence predictions. Notably, AMNet incorporates the consideration of molecule symmetry, enhancing accuracy while simultaneously reducing computational complexity. The integration of the Weisfeiler-Lehman isomorphism test for symmetry identification refines the model's predictions. Furthermore, our model maps the entire atom set in a chemical reaction, offering a comprehensive approach beyond focusing solely on the main molecules in reactions. We evaluated AMNet's performance on a subset of USPTO reaction datasets, addressing various tasks, including assessing the impact of molecular symmetry identification, understanding the influence of feature selection on AMNet performance, and comparing its performance with the state-of-the-art method. The result reveals an average accuracy of 97.3% on mapped atoms, with 99.7% of reactions correctly mapped when the correct mapped atom is within the top 10 predicted atoms.Scientific contributionThe paper introduces a novel end-to-end deep graph matching model for atom mapping, utilizing molecular graph representations to capture structural characteristics effectively. It enhances accuracy by integrating symmetry detection through the Weisfeiler-Lehman test, reducing the number of possible mappings and improving efficiency. Unlike previous methods, it maps the entire reaction, not just main components, providing a comprehensive view. Additionally, by integrating efficient graph matching techniques, it reduces computational complexity, making atom mapping more feasible.
Collapse
Affiliation(s)
- Maryam Astero
- Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland.
| | - Juho Rousu
- Computer Science, Aalto University, Konemiehentie 2, 02150, Espoo, Finland.
| |
Collapse
|
2
|
Zheng S, Zeng T, Li C, Chen B, Coley CW, Yang Y, Wu R. Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP. Nat Commun 2022; 13:3342. [PMID: 35688826 PMCID: PMC9187661 DOI: 10.1038/s41467-022-30970-9] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 05/27/2022] [Indexed: 12/30/2022] Open
Abstract
The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs. The complete biosynthetic pathway from most natural products (NPs) are unknown. Here, the authors report BioNavi-NP, a computational toolkit for bio-retrosynthetic pathway elucidation or reconstruction for both NPs and NP-like compounds.
Collapse
Affiliation(s)
- Shuangjia Zheng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.,Galixir, Beijing, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
| | - Tao Zeng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | | | - Binghong Chen
- College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.
| | - Ruibo Wu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.
| |
Collapse
|
3
|
Karp PD, Paley S, Krummenacker M, Kothari A, Wannemuehler MJ, Phillips GJ. Pathway Tools Management of Pathway/Genome Data for Microbial Communities. FRONTIERS IN BIOINFORMATICS 2022; 2:869150. [PMID: 36304298 PMCID: PMC9580912 DOI: 10.3389/fbinf.2022.869150] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Accepted: 04/05/2022] [Indexed: 11/14/2022] Open
Abstract
The Pathway Tools (PTools) software provides a suite of capabilities for storing and analyzing integrated collections of genomic and metabolic information in the form of organism-specific Pathway/Genome Databases (PGDBs). A microbial community is represented in PTools by generating a PGDB from each metagenome-assembled genome (MAG). PTools computes a metabolic reconstruction for each organism, and predicts its operons. The properties of individual MAGs can be investigated using the many search and visualization operations within PTools. PTools also enables the user to investigate the properties of the microbial community by issuing searches across the full community, and by performing comparative operations across genome and pathway information. The software can generate a metabolic network diagram for the community, and it can overlay community omics datasets on that network diagram. PTools also provides a tool for searching for metabolic transformation routes across an organism community.
Collapse
Affiliation(s)
- Peter D. Karp
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA, United States,*Correspondence: Peter D. Karp,
| | - Suzanne Paley
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA, United States
| | - Markus Krummenacker
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA, United States
| | - Anamika Kothari
- Bioinformatics Research Group, Artificial Intelligence Center, SRI International, Menlo Park, CA, United States
| | | | - Gregory J. Phillips
- Department of Veterinary Microbiology, Iowa State University, Ames, IA, United States
| |
Collapse
|
4
|
PyMiner: A method for metabolic pathway design based on the uniform similarity of substrate-product pairs and conditional search. PLoS One 2022; 17:e0266783. [PMID: 35404943 PMCID: PMC9000129 DOI: 10.1371/journal.pone.0266783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 03/26/2022] [Indexed: 11/30/2022] Open
Abstract
Metabolic pathway design is an essential step in the course of constructing an efficient microbial cell factory to produce high value-added chemicals. Meanwhile, the computational design of biologically meaningful metabolic pathways has been attracting much attention to produce natural and non-natural products. However, there has been a lack of effective methods to perform metabolic network reduction automatically. In addition, comprehensive evaluation indexes for metabolic pathway are still relatively scarce. Here, we define a novel uniform similarity to calculate the main substrate-product pairs of known biochemical reactions, and develop further an efficient metabolic pathway design tool named PyMiner. As a result, the redundant information of general metabolic network (GMN) is eliminated, and the number of substrate-product pairs is shown to decrease by 81.62% on average. Considering that the nodes in the extracted metabolic network (EMN) constructed in this work is large in scale but imbalanced in distribution, we establish a conditional search strategy (CSS) that cuts search time in 90.6% cases. Compared with state-of-the-art methods, PyMiner shows obvious advantages and demonstrates equivalent or better performance on 95% cases of experimentally verified pathways. Consequently, PyMiner is a practical and effective tool for metabolic pathway design.
Collapse
|
5
|
Gao Y, Yuan Q, Mao Z, Liu H, Ma H. Global connectivity in genome-scale metabolic networks revealed by comprehensive FBA-based pathway analysis. BMC Microbiol 2021; 21:292. [PMID: 34696732 PMCID: PMC8543872 DOI: 10.1186/s12866-021-02357-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 10/12/2021] [Indexed: 11/10/2022] Open
Abstract
Background Graph-based analysis (GBA) of genome-scale metabolic networks has revealed system-level structures such as the bow-tie connectivity that describes the overall mass flow in a network. However, many pathways obtained by GBA are biologically impossible, making it difficult to study how the global structures affect the biological functions of a network. New method that can calculate the biologically relevant pathways is desirable for structural analysis of metabolic networks. Results Here, we present a new method to determine the bow-tie connectivity structure by calculating possible pathways between any pairs of metabolites in the metabolic network using a flux balance analysis (FBA) approach to ensure that the obtained pathways are biologically relevant. We tested this method with 15 selected high-quality genome-scale metabolic models from BiGG database. The results confirmed the key roles of central metabolites in network connectivity, locating in the core part of the bow-tie structure, the giant strongly connected component (GSC). However, the sizes of GSCs revealed by GBA are significantly larger than those by FBA approach. A great number of metabolites in the GSC from GBA actually cannot be produced from or converted to other metabolites through a mass balanced pathway and thus should not be in GSC but in other subsets of the bow-tie structure. In contrast, the bow-tie structural classification of metabolites obtained by FBA is more biologically relevant and suitable for the study of the structure-function relationships of genome scale metabolic networks. Conclusions The FBA based pathway calculation improve the biologically relevant classification of metabolites in the bow-tie connectivity structure of the metabolic network, taking us one step further toward understanding how such system-level structures impact the biological functions of an organism. Supplementary Information The online version contains supplementary material available at 10.1186/s12866-021-02357-1.
Collapse
Affiliation(s)
- Yajie Gao
- Biodesign Center, Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China.,College of Biotechnology, Tianjin University of Science & Technology, Tianjin, China
| | - Qianqian Yuan
- Biodesign Center, Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Zhitao Mao
- Biodesign Center, Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Hao Liu
- College of Biotechnology, Tianjin University of Science & Technology, Tianjin, China
| | - Hongwu Ma
- Biodesign Center, Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China.
| |
Collapse
|
6
|
Hafner J, Hatzimanikatis V. Finding metabolic pathways in large networks through atom-conserving substrate-product pairs. Bioinformatics 2021; 37:3560-3568. [PMID: 34003971 PMCID: PMC8545321 DOI: 10.1093/bioinformatics/btab368] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 03/22/2021] [Accepted: 05/17/2021] [Indexed: 11/29/2022] Open
Abstract
Motivation Finding biosynthetic pathways is essential for metabolic engineering of organisms to produce chemicals, biodegradation prediction of pollutants and drugs, and for the elucidation of bioproduction pathways of secondary metabolites. A key step in biosynthetic pathway design is the extraction of novel metabolic pathways from big networks that integrate known biological, as well as novel, predicted biotransformations. However, the efficient analysis and the navigation of big biochemical networks remain a challenge. Results Here, we propose the construction of searchable graph representations of metabolic networks. Each reaction is decomposed into pairs of reactants and products, and each pair is assigned a weight, which is calculated from the number of conserved atoms between the reactant and the product molecule. We test our method on a biochemical network that spans 6546 known enzymatic reactions to show how our approach elegantly extracts biologically relevant metabolic pathways from biochemical networks, and how the proposed network structure enables the application of efficient graph search algorithms that improve navigation and pathway identification in big metabolic networks. The weighted reactant–product pairs of an example network and the corresponding graph search algorithm are available online. The proposed method extracts metabolic pathways fast and reliably from big biochemical networks, which is inherently important for all applications involving the engineering of metabolic networks. Availability and implementation https://github.com/EPFL-LCSB/nicepath. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jasmin Hafner
- Laboratory of Computational Systems Biotechnology (LCSB), Institute of Chemical Sciences and Engineering (ISIC), School of Basic Sciences (SB), Swiss Federal Institute of Technology (EPFL), 1015 Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology (LCSB), Institute of Chemical Sciences and Engineering (ISIC), School of Basic Sciences (SB), Swiss Federal Institute of Technology (EPFL), 1015 Lausanne, Switzerland
- To whom correspondence should be addressed.
| |
Collapse
|
7
|
Huang Y, Xie Y, Zhong C, Zhou F. Finding branched pathways in metabolic network via atom group tracking. PLoS Comput Biol 2021; 17:e1008676. [PMID: 33529200 PMCID: PMC7880430 DOI: 10.1371/journal.pcbi.1008676] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 02/12/2021] [Accepted: 01/05/2021] [Indexed: 12/27/2022] Open
Abstract
Finding non-standard or new metabolic pathways has important applications in metabolic engineering, synthetic biology and the analysis and reconstruction of metabolic networks. Branched metabolic pathways dominate in metabolic networks and depict a more comprehensive picture of metabolism compared to linear pathways. Although progress has been developed to find branched metabolic pathways, few efforts have been made in identifying branched metabolic pathways via atom group tracking. In this paper, we present a pathfinding method called BPFinder for finding branched metabolic pathways by atom group tracking, which aims to guide the synthetic design of metabolic pathways. BPFinder enumerates linear metabolic pathways by tracking the movements of atom groups in metabolic network and merges the linear atom group conserving pathways into branched pathways. Two merging rules based on the structure of conserved atom groups are proposed to accurately merge the branched compounds of linear pathways to identify branched pathways. Furthermore, the integrated information of compound similarity, thermodynamic feasibility and conserved atom groups is also used to rank the pathfinding results for feasible branched pathways. Experimental results show that BPFinder is more capable of recovering known branched metabolic pathways as compared to other existing methods, and is able to return biologically relevant branched pathways and discover alternative branched pathways of biochemical interest. The online server of BPFinder is available at http://114.215.129.245:8080/atomic/. The program, source code and data can be downloaded from https://github.com/hyr0771/BPFinder. Computational search of branched metabolic pathways is a fundamental problem in metabolic engineering and metabolic network analysis, which provides a systematic way of understanding the metabolism and discovering alternative pathways for synthesis of useful biomolecules. We propose BPFinder, a novel computational approach to identify branched metabolic pathways via atom group tracking. Different from other pathfinding methods using atom tracking, BPFinder tracks the movement of atom groups in metabolic network to find linear atom group conserving pathways, and merge the found linear pathways by the selected branched compounds to generate branched pathways. Based on the structure of conserved atom groups in branched compounds, we design two merging rules for branched compounds: overlapping rule and non-overlapping rule. The user can flexibly adopt these rules to accurately find the branched pathways that contain overlapping/non-overlapping conserved atom groups. BPFinder also enables the user to combine the information of compound similarity, Gibbs free energy of reactions, and conserved atom groups to sort resulting pathways. Compared with other existing methods, BPFinder can more accurately recover the known branched pathways. The alternative branched pathways returned by BPFinder reveal that the user can flexibly utilize our proposed merging rules to discover biochemically meaningful pathways of interest.
Collapse
Affiliation(s)
- Yiran Huang
- School of Computer and Electronics and Information, Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China
- * E-mail:
| | - Yusi Xie
- School of Computer and Electronics and Information, Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China
| | - Cheng Zhong
- School of Computer and Electronics and Information, Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, China
| | - Fengfeng Zhou
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, China
| |
Collapse
|
8
|
Wang Z, Zhao W, Hao G, Song B. Mapping the resources and approaches facilitating computer-aided synthesis planning. Org Chem Front 2021. [DOI: 10.1039/d0qo00946f] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Computer-aided synthesis planning could facilitate organic synthesis study and relieve chemists of manual tasks. Artificial intelligence and deep learning would be useful for the development of computer-aided synthesis planning.
Collapse
Affiliation(s)
- Zheng Wang
- State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering
- Key Laboratory of Green Pesticide and Agricultural Bioengineering
- Ministry of Education
- Center for Research and Development of Fine Chemicals
- Guizhou University
| | - Wei Zhao
- State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering
- Key Laboratory of Green Pesticide and Agricultural Bioengineering
- Ministry of Education
- Center for Research and Development of Fine Chemicals
- Guizhou University
| | - Gefei Hao
- State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering
- Key Laboratory of Green Pesticide and Agricultural Bioengineering
- Ministry of Education
- Center for Research and Development of Fine Chemicals
- Guizhou University
| | - Baoan Song
- State Key Laboratory Breeding Base of Green Pesticide and Agricultural Bioengineering
- Key Laboratory of Green Pesticide and Agricultural Bioengineering
- Ministry of Education
- Center for Research and Development of Fine Chemicals
- Guizhou University
| |
Collapse
|
9
|
Otero-Muras I, Carbonell P. Automated engineering of synthetic metabolic pathways for efficient biomanufacturing. Metab Eng 2020; 63:61-80. [PMID: 33316374 DOI: 10.1016/j.ymben.2020.11.012] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2020] [Revised: 11/15/2020] [Accepted: 11/20/2020] [Indexed: 12/19/2022]
Abstract
Metabolic engineering involves the engineering and optimization of processes from single-cell to fermentation in order to increase production of valuable chemicals for health, food, energy, materials and others. A systems approach to metabolic engineering has gained traction in recent years thanks to advances in strain engineering, leading to an accelerated scaling from rapid prototyping to industrial production. Metabolic engineering is nowadays on track towards a truly manufacturing technology, with reduced times from conception to production enabled by automated protocols for DNA assembly of metabolic pathways in engineered producer strains. In this review, we discuss how the success of the metabolic engineering pipeline often relies on retrobiosynthetic protocols able to identify promising production routes and dynamic regulation strategies through automated biodesign algorithms, which are subsequently assembled as embedded integrated genetic circuits in the host strain. Those approaches are orchestrated by an experimental design strategy that provides optimal scheduling planning of the DNA assembly, rapid prototyping and, ultimately, brings forward an accelerated Design-Build-Test-Learn cycle and the overall optimization of the biomanufacturing process. Achieving such a vision will address the increasingly compelling demand in our society for delivering valuable biomolecules in an affordable, inclusive and sustainable bioeconomy.
Collapse
Affiliation(s)
- Irene Otero-Muras
- BioProcess Engineering Group, IIM-CSIC, Spanish National Research Council, Vigo, 36208, Spain.
| | - Pablo Carbonell
- Institute of Industrial Control Systems and Computing (ai2), Universitat Politècnica de València, 46022, Spain.
| |
Collapse
|
10
|
Atom Identifiers Generated by a Neighborhood-Specific Graph Coloring Method Enable Compound Harmonization across Metabolic Databases. Metabolites 2020; 10:metabo10090368. [PMID: 32933023 PMCID: PMC7570338 DOI: 10.3390/metabo10090368] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2020] [Revised: 09/04/2020] [Accepted: 09/08/2020] [Indexed: 02/06/2023] Open
Abstract
Metabolic flux analysis requires both a reliable metabolic model and reliable metabolic profiles in characterizing metabolic reprogramming. Advances in analytic methodologies enable production of high-quality metabolomics datasets capturing isotopic flux. However, useful metabolic models can be difficult to derive due to the lack of relatively complete atom-resolved metabolic networks for a variety of organisms, including human. Here, we developed a neighborhood-specific graph coloring method that creates unique identifiers for each atom in a compound facilitating construction of an atom-resolved metabolic network. What is more, this method is guaranteed to generate the same identifier for symmetric atoms, enabling automatic identification of possible additional mappings caused by molecular symmetry. Furthermore, a compound coloring identifier derived from the corresponding atom coloring identifiers can be used for compound harmonization across various metabolic network databases, which is an essential first step in network integration. With the compound coloring identifiers, 8865 correspondences between KEGG (Kyoto Encyclopedia of Genes and Genomes) and MetaCyc compounds are detected, with 5451 of them confirmed by other identifiers provided by the two databases. In addition, we found that the Enzyme Commission numbers (EC) of reactions can be used to validate possible correspondence pairs, with 1848 unconfirmed pairs validated by commonality in reaction ECs. Moreover, we were able to detect various issues and errors with compound representation in KEGG and MetaCyc databases by compound coloring identifiers, demonstrating the usefulness of this methodology for database curation.
Collapse
|
11
|
Perez De Souza L, Alseekh S, Brotman Y, Fernie AR. Network-based strategies in metabolomics data analysis and interpretation: from molecular networking to biological interpretation. Expert Rev Proteomics 2020; 17:243-255. [PMID: 32380880 DOI: 10.1080/14789450.2020.1766975] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
INTRODUCTION Metabolomics has become a crucial part of systems biology; however, data analysis is still often undertaken in a reductionist way focusing on changes in individual metabolites. Whilst such approaches indeed provide relevant insights into the metabolic phenotype of an organism, the intricate nature of metabolic relationships may be better explored when considering the whole system. AREAS COVERED This review highlights multiple network strategies that can be applied for metabolomics data analysis from different perspectives including: association networks based on quantitative information, mass spectra similarity networks to assist metabolite annotation and biochemical networks for systematic data interpretation. We also highlight some relevant insights into metabolic organization obtained through the exploration of such approaches. EXPERT OPINION Network based analysis is an established method that allows the identification of non-intuitive metabolic relationships as well as the identification of unknown compounds in mass spectrometry. Additionally, the representation of data from metabolomics within the context of metabolic networks is intuitive and allows for the use of statistical analysis that can better summarize relevant metabolic changes from a systematic perspective.
Collapse
Affiliation(s)
- Leonardo Perez De Souza
- Department of molecular physiology, Max-Planck-Institute of Molecular Plant Physiology , Potsdam-Golm, Germany
| | - Saleh Alseekh
- Department of molecular physiology, Max-Planck-Institute of Molecular Plant Physiology , Potsdam-Golm, Germany.,Department of plant metabolomics, Centre of Plant Systems Biology and Biotechnology , Plovdiv, Bulgaria
| | - Yariv Brotman
- Department of Life Sciences, Ben-Gurion University of the Negev , Beersheba, Israel
| | - Alisdair R Fernie
- Department of molecular physiology, Max-Planck-Institute of Molecular Plant Physiology , Potsdam-Golm, Germany.,Department of plant metabolomics, Centre of Plant Systems Biology and Biotechnology , Plovdiv, Bulgaria
| |
Collapse
|
12
|
Consuegra J, Grenier T, Baa-Puyoulet P, Rahioui I, Akherraz H, Gervais H, Parisot N, da Silva P, Charles H, Calevro F, Leulier F. Drosophila-associated bacteria differentially shape the nutritional requirements of their host during juvenile growth. PLoS Biol 2020; 18:e3000681. [PMID: 32196485 PMCID: PMC7112240 DOI: 10.1371/journal.pbio.3000681] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2019] [Revised: 04/01/2020] [Accepted: 03/04/2020] [Indexed: 01/14/2023] Open
Abstract
The interplay between nutrition and the microbial communities colonizing the gastrointestinal tract (i.e., gut microbiota) determines juvenile growth trajectory. Nutritional deficiencies trigger developmental delays, and an immature gut microbiota is a hallmark of pathologies related to childhood undernutrition. However, how host-associated bacteria modulate the impact of nutrition on juvenile growth remains elusive. Here, using gnotobiotic Drosophila melanogaster larvae independently associated with Acetobacter pomorumWJL (ApWJL) and Lactobacillus plantarumNC8 (LpNC8), 2 model Drosophila-associated bacteria, we performed a large-scale, systematic nutritional screen based on larval growth in 40 different and precisely controlled nutritional environments. We combined these results with genome-based metabolic network reconstruction to define the biosynthetic capacities of Drosophila germ-free (GF) larvae and its 2 bacterial partners. We first established that ApWJL and LpNC8 differentially fulfill the nutritional requirements of the ex-GF larvae and parsed such difference down to individual amino acids, vitamins, other micronutrients, and trace metals. We found that Drosophila-associated bacteria not only fortify the host’s diet with essential nutrients but, in specific instances, functionally compensate for host auxotrophies by either providing a metabolic intermediate or nutrient derivative to the host or by uptaking, concentrating, and delivering contaminant traces of micronutrients. Our systematic work reveals that beyond the molecular dialogue engaged between the host and its bacterial partners, Drosophila and its associated bacteria establish an integrated nutritional network relying on nutrient provision and utilization. A study of gnotobiotic fruit flies shows that the animal is involved in an integrated nutritional network with its facultative commensal bacteria, centered around the utilization and sharing of nutrients.
Collapse
Affiliation(s)
- Jessika Consuegra
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, École Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard Lyon 1, UMR5242, Lyon, France
| | - Théodore Grenier
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, École Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard Lyon 1, UMR5242, Lyon, France
| | - Patrice Baa-Puyoulet
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - Isabelle Rahioui
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - Houssam Akherraz
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, École Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard Lyon 1, UMR5242, Lyon, France
| | - Hugo Gervais
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, École Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard Lyon 1, UMR5242, Lyon, France
| | - Nicolas Parisot
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - Pedro da Silva
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - Hubert Charles
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - Federica Calevro
- Laboratoire Biologie Fonctionnelle, Insectes et Interactions, Université de Lyon, Institut National des Sciences Appliquées, Institut National de Recherche pour l’Agriculture, l’Alimentation et l’Environnement, UMR0203, Villeurbanne, France
| | - François Leulier
- Institut de Génomique Fonctionnelle de Lyon, Université de Lyon, École Normale Supérieure de Lyon, Centre National de la Recherche Scientifique, Université Claude Bernard Lyon 1, UMR5242, Lyon, France
- * E-mail:
| |
Collapse
|
13
|
Kim SM, Peña MI, Moll M, Bennett GN, Kavraki LE. Improving the organization and interactivity of metabolic pathfinding with precomputed pathways. BMC Bioinformatics 2020; 21:13. [PMID: 31924164 PMCID: PMC6954563 DOI: 10.1186/s12859-019-3328-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 12/18/2019] [Indexed: 11/11/2022] Open
Abstract
Background The rapid growth of available knowledge on metabolic processes across thousands of species continues to expand the possibilities of producing chemicals by combining pathways found in different species. Several computational search algorithms have been developed for automating the identification of possible heterologous pathways; however, these searches may return thousands of pathway results. Although the large number of results are in part due to the large number of possible compounds and reactions, a subset of core reaction modules is repeatedly observed in pathway results across multiple searches, suggesting that some subpaths between common compounds were more consistently explored than others.To reduce the resources spent on searching the same metabolic space, a new meta-algorithm for metabolic pathfinding, Hub Pathway search with Atom Tracking (HPAT), was developed to take advantage of a precomputed network of subpath modules. To investigate the efficacy of this method, we created a table describing a network of common hub metabolites and how they are biochemically connected and only offloaded searches to and from this hub network onto an interactive webserver capable of visualizing the resulting pathways. Results A test set of nineteen known pathways taken from literature and metabolic databases were used to evaluate if HPAT was capable of identifying known pathways. HPAT found the exact pathway for eleven of the nineteen test cases using a diverse set of precomputed subpaths, whereas a comparable pathfinding search algorithm that does not use precomputed subpaths found only seven of the nineteen test cases. The capability of HPAT to find novel pathways was demonstrated by its ability to identify novel 3-hydroxypropanoate (3-HP) synthesis pathways. As for pathway visualization, the new interactive pathway filters enable a reduction of the number of displayed pathways from hundreds down to less than ten pathways in several test cases, illustrating their utility in reducing the amount of presented information while retaining pathways of interest. Conclusions This work presents the first step in incorporating a precomputed subpath network into metabolic pathfinding and demonstrates how this leads to a concise, interactive visualization of pathway results. The modular nature of metabolic pathways is exploited to facilitate efficient discovery of alternate pathways.
Collapse
Affiliation(s)
- Sarah M Kim
- Department of Computer Science, Rice University, Houston, Texas, USA
| | - Matthew I Peña
- Department of BioSciences, Rice University, Houston, Texas, USA
| | - Mark Moll
- Department of Computer Science, Rice University, Houston, Texas, USA.
| | | | - Lydia E Kavraki
- Department of Computer Science, Rice University, Houston, Texas, USA
| |
Collapse
|
14
|
Whitmore LS, Nguyen B, Pinar A, George A, Hudson CM. RetSynth: determining all optimal and sub-optimal synthetic pathways that facilitate synthesis of target compounds in chassis organisms. BMC Bioinformatics 2019; 20:461. [PMID: 31500573 PMCID: PMC6734243 DOI: 10.1186/s12859-019-3025-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 08/12/2019] [Indexed: 11/24/2022] Open
Abstract
Background The efficient biological production of industrially and economically important compounds is a challenging problem. Brute-force determination of the optimal pathways to efficient production of a target chemical in a chassis organism is computationally intractable. Many current methods provide a single solution to this problem, but fail to provide all optimal pathways, optional sub-optimal solutions or hybrid biological/non-biological solutions. Results Here we present RetSynth, software with a novel algorithm for determining all optimal biological pathways given a starting biological chassis and target chemical. By dynamically selecting constraints, the number of potential pathways scales by the number of fully independent pathways and not by the number of overall reactions or size of the metabolic network. This feature allows all optimal pathways to be determined for a large number of chemicals and for a large corpus of potential chassis organisms. Additionally, this software contains other features including the ability to collect data from metabolic repositories, perform flux balance analysis, and to view optimal pathways identified by our algorithm using a built-in visualization module. This software also identifies sub-optimal pathways and allows incorporation of non-biological chemical reactions, which may be performed after metabolic production of precursor molecules. Conclusions The novel algorithm designed for RetSynth streamlines an arduous and complex process in metabolic engineering. Our stand-alone software allows the identification of candidate optimal and additional sub-optimal pathways, and provides the user with necessary ranking criteria such as target yield to decide which route to select for target production. Furthermore, the ability to incorporate non-biological reactions into the final steps allows determination of pathways to production for targets that cannot be solely produced biologically. With this comprehensive suite of features RetSynth exceeds any open-source software or webservice currently available for identifying optimal pathways for target production. Electronic supplementary material The online version of this article (10.1186/s12859-019-3025-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Bernard Nguyen
- Sandia National Laboratories, East Avenue, Livermore, 94550, USA
| | - Ali Pinar
- Sandia National Laboratories, East Avenue, Livermore, 94550, USA
| | - Anthe George
- Sandia National Laboratories, East Avenue, Livermore, 94550, USA
| | - Corey M Hudson
- Sandia National Laboratories, East Avenue, Livermore, 94550, USA.
| |
Collapse
|
15
|
Sinatti VVC, Gonçalves CAX, Romão-Dumaresq AS. Identification of metabolites identical and similar to drugs as candidates for metabolic engineering. J Biotechnol 2019; 302:67-76. [PMID: 31254549 DOI: 10.1016/j.jbiotec.2019.06.303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Revised: 04/20/2019] [Accepted: 06/25/2019] [Indexed: 11/18/2022]
Abstract
Natural compounds and derivatives play an essential role in the pharmaceutical industry, however, the difficulty in resynthesizing natural products or isolate them from the native host, often limit their availability, elevate costs and slow down the pharmaceutical manufacturing process. In this context, application of synthetic biology could enable the efficient production of large amounts of drugs or drug precursors in heterologous microorganisms aiming to accelerate the entire manufacturing process. Considering this perspective, here we developed a pipeline to automatically search for metabolites available in the metabolic space that are structurally similar to worldwide approved drugs. This pipeline involved the in silico screening of metabolites from a metabolic pathway meta-database using both Tanimoto coefficients based on Daylight like fingerprints and Maximum Common Substructure algorithm. The method was successfully applied to identify metabolites sharing essential scaffolds with one or more drugs as potential candidates for metabolic engineering. Three of these metabolites (Festuclavine, Scopolamine, and Baccatin III) were identified as similar to many drugs like Cabergoline, Oxitropium, Paclitaxel and had their metabolic pathways computationally mapped for their production in Saccharomyces cerevisiae with our proprietary pathway design software. These compounds are examples of new opportunities for the application of synthetic biology in pharmaceutical production.
Collapse
Affiliation(s)
- Vanessa V C Sinatti
- SENAI Innovation Institute for Biosynthetics, Technology Center for Chemical and Textile Industry, Rio de Janeiro, Brazil.
| | - Carlos Alberto X Gonçalves
- SENAI Innovation Institute for Biosynthetics, Technology Center for Chemical and Textile Industry, Rio de Janeiro, Brazil
| | - Aline S Romão-Dumaresq
- SENAI Innovation Institute for Biosynthetics, Technology Center for Chemical and Textile Industry, Rio de Janeiro, Brazil
| |
Collapse
|
16
|
Karp PD, Billington R, Caspi R, Fulcher CA, Latendresse M, Kothari A, Keseler IM, Krummenacker M, Midford PE, Ong Q, Ong WK, Paley SM, Subhraveti P. The BioCyc collection of microbial genomes and metabolic pathways. Brief Bioinform 2019; 20:1085-1093. [PMID: 29447345 PMCID: PMC6781571 DOI: 10.1093/bib/bbx085] [Citation(s) in RCA: 438] [Impact Index Per Article: 87.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2017] [Revised: 06/22/2017] [Indexed: 01/31/2023] Open
Abstract
BioCyc.org is a microbial genome Web portal that combines thousands of genomes with additional information inferred by computer programs, imported from other databases and curated from the biomedical literature by biologist curators. BioCyc also provides an extensive range of query tools, visualization services and analysis software. Recent advances in BioCyc include an expansion in the content of BioCyc in terms of both the number of genomes and the types of information available for each genome; an expansion in the amount of curated content within BioCyc; and new developments in the BioCyc software tools including redesigned gene/protein pages and metabolite pages; new search tools; a new sequence-alignment tool; a new tool for visualizing groups of related metabolic pathways; and a facility called SmartTables, which enables biologists to perform analyses that previously would have required a programmer's assistance.
Collapse
|
17
|
Krummenacker M, Latendresse M, Karp PD. Metabolic route computation in organism communities. MICROBIOME 2019; 7:89. [PMID: 31174602 PMCID: PMC6556054 DOI: 10.1186/s40168-019-0706-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Accepted: 05/28/2019] [Indexed: 06/09/2023]
Abstract
BACKGROUND Microbiomes are complex aggregates of organisms, each of which has its own extensive metabolic network. A variety of metabolites are exchanged between the microbes. The challenge we address is understanding the overall metabolic capabilities of a microbiome: through what series of metabolic transformations can a microbiome convert a starting compound to an ending compound? RESULTS We developed an efficient software tool to search for metabolic routes that include metabolic reactions from multiple organisms. The metabolic network for each organism is obtained from BioCyc, where the network was inferred from the annotated genome. The tool searches for optimal metabolic routes that minimize the number of reactions in each route, maximize the number of atoms conserved between the starting and ending compounds, and minimize the number of organism switches. The tool pre-computes the reaction sets found in each organism from BioCyc to facilitate fast computation of the reactions defined in a researcher-specified organism set. The generated routes are depicted graphically, and for each reaction in a route, the tool lists the organisms that can catalyze that reaction. We present solutions for three route-finding problems in the human gut microbiome: (1) production of indoxyl sulfate, (2) production of trimethylamine N-oxide (TMAO), and (3) synthesis and degradation of autoinducers. The optimal routes computed by our multi-organism route-search (MORS) tool for indoxyl sulfate and TMAO were the same as routes reported in the literature. CONCLUSIONS Our tool quickly found plausible routes for the discussed multi-organism route-finding problems. The routes shed light on how diverse organisms cooperate to perform multi-step metabolic transformations. Our tool enables scientists to consider multiple alternative routes and identifies the organisms responsible for each reaction.
Collapse
Affiliation(s)
| | | | - Peter D Karp
- SRI International, 333 Ravenswood Ave., Menlo Park, 94025 CA USA
| |
Collapse
|
18
|
Lin GM, Warden-Rothman R, Voigt CA. Retrosynthetic design of metabolic pathways to chemicals not found in nature. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.coisb.2019.04.004] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
19
|
Metabolic pathways synthesis based on ant colony optimization. Sci Rep 2018; 8:16398. [PMID: 30401873 PMCID: PMC6219534 DOI: 10.1038/s41598-018-34454-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 10/17/2018] [Indexed: 02/02/2023] Open
Abstract
One of the current challenges in bioinformatics is to discover new ways to transform a set of compounds into specific products. The usual approach is finding the reactions to synthesize a particular product, from a given substrate, by means of classical searching algorithms. However, they have three main limitations: difficulty in handling large amounts of reactions and compounds; absence of a step that verifies the availability of substrates; and inability to find branched pathways. We present here a novel bio-inspired algorithm for synthesizing linear and branched metabolic pathways. It allows relating several compounds simultaneously, ensuring the availability of substrates for every reaction in the solution. Comparisons with classical searching algorithms and other recent metaheuristic approaches show clear advantages of this proposal, fully recovering well-known pathways. Furthermore, solutions found can be analyzed in a simple way through graphical representations on the web.
Collapse
|
20
|
Jeffryes JG, Seaver SMD, Faria JP, Henry CS. A pathway for every product? Tools to discover and design plant metabolism. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2018; 273:61-70. [PMID: 29907310 DOI: 10.1016/j.plantsci.2018.03.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 03/13/2018] [Accepted: 03/19/2018] [Indexed: 06/08/2023]
Abstract
The vast diversity of plant natural products is a powerful indication of the biosynthetic capacity of plant metabolism. Synthetic biology seeks to capitalize on this ability by understanding and reconfiguring the biosynthetic pathways that generate this diversity to produce novel products with improved efficiency. Here we review the algorithms and databases that presently support the design and manipulation of metabolic pathways in plants, starting from metabolic models of native biosynthetic pathways, progressing to novel combinations of known reactions, and finally proposing new reactions that may be carried out by existing enzymes. We show how these tools are useful for proposing new pathways as well as identifying side reactions that may affect engineering goals.
Collapse
Affiliation(s)
- James G Jeffryes
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - Samuel M D Seaver
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - José P Faria
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - Christopher S Henry
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States.
| |
Collapse
|
21
|
Enumerating all possible biosynthetic pathways in metabolic networks. Sci Rep 2018; 8:9932. [PMID: 29967471 PMCID: PMC6028704 DOI: 10.1038/s41598-018-28007-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2018] [Accepted: 06/14/2018] [Indexed: 12/24/2022] Open
Abstract
Exhaustive identification of all possible alternate pathways that exist in metabolic networks can provide valuable insights into cellular metabolism. With the growing number of metabolic reconstructions, there is a need for an efficient method to enumerate pathways, which can also scale well to large metabolic networks, such as those corresponding to microbial communities. We developed MetQuest, an efficient graph-theoretic algorithm to enumerate all possible pathways of a particular size between a given set of source and target molecules. Our algorithm employs a guided breadth-first search to identify all feasible reactions based on the availability of the precursor molecules, followed by a novel dynamic-programming based enumeration, which assembles these reactions into pathways of a specified size producing the target from the source. We demonstrate several interesting applications of our algorithm, ranging from identifying amino acid biosynthesis pathways to identifying the most diverse pathways involved in degradation of complex molecules. We also illustrate the scalability of our algorithm, by studying large graphs such as those corresponding to microbial communities, and identify several metabolic interactions happening therein. MetQuest is available as a Python package, and the source codes can be found at https://github.com/RamanLab/metquest.
Collapse
|
22
|
Exploring the combinatorial space of complete pathways to chemicals. Biochem Soc Trans 2018; 46:513-522. [DOI: 10.1042/bst20170272] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2017] [Revised: 02/21/2018] [Accepted: 02/26/2018] [Indexed: 11/17/2022]
Abstract
Computational pathway design tools often face the challenges of balancing the stoichiometry of co-metabolites and cofactors, and dealing with reaction rule utilization in a single workflow. To this end, we provide an overview of two complementary stoichiometry-based pathway design tools optStoic and novoStoic developed in our group to tackle these challenges. optStoic is designed to determine the stoichiometry of overall conversion first which optimizes a performance criterion (e.g. high carbon/energy efficiency) and ensures a comprehensive search of co-metabolites and cofactors. The procedure then identifies the minimum number of intervening reactions to connect the source and sink metabolites. We also further the pathway design procedure by expanding the search space to include both known and hypothetical reactions, represented by reaction rules, in a new tool termed novoStoic. Reaction rules are derived based on a mixed-integer linear programming (MILP) compatible reaction operator, which allow us to explore natural promiscuous enzymes, engineer candidate enzymes that are not already promiscuous as well as design de novo enzymes. The identified biochemical reaction rules then guide novoStoic to design routes that expand the currently known biotransformation space using a single MILP modeling procedure. We demonstrate the use of the two computational tools in pathway elucidation by designing novel synthetic routes for isobutanol.
Collapse
|
23
|
Abd Algfoor Z, Shahrizal Sunar M, Abdullah A, Kolivand H. Identification of metabolic pathways using pathfinding approaches: a systematic review. Brief Funct Genomics 2017; 16:87-98. [PMID: 26969656 DOI: 10.1093/bfgp/elw002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Metabolic pathways have become increasingly available for various microorganisms. Such pathways have spurred the development of a wide array of computational tools, in particular, mathematical pathfinding approaches. This article can facilitate the understanding of computational analysis of metabolic pathways in genomics. Moreover, stoichiometric and pathfinding approaches in metabolic pathway analysis are discussed. Three major types of studies are elaborated: stoichiometric identification models, pathway-based graph analysis and pathfinding approaches in cellular metabolism. Furthermore, evaluation of the outcomes of the pathways with mathematical benchmarking metrics is provided. This review would lead to better comprehension of metabolism behaviors in living cells, in terms of computed pathfinding approaches.
Collapse
Affiliation(s)
- Zeyad Abd Algfoor
- MaGIC-X (Media and Games Innovation Centre of Excellence), UTM-IRDA Digital Media Centre, Universiti Teknologi Malaysia, Johor Bahru, Malaysia
| | - Mohd Shahrizal Sunar
- MaGIC-X (Media and Games Innovation Centre of Excellence), UTM-IRDA Digital Media Centre, Universiti Teknologi Malaysia, Johor Bahru, Malaysia
| | - Afnizanfaizal Abdullah
- Boston University School of Medicine, Boston Medical Center, Boston, MA, USA.,Duke Global Health Institute, Duke University, Durham, NC, USA.,Global Health Program, Duke Kunshan University, Jiangsu, China
| | - Hoshang Kolivand
- Department of Computer Science, Liverpool John Moores University, Liverpool, UK
| |
Collapse
|
24
|
Wang L, Dash S, Ng CY, Maranas CD. A review of computational tools for design and reconstruction of metabolic pathways. Synth Syst Biotechnol 2017; 2:243-252. [PMID: 29552648 PMCID: PMC5851934 DOI: 10.1016/j.synbio.2017.11.002] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 11/06/2017] [Accepted: 11/06/2017] [Indexed: 11/28/2022] Open
Abstract
Metabolic pathways reflect an organism's chemical repertoire and hence their elucidation and design have been a primary goal in metabolic engineering. Various computational methods have been developed to design novel metabolic pathways while taking into account several prerequisites such as pathway stoichiometry, thermodynamics, host compatibility, and enzyme availability. The choice of the method is often determined by the nature of the metabolites of interest and preferred host organism, along with computational complexity and availability of software tools. In this paper, we review different computational approaches used to design metabolic pathways based on the reaction network representation of the database (i.e., graph or stoichiometric matrix) and the search algorithm (i.e., graph search, flux balance analysis, or retrosynthetic search). We also put forth a systematic workflow that can be implemented in projects requiring pathway design and highlight current limitations and obstacles in computational pathway design.
Collapse
Affiliation(s)
- Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Satyakam Dash
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Chiam Yu Ng
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
25
|
Abstract
Systems metabolic engineering, which recently emerged as metabolic engineering integrated with systems biology, synthetic biology, and evolutionary engineering, allows engineering of microorganisms on a systemic level for the production of valuable chemicals far beyond its native capabilities. Here, we review the strategies for systems metabolic engineering and particularly its applications in Escherichia coli. First, we cover the various tools developed for genetic manipulation in E. coli to increase the production titers of desired chemicals. Next, we detail the strategies for systems metabolic engineering in E. coli, covering the engineering of the native metabolism, the expansion of metabolism with synthetic pathways, and the process engineering aspects undertaken to achieve higher production titers of desired chemicals. Finally, we examine a couple of notable products as case studies produced in E. coli strains developed by systems metabolic engineering. The large portfolio of chemical products successfully produced by engineered E. coli listed here demonstrates the sheer capacity of what can be envisioned and achieved with respect to microbial production of chemicals. Systems metabolic engineering is no longer in its infancy; it is now widely employed and is also positioned to further embrace next-generation interdisciplinary principles and innovation for its upgrade. Systems metabolic engineering will play increasingly important roles in developing industrial strains including E. coli that are capable of efficiently producing natural and nonnatural chemicals and materials from renewable nonfood biomass.
Collapse
|
26
|
Kim SM, Peña MI, Moll M, Bennett GN, Kavraki LE. A review of parameters and heuristics for guiding metabolic pathfinding. J Cheminform 2017; 9:51. [PMID: 29086092 PMCID: PMC5602787 DOI: 10.1186/s13321-017-0239-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 09/07/2017] [Indexed: 12/04/2022] Open
Abstract
Recent developments in metabolic engineering have led to the successful biosynthesis of valuable products, such as the precursor of the antimalarial compound, artemisinin, and opioid precursor, thebaine. Synthesizing these traditionally plant-derived compounds in genetically modified yeast cells introduces the possibility of significantly reducing the total time and resources required for their production, and in turn, allows these valuable compounds to become cheaper and more readily available. Most biosynthesis pathways used in metabolic engineering applications have been discovered manually, requiring a tedious search of existing literature and metabolic databases. However, the recent rapid development of available metabolic information has enabled the development of automated approaches for identifying novel pathways. Computer-assisted pathfinding has the potential to save biochemists time in the initial discovery steps of metabolic engineering. In this paper, we review the parameters and heuristics used to guide the search in recent pathfinding algorithms. These parameters and heuristics capture information on the metabolic network structure, compound structures, reaction features, and organism-specificity of pathways. No one metabolic pathfinding algorithm or search parameter stands out as the best to use broadly for solving the pathfinding problem, as each method and parameter has its own strengths and shortcomings. As assisted pathfinding approaches continue to become more sophisticated, the development of better methods for visualizing pathway results and integrating these results into existing metabolic engineering practices is also important for encouraging wider use of these pathfinding methods.
Collapse
Affiliation(s)
- Sarah M Kim
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Matthew I Peña
- Department of BioSciences, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Mark Moll
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - George N Bennett
- Department of BioSciences, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA.
| |
Collapse
|
27
|
Sankar A, Ranu S, Raman K. Predicting novel metabolic pathways through subgraph mining. Bioinformatics 2017; 33:3955-3963. [DOI: 10.1093/bioinformatics/btx481] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2017] [Accepted: 07/26/2017] [Indexed: 11/13/2022] Open
Affiliation(s)
- Aravind Sankar
- Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India
| | - Sayan Ranu
- Department of Computer Science and Engineering, Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India
- Initiative for Biological Systems Engineering (IBSE), Interdisciplinary Laboratory for Data Sciences, Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India
| | - Karthik Raman
- Initiative for Biological Systems Engineering (IBSE), Interdisciplinary Laboratory for Data Sciences, Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology (IIT) Madras, Chennai, Tamil Nadu, India
| |
Collapse
|
28
|
Hadadi N, Hafner J, Soh KC, Hatzimanikatis V. Reconstruction of biological pathways and metabolic networks from in silico labeled metabolites. Biotechnol J 2017; 12. [DOI: 10.1002/biot.201600464] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2016] [Revised: 11/21/2016] [Accepted: 11/28/2016] [Indexed: 12/13/2022]
Affiliation(s)
- Noushin Hadadi
- Laboratory of Computational Systems Biotechnology (LCSB); Swiss Federal Institute of Technology (EPFL); Lausanne Switzerland
| | - Jasmin Hafner
- Laboratory of Computational Systems Biotechnology (LCSB); Swiss Federal Institute of Technology (EPFL); Lausanne Switzerland
| | - Keng Cher Soh
- Laboratory of Computational Systems Biotechnology (LCSB); Swiss Federal Institute of Technology (EPFL); Lausanne Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology (LCSB); Swiss Federal Institute of Technology (EPFL); Lausanne Switzerland
| |
Collapse
|
29
|
Huang Y, Zhong C, Lin HX, Wang J. A Method for Finding Metabolic Pathways Using Atomic Group Tracking. PLoS One 2017; 12:e0168725. [PMID: 28068354 PMCID: PMC5221824 DOI: 10.1371/journal.pone.0168725] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2016] [Accepted: 12/05/2016] [Indexed: 12/13/2022] Open
Abstract
A fundamental computational problem in metabolic engineering is to find pathways between compounds. Pathfinding methods using atom tracking have been widely used to find biochemically relevant pathways. However, these methods require the user to define the atoms to be tracked. This may lead to failing to predict the pathways that do not conserve the user-defined atoms. In this work, we propose a pathfinding method called AGPathFinder to find biochemically relevant metabolic pathways between two given compounds. In AGPathFinder, we find alternative pathways by tracking the movement of atomic groups through metabolic networks and use combined information of reaction thermodynamics and compound similarity to guide the search towards more feasible pathways and better performance. The experimental results show that atomic group tracking enables our method to find pathways without the need of defining the atoms to be tracked, avoid hub metabolites, and obtain biochemically meaningful pathways. Our results also demonstrate that atomic group tracking, when incorporated with combined information of reaction thermodynamics and compound similarity, improves the quality of the found pathways. In most cases, the average compound inclusion accuracy and reaction inclusion accuracy for the top resulting pathways of our method are around 0.90 and 0.70, respectively, which are better than those of the existing methods. Additionally, AGPathFinder provides the information of thermodynamic feasibility and compound similarity for the resulting pathways.
Collapse
Affiliation(s)
- Yiran Huang
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, China
- School of Computer, Electronics and Information, Guangxi University, Nanning, China
- * E-mail: (YH); (CZ)
| | - Cheng Zhong
- School of Computer, Electronics and Information, Guangxi University, Nanning, China
- * E-mail: (YH); (CZ)
| | - Hai Xiang Lin
- Faculty of Electrical Engineering, Mathematics and Computer Science, Delft University of Technology, Delft, The Netherlands
| | - Jianyi Wang
- School of Chemistry and Chemical Engineering, Guangxi University, Nanning, China
| |
Collapse
|
30
|
Making sense of genomes of parasitic worms: Tackling bioinformatic challenges. Biotechnol Adv 2016; 34:663-686. [DOI: 10.1016/j.biotechadv.2016.03.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Revised: 02/25/2016] [Accepted: 03/01/2016] [Indexed: 01/25/2023]
|
31
|
Carbonell P, Currin A, Jervis AJ, Rattray NJW, Swainston N, Yan C, Takano E, Breitling R. Bioinformatics for the synthetic biology of natural products: integrating across the Design-Build-Test cycle. Nat Prod Rep 2016; 33:925-32. [PMID: 27185383 PMCID: PMC5063057 DOI: 10.1039/c6np00018e] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Indexed: 12/11/2022]
Abstract
Covering: 2000 to 2016Progress in synthetic biology is enabled by powerful bioinformatics tools allowing the integration of the design, build and test stages of the biological engineering cycle. In this review we illustrate how this integration can be achieved, with a particular focus on natural products discovery and production. Bioinformatics tools for the DESIGN and BUILD stages include tools for the selection, synthesis, assembly and optimization of parts (enzymes and regulatory elements), devices (pathways) and systems (chassis). TEST tools include those for screening, identification and quantification of metabolites for rapid prototyping. The main advantages and limitations of these tools as well as their interoperability capabilities are highlighted.
Collapse
Affiliation(s)
- Pablo Carbonell
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Andrew Currin
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Adrian J. Jervis
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Nicholas J. W. Rattray
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Neil Swainston
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Cunyu Yan
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Eriko Takano
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| | - Rainer Breitling
- Manchester Centre for Fine and Specialty Chemicals (SYNBIOCHEM) , Manchester Institute of Biotechnology , University of Manchester , Manchester M1 7DN , UK . ;
| |
Collapse
|
32
|
Frainay C, Jourdan F. Computational methods to identify metabolic sub-networks based on metabolomic profiles. Brief Bioinform 2016; 18:43-56. [PMID: 26822099 DOI: 10.1093/bib/bbv115] [Citation(s) in RCA: 42] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Revised: 12/16/2015] [Indexed: 11/13/2022] Open
Abstract
Untargeted metabolomics makes it possible to identify compounds that undergo significant changes in concentration in different experimental conditions. The resulting metabolomic profile characterizes the perturbation concerned, but does not explain the underlying biochemical mechanisms. Bioinformatics methods make it possible to interpret results in light of the whole metabolism. This knowledge is modelled into a network, which can be mined using algorithms that originate in graph theory. These algorithms can extract sub-networks related to the compounds identified. Several attempts have been made to adapt them to obtain more biologically meaningful results. However, there is still no consensus on this kind of analysis of metabolic networks. This review presents the main graph approaches used to interpret metabolomic data using metabolic networks. Their advantages and drawbacks are discussed, and the impacts of their parameters are emphasized. We also provide some guidelines for relevant sub-network extraction and also suggest a range of applications for most methods.
Collapse
|
33
|
Khosraviani M, Saheb Zamani M, Bidkhori G. FogLight: an efficient matrix-based approach to construct metabolic pathways by search space reduction. Bioinformatics 2015; 32:398-408. [PMID: 26454274 DOI: 10.1093/bioinformatics/btv578] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 10/02/2015] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION A fundamental computational problem in the area of metabolic engineering is finding metabolic pathways between a pair of source and target metabolites efficiently. We present an approach, namely FogLight, for searching metabolic networks utilizing Boolean (AND-OR) operations represented in matrix notation to efficiently reduce the search space. This enables the enumeration of all pathways between metabolites that are too distant for the application of brute-force methods. RESULTS Benchmarking tests run with FogLight show that it can reduce the search space by up to 98%, after which the accelerated search for high accurate results is guaranteed. Using FogLight, several pathways between eight given pairs of metabolites are found of which the pathways from CO2 to ethanol are specifically discussed. Additionally, in comparison with three path-finding tools, namely PHT, FMM and RouteSearch, FogLight can find shorter and more pathways for attempted source-target metabolite pairs. CONTACT szamani@aut.ac.ir, gholamreza.bidkhori@vtt.fi SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mehrshad Khosraviani
- Department of Computer Engineering & IT, Amirkabir University of Technology, Tehran, Iran and
| | - Morteza Saheb Zamani
- Department of Computer Engineering & IT, Amirkabir University of Technology, Tehran, Iran and
| | - Gholamreza Bidkhori
- Department of Computer Engineering & IT, Amirkabir University of Technology, Tehran, Iran and
| |
Collapse
|
34
|
Karp PD, Latendresse M, Paley SM, Krummenacker M, Ong QD, Billington R, Kothari A, Weaver D, Lee T, Subhraveti P, Spaulding A, Fulcher C, Keseler IM, Caspi R. Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology. Brief Bioinform 2015; 17:877-90. [PMID: 26454094 DOI: 10.1093/bib/bbv079] [Citation(s) in RCA: 173] [Impact Index Per Article: 19.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2015] [Indexed: 11/15/2022] Open
Abstract
Pathway Tools is a bioinformatics software environment with a broad set of capabilities. The software provides genome-informatics tools such as a genome browser, sequence alignments, a genome-variant analyzer and comparative-genomics operations. It offers metabolic-informatics tools, such as metabolic reconstruction, quantitative metabolic modeling, prediction of reaction atom mappings and metabolic route search. Pathway Tools also provides regulatory-informatics tools, such as the ability to represent and visualize a wide range of regulatory interactions. This article outlines the advances in Pathway Tools in the past 5 years. Major additions include components for metabolic modeling, metabolic route search, computation of atom mappings and estimation of compound Gibbs free energies of formation; addition of editors for signaling pathways, for genome sequences and for cellular architecture; storage of gene essentiality data and phenotype data; display of multiple alignments, and of signaling and electron-transport pathways; and development of Python and web-services application programming interfaces. Scientists around the world have created more than 9800 Pathway/Genome Databases by using Pathway Tools, many of which are curated databases for important model organisms.
Collapse
|