1
|
Upadhyay V, Boorla VS, Maranas CD. Rank-ordering of known enzymes as starting points for re-engineering novel substrate activity using a convolutional neural network. Metab Eng 2023; 78:171-182. [PMID: 37301359 DOI: 10.1016/j.ymben.2023.06.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 05/19/2023] [Accepted: 06/02/2023] [Indexed: 06/12/2023]
Abstract
Retro-biosynthetic approaches have made significant advances in predicting synthesis routes of target biofuel, bio-renewable or bio-active molecules. The use of only cataloged enzymatic activities limits the discovery of new production routes. Recent retro-biosynthetic algorithms increasingly use novel conversions that require altering the substrate or cofactor specificities of existing enzymes while connecting pathways leading to a target metabolite. However, identifying and re-engineering enzymes for desired novel conversions are currently the bottlenecks in implementing such designed pathways. Herein, we present EnzRank, a convolutional neural network (CNN) based approach, to rank-order existing enzymes in terms of their suitability to undergo successful protein engineering through directed evolution or de novo design towards a desired specific substrate activity. We train the CNN model on 11,800 known active enzyme-substrate pairs from the BRENDA database as positive samples and data generated by scrambling these pairs as negative samples using substrate dissimilarity between an enzyme's native substrate and all other molecules present in the dataset using Tanimoto similarity score. EnzRank achieves an average recovery rate of 80.72% and 73.08% for positive and negative pairs on test data after using a 10-fold holdout method for training and cross-validation. We further developed a web-based user interface (available at https://huggingface.co/spaces/vuu10/EnzRank) to predict enzyme-substrate activity using SMILES strings of substrates and enzyme sequence as input to allow convenient and easy-to-use access to EnzRank. In summary, this effort can aid de novo pathway design tools to prioritize starting enzyme re-engineering candidates for novel reactions as well as in predicting the potential secondary activity of enzymes in cell metabolism.
Collapse
Affiliation(s)
- Vikas Upadhyay
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Veda Sheersh Boorla
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
2
|
Sveshnikova A, MohammadiPeyhani H, Hatzimanikatis V. Computational tools and resources for designing new pathways to small molecules. Curr Opin Biotechnol 2022; 76:102722. [PMID: 35483185 DOI: 10.1016/j.copbio.2022.102722] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 03/04/2022] [Accepted: 03/22/2022] [Indexed: 12/22/2022]
Abstract
The metabolic engineering community relies on computational methods for pathway design to produce important small molecules in microbial hosts. Metabolic network databases are continuously curated and updated with known and novel reactions that expand the known biochemistry based on different sets of enzymatic reaction rules. To address the complexity of the metabolic networks, elaborate methods were developed to transform them into computable graphs, navigate them, and construct the best possible pathways. However, the recent experimental research points to the new challenges and opportunities for the computational pathway design. Here, we review the most recent advances, especially in the last two years, in computational discovery of new pathways and their prospects for expanding metabolic capabilities. We draw attention to the potential ways of improvement for pathway design algorithms, including the expansion of Design-Build-Test-Learn cycle to novel compounds and reactions and the standardization for the reaction rules and metabolic reaction databases.
Collapse
Affiliation(s)
- Anastasia Sveshnikova
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Homa MohammadiPeyhani
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland.
| |
Collapse
|
3
|
Expanding biochemical knowledge and illuminating metabolic dark matter with ATLASx. Nat Commun 2022; 13:1560. [PMID: 35322036 PMCID: PMC8943196 DOI: 10.1038/s41467-022-29238-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 03/07/2022] [Indexed: 12/23/2022] Open
Abstract
Metabolic “dark matter” describes currently unknown metabolic processes, which form a blind spot in our general understanding of metabolism and slow down the development of biosynthetic cell factories and naturally derived pharmaceuticals. Mapping the dark matter of metabolism remains an open challenge that can be addressed globally and systematically by existing computational solutions. In this work, we use 489 generalized enzymatic reaction rules to map both known and unknown metabolic processes around a biochemical database of 1.5 million biological compounds. We predict over 5 million reactions and integrate nearly 2 million naturally and synthetically-derived compounds into the global network of biochemical knowledge, named ATLASx. ATLASx is available to researchers as a powerful online platform that supports the prediction and analysis of biochemical pathways and evaluates the biochemical vicinity of molecule classes (https://lcsb-databases.epfl.ch/Atlas2). “Mapping the dark matter of metabolism remains an open challenge that can be addressed globally and systematically by existing computational solutions. Here the authors present ATLASx, a repository of known and predicted enzymatic reaction, connecting millions of compounds to help synthetic biologists and metabolic engineers to design and explore metabolic pathways.”
Collapse
|
4
|
Wang L, Upadhyay V, Maranas CD. dGPredictor: Automated fragmentation method for metabolic reaction free energy prediction and de novo pathway design. PLoS Comput Biol 2021; 17:e1009448. [PMID: 34570771 PMCID: PMC8496854 DOI: 10.1371/journal.pcbi.1009448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Revised: 10/07/2021] [Accepted: 09/13/2021] [Indexed: 11/19/2022] Open
Abstract
Group contribution (GC) methods are conventionally used in thermodynamics analysis of metabolic pathways to estimate the standard Gibbs energy change (ΔrG′o) of enzymatic reactions from limited experimental measurements. However, these methods are limited by their dependence on manually curated groups and inability to capture stereochemical information, leading to low reaction coverage. Herein, we introduce an automated molecular fingerprint-based thermodynamic analysis tool called dGPredictor that enables the consideration of stereochemistry within metabolite structures and thus increases reaction coverage. dGPredictor has comparable prediction accuracy compared to existing GC methods and can capture Gibbs energy changes for isomerase and transferase reactions, which exhibit no overall group changes. We also demonstrate dGPredictor’s ability to predict the Gibbs energy change for novel reactions and seamless integration within de novo metabolic pathway design tools such as novoStoic for safeguarding against the inclusion of reaction steps with infeasible directionalities. To facilitate easy access to dGPredictor, we developed a graphical user interface to predict the standard Gibbs energy change for reactions at various pH and ionic strengths. The tool allows customized user input of known metabolites as KEGG IDs and novel metabolites as InChI strings (https://github.com/maranasgroup/dGPredictor). The standard Gibbs energy change is commonly used to check for the feasibility of enzyme-catalyzed reactions as thermodynamics plays a crucial role in pathway design for biochemical synthesis. The group contribution methods using expert-defined functional groups have been extensively used for estimating standard Gibbs energy change. Here, we introduce a molecular fingerprint-based thermodynamic tool, dGPredictor, that enables distinguishing between (stereo)isomers in metabolic reactions leading to improved reaction coverage and comparable prediction accuracy as GC methods. dGPredictor can also be used alongside de novo pathway design tools to ensure the correct directionality of chosen reaction steps. We applied and tested dGPredictor on reactions from the KEGG database and applied it to screen an isobutanol synthesis pathway design. An open-source, user-friendly web interface is provided to facilitate easy access for standard Gibbs energy change of reactions at different pH values. (https://github.com/maranasgroup/dGPredictor).
Collapse
Affiliation(s)
- Lin Wang
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
| | - Vikas Upadhyay
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
| | - Costas D. Maranas
- Department of Chemical Engineering, Pennsylvania State University, University Park, Pennsylvania, United States America
- * E-mail:
| |
Collapse
|
5
|
|
6
|
Integrating thermodynamic and enzymatic constraints into genome-scale metabolic models. Metab Eng 2021; 67:133-144. [PMID: 34174426 DOI: 10.1016/j.ymben.2021.06.005] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 03/04/2021] [Accepted: 06/21/2021] [Indexed: 12/23/2022]
Abstract
Stoichiometric genome-scale metabolic network models (GEMs) have been widely used to predict metabolic phenotypes. In addition to stoichiometric ratios, other constraints such as enzyme availability and thermodynamic feasibility can also limit the phenotype solution space. Extended GEM models considering either enzymatic or thermodynamic constraints have been shown to improve prediction accuracy. In this paper, we propose a novel method that integrates both enzymatic and thermodynamic constraints in a single Pyomo modeling framework (ETGEMs). We applied this method to construct the EcoETM (E. coli metabolic model with enzymatic and thermodynamic constraints). Using this model, we calculated the optimal pathways for cellular growth and the production of 22 metabolites. When comparing the results with those of iML1515 and models with one of the two constraints, we observed that many thermodynamically unfavorable and/or high enzyme cost pathways were excluded from EcoETM. For example, the synthesis pathway of carbamoyl-phosphate (Cbp) from iML1515 is both thermodynamically unfavorable and enzymatically costly. After introducing the new constraints, the production pathways and yields of several Cbp-derived products (e.g. L-arginine, orotate) calculated using EcoETM were more realistic. The results of this study demonstrate the great application potential of metabolic models with multiple constraints for pathway analysis and phenotype prediction.
Collapse
|
7
|
Wang L, Maranas CD. Computationally Prospecting Potential Pathways from Lignin Monomers and Dimers toward Aromatic Compounds. ACS Synth Biol 2021; 10:1064-1076. [PMID: 33877818 DOI: 10.1021/acssynbio.0c00598] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The heterogeneity of the aromatic products originating from lignin catalytic depolymerization remains one of the major challenges associated with lignin valorization. Microbes have evolved catabolic pathways that can funnel heterogeneous intermediates to a few central aromatic products. These aromatic compounds can subsequently undergo intra- or extradiol ring opening to produce value-added chemicals. However, such funneling pathways are only partially characterized for a few organisms such as Sphingobium sp. SYK-6 and Pseudomonas putida KT2440. Herein, we apply the de novo pathway design tool (novoStoic) to computationally prospect possible ways of funneling lignin-derived mono- and biaryls. novoStoic employs reaction rules between molecular moieties to hypothesize de novo conversions by flagging known enzymes that carry out the same biotransformation on the most similar substrate. Both reaction rules and known reactions are then deployed by novoStoic to identify a mass-balanced biochemical network that converts a source to a target metabolite while minimizing the number of de novo steps. We demonstrate the application of novoStoic for (i) designing alternative pathways of funneling S, G, and H lignin monomers, and (ii) exploring cleavage pathways of β-1 and β-β dimers. By exploring the uncharted chemical space afforded by enzyme promiscuity, novoStoic can help predict previously unknown native pathways leveraging enzyme promiscuity and propose new carbon/energy efficient lignin funneling pathways with few heterologous enzymes.
Collapse
Affiliation(s)
- Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, United States
| | - Costas D. Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, United States
| |
Collapse
|
8
|
Hafner J, Payne J, MohammadiPeyhani H, Hatzimanikatis V, Smolke C. A computational workflow for the expansion of heterologous biosynthetic pathways to natural product derivatives. Nat Commun 2021; 12:1760. [PMID: 33741955 PMCID: PMC7979880 DOI: 10.1038/s41467-021-22022-5] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 02/24/2021] [Indexed: 01/31/2023] Open
Abstract
Plant natural products (PNPs) and their derivatives are important but underexplored sources of pharmaceutical molecules. To access this untapped potential, the reconstitution of heterologous PNP biosynthesis pathways in engineered microbes provides a valuable starting point to explore and produce novel PNP derivatives. Here, we introduce a computational workflow to systematically screen the biochemical vicinity of a biosynthetic pathway for pharmaceutical compounds that could be produced by derivatizing pathway intermediates. We apply our workflow to the biosynthetic pathway of noscapine, a benzylisoquinoline alkaloid (BIA) with a long history of medicinal use. Our workflow identifies pathways and enzyme candidates for the production of (S)-tetrahydropalmatine, a known analgesic and anxiolytic, and three additional derivatives. We then construct pathways for these compounds in yeast, resulting in platforms for de novo biosynthesis of BIA derivatives and demonstrating the value of cheminformatic tools to predict reactions, pathways, and enzymes in synthetic biology and metabolic engineering.
Collapse
Affiliation(s)
- Jasmin Hafner
- Laboratory of Computational Systems Biotechnology, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - James Payne
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - Homa MohammadiPeyhani
- Laboratory of Computational Systems Biotechnology, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, Swiss Federal Institute of Technology (EPFL), Lausanne, Switzerland.
| | - Christina Smolke
- Department of Bioengineering, Stanford University, Stanford, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
9
|
Saa PA, Cortés MP, López J, Bustos D, Maass A, Agosin E. Expanding Metabolic Capabilities Using Novel Pathway Designs: Computational Tools and Case Studies. Biotechnol J 2019; 14:e1800734. [DOI: 10.1002/biot.201800734] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Revised: 04/22/2019] [Indexed: 11/10/2022]
Affiliation(s)
- Pedro A. Saa
- Departamento de Ingeniería Química y BioprocesosPontificia Universidad Católica de Chile Av. Vicuña Mackenna 4860 7820436 Santiago Chile
| | - María P. Cortés
- Centro de Modelamiento MatemáticoUniversidad de Chile Av. Beaucheff 851 Santiago 8370456 Chile
- Centro de Regulación del GenomaUniversidad de Chile Av. Beaucheff 851 Santiago 8370456 Chile
| | - Javiera López
- Centro de Aromas y SaboresDICTUC S.A Av. Vicuña Mackenna 4860 Santiago 7820436 Chile
| | - Diego Bustos
- Centro de Aromas y SaboresDICTUC S.A Av. Vicuña Mackenna 4860 Santiago 7820436 Chile
| | - Alejandro Maass
- Centro de Modelamiento MatemáticoUniversidad de Chile Av. Beaucheff 851 Santiago 8370456 Chile
- Departmento de Ingeniería MatemáticaUniversidad de Chile Av. Beaucheff 851 Santiago 8370456 Chile
| | - Eduardo Agosin
- Departamento de Ingeniería Química y BioprocesosPontificia Universidad Católica de Chile Av. Vicuña Mackenna 4860 7820436 Santiago Chile
| |
Collapse
|