1
|
Liu W, Wang P, Zhuang X, Ling Y, Liu H, Wang S, Yu H, Ma L, Jiang Y, Zhao G, Yan X, Zhou Z, Zhang G. RDBSB: a database for catalytic bioparts with experimental evidence. Nucleic Acids Res 2024:gkae844. [PMID: 39360609 DOI: 10.1093/nar/gkae844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 09/10/2024] [Accepted: 09/17/2024] [Indexed: 10/04/2024] Open
Abstract
Catalytic bioparts are fundamental to the design, construction and optimization of biological systems for specific metabolic pathways. However, the functional characterization information of these bioparts is frequently dispersed across multiple databases and literature sources, posing significant challenges to the effective design and optimization of specific chassis or cell factories. We developed the Registry and Database of Bioparts for Synthetic Biology (RDBSB), a comprehensive resource encompassing 83 193 curated catalytic bioparts with experimental evidences. RDBSB offers their detailed qualitative and quantitative catalytic information, including critical parameters such as activities, substrates, optimal pH and temperature, and chassis specificity. The platform features an interactive search engine, visualization tools and analysis utilities such as biopart finder, structure prediction and pathway design tools. Additionally, RDBSB promotes community engagement through a catalytic bioparts submission system to facilitate rapid data sharing and utilization. To date, RDBSB has supported the contribution of >1000 catalytic bioparts. We anticipate that the database will significantly enhance the resources available for pathway design in synthetic biology and serve essential tools for researchers. RDBSB is freely available at https://www.biosino.org/rdbsb/.
Collapse
Affiliation(s)
- Wan Liu
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Pingping Wang
- CAS-Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, 300 Feng Lin Road, Shanghai 200032, China
| | - Xinhao Zhuang
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Yunchao Ling
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Haiyan Liu
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230026, China
| | - Sheng Wang
- Shanghai Zelixir Biotech Company Ltd., 4/F, Youyue Building, No. 298, Xiangke Road, Pudong New District, Shanghai 200030, China
| | - Haihan Yu
- School of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230026, China
| | - Liangxiao Ma
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| | - Yuguo Jiang
- Shanghai Key Laboratory of Plant Functional Genomics and Resources, Shanghai Chenshan Botanical Garden, and Chenshan Science Research Center, CAS Center for Excellence in Molecular Plant Sciences (CEMPS), Chinese Academy of Sciences (CAS), 3888 Chenhua Road, Shanghai 201602, China
| | - Guoping Zhao
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
- School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 1 Sub-lane Xiangshan, Hangzhou 310024, China
| | - Xing Yan
- CAS-Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, 300 Feng Lin Road, Shanghai 200032, China
| | - Zhihua Zhou
- CAS-Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, 300 Feng Lin Road, Shanghai 200032, China
| | - Guoqing Zhang
- National Genomics Data Center & Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, 320 Yueyang Road, Shanghai 200031, China
| |
Collapse
|
2
|
Heid E, Probst D, Green WH, Madsen GKH. EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions. Chem Sci 2023; 14:14229-14242. [PMID: 38098707 PMCID: PMC10718068 DOI: 10.1039/d3sc02048g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 11/21/2023] [Indexed: 12/17/2023] Open
Abstract
Enzymatic reactions are an ecofriendly, selective, and versatile addition, sometimes even alternative to organic reactions for the synthesis of chemical compounds such as pharmaceuticals or fine chemicals. To identify suitable reactions, computational models to predict the activity of enzymes on non-native substrates, to perform retrosynthetic pathway searches, or to predict the outcomes of reactions including regio- and stereoselectivity are becoming increasingly important. However, current approaches are substantially hindered by the limited amount of available data, especially if balanced and atom mapped reactions are needed and if the models feature machine learning components. We therefore constructed a high-quality dataset (EnzymeMap) by developing a large set of correction and validation algorithms for recorded reactions in the literature and showcase its significant positive impact on machine learning models of retrosynthesis, forward prediction, and regioselectivity prediction, outperforming previous approaches by a large margin. Our dataset allows for deep learning models of enzymatic reactions with unprecedented accuracy, and is freely available online.
Collapse
Affiliation(s)
- Esther Heid
- Institute of Materials Chemistry, TU Wien 1060 Vienna Austria
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge Massachusetts 02139 USA
| | | | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge Massachusetts 02139 USA
| | | |
Collapse
|
3
|
Prešern U, Goličnik M. Enzyme Databases in the Era of Omics and Artificial Intelligence. Int J Mol Sci 2023; 24:16918. [PMID: 38069254 PMCID: PMC10707154 DOI: 10.3390/ijms242316918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 11/24/2023] [Accepted: 11/26/2023] [Indexed: 12/18/2023] Open
Abstract
Enzyme research is important for the development of various scientific fields such as medicine and biotechnology. Enzyme databases facilitate this research by providing a wide range of information relevant to research planning and data analysis. Over the years, various databases that cover different aspects of enzyme biology (e.g., kinetic parameters, enzyme occurrence, and reaction mechanisms) have been developed. Most of the databases are curated manually, which improves reliability of the information; however, such curation cannot keep pace with the exponential growth in published data. Lack of data standardization is another obstacle for data extraction and analysis. Improving machine readability of databases is especially important in the light of recent advances in deep learning algorithms that require big training datasets. This review provides information regarding the current state of enzyme databases, especially in relation to the ever-increasing amount of generated research data and recent advancements in artificial intelligence algorithms. Furthermore, it describes several enzyme databases, providing the reader with necessary information for their use.
Collapse
Affiliation(s)
| | - Marko Goličnik
- Institute of Biochemistry and Molecular Genetics, Faculty of Medicine, University of Ljubljana, Vrazov trg 2, 1000 Ljubljana, Slovenia;
| |
Collapse
|
4
|
Yu T, Boob AG, Volk MJ, Liu X, Cui H, Zhao H. Machine learning-enabled retrobiosynthesis of molecules. Nat Catal 2023. [DOI: 10.1038/s41929-022-00909-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
5
|
Walther D. Specifics of Metabolite-Protein Interactions and Their Computational Analysis and Prediction. Methods Mol Biol 2023; 2554:179-197. [PMID: 36178627 DOI: 10.1007/978-1-0716-2624-5_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Computational approaches to the characterization and prediction of compound-protein interactions have a long research history and are well established, driven primarily by the needs of drug development. While, in principle, many of the computational methods developed in the context of drug development can also be applied directly to the investigation of metabolite-protein interactions, the interactions of metabolites with proteins (enzymes) are characterized by a number of particularities that result from their natural evolutionary origin and their biological and biochemical roles, as well as from a different problem setting when investigating them. In this review, these special aspects will be highlighted and recent research on them and developed computational approaches presented, along with available resources. They concern, among others, binding promiscuity, allostery, the role of posttranslational modifications, molecular steering and crowding effects, and metabolic conversion rate predictions. Recent breakthroughs in the field of protein structure prediction and newly developed machine learning techniques are being discussed as a tremendous opportunity for developing a more detailed molecular understanding of metabolism.
Collapse
Affiliation(s)
- Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.
| |
Collapse
|
6
|
Merging enzymatic and synthetic chemistry with computational synthesis planning. Nat Commun 2022; 13:7747. [PMID: 36517480 PMCID: PMC9750992 DOI: 10.1038/s41467-022-35422-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/30/2022] [Indexed: 12/15/2022] Open
Abstract
Synthesis planning programs trained on chemical reaction data can design efficient routes to new molecules of interest, but are limited in their ability to leverage rare chemical transformations. This challenge is acute for enzymatic reactions, which are valuable due to their selectivity and sustainability but are few in number. We report a retrosynthetic search algorithm using two neural network models for retrosynthesis-one covering 7984 enzymatic transformations and one 163,723 synthetic transformations-that balances the exploration of enzymatic and synthetic reactions to identify hybrid synthesis plans. This approach extends the space of retrosynthetic moves by thousands of uniquely enzymatic one-step transformations, discovers routes to molecules for which synthetic or enzymatic searches find none, and designs shorter routes for others. Application to (-)-Δ9 tetrahydrocannabinol (THC) (dronabinol) and R,R-formoterol (arformoterol) illustrates how our strategy facilitates the replacement of metal catalysis, high step counts, or costly enantiomeric resolution with more elegant hybrid proposals.
Collapse
|
7
|
Ding X, Zheng Z, Zhao G, Wang L, Wang H, Yang Q, Zhang M, Li L, Wang P. Bottom-up synthetic biology approach for improving the efficiency of menaquinone-7 synthesis in Bacillus subtilis. Microb Cell Fact 2022; 21:101. [PMID: 35643569 PMCID: PMC9148487 DOI: 10.1186/s12934-022-01823-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 05/13/2022] [Indexed: 11/10/2022] Open
Abstract
Abstract
Background
Menaquinone-7 (MK-7), which is associated with complex and tightly regulated pathways and redox imbalances, is produced at low titres in Bacillus subtilis. Synthetic biology provides a rational engineering principle for the transcriptional optimisation of key enzymes and the artificial creation of cofactor regeneration systems without regulatory interference. This holds great promise for alleviating pathway bottlenecks and improving the efficiency of carbon and energy utilisation.
Results
We used a bottom-up synthetic biology approach for the synthetic redesign of central carbon and to improve the adaptability between material and energy metabolism in MK-7 synthesis pathways. First, the rate-limiting enzymes, 1-deoxyxylulose-5-phosphate synthase (DXS), isopentenyl-diphosphate delta-isomerase (Fni), 1-deoxyxylulose-5-phosphate reductase (DXR), isochorismate synthase (MenF), and 3-deoxy-7-phosphoheptulonate synthase (AroA) in the MK-7 pathway were sequentially overexpressed. Promoter engineering and fusion tags were used to overexpress the key enzyme MenA, and the titre of MK-7 was 39.01 mg/L. Finally, after stoichiometric calculation and optimisation of the cofactor regeneration pathway, we constructed two NADPH regeneration systems, enhanced the endogenous cofactor regeneration pathway, and introduced a heterologous NADH kinase (Pos5P) to increase the availability of NADPH for MK-7 biosynthesis. The strain expressing pos5P was more efficient in converting NADH to NADPH and had excellent MK-7 synthesis ability. Following three Design-Build-Test-Learn cycles, the titre of MK-7 after flask fermentation reached 53.07 mg/L, which was 4.52 times that of B. subtilis 168. Additionally, the artificially constructed cofactor regeneration system reduced the amount of NADH-dependent by-product lactate in the fermentation broth by 9.15%. This resulted in decreased energy loss and improved carbon conversion.
Conclusions
In summary, a "high-efficiency, low-carbon, cofactor-recycling" MK-7 synthetic strain was constructed, and the strategy used in this study can be generally applied for constructing high-efficiency synthesis platforms for other terpenoids, laying the foundation for the large-scale production of high-value MK-7 as well as terpenoids.
Collapse
|
8
|
ARBRE: Computational resource to predict pathways towards industrially important aromatic compounds. Metab Eng 2022; 72:259-274. [DOI: 10.1016/j.ymben.2022.03.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Revised: 03/15/2022] [Accepted: 03/26/2022] [Indexed: 12/16/2022]
|
9
|
Curating a comprehensive set of enzymatic reaction rules for efficient novel biosynthetic pathway design. Metab Eng 2021; 65:79-87. [PMID: 33662575 DOI: 10.1016/j.ymben.2021.02.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 02/05/2021] [Accepted: 02/23/2021] [Indexed: 01/29/2023]
Abstract
Enzyme substrate promiscuity has significant implications for metabolic engineering. The ability to predict the space of possible enzymatic side reactions is crucial for elucidating underground metabolic networks in microorganisms, as well as harnessing novel biosynthetic capabilities of enzymes to produce desired chemicals. Reaction rule-based cheminformatics platforms have been implemented to computationally enumerate possible promiscuous reactions, relying on existing knowledge of enzymatic transformations to inform novel reactions. However, past versions of curated reaction rules have been limited by a lack of comprehensiveness in representing all possible transformations, as well as the need to prune rules to enhance computational efficiency in pathway expansion. To this end, we curated a set of 1224 most generalized reaction rules, automatically abstracted from atom-mapped MetaCyc reactions and verified to uniquely cover all common enzymatic transformations. We developed a framework to systematically identify and correct redundancies and errors in the curation process, resulting in a minimal, yet comprehensive, rule set. These reaction rules were capable of reproducing more than 85% of all reactions in the KEGG and BRENDA databases, for which a large fraction of reactions is not present in MetaCyc. Our rules exceed all previously published rule sets for which reproduction was possible in this coverage analysis, which allows for the exploration of a larger space of known enzymatic transformations. By leveraging the entire knowledge of possible metabolic reactions through generalized enzymatic reaction rules, we are able to better utilize underground metabolic pathways and accelerate novel biosynthetic pathway design to enable bioproduction towards a wider range of new molecules.
Collapse
|
10
|
The Metano Modeling Toolbox MMTB: An Intuitive, Web-Based Toolbox Introduced by Two Use Cases. Metabolites 2021; 11:metabo11020113. [PMID: 33671140 PMCID: PMC7923039 DOI: 10.3390/metabo11020113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Revised: 02/12/2021] [Accepted: 02/15/2021] [Indexed: 11/17/2022] Open
Abstract
Genome-scale metabolic models are of high interest in a number of different research fields. Flux balance analysis (FBA) and other mathematical methods allow the prediction of the steady-state behavior of metabolic networks under different environmental conditions. However, many existing applications for flux optimizations do not provide a metabolite-centric view on fluxes. Metano is a standalone, open-source toolbox for the analysis and refinement of metabolic models. While flux distributions in metabolic networks are predominantly analyzed from a reaction-centric point of view, the Metano methods of split-ratio analysis and metabolite flux minimization also allow a metabolite-centric view on flux distributions. In addition, we present MMTB (Metano Modeling Toolbox), a web-based toolbox for metabolic modeling including a user-friendly interface to Metano methods. MMTB assists during bottom-up construction of metabolic models by integrating reaction and enzymatic annotation data from different databases. Furthermore, MMTB is especially designed for non-experienced users by providing an intuitive interface to the most commonly used modeling methods and offering novel visualizations. Additionally, MMTB allows users to upload their models, which can in turn be explored and analyzed by the community. We introduce MMTB by two use cases, involving a published model of Corynebacterium glutamicum and a newly created model of Phaeobacter inhibens.
Collapse
|
11
|
Moretti S, Tran VDT, Mehl F, Ibberson M, Pagni M. MetaNetX/MNXref: unified namespace for metabolites and biochemical reactions in the context of metabolic models. Nucleic Acids Res 2021; 49:D570-D574. [PMID: 33156326 PMCID: PMC7778905 DOI: 10.1093/nar/gkaa992] [Citation(s) in RCA: 63] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/09/2020] [Accepted: 10/27/2020] [Indexed: 12/28/2022] Open
Abstract
MetaNetX/MNXref is a reconciliation of metabolites and biochemical reactions providing cross-links between major public biochemistry and Genome-Scale Metabolic Network (GSMN) databases. The new release brings several improvements with respect to the quality of the reconciliation, with particular attention dedicated to preserving the intrinsic properties of GSMN models. The MetaNetX website (https://www.metanetx.org/) provides access to the full database and online services. A major improvement is for mapping of user-provided GSMNs to MXNref, which now provides diagnostic messages about model content. In addition to the website and flat files, the resource can now be accessed through a SPARQL endpoint (https://rdf.metanetx.org).
Collapse
Affiliation(s)
- Sébastien Moretti
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Van Du T Tran
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Florence Mehl
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Mark Ibberson
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Marco Pagni
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
12
|
Sinha S, Lynn AM, Desai DK. Implementation of homology based and non-homology based computational methods for the identification and annotation of orphan enzymes: using Mycobacterium tuberculosis H37Rv as a case study. BMC Bioinformatics 2020; 21:466. [PMID: 33076816 PMCID: PMC7574302 DOI: 10.1186/s12859-020-03794-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Accepted: 10/01/2020] [Indexed: 02/06/2023] Open
Abstract
Background Homology based methods are one of the most important and widely used approaches for functional annotation of high-throughput microbial genome data. A major limitation of these methods is the absence of well-characterized sequences for certain functions. The non-homology methods based on the context and the interactions of a protein are very useful for identifying missing metabolic activities and functional annotation in the absence of significant sequence similarity. In the current work, we employ both homology and context-based methods, incrementally, to identify local holes and chokepoints, whose presence in the Mycobacterium tuberculosis genome is indicated based on its interaction with known proteins in a metabolic network context, but have not been annotated. We have developed two computational procedures using network theory to identify orphan enzymes (‘Hole finding protocol’) coupled with the identification of candidate proteins for the predicted orphan enzyme (‘Hole filling protocol’). We propose an integrated interaction score based on scores from the STRING database to identify candidate protein sequences for the orphan enzymes from M. tuberculosis, as a case study, which are most likely to perform the missing function. Results The application of an automated homology-based enzyme identification protocol, ModEnzA, on M. tuberculosis genome yielded 56 novel enzyme predictions. We further predicted 74 putative local holes, 6 choke points, and 3 high confidence local holes in the genome using ‘Hole finding protocol’. The ‘Hole-filling protocol’ was validated on the E. coli genome using artificial in-silico enzyme knockouts where our method showed 25% increased accuracy, compared to other methods, in assigning the correct sequence for the knocked-out enzyme amongst the top 10 ranks. The method was further validated on 8 additional genomes. Conclusions We have developed methods that can be generalized to augment homology-based annotation to identify missing enzyme coding genes and to predict a candidate protein for them. For pathogens such as M. tuberculosis, this work holds significance in terms of increasing the protein repertoire and thereby, the potential for identifying novel drug targets.
Collapse
Affiliation(s)
- Swati Sinha
- Bioinformatics Institute, Agency for Science, Technology, and Research (A*Star), 30 Biopolis Street, #07-01 Matrix, Singapore, 138671, Republic of Singapore
| | - Andrew M Lynn
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Dhwani K Desai
- Department of Biology and Department of Pharmacology, Dalhousie University, Halifax, NS, B3H4R2, Canada. .,School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India.
| |
Collapse
|
13
|
Chen F, Yuan L, Ding S, Tian Y, Hu QN. Data-driven rational biosynthesis design: from molecules to cell factories. Brief Bioinform 2020; 21:1238-1248. [PMID: 31243440 DOI: 10.1093/bib/bbz065] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Revised: 04/28/2019] [Accepted: 05/08/2019] [Indexed: 11/12/2022] Open
Abstract
A proliferation of chemical, reaction and enzyme databases, new computational methods and software tools for data-driven rational biosynthesis design have emerged in recent years. With the coming of the era of big data, particularly in the bio-medical field, data-driven rational biosynthesis design could potentially be useful to construct target-oriented chassis organisms. Engineering the complicated metabolic systems of chassis organisms to biosynthesize target molecules from inexpensive biomass is the main goal of cell factory design. The process of data-driven cell factory design could be divided into several parts: (1) target molecule selection; (2) metabolic reaction and pathway design; (3) prediction of novel enzymes based on protein domain and structure transformation of biosynthetic reactions; (4) construction of large-scale DNA for metabolic pathways; and (5) DNA assembly methods and visualization tools. The construction of a one-stop cell factory system could achieve automated design from the molecule level to the chassis level. In this article, we outline data-driven rational biosynthesis design steps and provide an overview of related tools in individual steps.
Collapse
Affiliation(s)
- Fu Chen
- College of Biotechnology, Tianjin University of Science and Technology, Tianjin, People's Republic of China.,Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, People's Republic of China.,CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Le Yuan
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, People's Republic of China.,University of Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Shaozhen Ding
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
| | - Yu Tian
- Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, People's Republic of China.,University of Chinese Academy of Sciences, Beijing, People's Republic of China
| | - Qian-Nan Hu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, People's Republic of China
| |
Collapse
|
14
|
Koblitz J, Schomburg D, Neumann-Schaal M. MetaboMAPS: Pathway sharing and multi-omics data visualization in metabolic context. F1000Res 2020; 9:288. [PMID: 32765840 PMCID: PMC7383707 DOI: 10.12688/f1000research.23427.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/13/2020] [Indexed: 01/08/2023] Open
Abstract
Metabolic pathways are an important part of systems biology research since they illustrate complex interactions between metabolites, enzymes, and regulators. Pathway maps are drawn to elucidate metabolism or to set data in a metabolic context. We present MetaboMAPS, a web-based platform to visualize numerical data on individual metabolic pathway maps. Metabolic maps can be stored, distributed and downloaded in SVG-format. MetaboMAPS was designed for users without computational background and supports pathway sharing without strict conventions. In addition to existing applications that established standards for well-studied pathways, MetaboMAPS offers a niche for individual, customized pathways beyond common knowledge, supporting ongoing research by creating publication-ready visualizations of experimental data.
Collapse
Affiliation(s)
- Julia Koblitz
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Dietmar Schomburg
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Meina Neumann-Schaal
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Leibniz-Institut DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| |
Collapse
|
15
|
Koblitz J, Schomburg D, Neumann-Schaal M. MetaboMAPS: Pathway sharing and multi-omics data visualization in metabolic context. F1000Res 2020; 9:288. [PMID: 32765840 PMCID: PMC7383707 DOI: 10.12688/f1000research.23427.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 04/17/2020] [Indexed: 03/30/2024] Open
Abstract
Metabolic pathways are an important part of systems biology research since they illustrate complex interactions between metabolites, enzymes, and regulators. Pathway maps are drawn to elucidate metabolism or to set data in a metabolic context. We present MetaboMAPS, a web-based platform to visualize numerical data on individual metabolic pathway maps. Metabolic maps can be stored, distributed and downloaded in SVG-format. MetaboMAPS was designed for users without computational background and supports pathway sharing without strict conventions. In addition to existing applications that established standards for well-studied pathways, MetaboMAPS offers a niche for individual, customized pathways beyond common knowledge, supporting ongoing research by creating publication-ready visualizations of experimental data.
Collapse
Affiliation(s)
- Julia Koblitz
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Dietmar Schomburg
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Meina Neumann-Schaal
- Braunschweig Integrated Centre of Systems Biology, Technische Universität Braunschweig, Braunschweig, 38106, Germany
- Leibniz-Institut DSMZ - German Collection of Microorganisms and Cell Cultures, Braunschweig, 38124, Germany
| |
Collapse
|
16
|
Ji H, Xu Y, Lu H, Zhang Z. Deep MS/MS-Aided Structural-Similarity Scoring for Unknown Metabolite Identification. Anal Chem 2019; 91:5629-5637. [DOI: 10.1021/acs.analchem.8b05405] [Citation(s) in RCA: 36] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Hongchao Ji
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, People’s Republic of China
| | - Yamei Xu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, People’s Republic of China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, People’s Republic of China
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha 410083, People’s Republic of China
| |
Collapse
|
17
|
Griesemer M, Kimbrel JA, Zhou CE, Navid A, D'haeseleer P. Combining multiple functional annotation tools increases coverage of metabolic annotation. BMC Genomics 2018; 19:948. [PMID: 30567498 PMCID: PMC6299973 DOI: 10.1186/s12864-018-5221-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 11/05/2018] [Indexed: 12/15/2022] Open
Abstract
Background Genome-scale metabolic modeling is a cornerstone of systems biology analysis of microbial organisms and communities, yet these genome-scale modeling efforts are invariably based on incomplete functional annotations. Annotated genomes typically contain 30–50% of genes without functional annotation, severely limiting our knowledge of the “parts lists” that the organisms have at their disposal. These incomplete annotations may be sufficient to derive a model of a core set of well-studied metabolic pathways that support growth in pure culture. However, pathways important for growth on unusual metabolites exchanged in complex microbial communities are often less understood, resulting in missing functional annotations in newly sequenced genomes. Results Here, we present results on a comprehensive reannotation of 27 bacterial reference genomes, focusing on enzymes with EC numbers annotated by KEGG, RAST, EFICAz, and the BRENDA enzyme database, and on membrane transport annotations by TransportDB, KEGG and RAST. Our analysis shows that annotation using multiple tools can result in a drastically larger metabolic network reconstruction, adding on average 40% more EC numbers, 3–8 times more substrate-specific transporters, and 37% more metabolic genes. These results are even more pronounced for bacterial species that are phylogenetically distant from well-studied model organisms such as E. coli. Conclusions Metabolic annotations are often incomplete and inconsistent. Combining multiple functional annotation tools can greatly improve genome coverage and metabolic network size, especially for non-model organisms and non-core pathways. Electronic supplementary material The online version of this article (10.1186/s12864-018-5221-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Marc Griesemer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Jeffrey A Kimbrel
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Carol E Zhou
- Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Ali Navid
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA
| | - Patrik D'haeseleer
- Biosciences and Biotechnology Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA. .,Global Security Computing Applications Division, Lawrence Livermore National Laboratory, Livermore, CA, 94551, USA.
| |
Collapse
|
18
|
Jeffryes JG, Seaver SMD, Faria JP, Henry CS. A pathway for every product? Tools to discover and design plant metabolism. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2018; 273:61-70. [PMID: 29907310 DOI: 10.1016/j.plantsci.2018.03.025] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Revised: 03/13/2018] [Accepted: 03/19/2018] [Indexed: 06/08/2023]
Abstract
The vast diversity of plant natural products is a powerful indication of the biosynthetic capacity of plant metabolism. Synthetic biology seeks to capitalize on this ability by understanding and reconfiguring the biosynthetic pathways that generate this diversity to produce novel products with improved efficiency. Here we review the algorithms and databases that presently support the design and manipulation of metabolic pathways in plants, starting from metabolic models of native biosynthetic pathways, progressing to novel combinations of known reactions, and finally proposing new reactions that may be carried out by existing enzymes. We show how these tools are useful for proposing new pathways as well as identifying side reactions that may affect engineering goals.
Collapse
Affiliation(s)
- James G Jeffryes
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - Samuel M D Seaver
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - José P Faria
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States
| | - Christopher S Henry
- Argonne National Laboratory, Mathematics and Computer Science Division, Argonne, IL, United States.
| |
Collapse
|
19
|
Wohlgemuth R. Horizons of Systems Biocatalysis and Renaissance of Metabolite Synthesis. Biotechnol J 2018; 13:e1700620. [DOI: 10.1002/biot.201700620] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Revised: 04/26/2018] [Indexed: 12/12/2022]
Affiliation(s)
- Roland Wohlgemuth
- European Federation of Biotechnology; Section on Applied Biocatalysis (ESAB); Theodor-Heuss-Allee 25,Frankfurt am Main 60486 Germany
- Sigma-Aldrich; Member of Merck Group; Industriestrasse 25,Buchs 9470 Switzerland
| |
Collapse
|
20
|
Kumar A, Wang L, Ng CY, Maranas CD. Pathway design using de novo steps through uncharted biochemical spaces. Nat Commun 2018; 9:184. [PMID: 29330441 PMCID: PMC5766603 DOI: 10.1038/s41467-017-02362-x] [Citation(s) in RCA: 53] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 11/21/2017] [Indexed: 12/31/2022] Open
Abstract
Existing retrosynthesis tools generally traverse production routes from a source to a sink metabolite using known enzymes or de novo steps. Generally, important considerations such as blending known transformations with putative steps, complexity of pathway topology, mass conservation, cofactor balance, thermodynamic feasibility, microbial chassis selection, and cost are largely dealt with in a posteriori fashion. The computational procedure we present here designs bioconversion routes while simultaneously considering any combination of the aforementioned design criteria. First, we track and codify as rules all reaction centers using a prime factorization-based encoding technique (rePrime). Reaction rules and known biotransformations are then simultaneously used by the pathway design algorithm (novoStoic) to trace both metabolites and molecular moieties through balanced bio-conversion strategies. We demonstrate the use of novoStoic in bypassing steps in existing pathways through putative transformations, assembling complex pathways blending both known and putative steps toward pharmaceuticals, and postulating ways to biodegrade xenobiotics.
Collapse
Affiliation(s)
- Akhil Kumar
- The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Chiam Yu Ng
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
21
|
Ellens KW, Christian N, Singh C, Satagopam VP, May P, Linster CL. Confronting the catalytic dark matter encoded by sequenced genomes. Nucleic Acids Res 2017; 45:11495-11514. [PMID: 29059321 PMCID: PMC5714238 DOI: 10.1093/nar/gkx937] [Citation(s) in RCA: 52] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 10/03/2017] [Indexed: 01/02/2023] Open
Abstract
The post-genomic era has provided researchers with a deluge of protein sequences. However, a significant fraction of the proteins encoded by sequenced genomes remains without an identified function. Here, we aim at determining how many enzymes of uncertain or unknown function are still present in the Saccharomyces cerevisiae and human proteomes. Using information available in the Swiss-Prot, BRENDA and KEGG databases in combination with a Hidden Markov Model-based method, we estimate that >600 yeast and 2000 human proteins (>30% of their proteins of unknown function) are enzymes whose precise function(s) remain(s) to be determined. This illustrates the impressive scale of the ‘unknown enzyme problem’. We extensively review classical biochemical as well as more recent systematic experimental and computational approaches that can be used to support enzyme function discovery research. Finally, we discuss the possible roles of the elusive catalysts in light of recent developments in the fields of enzymology and metabolism as well as the significance of the unknown enzyme problem in the context of metabolic modeling, metabolic engineering and rare disease research.
Collapse
Affiliation(s)
- Kenneth W Ellens
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| | - Nils Christian
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| | - Charandeep Singh
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| | - Venkata P Satagopam
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| | - Patrick May
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| | - Carole L Linster
- Luxembourg Centre for Systems Biomedicine, University of Luxembourg, L-4362 Esch-sur-Alzette, Luxembourg
| |
Collapse
|
22
|
Wang L, Dash S, Ng CY, Maranas CD. A review of computational tools for design and reconstruction of metabolic pathways. Synth Syst Biotechnol 2017; 2:243-252. [PMID: 29552648 PMCID: PMC5851934 DOI: 10.1016/j.synbio.2017.11.002] [Citation(s) in RCA: 75] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Revised: 11/06/2017] [Accepted: 11/06/2017] [Indexed: 11/28/2022] Open
Abstract
Metabolic pathways reflect an organism's chemical repertoire and hence their elucidation and design have been a primary goal in metabolic engineering. Various computational methods have been developed to design novel metabolic pathways while taking into account several prerequisites such as pathway stoichiometry, thermodynamics, host compatibility, and enzyme availability. The choice of the method is often determined by the nature of the metabolites of interest and preferred host organism, along with computational complexity and availability of software tools. In this paper, we review different computational approaches used to design metabolic pathways based on the reaction network representation of the database (i.e., graph or stoichiometric matrix) and the search algorithm (i.e., graph search, flux balance analysis, or retrosynthetic search). We also put forth a systematic workflow that can be implemented in projects requiring pathway design and highlight current limitations and obstacles in computational pathway design.
Collapse
Affiliation(s)
- Lin Wang
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Satyakam Dash
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Chiam Yu Ng
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Costas D Maranas
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
23
|
Schomburg I, Jeske L, Ulbrich M, Placzek S, Chang A, Schomburg D. The BRENDA enzyme information system–From a database to an expert system. J Biotechnol 2017; 261:194-206. [DOI: 10.1016/j.jbiotec.2017.04.020] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Revised: 04/11/2017] [Accepted: 04/18/2017] [Indexed: 02/06/2023]
|
24
|
Kim SM, Peña MI, Moll M, Bennett GN, Kavraki LE. A review of parameters and heuristics for guiding metabolic pathfinding. J Cheminform 2017; 9:51. [PMID: 29086092 PMCID: PMC5602787 DOI: 10.1186/s13321-017-0239-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 09/07/2017] [Indexed: 12/04/2022] Open
Abstract
Recent developments in metabolic engineering have led to the successful biosynthesis of valuable products, such as the precursor of the antimalarial compound, artemisinin, and opioid precursor, thebaine. Synthesizing these traditionally plant-derived compounds in genetically modified yeast cells introduces the possibility of significantly reducing the total time and resources required for their production, and in turn, allows these valuable compounds to become cheaper and more readily available. Most biosynthesis pathways used in metabolic engineering applications have been discovered manually, requiring a tedious search of existing literature and metabolic databases. However, the recent rapid development of available metabolic information has enabled the development of automated approaches for identifying novel pathways. Computer-assisted pathfinding has the potential to save biochemists time in the initial discovery steps of metabolic engineering. In this paper, we review the parameters and heuristics used to guide the search in recent pathfinding algorithms. These parameters and heuristics capture information on the metabolic network structure, compound structures, reaction features, and organism-specificity of pathways. No one metabolic pathfinding algorithm or search parameter stands out as the best to use broadly for solving the pathfinding problem, as each method and parameter has its own strengths and shortcomings. As assisted pathfinding approaches continue to become more sophisticated, the development of better methods for visualizing pathway results and integrating these results into existing metabolic engineering practices is also important for encouraging wider use of these pathfinding methods.
Collapse
Affiliation(s)
- Sarah M Kim
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Matthew I Peña
- Department of BioSciences, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Mark Moll
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - George N Bennett
- Department of BioSciences, Rice University, 6100 Main St., Houston, TX, 77005, USA
| | - Lydia E Kavraki
- Department of Computer Science, Rice University, 6100 Main St., Houston, TX, 77005, USA.
| |
Collapse
|
25
|
Perez de Souza L, Naake T, Tohge T, Fernie AR. From chromatogram to analyte to metabolite. How to pick horses for courses from the massive web resources for mass spectral plant metabolomics. Gigascience 2017; 6:1-20. [PMID: 28520864 PMCID: PMC5499862 DOI: 10.1093/gigascience/gix037] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2017] [Revised: 05/08/2017] [Accepted: 05/12/2017] [Indexed: 01/19/2023] Open
Abstract
The grand challenge currently facing metabolomics is the expansion of the coverage of the metabolome from a minor percentage of the metabolic complement of the cell toward the level of coverage afforded by other post-genomic technologies such as transcriptomics and proteomics. In plants, this problem is exacerbated by the sheer diversity of chemicals that constitute the metabolome, with the number of metabolites in the plant kingdom generally considered to be in excess of 200 000. In this review, we focus on web resources that can be exploited in order to improve analyte and ultimately metabolite identification and quantification. There is a wide range of available software that not only aids in this but also in the related area of peak alignment; however, for the uninitiated, choosing which program to use is a daunting task. For this reason, we provide an overview of the pros and cons of the software as well as comments regarding the level of programing skills required to effectively exploit their basic functions. In addition, the torrent of available genome and transcriptome sequences that followed the advent of next-generation sequencing has opened up further valuable resources for metabolite identification. All things considered, we posit that only via a continued communal sharing of information such as that deposited in the databases described within the article are we likely to be able to make significant headway toward improving our coverage of the plant metabolome.
Collapse
Affiliation(s)
- Leonardo Perez de Souza
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Thomas Naake
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Takayuki Tohge
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| | - Alisdair R Fernie
- Max-Planck-Institute of Molecular Plant Physiology, Am Mühlenberg 1, 14476 Potsdam-Golm, Germany
| |
Collapse
|
26
|
Labena AA, Ye YN, Dong C, Zhang FZ, Guo FB. SSER: Species specific essential reactions database. BMC SYSTEMS BIOLOGY 2017; 11:50. [PMID: 28420402 PMCID: PMC5395902 DOI: 10.1186/s12918-017-0426-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Accepted: 04/13/2017] [Indexed: 01/05/2023]
Abstract
BACKGROUND Essential reactions are vital components of cellular networks. They are the foundations of synthetic biology and are potential candidate targets for antimetabolic drug design. Especially if a single reaction is catalyzed by multiple enzymes, then inhibiting the reaction would be a better option than targeting the enzymes or the corresponding enzyme-encoding gene. The existing databases such as BRENDA, BiGG, KEGG, Bio-models, Biosilico, and many others offer useful and comprehensive information on biochemical reactions. But none of these databases especially focus on essential reactions. Therefore, building a centralized repository for this class of reactions would be of great value. DESCRIPTION Here, we present a species-specific essential reactions database (SSER). The current version comprises essential biochemical and transport reactions of twenty-six organisms which are identified via flux balance analysis (FBA) combined with manual curation on experimentally validated metabolic network models. Quantitative data on the number of essential reactions, number of the essential reactions associated with their respective enzyme-encoding genes and shared essential reactions across organisms are the main contents of the database. CONCLUSION SSER would be a prime source to obtain essential reactions data and related gene and metabolite information and it can significantly facilitate the metabolic network models reconstruction and analysis, and drug target discovery studies. Users can browse, search, compare and download the essential reactions of organisms of their interest through the website http://cefg.uestc.edu.cn/sser .
Collapse
Affiliation(s)
- Abraham A Labena
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.,College of Computational and Natural Sciences, Dilla University, Dilla, Ethiopia
| | - Yuan-Nong Ye
- School of Biology and Engineering, Guizhou Medical University, Guiyang Shi, China
| | - Chuan Dong
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Fa-Z Zhang
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China.,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China
| | - Feng-Biao Guo
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China. .,Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China. .,Bioinformatics Center in School of Life Science and Technology, University of Electronic Science and Technology of China, No.4, Section 2, North JianShe Road, Chengdu, 610054, China.
| |
Collapse
|
27
|
Dannheim H, Will SE, Schomburg D, Neumann-Schaal M. Clostridioides difficile 630Δ erm in silico and in vivo - quantitative growth and extensive polysaccharide secretion. FEBS Open Bio 2017; 7:602-615. [PMID: 28396843 PMCID: PMC5377389 DOI: 10.1002/2211-5463.12208] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Revised: 02/09/2017] [Accepted: 02/10/2017] [Indexed: 12/15/2022] Open
Abstract
Antibiotic-associated infections with Clostridioides difficile are a severe and often lethal risk for hospitalized patients, and can also affect populations without these classical risk factors. For a rational design of therapeutical concepts, a better knowledge of the metabolism of the pathogen is crucial. Metabolic modeling can provide a simulation of quantitative growth and usage of metabolic pathways, leading to a deeper understanding of the organism. Here, we present an elaborate genome-scale metabolic model of C. difficile 630Δerm. The model iHD992 includes experimentally determined product and substrate uptake rates and is able to simulate the energy metabolism and quantitative growth of C. difficile. Dynamic flux balance analysis was used for time-resolved simulations of the quantitative growth in two different media. The model predicts oxidative Stickland reactions and glucose degradation as main sources of energy, while the resulting reduction potential is mostly used for acetogenesis via the Wood-Ljungdahl pathway. Initial modeling experiments did not reproduce the observed growth behavior before the production of large quantities of a previously unknown polysaccharide was detected. Combined genome analysis and laboratory experiments indicated that the polysaccharide is an acetylated glucose polymer. Time-resolved simulations showed that polysaccharide secretion was coupled to growth even during unstable glucose uptake in minimal medium. This is accomplished by metabolic shifts between active glycolysis and gluconeogenesis which were also observed in laboratory experiments.
Collapse
Affiliation(s)
- Henning Dannheim
- Braunschweig Integrated Centre of Systems Biology (BRICS) Department of Bioinformatics and Biochemistry Technische Universität Braunschweig Braunschweig Germany
| | - Sabine E Will
- Braunschweig Integrated Centre of Systems Biology (BRICS) Department of Bioinformatics and Biochemistry Technische Universität Braunschweig Braunschweig Germany
| | - Dietmar Schomburg
- Braunschweig Integrated Centre of Systems Biology (BRICS) Department of Bioinformatics and Biochemistry Technische Universität Braunschweig Braunschweig Germany
| | - Meina Neumann-Schaal
- Braunschweig Integrated Centre of Systems Biology (BRICS) Department of Bioinformatics and Biochemistry Technische Universität Braunschweig Braunschweig Germany
| |
Collapse
|
28
|
Tu W, Ding S, Wu L, Deng Z, Zhu H, Xu X, Lin C, Ye C, Han M, Zhao M, Liu J, Deng Z, Chen J, Wei DQ, Hu QN. SynBioEcoli: a comprehensive metabolism network of engineered E. coli in three dimensional visualization. QUANTITATIVE BIOLOGY 2017. [DOI: 10.1007/s40484-017-0098-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
29
|
Wolf J, Stark H, Fafenrot K, Albersmeier A, Pham TK, Müller KB, Meyer BH, Hoffmann L, Shen L, Albaum SP, Kouril T, Schmidt-Hohagen K, Neumann-Schaal M, Bräsen C, Kalinowski J, Wright PC, Albers SV, Schomburg D, Siebers B. A systems biology approach reveals major metabolic changes in the thermoacidophilic archaeon Sulfolobus solfataricus in response to the carbon source L-fucose versus D-glucose. Mol Microbiol 2016; 102:882-908. [PMID: 27611014 DOI: 10.1111/mmi.13498] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/03/2016] [Indexed: 12/01/2022]
Abstract
Archaea are characterised by a complex metabolism with many unique enzymes that differ from their bacterial and eukaryotic counterparts. The thermoacidophilic archaeon Sulfolobus solfataricus is known for its metabolic versatility and is able to utilize a great variety of different carbon sources. However, the underlying degradation pathways and their regulation are often unknown. In this work, the growth on different carbon sources was analysed, using an integrated systems biology approach. The comparison of growth on L-fucose and D-glucose allows first insights into the genome-wide changes in response to the two carbon sources and revealed a new pathway for L-fucose degradation in S. solfataricus. During growth on L-fucose major changes in the central carbon metabolic network, as well as an increased activity of the glyoxylate bypass and the 3-hydroxypropionate/4-hydroxybutyrate cycle were observed. Within the newly discovered pathway for L-fucose degradation the following key reactions were identified: (i) L-fucose oxidation to L-fuconate via a dehydrogenase, (ii) dehydration to 2-keto-3-deoxy-L-fuconate via dehydratase, (iii) 2-keto-3-deoxy-L-fuconate cleavage to pyruvate and L-lactaldehyde via aldolase and (iv) L-lactaldehyde conversion to L-lactate via aldehyde dehydrogenase. This pathway as well as L-fucose transport shows interesting overlaps to the D-arabinose pathway, representing another example for pathway promiscuity in Sulfolobus species.
Collapse
Affiliation(s)
- Jacqueline Wolf
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Helge Stark
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Katharina Fafenrot
- Molecular Enzyme Technology and Biochemistry, Biofilm Centre, Universität Duisburg-Essen, Essen, 45141, Germany
| | - Andreas Albersmeier
- Center for Biotechnology - CeBiTec, Universität Bielefeld, Bielefeld, 33615, Germany
| | - Trong K Pham
- Departement of Chemical and Biological Engineering, ChELSI Institute, University of Sheffield, Sheffield, S1 3JD, UK
| | - Katrin B Müller
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Benjamin H Meyer
- Molecular Biology of Archaea, Institute for Biology II - Microbiology, Universität Freiburg, Freiburg, 79104, Germany
| | - Lena Hoffmann
- Molecular Biology of Archaea, Institute for Biology II - Microbiology, Universität Freiburg, Freiburg, 79104, Germany
| | - Lu Shen
- Molecular Enzyme Technology and Biochemistry, Biofilm Centre, Universität Duisburg-Essen, Essen, 45141, Germany
| | - Stefan P Albaum
- Center for Biotechnology - CeBiTec, Universität Bielefeld, Bielefeld, 33615, Germany
| | - Theresa Kouril
- Molecular Enzyme Technology and Biochemistry, Biofilm Centre, Universität Duisburg-Essen, Essen, 45141, Germany
| | - Kerstin Schmidt-Hohagen
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Meina Neumann-Schaal
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Christopher Bräsen
- Molecular Enzyme Technology and Biochemistry, Biofilm Centre, Universität Duisburg-Essen, Essen, 45141, Germany
| | - Jörn Kalinowski
- Center for Biotechnology - CeBiTec, Universität Bielefeld, Bielefeld, 33615, Germany
| | - Phillip C Wright
- Departement of Chemical and Biological Engineering, ChELSI Institute, University of Sheffield, Sheffield, S1 3JD, UK
| | - Sonja-Verena Albers
- Molecular Biology of Archaea, Institute for Biology II - Microbiology, Universität Freiburg, Freiburg, 79104, Germany
| | - Dietmar Schomburg
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Braunschweig, 38106, Germany
| | - Bettina Siebers
- Molecular Enzyme Technology and Biochemistry, Biofilm Centre, Universität Duisburg-Essen, Essen, 45141, Germany
| |
Collapse
|
30
|
Placzek S, Schomburg I, Chang A, Jeske L, Ulbrich M, Tillack J, Schomburg D. BRENDA in 2017: new perspectives and new tools in BRENDA. Nucleic Acids Res 2016; 45:D380-D388. [PMID: 27924025 PMCID: PMC5210646 DOI: 10.1093/nar/gkw952] [Citation(s) in RCA: 175] [Impact Index Per Article: 21.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 10/17/2016] [Indexed: 01/11/2023] Open
Abstract
The BRENDA enzyme database (www.brenda-enzymes.org) has developed into the main enzyme and enzyme-ligand information system in its 30 years of existence. The information is manually extracted from primary literature and extended by text mining procedures, integration of external data and prediction algorithms. Approximately 3 million data from 83 000 enzymes and 137 000 literature references constitute the manually annotated core. Text mining procedures extend these data with information on occurrence, enzyme-disease relationships and kinetic data. Prediction algorithms contribute locations and genome annotations. External data and links complete the data with sequences and 3D structures. A total of 206 000 enzyme ligands provide functional and structural data. BRENDA offers a complex query tool engine allowing the users an efficient access to the data via different search methods and explorers. The new design of the BRENDA entry page and the enzyme summary pages improves the user access and the performance. New interactive and intuitive BRENDA pathway maps give an overview on biochemical processes and facilitate the visualization of enzyme, ligand and organism information in the biochemical context. SCOPe and CATH, databases for protein structure classification, are included. New online and video tutorials provide online training for the users. BRENDA is freely available for academic users.
Collapse
Affiliation(s)
- Sandra Placzek
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Ida Schomburg
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Antje Chang
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Lisa Jeske
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Marcus Ulbrich
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Jana Tillack
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| | - Dietmar Schomburg
- Technische Universität Braunschweig, Braunschweig Integrated Centre of Systems Biology (BRICS), Rebenring 56, 38106 Braunschweig, Germany
| |
Collapse
|
31
|
He S, Li M, Ye X, Wang H, Yu W, He W, Wang Y, Qiao Y. Site of metabolism prediction for oxidation reactions mediated by oxidoreductases based on chemical bond. Bioinformatics 2016; 33:363-372. [DOI: 10.1093/bioinformatics/btw617] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 09/22/2016] [Indexed: 12/31/2022] Open
|
32
|
Dönertaş HM, Martínez Cuesta S, Rahman SA, Thornton JM. Characterising Complex Enzyme Reaction Data. PLoS One 2016; 11:e0147952. [PMID: 26840640 PMCID: PMC4740462 DOI: 10.1371/journal.pone.0147952] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Accepted: 01/11/2016] [Indexed: 01/05/2023] Open
Abstract
The relationship between enzyme-catalysed reactions and the Enzyme Commission (EC) number, the widely accepted classification scheme used to characterise enzyme activity, is complex and with the rapid increase in our knowledge of the reactions catalysed by enzymes needs revisiting. We present a manual and computational analysis to investigate this complexity and found that almost one-third of all known EC numbers are linked to more than one reaction in the secondary reaction databases (e.g., KEGG). Although this complexity is often resolved by defining generic, alternative and partial reactions, we have also found individual EC numbers with more than one reaction catalysing different types of bond changes. This analysis adds a new dimension to our understanding of enzyme function and might be useful for the accurate annotation of the function of enzymes and to study the changes in enzyme function during evolution.
Collapse
Affiliation(s)
- Handan Melike Dönertaş
- European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- Department of Biological Sciences, Middle East Technical University, Ankara, Turkey
| | - Sergio Martínez Cuesta
- European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Syed Asad Rahman
- European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
| | - Janet M. Thornton
- European Molecular Biology Laboratory, European Bioinformatics Institute EMBL-EBI, Wellcome Trust Genome Campus, Hinxton, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
33
|
Moretti S, Martin O, Van Du Tran T, Bridge A, Morgat A, Pagni M. MetaNetX/MNXref--reconciliation of metabolites and biochemical reactions to bring together genome-scale metabolic networks. Nucleic Acids Res 2015; 44:D523-6. [PMID: 26527720 PMCID: PMC4702813 DOI: 10.1093/nar/gkv1117] [Citation(s) in RCA: 105] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/11/2015] [Indexed: 12/02/2022] Open
Abstract
MetaNetX is a repository of genome-scale metabolic networks (GSMNs) and biochemical pathways from a number of major resources imported into a common namespace of chemical compounds, reactions, cellular compartments—namely MNXref—and proteins. The MetaNetX.org website (http://www.metanetx.org/) provides access to these integrated data as well as a variety of tools that allow users to import their own GSMNs, map them to the MNXref reconciliation, and manipulate, compare, analyze, simulate (using flux balance analysis) and export the resulting GSMNs. MNXref and MetaNetX are regularly updated and freely available.
Collapse
Affiliation(s)
- Sébastien Moretti
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland Department of Ecology and Evolution, Biophore, Evolutionary Bioinformatics group, University of Lausanne, Lausanne 1015, Switzerland
| | - Olivier Martin
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - T Van Du Tran
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| | - Alan Bridge
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva 1206, Switzerland
| | - Anne Morgat
- Swiss-Prot Group, SIB Swiss Institute of Bioinformatics, Geneva 1206, Switzerland Equipe ERABLE, INRIA Grenoble Rhône-Alpes, Montbonnot Saint-Martin 38330, France
| | - Marco Pagni
- Vital-IT group, SIB Swiss Institute of Bioinformatics, Lausanne 1015, Switzerland
| |
Collapse
|
34
|
Tu W, Zhang H, Liu J, Hu QN. BioSynther: a customized biosynthetic potential explorer. Bioinformatics 2015; 32:472-3. [DOI: 10.1093/bioinformatics/btv599] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2015] [Accepted: 10/12/2015] [Indexed: 11/13/2022] Open
|
35
|
Jeffryes JG, Colastani RL, Elbadawi-Sidhu M, Kind T, Niehaus TD, Broadbelt LJ, Hanson AD, Fiehn O, Tyo KEJ, Henry CS. MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics. J Cheminform 2015; 7:44. [PMID: 26322134 PMCID: PMC4550642 DOI: 10.1186/s13321-015-0087-1] [Citation(s) in RCA: 135] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/06/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. DESCRIPTION Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. CONCLUSIONS MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.
Collapse
Affiliation(s)
- James G Jeffryes
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA ; Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| | - Ricardo L Colastani
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| | | | - Tobias Kind
- West Coast Metabolomics Center, University of California, Davis, CA USA
| | - Thomas D Niehaus
- Horticultural Sciences Department, University of Florida, Gainesville, FL USA
| | - Linda J Broadbelt
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA
| | - Andrew D Hanson
- Horticultural Sciences Department, University of Florida, Gainesville, FL USA
| | - Oliver Fiehn
- West Coast Metabolomics Center, University of California, Davis, CA USA ; Biochemistry Department, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Keith E J Tyo
- Department of Chemical and Biological Engineering, Northwestern University, Evanston, IL USA
| | - Christopher S Henry
- Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL USA
| |
Collapse
|
36
|
Warr WA. Many InChIs and quite some feat. J Comput Aided Mol Des 2015; 29:681-94. [PMID: 26081259 DOI: 10.1007/s10822-015-9854-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2015] [Accepted: 06/10/2015] [Indexed: 12/14/2022]
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, Holmes Chapel, Crewe, Cheshire, CW4 7HZ, UK,
| |
Collapse
|
37
|
Dias O, Rocha M, Ferreira EC, Rocha I. Reconstructing genome-scale metabolic models with merlin. Nucleic Acids Res 2015; 43:3899-910. [PMID: 25845595 PMCID: PMC4417185 DOI: 10.1093/nar/gkv294] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2015] [Accepted: 03/18/2015] [Indexed: 01/13/2023] Open
Abstract
The Metabolic Models Reconstruction Using Genome-Scale Information (merlin) tool is a user-friendly Java application that aids the reconstruction of genome-scale metabolic models for any organism that has its genome sequenced. It performs the major steps of the reconstruction process, including the functional genomic annotation of the whole genome and subsequent construction of the portfolio of reactions. Moreover, merlin includes tools for the identification and annotation of genes encoding transport proteins, generating the transport reactions for those carriers. It also performs the compartmentalisation of the model, predicting the organelle localisation of the proteins encoded in the genome and thus the localisation of the metabolites involved in the reactions promoted by such enzymes. The gene-proteins-reactions (GPR) associations are automatically generated and included in the model. Finally, merlin expedites the transition from genomic data to draft metabolic models reconstructions exported in the SBML standard format, allowing the user to have a preliminary view of the biochemical network, which can be manually curated within the environment provided by merlin.
Collapse
Affiliation(s)
- Oscar Dias
- Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Miguel Rocha
- Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Eugénio C Ferreira
- Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| | - Isabel Rocha
- Centre of Biological Engineering, University of Minho, Campus de Gualtar, 4710-057 Braga, Portugal
| |
Collapse
|
38
|
Genome-scale modeling for metabolic engineering. J Ind Microbiol Biotechnol 2015; 42:327-38. [PMID: 25578304 DOI: 10.1007/s10295-014-1576-3] [Citation(s) in RCA: 55] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 12/20/2014] [Indexed: 01/04/2023]
Abstract
We focus on the application of constraint-based methodologies and, more specifically, flux balance analysis in the field of metabolic engineering, and enumerate recent developments and successes of the field. We also review computational frameworks that have been developed with the express purpose of automatically selecting optimal gene deletions for achieving improved production of a chemical of interest. The application of flux balance analysis methods in rational metabolic engineering requires a metabolic network reconstruction and a corresponding in silico metabolic model for the microorganism in question. For this reason, we additionally present a brief overview of automated reconstruction techniques. Finally, we emphasize the importance of integrating metabolic networks with regulatory information-an area which we expect will become increasingly important for metabolic engineering-and present recent developments in the field of metabolic and regulatory integration.
Collapse
|
39
|
Chang A, Schomburg I, Placzek S, Jeske L, Ulbrich M, Xiao M, Sensen CW, Schomburg D. BRENDA in 2015: exciting developments in its 25th year of existence. Nucleic Acids Res 2014; 43:D439-46. [PMID: 25378310 PMCID: PMC4383907 DOI: 10.1093/nar/gku1068] [Citation(s) in RCA: 150] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
The BRENDA enzyme information system (http://www.brenda-enzymes.org/) has developed into an elaborate system of enzyme and enzyme-ligand information obtained from different sources, combined with flexible query systems and evaluation tools. The information is obtained by manual extraction from primary literature, text and data mining, data integration, and prediction algorithms. Approximately 300 million data include enzyme function and molecular data from more than 30 000 organisms. The manually derived core contains 3 million data from 77 000 enzymes annotated from 135 000 literature references. Each entry is connected to the literature reference and the source organism. They are complemented by information on occurrence, enzyme/disease relationships from text mining, sequences and 3D structures from other databases, and predicted enzyme location and genome annotation. Functional and structural data of more than 190 000 enzyme ligands are stored in BRENDA. New features improving the functionality and analysis tools were implemented. The human anatomy atlas CAVEman is linked to the BRENDA Tissue Ontology terms providing a connection between anatomical and functional enzyme data. Word Maps for enzymes obtained from PubMed abstracts highlight application and scientific relevance of enzymes. The EnzymeDetector genome annotation tool and the reaction database BKM-react including reactions from BRENDA, KEGG and MetaCyc were improved. The website was redesigned providing new query options.
Collapse
Affiliation(s)
- Antje Chang
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| | - Ida Schomburg
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| | - Sandra Placzek
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| | - Lisa Jeske
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| | - Marcus Ulbrich
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| | - Mei Xiao
- Department of Biochemistry & Molecular Biology, Faculty of Medicine, University of Calgary, 3330 Hospital Drive N.W., Calgary, Alberta T2N 4N1, Canada
| | - Christoph W Sensen
- The Jackson Laboratory, 263 Farmington Avenue, Farmington, CT 06030, USA
| | - Dietmar Schomburg
- Department of Bioinformatics and Biochemistry, Technische Universität Braunschweig, Langer Kamp 19 B, D-38106 Braunschweig, Germany
| |
Collapse
|
40
|
Genetics of the human metabolome, what is next? Biochim Biophys Acta Mol Basis Dis 2014; 1842:1923-1931. [PMID: 24905732 DOI: 10.1016/j.bbadis.2014.05.030] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Revised: 05/04/2014] [Accepted: 05/28/2014] [Indexed: 11/23/2022]
Abstract
Increases in throughput and decreases in costs have facilitated large scale metabolomics studies, the simultaneous measurement of large numbers of biochemical components in biological samples. Initial large scale studies focused on biomarker discovery for disease or disease progression and helped to understand biochemical pathways underlying disease. The first population-based studies that combined metabolomics and genome wide association studies (mGWAS) have increased our understanding of the (genetic) regulation of biochemical conversions. Measurements of metabolites as intermediate phenotypes are a potentially very powerful approach to uncover how genetic variation affects disease susceptibility and progression. However, we still face many hurdles in the interpretation of mGWAS data. Due to the composite nature of many metabolites, single enzymes may affect the levels of multiple metabolites and, conversely, levels of single metabolites may be affected by multiple enzymes. Here, we will provide a global review of the current status of mGWAS. We will specifically discuss the application of prior biological knowledge present in databases to the interpretation of mGWAS results and discuss the potential of mathematical models. As the technology continuously improves to detect metabolites and to measure genetic variation, it is clear that comprehensive systems biology based approaches are required to further our insight in the association between genes, metabolites and disease. This article is part of a Special Issue entitled: From Genome to Function.
Collapse
|
41
|
Maarleveld TR, Boele J, Bruggeman FJ, Teusink B. A data integration and visualization resource for the metabolic network of Synechocystis sp. PCC 6803. PLANT PHYSIOLOGY 2014; 164:1111-21. [PMID: 24402049 PMCID: PMC3938606 DOI: 10.1104/pp.113.224394] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Data integration is a central activity in systems biology. The integration of genomic, transcript, protein, metabolite, flux, and computational data yields unprecedented information about the system level functioning of organisms. Often, data integration is done purely computationally, leaving the user with little insight in addition to statistical information. In this article, we present a visualization tool for the metabolic network of Synechocystis sp. PCC 6803, an important model cyanobacterium for sustainable biofuel production. We illustrate how this metabolic map can be used to integrate experimental and computational data for Synechocystis sp. PCC 6803 systems biology and metabolic engineering studies. Additionally, we discuss how this map, and the software infrastructure that we supply with it, can be used in the development of other organism-specific metabolic network visualizations. In addition to the Python console package VoNDA (http://vonda.sf.net), we provide a working demonstration of the interactive metabolic map and the associated Synechocystis sp. PCC 6803 genome-scale stoichiometric model, as well as various ready-to-visualize microarray data sets, at http://f-a-m-e.org/synechocytis.
Collapse
|
42
|
Toya Y, Shimizu H. Flux analysis and metabolomics for systematic metabolic engineering of microorganisms. Biotechnol Adv 2013; 31:818-26. [PMID: 23680193 DOI: 10.1016/j.biotechadv.2013.05.002] [Citation(s) in RCA: 81] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2013] [Revised: 04/23/2013] [Accepted: 05/04/2013] [Indexed: 12/29/2022]
Abstract
Rational engineering of metabolism is important for bio-production using microorganisms. Metabolic design based on in silico simulations and experimental validation of the metabolic state in the engineered strain helps in accomplishing systematic metabolic engineering. Flux balance analysis (FBA) is a method for the prediction of metabolic phenotype, and many applications have been developed using FBA to design metabolic networks. Elementary mode analysis (EMA) and ensemble modeling techniques are also useful tools for in silico strain design. The metabolome and flux distribution of the metabolic pathways enable us to evaluate the metabolic state and provide useful clues to improve target productivity. Here, we reviewed several computational applications for metabolic engineering by using genome-scale metabolic models of microorganisms. We also discussed the recent progress made in the field of metabolomics and (13)C-metabolic flux analysis techniques, and reviewed these applications pertaining to bio-production development. Because these in silico or experimental approaches have their respective advantages and disadvantages, the combined usage of these methods is complementary and effective for metabolic engineering.
Collapse
Affiliation(s)
- Yoshihiro Toya
- Department of Bioinformatic Engineering, Graduate School of Information Science and Technology, Osaka University, 1-5 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | |
Collapse
|
43
|
Altman T, Travers M, Kothari A, Caspi R, Karp PD. A systematic comparison of the MetaCyc and KEGG pathway databases. BMC Bioinformatics 2013; 14:112. [PMID: 23530693 PMCID: PMC3665663 DOI: 10.1186/1471-2105-14-112] [Citation(s) in RCA: 89] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 03/04/2013] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND The MetaCyc and KEGG projects have developed large metabolic pathway databases that are used for a variety of applications including genome analysis and metabolic engineering. We present a comparison of the compound, reaction, and pathway content of MetaCyc version 16.0 and a KEGG version downloaded on Feb-27-2012 to increase understanding of their relative sizes, their degree of overlap, and their scope. To assess their overlap, we must know the correspondences between compounds, reactions, and pathways in MetaCyc, and those in KEGG. We devoted significant effort to computational and manual matching of these entities, and we evaluated the accuracy of the correspondences. RESULTS KEGG contains 179 module pathways versus 1,846 base pathways in MetaCyc; KEGG contains 237 map pathways versus 296 super pathways in MetaCyc. KEGG pathways contain 3.3 times as many reactions on average as do MetaCyc pathways, and the databases employ different conceptualizations of metabolic pathways. KEGG contains 8,692 reactions versus 10,262 for MetaCyc. 6,174 KEGG reactions are components of KEGG pathways versus 6,348 for MetaCyc. KEGG contains 16,586 compounds versus 11,991 for MetaCyc. 6,912 KEGG compounds act as substrates in KEGG reactions versus 8,891 for MetaCyc. MetaCyc contains a broader set of database attributes than does KEGG, such as relationships from a compound to enzymes that it regulates, identification of spontaneous reactions, and the expected taxonomic range of metabolic pathways. MetaCyc contains many pathways not found in KEGG, from plants, fungi, metazoa, and actinobacteria; KEGG contains pathways not found in MetaCyc, for xenobiotic degradation, glycan metabolism, and metabolism of terpenoids and polyketides. MetaCyc contains fewer unbalanced reactions, which facilitates metabolic modeling such as using flux-balance analysis. MetaCyc includes generic reactions that may be instantiated computationally. CONCLUSIONS KEGG contains significantly more compounds than does MetaCyc, whereas MetaCyc contains significantly more reactions and pathways than does KEGG, in particular KEGG modules are quite incomplete. The number of reactions occurring in pathways in the two DBs are quite similar.
Collapse
Affiliation(s)
- Tomer Altman
- Bioinformatics Research Group, SRI International, Menlo Park, USA
| | | | | | | | | |
Collapse
|
44
|
Fukushima A, Kusano M. Recent progress in the development of metabolome databases for plant systems biology. FRONTIERS IN PLANT SCIENCE 2013; 4:73. [PMID: 23577015 PMCID: PMC3616245 DOI: 10.3389/fpls.2013.00073] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Accepted: 03/15/2013] [Indexed: 05/19/2023]
Abstract
Metabolomics has grown greatly as a functional genomics tool, and has become an invaluable diagnostic tool for biochemical phenotyping of biological systems. Over the past decades, a number of databases involving information related to mass spectra, compound names and structures, statistical/mathematical models and metabolic pathways, and metabolite profile data have been developed. Such databases complement each other and support efficient growth in this area, although the data resources remain scattered across the World Wide Web. Here, we review available metabolome databases and summarize the present status of development of related tools, particularly focusing on the plant metabolome. Data sharing discussed here will pave way for the robust interpretation of metabolomic data and advances in plant systems biology.
Collapse
Affiliation(s)
- Atsushi Fukushima
- RIKEN Plant Science CenterYokohama, Kanagawa, Japan
- *Correspondence: Atsushi Fukushima, RIKEN Plant Science Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan. e-mail:
| | - Miyako Kusano
- RIKEN Plant Science CenterYokohama, Kanagawa, Japan
- Department of Genome System Sciences, Graduate School of Nanobioscience, Kihara Institute for Biological ResearchYokohama, Kanagawa, Japan
| |
Collapse
|
45
|
Schomburg I, Chang A, Placzek S, Söhngen C, Rother M, Lang M, Munaretto C, Ulas S, Stelzer M, Grote A, Scheer M, Schomburg D. BRENDA in 2013: integrated reactions, kinetic data, enzyme function data, improved disease classification: new options and contents in BRENDA. Nucleic Acids Res 2013; 41:D764-72. [PMID: 23203881 PMCID: PMC3531171 DOI: 10.1093/nar/gks1049] [Citation(s) in RCA: 271] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2012] [Revised: 10/08/2012] [Accepted: 10/10/2012] [Indexed: 11/13/2022] Open
Abstract
The BRENDA (BRaunschweig ENzyme DAtabase) enzyme portal (http://www.brenda-enzymes.org) is the main information system of functional biochemical and molecular enzyme data and provides access to seven interconnected databases. BRENDA contains 2.7 million manually annotated data on enzyme occurrence, function, kinetics and molecular properties. Each entry is connected to a reference and the source organism. Enzyme ligands are stored with their structures and can be accessed via their names, synonyms or via a structure search. FRENDA (Full Reference ENzyme DAta) and AMENDA (Automatic Mining of ENzyme DAta) are based on text mining methods and represent a complete survey of PubMed abstracts with information on enzymes in different organisms, tissues or organelles. The supplemental database DRENDA provides more than 910 000 new EC number-disease relations in more than 510 000 references from automatic search and a classification of enzyme-disease-related information. KENDA (Kinetic ENzyme DAta), a new amendment extracts and displays kinetic values from PubMed abstracts. The integration of the EnzymeDetector offers an automatic comparison, evaluation and prediction of enzyme function annotations for prokaryotic genomes. The biochemical reaction database BKM-react contains non-redundant enzyme-catalysed and spontaneous reactions and was developed to facilitate and accelerate the construction of biochemical models.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | | | - Dietmar Schomburg
- Technische Universität Braunschweig, Dpt. for Bioinformatics and Biochemistry, Langer Kamp 19 B, 38106 Braunschweig, Germany
| |
Collapse
|
46
|
Bernard T, Bridge A, Morgat A, Moretti S, Xenarios I, Pagni M. Reconciliation of metabolites and biochemical reactions for metabolic networks. Brief Bioinform 2012; 15:123-35. [PMID: 23172809 PMCID: PMC3896926 DOI: 10.1093/bib/bbs058] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Genome-scale metabolic network reconstructions are now routinely used in the study of metabolic pathways, their evolution and design. The development of such reconstructions involves the integration of information on reactions and metabolites from the scientific literature as well as public databases and existing genome-scale metabolic models. The reconciliation of discrepancies between data from these sources generally requires significant manual curation, which constitutes a major obstacle in efforts to develop and apply genome-scale metabolic network reconstructions. In this work, we discuss some of the major difficulties encountered in the mapping and reconciliation of metabolic resources and review three recent initiatives that aim to accelerate this process, namely BKM-react, MetRxn and MNXref (presented in this article). Each of these resources provides a pre-compiled reconciliation of many of the most commonly used metabolic resources. By reducing the time required for manual curation of metabolite and reaction discrepancies, these resources aim to accelerate the development and application of high-quality genome-scale metabolic network reconstructions and models.
Collapse
|
47
|
Ningthoujam SS, Talukdar AD, Potsangbam KS, Choudhury MD. Challenges in developing medicinal plant databases for sharing ethnopharmacological knowledge. JOURNAL OF ETHNOPHARMACOLOGY 2012; 141:9-32. [PMID: 22401841 DOI: 10.1016/j.jep.2012.02.042] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Revised: 02/19/2012] [Accepted: 02/25/2012] [Indexed: 05/31/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Major research contributions in ethnopharmacology have generated vast amount of data associated with medicinal plants. Computerized databases facilitate data management and analysis making coherent information available to researchers, planners and other users. Web-based databases also facilitate knowledge transmission and feed the circle of information exchange between the ethnopharmacological studies and public audience. However, despite the development of many medicinal plant databases, a lack of uniformity is still discernible. Therefore, it calls for defining a common standard to achieve the common objectives of ethnopharmacology. AIM OF THE STUDY The aim of the study is to review the diversity of approaches in storing ethnopharmacological information in databases and to provide some minimal standards for these databases. MATERIALS AND METHODS Survey for articles on medicinal plant databases was done on the Internet by using selective keywords. Grey literatures and printed materials were also searched for information. Listed resources were critically analyzed for their approaches in content type, focus area and software technology. RESULTS Necessity for rapid incorporation of traditional knowledge by compiling primary data has been felt. While citation collection is common approach for information compilation, it could not fully assimilate local literatures which reflect traditional knowledge. Need for defining standards for systematic evaluation, checking quality and authenticity of the data is felt. Databases focussing on thematic areas, viz., traditional medicine system, regional aspect, disease and phytochemical information are analyzed. Issues pertaining to data standard, data linking and unique identification need to be addressed in addition to general issues like lack of update and sustainability. In the background of the present study, suggestions have been made on some minimum standards for development of medicinal plant database. CONCLUSION In spite of variations in approaches, existence of many overlapping features indicates redundancy of resources and efforts. As the development of global data in a single database may not be possible in view of the culture-specific differences, efforts can be given to specific regional areas. Existing scenario calls for collaborative approach for defining a common standard in medicinal plant database for knowledge sharing and scientific advancement.
Collapse
|
48
|
Kumar A, Suthers PF, Maranas CD. MetRxn: a knowledgebase of metabolites and reactions spanning metabolic models and databases. BMC Bioinformatics 2012; 13:6. [PMID: 22233419 PMCID: PMC3277463 DOI: 10.1186/1471-2105-13-6] [Citation(s) in RCA: 100] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Accepted: 01/10/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Increasingly, metabolite and reaction information is organized in the form of genome-scale metabolic reconstructions that describe the reaction stoichiometry, directionality, and gene to protein to reaction associations. A key bottleneck in the pace of reconstruction of new, high-quality metabolic models is the inability to directly make use of metabolite/reaction information from biological databases or other models due to incompatibilities in content representation (i.e., metabolites with multiple names across databases and models), stoichiometric errors such as elemental or charge imbalances, and incomplete atomistic detail (e.g., use of generic R-group or non-explicit specification of stereo-specificity). DESCRIPTION MetRxn is a knowledgebase that includes standardized metabolite and reaction descriptions by integrating information from BRENDA, KEGG, MetaCyc, Reactome.org and 44 metabolic models into a single unified data set. All metabolite entries have matched synonyms, resolved protonation states, and are linked to unique structures. All reaction entries are elementally and charge balanced. This is accomplished through the use of a workflow of lexicographic, phonetic, and structural comparison algorithms. MetRxn allows for the download of standardized versions of existing genome-scale metabolic models and the use of metabolic information for the rapid reconstruction of new ones. CONCLUSIONS The standardization in description allows for the direct comparison of the metabolite and reaction content between metabolic models and databases and the exhaustive prospecting of pathways for biotechnological production. This ever-growing dataset currently consists of over 76,000 metabolites participating in more than 72,000 reactions (including unresolved entries). MetRxn is hosted on a web-based platform that uses relational database models (MySQL).
Collapse
Affiliation(s)
- Akhil Kumar
- Department of Chemical Engineering, The Pennsylvania State University, University Park, PA 16802, USA.
| | | | | |
Collapse
|
49
|
Alcántara R, Axelsen KB, Morgat A, Belda E, Coudert E, Bridge A, Cao H, de Matos P, Ennis M, Turner S, Owen G, Bougueleret L, Xenarios I, Steinbeck C. Rhea--a manually curated resource of biochemical reactions. Nucleic Acids Res 2011; 40:D754-60. [PMID: 22135291 PMCID: PMC3245052 DOI: 10.1093/nar/gkr1126] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Rhea (http://www.ebi.ac.uk/rhea) is a comprehensive resource of expert-curated biochemical reactions. Rhea provides a non-redundant set of chemical transformations for use in a broad spectrum of applications, including metabolic network reconstruction and pathway inference. Rhea includes enzyme-catalyzed reactions (covering the IUBMB Enzyme Nomenclature list), transport reactions and spontaneously occurring reactions. Rhea reactions are described using chemical species from the Chemical Entities of Biological Interest ontology (ChEBI) and are stoichiometrically balanced for mass and charge. They are extensively manually curated with links to source literature and other public resources on metabolism including enzyme and pathway databases. This cross-referencing facilitates the mapping and reconciliation of common reactions and compounds between distinct resources, which is a common first step in the reconstruction of genome scale metabolic networks and models.
Collapse
Affiliation(s)
- Rafael Alcántara
- Chemoinformatics and Metabolism Team, European Bioinformatics Institute, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|