1
|
Xie X, Gui L, Qiao B, Wang G, Huang S, Zhao Y, Sun S. Deep learning in template-free de novo biosynthetic pathway design of natural products. Brief Bioinform 2024; 25:bbae495. [PMID: 39373052 PMCID: PMC11456888 DOI: 10.1093/bib/bbae495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/12/2024] [Accepted: 09/20/2024] [Indexed: 10/08/2024] Open
Abstract
Natural products (NPs) are indispensable in drug development, particularly in combating infections, cancer, and neurodegenerative diseases. However, their limited availability poses significant challenges. Template-free de novo biosynthetic pathway design provides a strategic solution for NP production, with deep learning standing out as a powerful tool in this domain. This review delves into state-of-the-art deep learning algorithms in NP biosynthesis pathway design. It provides an in-depth discussion of databases like Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, and UniProt, which are essential for model training, along with chemical databases such as Reaxys, SciFinder, and PubChem for transfer learning to expand models' understanding of the broader chemical space. It evaluates the potential and challenges of sequence-to-sequence and graph-to-graph translation models for accurate single-step prediction. Additionally, it discusses search algorithms for multistep prediction and deep learning algorithms for predicting enzyme function. The review also highlights the pivotal role of deep learning in improving catalytic efficiency through enzyme engineering, which is essential for enhancing NP production. Moreover, it examines the application of large language models in pathway design, enzyme discovery, and enzyme engineering. Finally, it addresses the challenges and prospects associated with template-free approaches, offering insights into potential advancements in NP biosynthesis pathway design.
Collapse
Affiliation(s)
- Xueying Xie
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Lin Gui
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Baixue Qiao
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, No. 246 Xuefu Road, Nangang District,Harbin 150081, China
| | - Yuming Zhao
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Shanwen Sun
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| |
Collapse
|
2
|
Kim T, Lee S, Kwak Y, Choi MS, Park J, Hwang SJ, Kim SG. READRetro: natural product biosynthesis predicting with retrieval-augmented dual-view retrosynthesis. THE NEW PHYTOLOGIST 2024; 243:2512-2527. [PMID: 39081009 DOI: 10.1111/nph.20012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 07/08/2024] [Indexed: 08/23/2024]
Abstract
Plants, as a sessile organism, produce various secondary metabolites to interact with the environment. These chemicals have fascinated the plant science community because of their ecological significance and notable biological activity. However, predicting the complete biosynthetic pathways from target molecules to metabolic building blocks remains a challenge. Here, we propose retrieval-augmented dual-view retrosynthesis (READRetro) as a practical bio-retrosynthesis tool to predict the biosynthetic pathways of plant natural products. Conventional bio-retrosynthesis models have been limited in their ability to predict biosynthetic pathways for natural products. READRetro was optimized for the prediction of complex metabolic pathways by incorporating cutting-edge deep learning architectures, an ensemble approach, and two retrievers. Evaluation of single- and multi-step retrosynthesis showed that each component of READRetro significantly improved its ability to predict biosynthetic pathways. READRetro was also able to propose the known pathways of secondary metabolites such as monoterpene indole alkaloids and the unknown pathway of menisdaurilide, demonstrating its applicability to real-world bio-retrosynthesis of plant natural products. For researchers interested in the biosynthesis and production of secondary metabolites, a user-friendly website (https://readretro.net) and the open-source code of READRetro have been made available.
Collapse
Affiliation(s)
- Taein Kim
- Department of Biological Sciences, KAIST, Daejeon, 34141, Korea
| | - Seul Lee
- Kim Jaechul Graduate School of AI, KAIST, Daejeon, 34141, Korea
| | - Yejin Kwak
- Department of BioMedical Convergence Engineering, Pusan National University, Yangsan, 50612, Korea
| | - Min-Soo Choi
- Department of Biological Sciences, KAIST, Daejeon, 34141, Korea
| | - Jeongbin Park
- Department of BioMedical Convergence Engineering, Pusan National University, Yangsan, 50612, Korea
| | - Sung Ju Hwang
- Kim Jaechul Graduate School of AI, KAIST, Daejeon, 34141, Korea
- School of Computing, KAIST, Daejeon, 34141, Korea
| | - Sang-Gyu Kim
- Department of Biological Sciences, KAIST, Daejeon, 34141, Korea
| |
Collapse
|
3
|
Kundu P, Beura S, Mondal S, Das AK, Ghosh A. Machine learning for the advancement of genome-scale metabolic modeling. Biotechnol Adv 2024; 74:108400. [PMID: 38944218 DOI: 10.1016/j.biotechadv.2024.108400] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 05/13/2024] [Accepted: 06/23/2024] [Indexed: 07/01/2024]
Abstract
Constraint-based modeling (CBM) has evolved as the core systems biology tool to map the interrelations between genotype, phenotype, and external environment. The recent advancement of high-throughput experimental approaches and multi-omics strategies has generated a plethora of new and precise information from wide-ranging biological domains. On the other hand, the continuously growing field of machine learning (ML) and its specialized branch of deep learning (DL) provide essential computational architectures for decoding complex and heterogeneous biological data. In recent years, both multi-omics and ML have assisted in the escalation of CBM. Condition-specific omics data, such as transcriptomics and proteomics, helped contextualize the model prediction while analyzing a particular phenotypic signature. At the same time, the advanced ML tools have eased the model reconstruction and analysis to increase the accuracy and prediction power. However, the development of these multi-disciplinary methodological frameworks mainly occurs independently, which limits the concatenation of biological knowledge from different domains. Hence, we have reviewed the potential of integrating multi-disciplinary tools and strategies from various fields, such as synthetic biology, CBM, omics, and ML, to explore the biochemical phenomenon beyond the conventional biological dogma. How the integrative knowledge of these intersected domains has improved bioengineering and biomedical applications has also been highlighted. We categorically explained the conventional genome-scale metabolic model (GEM) reconstruction tools and their improvement strategies through ML paradigms. Further, the crucial role of ML and DL in omics data restructuring for GEM development has also been briefly discussed. Finally, the case-study-based assessment of the state-of-the-art method for improving biomedical and metabolic engineering strategies has been elaborated. Therefore, this review demonstrates how integrating experimental and in silico strategies can help map the ever-expanding knowledge of biological systems driven by condition-specific cellular information. This multiview approach will elevate the application of ML-based CBM in the biomedical and bioengineering fields for the betterment of society and the environment.
Collapse
Affiliation(s)
- Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Satyajit Beura
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Suman Mondal
- P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Kumar Das
- Department of Bioscience and Biotechnology, Indian Institute of Technology, Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
4
|
Gricourt G, Meyer P, Duigou T, Faulon JL. Artificial Intelligence Methods and Models for Retro-Biosynthesis: A Scoping Review. ACS Synth Biol 2024; 13:2276-2294. [PMID: 39047143 PMCID: PMC11334239 DOI: 10.1021/acssynbio.4c00091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 06/14/2024] [Accepted: 06/14/2024] [Indexed: 07/27/2024]
Abstract
Retrosynthesis aims to efficiently plan the synthesis of desirable chemicals by strategically breaking down molecules into readily available building block compounds. Having a long history in chemistry, retro-biosynthesis has also been used in the fields of biocatalysis and synthetic biology. Artificial intelligence (AI) is driving us toward new frontiers in synthesis planning and the exploration of chemical spaces, arriving at an opportune moment for promoting bioproduction that would better align with green chemistry, enhancing environmental practices. In this review, we summarize the recent advancements in the application of AI methods and models for retrosynthetic and retro-biosynthetic pathway design. These techniques can be based either on reaction templates or generative models and require scoring functions and planning strategies to navigate through the retrosynthetic graph of possibilities. We finally discuss limitations and promising research directions in this field.
Collapse
Affiliation(s)
- Guillaume Gricourt
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Philippe Meyer
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Thomas Duigou
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
| | - Jean-Loup Faulon
- Université
Paris-Saclay, INRAE, AgroParisTech, Micalis
Institute, 78350 Jouy-en-Josas, France
- The
University of Manchester, Manchester Institute
of Biotechnology, Manchester M1 7DN, U.K.
| |
Collapse
|
5
|
Zhang X, Liu J, Yang F, Zhang Q, Yang Z, Shah HA. Planning biosynthetic pathways of target molecules based on metabolic reaction prediction and AND-OR tree search. Comput Biol Chem 2024; 111:108106. [PMID: 38833912 DOI: 10.1016/j.compbiolchem.2024.108106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 05/06/2024] [Accepted: 05/13/2024] [Indexed: 06/06/2024]
Abstract
Bioretrosynthesis problem is to predict synthetic routes using substrates for given natural products (NPs). However, the huge number of metabolic reactions leads to a combinatorial explosion of searching space, which is high time-consuming and costly. Here, we propose a framework called BioRetro to predict bioretrosynthesis pathways using a one-step bioretrosynthesis network, termed HybridMLP combined with AND-OR tree heuristic search. The HybridMLP predicts precursors that will produce the target NPs, while the AND-OR tree generates the iterative multi-step biosynthetic pathways. The one-step bioretrosynthesis prediction experiments are conducted on MetaNetX dataset by using HybridMLP, which achieves 46.5%, 74.6%, 81.6% in terms of the top-1, top-5, top-10 accuracies. The great performance demonstrates the effectiveness of HybridMLP in one-step bioretrosynthesis. Besides, the evaluation of two benchmark datasets reveals that BioRetro can significantly improve the speed and success rate in predicting biosynthesis pathways. In addition, the BioRetro is further shown to find the synthetic pathway of compounds, such as ginsenoside F1 with the same substrates as reported but different enzymes, which may be the novel potential enzyme to have better catalytic performance.
Collapse
Affiliation(s)
- Xiaolei Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Juan Liu
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China.
| | - Feng Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Qiang Zhang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Zhihui Yang
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China
| | - Hayat Ali Shah
- Institute of Artificial Intelligence, School of Computer Science, Wuhan University, Wuhan, 430072, China
| |
Collapse
|
6
|
Zeng T, Jin Z, Zheng S, Yu T, Wu R. Developing BioNavi for Hybrid Retrosynthesis Planning. JACS AU 2024; 4:2492-2502. [PMID: 39055138 PMCID: PMC11267531 DOI: 10.1021/jacsau.4c00228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 06/18/2024] [Accepted: 06/20/2024] [Indexed: 07/27/2024]
Abstract
Illuminating synthetic pathways is essential for producing valuable chemicals, such as bioactive molecules. Chemical and biological syntheses are crucial, and their integration often leads to more efficient and sustainable pathways. Despite the rapid development of retrosynthesis models, few of them consider both chemical and biological syntheses, hindering the pathway design for high-value chemicals. Here, we propose BioNavi by innovating multitask learning and reaction templates into the deep learning-driven model to design hybrid synthesis pathways in a more interpretable manner. BioNavi outperforms existing approaches on different data sets, achieving a 75% hit rate in replicating reported biosynthetic pathways and displaying superior ability in designing hybrid synthesis pathways. Additional case studies further illustrate the potential application of BioNavi in a de novo pathway design. The enhanced web server (http://biopathnavi.qmclab.com/bionavi/) simplifies input operations and implements step-by-step exploration according to user experience. We show that BioNavi is a handy navigator for designing synthetic pathways for various chemicals.
Collapse
Affiliation(s)
- Tao Zeng
- School
of Pharmaceutical Sciences, Sun Yat-sen
University, Guangzhou 510006, P. R. China
| | - Zhehao Jin
- Center
for Synthetic Biochemistry, CAS Key Laboratory of Quantitative Engineering
Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
(CAS), Shenzhen 518055, P. R. China
| | - Shuangjia Zheng
- Global
Institute of Future Technology, Shanghai
Jiao Tong University, Shanghai 200240, P. R. China
| | - Tao Yu
- Center
for Synthetic Biochemistry, CAS Key Laboratory of Quantitative Engineering
Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
(CAS), Shenzhen 518055, P. R. China
| | - Ruibo Wu
- School
of Pharmaceutical Sciences, Sun Yat-sen
University, Guangzhou 510006, P. R. China
| |
Collapse
|
7
|
Martín Lázaro H, Marín Bautista R, Carbonell P. DetSpace: a web server for engineering detectable pathways for bio-based chemical production. Nucleic Acids Res 2024; 52:W476-W480. [PMID: 38634809 PMCID: PMC11223873 DOI: 10.1093/nar/gkae287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Revised: 03/18/2024] [Accepted: 04/16/2024] [Indexed: 04/19/2024] Open
Abstract
Tackling climate change challenges requires replacing current chemical industrial processes through the rational and sustainable use of biodiversity resources. To that end, production routes to key bio-based chemicals for the bioeconomy have been identified. However, their production still remains inefficient in terms of titers, rates, and yields; because of the hurdles found when scaling up. In order to make production more efficient, strategies like automated screening and dynamic pathway regulation through biosensors have been applied as part of strain optimization. However, to date, no systematic way exists to design a genetic circuit that is responsive to concentrations of a given target compound. Here, the DetSpace web server provides a set of integrated tools that allows a user to select and design a biological circuit that performs the sensing of a molecule of interest by its enzymatic conversion to a detectable molecule through a transcription factor. In that way, the DetSpace web server allows synthetic biologists to easily design biosensing routes for the dynamic regulation of metabolic pathways in applications ranging from genetic circuits design, screening, production, and bioremediation of bio-based chemicals, to diagnostics and drug delivery.
Collapse
Affiliation(s)
- Hèctor Martín Lázaro
- Institute of Industrial Control Systems and Computing (AI2), Universitat Politècnica de València (UPV), Camí de Vera s/n, 46022 València, Spain
| | - Ricardo Marín Bautista
- Institute of Industrial Control Systems and Computing (AI2), Universitat Politècnica de València (UPV), Camí de Vera s/n, 46022 València, Spain
| | - Pablo Carbonell
- Institute of Industrial Control Systems and Computing (AI2), Universitat Politècnica de València (UPV), Camí de Vera s/n, 46022 València, Spain
- Institute for Integrative Systems Biology I2SysBio, Universitat de València-CSIC, Escardino Street 9, Paterna, 46980 València, Spain
| |
Collapse
|
8
|
Balzerani F, Blasco T, Pérez-Burillo S, Valcarcel LV, Hassoun S, Planes FJ. Extending PROXIMAL to predict degradation pathways of phenolic compounds in the human gut microbiota. NPJ Syst Biol Appl 2024; 10:56. [PMID: 38802371 PMCID: PMC11130242 DOI: 10.1038/s41540-024-00381-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Accepted: 05/09/2024] [Indexed: 05/29/2024] Open
Abstract
Despite significant advances in reconstructing genome-scale metabolic networks, the understanding of cellular metabolism remains incomplete for many organisms. A promising approach for elucidating cellular metabolism is analysing the full scope of enzyme promiscuity, which exploits the capacity of enzymes to bind to non-annotated substrates and generate novel reactions. To guide time-consuming costly experimentation, different computational methods have been proposed for exploring enzyme promiscuity. One relevant algorithm is PROXIMAL, which strongly relies on KEGG to define generic reaction rules and link specific molecular substructures with associated chemical transformations. Here, we present a completely new pipeline, PROXIMAL2, which overcomes the dependency on KEGG data. In addition, PROXIMAL2 introduces two relevant improvements with respect to the former version: i) correct treatment of multi-step reactions and ii) tracking of electric charges in the transformations. We compare PROXIMAL and PROXIMAL2 in recovering annotated products from substrates in KEGG reactions, finding a highly significant improvement in the level of accuracy. We then applied PROXIMAL2 to predict degradation reactions of phenolic compounds in the human gut microbiota. The results were compared to RetroPath RL, a different and relevant enzyme promiscuity method. We found a significant overlap between these two methods but also complementary results, which open new research directions into this relevant question in nutrition.
Collapse
Affiliation(s)
- Francesco Balzerani
- University of Navarra, Tecnun School of Engineering, Manuel de Lardizábal 13, 20018, San Sebastián, Spain
| | - Telmo Blasco
- University of Navarra, Tecnun School of Engineering, Manuel de Lardizábal 13, 20018, San Sebastián, Spain
| | - Sergio Pérez-Burillo
- University of Navarra, Tecnun School of Engineering, Manuel de Lardizábal 13, 20018, San Sebastián, Spain
| | - Luis V Valcarcel
- University of Navarra, Tecnun School of Engineering, Manuel de Lardizábal 13, 20018, San Sebastián, Spain
- University of Navarra, Biomedical Engineering Center, Campus Universitario, 31009, Pamplona, Navarra, Spain
- University of Navarra, Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Campus Universitario, 31080, Pamplona, Spain
| | - Soha Hassoun
- Department of Computer Science, Tufts University, Medford, MA, 02155, USA.
- Department of Chemical and Biological Engineering, Tufts University, Medford, MA, 02155, USA.
| | - Francisco J Planes
- University of Navarra, Tecnun School of Engineering, Manuel de Lardizábal 13, 20018, San Sebastián, Spain.
- University of Navarra, Biomedical Engineering Center, Campus Universitario, 31009, Pamplona, Navarra, Spain.
- University of Navarra, Instituto de Ciencia de los Datos e Inteligencia Artificial (DATAI), Campus Universitario, 31080, Pamplona, Spain.
| |
Collapse
|
9
|
Orsi E, Schada von Borzyskowski L, Noack S, Nikel PI, Lindner SN. Automated in vivo enzyme engineering accelerates biocatalyst optimization. Nat Commun 2024; 15:3447. [PMID: 38658554 PMCID: PMC11043082 DOI: 10.1038/s41467-024-46574-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 03/04/2024] [Indexed: 04/26/2024] Open
Abstract
Achieving cost-competitive bio-based processes requires development of stable and selective biocatalysts. Their realization through in vitro enzyme characterization and engineering is mostly low throughput and labor-intensive. Therefore, strategies for increasing throughput while diminishing manual labor are gaining momentum, such as in vivo screening and evolution campaigns. Computational tools like machine learning further support enzyme engineering efforts by widening the explorable design space. Here, we propose an integrated solution to enzyme engineering challenges whereby ML-guided, automated workflows (including library generation, implementation of hypermutation systems, adapted laboratory evolution, and in vivo growth-coupled selection) could be realized to accelerate pipelines towards superior biocatalysts.
Collapse
Affiliation(s)
- Enrico Orsi
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800, Kongens Lyngby, Denmark
| | | | - Stephan Noack
- Institute of Bio- and Geosciences, IBG-1: Biotechnology, Forschungszentrum Jülich, 52425, Jülich, Germany
| | - Pablo I Nikel
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, 2800, Kongens Lyngby, Denmark
| | - Steffen N Lindner
- Max Planck Institute of Molecular Plant Physiology, 14476, Potsdam-Golm, Germany.
- Department of Biochemistry, Charité Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität, 10117, Berlin, Germany.
| |
Collapse
|
10
|
Wang X, Quinn D, Moody TS, Huang M. ALDELE: All-Purpose Deep Learning Toolkits for Predicting the Biocatalytic Activities of Enzymes. J Chem Inf Model 2024; 64:3123-3139. [PMID: 38573056 PMCID: PMC11040732 DOI: 10.1021/acs.jcim.4c00058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 02/15/2024] [Accepted: 03/11/2024] [Indexed: 04/05/2024]
Abstract
Rapidly predicting enzyme properties for catalyzing specific substrates is essential for identifying potential enzymes for industrial transformations. The demand for sustainable production of valuable industry chemicals utilizing biological resources raised a pressing need to speed up biocatalyst screening using machine learning techniques. In this research, we developed an all-purpose deep-learning-based multiple-toolkit (ALDELE) workflow for screening enzyme catalysts. ALDELE incorporates both structural and sequence representations of proteins, alongside representations of ligands by subgraphs and overall physicochemical properties. Comprehensive evaluation demonstrated that ALDELE can predict the catalytic activities of enzymes, and particularly, it identifies residue-based hotspots to guide enzyme engineering and generates substrate heat maps to explore the substrate scope for a given biocatalyst. Moreover, our models notably match empirical data, reinforcing the practicality and reliability of our approach through the alignment with confirmed mutation sites. ALDELE offers a facile and comprehensive solution by integrating different toolkits tailored for different purposes at affordable computational cost and therefore would be valuable to speed up the discovery of new functional enzymes for their exploitation by the industry.
Collapse
Affiliation(s)
- Xiangwen Wang
- School
of Chemistry and Chemical Engineering, Queen’s
University Belfast, Belfast BT9 5AG, Northern Ireland, U.K.
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, Craigavon BT63 5QD, Northern Ireland, U.K.
| | - Derek Quinn
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, Craigavon BT63 5QD, Northern Ireland, U.K.
| | - Thomas S. Moody
- Department
of Biocatalysis and Isotope Chemistry, Almac
Sciences, Craigavon BT63 5QD, Northern Ireland, U.K.
- Arran
Chemical Company Limited, Unit 1 Monksland Industrial Estate, Athlone,
Co., Roscommon N37 DN24, Ireland
| | - Meilan Huang
- School
of Chemistry and Chemical Engineering, Queen’s
University Belfast, Belfast BT9 5AG, Northern Ireland, U.K.
| |
Collapse
|
11
|
Boob AG, Chen J, Zhao H. Enabling pathway design by multiplex experimentation and machine learning. Metab Eng 2024; 81:70-87. [PMID: 38040110 DOI: 10.1016/j.ymben.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 11/01/2023] [Accepted: 11/25/2023] [Indexed: 12/03/2023]
Abstract
The remarkable metabolic diversity observed in nature has provided a foundation for sustainable production of a wide array of valuable molecules. However, transferring the biosynthetic pathway to the desired host often runs into inherent failures that arise from intermediate accumulation and reduced flux resulting from competing pathways within the host cell. Moreover, the conventional trial and error methods utilized in pathway optimization struggle to fully grasp the intricacies of installed pathways, leading to time-consuming and labor-intensive experiments, ultimately resulting in suboptimal yields. Considering these obstacles, there is a pressing need to explore the enzyme expression landscape and identify the optimal pathway configuration for enhanced production of molecules. This review delves into recent advancements in pathway engineering, with a focus on multiplex experimentation and machine learning techniques. These approaches play a pivotal role in overcoming the limitations of traditional methods, enabling exploration of a broader design space and increasing the likelihood of discovering optimal pathway configurations for enhanced production of molecules. We discuss several tools and strategies for pathway design, construction, and optimization for sustainable and cost-effective microbial production of molecules ranging from bulk to fine chemicals. We also highlight major successes in academia and industry through compelling case studies.
Collapse
Affiliation(s)
- Aashutosh Girish Boob
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Junyu Chen
- Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; Department of Bioengineering, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; Carl R. Woese Institute for Genomic Biology, University of Illinois Urbana-Champaign, Urbana, IL, 61801, United States; DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois Urbana-Champaign, Urbana, Illinois 61801, United States.
| |
Collapse
|
12
|
Heid E, Probst D, Green WH, Madsen GKH. EnzymeMap: curation, validation and data-driven prediction of enzymatic reactions. Chem Sci 2023; 14:14229-14242. [PMID: 38098707 PMCID: PMC10718068 DOI: 10.1039/d3sc02048g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 11/21/2023] [Indexed: 12/17/2023] Open
Abstract
Enzymatic reactions are an ecofriendly, selective, and versatile addition, sometimes even alternative to organic reactions for the synthesis of chemical compounds such as pharmaceuticals or fine chemicals. To identify suitable reactions, computational models to predict the activity of enzymes on non-native substrates, to perform retrosynthetic pathway searches, or to predict the outcomes of reactions including regio- and stereoselectivity are becoming increasingly important. However, current approaches are substantially hindered by the limited amount of available data, especially if balanced and atom mapped reactions are needed and if the models feature machine learning components. We therefore constructed a high-quality dataset (EnzymeMap) by developing a large set of correction and validation algorithms for recorded reactions in the literature and showcase its significant positive impact on machine learning models of retrosynthesis, forward prediction, and regioselectivity prediction, outperforming previous approaches by a large margin. Our dataset allows for deep learning models of enzymatic reactions with unprecedented accuracy, and is freely available online.
Collapse
Affiliation(s)
- Esther Heid
- Institute of Materials Chemistry, TU Wien 1060 Vienna Austria
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge Massachusetts 02139 USA
| | | | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge Massachusetts 02139 USA
| | | |
Collapse
|
13
|
Ryu G, Kim GB, Yu T, Lee SY. Deep learning for metabolic pathway design. Metab Eng 2023; 80:130-141. [PMID: 37734652 DOI: 10.1016/j.ymben.2023.09.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 09/17/2023] [Accepted: 09/19/2023] [Indexed: 09/23/2023]
Abstract
The establishment of a bio-based circular economy is imperative in tackling the climate crisis and advancing sustainable development. In this realm, the creation of microbial cell factories is central to generating a variety of chemicals and materials. The design of metabolic pathways is crucial in shaping these microbial cell factories, especially when it comes to producing chemicals with yet-to-be-discovered biosynthetic routes. To aid in navigating the complexities of chemical and metabolic domains, computer-supported tools for metabolic pathway design have emerged. In this paper, we evaluate how digital strategies can be employed for pathway prediction and enzyme discovery. Additionally, we touch upon the recent strides made in using deep learning techniques for metabolic pathway prediction. These computational tools and strategies streamline the design of metabolic pathways, facilitating the development of microbial cell factories. Leveraging the capabilities of deep learning in metabolic pathway design is profoundly promising, potentially hastening the advent of a bio-based circular economy.
Collapse
Affiliation(s)
- Gahyeon Ryu
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Four), KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, Daejeon, 34141, Republic of Korea
| | - Gi Bae Kim
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Four), KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, Daejeon, 34141, Republic of Korea
| | - Taeho Yu
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Four), KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, Daejeon, 34141, Republic of Korea
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory, Department of Chemical and Biomolecular Engineering (BK21 Four), KAIST Institute for BioCentury, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, 34141, Republic of Korea; Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, KAIST, Daejeon, 34141, Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon, 34141, Republic of Korea; Graduate School of Engineering Biology, KAIST, Daejeon, 34141, Republic of Korea.
| |
Collapse
|
14
|
Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023; 22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]
Abstract
Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.
Collapse
Affiliation(s)
| | - Katherine R Duncan
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Somayah S Elsayed
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Neha Garg
- School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Nathaniel I Martin
- Biological Chemistry Group, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Barbara R Terlouw
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Friederike Biermann
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Kai Blin
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Marina Gorostiola González
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
- ONCODE institute, Leiden, The Netherlands
| | - Eric J N Helfrich
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Florian Huber
- Center for Digitalization and Digitality, Hochschule Düsseldorf, Düsseldorf, Germany
| | - Stefan Leopold-Messer
- Institut für Mikrobiologie, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Kohulan Rajan
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Jena, Germany
| | - Tristan de Rond
- School of Chemical Sciences, University of Auckland, Auckland, New Zealand
| | - Jeffrey A van Santen
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University, Jena, Germany
- Pharmaceuticals R&D, Bayer AG, Berlin, Germany
| | - Marcy J Balunas
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Mehdi A Beniddir
- Équipe "Chimie des Substances Naturelles", Université Paris-Saclay, CNRS, BioCIS, Orsay, France
| | - Doris A van Bergeijk
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Laura M Carroll
- Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
| | - Chase M Clark
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | | | | | - Chao Du
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | | | - Willem Jespers
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Olga V Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Drug Bioinformatics, Medical Faculty, Saarland University, Homburg, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | | | - Hyunwoo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University Seoul, Goyang-si, Republic of Korea
| | - Tiago F Leao
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Joleen Masschelein
- Center for Microbiology, VIB-KU Leuven, Heverlee, Belgium
- Department of Biology, KU Leuven, Heverlee, Belgium
| | - Evan R Rees
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | - Raphael Reher
- Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany
- Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
| | - Daniel Reker
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Duke Microbiome Center, Duke University, Durham, NC, USA
| | - Philippe Schwaller
- Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | | | - Michael A Skinnider
- Adapsyn Bioscience, Hamilton, Ontario, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Allison S Walker
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Barbara Zdrazil
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, UK
| | - Nadine Ziemert
- Interfaculty Institute for Microbiology and Infection Medicine Tuebingen (IMIT), Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany
| | | | - Pierre Guyomard
- Bonsai team, CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Université de Lille, Villeneuve d'Ascq Cedex, France
| | - Andrea Volkamer
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- In silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Department of Pharmacy, Saarland University, Saarbrücken, Germany
- German Center for infection research (DZIF), Braunschweig, Germany
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany
| | - Gilles P van Wezel
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
- Netherlands Institute of Ecology, NIOO-KNAW, Wageningen, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany.
- Department of Pharmacy, Saarland University, Saarbrücken, Germany.
- German Center for infection research (DZIF), Braunschweig, Germany.
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany.
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
| | - Serina L Robinson
- Department of Environmental Microbiology, Eawag: Swiss Federal Institute for Aquatic Science and Technology, Dübendorf, Switzerland.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Institute of Biology, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
15
|
Merzbacher C, Oyarzún DA. Applications of artificial intelligence and machine learning in dynamic pathway engineering. Biochem Soc Trans 2023; 51:1871-1879. [PMID: 37656433 PMCID: PMC10657174 DOI: 10.1042/bst20221542] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2023] [Revised: 08/07/2023] [Accepted: 08/21/2023] [Indexed: 09/02/2023]
Abstract
Dynamic pathway engineering aims to build metabolic production systems embedded with intracellular control mechanisms for improved performance. These control systems enable host cells to self-regulate the temporal activity of a production pathway in response to perturbations, using a combination of biosensors and feedback circuits for controlling expression of heterologous enzymes. Pathway design, however, requires assembling together multiple biological parts into suitable circuit architectures, as well as careful calibration of the function of each component. This results in a large design space that is costly to navigate through experimentation alone. Methods from artificial intelligence (AI) and machine learning are gaining increasing attention as tools to accelerate the design cycle, owing to their ability to identify hidden patterns in data and rapidly screen through large collections of designs. In this review, we discuss recent developments in the application of machine learning methods to the design of dynamic pathways and their components. We cover recent successes and offer perspectives for future developments in the field. The integration of AI into metabolic engineering pipelines offers great opportunities to streamline design and discover control systems for improved production of high-value chemicals.
Collapse
Affiliation(s)
| | - Diego A. Oyarzún
- School of Informatics, University of Edinburgh, Edinburgh, U.K
- The Alan Turing Institute, London, U.K
- School of Biological Sciences, University of Edinburgh, Edinburgh, U.K
| |
Collapse
|
16
|
Trostel L, Coll C, Fenner K, Hafner J. Combining predictive and analytical methods to elucidate pharmaceutical biotransformation in activated sludge. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2023; 25:1322-1336. [PMID: 37539453 DOI: 10.1039/d3em00161j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/05/2023]
Abstract
While man-made chemicals in the environment are ubiquitous and a potential threat to human health and ecosystem integrity, the environmental fate of chemical contaminants such as pharmaceuticals is often poorly understood. Biodegradation processes driven by microbial communities convert chemicals into transformation products (TPs) that may themselves have adverse ecological effects. The detection of TPs formed during biodegradation has been continuously improved thanks to the development of TP prediction algorithms and analytical workflows. Here, we contribute to this advance by (i) reviewing past applications of TP identification workflows, (ii) applying an updated workflow for TP prediction to 42 pharmaceuticals in biodegradation experiments with activated sludge, and (iii) benchmarking 5 different pathway prediction models, comprising 4 prediction models trained on different datasets provided by enviPath, and the state-of-the-art EAWAG pathway prediction system. Using the updated workflow, we could tentatively identify 79 transformation products for 31 pharmaceutical compounds. Compared to previous works, we have further automatized several steps that were previously performed by hand. By benchmarking the enviPath prediction system on experimental data, we demonstrate the usefulness of the pathway prediction tool to generate suspect lists for screening, and we propose new avenues to improve their accuracy. Moreover, we provide a well-documented workflow that can be (i) readily applied to detect transformation products in activated sludge and (ii) potentially extended to other environmental studies.
Collapse
Affiliation(s)
- Leo Trostel
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
| | - Claudia Coll
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
| | - Kathrin Fenner
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
- Department of Chemistry, University of Zürich, 8057 Zürich, Switzerland
| | - Jasmin Hafner
- Department of Environmental Chemistry, Swiss Federal Institute of Aquatic Science and Technology (Eawag), Dübendorf, 8600, Zürich, Switzerland.
- Department of Chemistry, University of Zürich, 8057 Zürich, Switzerland
| |
Collapse
|
17
|
Wang X, Mohsin A, Sun Y, Li C, Zhuang Y, Wang G. From Spatial-Temporal Multiscale Modeling to Application: Bridging the Valley of Death in Industrial Biotechnology. Bioengineering (Basel) 2023; 10:744. [PMID: 37370675 DOI: 10.3390/bioengineering10060744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2023] [Revised: 06/13/2023] [Accepted: 06/19/2023] [Indexed: 06/29/2023] Open
Abstract
The Valley of Death confronts industrial biotechnology with a significant challenge to the commercialization of products. Fortunately, with the integration of computation, automation and artificial intelligence (AI) technology, the industrial biotechnology accelerates to cross the Valley of Death. The Fourth Industrial Revolution (Industry 4.0) has spurred advanced development of intelligent biomanufacturing, which has evolved the industrial structures in line with the worldwide trend. To achieve this, intelligent biomanufacturing can be structured into three main parts that comprise digitalization, modeling and intellectualization, with modeling forming a crucial link between the other two components. This paper provides an overview of mechanistic models, data-driven models and their applications in bioprocess development. We provide a detailed elaboration of the hybrid model and its applications in bioprocess engineering, including strain design, process control and optimization, as well as bioreactor scale-up. Finally, the challenges and opportunities of biomanufacturing towards Industry 4.0 are also discussed.
Collapse
Affiliation(s)
- Xueting Wang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology (ECUST), Shanghai 200237, China
| | - Ali Mohsin
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology (ECUST), Shanghai 200237, China
| | - Yifei Sun
- Key Laboratory of Smart Manufacturing in Energy Chemical Process, Ministry of Education, East China University of Science and Technology (ECUST), Shanghai 200237, China
| | - Chao Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology (ECUST), Shanghai 200237, China
| | - Yingping Zhuang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology (ECUST), Shanghai 200237, China
| | - Guan Wang
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology (ECUST), Shanghai 200237, China
| |
Collapse
|
18
|
Tan Z, Li J, Hou J, Gonzalez R. Designing artificial pathways for improving chemical production. Biotechnol Adv 2023; 64:108119. [PMID: 36764336 DOI: 10.1016/j.biotechadv.2023.108119] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 02/01/2023] [Accepted: 02/06/2023] [Indexed: 02/11/2023]
Abstract
Metabolic engineering exploits manipulation of catalytic and regulatory elements to improve a specific function of the host cell, often the synthesis of interesting chemicals. Although naturally occurring pathways are significant resources for metabolic engineering, these pathways are frequently inefficient and suffer from a series of inherent drawbacks. Designing artificial pathways in a rational manner provides a promising alternative for chemicals production. However, the entry barrier of designing artificial pathway is relatively high, which requires researchers a comprehensive and deep understanding of physical, chemical and biological principles. On the other hand, the designed artificial pathways frequently suffer from low efficiencies, which impair their further applications in host cells. Here, we illustrate the concept and basic workflow of retrobiosynthesis in designing artificial pathways, as well as the most currently used methods including the knowledge- and computer-based approaches. Then, we discuss how to obtain desired enzymes for novel biochemistries, and how to trim the initially designed artificial pathways for further improving their functionalities. Finally, we summarize the current applications of artificial pathways from feedstocks utilization to various products synthesis, as well as our future perspectives on designing artificial pathways.
Collapse
Affiliation(s)
- Zaigao Tan
- State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, Shanghai, China; School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; Department of Bioengineering, Shanghai Jiao Tong University, Shanghai, China.
| | - Jian Li
- State Key Laboratory of Microbial Metabolism, Shanghai Jiao Tong University, Shanghai, China; School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China; Department of Bioengineering, Shanghai Jiao Tong University, Shanghai, China
| | - Jin Hou
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, China
| | - Ramon Gonzalez
- Department of Chemical, Biological, and Materials Engineering, University of South Florida, Tampa, FL, USA.
| |
Collapse
|
19
|
Recent progress in the synthesis of advanced biofuel and bioproducts. Curr Opin Biotechnol 2023; 80:102913. [PMID: 36854202 DOI: 10.1016/j.copbio.2023.102913] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/20/2023] [Accepted: 01/30/2023] [Indexed: 02/27/2023]
Abstract
Energy is one of the most complex fields of study and an issue that influences nearly every aspect of modern life. Over the past century, combustion of fossil fuels, particularly in the transportation sector, has been the dominant form of energy release. Refining of petroleum and natural gas into liquid transportation fuels is also the centerpiece of the modern chemical industry used to produce materials, solvents, and other consumer goods. In the face of global climate change, the world is searching for alternative, sustainable means of producing energy carriers and chemical building blocks. The use of biofuels in engines predates modern refinery optimization and today represents a small but significant fraction of liquid transportation fuels burnt each year. Similarly, white biotechnology has been used to produce many natural products through fermentation. The evolution of recombinant DNA technology into modern synthetic biology has expanded the scope of biofuels and bioproducts that can be made by biocatalysts. This opinion examines the current trends in this research space, highlighting the substantial growth in computational tools and the growing influence of renewable electricity in the design of metabolic engineering strategies. In short, advanced biofuel and bioproduct synthesis remains a vibrant and critically important field of study whose focus is shifting away from the conversion of lignocellulosic biomass toward a broader consideration of how to reduce carbon dioxide to fuels and chemical products.
Collapse
|
20
|
Yu T, Boob AG, Volk MJ, Liu X, Cui H, Zhao H. Machine learning-enabled retrobiosynthesis of molecules. Nat Catal 2023. [DOI: 10.1038/s41929-022-00909-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
|
21
|
Lim PK, Julca I, Mutwil M. Redesigning plant specialized metabolism with supervised machine learning using publicly available reactome data. Comput Struct Biotechnol J 2023; 21:1639-1650. [PMID: 36874159 PMCID: PMC9976193 DOI: 10.1016/j.csbj.2023.01.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 01/12/2023] [Accepted: 01/12/2023] [Indexed: 01/19/2023] Open
Abstract
The immense structural diversity of products and intermediates of plant specialized metabolism (specialized metabolites) makes them rich sources of therapeutic medicine, nutrients, and other useful materials. With the rapid accumulation of reactome data that can be accessible on biological and chemical databases, along with recent advances in machine learning, this review sets out to outline how supervised machine learning can be used to design new compounds and pathways by exploiting the wealth of said data. We will first examine the various sources from which reactome data can be obtained, followed by explaining the different machine learning encoding methods for reactome data. We then discuss current supervised machine learning developments that can be employed in various aspects to help redesign plant specialized metabolism.
Collapse
Affiliation(s)
- Peng Ken Lim
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Irene Julca
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological University, Singapore, Singapore
| |
Collapse
|
22
|
Qin Y, Li Q, Fan L, Ning X, Wei X, You C. Biomanufacturing by In Vitro Biotransformation (ivBT) Using Purified Cascade Multi-enzymes. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2023; 186:1-27. [PMID: 37455283 DOI: 10.1007/10_2023_231] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/18/2023]
Abstract
In vitro biotransformation (ivBT) refers to the use of an artificial biological reaction system that employs purified enzymes for the one-pot conversion of low-cost materials into biocommodities such as ethanol, organic acids, and amino acids. Unshackled from cell growth and metabolism, ivBT exhibits distinct advantages compared with metabolic engineering, including but not limited to high engineering flexibility, ease of operation, fast reaction rate, high product yields, and good scalability. These characteristics position ivBT as a promising next-generation biomanufacturing platform. Nevertheless, challenges persist in the enhancement of bulk enzyme preparation methods, the acquisition of enzymes with superior catalytic properties, and the development of sophisticated approaches for pathway design and system optimization. In alignment with the workflow of ivBT development, this chapter presents a systematic introduction to pathway design, enzyme mining and engineering, system construction, and system optimization. The chapter also proffers perspectives on ivBT development.
Collapse
Affiliation(s)
- Yanmei Qin
- University of Chinese Academy of Sciences, Beijing, China
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Qiangzi Li
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Lin Fan
- University of Chinese Academy of Sciences, Beijing, China
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
- University of Chinese Academy of Sciences Sino-Danish College, Beijing, China
| | - Xiao Ning
- University of Chinese Academy of Sciences, Beijing, China
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China
| | - Xinlei Wei
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China.
- National Technology Innovation Center of Synthetic Biology, Tianjin, China.
| | - Chun You
- In Vitro Synthetic Biology Center, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, China.
- National Technology Innovation Center of Synthetic Biology, Tianjin, China.
| |
Collapse
|
23
|
Walther D. Specifics of Metabolite-Protein Interactions and Their Computational Analysis and Prediction. Methods Mol Biol 2023; 2554:179-197. [PMID: 36178627 DOI: 10.1007/978-1-0716-2624-5_12] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Computational approaches to the characterization and prediction of compound-protein interactions have a long research history and are well established, driven primarily by the needs of drug development. While, in principle, many of the computational methods developed in the context of drug development can also be applied directly to the investigation of metabolite-protein interactions, the interactions of metabolites with proteins (enzymes) are characterized by a number of particularities that result from their natural evolutionary origin and their biological and biochemical roles, as well as from a different problem setting when investigating them. In this review, these special aspects will be highlighted and recent research on them and developed computational approaches presented, along with available resources. They concern, among others, binding promiscuity, allostery, the role of posttranslational modifications, molecular steering and crowding effects, and metabolic conversion rate predictions. Recent breakthroughs in the field of protein structure prediction and newly developed machine learning techniques are being discussed as a tremendous opportunity for developing a more detailed molecular understanding of metabolism.
Collapse
Affiliation(s)
- Dirk Walther
- Max Planck Institute of Molecular Plant Physiology, Potsdam, Germany.
| |
Collapse
|
24
|
Patra P, B R D, Kundu P, Das M, Ghosh A. Recent advances in machine learning applications in metabolic engineering. Biotechnol Adv 2023; 62:108069. [PMID: 36442697 DOI: 10.1016/j.biotechadv.2022.108069] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 10/18/2022] [Accepted: 11/22/2022] [Indexed: 11/27/2022]
Abstract
Metabolic engineering encompasses several widely-used strategies, which currently hold a high seat in the field of biotechnology when its potential is manifesting through a plethora of research and commercial products with a strong societal impact. The genomic revolution that occurred almost three decades ago has initiated the generation of large omics-datasets which has helped in gaining a better understanding of cellular behavior. The itinerary of metabolic engineering that has occurred based on these large datasets has allowed researchers to gain detailed insights and a reasonable understanding of the intricacies of biosystems. However, the existing trail-and-error approaches for metabolic engineering are laborious and time-intensive when it comes to the production of target compounds with high yields through genetic manipulations in host organisms. Machine learning (ML) coupled with the available metabolic engineering test instances and omics data brings a comprehensive and multidisciplinary approach that enables scientists to evaluate various parameters for effective strain design. This vast amount of biological data should be standardized through knowledge engineering to train different ML models for providing accurate predictions in gene circuits designing, modification of proteins, optimization of bioprocess parameters for scaling up, and screening of hyper-producing robust cell factories. This review briefs on the premise of ML, followed by mentioning various ML methods and algorithms alongside the numerous omics datasets available to train ML models for predicting metabolic outcomes with high-accuracy. The combinative interplay between the ML algorithms and biological datasets through knowledge engineering have guided the recent advancements in applications such as CRISPR/Cas systems, gene circuits, protein engineering, metabolic pathway reconstruction, and bioprocess engineering. Finally, this review addresses the probable challenges of applying ML in metabolic engineering which will guide the researchers toward novel techniques to overcome the limitations.
Collapse
Affiliation(s)
- Pradipta Patra
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Disha B R
- B.M.S College of Engineering, Basavanagudi, Bengaluru, Karnataka 560019, India
| | - Pritam Kundu
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Manali Das
- School of Bioscience, Indian Institute of Technology Kharagpur, West Bengal 721302, India
| | - Amit Ghosh
- School School of Energy Science and Engineering, Indian Institute of Technology Kharagpur, West Bengal 721302, India; P.K. Sinha Centre for Bioenergy and Renewables, Indian Institute of Technology Kharagpur, West Bengal 721302, India.
| |
Collapse
|
25
|
Volk MJ, Tran VG, Tan SI, Mishra S, Fatma Z, Boob A, Li H, Xue P, Martin TA, Zhao H. Metabolic Engineering: Methodologies and Applications. Chem Rev 2022; 123:5521-5570. [PMID: 36584306 DOI: 10.1021/acs.chemrev.2c00403] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Metabolic engineering aims to improve the production of economically valuable molecules through the genetic manipulation of microbial metabolism. While the discipline is a little over 30 years old, advancements in metabolic engineering have given way to industrial-level molecule production benefitting multiple industries such as chemical, agriculture, food, pharmaceutical, and energy industries. This review describes the design, build, test, and learn steps necessary for leading a successful metabolic engineering campaign. Moreover, we highlight major applications of metabolic engineering, including synthesizing chemicals and fuels, broadening substrate utilization, and improving host robustness with a focus on specific case studies. Finally, we conclude with a discussion on perspectives and future challenges related to metabolic engineering.
Collapse
Affiliation(s)
- Michael J Volk
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Vinh G Tran
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Shih-I Tan
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemical Engineering, National Cheng Kung University, Tainan 70101, Taiwan
| | - Shekhar Mishra
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Zia Fatma
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Aashutosh Boob
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Hongxiang Li
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Pu Xue
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Teresa A Martin
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
26
|
Merging enzymatic and synthetic chemistry with computational synthesis planning. Nat Commun 2022; 13:7747. [PMID: 36517480 PMCID: PMC9750992 DOI: 10.1038/s41467-022-35422-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 11/30/2022] [Indexed: 12/15/2022] Open
Abstract
Synthesis planning programs trained on chemical reaction data can design efficient routes to new molecules of interest, but are limited in their ability to leverage rare chemical transformations. This challenge is acute for enzymatic reactions, which are valuable due to their selectivity and sustainability but are few in number. We report a retrosynthetic search algorithm using two neural network models for retrosynthesis-one covering 7984 enzymatic transformations and one 163,723 synthetic transformations-that balances the exploration of enzymatic and synthetic reactions to identify hybrid synthesis plans. This approach extends the space of retrosynthetic moves by thousands of uniquely enzymatic one-step transformations, discovers routes to molecules for which synthetic or enzymatic searches find none, and designs shorter routes for others. Application to (-)-Δ9 tetrahydrocannabinol (THC) (dronabinol) and R,R-formoterol (arformoterol) illustrates how our strategy facilitates the replacement of metal catalysis, high step counts, or costly enantiomeric resolution with more elegant hybrid proposals.
Collapse
|
27
|
Duong-Trung N, Born S, Kim JW, Schermeyer MT, Paulick K, Borisyak M, Cruz-Bournazou MN, Werner T, Scholz R, Schmidt-Thieme L, Neubauer P, Martinez E. When Bioprocess Engineering Meets Machine Learning: A Survey from the Perspective of Automated Bioprocess Development. Biochem Eng J 2022. [DOI: 10.1016/j.bej.2022.108764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
28
|
The automated Galaxy-SynBioCAD pipeline for synthetic biology design and engineering. Nat Commun 2022; 13:5082. [PMID: 36038542 PMCID: PMC9424320 DOI: 10.1038/s41467-022-32661-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 08/11/2022] [Indexed: 11/27/2022] Open
Abstract
Here we introduce the Galaxy-SynBioCAD portal, a toolshed for synthetic biology, metabolic engineering, and industrial biotechnology. The tools and workflows currently shared on the portal enables one to build libraries of strains producing desired chemical targets covering an end-to-end metabolic pathway design and engineering process from the selection of strains and targets, the design of DNA parts to be assembled, to the generation of scripts driving liquid handlers for plasmid assembly and strain transformations. Standard formats like SBML and SBOL are used throughout to enforce the compatibility of the tools. In a study carried out at four different sites, we illustrate the link between pathway design and engineering with the building of a library of E. coli lycopene-producing strains. We also benchmark our workflows on literature and expert validated pathways. Overall, we find an 83% success rate in retrieving the validated pathways among the top 10 pathways generated by the workflows.
Collapse
|
29
|
Cho JS, Kim GB, Eun H, Moon CW, Lee SY. Designing Microbial Cell Factories for the Production of Chemicals. JACS AU 2022; 2:1781-1799. [PMID: 36032533 PMCID: PMC9400054 DOI: 10.1021/jacsau.2c00344] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 07/26/2022] [Accepted: 07/26/2022] [Indexed: 05/24/2023]
Abstract
The sustainable production of chemicals from renewable, nonedible biomass has emerged as an essential alternative to address pressing environmental issues arising from our heavy dependence on fossil resources. Microbial cell factories are engineered microorganisms harboring biosynthetic pathways streamlined to produce chemicals of interests from renewable carbon sources. The biosynthetic pathways for the production of chemicals can be defined into three categories with reference to the microbial host selected for engineering: native-existing pathways, nonnative-existing pathways, and nonnative-created pathways. Recent trends in leveraging native-existing pathways, discovering nonnative-existing pathways, and designing de novo pathways (as nonnative-created pathways) are discussed in this Perspective. We highlight key approaches and successful case studies that exemplify these concepts. Once these pathways are designed and constructed in the microbial cell factory, systems metabolic engineering strategies can be used to improve the performance of the strain to meet industrial production standards. In the second part of the Perspective, current trends in design tools and strategies for systems metabolic engineering are discussed with an eye toward the future. Finally, we survey current and future challenges that need to be addressed to advance microbial cell factories for the sustainable production of chemicals.
Collapse
Affiliation(s)
- Jae Sung Cho
- Metabolic
and Biomolecular Engineering National Research Laboratory and Systems
Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative
Laboratory, Department of Chemical and Biomolecular Engineering (BK21
four), Korea Advanced Institute of Science
and Technology (KAIST), Daejeon 34141, Republic of Korea
- KAIST
Institute for the BioCentury and KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
- BioProcess
Engineering Research Center and BioInformatics Research Center, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
| | - Gi Bae Kim
- Metabolic
and Biomolecular Engineering National Research Laboratory and Systems
Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative
Laboratory, Department of Chemical and Biomolecular Engineering (BK21
four), Korea Advanced Institute of Science
and Technology (KAIST), Daejeon 34141, Republic of Korea
- KAIST
Institute for the BioCentury and KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
| | - Hyunmin Eun
- Metabolic
and Biomolecular Engineering National Research Laboratory and Systems
Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative
Laboratory, Department of Chemical and Biomolecular Engineering (BK21
four), Korea Advanced Institute of Science
and Technology (KAIST), Daejeon 34141, Republic of Korea
- KAIST
Institute for the BioCentury and KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
| | - Cheon Woo Moon
- Metabolic
and Biomolecular Engineering National Research Laboratory and Systems
Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative
Laboratory, Department of Chemical and Biomolecular Engineering (BK21
four), Korea Advanced Institute of Science
and Technology (KAIST), Daejeon 34141, Republic of Korea
- KAIST
Institute for the BioCentury and KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
| | - Sang Yup Lee
- Metabolic
and Biomolecular Engineering National Research Laboratory and Systems
Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative
Laboratory, Department of Chemical and Biomolecular Engineering (BK21
four), Korea Advanced Institute of Science
and Technology (KAIST), Daejeon 34141, Republic of Korea
- KAIST
Institute for the BioCentury and KAIST Institute for Artificial Intelligence, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
- BioProcess
Engineering Research Center and BioInformatics Research Center, Korea Advanced Institute of Science and Technology
(KAIST), Daejeon 34141, Republic of Korea
| |
Collapse
|
30
|
Yang D, Eun H, Prabowo CPS, Cho S, Lee SY. Metabolic and cellular engineering for the production of natural products. Curr Opin Biotechnol 2022; 77:102760. [PMID: 35908315 DOI: 10.1016/j.copbio.2022.102760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 06/14/2022] [Accepted: 06/30/2022] [Indexed: 11/25/2022]
Abstract
Increased awareness of the environmental and health concerns of consuming chemically synthesized products has led to a rising demand for natural products that are greener and more sustainable. Despite their importance, however, industrial-scale production of natural products has been challenging due to the low yield and high cost of the bioprocesses. To cope with this problem, systems metabolic engineering has been employed to efficiently produce natural products from renewable biomass. Here, we review the recent systems metabolic engineering strategies employed for enhanced production of value-added natural products, together with accompanying examples. Particular focus is set on systems-level engineering and cell physiology engineering strategies. Future perspectives are also discussed.
Collapse
Affiliation(s)
- Dongsoo Yang
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea.
| | - Hyunmin Eun
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea
| | - Cindy Pricilia Surya Prabowo
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea
| | - Sumin Cho
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea
| | - Sang Yup Lee
- Metabolic and Biomolecular Engineering National Research Laboratory and Systems Metabolic Engineering and Systems Healthcare Cross-Generation Collaborative Laboratory, Department of Chemical and Biomolecular Engineering (BK21 four), Korea Advanced Institute of Science and Technology (KAIST), Daejeon 34141, Republic of Korea; KAIST Institute for the BioCentury and KAIST Institute for Artificial Intelligence, KAIST, Daejeon 34141, Republic of Korea; BioProcess Engineering Research Center and BioInformatics Research Center, KAIST, Daejeon 34141, Republic of Korea.
| |
Collapse
|
31
|
Prediction of degradation pathways of phenolic compounds in the human gut microbiota through enzyme promiscuity methods. NPJ Syst Biol Appl 2022; 8:24. [PMID: 35831427 PMCID: PMC9279433 DOI: 10.1038/s41540-022-00234-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 06/20/2022] [Indexed: 11/08/2022] Open
Abstract
The relevance of phenolic compounds in the human diet has increased in recent years, particularly due to their role as natural antioxidants and chemopreventive agents in different diseases. In the human body, phenolic compounds are mainly metabolized by the gut microbiota; however, their metabolism is not well represented in public databases and existing reconstructions. In a previous work, using different sources of knowledge, bioinformatic and modelling tools, we developed AGREDA, an extended metabolic network more amenable to analyze the interaction of the human gut microbiota with diet. Despite the substantial improvement achieved by AGREDA, it was not sufficient to represent the diverse metabolic space of phenolic compounds. In this article, we make use of an enzyme promiscuity approach to complete further the metabolism of phenolic compounds in the human gut microbiota. In particular, we apply RetroPath RL, a previously developed approach based on Monte Carlo Tree Search strategy reinforcement learning, in order to predict the degradation pathways of compounds present in Phenol-Explorer, the largest database of phenolic compounds in the literature. Reactions predicted by RetroPath RL were integrated with AGREDA, leading to a more complete version of the human gut microbiota metabolic network. We assess the impact of our improvements in the metabolic processing of various foods, finding previously undetected connections with output microbial metabolites. By means of untargeted metabolomics data, we present in vitro experimental validation for output microbial metabolites released in the fermentation of lentils with feces of children representing different clinical conditions.
Collapse
|
32
|
Xu Z, Mahadevan R. Efficient Enumeration of Branched Novel Biochemical Pathways Using a Probabilistic Technique. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.1c02211] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Zhiqing Xu
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario M5S 3E5, Canada
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, Ontario M5S 3E5, Canada
- Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario M5S 3G9, Canada
| |
Collapse
|
33
|
Shi Z, Liu P, Liao X, Mao Z, Zhang J, Wang Q, Sun J, Ma H, Ma Y. Data-Driven Synthetic Cell Factories Development for Industrial Biomanufacturing. BIODESIGN RESEARCH 2022; 2022:9898461. [PMID: 37850146 PMCID: PMC10521697 DOI: 10.34133/2022/9898461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 05/26/2022] [Indexed: 10/19/2023] Open
Abstract
Revolutionary breakthroughs in artificial intelligence (AI) and machine learning (ML) have had a profound impact on a wide range of scientific disciplines, including the development of artificial cell factories for biomanufacturing. In this paper, we review the latest studies on the application of data-driven methods for the design of new proteins, pathways, and strains. We first briefly introduce the various types of data and databases relevant to industrial biomanufacturing, which are the basis for data-driven research. Different types of algorithms, including traditional ML and more recent deep learning methods, are also presented. We then demonstrate how these data-based approaches can be applied to address various issues in cell factory development using examples from recent studies, including the prediction of protein function, improvement of metabolic models, and estimation of missing kinetic parameters, design of non-natural biosynthesis pathways, and pathway optimization. In the last section, we discuss the current limitations of these data-driven approaches and propose that data-driven methods should be integrated with mechanistic models to complement each other and facilitate the development of synthetic strains for industrial biomanufacturing.
Collapse
Affiliation(s)
- Zhenkun Shi
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Pi Liu
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Xiaoping Liao
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Zhitao Mao
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Jianqi Zhang
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Qinhong Wang
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Jibin Sun
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Hongwu Ma
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| | - Yanhe Ma
- Key Laboratory of Systems Microbial Technology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, China
- National Technology Innovation Center of Synthetic Biology, Tianjin 300308China
| |
Collapse
|
34
|
Zheng S, Zeng T, Li C, Chen B, Coley CW, Yang Y, Wu R. Deep learning driven biosynthetic pathways navigation for natural products with BioNavi-NP. Nat Commun 2022; 13:3342. [PMID: 35688826 PMCID: PMC9187661 DOI: 10.1038/s41467-022-30970-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Accepted: 05/27/2022] [Indexed: 12/30/2022] Open
Abstract
The complete biosynthetic pathways are unknown for most natural products (NPs), it is thus valuable to make computer-aided bio-retrosynthesis predictions. Here, a navigable and user-friendly toolkit, BioNavi-NP, is developed to predict the biosynthetic pathways for both NPs and NP-like compounds. First, a single-step bio-retrosynthesis prediction model is trained using both general organic and biosynthetic reactions through end-to-end transformer neural networks. Based on this model, plausible biosynthetic pathways can be efficiently sampled through an AND-OR tree-based planning algorithm from iterative multi-step bio-retrosynthetic routes. Extensive evaluations reveal that BioNavi-NP can identify biosynthetic pathways for 90.2% of 368 test compounds and recover the reported building blocks as in the test set for 72.8%, 1.7 times more accurate than existing conventional rule-based approaches. The model is further shown to identify biologically plausible pathways for complex NPs collected from the recent literature. The toolkit as well as the curated datasets and learned models are freely available to facilitate the elucidation and reconstruction of the biosynthetic pathways for NPs. The complete biosynthetic pathway from most natural products (NPs) are unknown. Here, the authors report BioNavi-NP, a computational toolkit for bio-retrosynthetic pathway elucidation or reconstruction for both NPs and NP-like compounds.
Collapse
Affiliation(s)
- Shuangjia Zheng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.,Galixir, Beijing, China.,School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China
| | - Tao Zeng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China
| | | | - Binghong Chen
- College of Computing, Georgia Institute of Technology, Atlanta, GA, USA
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, 510006, China.
| | - Ruibo Wu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou, 510006, China.
| |
Collapse
|
35
|
Sabzevari M, Szedmak S, Penttilä M, Jouhten P, Rousu J. Strain design optimization using reinforcement learning. PLoS Comput Biol 2022; 18:e1010177. [PMID: 35658018 PMCID: PMC9200333 DOI: 10.1371/journal.pcbi.1010177] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 06/15/2022] [Accepted: 05/06/2022] [Indexed: 11/18/2022] Open
Abstract
Engineered microbial cells present a sustainable alternative to fossil-based synthesis of chemicals and fuels. Cellular synthesis routes are readily assembled and introduced into microbial strains using state-of-the-art synthetic biology tools. However, the optimization of the strains required to reach industrially feasible production levels is far less efficient. It typically relies on trial-and-error leading into high uncertainty in total duration and cost. New techniques that can cope with the complexity and limited mechanistic knowledge of the cellular regulation are called for guiding the strain optimization.
In this paper, we put forward a multi-agent reinforcement learning (MARL) approach that learns from experiments to tune the metabolic enzyme levels so that the production is improved. Our method is model-free and does not assume prior knowledge of the microbe’s metabolic network or its regulation. The multi-agent approach is well-suited to make use of parallel experiments such as multi-well plates commonly used for screening microbial strains.
We demonstrate the method’s capabilities using the genome-scale kinetic model of Escherichia coli, k-ecoli457, as a surrogate for an in vivo cell behaviour in cultivation experiments. We investigate the method’s performance relevant for practical applicability in strain engineering i.e. the speed of convergence towards the optimum response, noise tolerance, and the statistical stability of the solutions found. We further evaluate the proposed MARL approach in improving L-tryptophan production by yeast Saccharomyces cerevisiae, using publicly available experimental data on the performance of a combinatorial strain library.
Overall, our results show that multi-agent reinforcement learning is a promising approach for guiding the strain optimization beyond mechanistic knowledge, with the goal of faster and more reliably obtaining industrially attractive production levels.
Collapse
Affiliation(s)
- Maryam Sabzevari
- Department of Computer Science, Aalto University, Espoo, Finland
- * E-mail: ,
| | - Sandor Szedmak
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Merja Penttilä
- VTT Technical Research Centre of Finland Ltd, Espoo, Finland
| | - Paula Jouhten
- VTT Technical Research Centre of Finland Ltd, Espoo, Finland
| | - Juho Rousu
- Department of Computer Science, Aalto University, Espoo, Finland
| |
Collapse
|
36
|
Liao X, Ma H, Tang YJ. Artificial intelligence: a solution to involution of design–build–test–learn cycle. Curr Opin Biotechnol 2022; 75:102712. [DOI: 10.1016/j.copbio.2022.102712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 02/05/2022] [Accepted: 03/01/2022] [Indexed: 01/08/2023]
|
37
|
Sankaranarayanan K, Heid E, Coley CW, Verma D, Green WH, Jensen KF. Similarity based enzymatic retrosynthesis. Chem Sci 2022; 13:6039-6053. [PMID: 35685792 PMCID: PMC9132021 DOI: 10.1039/d2sc01588a] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Accepted: 04/26/2022] [Indexed: 11/29/2022] Open
Abstract
Enzymes synthesize complex natural products effortlessly by catalyzing chemo-, regio-, and enantio-selective transformations. Further, biocatalytic processes are increasingly replacing conventional organic synthesis steps because they use mild solvents, avoid the use of metals, and reduce overall non-biodegradable waste. Here, we present a single-step retrosynthesis search algorithm to facilitate enzymatic synthesis of natural product analogs. First, we develop a tool, RDEnzyme, capable of extracting and applying stereochemically consistent enzymatic reaction templates, i.e., subgraph patterns that describe the changes in connectivity between a product molecule and its corresponding reactant(s). Using RDEnzyme, we demonstrate that molecular similarity is an effective metric to propose retrosynthetic disconnections based on analogy to precedent enzymatic reactions in UniProt/RHEA. Using ∼5500 reactions from RHEA as a knowledge base, the recorded reactants to the product are among the top 10 proposed suggestions in 71% of ∼700 test reactions. Second, we trained a statistical model capable of discriminating between reaction pairs belonging to homologous enzymes and evolutionarily distant enzymes using ∼30 000 reaction pairs from SwissProt as a knowledge base. This model is capable of understanding patterns in enzyme promiscuity to evaluate the likelihood of experimental evolution success. By recursively applying the similarity-based single-step retrosynthesis and evolution prediction workflow, we successfully plan the enzymatic synthesis routes for both active pharmaceutical ingredients (e.g. Islatravir, Molnupiravir) and commodity chemicals (e.g. 1,4-butanediol, branched-chain higher alcohols/biofuels), in a retrospective fashion. Through the development and demonstration of the single-step enzymatic retrosynthesis strategy using natural transformations, our approach provides a first step towards solving the challenging problem of incorporating both enzyme- and organic-chemistry based transformations into a computer aided synthesis planning workflow.
Collapse
Affiliation(s)
- Karthik Sankaranarayanan
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge Massachusetts 02139 USA
| | - Esther Heid
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge Massachusetts 02139 USA
- Institute of Materials Chemistry, TU Wien 1060 Vienna Austria
| | - Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge Massachusetts 02139 USA
| | - Deeptak Verma
- Computational and Structural Chemistry, Discovery Chemistry, Merck & Co., Inc. Kenilworth NJ 07033 USA
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge Massachusetts 02139 USA
| | - Klavs F Jensen
- Department of Chemical Engineering, Massachusetts Institute of Technology 77 Massachusetts Avenue Cambridge Massachusetts 02139 USA
| |
Collapse
|
38
|
Kellman BP, Richelle A, Yang JY, Chapla D, Chiang AWT, Najera JA, Liang C, Fürst A, Bao B, Koga N, Mohammad MA, Bruntse AB, Haymond MW, Moremen KW, Bode L, Lewis NE. Elucidating Human Milk Oligosaccharide biosynthetic genes through network-based multi-omics integration. Nat Commun 2022; 13:2455. [PMID: 35508452 PMCID: PMC9068700 DOI: 10.1038/s41467-022-29867-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 04/04/2022] [Indexed: 12/18/2022] Open
Abstract
Human Milk Oligosaccharides (HMOs) are abundant carbohydrates fundamental to infant health and development. Although these oligosaccharides were discovered more than half a century ago, their biosynthesis in the mammary gland remains largely uncharacterized. Here, we use a systems biology framework that integrates glycan and RNA expression data to construct an HMO biosynthetic network and predict glycosyltransferases involved. To accomplish this, we construct models describing the most likely pathways for the synthesis of the oligosaccharides accounting for >95% of the HMO content in human milk. Through our models, we propose candidate genes for elongation, branching, fucosylation, and sialylation of HMOs. Our model aggregation approach recovers 2 of 2 previously known gene-enzyme relations and 2 of 3 empirically confirmed gene-enzyme relations. The top genes we propose for the remaining 5 linkage reactions are consistent with previously published literature. These results provide the molecular basis of HMO biosynthesis necessary to guide progress in HMO research and application with the goal of understanding and improving infant health and development.
Collapse
Affiliation(s)
- Benjamin P Kellman
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Anne Richelle
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Jeong-Yeh Yang
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, USA
| | - Digantkumar Chapla
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, USA
| | - Austin W T Chiang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Julia A Najera
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Chenguang Liang
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Annalee Fürst
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Bokan Bao
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA, 92093, USA
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Natalia Koga
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Mahmoud A Mohammad
- Department of Pediatrics, Children's Nutrition Research Center, US Department of Agriculture/Agricultural Research Service, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Anders Bech Bruntse
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Morey W Haymond
- Department of Pediatrics, Children's Nutrition Research Center, US Department of Agriculture/Agricultural Research Service, Baylor College of Medicine, Houston, TX, 77030, USA
| | - Kelley W Moremen
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA, USA
| | - Lars Bode
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA
- Larsson-Rosenquist Foundation Mother-Milk-Infant Center of Research Excellence (MOMI CORE), University of California, San Diego, La Jolla, CA, 92093, USA
| | - Nathan E Lewis
- Department of Pediatrics, University of California, San Diego, La Jolla, CA, 92093, USA.
- Department of Bioengineering, University of California, San Diego, La Jolla, CA, 92093, USA.
| |
Collapse
|
39
|
Sveshnikova A, MohammadiPeyhani H, Hatzimanikatis V. Computational tools and resources for designing new pathways to small molecules. Curr Opin Biotechnol 2022; 76:102722. [PMID: 35483185 DOI: 10.1016/j.copbio.2022.102722] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Revised: 03/04/2022] [Accepted: 03/22/2022] [Indexed: 12/22/2022]
Abstract
The metabolic engineering community relies on computational methods for pathway design to produce important small molecules in microbial hosts. Metabolic network databases are continuously curated and updated with known and novel reactions that expand the known biochemistry based on different sets of enzymatic reaction rules. To address the complexity of the metabolic networks, elaborate methods were developed to transform them into computable graphs, navigate them, and construct the best possible pathways. However, the recent experimental research points to the new challenges and opportunities for the computational pathway design. Here, we review the most recent advances, especially in the last two years, in computational discovery of new pathways and their prospects for expanding metabolic capabilities. We draw attention to the potential ways of improvement for pathway design algorithms, including the expansion of Design-Build-Test-Learn cycle to novel compounds and reactions and the standardization for the reaction rules and metabolic reaction databases.
Collapse
Affiliation(s)
- Anastasia Sveshnikova
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Homa MohammadiPeyhani
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland
| | - Vassily Hatzimanikatis
- Laboratory of Computational Systems Biotechnology, École Polytechnique Fédérale de Lausanne, EPFL, Lausanne, Switzerland.
| |
Collapse
|
40
|
Fessner ND, Badenhorst CPS, Bornscheuer UT. Enzyme Kits to Facilitate the Integration of Biocatalysis into Organic Chemistry – First Aid for Synthetic Chemists. ChemCatChem 2022. [DOI: 10.1002/cctc.202200156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Nico D. Fessner
- Dept. of Biotechnology & Enzyme Catalysis Institute of Biochemistry University of Greifswald Felix-Hausdorff-Str. 4 17487 Greifswald Germany
| | - Christoffel P. S. Badenhorst
- Dept. of Biotechnology & Enzyme Catalysis Institute of Biochemistry University of Greifswald Felix-Hausdorff-Str. 4 17487 Greifswald Germany
| | - Uwe T. Bornscheuer
- Dept. of Biotechnology & Enzyme Catalysis Institute of Biochemistry University of Greifswald Felix-Hausdorff-Str. 4 17487 Greifswald Germany
| |
Collapse
|
41
|
Expanding biochemical knowledge and illuminating metabolic dark matter with ATLASx. Nat Commun 2022; 13:1560. [PMID: 35322036 PMCID: PMC8943196 DOI: 10.1038/s41467-022-29238-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 03/07/2022] [Indexed: 12/23/2022] Open
Abstract
Metabolic “dark matter” describes currently unknown metabolic processes, which form a blind spot in our general understanding of metabolism and slow down the development of biosynthetic cell factories and naturally derived pharmaceuticals. Mapping the dark matter of metabolism remains an open challenge that can be addressed globally and systematically by existing computational solutions. In this work, we use 489 generalized enzymatic reaction rules to map both known and unknown metabolic processes around a biochemical database of 1.5 million biological compounds. We predict over 5 million reactions and integrate nearly 2 million naturally and synthetically-derived compounds into the global network of biochemical knowledge, named ATLASx. ATLASx is available to researchers as a powerful online platform that supports the prediction and analysis of biochemical pathways and evaluates the biochemical vicinity of molecule classes (https://lcsb-databases.epfl.ch/Atlas2). “Mapping the dark matter of metabolism remains an open challenge that can be addressed globally and systematically by existing computational solutions. Here the authors present ATLASx, a repository of known and predicted enzymatic reaction, connecting millions of compounds to help synthetic biologists and metabolic engineers to design and explore metabolic pathways.”
Collapse
|
42
|
Machine learning modeling of family wide enzyme-substrate specificity screens. PLoS Comput Biol 2022; 18:e1009853. [PMID: 35143485 PMCID: PMC8865696 DOI: 10.1371/journal.pcbi.1009853] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 02/23/2022] [Accepted: 01/21/2022] [Indexed: 11/19/2022] Open
Abstract
Biocatalysis is a promising approach to sustainably synthesize pharmaceuticals, complex natural products, and commodity chemicals at scale. However, the adoption of biocatalysis is limited by our ability to select enzymes that will catalyze their natural chemical transformation on non-natural substrates. While machine learning and in silico directed evolution are well-posed for this predictive modeling challenge, efforts to date have primarily aimed to increase activity against a single known substrate, rather than to identify enzymes capable of acting on new substrates of interest. To address this need, we curate 6 different high-quality enzyme family screens from the literature that each measure multiple enzymes against multiple substrates. We compare machine learning-based compound-protein interaction (CPI) modeling approaches from the literature used for predicting drug-target interactions. Surprisingly, comparing these interaction-based models against collections of independent (single task) enzyme-only or substrate-only models reveals that current CPI approaches are incapable of learning interactions between compounds and proteins in the current family level data regime. We further validate this observation by demonstrating that our no-interaction baseline can outperform CPI-based models from the literature used to guide the discovery of kinase inhibitors. Given the high performance of non-interaction based models, we introduce a new structure-based strategy for pooling residue representations across a protein sequence. Altogether, this work motivates a principled path forward in order to build and evaluate meaningful predictive models for biocatalysis and other drug discovery applications. Predicting interactions between compounds and proteins represents a long-standing dream of drug discovery and protein engineering. Robust models of enzyme-substrate scope would dramatically advance our ability to design synthetic routes involving enzymatic catalysis. However, the lack of standardization between compound-protein interaction studies makes it difficult to evaluate the generalizability of such models. In this work we take a critical step forward by standardizing high-quality datasets measuring enzyme-substrate interactions, outlining rigorous evaluations, and proposing a new way to integrate structural information into protein representations. In testing previous modeling approaches, we highlight a surprising inability of existing models to effectively leverage compound-protein interactions to improve generalization, which challenges a perception in the literature. This establishes future opportunities for model development and integration of enzyme-substrate scope models into computer-aided synthesis planning software.
Collapse
|
43
|
Designing a multilayer film via machine learning of scientific literature. Sci Rep 2022; 12:930. [PMID: 35042971 PMCID: PMC8766440 DOI: 10.1038/s41598-022-05010-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 01/04/2022] [Indexed: 12/23/2022] Open
Abstract
Scientists who design chemical substances often use materials informatics (MI), a data-driven approach with either computer simulation or artificial intelligence (AI). MI is a valuable technique, but applying it to layered structures is difficult. Most of the proposed computer-aided material search techniques use atomic or molecular simulations, which are limited to small areas. Some AI approaches have planned layered structures, but they require a physical theory or abundant experimental results. There is no universal design tool for multilayer films in MI. Here, we show a multilayer film can be designed through machine learning (ML) of experimental procedures extracted from chemical-coating articles. We converted material names according to International Union of Pure and Applied Chemistry rules and stored them in databases for each fabrication step without any physicochemical theory. Compared with experimental results which depend on authors, experimental protocol is superiority at almost unified and less data loss. Connecting scientific knowledge through ML enables us to predict untrained film structures. This suggests that AI imitates research activity, which is normally inspired by other scientific achievements and can thus be used as a general design technique.
Collapse
|
44
|
Green biomanufacturing promoted by automatic retrobiosynthesis planning and computational enzyme design. Chin J Chem Eng 2022. [DOI: 10.1016/j.cjche.2021.08.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
45
|
Breger JC, Ellis GA, Walper SA, Susumu K, Medintz IL. Implementing Multi-Enzyme Biocatalytic Systems Using Nanoparticle Scaffolds. Methods Mol Biol 2022; 2487:227-262. [PMID: 35687240 DOI: 10.1007/978-1-0716-2269-8_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Interest in multi-enzyme synthesis outside of cells (in vitro) is becoming far more prevalent as the field of cell-free synthetic biology grows exponentially. Such synthesis would allow for complex chemical transformations based on the exquisite specificity of enzymes in a "greener" manner as compared to organic chemical transformations. Here, we describe how nanoparticles, and in this specific case-semiconductor quantum dots, can be used to both stabilize enzymes and further allow them to self-assemble into nanocomplexes that facilitate high-efficiency channeling phenomena. Pertinent protocol information is provided on enzyme expression, choice of nanoparticulate material, confirmation of enzyme attachment to nanoparticles, assay format and tracking, data analysis, and optimization of assay formats to draw the best analytical information from the underlying processes.
Collapse
Affiliation(s)
- Joyce C Breger
- Center for Bio/Molecular Science and Engineering, Code 6900, Washington, DC, USA
| | - Gregory A Ellis
- Center for Bio/Molecular Science and Engineering, Code 6900, Washington, DC, USA
| | - Scott A Walper
- Center for Bio/Molecular Science and Engineering, Code 6900, Washington, DC, USA
| | - Kimihiro Susumu
- Optical Sciences Division, Code 5611, U.S. Naval Research Laboratory, Washington, DC, USA
- Jacobs Corporation, Hanover, MD, USA
| | - Igor L Medintz
- Center for Bio/Molecular Science and Engineering, Code 6900, Washington, DC, USA.
| |
Collapse
|
46
|
Vila-Santa A, Mendes FC, Ferreira FC, Prather KLJ, Mira NP. Implementation of Synthetic Pathways to Foster Microbe-Based Production of Non-Naturally Occurring Carboxylic Acids and Derivatives. J Fungi (Basel) 2021; 7:jof7121020. [PMID: 34947002 PMCID: PMC8706239 DOI: 10.3390/jof7121020] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 11/15/2021] [Accepted: 11/20/2021] [Indexed: 11/20/2022] Open
Abstract
Microbially produced carboxylic acids (CAs) are considered key players in the implementation of more sustainable industrial processes due to their potential to replace a set of oil-derived commodity chemicals. Most CAs are intermediates of microbial central carbon metabolism, and therefore, a biochemical production pathway is described and can be transferred to a host of choice to enable/improve production at an industrial scale. However, for some CAs, the implementation of this approach is difficult, either because they do not occur naturally (as is the case for levulinic acid) or because the described production pathway cannot be easily ported (as it is the case for adipic, muconic or glucaric acids). Synthetic biology has been reshaping the range of molecules that can be produced by microbial cells by setting new-to-nature pathways that leverage on enzyme arrangements not observed in vivo, often in association with the use of substrates that are not enzymes’ natural ones. In this review, we provide an overview of how the establishment of synthetic pathways, assisted by computational tools for metabolic retrobiosynthesis, has been applied to the field of CA production. The translation of these efforts in bridging the gap between the synthesis of CAs and of their more interesting derivatives, often themselves non-naturally occurring molecules, is also reviewed using as case studies the production of methacrylic, methylmethacrylic and poly-lactic acids.
Collapse
Affiliation(s)
- Ana Vila-Santa
- Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Department of Bioengineering, University of Lisbon, 1049-001 Lisbon, Portugal; (A.V.-S.); (F.C.M.); (F.C.F.)
- Associate Laboratory i4HB—Institute for Health and Bioeconomy at Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
| | - Fernão C. Mendes
- Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Department of Bioengineering, University of Lisbon, 1049-001 Lisbon, Portugal; (A.V.-S.); (F.C.M.); (F.C.F.)
- Associate Laboratory i4HB—Institute for Health and Bioeconomy at Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
| | - Frederico C. Ferreira
- Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Department of Bioengineering, University of Lisbon, 1049-001 Lisbon, Portugal; (A.V.-S.); (F.C.M.); (F.C.F.)
- Associate Laboratory i4HB—Institute for Health and Bioeconomy at Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
| | - Kristala L. J. Prather
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA;
| | - Nuno P. Mira
- Institute for Bioengineering and Biosciences, Instituto Superior Técnico, Department of Bioengineering, University of Lisbon, 1049-001 Lisbon, Portugal; (A.V.-S.); (F.C.M.); (F.C.F.)
- Associate Laboratory i4HB—Institute for Health and Bioeconomy at Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais, 1049-001 Lisboa, Portugal
- Correspondence:
| |
Collapse
|
47
|
Weber JM, Guo Z, Zhang C, Schweidtmann AM, Lapkin AA. Chemical data intelligence for sustainable chemistry. Chem Soc Rev 2021; 50:12013-12036. [PMID: 34520507 DOI: 10.1039/d1cs00477h] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
This study highlights new opportunities for optimal reaction route selection from large chemical databases brought about by the rapid digitalisation of chemical data. The chemical industry requires a transformation towards more sustainable practices, eliminating its dependencies on fossil fuels and limiting its impact on the environment. However, identifying more sustainable process alternatives is, at present, a cumbersome, manual, iterative process, based on chemical intuition and modelling. We give a perspective on methods for automated discovery and assessment of competitive sustainable reaction routes based on renewable or waste feedstocks. Three key areas of transition are outlined and reviewed based on their state-of-the-art as well as bottlenecks: (i) data, (ii) evaluation metrics, and (iii) decision-making. We elucidate their synergies and interfaces since only together these areas can bring about the most benefit. The field of chemical data intelligence offers the opportunity to identify the inherently more sustainable reaction pathways and to identify opportunities for a circular chemical economy. Our review shows that at present the field of data brings about most bottlenecks, such as data completion and data linkage, but also offers the principal opportunity for advancement.
Collapse
Affiliation(s)
- Jana M Weber
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK. .,Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore
| | - Zhen Guo
- Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore.,Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, 138602, Singapore
| | - Chonghuan Zhang
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK.
| | - Artur M Schweidtmann
- Department of Chemical Engineering, Delft University of Technology, Van der Maasweg 9, Delft 2629 HZ, The Netherlands
| | - Alexei A Lapkin
- Department of Chemical Engineering and Biotechnology, University of Cambridge, West Cambridge Site, Philippa Fawcett Drive, Cambridge CB3 0AS, UK. .,Chemical Data Intelligence (CDI) Pte Ltd, Robinson Road, #02-00, 068898, Singapore.,Cambridge Centre for Advanced Research and Education in Singapore, CARES Ltd. 1 CREATE Way, CREATE Tower #05-05, 138602, Singapore
| |
Collapse
|
48
|
Munro LJ, Kell DB. Intelligent host engineering for metabolic flux optimisation in biotechnology. Biochem J 2021; 478:3685-3721. [PMID: 34673920 PMCID: PMC8589332 DOI: 10.1042/bcj20210535] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Revised: 09/22/2021] [Accepted: 09/24/2021] [Indexed: 12/13/2022]
Abstract
Optimising the function of a protein of length N amino acids by directed evolution involves navigating a 'search space' of possible sequences of some 20N. Optimising the expression levels of P proteins that materially affect host performance, each of which might also take 20 (logarithmically spaced) values, implies a similar search space of 20P. In this combinatorial sense, then, the problems of directed protein evolution and of host engineering are broadly equivalent. In practice, however, they have different means for avoiding the inevitable difficulties of implementation. The spare capacity exhibited in metabolic networks implies that host engineering may admit substantial increases in flux to targets of interest. Thus, we rehearse the relevant issues for those wishing to understand and exploit those modern genome-wide host engineering tools and thinking that have been designed and developed to optimise fluxes towards desirable products in biotechnological processes, with a focus on microbial systems. The aim throughput is 'making such biology predictable'. Strategies have been aimed at both transcription and translation, especially for regulatory processes that can affect multiple targets. However, because there is a limit on how much protein a cell can produce, increasing kcat in selected targets may be a better strategy than increasing protein expression levels for optimal host engineering.
Collapse
Affiliation(s)
- Lachlan J. Munro
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Douglas B. Kell
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs. Lyngby, Denmark
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St, Liverpool L69 7ZB, U.K
- Mellizyme Biotechnology Ltd, IC1, Liverpool Science Park, 131 Mount Pleasant, Liverpool L3 5TF, U.K
| |
Collapse
|
49
|
Heid E, Goldman S, Sankaranarayanan K, Coley CW, Flamm C, Green WH. EHreact: Extended Hasse Diagrams for the Extraction and Scoring of Enzymatic Reaction Templates. J Chem Inf Model 2021; 61:4949-4961. [PMID: 34587449 PMCID: PMC8549070 DOI: 10.1021/acs.jcim.1c00921] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Indexed: 11/29/2022]
Abstract
Data-driven computer-aided synthesis planning utilizing organic or biocatalyzed reactions from large databases has gained increasing interest in the last decade, sparking the development of numerous tools to extract, apply, and score general reaction templates. The generation of reaction rules for enzymatic reactions is especially challenging since substrate promiscuity varies between enzymes, causing the optimal levels of rule specificity and optimal number of included atoms to differ between enzymes. This complicates an automated extraction from databases and has promoted the creation of manually curated reaction rule sets. Here, we present EHreact, a purely data-driven open-source software tool, to extract and score reaction rules from sets of reactions known to be catalyzed by an enzyme at appropriate levels of specificity without expert knowledge. EHreact extracts and groups reaction rules into tree-like structures, Hasse diagrams, based on common substructures in the imaginary transition structures. Each diagram can be utilized to output a single or a set of reaction rules, as well as calculate the probability of a new substrate to be processed by the given enzyme by inferring information about the reactive site of the enzyme from the known reactions and their grouping in the template tree. EHreact heuristically predicts the activity of a given enzyme on a new substrate, outperforming current approaches in accuracy and functionality.
Collapse
Affiliation(s)
- Esther Heid
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Samuel Goldman
- Computational
and Systems Biology, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Karthik Sankaranarayanan
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Connor W. Coley
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| | - Christoph Flamm
- Department
of Theoretical Chemistry, University of
Vienna, 1090 Vienna, Austria
| | - William H. Green
- Department
of Chemical Engineering, Massachusetts Institute
of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
50
|
Krantz M, Zimmer D, Adler SO, Kitashova A, Klipp E, Mühlhaus T, Nägele T. Data Management and Modeling in Plant Biology. FRONTIERS IN PLANT SCIENCE 2021; 12:717958. [PMID: 34539712 PMCID: PMC8446634 DOI: 10.3389/fpls.2021.717958] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Accepted: 07/29/2021] [Indexed: 05/25/2023]
Abstract
The study of plant-environment interactions is a multidisciplinary research field. With the emergence of quantitative large-scale and high-throughput techniques, amount and dimensionality of experimental data have strongly increased. Appropriate strategies for data storage, management, and evaluation are needed to make efficient use of experimental findings. Computational approaches of data mining are essential for deriving statistical trends and signatures contained in data matrices. Although, current biology is challenged by high data dimensionality in general, this is particularly true for plant biology. Plants as sessile organisms have to cope with environmental fluctuations. This typically results in strong dynamics of metabolite and protein concentrations which are often challenging to quantify. Summarizing experimental output results in complex data arrays, which need computational statistics and numerical methods for building quantitative models. Experimental findings need to be combined by computational models to gain a mechanistic understanding of plant metabolism. For this, bioinformatics and mathematics need to be combined with experimental setups in physiology, biochemistry, and molecular biology. This review presents and discusses concepts at the interface of experiment and computation, which are likely to shape current and future plant biology. Finally, this interface is discussed with regard to its capabilities and limitations to develop a quantitative model of plant-environment interactions.
Collapse
Affiliation(s)
- Maria Krantz
- Theoretical Biophysics, Institute of Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - David Zimmer
- Computational Systems Biology, Technische Universität Kaiserslautern, Kaiserslautern, Germany
| | - Stephan O. Adler
- Theoretical Biophysics, Institute of Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Anastasia Kitashova
- Plant Evolutionary Cell Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| | - Edda Klipp
- Theoretical Biophysics, Institute of Biology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Timo Mühlhaus
- Computational Systems Biology, Technische Universität Kaiserslautern, Kaiserslautern, Germany
| | - Thomas Nägele
- Plant Evolutionary Cell Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, Planegg-Martinsried, Germany
| |
Collapse
|