1
|
Bogetti A, Zwier MC, Chong LT. Revisiting Textbook Azide-Clock Reactions: A "Propeller-Crawling" Mechanism Explains Differences in Rates. J Am Chem Soc 2024; 146:12828-12835. [PMID: 38687173 PMCID: PMC11078601 DOI: 10.1021/jacs.4c03360] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2024] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 05/02/2024]
Abstract
An ongoing challenge to chemists is the analysis of pathways and kinetics for chemical reactions in solution, including transient structures between the reactants and products that are difficult to resolve using laboratory experiments. Here, we enabled direct molecular dynamics simulations of a textbook series of chemical reactions on the hundreds of ns to μs time scale using the weighted ensemble (WE) path sampling strategy with hybrid quantum mechanical/molecular mechanical (QM/MM) models. We focused on azide-clock reactions involving addition of an azide anion to each of three long-lived trityl cations in an acetonitrile-water solvent mixture. Results reveal a two-step mechanism: (1) diffusional collision of reactants to form an ion-pair intermediate; (2) "activation" or rearrangement of the intermediate to the product. Our simulations yield not only reaction rates that are within error of experiment but also rates for individual steps, indicating the activation step as rate-limiting for all three cations. Further, the trend in reaction rates is due to dynamical effects, i.e., differing extents of the azide anion "crawling" along the cation's phenyl-ring "propellers" during the activation step. Our study demonstrates the power of analyzing pathways and kinetics to gain insights on reaction mechanisms, underscoring the value of including WE and other related path sampling strategies in the modern toolbox for chemists.
Collapse
Affiliation(s)
- Anthony
T. Bogetti
- Department
of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Matthew C. Zwier
- Department
of Chemistry, Drake University, Des Moines, Iowa 50311, United States
| | - Lillian T. Chong
- Department
of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
2
|
Wang X, Li A, Li X, Cui H. Empowering Protein Engineering through Recombination of Beneficial Substitutions. Chemistry 2024; 30:e202303889. [PMID: 38288640 DOI: 10.1002/chem.202303889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Indexed: 02/24/2024]
Abstract
Directed evolution stands as a seminal technology for generating novel protein functionalities, a cornerstone in biocatalysis, metabolic engineering, and synthetic biology. Today, with the development of various mutagenesis methods and advanced analytical machines, the challenge of diversity generation and high-throughput screening platforms is largely solved, and one of the remaining challenges is: how to empower the potential of single beneficial substitutions with recombination to achieve the epistatic effect. This review overviews experimental and computer-assisted recombination methods in protein engineering campaigns. In addition, integrated and machine learning-guided strategies were highlighted to discuss how these recombination approaches contribute to generating the screening library with better diversity, coverage, and size. A decision tree was finally summarized to guide the further selection of proper recombination strategies in practice, which was beneficial for accelerating protein engineering.
Collapse
Affiliation(s)
- Xinyue Wang
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Anni Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Xiujuan Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Haiyang Cui
- School of Life Sciences, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| |
Collapse
|
3
|
Bogetti A, Leung JMG, Chong LT. LPATH: A Semiautomated Python Tool for Clustering Molecular Pathways. J Chem Inf Model 2023; 63:7610-7616. [PMID: 38048485 PMCID: PMC10751797 DOI: 10.1021/acs.jcim.3c01318] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2023] [Revised: 10/14/2023] [Accepted: 11/09/2023] [Indexed: 12/06/2023]
Abstract
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here, we present the LPATH Python tool, which implements a semiautomated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of the alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
Collapse
Affiliation(s)
- Anthony
T. Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jeremy M. G. Leung
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Lillian T. Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
4
|
Kouba P, Kohout P, Haddadi F, Bushuiev A, Samusevich R, Sedlar J, Damborsky J, Pluskal T, Sivic J, Mazurenko S. Machine Learning-Guided Protein Engineering. ACS Catal 2023; 13:13863-13895. [PMID: 37942269 PMCID: PMC10629210 DOI: 10.1021/acscatal.3c02743] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 09/20/2023] [Indexed: 11/10/2023]
Abstract
Recent progress in engineering highly promising biocatalysts has increasingly involved machine learning methods. These methods leverage existing experimental and simulation data to aid in the discovery and annotation of promising enzymes, as well as in suggesting beneficial mutations for improving known targets. The field of machine learning for protein engineering is gathering steam, driven by recent success stories and notable progress in other areas. It already encompasses ambitious tasks such as understanding and predicting protein structure and function, catalytic efficiency, enantioselectivity, protein dynamics, stability, solubility, aggregation, and more. Nonetheless, the field is still evolving, with many challenges to overcome and questions to address. In this Perspective, we provide an overview of ongoing trends in this domain, highlight recent case studies, and examine the current limitations of machine learning-based methods. We emphasize the crucial importance of thorough experimental validation of emerging models before their use for rational protein design. We present our opinions on the fundamental problems and outline the potential directions for future research.
Collapse
Affiliation(s)
- Petr Kouba
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Faculty of
Electrical Engineering, Czech Technical
University in Prague, Technicka 2, 166 27 Prague 6, Czech Republic
| | - Pavel Kohout
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Faraneh Haddadi
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Anton Bushuiev
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Raman Samusevich
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Jiri Sedlar
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Jiri Damborsky
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Tomas Pluskal
- Institute
of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo nám. 2, 160 00 Prague 6, Czech Republic
| | - Josef Sivic
- Czech Institute
of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Jugoslavskych partyzanu 1580/3, 160 00 Prague 6, Czech Republic
| | - Stanislav Mazurenko
- Loschmidt
Laboratories, Department of Experimental Biology and RECETOX, Faculty
of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech
Republic
- International
Clinical Research Center, St. Anne’s
University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| |
Collapse
|
5
|
Chen J, Zhang M, Xu Z, Ma R, Shi Q. Machine-learning analysis to predict the fluorescence quantum yield of carbon quantum dots in biochar. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 896:165136. [PMID: 37379935 DOI: 10.1016/j.scitotenv.2023.165136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 06/11/2023] [Accepted: 06/23/2023] [Indexed: 06/30/2023]
Abstract
Biochar nanoparticles have recently attracted attention, owing to their environmental behavior and ecological effects. However, biochar has not been shown to contain carbon quantum dots (< 10 nm) with unique photovoltaic properties. Therefore, this study utilized several characterization techniques to demonstrate the generation of carbon quantum dots in biochar produced from 10 types of farm waste. The generated carbon quantum dots had a quasi-spherical morphology and high-resolution lattice stripes with lattice spacings of 0.20-0.23 nm. Moreover, they contained functional groups with good hydrophilic properties, such as amino and hydroxyl groups, and elemental O, C, and N on the surface. A crucial determinant of the photoluminescence properties of carbon quantum dots is their fluorescence quantum yield. Therefore, the relationship between the biochar preparation parameters and the fluorescence quantum yield was investigated using six machine learning analytical models based on 480 samples. Among the models, the gradient-boosting decision-tree regression model exhibited the best predictive performance (R2 > 0.9, RMSE <0.02, and MAPE <3), and was used for the analysis of feature importance; compared to the properties of the raw material, the production parameters had a greater effect on the fluorescence quantum yield. Additionally, four key features were identified: pyrolysis temperature, residence time, N content, and C/N ratio, which were independent of farm waste type. These features can be used to accurately predict the fluorescence quantum yield of carbon quantum dots in biochar. The relative error range between the predicted and the experimental value of fluorescence quantum yield is 0.00-4.60 %. Thus, the prediction model has the potential to predict the fluorescence quantum yield of carbon quantum dots in other types of farm waste biochar, and provides fundamental information for the study of biochar nanoparticles.
Collapse
Affiliation(s)
- Jiao Chen
- College of Ecology and Environment, Xin Jiang University, Urumqi 830046, PR China
| | - Mengqian Zhang
- China Energy Conservation and Environmental Protection Group, Beijing 100035, PR China
| | - Zijun Xu
- College of Ecology and Environment, Xin Jiang University, Urumqi 830046, PR China..
| | - Ruoxin Ma
- College of Ecology and Environment, Xin Jiang University, Urumqi 830046, PR China
| | - Qingdong Shi
- College of Ecology and Environment, Xin Jiang University, Urumqi 830046, PR China
| |
Collapse
|
6
|
Bogetti AT, Leung JMG, Chong LT. LPATH: A semi-automated Python tool for clustering molecular pathways. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.17.553774. [PMID: 37645995 PMCID: PMC10462149 DOI: 10.1101/2023.08.17.553774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
The pathways by which a molecular process transitions to a target state are highly sought-after as direct views of a transition mechanism. While great strides have been made in the physics-based simulation of such pathways, the analysis of these pathways can be a major challenge due to their diversity and variable lengths. Here we present the LPATH Python tool, which implements a semi-automated method for linguistics-assisted clustering of pathways into distinct classes (or routes). This method involves three steps: 1) discretizing the configurational space into key states, 2) extracting a text-string sequence of key visited states for each pathway, and 3) pairwise matching of pathways based on a text-string similarity score. To circumvent the prohibitive memory requirements of the first step, we have implemented a general two-stage method for clustering conformational states that exploits machine learning. LPATH is primarily designed for use with the WESTPA software for weighted ensemble simulations; however, the tool can also be applied to conventional simulations. As demonstrated for the C7eq to C7ax conformational transition of alanine dipeptide, LPATH provides physically reasonable classes of pathways and corresponding probabilities.
Collapse
Affiliation(s)
- Anthony T. Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Jeremy M. G. Leung
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| | - Lillian T. Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260
| |
Collapse
|
7
|
Abstract
A survey of protein databases indicates that the majority of enzymes exist in oligomeric forms, with about half of those found in the UniProt database being homodimeric. Understanding why many enzymes are in their dimeric form is imperative. Recent developments in experimental and computational techniques have allowed for a deeper comprehension of the cooperative interactions between the subunits of dimeric enzymes. This review aims to succinctly summarize these recent advancements by providing an overview of experimental and theoretical methods, as well as an understanding of cooperativity in substrate binding and the molecular mechanisms of cooperative catalysis within homodimeric enzymes. Focus is set upon the beneficial effects of dimerization and cooperative catalysis. These advancements not only provide essential case studies and theoretical support for comprehending dimeric enzyme catalysis but also serve as a foundation for designing highly efficient catalysts, such as dimeric organic catalysts. Moreover, these developments have significant implications for drug design, as exemplified by Paxlovid, which was designed for the homodimeric main protease of SARS-CoV-2.
Collapse
Affiliation(s)
- Ke-Wei Chen
- Lab of Computional Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
| | - Tian-Yu Sun
- Shenzhen Bay Laboratory, Shenzhen 518132, China
| | - Yun-Dong Wu
- Lab of Computional Chemistry and Drug Design, State Key Laboratory of Chemical Oncogenomics, Peking University Shenzhen Graduate School, Shenzhen 518055, China
- Shenzhen Bay Laboratory, Shenzhen 518132, China
| |
Collapse
|
8
|
Platero-Rochart D, Krivobokova T, Gastegger M, Reibnegger G, Sánchez-Murcia PA. Prediction of Enzyme Catalysis by Computing Reaction Energy Barriers via Steered QM/MM Molecular Dynamics Simulations and Machine Learning. J Chem Inf Model 2023; 63:4623-4632. [PMID: 37479222 PMCID: PMC10430765 DOI: 10.1021/acs.jcim.3c00772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Indexed: 07/23/2023]
Abstract
The prediction of enzyme activity is one of the main challenges in catalysis. With computer-aided methods, it is possible to simulate the reaction mechanism at the atomic level. However, these methods are usually expensive if they are to be used on a large scale, as they are needed for protein engineering campaigns. To alleviate this situation, machine learning methods can help in the generation of predictive-decision models. Herein, we test different regression algorithms for the prediction of the reaction energy barrier of the rate-limiting step of the hydrolysis of mono-(2-hydroxyethyl)terephthalic acid by the MHETase ofIdeonella sakaiensis. As a training data set, we use steered quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulation snapshots and their corresponding pulling work values. We have explored three algorithms together with three chemical representations. As an outcome, our trained models are able to predict pulling works along the steered QM/MM MD simulations with a mean absolute error below 3 kcal mol-1 and a score value above 0.90. More challenging is the prediction of the energy maximum with a single geometry. Whereas the use of the initial snapshot of the QM/MM MD trajectory as input geometry yields a very poor prediction of the reaction energy barrier, the use of an intermediate snapshot of the former trajectory brings the score value above 0.40 with a low mean absolute error (ca. 3 kcal mol-1). Altogether, we have faced in this work some initial challenges of the final goal of getting an efficient workflow for the semiautomatic prediction of enzyme-catalyzed energy barriers and catalytic efficiencies.
Collapse
Affiliation(s)
- Daniel Platero-Rochart
- Laboratory
of Computer-Aided Molecular Design, Division of Medicinal Chemistry,
Otto-Loewi Research Center, Medical University
of Graz, Neue Stiftingtalstraße 6/III, A-8010 Graz, Austria
| | - Tatyana Krivobokova
- Department
of Statistics and Operations Research, University
of Vienna, Oskar-Morgenstern-Platz 1, A-1090 Vienna, Austria
| | - Michael Gastegger
- Institute
of Software Engineering and Theoretical Computer Science, Machine
Learning Group, Technische Universität, 10587 Berlin, Germany
| | - Gilbert Reibnegger
- Laboratory
of Computer-Aided Molecular Design, Division of Medicinal Chemistry,
Otto-Loewi Research Center, Medical University
of Graz, Neue Stiftingtalstraße 6/III, A-8010 Graz, Austria
| | - Pedro A. Sánchez-Murcia
- Laboratory
of Computer-Aided Molecular Design, Division of Medicinal Chemistry,
Otto-Loewi Research Center, Medical University
of Graz, Neue Stiftingtalstraße 6/III, A-8010 Graz, Austria
| |
Collapse
|
9
|
Naleem N, Abreu CRA, Warmuz K, Tong M, Kirmizialtin S, Tuckerman ME. An exploration of machine learning models for the determination of reaction coordinates associated with conformational transitions. J Chem Phys 2023; 159:034102. [PMID: 37458344 DOI: 10.1063/5.0147597] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 06/23/2023] [Indexed: 07/20/2023] Open
Abstract
Determining collective variables (CVs) for conformational transitions is crucial to understanding their dynamics and targeting them in enhanced sampling simulations. Often, CVs are proposed based on intuition or prior knowledge of a system. However, the problem of systematically determining a proper reaction coordinate (RC) for a specific process in terms of a set of putative CVs can be achieved using committor analysis (CA). Identifying essential degrees of freedom that govern such transitions using CA remains elusive because of the high dimensionality of the conformational space. Various schemes exist to leverage the power of machine learning (ML) to extract an RC from CA. Here, we extend these studies and compare the ability of 17 different ML schemes to identify accurate RCs associated with conformational transitions. We tested these methods on an alanine dipeptide in vacuum and on a sarcosine dipeptoid in an implicit solvent. Our comparison revealed that the light gradient boosting machine method outperforms other methods. In order to extract key features from the models, we employed Shapley Additive exPlanations analysis and compared its interpretation with the "feature importance" approach. For the alanine dipeptide, our methodology identifies ϕ and θ dihedrals as essential degrees of freedom in the C7ax to C7eq transition. For the sarcosine dipeptoid system, the dihedrals ψ and ω are the most important for the cisαD to transαD transition. We further argue that analysis of the full dynamical pathway, and not just endpoint states, is essential for identifying key degrees of freedom governing transitions.
Collapse
Affiliation(s)
- Nawavi Naleem
- Chemistry Program, Science Division, New York University, Abu Dhabi, UAE
| | - Charlles R A Abreu
- Chemical Engineering Department, Escola de Química, Universidade Federal do Rio de Janeiro, 21941-909 Rio de Janeiro, RJ, Brazil
| | - Krzysztof Warmuz
- Computer Science Program, Science Division, New York University, Abu Dhabi, UAE
| | - Muchen Tong
- Department of Chemistry, New York University (NYU), New York, New York 10003, USA
| | - Serdal Kirmizialtin
- Chemistry Program, Science Division, New York University, Abu Dhabi, UAE
- Department of Chemistry, New York University (NYU), New York, New York 10003, USA
- Center for Smart Engineering Materials, New York University, Abu Dhabi, UAE
| | - Mark E Tuckerman
- Department of Chemistry, New York University (NYU), New York, New York 10003, USA
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Rd. North, Shanghai 200062, China
- Simons Center for Computational Physical Chemistry at New York University, New York, New York 10003, USA
| |
Collapse
|
10
|
Li Y, Zhang R, Yan X, Fan K. Machine learning facilitating the rational design of nanozymes. J Mater Chem B 2023. [PMID: 37325942 DOI: 10.1039/d3tb00842h] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
As a component substitute for natural enzymes, nanozymes have the advantages of easy synthesis, convenient modification, low cost, and high stability, and are widely used in many fields. However, their application is seriously restricted by the difficulty of rapidly creating high-performance nanozymes. The use of machine learning techniques to guide the rational design of nanozymes holds great promise to overcome this difficulty. In this review, we introduce the recent progress of machine learning in assisting the design of nanozymes. Particular attention is given to the successful strategies of machine learning in predicting the activity, selectivity, catalytic mechanisms, optimal structures and other features of nanozymes. The typical procedures and approaches for conducting machine learning in the study of nanozymes are also highlighted. Moreover, we discuss in detail the difficulties of machine learning methods in dealing with the redundant and chaotic nanozyme data and provide an outlook on the future application of machine learning in the nanozyme field. We hope that this review will serve as a useful handbook for researchers in related fields and promote the utilization of machine learning in nanozyme rational design and related topics.
Collapse
Affiliation(s)
- Yucong Li
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
| | - Ruofei Zhang
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
| | - Xiyun Yan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
- Nanozyme Medical Center, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou 450052, China
| | - Kelong Fan
- CAS Engineering Laboratory for Nanozyme, Key Laboratory of Protein and Peptide Pharmaceutical, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China.
- University of Chinese Academy of Sciences, Chinese Academy of Sciences, Beijing 100408, China
- Nanozyme Medical Center, School of Basic Medical Sciences, Zhengzhou University, Zhengzhou 450052, China
| |
Collapse
|
11
|
Xu S, Gao S, An Y. Research progress of engineering microbial cell factories for pigment production. Biotechnol Adv 2023; 65:108150. [PMID: 37044266 DOI: 10.1016/j.biotechadv.2023.108150] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 03/14/2023] [Accepted: 04/06/2023] [Indexed: 04/14/2023]
Abstract
Pigments are widely used in people's daily life, such as food additives, cosmetics, pharmaceuticals, textiles, etc. In recent years, the natural pigments produced by microorganisms have attracted increased attention because these processes cannot be affected by seasons like the plant extraction methods, and can also avoid the environmental pollution problems caused by chemical synthesis. Synthetic biology and metabolic engineering have been used to construct and optimize metabolic pathways for production of natural pigments in cellular factories. Building microbial cell factories for synthesis of natural pigments has many advantages, including well-defined genetic background of the strains, high-density and rapid culture of cells, etc. Until now, the technical means about engineering microbial cell factories for pigment production and metabolic regulation processes have not been systematically analyzed and summarized. Therefore, the studies about construction, modification and regulation of synthetic pathways for microbial synthesis of pigments in recent years have been reviewed, aiming to provide an up-to-date summary of engineering strategies for microbial synthesis of natural pigments including carotenoids, melanins, riboflavins, azomycetes and quinones. This review should provide new ideas for further improving microbial production of natural pigments in the future.
Collapse
Affiliation(s)
- Shumin Xu
- College of Biosciences and Biotechnology, Shenyang Agricultural University, Shenyang, China; College of Food Science, Shenyang Agricultural University, Shenyang, China
| | - Song Gao
- College of Biosciences and Biotechnology, Shenyang Agricultural University, Shenyang, China
| | - Yingfeng An
- College of Biosciences and Biotechnology, Shenyang Agricultural University, Shenyang, China; College of Food Science, Shenyang Agricultural University, Shenyang, China; Shenyang Key Laboratory of Microbial Resources Mining and Molecular Breeding, Shenyang, China; Liaoning Provincial Key Laboratory of Agricultural Biotechnology, Shenyang, China.
| |
Collapse
|
12
|
Wang Y, Yu L, Shao J, Zhu Z, Zhang L. Structure-driven protein engineering for production of valuable natural products. TRENDS IN PLANT SCIENCE 2023; 28:460-470. [PMID: 36473772 DOI: 10.1016/j.tplants.2022.11.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 09/25/2022] [Accepted: 11/11/2022] [Indexed: 06/17/2023]
Abstract
Proteins are the most frequently used biocatalysts, and their structures determine their functions. Modifying the functions of proteins on the basis of their structures lies at the heart of protein engineering, opening a new horizon for metabolic engineering by efficiently generating stable enzymes. Many attempts at classical metabolic engineering have focused on improving specific metabolic fluxes and producing more valuable natural products by increasing gene expression levels and enzyme concentrations. However, most naturally occurring enzymes show limitations, and such limitations have hindered practical applications. Here we review recent advances in protein engineering in synthetic biology, chemoenzymatic synthesis, and plant metabolic engineering and describe opportunities for designing and constructing novel enzymes or proteins with desirable properties to obtain more active natural products.
Collapse
Affiliation(s)
- Yun Wang
- Institute of Interdisciplinary Integrative Medicine Research, Medical School of Nantong University, Nantong 226001, China; Biomedical Innovation R&D Centre, School of Medicine, Shanghai University, Shanghai 200444, China
| | - Luyao Yu
- Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai 200433, China
| | - Jie Shao
- Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai 200433, China
| | - Zhanpin Zhu
- Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai 200433, China
| | - Lei Zhang
- Institute of Interdisciplinary Integrative Medicine Research, Medical School of Nantong University, Nantong 226001, China; Biomedical Innovation R&D Centre, School of Medicine, Shanghai University, Shanghai 200444, China; Department of Pharmaceutical Botany, School of Pharmacy, Second Military Medical University, Shanghai 200433, China; Innovative Drug R&D Center, College of Life Sciences, Huaibei Normal University, Huaibei 235000, China.
| |
Collapse
|
13
|
Zhang K, Chen L, Zhang T, Lu J, Liu D, Wu J. Machine learning quantitatively characterizes the deformation and destruction of explosive molecules. Phys Chem Chem Phys 2023; 25:8692-8704. [PMID: 36892514 DOI: 10.1039/d2cp04623g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
Although explosives have been widely used in mines, road development, old building demolishing, and munition explosions; currently, how chemical bonds between atoms break and recombine, how the molecular structure is deformed and destroyed, how the reaction product molecules are formed, and the details for this rapid change process in explosive reactions are not yet fully understood, which limits the full use of explosive energy and safer use of explosives. This paper presents a quantitative model of molecular structure deformation using machine learning algorithms as well as a qualitative model of its relationship with molecular structure destruction, based on a molecular dynamics simulation and detailed analysis of the shock-loaded ε-CL-20, providing new perspectives for explosive community research. Specifically, the quantitative model of molecular structure deformation establishes the quantitative relationship between the molecular volume change and molecular position change, and between molecular distance change and molecular volume change using the machine learning algorithms such as Delaunay triangulation, clustering, and gradient descent. We find that the molecular spacing in explosives is strongly compressed after being shocked, and the peripheral structure can shrink inward, which is beneficial to keep the cage structure stable. When the peripheral structure is compressed to a certain extent, the cage structure volume begins to expand and is then destroyed. In addition, hydrogen atom transfer occurs within the explosive molecule. This study amplifies the structural changes and the chemical reaction process for explosive molecules after being strongly compressed by a shock wave, which can enrich the knowledge of the real detonation reaction process. The analysis method based on quantitative characterization using machine learning proposed in this study can also be used to analyze the microscopic reaction mechanism in other materials.
Collapse
Affiliation(s)
- Kaining Zhang
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Lang Chen
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Teng Zhang
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Jianying Lu
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Danyang Liu
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| | - Junying Wu
- State Key Laboratory of Explosion Science and Technology, Beijing Institute of Technology, Beijing 100081, China.
| |
Collapse
|
14
|
Jiang Y, Ran X, Yang ZJ. Data-driven enzyme engineering to identify function-enhancing enzymes. Protein Eng Des Sel 2023; 36:gzac009. [PMID: 36214500 PMCID: PMC10365845 DOI: 10.1093/protein/gzac009] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 08/08/2022] [Accepted: 09/28/2022] [Indexed: 01/22/2023] Open
Abstract
Identifying function-enhancing enzyme variants is a 'holy grail' challenge in protein science because it will allow researchers to expand the biocatalytic toolbox for late-stage functionalization of drug-like molecules, environmental degradation of plastics and other pollutants, and medical treatment of food allergies. Data-driven strategies, including statistical modeling, machine learning, and deep learning, have largely advanced the understanding of the sequence-structure-function relationships for enzymes. They have also enhanced the capability of predicting and designing new enzymes and enzyme variants for catalyzing the transformation of new-to-nature reactions. Here, we reviewed the recent progresses of data-driven models that were applied in identifying efficiency-enhancing mutants for catalytic reactions. We also discussed existing challenges and obstacles faced by the community. Although the review is by no means comprehensive, we hope that the discussion can inform the readers about the state-of-the-art in data-driven enzyme engineering, inspiring more joint experimental-computational efforts to develop and apply data-driven modeling to innovate biocatalysts for synthetic and pharmaceutical applications.
Collapse
Affiliation(s)
- Yaoyukun Jiang
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Xinchun Ran
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
| | - Zhongyue J Yang
- Department of Chemistry, Vanderbilt University, Nashville, TN 37235, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN 37235, USA
- Vanderbilt Institute of Chemical Biology, Vanderbilt University, Nashville, TN 37235, USA
- Data Science Institute, Vanderbilt University, Nashville, TN 37235, USA
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, TN 37235, USA
| |
Collapse
|
15
|
Opuu V, Simonson T. Enzyme redesign and genetic code expansion. Protein Eng Des Sel 2023; 36:gzad017. [PMID: 37879093 DOI: 10.1093/protein/gzad017] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2023] [Revised: 09/10/2023] [Accepted: 09/19/2023] [Indexed: 10/27/2023] Open
Abstract
Enzyme design is an important application of computational protein design (CPD). It can benefit enormously from the additional chemistries provided by noncanonical amino acids (ncAAs). These can be incorporated into an 'expanded' genetic code, and introduced in vivo into target proteins. The key step for genetic code expansion is to engineer an aminoacyl-transfer RNA (tRNA) synthetase (aaRS) and an associated tRNA that handles the ncAA. Experimental directed evolution has been successfully used to engineer aaRSs and incorporate over 200 ncAAs into expanded codes. But directed evolution has severe limits, and is not yet applicable to noncanonical AA backbones. CPD can help address several of its limitations, and has begun to be applied to this problem. We review efforts to redesign aaRSs, studies that designed new proteins and functionalities with the help of ncAAs, and some of the method developments that have been used, such as adaptive landscape flattening Monte Carlo, which allows an enzyme to be redesigned with substrate or transition state binding as the design target.
Collapse
Affiliation(s)
- Vaitea Opuu
- Institut Chimie Biologie Innovation (CNRS UMR8231), Ecole Supérieure de Physique et Chimie de Paris (ESPCI), 75005 Paris, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, Institut Polytechnique de Paris, 91128 Palaiseau, France
| |
Collapse
|
16
|
Zhang W, Huang W, Tan J, Guo Q, Wu B. Heterogeneous catalysis mediated by light, electricity and enzyme via machine learning: Paradigms, applications and prospects. CHEMOSPHERE 2022; 308:136447. [PMID: 36116627 DOI: 10.1016/j.chemosphere.2022.136447] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 09/08/2022] [Accepted: 09/11/2022] [Indexed: 06/15/2023]
Abstract
Energy crisis and environmental pollution have become the bottleneck of human sustainable development. Therefore, there is an urgent need to develop new catalysts for energy production and environmental remediation. Due to the high cost caused by blind screening and limited valuable computing resources, the traditional experimental methods and theoretical calculations are difficult to meet with the requirements. In the past decades, computer science has made great progress, especially in the field of machine learning (ML). As a new research paradigm, ML greatly accelerates the theoretical calculation methods represented by first principal calculation and molecular dynamics, and establish the physical picture of heterogeneous catalytic processes for energy and environment. This review firstly summarized the general research paradigms of ML in the discovery of catalysts. Then, the latest progresses of ML in light-, electricity- and enzyme-mediated heterogeneous catalysis were reviewed from the perspective of catalytic performance, operating conditions and reaction mechanism. The general guidelines of ML for heterogeneous catalysis were proposed. Finally, the existing problems and future development trend of ML in heterogeneous catalysis mediated by light, electricity and enzyme were summarized. We highly expect that this review will facilitate the interaction between ML and heterogeneous catalysis, and illuminate the development prospect of heterogeneous catalysis.
Collapse
Affiliation(s)
- Wentao Zhang
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, 518055, People's Republic of China
| | - Wenguang Huang
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China.
| | - Jie Tan
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China
| | - Qingwei Guo
- South China Institute of Environmental Sciences, Ministry of Ecology and Environment of PRC, Guangzhou, 510655, People's Republic of China
| | - Bingdang Wu
- School of Environmental Science and Engineering, Suzhou University of Science and Technology, Suzhou, 215009, People's Republic of China; Key Laboratory of Suzhou Sponge City Technology, Suzhou, 215002, People's Republic of China.
| |
Collapse
|
17
|
Yan B, Ran X, Gollu A, Cheng Z, Zhou X, Chen Y, Yang ZJ. IntEnzyDB: an Integrated Structure-Kinetics Enzymology Database. J Chem Inf Model 2022; 62:5841-5848. [PMID: 36286319 DOI: 10.1021/acs.jcim.2c01139] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Data-driven modeling has emerged as a new paradigm for biocatalyst design and discovery. Biocatalytic databases that integrate enzyme structure and function data are in urgent need. Here we describe IntEnzyDB as an integrated structure-kinetics database for facile statistical modeling and machine learning. IntEnzyDB employs a relational database architecture with a flattened data structure, which allows rapid data operation. This architecture also makes it easy for IntEnzyDB to incorporate more types of enzyme function data. IntEnzyDB contains enzyme kinetics and structure data from six enzyme commission classes. Using 1050 enzyme structure-kinetics pairs, we investigated the efficiency-perturbing propensities of mutations that are close or distal to the active site. The statistical results show that efficiency-enhancing mutations are globally encoded and that deleterious mutations are much more likely to occur in close mutations than in distal mutations. Finally, we describe a web interface that allows public users to access enzymology data stored in IntEnzyDB. IntEnzyDB will provide a computational facility for data-driven modeling in biocatalysis and molecular evolution.
Collapse
Affiliation(s)
- Bailu Yan
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States.,Department of Biostatistics, Vanderbilt University, Nashville, Tennessee 37205, United States
| | - Xinchun Ran
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Anvita Gollu
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Zihao Cheng
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Xiang Zhou
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Yiwen Chen
- Data Science Institute, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Zhongyue J Yang
- Department of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States.,Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37235, United States.,Vanderbilt Institute of Chemical Biology, Vanderbilt University, Nashville, Tennessee 37235, United States.,Data Science Institute, Vanderbilt University, Nashville, Tennessee 37235, United States.,Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, Tennessee 37205, United States
| |
Collapse
|
18
|
Chang YF, Chen SY, Lee CC, Chen J, Lai CS. Easy and Rapid Approach to Obtaining the Binding Affinity of Biomolecular Interactions Based on the Deep Learning Boost. Anal Chem 2022; 94:10427-10434. [PMID: 35837692 DOI: 10.1021/acs.analchem.2c01620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Recently, the deep learning (DL) dimension of artificial intelligence has received much attention from biochemical researchers and thus has gradually become the key approach adopted in the area of biosensing applications. Studies have shown that the use of DL techniques for sensing can not only shorten the time of data analysis but also significantly increase the accuracy of data analysis and prediction, resulting in the performance improvement of biosensing systems in comparison to conventional methods. However, obtaining reliable equilibrium and rate constants of biomolecular interactions during the detection process remains difficult and time-consuming to date. In this study, we propose a transformed model based on the deep transfer learning and sequence-to-sequence autoencoder that can successfully transfer the SPR sensorgram to the protein-binding constants, that is, the association rate constant (ka) and dissociation rate constant (kd), which provide crucial information to understand the mechanisms of drug action and the functional structures of biomolecules. Experimentally, we first trained and tested the pre-trained model using the Langmuir model which generated ideal SPR sensorgrams and then we fine-tuned the pre-trained model through the augmented SPR sensorgrams which were synthesized by using the synthesized minority oversampling technique (SMOTE) through the moderate-scale experiment. Next, the fine-tuned model was inputted with a short experimental SPR sensorgram that only needs 110 s, and the sensorgram was directly transformed into a reconstructed ideal sensorgram. Finally, the binding kinetic constants, that is, ka and kd, as outputs, were obtained through fitting the reconstructed ideal sensorgram. The results showed that the prediction errors of ka and kd obtained by our model were less than 12 and 24%, respectively. Based on the convenience, accuracy, and reliability of the proposed DL approach, we believe our strategy significantly boosts the feasibility to monitor the binding affinity of antibodies online during production.
Collapse
Affiliation(s)
- Ying-Feng Chang
- Artificial Intelligence Research Center, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan
| | - Sin-You Chen
- Artificial Intelligence Research Center, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan
| | - Chi-Ching Lee
- Artificial Intelligence Research Center, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Department of Computer Science and Information Engineering, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Genomic Medicine Core Laboratory, Chang Gung Memorial Hospital, Kweishan District, Taoyuan City 33305, Taiwan
| | - Jenhui Chen
- Artificial Intelligence Research Center, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Department of Computer Science and Information Engineering, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Division of Breast Surgery and General Surgery, Department of Surgery, Chang Gung Memorial Hospital, Linkou Branch, Kweishan District, Taoyuan City 33305, Taiwan.,Department of Electronic Engineering, Ming Chi University of Technology, Taishan District, New Taipei City 24301, Taiwan
| | - Chao-Sung Lai
- Artificial Intelligence Research Center, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Department of Electronic Engineering, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Center for Biomedical Engineering, Chang Gung University, Kweishan District, Taoyuan City 33302, Taiwan.,Department of Nephrology, Chang Gung Memorial Hospital, Kweishan District, Taoyuan City 33305, Taiwan.,Department of Materials Engineering, Ming Chi University of Technology, Taishan District, New Taipei City 24301, Taiwan
| |
Collapse
|
19
|
Laveglia V, Giachetti A, Sala D, Andreini C, Rosato A. Learning to Identify Physiological and Adventitious Metal-Binding Sites in the Three-Dimensional Structures of Proteins by Following the Hints of a Deep Neural Network. J Chem Inf Model 2022; 62:2951-2960. [PMID: 35679182 PMCID: PMC9241070 DOI: 10.1021/acs.jcim.2c00522] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Thirty-eight percent of protein structures in the Protein Data Bank contain at least one metal ion. However, not all these metal sites are biologically relevant. Cations present as impurities during sample preparation or in the crystallization buffer can cause the formation of protein-metal complexes that do not exist in vivo. We implemented a deep learning approach to build a classifier able to distinguish between physiological and adventitious zinc-binding sites in the 3D structures of metalloproteins. We trained the classifier using manually annotated sites extracted from the MetalPDB database. Using a 10-fold cross validation procedure, the classifier achieved an accuracy of about 90%. The same neural classifier could predict the physiological relevance of non-heme mononuclear iron sites with an accuracy of nearly 80%, suggesting that the rules learned on zinc sites have general relevance. By quantifying the relative importance of the features describing the input zinc sites from the network perspective and by analyzing the characteristics of the MetalPDB datasets, we inferred some common principles. Physiological sites present a low solvent accessibility of the aminoacids forming coordination bonds with the metal ion (the metal ligands), a relatively large number of residues in the metal environment (≥20), and a distinct pattern of conservation of Cys and His residues in the site. Adventitious sites, on the other hand, tend to have a low number of donor atoms from the polypeptide chain (often one or two). These observations support the evaluation of the physiological relevance of novel metal-binding sites in protein structures.
Collapse
Affiliation(s)
- Vincenzo Laveglia
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Andrea Giachetti
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Davide Sala
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy.,Institute for Drug Discovery, Leipzig University, Brüderstr. 34, 04103 Leipzig, Germany.,Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy
| | - Claudia Andreini
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy.,Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy
| | - Antonio Rosato
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy.,Magnetic Resonance Center (CERM), University of Florence, Via Luigi Sacconi 6, 50019 Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Via della Lastruccia 3, 50019 Sesto Fiorentino, Italy
| |
Collapse
|
20
|
Hall SW, Díaz Leines G, Sarupria S, Rogal J. Practical guide to replica exchange transition interface sampling and forward flux sampling. J Chem Phys 2022; 156:200901. [DOI: 10.1063/5.0080053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Path sampling approaches have become invaluable tools to explore the mechanisms and dynamics of the so-called rare events that are characterized by transitions between metastable states separated by sizable free energy barriers. Their practical application, in particular to ever more complex molecular systems, is, however, not entirely trivial. Focusing on replica exchange transition interface sampling (RETIS) and forward flux sampling (FFS), we discuss a range of analysis tools that can be used to assess the quality and convergence of such simulations, which is crucial to obtain reliable results. The basic ideas of a step-wise evaluation are exemplified for the study of nucleation in several systems with different complexities, providing a general guide for the critical assessment of RETIS and FFS simulations.
Collapse
Affiliation(s)
- Steven W. Hall
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, Minnesota 55455, USA
| | - Grisell Díaz Leines
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridgeshire CB2 1EW, United Kingdom
| | - Sapna Sarupria
- Department of Chemistry, University of Minnesota, Minneapolis, Minnesota 55455, USA
- Department of Chemical and Biomolecular Engineering, Clemson University, Clemson, South Carolina 29634, USA
| | - Jutta Rogal
- Department of Chemistry, New York University, New York, New York 10003, USA
- Fachbereich Physik, Freie Universität Berlin, 14195 Berlin, Germany
| |
Collapse
|
21
|
Song Z, Trozzi F, Tian H, Yin C, Tao P. Mechanistic Insights into Enzyme Catalysis from Explaining Machine-Learned Quantum Mechanical and Molecular Mechanical Minimum Energy Pathways. ACS PHYSICAL CHEMISTRY AU 2022; 2:316-330. [PMID: 35936506 PMCID: PMC9344433 DOI: 10.1021/acsphyschemau.2c00005] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
With the increasing popularity of machine learning (ML) applications, the demand for explainable artificial intelligence techniques to explain ML models developed for computational chemistry has also emerged. In this study, we present the development of the Boltzmann-weighted cumulative integrated gradients (BCIG) approach for effective explanation of mechanistic insights into ML models trained on high-level quantum mechanical and molecular mechanical (QM/MM) minimum energy pathways. Using the acylation reactions of the Toho-1 β-lactamase and two antibiotics (ampicillin and cefalexin) as the model systems, we show that the BCIG approach could quantitatively attribute the energetic contribution in one system and the relative reactivity of individual steps across different systems to specific chemical processes such as the bond making/breaking and proton transfers. The proposed BCIG contribution attribution method quantifies chemistry-interpretable insights in terms of contributions from each elementary chemical process, which is in agreement with the validating QM/MM calculations and our intuitive mechanistic understandings of the model reactions.
Collapse
|
22
|
Antoniou D, Schwartz SD. Method for Identifying Common Features in Reactive Trajectories of a Transition Path Sampling Ensemble. J Chem Theory Comput 2022; 18:3997-4004. [PMID: 35536190 DOI: 10.1021/acs.jctc.2c00186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Simulation methods like transition path sampling (TPS) generate an abundance of information buried in the collection of reactive trajectories that they generate. However, only limited use has been made of this information, mainly for the identification of the reaction coordinate. The standard TPS tools have been designed for monitoring the progress of the system from reactants to products. However, the reaction coordinate does not contain all the information regarding the mechanism. In our earlier work, we have used TPS on enzymatic systems and have identified important motions in the reactant well that prepares the system for the reaction. Since these events take place in the reactant well, they are beyond the reach of standard TPS postprocessing methods. We present a simple scheme for identifying the common trends in enzymatic trajectories. This scheme was designed for a specific class of enzymatic reactions: it can be used for identifying motions that guide the system to reaction-ready conformations. We have applied it to two enzymatic systems that we have studied in the past, formate dehydrogenase and purine nucleoside phosphorylase, and we were able to identify interactions, far from the transition state, that are important for preparing the system for the reaction but that had been overlooked in earlier work.
Collapse
Affiliation(s)
- Dimitri Antoniou
- Department of Chemistry and Biochemistry, University of Arizona, 1306 East University Blvd., Tucson, Arizona 85721, United States
| | - Steven D Schwartz
- Department of Chemistry and Biochemistry, University of Arizona, 1306 East University Blvd., Tucson, Arizona 85721, United States
| |
Collapse
|
23
|
Betinol IO, Reid JP. A predictive and mechanistic statistical modelling workflow for improving decision making in organic synthesis and catalysis. Org Biomol Chem 2022; 20:6012-6018. [PMID: 35389396 DOI: 10.1039/d2ob00272h] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The application of multivariate linear regression models has been widely utilized as a strategy to streamline the reaction optimization process. While these tools likely provide relatively safe predictions, embedding a method for forecasting the probability of achieving the desired reaction outcome would be valuable for streamlining the identification of promising structures with the best chance of success. Herein, we present a workflow that predicts the probability that a reaction will be successful and is easy and quick to apply. We show that this probabilistic framework can effectively differentiate between predictions often indistinguishable by multivariate linear regression analysis. Moreover, these techniques can enhance the development of mechanistically informative correlations by producing more direct pathways for molecular development and design. Overall, we anticipate this protocol will be generally applicable and useful for accelerating successful chemical discovery.
Collapse
Affiliation(s)
- Isaiah O Betinol
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada.
| | - Jolene P Reid
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada.
| |
Collapse
|
24
|
Computation of photovoltaic and stability properties of hybrid organic–inorganic perovskites via convolutional neural networks. Theor Chem Acc 2022. [DOI: 10.1007/s00214-022-02875-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
25
|
Meuwly M. Atomistic Simulations for Reactions and Vibrational Spectroscopy in the Era of Machine Learning─ Quo Vadis?. J Phys Chem B 2022; 126:2155-2167. [PMID: 35286087 DOI: 10.1021/acs.jpcb.2c00212] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Atomistic simulations using accurate energy functions can provide molecular-level insight into functional motions of molecules in the gas and in the condensed phase. This Perspective delineates the present status of the field from the efforts of others and some of our own work and discusses open questions and future prospects. The combination of physics-based long-range representations using multipolar charge distributions and kernel representations for the bonded interactions is shown to provide realistic models for the exploration of the infrared spectroscopy of molecules in solution. For reactions, empirical models connecting dedicated energy functions for the reactant and product states allow statistically meaningful sampling of conformational space whereas machine-learned energy functions are superior in accuracy. The future combination of physics-based models with machine-learning techniques and integration into all-purpose molecular simulation software provides a unique opportunity to bring such dynamics simulations closer to reality.
Collapse
Affiliation(s)
- Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| |
Collapse
|
26
|
Tatta ER, Imchen M, Moopantakath J, Kumavath R. Bioprospecting of microbial enzymes: current trends in industry and healthcare. Appl Microbiol Biotechnol 2022; 106:1813-1835. [PMID: 35254498 DOI: 10.1007/s00253-022-11859-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 02/15/2022] [Accepted: 02/26/2022] [Indexed: 12/13/2022]
Abstract
Microbial enzymes have an indispensable role in producing foods, pharmaceuticals, and other commercial goods. Many novel enzymes have been reported from all domains of life, such as plants, microbes, and animals. Nonetheless, industrially desirable enzymes of microbial origin are limited. This review article discusses the classifications, applications, sources, and challenges of most demanded industrial enzymes such as pectinases, cellulase, lipase, and protease. In addition, the production of novel enzymes through protein engineering technologies such as directed evolution, rational, and de novo design, for the improvement of existing industrial enzymes is also explored. We have also explored the role of metagenomics, nanotechnology, OMICs, and machine learning approaches in the bioprospecting of novel enzymes. Overall, this review covers the basics of biocatalysts in industrial and healthcare applications and provides an overview of existing microbial enzyme optimization tools. KEY POINTS: • Microbial bioactive molecules are vital for therapeutic and industrial applications. • High-throughput OMIC is the most proficient approach for novel enzyme discovery. • Comprehensive databases and efficient machine learning models are the need of the hour to fast forward de novo enzyme design and discovery.
Collapse
Affiliation(s)
- Eswar Rao Tatta
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Madangchanok Imchen
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Jamseel Moopantakath
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Ranjith Kumavath
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India.
| |
Collapse
|
27
|
Vanella R, Kovacevic G, Doffini V, Fernández de Santaella J, Nash MA. High-throughput screening, next generation sequencing and machine learning: advanced methods in enzyme engineering. Chem Commun (Camb) 2022; 58:2455-2467. [PMID: 35107442 PMCID: PMC8851469 DOI: 10.1039/d1cc04635g] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma. Typical enhancements sought in enzyme engineering and in vitro evolution campaigns include improved folding stability, catalytic activity, and/or substrate specificity. Despite significant progress in recent years in the areas of high-throughput screening and DNA sequencing, our ability to explore the vast space of functional enzyme sequences remains severely limited. Here, we review the currently available suite of modern methods for enzyme engineering, with a focus on novel readout systems based on enzyme cascades, and new approaches to reaction compartmentalization including single-cell hydrogel encapsulation techniques to achieve a genotype–phenotype link. We further summarize systematic scanning mutagenesis approaches and their merger with deep mutational scanning and massively parallel next-generation DNA sequencing technologies to generate mutability landscapes. Finally, we discuss the implementation of machine learning models for computational prediction of enzyme phenotypic fitness from sequence. This broad overview of current state-of-the-art approaches for enzyme engineering and evolution will aid newcomers and experienced researchers alike in identifying the important challenges that should be addressed to move the field forward. Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma.![]()
Collapse
Affiliation(s)
- Rosario Vanella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Gordana Kovacevic
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Vanni Doffini
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Jaime Fernández de Santaella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Michael A Nash
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| |
Collapse
|
28
|
Paruchuri BC, Gopal V, Sarupria S, Larsen J. Toward enzyme-responsive polymersome drug delivery. Nanomedicine (Lond) 2021; 16:2679-2693. [PMID: 34870451 DOI: 10.2217/nnm-2021-0194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
In drug delivery, enzyme-responsive drug carriers are becoming increasingly relevant because of the growing association of disease pathology with enzyme overexpression. Polymersomes are of interest to such applications because of their tunable properties. While polymersomes open up a wide range of chemical and physical properties to explore, they also present a challenge in developing generalized rules for the synthesis of novel systems. Motivated by this issue, in this perspective, we summarize the existing knowledge on enzyme-responsive polymersomes and outline the main design choices. Then, we propose heuristics to guide the design of novel systems. Finally, we discuss the potential of an integrated approach using computer simulations and experimental studies to streamline this design process and close the existing knowledge gaps.
Collapse
Affiliation(s)
| | - Varun Gopal
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC 29631, USA.,Department of Chemical Engineering & Material Science, University of Minnesota, Minneapolis, MN 55455, USA
| | - Sapna Sarupria
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC 29631, USA.,Center for Optical Materials Science & Engineering Technologies (COMSET), Clemson University, Clemson, SC 29670, USA.,Department of Chemistry, University of Minnesota, Minneapolis, MN 55455, USA
| | - Jessica Larsen
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC 29631, USA.,Department of Bioengineering, Clemson University, Clemson, SC 29631, USA
| |
Collapse
|
29
|
Rangel-Martinez D, Nigam K, Ricardez-Sandoval LA. Machine learning on sustainable energy: A review and outlook on renewable energy systems, catalysis, smart grid and energy storage. Chem Eng Res Des 2021. [DOI: 10.1016/j.cherd.2021.08.013] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
30
|
Abstract
Given the importance of catalysts in the chemical industry, they have been extensively investigated by experimental and numerical methods. With the development of computational algorithms and computer hardware, large-scale simulations have enabled influential studies with more atomic details reflecting microscopic mechanisms. This review provides a comprehensive summary of recent developments in molecular dynamics, including ab initio molecular dynamics and reaction force-field molecular dynamics. Recent research on both approaches to catalyst calculations is reviewed, including growth, dehydrogenation, hydrogenation, oxidation reactions, bias, and recombination of carbon materials that can guide catalyst calculations. Machine learning has attracted increasing interest in recent years, and its combination with the field of catalysts has inspired promising development approaches. Its applications in machine learning potential, catalyst design, performance prediction, structure optimization, and classification have been summarized in detail. This review hopes to shed light and perspective on ML approaches in catalysts.
Collapse
|
31
|
Lyu Y, Scrimin P. Mimicking Enzymes: The Quest for Powerful Catalysts from Simple Molecules to Nanozymes. ACS Catal 2021. [DOI: 10.1021/acscatal.1c01219] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Yanchao Lyu
- University of Padova, Department of Chemical Sciences, via Marzolo, 1, 35131 Padova, Italy
| | - Paolo Scrimin
- University of Padova, Department of Chemical Sciences, via Marzolo, 1, 35131 Padova, Italy
| |
Collapse
|
32
|
Michael E, Simonson T. How much can physics do for protein design? Curr Opin Struct Biol 2021; 72:46-54. [PMID: 34461593 DOI: 10.1016/j.sbi.2021.07.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 07/22/2021] [Accepted: 07/25/2021] [Indexed: 01/03/2023]
Abstract
Physics and physical chemistry are an important thread in computational protein design, complementary to knowledge-based tools. They provide molecular mechanics scoring functions that need little or no ad hoc parameter readjustment, methods to thoroughly sample equilibrium ensembles, and different levels of approximation for conformational flexibility. They led recently to the successful redesign of a small protein using a physics-based folded state energy. Adaptive Monte Carlo or molecular dynamics schemes were discovered where protein variants are populated as per their ligand-binding free energy or catalytic efficiency. Molecular dynamics have been used for backbone flexibility. Implicit solvent models have been refined, polarizable force fields applied, and many physical insights obtained.
Collapse
Affiliation(s)
- Eleni Michael
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France
| | - Thomas Simonson
- Laboratoire de Biologie Structurale de la Cellule (CNRS UMR7654), Ecole Polytechnique, 91128, Palaiseau, France.
| |
Collapse
|
33
|
Dutta K, Shityakov S, Khalifa I. New Trends in Bioremediation Technologies Toward Environment-Friendly Society: A Mini-Review. Front Bioeng Biotechnol 2021; 9:666858. [PMID: 34409018 PMCID: PMC8365754 DOI: 10.3389/fbioe.2021.666858] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 05/26/2021] [Indexed: 01/29/2023] Open
Abstract
Today's environmental balance has been compromised by the unreasonable and sometimes dangerous actions committed by humans to maintain their dominance over the Earth's natural resources. As a result, oceans are contaminated by the different types of plastic trash, crude oil coming from mismanagement of transporting ships spilling it in the water, and air pollution due to increasing production of greenhouse gases, such as CO2 and CH4 etc., into the atmosphere. The lands, agricultural fields, and groundwater are also contaminated by the infamous chemicals viz., polycyclic aromatic hydrocarbons, pyrethroids pesticides, bisphenol-A, and dioxanes. Therefore, bioremediation might function as a convenient alternative to restore a clean environment. However, at present, the majority of bioremediation reports are limited to the natural capabilities of microbial enzymes. Synthetic biology with uncompromised supervision of ethical standards could help to outsmart nature's engineering, such as the CETCH cycle for improved CO2 fixation. Additionally, a blend of synthetic biology with machine learning algorithms could expand the possibilities of bioengineering. This review summarized current state-of-the-art knowledge of the data-assisted enzyme redesigning to actively promote new research on important enzymes to ameliorate the environment.
Collapse
Affiliation(s)
- Kunal Dutta
- Department of Human Physiology, Vidyasagar University, Medinipur, India
| | - Sergey Shityakov
- Department of Chemoinformatics, Infochemistry Scientific Center, Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University), Saint-Petersburg, Russia
| | - Ibrahim Khalifa
- Food Technology Department, Faculty of Agriculture, Benha University, Moshtohor, Egypt
| |
Collapse
|
34
|
Harder, better, faster, stronger: Large-scale QM and QM/MM for predictive modeling in enzymes and proteins. Curr Opin Struct Biol 2021; 72:9-17. [PMID: 34388673 DOI: 10.1016/j.sbi.2021.07.004] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/25/2021] [Accepted: 07/05/2021] [Indexed: 11/23/2022]
Abstract
Computational prediction of enzyme mechanism and protein function requires accurate physics-based models and suitable sampling. We discuss recent advances in large-scale quantum mechanical (QM) modeling of biochemical systems that have reduced the cost of high-accuracy models. Tradeoffs between sampling and accuracy have motivated modeling with molecular mechanics (MM) in a multiscale QM/MM or iterative approach. Limitations to both conventional density-functional theory and classical MM force fields remain for describing noncovalent interactions in comparison to experiment or wavefunction theory. Because predictions of enzyme action (i.e. electrostatics), free energy barriers, and mechanisms are sensitive to the protocol and embedding method in QM/MM, convergence tests and systematic methods for quantifying QM-level interactions are a needed, active area of development.
Collapse
|
35
|
Efficient hydrogenation catalyst designing via preferential adsorption sites construction towards active copper. J Catal 2021. [DOI: 10.1016/j.jcat.2021.06.025] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
36
|
Purslow JA, Nguyen TT, Khatiwada B, Singh A, Venditti V. N 6-methyladenosine binding induces a metal-centered rearrangement that activates the human RNA demethylase Alkbh5. SCIENCE ADVANCES 2021; 7:7/34/eabi8215. [PMID: 34407931 PMCID: PMC8373141 DOI: 10.1126/sciadv.abi8215] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 06/29/2021] [Indexed: 05/13/2023]
Abstract
Alkbh5 catalyzes demethylation of the N 6-methyladenosine (m6A), an epigenetic mark that controls several physiological processes including carcinogenesis and stem cell differentiation. The activity of Alkbh5 comprises two coupled reactions. The first reaction involves decarboxylation of α-ketoglutarate (αKG) and formation of a Fe4+═O species. This oxyferryl intermediate oxidizes the m6A to reestablish the canonical base. Despite coupling between the two reactions being required for the correct Alkbh5 functioning, the mechanisms linking dioxygen activation to m6A binding are not fully understood. Here, we use solution NMR to investigate the structure and dynamics of apo and holo Alkbh5. We show that binding of m6A to Alkbh5 induces a metal-centered rearrangement of αKG that increases the exposed area of the metal, making it available for binding O2 Our study reveals the molecular mechanisms underlying activation of Alkbh5, therefore opening new perspectives for the design of novel strategies to control gene expression and cancer progression.
Collapse
Affiliation(s)
| | - Trang T Nguyen
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA
| | | | - Aayushi Singh
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA
| | - Vincenzo Venditti
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA.
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50011, USA
| |
Collapse
|
37
|
Abstract
Machine learning (ML) techniques applied to chemical reactions have a long history. The present contribution discusses applications ranging from small molecule reaction dynamics to computational platforms for reaction planning. ML-based techniques can be particularly relevant for problems involving both computation and experiments. For one, Bayesian inference is a powerful approach to develop models consistent with knowledge from experiments. Second, ML-based methods can also be used to handle problems that are formally intractable using conventional approaches, such as exhaustive characterization of state-to-state information in reactive collisions. Finally, the explicit simulation of reactive networks as they occur in combustion has become possible using machine-learned neural network potentials. This review provides an overview of the questions that can and have been addressed using machine learning techniques, and an outlook discusses challenges in this diverse and stimulating field. It is concluded that ML applied to chemistry problems as practiced and conceived today has the potential to transform the way with which the field approaches problems involving chemical reactions, in both research and academic teaching.
Collapse
Affiliation(s)
- Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland.,Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| |
Collapse
|
38
|
Feehan R, Montezano D, Slusky JSG. Machine learning for enzyme engineering, selection and design. Protein Eng Des Sel 2021; 34:gzab019. [PMID: 34296736 PMCID: PMC8299298 DOI: 10.1093/protein/gzab019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Revised: 06/18/2021] [Accepted: 06/23/2021] [Indexed: 11/15/2022] Open
Abstract
Machine learning is a useful computational tool for large and complex tasks such as those in the field of enzyme engineering, selection and design. In this review, we examine enzyme-related applications of machine learning. We start by comparing tools that can identify the function of an enzyme and the site responsible for that function. Then we detail methods for optimizing important experimental properties, such as the enzyme environment and enzyme reactants. We describe recent advances in enzyme systems design and enzyme design itself. Throughout we compare and contrast the data and algorithms used for these tasks to illustrate how the algorithms and data can be best used by future designers.
Collapse
Affiliation(s)
- Ryan Feehan
- Center for Computational Biology, The University of Kansas, 2030 Becker Dr., Lawrence, KS 66047-1620, USA
| | - Daniel Montezano
- Center for Computational Biology, The University of Kansas, 2030 Becker Dr., Lawrence, KS 66047-1620, USA
| | - Joanna S G Slusky
- Center for Computational Biology, The University of Kansas, 2030 Becker Dr., Lawrence, KS 66047-1620, USA
- Department of Molecular Biosciences, The University of Kansas, 1200 Sunnyside Ave. Lawrence, KS 66045-7600, USA
| |
Collapse
|
39
|
Gargiulo S, Soumillion P. Directed evolution for enzyme development in biocatalysis. Curr Opin Chem Biol 2020; 61:107-113. [PMID: 33385931 DOI: 10.1016/j.cbpa.2020.11.006] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2020] [Revised: 11/25/2020] [Accepted: 11/29/2020] [Indexed: 02/07/2023]
Abstract
As an important sector of the chemical industry, biocatalysis requires the continuous development of enzymes with tailor-made activity, selectivity, stability, or tolerance to unnatural environments. This is now routinely achieved by directed evolution based on iterative cycles of genetic diversification and activity screening. Here, we highlight its recent developments. First, the design of "smarter" libraries by focused mutagenesis may be a crucial start-up for a fast and successful outcome. Then library assembly and expression are also key steps that benefits from modern molecular biology progresses. Finally, various strategies may be considered for library screening depending on the final objective: while low-throughput direct assays have been very successful in generating enzymes for important biocatalytic processes, even in bringing completely new chemistries to the enzyme world, ultrahigh-throughput screening methods are emerging as powerful approaches for engineering the next generation of industrial enzymes.
Collapse
Affiliation(s)
- Serena Gargiulo
- Louvain Institute of Biomolecular Science and Technology, Université catholique de Louvain, Place Croix du Sud 4-5, 1390 Louvain-la-Neuve, Belgium
| | - Patrice Soumillion
- Louvain Institute of Biomolecular Science and Technology, Université catholique de Louvain, Place Croix du Sud 4-5, 1390 Louvain-la-Neuve, Belgium.
| |
Collapse
|
40
|
Han Y, Tang B, Wang L, Bao H, Lu Y, Guan C, Zhang L, Le M, Liu Z, Wu M. Machine-Learning-Driven Synthesis of Carbon Dots with Enhanced Quantum Yields. ACS NANO 2020; 14:14761-14768. [PMID: 32960048 DOI: 10.1021/acsnano.0c01899] [Citation(s) in RCA: 80] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Knowing the correlation of reaction parameters in the preparation process of carbon dots (CDs) is essential for optimizing the synthesis strategy, exploring exotic properties, and exploiting potential applications. However, the integrated screening experimental data on the synthesis of CDs are huge and noisy. Machine learning (ML) has recently been successfully used for the screening of high-performance materials. Here, we demonstrate how ML-based techniques can offer insight into the successful prediction, optimization, and acceleration of CDs' synthesis process. A regression ML model on hydrothermal-synthesized CDs is established capable of revealing the relationship between various synthesis parameters and experimental outcomes as well as enhancing the process-related properties such as the fluorescent quantum yield (QY). CDs exhibiting a strong green emission with QY up to 39.3% are obtained through the combined ML guidance and experimental verification. The mass of precursors and the volume of alkaline catalysts are identified as the most important features in the synthesis of high-QY CDs by the trained ML model. The CDs are applied as an ultrasensitive fluorescence probe for monitoring the Fe3+ ion because of their superior optical behaviors. The probe exhibits the linear response to the Fe3+ ion with a wide concentration range (0-150 μM), and its detection limit is 0.039 μM. Our findings demonstrate the great capability of ML to guide the synthesis of high-quality CDs, accelerating the development of intelligent material.
Collapse
Affiliation(s)
- Yu Han
- Institute of Nanochemistry and Nanobiology, School of Environmental and Chemical Engineering, Shanghai University, 99 Shangda Road, BaoShan District, Shanghai 200444, P.R. China
| | - Bijun Tang
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Liang Wang
- Institute of Nanochemistry and Nanobiology, School of Environmental and Chemical Engineering, Shanghai University, 99 Shangda Road, BaoShan District, Shanghai 200444, P.R. China
| | - Hong Bao
- Institute of Nanochemistry and Nanobiology, School of Environmental and Chemical Engineering, Shanghai University, 99 Shangda Road, BaoShan District, Shanghai 200444, P.R. China
| | - Yuhao Lu
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Cuntai Guan
- School of Computer Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Liang Zhang
- Institute of Nanochemistry and Nanobiology, School of Environmental and Chemical Engineering, Shanghai University, 99 Shangda Road, BaoShan District, Shanghai 200444, P.R. China
| | - Mengying Le
- Institute of Nanochemistry and Nanobiology, School of Environmental and Chemical Engineering, Shanghai University, 99 Shangda Road, BaoShan District, Shanghai 200444, P.R. China
| | - Zheng Liu
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore 639798, Singapore
| | - Minghong Wu
- Shanghai Applied Radiation Institute, Shanghai University, 333 Nanchen Road, BaoShan District, Shanghai 200444, P.R. China
| |
Collapse
|
41
|
Machine learning approach for elucidating and predicting the role of synthesis parameters on the shape and size of TiO 2 nanoparticles. Sci Rep 2020; 10:18910. [PMID: 33144623 PMCID: PMC7609603 DOI: 10.1038/s41598-020-75967-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2020] [Accepted: 10/19/2020] [Indexed: 01/03/2023] Open
Abstract
In the present work a series of design rules are developed in order to tune the morphology of TiO2 nanoparticles through hydrothermal process. Through a careful experimental design, the influence of relevant process parameters on the synthesis outcome are studied, reaching to the develop predictive models by using Machine Learning methods. The models, after the validation and training, are able to predict with high accuracy the synthesis outcome in terms of nanoparticle size, polydispersity and aspect ratio. Furthermore, they are implemented by reverse engineering approach to do the inverse process, i.e. obtain the optimal synthesis parameters given a specific product characteristic. For the first time, it is presented a synthesis method that allows continuous and precise control of NPs morphology with the possibility to tune the aspect ratio over a large range from 1.4 (perfect truncated bipyramids) to 6 (elongated nanoparticles) and the length from 20 to 140 nm.
Collapse
|
42
|
Abstract
We have analyzed the reaction catalyzed by formate dehydrogenase using transition path sampling. This system has recently received experimental attention using infrared spectroscopy and heavy-enzyme studies. Some of the experimental results point to the possible importance of protein motions that are coupled to the chemical step. We found that the residue Val123 that lies behind the nicotinamide ring occasionally comes into van der Waals contact with the acceptor and that in all reactive trajectories, the barrier-crossing event is preceded by this contact, meaning that the motion of Val123 is part of the reaction coordinate. Experimental results have been interpreted with a two-dimensional formula for the chemical rate, which cannot capture effects such as the one we describe.
Collapse
Affiliation(s)
- Dimitri Antoniou
- Department of Chemistry and Biochemistry, University of Arizona, 1306 East University Blvd., Tucson, Arizona 85721, United States
| | - Steven D Schwartz
- Department of Chemistry and Biochemistry, University of Arizona, 1306 East University Blvd., Tucson, Arizona 85721, United States
| |
Collapse
|
43
|
Co-evolution of activity and thermostability of an aldo-keto reductase KmAKR for asymmetric synthesis of statin precursor dichiral diols. Bioorg Chem 2020; 103:104228. [DOI: 10.1016/j.bioorg.2020.104228] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 07/28/2020] [Accepted: 08/11/2020] [Indexed: 12/13/2022]
|
44
|
Zhou J, Xu G, Ni Y. Stereochemistry in Asymmetric Reduction of Bulky–Bulky Ketones by Alcohol Dehydrogenases. ACS Catal 2020. [DOI: 10.1021/acscatal.0c02646] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Jieyu Zhou
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122 Jiangsu, China
| | - Guochao Xu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122 Jiangsu, China
| | - Ye Ni
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, 214122 Jiangsu, China
| |
Collapse
|
45
|
Volk MJ, Lourentzou I, Mishra S, Vo LT, Zhai C, Zhao H. Biosystems Design by Machine Learning. ACS Synth Biol 2020; 9:1514-1533. [PMID: 32485108 DOI: 10.1021/acssynbio.0c00129] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Biosystems such as enzymes, pathways, and whole cells have been increasingly explored for biotechnological applications. However, the intricate connectivity and resulting complexity of biosystems poses a major hurdle in designing biosystems with desirable features. As -omics and other high throughput technologies have been rapidly developed, the promise of applying machine learning (ML) techniques in biosystems design has started to become a reality. ML models enable the identification of patterns within complicated biological data across multiple scales of analysis and can augment biosystems design applications by predicting new candidates for optimized performance. ML is being used at every stage of biosystems design to help find nonobvious engineering solutions with fewer design iterations. In this review, we first describe commonly used models and modeling paradigms within ML. We then discuss some applications of these models that have already shown success in biotechnological applications. Moreover, we discuss successful applications at all scales of biosystems design, including nucleic acids, genetic circuits, proteins, pathways, genomes, and bioprocesses. Finally, we discuss some limitations of these methods and potential solutions as well as prospects of the combination of ML and biosystems design.
Collapse
|
46
|
Dotas RR, Nguyen TT, Stewart CE, Ghirlando R, Potoyan DA, Venditti V. Hybrid Thermophilic/Mesophilic Enzymes Reveal a Role for Conformational Disorder in Regulation of Bacterial Enzyme I. J Mol Biol 2020; 432:4481-4498. [PMID: 32504625 DOI: 10.1016/j.jmb.2020.05.024] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 05/23/2020] [Accepted: 05/29/2020] [Indexed: 02/08/2023]
Abstract
Conformational disorder is emerging as an important feature of biopolymers, regulating a vast array of cellular functions, including signaling, phase separation, and enzyme catalysis. Here we combine NMR, crystallography, computer simulations, protein engineering, and functional assays to investigate the role played by conformational heterogeneity in determining the activity of the C-terminal domain of bacterial Enzyme I (EIC). In particular, we design chimeric proteins by hybridizing EIC from thermophilic and mesophilic organisms, and we characterize the resulting constructs for structure, dynamics, and biological function. We show that EIC exists as a mixture of active and inactive conformations and that functional regulation is achieved by tuning the thermodynamic balance between active and inactive states. Interestingly, we also present a hybrid thermophilic/mesophilic enzyme that is thermostable and more active than the wild-type thermophilic enzyme, suggesting that hybridizing thermophilic and mesophilic proteins is a valid strategy to engineer thermostable enzymes with significant low-temperature activity.
Collapse
Affiliation(s)
- Rochelle R Dotas
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA
| | - Trang T Nguyen
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA
| | - Charles E Stewart
- Macromolecular X-ray Crystallography Facility, Office of Biotechnology, Iowa State University, Ames, IA 50011, USA
| | - Rodolfo Ghirlando
- Laboratory of Molecular Biology, NIDDK, National Institutes of Health, Bethesda, MD 20892, USA
| | - Davit A Potoyan
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA; Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50011, USA.
| | - Vincenzo Venditti
- Department of Chemistry, Iowa State University, Ames, IA 50011, USA; Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, IA 50011, USA.
| |
Collapse
|
47
|
Mdluli V, Diluzio S, Lewis J, Kowalewski JF, Connell TU, Yaron D, Kowalewski T, Bernhard S. High-throughput Synthesis and Screening of Iridium(III) Photocatalysts for the Fast and Chemoselective Dehalogenation of Aryl Bromides. ACS Catal 2020. [DOI: 10.1021/acscatal.0c02247] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Affiliation(s)
- Velabo Mdluli
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Stephen Diluzio
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Jacqueline Lewis
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Jakub F. Kowalewski
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Timothy U. Connell
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - David Yaron
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Tomasz Kowalewski
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Stefan Bernhard
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
48
|
von der Esch B, Dietschreit JCB, Peters LDM, Ochsenfeld C. Finding Reactive Configurations: A Machine Learning Approach for Estimating Energy Barriers Applied to Sirtuin 5. J Chem Theory Comput 2019; 15:6660-6667. [PMID: 31765138 DOI: 10.1021/acs.jctc.9b00876] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Sirtuin 5 is a class III histone deacetylase that, unlike its classification, mainly catalyzes desuccinylation and demanoylation reactions. It is an interesting drug target that we use here to test new ideas for calculating reaction pathways of large molecular systems such as enzymes. A major issue with most schemes (e.g., adiabatic mapping) is that the resulting activation barrier height heavily depends on the chosen educt conformation. This makes the selection of the initial structure decisive for the success of the characterization. Here, we apply machine learning to a large number of molecular dynamics frames and potential energy barriers obtained by quantum mechanics/molecular mechanics calculations in order to identify (1) suitable start-conformations for reaction path calculations and (2) structural features relevant for the first step of the desuccinylation reaction catalyzed by Sirtuin 5. The latter generally aids the understanding of reaction mechanisms and important interactions in active centers. Using our novel approach, we found eleven key features that govern the reactivity. We were able to estimate reaction barriers with a mean absolute error of 3.6 kcal/mol and identified reactive configurations.
Collapse
Affiliation(s)
- Beatriz von der Esch
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Johannes C B Dietschreit
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Laurens D M Peters
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| | - Christian Ochsenfeld
- Chair of Theoretical Chemistry, Department of Chemistry , University of Munich (LMU) , Butenandtstr. 7 , D-81377 München , Germany
| |
Collapse
|