1
|
Duan R, Wang S, Li Z, Zhang W, Wu J, Jiang Y, Lin Q, Yuan P, Yue X, Yao Y, Xiao X, Xiao Y, Wang Z. Computer-assisted semi-rational design enhanced the enzymatic activity and protein stability of Proteinase K in calcium-free conditions. Biochem Biophys Res Commun 2024; 721:150109. [PMID: 38762932 DOI: 10.1016/j.bbrc.2024.150109] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Accepted: 05/12/2024] [Indexed: 05/21/2024]
Abstract
Wild-type Proteinase K binds to two Ca2+ ions, which play an important role in regulating enzymaticactivity and maintaining protein stability. Therefore, a predetermined concentration of Ca2+ must be added during the use of Proteinase K, which increases its commercial cost. Herein, we addressed this challenge using a computational strategy to engineer a Proteinase K mutant that does not require Ca2+ and exhibits high enzymatic activity and protein stability. In the absence of Ca2+, the best mutant, MT24 (S17W-S176N-D260F), displayed an activity approximately 9.2-fold higher than that of wild-type Proteinase K. It also exhibited excellent protein stability, retaining 56.2 % of its enzymatic activity after storage at 4 °C for 5 days. The residual enzymatic activity was 65-fold higher than that of the wild-type Proteinase K under the same storage conditions. Structural analysis and molecular dynamics simulations suggest that the introduction of new hydrogen bond and π-π stacking at the Ca2+ binding sites due to the mutation may be the reasons for the increased enzymatic activity and stability of MT24.
Collapse
Affiliation(s)
- Rongdi Duan
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Shen Wang
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Zhetao Li
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Wenjun Zhang
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Junteng Wu
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Yifei Jiang
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Qinting Lin
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Peixiong Yuan
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Xiaoyan Yue
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Yunxiao Yao
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Xiaoyue Xiao
- School of Life Sciences, Tianjin University, Tianjin, 300072, China
| | - Yunjie Xiao
- School of Life Sciences, Tianjin University, Tianjin, 300072, China.
| | - Zefang Wang
- School of Life Sciences, Tianjin University, Tianjin, 300072, China.
| |
Collapse
|
2
|
Shanker VR, Bruun TUJ, Hie BL, Kim PS. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science 2024; 385:46-53. [PMID: 38963838 DOI: 10.1126/science.adk8946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 05/29/2024] [Indexed: 07/06/2024]
Abstract
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
Collapse
Affiliation(s)
- Varun R Shanker
- Stanford Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
| | - Theodora U J Bruun
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian L Hie
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Peter S Kim
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
3
|
Zhu X, Zhao YF, Wen HJ, Lu Y, You S, Herman RA, Wang J. Silkworm pupae protein co-degradation by magnetic nanoparticles immobilized proteinase K and Mucor circinelloides aspartic protease for further utilization of sericulture by-products. ENVIRONMENTAL RESEARCH 2024; 249:118385. [PMID: 38331140 DOI: 10.1016/j.envres.2024.118385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Revised: 01/18/2024] [Accepted: 01/30/2024] [Indexed: 02/10/2024]
Abstract
Silkworm pupae, by-product of sericulture industry, is massively discarded. The degradation rate of silkworm pupae protein is critical to further employment, which reduces the impact of waste on the environment. Herein, magnetic Janus mesoporous silica nanoparticles immobilized proteinase K mutant T206M and Mucor circinelloides aspartic protease were employed in the co-degradation. The thermostability of T206M improved by enhancing structural rigidity (t1/2 by 30 min and T50 by 5 °C), prompting the degradation efficiency. At 65 °C and pH 7, degradation rate reached the highest of 61.7%, which improved by 26% compared with single free protease degradation. Besides, the immobilized protease is easy to separate and reuse, which maintains 50% activity after 10 recycles. Therefore, immobilized protease co-degradation was first applied to the development and utilization of silkworm pupae resulting in the release of promising antioxidant properties and reduces the environmental impact by utilizing a natural and renewable resource.
Collapse
Affiliation(s)
- Xuan Zhu
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China
| | - Yi-Fan Zhao
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China
| | - Hong-Jian Wen
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China
| | - Yu Lu
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China
| | - Shuai You
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China; Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Affairs, The Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang, 212100, China
| | - Richard Ansah Herman
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China; Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Affairs, The Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang, 212100, China
| | - Jun Wang
- Jiangsu Key Laboratory of Sericultural Biology and Biotechnology, School of Biotechnology, Jiangsu University of Science and Technology, Zhenjiang, 212100, China; Key Laboratory of Silkworm and Mulberry Genetic Improvement, Ministry of Agriculture and Rural Affairs, The Sericultural Research Institute, Chinese Academy of Agricultural Sciences, Zhenjiang, 212100, China.
| |
Collapse
|
4
|
Gelman S, Johnson B, Freschlin C, D'Costa S, Gitter A, Romero PA. Biophysics-based protein language models for protein engineering. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585128. [PMID: 38559182 PMCID: PMC10980077 DOI: 10.1101/2024.03.15.585128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure, and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose Mutational Effect Transfer Learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure, and energetics. We finetune METL on experimental sequence-function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity, and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL's ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.
Collapse
Affiliation(s)
- Sam Gelman
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | - Bryce Johnson
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | | | - Sameer D'Costa
- Department of Biochemistry, University of Wisconsin-Madison
| | - Anthony Gitter
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison
| | | |
Collapse
|
5
|
Wang X, Li A, Li X, Cui H. Empowering Protein Engineering through Recombination of Beneficial Substitutions. Chemistry 2024; 30:e202303889. [PMID: 38288640 DOI: 10.1002/chem.202303889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Indexed: 02/24/2024]
Abstract
Directed evolution stands as a seminal technology for generating novel protein functionalities, a cornerstone in biocatalysis, metabolic engineering, and synthetic biology. Today, with the development of various mutagenesis methods and advanced analytical machines, the challenge of diversity generation and high-throughput screening platforms is largely solved, and one of the remaining challenges is: how to empower the potential of single beneficial substitutions with recombination to achieve the epistatic effect. This review overviews experimental and computer-assisted recombination methods in protein engineering campaigns. In addition, integrated and machine learning-guided strategies were highlighted to discuss how these recombination approaches contribute to generating the screening library with better diversity, coverage, and size. A decision tree was finally summarized to guide the further selection of proper recombination strategies in practice, which was beneficial for accelerating protein engineering.
Collapse
Affiliation(s)
- Xinyue Wang
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Anni Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Xiujuan Li
- School of Food Science and Pharmaceutical Engineering, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| | - Haiyang Cui
- School of Life Sciences, Nanjing Normal University, No. 2 Xuelin Road, Nanjing, 210097, China
| |
Collapse
|
6
|
Nam K, Shao Y, Major DT, Wolf-Watz M. Perspectives on Computational Enzyme Modeling: From Mechanisms to Design and Drug Development. ACS OMEGA 2024; 9:7393-7412. [PMID: 38405524 PMCID: PMC10883025 DOI: 10.1021/acsomega.3c09084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
Understanding enzyme mechanisms is essential for unraveling the complex molecular machinery of life. In this review, we survey the field of computational enzymology, highlighting key principles governing enzyme mechanisms and discussing ongoing challenges and promising advances. Over the years, computer simulations have become indispensable in the study of enzyme mechanisms, with the integration of experimental and computational exploration now established as a holistic approach to gain deep insights into enzymatic catalysis. Numerous studies have demonstrated the power of computer simulations in characterizing reaction pathways, transition states, substrate selectivity, product distribution, and dynamic conformational changes for various enzymes. Nevertheless, significant challenges remain in investigating the mechanisms of complex multistep reactions, large-scale conformational changes, and allosteric regulation. Beyond mechanistic studies, computational enzyme modeling has emerged as an essential tool for computer-aided enzyme design and the rational discovery of covalent drugs for targeted therapies. Overall, enzyme design/engineering and covalent drug development can greatly benefit from our understanding of the detailed mechanisms of enzymes, such as protein dynamics, entropy contributions, and allostery, as revealed by computational studies. Such a convergence of different research approaches is expected to continue, creating synergies in enzyme research. This review, by outlining the ever-expanding field of enzyme research, aims to provide guidance for future research directions and facilitate new developments in this important and evolving field.
Collapse
Affiliation(s)
- Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019-5251, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | |
Collapse
|
7
|
Ismail A, Govindarajan S, Mannervik B. Human GST P1-1 Redesigned for Enhanced Catalytic Activity with the Anticancer Prodrug Telcyta and Improved Thermostability. Cancers (Basel) 2024; 16:762. [PMID: 38398153 PMCID: PMC10887215 DOI: 10.3390/cancers16040762] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 02/09/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
Protein engineering can be used to tailor enzymes for medical purposes, including antibody-directed enzyme prodrug therapy (ADEPT), which can act as a tumor-targeted alternative to conventional chemotherapy for cancer. In ADEPT, the antibody serves as a vector, delivering a drug-activating enzyme selectively to the tumor site. Glutathione transferases (GSTs) are a family of naturally occurring detoxication enzymes, and the finding that some of them are overexpressed in tumors has been exploited to develop GST-activated prodrugs. The prodrug Telcyta is activated by GST P1-1, which is the GST most commonly elevated in cancer cells, implying that tumors overexpressing GST P1-1 should be particularly vulnerable to Telcyta. Promising antitumor activity has been noted in clinical trials, but the wildtype enzyme has modest activity with Telcyta, and further functional improvement would enhance its usefulness for ADEPT. We utilized protein engineering to construct human GST P1-1 gene variants in the search for enzymes with enhanced activity with Telcyta. The variant Y109H displayed a 2.9-fold higher enzyme activity compared to the wild-type GST P1-1. However, increased catalytic potency was accompanied by decreased thermal stability of the Y109H enzyme, losing 99% of its activity in 8 min at 50 °C. Thermal stability was restored by four additional mutations simultaneously introduced without loss of the enhanced activity with Telcyta. The mutation Q85R was identified as an important contributor to the regained thermostability. These results represent a first step towards a functional ADEPT application for Telcyta.
Collapse
Affiliation(s)
- Aram Ismail
- Arrhenius Laboratories, Department of Biochemistry and Biophysics, Stockholm University, SE-10691 Stockholm, Sweden;
| | | | - Bengt Mannervik
- Arrhenius Laboratories, Department of Biochemistry and Biophysics, Stockholm University, SE-10691 Stockholm, Sweden;
- Department of Chemistry, Scripps Research, La Jolla, CA 92037, USA
| |
Collapse
|
8
|
Shanker VR, Bruun TU, Hie BL, Kim PS. Inverse folding of protein complexes with a structure-informed language model enables unsupervised antibody evolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.19.572475. [PMID: 38187780 PMCID: PMC10769282 DOI: 10.1101/2023.12.19.572475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
Large language models trained on sequence information alone are capable of learning high level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here we show that a general protein language model augmented with protein structure backbone coordinates and trained on the inverse folding problem can guide evolution for diverse proteins without needing to explicitly model individual functional tasks. We demonstrate inverse folding to be an effective unsupervised, structure-based sequence optimization strategy that also generalizes to multimeric complexes by implicitly learning features of binding and amino acid epistasis. Using this approach, we screened ~30 variants of two therapeutic clinical antibodies used to treat SARS-CoV-2 infection and achieved up to 26-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants-of-concern BQ.1.1 and XBB.1.5, respectively. In addition to substantial overall improvements in protein function, we find inverse folding performs with leading experimental success rates among other reported machine learning-guided directed evolution methods, without requiring any task-specific training data.
Collapse
Affiliation(s)
- Varun R. Shanker
- Stanford Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
| | - Theodora U.J. Bruun
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian L. Hie
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Peter S. Kim
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
9
|
Wang J, Chen C, Yao G, Ding J, Wang L, Jiang H. Intelligent Protein Design and Molecular Characterization Techniques: A Comprehensive Review. Molecules 2023; 28:7865. [PMID: 38067593 PMCID: PMC10707872 DOI: 10.3390/molecules28237865] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Revised: 11/13/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
In recent years, the widespread application of artificial intelligence algorithms in protein structure, function prediction, and de novo protein design has significantly accelerated the process of intelligent protein design and led to many noteworthy achievements. This advancement in protein intelligent design holds great potential to accelerate the development of new drugs, enhance the efficiency of biocatalysts, and even create entirely new biomaterials. Protein characterization is the key to the performance of intelligent protein design. However, there is no consensus on the most suitable characterization method for intelligent protein design tasks. This review describes the methods, characteristics, and representative applications of traditional descriptors, sequence-based and structure-based protein characterization. It discusses their advantages, disadvantages, and scope of application. It is hoped that this could help researchers to better understand the limitations and application scenarios of these methods, and provide valuable references for choosing appropriate protein characterization techniques for related research in the field, so as to better carry out protein research.
Collapse
Affiliation(s)
| | | | | | - Junjie Ding
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Liangliang Wang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| | - Hui Jiang
- State Key Laboratory of NBC Protection for Civilian, Beijing 102205, China; (J.W.); (C.C.); (G.Y.)
| |
Collapse
|
10
|
Markus B, C GC, Andreas K, Arkadij K, Stefan L, Gustav O, Elina S, Radka S. Accelerating Biocatalysis Discovery with Machine Learning: A Paradigm Shift in Enzyme Engineering, Discovery, and Design. ACS Catal 2023; 13:14454-14469. [PMID: 37942268 PMCID: PMC10629211 DOI: 10.1021/acscatal.3c03417] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/29/2023] [Accepted: 10/03/2023] [Indexed: 11/10/2023]
Abstract
Emerging computational tools promise to revolutionize protein engineering for biocatalytic applications and accelerate the development timelines previously needed to optimize an enzyme to its more efficient variant. For over a decade, the benefits of predictive algorithms have helped scientists and engineers navigate the complexity of functional protein sequence space. More recently, spurred by dramatic advances in underlying computational tools, the promise of faster, cheaper, and more accurate enzyme identification, characterization, and engineering has catapulted terms such as artificial intelligence and machine learning to the must-have vocabulary in the field. This Perspective aims to showcase the current status of applications in pharmaceutical industry and also to discuss and celebrate the innovative approaches in protein science by highlighting their potential in selected recent developments and offering thoughts on future opportunities for biocatalysis. It also critically assesses the technology's limitations, unanswered questions, and unmet challenges.
Collapse
Affiliation(s)
- Braun Markus
- Department
of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010 Graz, Austria
| | - Gruber Christian C
- Enzyme
and Drug Discovery, Innophore. 1700 Montgomery Street, San Francisco, California 94111, United States
| | - Krassnigg Andreas
- Enzyme
and Drug Discovery, Innophore. 1700 Montgomery Street, San Francisco, California 94111, United States
| | - Kummer Arkadij
- Moderna,
Inc., 200 Technology
Square, Cambridge, Massachusetts 02139, United States
| | - Lutz Stefan
- Codexis
Inc., 200 Penobscot Drive, Redwood City, California 94063, United States
| | - Oberdorfer Gustav
- Department
of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010 Graz, Austria
| | - Siirola Elina
- Novartis
Institute for Biomedical Research, Global Discovery Chemistry, Basel CH-4108, Switzerland
| | - Snajdrova Radka
- Novartis
Institute for Biomedical Research, Global Discovery Chemistry, Basel CH-4108, Switzerland
| |
Collapse
|
11
|
Sun Y, Huang X, Osawa Y, Chen YE, Zhang H. The Versatile Biocatalyst of Cytochrome P450 CYP102A1: Structure, Function, and Engineering. Molecules 2023; 28:5353. [PMID: 37513226 PMCID: PMC10383305 DOI: 10.3390/molecules28145353] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 07/07/2023] [Accepted: 07/10/2023] [Indexed: 07/30/2023] Open
Abstract
Wild-type cytochrome P450 CYP102A1 from Bacillus megaterium is a highly efficient monooxygenase for the oxidation of long-chain fatty acids. The unique features of CYP102A1, such as high catalytic activity, expression yield, regio- and stereoselectivity, and self-sufficiency in electron transfer as a fusion protein, afford the requirements for an ideal biocatalyst. In the past three decades, remarkable progress has been made in engineering CYP102A1 for applications in drug discovery, biosynthesis, and biotechnology. The repertoire of engineered CYP102A1 variants has grown tremendously, whereas the substrate repertoire is avalanched to encompass alkanes, alkenes, aromatics, organic solvents, pharmaceuticals, drugs, and many more. In this article, we highlight the major advances in the past five years in our understanding of the structure and function of CYP102A1 and the methodologies used to engineer CYP102A1 for novel applications. The objective is to provide a succinct review of the latest developments with reference to the body of CYP102A1-related literature.
Collapse
Affiliation(s)
- Yudong Sun
- Department of Pharmacology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiaoqiang Huang
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yoichi Osawa
- Department of Pharmacology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yuqing Eugene Chen
- Department of Internal Medicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Haoming Zhang
- Department of Pharmacology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
12
|
Ramírez-Palacios C, Marrink SJ. Super High-Throughput Screening of Enzyme Variants by Spectral Graph Convolutional Neural Networks. J Chem Theory Comput 2023. [PMID: 36961994 PMCID: PMC10373491 DOI: 10.1021/acs.jctc.2c01227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2023]
Abstract
Finding new enzyme variants with the desired substrate scope requires screening through a large number of potential variants. In a typical in silico enzyme engineering workflow, it is possible to scan a few thousands of variants, and gather several candidates for further screening or experimental verification. In this work, we show that a Graph Convolutional Neural Network (GCN) can be trained to predict the binding energy of combinatorial libraries of enzyme complexes using only sequence information. The GCN model uses a stack of message-passing and graph pooling layers to extract information from the protein input graph and yield a prediction. The GCN model is agnostic to the identity of the ligand, which is kept constant within the mutant libraries. Using a miniscule subset of the total combinatorial space (204-208 mutants) as training data, the proposed GCN model achieves a high accuracy in predicting the binding energy of unseen variants. The network's accuracy was further improved by injecting feature embeddings obtained from a language module pretrained on 10 million protein sequences. Since no structural information is needed to evaluate new variants, the deep learning algorithm is capable of scoring an enzyme variant in under 1 ms, allowing the search of billions of candidates on a single GPU.
Collapse
Affiliation(s)
- Carlos Ramírez-Palacios
- Molecular Dynamics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Siewert J Marrink
- Molecular Dynamics, Groningen Biomolecular Sciences and Biotechnology Institute (GBB), University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| |
Collapse
|
13
|
Xu C, Battig A, Schartel B, Siegel R, Senker J, von der Forst I, Unverzagt C, Agarwal S, Möglich A, Greiner A. Investigation of the Thermal Stability of Proteinase K for the Melt Processing of Poly(l-lactide). Biomacromolecules 2022; 23:4841-4850. [PMID: 36327974 PMCID: PMC9667878 DOI: 10.1021/acs.biomac.2c01008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/14/2022] [Indexed: 11/06/2022]
Abstract
The enzymatic degradation of aliphatic polyesters offers unique opportunities for various use cases in materials science. Although evidently desirable, the implementation of enzymes in technical applications of polyesters is generally challenging due to the thermal lability of enzymes. To prospectively overcome this intrinsic limitation, we here explored the thermal stability of proteinase K at conditions applicable for polymer melt processing, given that this hydrolytic enzyme is well established for its ability to degrade poly(l-lactide) (PLLA). Using assorted spectroscopic methods and enzymatic assays, we investigated the effects of high temperatures on the structure and specific activity of proteinase K. Whereas in solution, irreversible unfolding occurred at temperatures above 75-80 °C, in the dry, bulk state, proteinase K withstood prolonged incubation at elevated temperatures. Unexpectedly little activity loss occurred during incubation at up to 130 °C, and intermediate levels of catalytic activity were preserved at up to 150 °C. The resistance of bulk proteinase K to thermal treatment was slightly enhanced by absorption into polyacrylamide (PAM) particles. Under these conditions, after 5 min at a temperature of 200 °C, which is required for the melt processing of PLLA, proteinase K was not completely denatured but retained around 2% enzymatic activity. Our findings reveal that the thermal processing of proteinase K in the dry state is principally feasible, but equally, they also identify needs and prospects for improvement. The experimental pipeline we establish for proteinase K analysis stands to benefit efforts directed to this end. More broadly, our work sheds light on enzymatically degradable polymers and the thermal processing of enzymes, which are of increasing economical and societal relevance.
Collapse
Affiliation(s)
- Chengzhang Xu
- Macromolecular
Chemistry and Bavarian Polymer Institute, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95440, Germany
| | - Alexander Battig
- Bundesanstalt
für Materialforschung und -prüfung (BAM), Unter den Eichen 87, Berlin 12205, Germany
| | - Bernhard Schartel
- Bundesanstalt
für Materialforschung und -prüfung (BAM), Unter den Eichen 87, Berlin 12205, Germany
| | - Renée Siegel
- Inorganic
Chemistry III and Northern Bavarian NMR Centre (NBNC), University of Bayreuth, Universitätsstrasse 30, Bayreuth 95440, Germany
| | - Jürgen Senker
- Inorganic
Chemistry III and Northern Bavarian NMR Centre (NBNC), University of Bayreuth, Universitätsstrasse 30, Bayreuth 95440, Germany
| | - Inge von der Forst
- Bioorganic
Chemistry, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95447, Germany
| | - Carlo Unverzagt
- Bioorganic
Chemistry, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95447, Germany
| | - Seema Agarwal
- Macromolecular
Chemistry and Bavarian Polymer Institute, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95440, Germany
| | - Andreas Möglich
- Department
of Biochemistry, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95447, Germany
| | - Andreas Greiner
- Macromolecular
Chemistry and Bavarian Polymer Institute, University of Bayreuth, Universitätsstrasse 30, Bayreuth 95440, Germany
| |
Collapse
|
14
|
Villalobos-Alva J, Ochoa-Toledo L, Villalobos-Alva MJ, Aliseda A, Pérez-Escamirosa F, Altamirano-Bustamante NF, Ochoa-Fernández F, Zamora-Solís R, Villalobos-Alva S, Revilla-Monsalve C, Kemper-Valverde N, Altamirano-Bustamante MM. Protein Science Meets Artificial Intelligence: A Systematic Review and a Biochemical Meta-Analysis of an Inter-Field. Front Bioeng Biotechnol 2022; 10:788300. [PMID: 35875501 PMCID: PMC9301016 DOI: 10.3389/fbioe.2022.788300] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2021] [Accepted: 05/25/2022] [Indexed: 11/23/2022] Open
Abstract
Proteins are some of the most fascinating and challenging molecules in the universe, and they pose a big challenge for artificial intelligence. The implementation of machine learning/AI in protein science gives rise to a world of knowledge adventures in the workhorse of the cell and proteome homeostasis, which are essential for making life possible. This opens up epistemic horizons thanks to a coupling of human tacit–explicit knowledge with machine learning power, the benefits of which are already tangible, such as important advances in protein structure prediction. Moreover, the driving force behind the protein processes of self-organization, adjustment, and fitness requires a space corresponding to gigabytes of life data in its order of magnitude. There are many tasks such as novel protein design, protein folding pathways, and synthetic metabolic routes, as well as protein-aggregation mechanisms, pathogenesis of protein misfolding and disease, and proteostasis networks that are currently unexplored or unrevealed. In this systematic review and biochemical meta-analysis, we aim to contribute to bridging the gap between what we call binomial artificial intelligence (AI) and protein science (PS), a growing research enterprise with exciting and promising biotechnological and biomedical applications. We undertake our task by exploring “the state of the art” in AI and machine learning (ML) applications to protein science in the scientific literature to address some critical research questions in this domain, including What kind of tasks are already explored by ML approaches to protein sciences? What are the most common ML algorithms and databases used? What is the situational diagnostic of the AI–PS inter-field? What do ML processing steps have in common? We also formulate novel questions such as Is it possible to discover what the rules of protein evolution are with the binomial AI–PS? How do protein folding pathways evolve? What are the rules that dictate the folds? What are the minimal nuclear protein structures? How do protein aggregates form and why do they exhibit different toxicities? What are the structural properties of amyloid proteins? How can we design an effective proteostasis network to deal with misfolded proteins? We are a cross-functional group of scientists from several academic disciplines, and we have conducted the systematic review using a variant of the PICO and PRISMA approaches. The search was carried out in four databases (PubMed, Bireme, OVID, and EBSCO Web of Science), resulting in 144 research articles. After three rounds of quality screening, 93 articles were finally selected for further analysis. A summary of our findings is as follows: regarding AI applications, there are mainly four types: 1) genomics, 2) protein structure and function, 3) protein design and evolution, and 4) drug design. In terms of the ML algorithms and databases used, supervised learning was the most common approach (85%). As for the databases used for the ML models, PDB and UniprotKB/Swissprot were the most common ones (21 and 8%, respectively). Moreover, we identified that approximately 63% of the articles organized their results into three steps, which we labeled pre-process, process, and post-process. A few studies combined data from several databases or created their own databases after the pre-process. Our main finding is that, as of today, there are no research road maps serving as guides to address gaps in our knowledge of the AI–PS binomial. All research efforts to collect, integrate multidimensional data features, and then analyze and validate them are, so far, uncoordinated and scattered throughout the scientific literature without a clear epistemic goal or connection between the studies. Therefore, our main contribution to the scientific literature is to offer a road map to help solve problems in drug design, protein structures, design, and function prediction while also presenting the “state of the art” on research in the AI–PS binomial until February 2021. Thus, we pave the way toward future advances in the synthetic redesign of novel proteins and protein networks and artificial metabolic pathways, learning lessons from nature for the welfare of humankind. Many of the novel proteins and metabolic pathways are currently non-existent in nature, nor are they used in the chemical industry or biomedical field.
Collapse
Affiliation(s)
- Jalil Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Luis Ochoa-Toledo
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Mario Javier Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Atocha Aliseda
- Instituto de Investigaciones Filosóficas, Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Fernando Pérez-Escamirosa
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | | | - Francine Ochoa-Fernández
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Ricardo Zamora-Solís
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Sebastián Villalobos-Alva
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Cristina Revilla-Monsalve
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
| | - Nicolás Kemper-Valverde
- Instituto de Ciencias Aplicadas y Tecnología (ICAT), Universidad Nacional Autónoma de México (UNAM), Mexico City, Mexico
| | - Myriam M. Altamirano-Bustamante
- Unidad de Investigación en Enfermedades Metabólicas, Centro Médico Nacional Siglo XXI, Instituto Mexicano del Seguro Social, Mexico City, Mexico
- *Correspondence: Myriam M. Altamirano-Bustamante,
| |
Collapse
|
15
|
Talluri S. Algorithms for protein design. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2022; 130:1-38. [PMID: 35534105 DOI: 10.1016/bs.apcsb.2022.01.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Computational Protein Design has the potential to contribute to major advances in enzyme technology, vaccine design, receptor-ligand engineering, biomaterials, nanosensors, and synthetic biology. Although Protein Design is a challenging problem, proteins can be designed by experts in Protein Design, as well as by non-experts whose primary interests are in the applications of Protein Design. The increased accessibility of Protein Design technology is attributable to the accumulated knowledge and experience with Protein Design as well as to the availability of software and online resources. The objective of this review is to serve as a guide to the relevant literature with a focus on the novel methods and algorithms that have been developed or applied for Protein Design, and to assist in the selection of algorithms for Protein Design. Novel algorithms and models that have been introduced to utilize the enormous amount of experimental data and novel computational hardware have the potential for producing substantial increases in the accuracy, reliability and range of applications of designed proteins.
Collapse
Affiliation(s)
- Sekhar Talluri
- Department of Biotechnology, GITAM, Visakhapatnam, India.
| |
Collapse
|
16
|
Tatta ER, Imchen M, Moopantakath J, Kumavath R. Bioprospecting of microbial enzymes: current trends in industry and healthcare. Appl Microbiol Biotechnol 2022; 106:1813-1835. [PMID: 35254498 DOI: 10.1007/s00253-022-11859-5] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Revised: 02/15/2022] [Accepted: 02/26/2022] [Indexed: 12/13/2022]
Abstract
Microbial enzymes have an indispensable role in producing foods, pharmaceuticals, and other commercial goods. Many novel enzymes have been reported from all domains of life, such as plants, microbes, and animals. Nonetheless, industrially desirable enzymes of microbial origin are limited. This review article discusses the classifications, applications, sources, and challenges of most demanded industrial enzymes such as pectinases, cellulase, lipase, and protease. In addition, the production of novel enzymes through protein engineering technologies such as directed evolution, rational, and de novo design, for the improvement of existing industrial enzymes is also explored. We have also explored the role of metagenomics, nanotechnology, OMICs, and machine learning approaches in the bioprospecting of novel enzymes. Overall, this review covers the basics of biocatalysts in industrial and healthcare applications and provides an overview of existing microbial enzyme optimization tools. KEY POINTS: • Microbial bioactive molecules are vital for therapeutic and industrial applications. • High-throughput OMIC is the most proficient approach for novel enzyme discovery. • Comprehensive databases and efficient machine learning models are the need of the hour to fast forward de novo enzyme design and discovery.
Collapse
Affiliation(s)
- Eswar Rao Tatta
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Madangchanok Imchen
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Jamseel Moopantakath
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India
| | - Ranjith Kumavath
- Department of Genomic Science, School of Biological Sciences, Central University of Kerala, Tejaswini Hills, Periya (PO.), Kasaragod, Kerala, 671320, India.
| |
Collapse
|
17
|
Kenny SE, Antaw F, Locke WJ, Howard CB, Korbie D, Trau M. Next-Generation Molecular Discovery: From Bottom-Up In Vivo and In Vitro Approaches to In Silico Top-Down Approaches for Therapeutics Neogenesis. Life (Basel) 2022; 12:life12030363. [PMID: 35330114 PMCID: PMC8950575 DOI: 10.3390/life12030363] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 02/23/2022] [Indexed: 12/02/2022] Open
Abstract
Protein and drug engineering comprises a major part of the medical and research industries, and yet approaches to discovering and understanding therapeutic molecular interactions in biological systems rely on trial and error. The general approach to molecular discovery involves screening large libraries of compounds, proteins, or antibodies, or in vivo antibody generation, which could be considered “bottom-up” approaches to therapeutic discovery. In these bottom-up approaches, a minimal amount is known about the therapeutics at the start of the process, but through meticulous and exhaustive laboratory work, the molecule is characterised in detail. In contrast, the advent of “big data” and access to extensive online databases and machine learning technologies offers promising new avenues to understanding molecular interactions. Artificial intelligence (AI) now has the potential to predict protein structure at an unprecedented accuracy using only the genetic sequence. This predictive approach to characterising molecular structure—when accompanied by high-quality experimental data for model training—has the capacity to invert the process of molecular discovery and characterisation. The process has potential to be transformed into a top-down approach, where new molecules can be designed directly based on the structure of a target and the desired function, rather than performing screening of large libraries of molecular variants. This paper will provide a brief evaluation of bottom-up approaches to discovering and characterising biological molecules and will discuss recent advances towards developing top-down approaches and the prospects of this.
Collapse
Affiliation(s)
- Sophie E. Kenny
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Fiach Antaw
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Warwick J. Locke
- Molecular Diagnostic Solutions, Health and Biosecurity, Commonwealth Scientific and Industrial Research Organisation, Building 101, Clunies Ross Street, Canberra, ACT 2601, Australia;
| | - Christopher B. Howard
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
| | - Darren Korbie
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
- Correspondence: (D.K.); (M.T.)
| | - Matt Trau
- Centre for Personalised Nanomedicine, Australian Institute for Bioengineering and Nanotechnology (AIBN), The University of Queensland, Corner of College and Cooper Roads (Bldg 75), Brisbane, QLD 4072, Australia; (S.E.K.); (F.A.); (C.B.H.)
- School of Chemistry and Molecular Biosciences, The University of Queensland, Brisbane, QLD 4072, Australia
- Correspondence: (D.K.); (M.T.)
| |
Collapse
|
18
|
Computational enzyme redesign: large jumps in function. TRENDS IN CHEMISTRY 2022. [DOI: 10.1016/j.trechm.2022.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
19
|
Vanella R, Kovacevic G, Doffini V, Fernández de Santaella J, Nash MA. High-throughput screening, next generation sequencing and machine learning: advanced methods in enzyme engineering. Chem Commun (Camb) 2022; 58:2455-2467. [PMID: 35107442 PMCID: PMC8851469 DOI: 10.1039/d1cc04635g] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma. Typical enhancements sought in enzyme engineering and in vitro evolution campaigns include improved folding stability, catalytic activity, and/or substrate specificity. Despite significant progress in recent years in the areas of high-throughput screening and DNA sequencing, our ability to explore the vast space of functional enzyme sequences remains severely limited. Here, we review the currently available suite of modern methods for enzyme engineering, with a focus on novel readout systems based on enzyme cascades, and new approaches to reaction compartmentalization including single-cell hydrogel encapsulation techniques to achieve a genotype–phenotype link. We further summarize systematic scanning mutagenesis approaches and their merger with deep mutational scanning and massively parallel next-generation DNA sequencing technologies to generate mutability landscapes. Finally, we discuss the implementation of machine learning models for computational prediction of enzyme phenotypic fitness from sequence. This broad overview of current state-of-the-art approaches for enzyme engineering and evolution will aid newcomers and experienced researchers alike in identifying the important challenges that should be addressed to move the field forward. Enzyme engineering is an important biotechnological process capable of generating tailored biocatalysts for applications in industrial chemical conversion and biopharma.![]()
Collapse
Affiliation(s)
- Rosario Vanella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Gordana Kovacevic
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Vanni Doffini
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Jaime Fernández de Santaella
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| | - Michael A Nash
- Department of Chemistry, University of Basel, 4058 Basel, Switzerland.,Department of Biosystems Science and Engineering, ETH Zurich, 4058 Basel, Switzerland.
| |
Collapse
|
20
|
Yu Y, Wang R, Teo RD. Machine Learning Approaches for Metalloproteins. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27041277. [PMID: 35209064 PMCID: PMC8878495 DOI: 10.3390/molecules27041277] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 02/10/2022] [Accepted: 02/11/2022] [Indexed: 01/10/2023]
Abstract
Metalloproteins are a family of proteins characterized by metal ion binding, whereby the presence of these ions confers key catalytic and ligand-binding properties. Due to their ubiquity among biological systems, researchers have made immense efforts to predict the structural and functional roles of metalloproteins. Ultimately, having a comprehensive understanding of metalloproteins will lead to tangible applications, such as designing potent inhibitors in drug discovery. Recently, there has been an acceleration in the number of studies applying machine learning to predict metalloprotein properties, primarily driven by the advent of more sophisticated machine learning algorithms. This review covers how machine learning tools have consolidated and expanded our comprehension of various aspects of metalloproteins (structure, function, stability, ligand-binding interactions, and inhibitors). Future avenues of exploration are also discussed.
Collapse
Affiliation(s)
- Yue Yu
- Division of Natural and Applied Sciences, Duke Kunshan University, Kunshan, Jiangsu 215316, China;
- Department of Physics, Duke University, Durham, NC 27708, USA
| | - Ruobing Wang
- Department of Chemistry, Duke University, Durham, NC 27708, USA;
| | - Ruijie D. Teo
- Department of Chemistry, Duke University, Durham, NC 27708, USA;
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
- Correspondence:
| |
Collapse
|
21
|
Cadet XF, Gelly JC, van Noord A, Cadet F, Acevedo-Rocha CG. Learning Strategies in Protein Directed Evolution. Methods Mol Biol 2022; 2461:225-275. [PMID: 35727454 DOI: 10.1007/978-1-0716-2152-3_15] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Synthetic biology is a fast-evolving research field that combines biology and engineering principles to develop new biological systems for medical, pharmacological, and industrial applications. Synthetic biologists use iterative "design, build, test, and learn" cycles to efficiently engineer genetic systems that are reliable, reproducible, and predictable. Protein engineering by directed evolution can benefit from such a systematic engineering approach for various reasons. Learning can be carried out before starting, throughout or after finalizing a directed evolution project. Computational tools, bioinformatics, and scanning mutagenesis methods can be excellent starting points, while molecular dynamics simulations and other strategies can guide engineering efforts. Similarly, studying protein intermediates along evolutionary pathways offers fascinating insights into the molecular mechanisms shaped by evolution. The learning step of the cycle is not only crucial for proteins or enzymes that are not suitable for high-throughput screening or selection systems, but it is also valuable for any platform that can generate a large amount of data that can be aided by machine learning algorithms. The main challenge in protein engineering is to predict the effect of a single mutation on one functional parameter-to say nothing of several mutations on multiple parameters. This is largely due to nonadditive mutational interactions, known as epistatic effects-beneficial mutations present in a genetic background may not be beneficial in another genetic background. In this work, we provide an overview of experimental and computational strategies that can guide the user to learn protein function at different stages in a directed evolution project. We also discuss how epistatic effects can influence the success of directed evolution projects. Since machine learning is gaining momentum in protein engineering and the field is becoming more interdisciplinary thanks to collaboration between mathematicians, computational scientists, engineers, molecular biologists, and chemists, we provide a general workflow that familiarizes nonexperts with the basic concepts, dataset requirements, learning approaches, model capabilities and performance metrics of this intriguing area. Finally, we also provide some practical recommendations on how machine learning can harness epistatic effects for engineering proteins in an "outside-the-box" way.
Collapse
Affiliation(s)
- Xavier F Cadet
- PEACCEL, Artificial Intelligence Department, Paris, France
| | - Jean Christophe Gelly
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | | - Frédéric Cadet
- Laboratoire d'Excellence GR-Ex, Paris, France
- BIGR, DSIMB, UMR_S1134, INSERM, University of Paris & University of Reunion, Paris, France
| | | |
Collapse
|
22
|
Saito Y, Oikawa M, Sato T, Nakazawa H, Ito T, Kameda T, Tsuda K, Umetsu M. Machine-Learning-Guided Library Design Cycle for Directed Evolution of Enzymes: The Effects of Training Data Composition on Sequence Space Exploration. ACS Catal 2021. [DOI: 10.1021/acscatal.1c03753] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Affiliation(s)
- Yutaka Saito
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
- AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
- Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Misaki Oikawa
- Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan
| | - Takumi Sato
- Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan
| | - Hikaru Nakazawa
- Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan
| | - Tomoyuki Ito
- Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan
| | - Tomoshi Kameda
- Artificial Intelligence Research Center, National Institute of Advanced Industrial Science and Technology (AIST), 2-4-7 Aomi, Koto-ku, Tokyo 135-0064, Japan
- Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo 103-0027, Japan
| | - Koji Tsuda
- Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8561, Japan
- Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo 103-0027, Japan
- Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science, 1-2-1 Sengen, Tsukuba, Ibaraki 305-0047, Japan
| | - Mitsuo Umetsu
- Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Aoba-ku, Sendai 980-8579, Japan
- Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo 103-0027, Japan
| |
Collapse
|
23
|
Bertelsen AB, Hackney CM, Bayer CN, Kjelgaard LD, Rennig M, Christensen B, Sørensen ES, Safavi‐Hemami H, Wulff T, Ellgaard L, Nørholm MHH. DisCoTune: versatile auxiliary plasmids for the production of disulphide-containing proteins and peptides in the E. coli T7 system. Microb Biotechnol 2021; 14:2566-2580. [PMID: 34405535 PMCID: PMC8601162 DOI: 10.1111/1751-7915.13895] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Revised: 06/15/2021] [Accepted: 07/04/2021] [Indexed: 11/28/2022] Open
Abstract
Secreted proteins and peptides hold large potential both as therapeutics and as enzyme catalysts in biotechnology. The high stability of many secreted proteins helps maintain functional integrity in changing chemical environments and is a contributing factor to their commercial potential. Disulphide bonds constitute an important post-translational modification that stabilizes many of these proteins and thus preserves the active state under chemically stressful conditions. Despite their importance, the discovery and applications within this group of proteins and peptides are limited by the availability of synthetic biology tools and heterologous production systems that allow for efficient formation of disulphide bonds. Here, we refine the design of two DisCoTune (Disulphide bond formation in E. coli with tunable expression) plasmids that enable the formation of disulphides in the highly popular Escherichia coli T7 protein production system. We show that this new system promotes significantly higher yield and activity of an industrial protease and a conotoxin, which belongs to a group of disulphide-rich venom peptides from cone snails with strong potential as research tools and pharmacological agents.
Collapse
Affiliation(s)
- Andreas B. Bertelsen
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkKongens Lyngby2800Denmark
| | - Celeste Menuet Hackney
- Department of BiologyLinderstrøm‐Lang Centre for Protein ScienceUniversity of CopenhagenCopenhagen N.2200Denmark
| | - Carolyn N. Bayer
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkKongens Lyngby2800Denmark
| | - Lau D. Kjelgaard
- Department of BiologyLinderstrøm‐Lang Centre for Protein ScienceUniversity of CopenhagenCopenhagen N.2200Denmark
| | - Maja Rennig
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkKongens Lyngby2800Denmark
| | - Brian Christensen
- Department of Molecular Biology and GeneticsAarhus UniversityAarhus C8000Denmark
| | | | - Helena Safavi‐Hemami
- Department of BiologyLinderstrøm‐Lang Centre for Protein ScienceUniversity of CopenhagenCopenhagen N.2200Denmark
- Department of Biomedical SciencesUniversity of CopenhagenCopenhagen N2200Denmark
- Department of Biochemistry and School of Biological SciencesUniversity of UtahSalt Lake CityUT84112USA
| | - Tune Wulff
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkKongens Lyngby2800Denmark
| | - Lars Ellgaard
- Department of BiologyLinderstrøm‐Lang Centre for Protein ScienceUniversity of CopenhagenCopenhagen N.2200Denmark
| | - Morten H. H. Nørholm
- The Novo Nordisk Foundation Center for BiosustainabilityTechnical University of DenmarkKongens Lyngby2800Denmark
| |
Collapse
|
24
|
Machine learning-guided acyl-ACP reductase engineering for improved in vivo fatty alcohol production. Nat Commun 2021; 12:5825. [PMID: 34611172 PMCID: PMC8492656 DOI: 10.1038/s41467-021-25831-w] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2021] [Accepted: 09/01/2021] [Indexed: 02/04/2023] Open
Abstract
Alcohol-forming fatty acyl reductases (FARs) catalyze the reduction of thioesters to alcohols and are key enzymes for microbial production of fatty alcohols. Many metabolic engineering strategies utilize FARs to produce fatty alcohols from intracellular acyl-CoA and acyl-ACP pools; however, enzyme activity, especially on acyl-ACPs, remains a significant bottleneck to high-flux production. Here, we engineer FARs with enhanced activity on acyl-ACP substrates by implementing a machine learning (ML)-driven approach to iteratively search the protein fitness landscape. Over the course of ten design-test-learn rounds, we engineer enzymes that produce over twofold more fatty alcohols than the starting natural sequences. We characterize the top sequence and show that it has an enhanced catalytic rate on palmitoyl-ACP. Finally, we analyze the sequence-function data to identify features, like the net charge near the substrate-binding site, that correlate with in vivo activity. This work demonstrates the power of ML to navigate the fitness landscape of traditionally difficult-to-engineer proteins.
Collapse
|
25
|
Galanie S, Entwistle D, Lalonde J. Engineering biosynthetic enzymes for industrial natural product synthesis. Nat Prod Rep 2021; 37:1122-1143. [PMID: 32364202 DOI: 10.1039/c9np00071b] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Covering: 2000 to 2020 Natural products and their derivatives are commercially important medicines, agrochemicals, flavors, fragrances, and food ingredients. Industrial strategies to produce these structurally complex molecules encompass varied combinations of chemical synthesis, biocatalysis, and extraction from natural sources. Interest in engineering natural product biosynthesis began with the advent of genetic tools for pathway discovery. Genes and strains can now readily be synthesized, mutated, recombined, and sequenced. Enzyme engineering has succeeded commercially due to the development of genetic methods, analytical technologies, and machine learning algorithms. Today, engineered biosynthetic enzymes from organisms spanning the tree of life are used industrially to produce diverse molecules. These biocatalytic processes include single enzymatic steps, multienzyme cascades, and engineered native and heterologous microbial strains. This review will describe how biosynthetic enzymes have been engineered to enable commercial and near-commercial syntheses of natural products and their analogs.
Collapse
Affiliation(s)
- Stephanie Galanie
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA.
| | - David Entwistle
- Process Chemistry, Codexis, Inc., Redwood City, California, USA
| | - James Lalonde
- Microbial Digital Genome Engineering, Inscripta, Inc., Pleasanton, California, USA
| |
Collapse
|
26
|
Dutta K, Shityakov S, Khalifa I. New Trends in Bioremediation Technologies Toward Environment-Friendly Society: A Mini-Review. Front Bioeng Biotechnol 2021; 9:666858. [PMID: 34409018 PMCID: PMC8365754 DOI: 10.3389/fbioe.2021.666858] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Accepted: 05/26/2021] [Indexed: 01/29/2023] Open
Abstract
Today's environmental balance has been compromised by the unreasonable and sometimes dangerous actions committed by humans to maintain their dominance over the Earth's natural resources. As a result, oceans are contaminated by the different types of plastic trash, crude oil coming from mismanagement of transporting ships spilling it in the water, and air pollution due to increasing production of greenhouse gases, such as CO2 and CH4 etc., into the atmosphere. The lands, agricultural fields, and groundwater are also contaminated by the infamous chemicals viz., polycyclic aromatic hydrocarbons, pyrethroids pesticides, bisphenol-A, and dioxanes. Therefore, bioremediation might function as a convenient alternative to restore a clean environment. However, at present, the majority of bioremediation reports are limited to the natural capabilities of microbial enzymes. Synthetic biology with uncompromised supervision of ethical standards could help to outsmart nature's engineering, such as the CETCH cycle for improved CO2 fixation. Additionally, a blend of synthetic biology with machine learning algorithms could expand the possibilities of bioengineering. This review summarized current state-of-the-art knowledge of the data-assisted enzyme redesigning to actively promote new research on important enzymes to ameliorate the environment.
Collapse
Affiliation(s)
- Kunal Dutta
- Department of Human Physiology, Vidyasagar University, Medinipur, India
| | - Sergey Shityakov
- Department of Chemoinformatics, Infochemistry Scientific Center, Saint Petersburg National Research University of Information Technologies, Mechanics and Optics (ITMO University), Saint-Petersburg, Russia
| | - Ibrahim Khalifa
- Food Technology Department, Faculty of Agriculture, Benha University, Moshtohor, Egypt
| |
Collapse
|
27
|
Yi D, Bayer T, Badenhorst CPS, Wu S, Doerr M, Höhne M, Bornscheuer UT. Recent trends in biocatalysis. Chem Soc Rev 2021; 50:8003-8049. [PMID: 34142684 PMCID: PMC8288269 DOI: 10.1039/d0cs01575j] [Citation(s) in RCA: 115] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Indexed: 12/13/2022]
Abstract
Biocatalysis has undergone revolutionary progress in the past century. Benefited by the integration of multidisciplinary technologies, natural enzymatic reactions are constantly being explored. Protein engineering gives birth to robust biocatalysts that are widely used in industrial production. These research achievements have gradually constructed a network containing natural enzymatic synthesis pathways and artificially designed enzymatic cascades. Nowadays, the development of artificial intelligence, automation, and ultra-high-throughput technology provides infinite possibilities for the discovery of novel enzymes, enzymatic mechanisms and enzymatic cascades, and gradually complements the lack of remaining key steps in the pathway design of enzymatic total synthesis. Therefore, the research of biocatalysis is gradually moving towards the era of novel technology integration, intelligent manufacturing and enzymatic total synthesis.
Collapse
Affiliation(s)
- Dong Yi
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Thomas Bayer
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Christoffel P. S. Badenhorst
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Shuke Wu
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Mark Doerr
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Matthias Höhne
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| | - Uwe T. Bornscheuer
- Department of Biotechnology & Enzyme Catalysis, Institute of Biochemistry, University GreifswaldFelix-Hausdorff-Str. 4D-17487 GreifswaldGermany
| |
Collapse
|
28
|
Siedhoff NE, Illig AM, Schwaneberg U, Davari MD. PyPEF-An Integrated Framework for Data-Driven Protein Engineering. J Chem Inf Model 2021; 61:3463-3476. [PMID: 34260225 DOI: 10.1021/acs.jcim.1c00099] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Data-driven strategies are gaining increased attention in protein engineering due to recent advances in access to large experimental databanks of proteins, next-generation sequencing (NGS), high-throughput screening (HTS) methods, and the development of artificial intelligence algorithms. However, the reliable prediction of beneficial amino acid substitutions, their combination, and the effect on functional properties remain the most significant challenges in protein engineering, which is applied to develop proteins and enzymes for biocatalysis, biomedicine, and life sciences. Here, we present a general-purpose framework (PyPEF: pythonic protein engineering framework) for performing data-driven protein engineering using machine learning methods combined with techniques from signal processing and statistical physics. PyPEF guides the identification and selection of beneficial proteins of a defined sequence space by systematically or randomly exploring the fitness of variants and by sampling random evolution pathways. The performance of PyPEF was evaluated concerning its predictive accuracy and throughput on four public protein and enzyme data sets using common regression models. It was proved that the program could efficiently predict the fitness of protein sequences for different target properties (predictive models with coefficient of determination values ranging from 0.58 to 0.92). By combining machine learning and protein evolution, PyPEF enabled the screening of proteins with various functions, reaching a screening capacity of more than 500,000 protein sequence variants in the timeframe of only a few minutes on a personal computer. PyPEF displayed significant accuracies on four public data sets (different proteins and properties) and underlined the potential of integrating data-driven technologies for covering different philosophies by either predicting the fitness of the variants to the highest accuracy accounting for epistatic effects or capturing the general trend of introduced mutations on the fitness in directed protein evolution campaigns. In essence, PyPEF can provide a powerful solution to current sequence exploration and combinatorial problems faced in protein engineering through exhaustive in silico screening of the sequence space.
Collapse
Affiliation(s)
- Niklas E Siedhoff
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| | | | - Ulrich Schwaneberg
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany.,DWI-Leibniz Institute for Interactive Materials, Forckenbeckstraße 50, 52074 Aachen, Germany
| | - Mehdi D Davari
- Institute of Biotechnology, RWTH Aachen University, Worringer Weg 3, 52074 Aachen, Germany
| |
Collapse
|
29
|
Wu Z, Johnston KE, Arnold FH, Yang KK. Protein sequence design with deep generative models. Curr Opin Chem Biol 2021; 65:18-27. [PMID: 34051682 DOI: 10.1016/j.cbpa.2021.04.004] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 04/02/2021] [Accepted: 04/07/2021] [Indexed: 12/20/2022]
Abstract
Protein engineering seeks to identify protein sequences with optimized properties. When guided by machine learning, protein sequence generation methods can draw on prior knowledge and experimental efforts to improve this process. In this review, we highlight recent applications of machine learning to generate protein sequences, focusing on the emerging field of deep generative methods.
Collapse
Affiliation(s)
- Zachary Wu
- Division of Chemistry and Chemical Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, 91125, CA, USA
| | - Kadina E Johnston
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, 91125, CA, USA
| | - Frances H Arnold
- Division of Chemistry and Chemical Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, 91125, CA, USA; Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California Blvd, Pasadena, 91125, CA, USA
| | - Kevin K Yang
- Microsoft Research New England, 1 Memorial Drive, Cambridge, 02142, MA, USA.
| |
Collapse
|
30
|
Sunny JS, Nisha K, Natarajan A, Saleena LM. IND-enzymes: a repository for hydrolytic enzymes derived from thermophilic and psychrophilic bacterial species with potential industrial usage. Extremophiles 2021; 25:319-325. [PMID: 33961119 DOI: 10.1007/s00792-021-01231-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 04/22/2021] [Indexed: 10/21/2022]
Abstract
Biocatalysts provide many advantages over the traditional chemically assisted processes prevalent in industries. Consequently, the search for novel enzymes has increased over the years with a renewed interest in thermophilic and psychrophilic bacterial species. Enzymes or extremozymes extracted from such species have exhibited an affinity to extreme temperatures which is a prerequisite for many industrial applications. However, utilisation of these enzymes faces a major bottleneck. The distribution of sequence data associated with thermophiles and psychrophiles is overwhelming, spanning various databases and scientific literature. Based on more than 100 publications and genomes from over 300 thermophilic and psychrophilic bacterial species, we have constructed the database IND-Enzymes (indenzymes.srmist.edu.in). This database consists of over 20,120 nucleotide and protein sequences belonging to the hydrolytic enzyme class lipase, protease, esterase and amylase. Users can access over 100 published enzymes, 200 PDB structural data. Enzymes derived from genomes can be directly downloaded and users can also access the entire annotation data derived from species individually. Along with an alignment tool and python based pipelines, IND-Enzymes serves as the largest sequence repository for hydrolytic enzymes from thermophilic and psychrophilic bacterial species. This database showcases resources that are essential for protein engineering of hot-cold stable enzymes.
Collapse
Affiliation(s)
- Jithin S Sunny
- Department of Biotechnology, School of Bioengineering, SRM Institute of Science and Technology, Room no. 508, SRM Nagar, Kattankulathur, 603203, Kanchipuram, TN, India
| | - Khairun Nisha
- Department of Biotechnology, School of Bioengineering, SRM Institute of Science and Technology, Room no. 508, SRM Nagar, Kattankulathur, 603203, Kanchipuram, TN, India
| | - Anuradha Natarajan
- Department of Biotechnology, School of Bioengineering, SRM Institute of Science and Technology, Room no. 508, SRM Nagar, Kattankulathur, 603203, Kanchipuram, TN, India
| | - Lilly M Saleena
- Department of Biotechnology, School of Bioengineering, SRM Institute of Science and Technology, Room no. 508, SRM Nagar, Kattankulathur, 603203, Kanchipuram, TN, India.
| |
Collapse
|
31
|
Ferguson AL, Ranganathan R. 100th Anniversary of Macromolecular Science Viewpoint: Data-Driven Protein Design. ACS Macro Lett 2021; 10:327-340. [PMID: 35549066 DOI: 10.1021/acsmacrolett.0c00885] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The design of synthetic proteins with the desired function is a long-standing goal in biomolecular science, with broad applications in biochemical engineering, agriculture, medicine, and public health. Rational de novo design and experimental directed evolution have achieved remarkable successes but are challenged by the requirement to find functional "needles" in the vast "haystack" of protein sequence space. Data-driven models for fitness landscapes provide a predictive map between protein sequence and function and can prospectively identify functional candidates for experimental testing to greatly improve the efficiency of this search. This Viewpoint reviews the applications of machine learning and, in particular, deep learning as part of data-driven protein engineering platforms. We highlight recent successes, review promising computational methodologies, and provide an outlook on future challenges and opportunities. The article is written for a broad audience comprising both polymer and protein scientists and computer and data scientists interested in an up-to-date review of recent innovations and opportunities in this rapidly evolving field.
Collapse
Affiliation(s)
- Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Rama Ranganathan
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
- Center for Physics of Evolving Systems, University of Chicago, Chicago, Illinois 60637, United States
- Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
32
|
Li G, Qin Y, Fontaine NT, Ng Fuk Chong M, Maria‐Solano MA, Feixas F, Cadet XF, Pandjaitan R, Garcia‐Borràs M, Cadet F, Reetz MT. Machine Learning Enables Selection of Epistatic Enzyme Mutants for Stability Against Unfolding and Detrimental Aggregation. Chembiochem 2021; 22:904-914. [PMID: 33094545 PMCID: PMC7984044 DOI: 10.1002/cbic.202000612] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 10/22/2020] [Indexed: 12/15/2022]
Abstract
Machine learning (ML) has pervaded most areas of protein engineering, including stability and stereoselectivity. Using limonene epoxide hydrolase as the model enzyme and innov'SAR as the ML platform, comprising a digital signal process, we achieved high protein robustness that can resist unfolding with concomitant detrimental aggregation. Fourier transform (FT) allows us to take into account the order of the protein sequence and the nonlinear interactions between positions, and thus to grasp epistatic phenomena. The innov'SAR approach is interpolative, extrapolative and makes outside-the-box, predictions not found in other state-of-the-art ML or deep learning approaches. Equally significant is the finding that our approach to ML in the present context, flanked by advanced molecular dynamics simulations, uncovers the connection between epistatic mutational interactions and protein robustness.
Collapse
Affiliation(s)
- Guangyue Li
- State Key Laboratory for Biology of Plant Diseases and Insect Pests Key Laboratory of Control of Biological Hazard Factors (Plant Origin) for Agri-product Quality and Safety Ministry of Agriculture, Institute of Plant ProtectionChinese Academy of Agricultural SciencesBeijing100081P. R. China
| | - Youcai Qin
- State Key Laboratory for Biology of Plant Diseases and Insect Pests Key Laboratory of Control of Biological Hazard Factors (Plant Origin) for Agri-product Quality and Safety Ministry of Agriculture, Institute of Plant ProtectionChinese Academy of Agricultural SciencesBeijing100081P. R. China
| | - Nicolas T. Fontaine
- PEACCELArtificial Intelligence Department6 Square Albin Cachot, Box 4275013ParisFrance) .
| | - Matthieu Ng Fuk Chong
- PEACCELArtificial Intelligence Department6 Square Albin Cachot, Box 4275013ParisFrance) .
| | - Miguel A. Maria‐Solano
- Institut de Química Computacional i Catàlisi and Departament de QuímicaUniversitat de Girona Campus Montilivi17003Girona, CataloniaSpain) .
| | - Ferran Feixas
- Institut de Química Computacional i Catàlisi and Departament de QuímicaUniversitat de Girona Campus Montilivi17003Girona, CataloniaSpain) .
| | - Xavier F. Cadet
- PEACCELArtificial Intelligence Department6 Square Albin Cachot, Box 4275013ParisFrance) .
| | - Rudy Pandjaitan
- PEACCELArtificial Intelligence Department6 Square Albin Cachot, Box 4275013ParisFrance) .
| | - Marc Garcia‐Borràs
- Institut de Química Computacional i Catàlisi and Departament de QuímicaUniversitat de Girona Campus Montilivi17003Girona, CataloniaSpain) .
| | - Frederic Cadet
- PEACCELArtificial Intelligence Department6 Square Albin Cachot, Box 4275013ParisFrance) .
| | - Manfred T. Reetz
- Department of ChemistryPhilipps-Universität35032MarburgGermany) .
- Max-Planck-Institut fuer Kohlenforschung45470MülheimGermany
- Tianjin Institute of Industrial BiotechnologyChinese Academy of Sciences32 West 7th Avenue, Tianjin Airport Economic Area300308TianjinP. R. China
| |
Collapse
|
33
|
Zhao Y, Li D, Bai X, Luo M, Feng Y, Zhao Y, Ma F, Yang GY. Improved thermostability of proteinase K and recognizing the synergistic effect of Rosetta and FoldX approaches. Protein Eng Des Sel 2021; 34:6404066. [PMID: 34671809 DOI: 10.1093/protein/gzab024] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Revised: 08/24/2021] [Accepted: 08/25/2021] [Indexed: 11/14/2022] Open
Abstract
Proteinase K (PRK) is a proteolytic enzyme that has been widely used in industrial applications. However, poor stability has severely limited the uses of PRK. In this work, we used two structure-guided rational design methods, Rosetta and FoldX, to modify PRK thermostability. Fifty-two single amino acid conversion mutants were constructed based on software predictions of residues that could affect protein stability. Experimental characterization revealed that 46% (21 mutants) exhibited enhanced thermostability. The top four variants, D260V, T4Y, S216Q, and S219Q, showed improved half-lives at 69°C by 12.4-, 2.6-, 2.3-, and 2.2-fold that of the parent enzyme, respectively. We also found that selecting mutations predicted by both methods could increase the predictive accuracy over that of either method alone, with 73% of the shared predicted mutations resulting in higher thermostability. In addition to providing promising new variants of PRK in industrial applications, our findings also show that combining these programs may synergistically improve their predictive accuracy.
Collapse
Affiliation(s)
- Yang Zhao
- Institute of Biothermal Science and Technology, University of Shanghai for Science and Technology, 516 Jungong Rd., Shanghai 200093, People's Republic of China
| | - Daixi Li
- Institute of Biothermal Science and Technology, University of Shanghai for Science and Technology, 516 Jungong Rd., Shanghai 200093, People's Republic of China
| | - Xue Bai
- Institute of Biothermal Science and Technology, University of Shanghai for Science and Technology, 516 Jungong Rd., Shanghai 200093, People's Republic of China
| | - Manjie Luo
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai 200240, People's Republic of China
| | - Yan Feng
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai 200240, People's Republic of China
| | - Yilei Zhao
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai 200240, People's Republic of China
| | - Fuqiang Ma
- CAS Key Lab of Bio-Medical Diagnostics, Suzhou Institute of Biomedical Engineering and Technology, Chinese Academy of Sciences, 88 Keling Rd., Suzhou 215163, China
| | - Guang-Yu Yang
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 800 Dongchuan Rd., Shanghai 200240, People's Republic of China
| |
Collapse
|
34
|
Unger EK, Keller JP, Altermatt M, Liang R, Matsui A, Dong C, Hon OJ, Yao Z, Sun J, Banala S, Flanigan ME, Jaffe DA, Hartanto S, Carlen J, Mizuno GO, Borden PM, Shivange AV, Cameron LP, Sinning S, Underhill SM, Olson DE, Amara SG, Temple Lang D, Rudnick G, Marvin JS, Lavis LD, Lester HA, Alvarez VA, Fisher AJ, Prescher JA, Kash TL, Yarov-Yarovoy V, Gradinaru V, Looger LL, Tian L. Directed Evolution of a Selective and Sensitive Serotonin Sensor via Machine Learning. Cell 2020; 183:1986-2002.e26. [PMID: 33333022 PMCID: PMC8025677 DOI: 10.1016/j.cell.2020.11.040] [Citation(s) in RCA: 92] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2019] [Revised: 06/22/2020] [Accepted: 11/20/2020] [Indexed: 12/28/2022]
Abstract
Serotonin plays a central role in cognition and is the target of most pharmaceuticals for psychiatric disorders. Existing drugs have limited efficacy; creation of improved versions will require better understanding of serotonergic circuitry, which has been hampered by our inability to monitor serotonin release and transport with high spatial and temporal resolution. We developed and applied a binding-pocket redesign strategy, guided by machine learning, to create a high-performance, soluble, fluorescent serotonin sensor (iSeroSnFR), enabling optical detection of millisecond-scale serotonin transients. We demonstrate that iSeroSnFR can be used to detect serotonin release in freely behaving mice during fear conditioning, social interaction, and sleep/wake transitions. We also developed a robust assay of serotonin transporter function and modulation by drugs. We expect that both machine-learning-guided binding-pocket redesign and iSeroSnFR will have broad utility for the development of other sensors and in vitro and in vivo serotonin detection, respectively.
Collapse
Affiliation(s)
- Elizabeth K Unger
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Jacob P Keller
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA
| | - Michael Altermatt
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Ruqiang Liang
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Aya Matsui
- Laboratory on Neurobiology of Compulsive Behaviors, National Institute on Alcohol Abuse and Alcoholism, NIH, Bethesda, MD 20892, USA
| | - Chunyang Dong
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Olivia J Hon
- Bowles Center for Alcohol Studies, Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599, USA
| | - Zi Yao
- Department of Chemistry, University of California, Irvine, Irvine, CA 92697, USA
| | - Junqing Sun
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Samba Banala
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA
| | - Meghan E Flanigan
- Bowles Center for Alcohol Studies, Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599, USA
| | - David A Jaffe
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Samantha Hartanto
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Jane Carlen
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Grace O Mizuno
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Phillip M Borden
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA
| | - Amol V Shivange
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lindsay P Cameron
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Steffen Sinning
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Suzanne M Underhill
- Laboratory of Molecular and Cellular Neurobiology, National Institute on Mental Health, NIH, Bethesda, MD 20892, USA
| | - David E Olson
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Susan G Amara
- Laboratory of Molecular and Cellular Neurobiology, National Institute on Mental Health, NIH, Bethesda, MD 20892, USA
| | - Duncan Temple Lang
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Gary Rudnick
- Department of Pharmacology, Yale University School of Medicine, New Haven, CT 06520, USA
| | - Jonathan S Marvin
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA
| | - Luke D Lavis
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA
| | - Henry A Lester
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Veronica A Alvarez
- Laboratory on Neurobiology of Compulsive Behaviors, National Institute on Alcohol Abuse and Alcoholism, NIH, Bethesda, MD 20892, USA
| | - Andrew J Fisher
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Jennifer A Prescher
- Department of Chemistry, University of California, Irvine, Irvine, CA 92697, USA
| | - Thomas L Kash
- Bowles Center for Alcohol Studies, Department of Pharmacology, University of North Carolina School of Medicine, Chapel Hill, NC 27599, USA
| | - Vladimir Yarov-Yarovoy
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA
| | - Viviana Gradinaru
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Loren L Looger
- Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, VA 20174, USA.
| | - Lin Tian
- Departments of Biochemistry and Molecular Medicine, Chemistry, Statistics, Molecular and Cellular Biology, and Physiology and Membrane Biology, the Center for Neuroscience, and Graduate Programs in Molecular, Cellular, and Integrative Physiology, Biochemistry, Molecular, Cellular and Developmental Biology and Neuroscience, University of California, Davis, Davis, CA 95616, USA.
| |
Collapse
|
35
|
Song H, Bremer BJ, Hinds EC, Raskutti G, Romero PA. Inferring Protein Sequence-Function Relationships with Large-Scale Positive-Unlabeled Learning. Cell Syst 2020; 12:92-101.e8. [PMID: 33212013 DOI: 10.1016/j.cels.2020.10.007] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2019] [Revised: 08/13/2020] [Accepted: 10/22/2020] [Indexed: 10/22/2022]
Abstract
Machine learning can infer how protein sequence maps to function without requiring a detailed understanding of the underlying physical or biological mechanisms. It is challenging to apply existing supervised learning frameworks to large-scale experimental data generated by deep mutational scanning (DMS) and related methods. DMS data often contain high-dimensional and correlated sequence variables, experimental sampling error and bias, and the presence of missing data. Notably, most DMS data do not contain examples of negative sequences, making it challenging to directly estimate how sequence affects function. Here, we develop a positive-unlabeled (PU) learning framework to infer sequence-function relationships from large-scale DMS data. Our PU learning method displays excellent predictive performance across ten large-scale sequence-function datasets, representing proteins of different folds, functions, and library types. The estimated parameters pinpoint key residues that dictate protein structure and function. Finally, we apply our statistical sequence-function model to design highly stabilized enzymes.
Collapse
Affiliation(s)
- Hyebin Song
- Department of Statistics, The Pennsylvania State University, State College, PA 16802, USA; Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Bennett J Bremer
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Emily C Hinds
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Garvesh Raskutti
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Philip A Romero
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706, USA; Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
36
|
Troiano D, Orsat V, Dumont MJ. Status of Biocatalysis in the Production of 2,5-Furandicarboxylic Acid. ACS Catal 2020. [DOI: 10.1021/acscatal.0c02378] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Affiliation(s)
- Derek Troiano
- Bioresource Engineering Department, McGill University, Ste-Anne-de-Bellevue, Quebec H9X 3V9, Canada
| | - Valérie Orsat
- Bioresource Engineering Department, McGill University, Ste-Anne-de-Bellevue, Quebec H9X 3V9, Canada
| | - Marie-Josée Dumont
- Bioresource Engineering Department, McGill University, Ste-Anne-de-Bellevue, Quebec H9X 3V9, Canada
| |
Collapse
|
37
|
Siedhoff NE, Schwaneberg U, Davari MD. Machine learning-assisted enzyme engineering. Methods Enzymol 2020; 643:281-315. [DOI: 10.1016/bs.mie.2020.05.005] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
38
|
Chowdhury R, Maranas CD. From directed evolution to computational enzyme engineering—A review. AIChE J 2019. [DOI: 10.1002/aic.16847] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Ratul Chowdhury
- Department of Chemical Engineering The Pennsylvania State University University Park Pennsylvania
| | - Costas D. Maranas
- Department of Chemical Engineering The Pennsylvania State University University Park Pennsylvania
| |
Collapse
|
39
|
Improving the catalytic performance of Proteinase K from Parengyodontium album for use in feather degradation. Int J Biol Macromol 2019; 154:1586-1595. [PMID: 31706815 DOI: 10.1016/j.ijbiomac.2019.11.043] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2019] [Revised: 11/06/2019] [Accepted: 11/06/2019] [Indexed: 01/14/2023]
Abstract
Proteinase K (PROK) from Parengyodontium album hydrolyzes keratin, a major protein component of poultry feathers, which are an inexpensive and renewable protein resource. Based on structural studies for analysis of amino acid flexibility near the catalytic center, identification of highly conserved residues, and experimental screening, we obtained a mutant R218S with residual activity 1.6-fold higher than that of PROK after incubation at 60 °C for 1 h. Molecular dynamics simulation indicated that substitution of Arg218 with Ser leads to three hydrogen bonds being introduced into the structure, stabilizing the β-sheet in which Ser218 is located, and thus improvement of thermostability. Additionally, the mutant R218S had a 15% increase in specific activity compared to PROK and improvement in the rate and thoroughness of feather degradation compared with PROK. We confirmed the positive effects of enhancing catalytic center rigidity on enzyme thermostability, a finding which may have broad applications.
Collapse
|
40
|
Yang KK, Wu Z, Arnold FH. Machine-learning-guided directed evolution for protein engineering. Nat Methods 2019; 16:687-694. [PMID: 31308553 DOI: 10.1038/s41592-019-0496-6] [Citation(s) in RCA: 431] [Impact Index Per Article: 86.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2018] [Accepted: 06/17/2019] [Indexed: 02/06/2023]
Abstract
Protein engineering through machine-learning-guided directed evolution enables the optimization of protein functions. Machine-learning approaches predict how sequence maps to function in a data-driven manner without requiring a detailed model of the underlying physics or biological pathways. Such methods accelerate directed evolution by learning from the properties of characterized variants and using that information to select sequences that are likely to exhibit improved properties. Here we introduce the steps required to build machine-learning sequence-function models and to use those models to guide engineering, making recommendations at each stage. This review covers basic concepts relevant to the use of machine learning for protein engineering, as well as the current literature and applications of this engineering paradigm. We illustrate the process with two case studies. Finally, we look to future opportunities for machine learning to enable the discovery of unknown protein functions and uncover the relationship between protein sequence and function.
Collapse
Affiliation(s)
- Kevin K Yang
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Zachary Wu
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Frances H Arnold
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
41
|
Kent R, Dixon N. Systematic Evaluation of Genetic and Environmental Factors Affecting Performance of Translational Riboswitches. ACS Synth Biol 2019; 8:884-901. [PMID: 30897329 PMCID: PMC6492952 DOI: 10.1021/acssynbio.9b00017] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Since their discovery, riboswitches have been attractive tools for the user-controlled regulation of gene expression in bacterial systems. Riboswitches facilitate small molecule mediated fine-tuning of protein expression, making these tools of great use to the synthetic biology community. However, the use of riboswitches is often restricted due to context dependent performance and limited dynamic range. Here, we report the drastic improvement of a previously developed orthogonal riboswitch achieved through in vivo functional selection and optimization of flanking coding and noncoding sequences. The behavior of the derived riboswitches was mapped under a wide array of growth and induction conditions, using a structured Design of Experiments approach. This approach successfully improved the maximal protein expression levels 8.2-fold relative to the original riboswitches, and the dynamic range was improved to afford riboswitch dependent control of 80-fold. The optimized orthogonal riboswitch was then integrated downstream of four endogenous stress promoters, responsive to phosphate starvation, hyperosmotic stress, redox stress, and carbon starvation. These responsive stress promoter-riboswitch devices were demonstrated to allow for tuning of protein expression up to ∼650-fold in response to both environmental and cellular stress responses and riboswitch dependent attenuation. We envisage that these riboswitch stress responsive devices will be useful tools for the construction of advanced genetic circuits, bioprocessing, and protein expression.
Collapse
Affiliation(s)
- R. Kent
- Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester M13 9PL, United Kingdom
| | - N. Dixon
- Manchester Institute of Biotechnology, School of Chemistry, University of Manchester, Manchester M13 9PL, United Kingdom
| |
Collapse
|
42
|
Li G, Dong Y, Reetz MT. Can Machine Learning Revolutionize Directed Evolution of Selective Enzymes? Adv Synth Catal 2019. [DOI: 10.1002/adsc.201900149] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Guangyue Li
- State Key Laboratory for Biology of Plant Diseases and Insect Pests/Key Laboratory of Control of Biological Hazard Factors (Plant Origin) for Agri-product Quality and Safety, Ministry of Agriculture, Institute of Plant ProtectionChinese Academy of Agricultural Sciences Beijing 100081 People's Republic of China
| | - Yijie Dong
- State Key Laboratory for Biology of Plant Diseases and Insect Pests/Key Laboratory of Control of Biological Hazard Factors (Plant Origin) for Agri-product Quality and Safety, Ministry of Agriculture, Institute of Plant ProtectionChinese Academy of Agricultural Sciences Beijing 100081 People's Republic of China
| | - Manfred T. Reetz
- Max-Planck-Institut für Kohlenforschung Kaiser-Wilhelm-Platz 1 45470 Mülheim an der Ruhr Germany
- Fachbereich Chemie der Philipps-Universität Hans-Meerwein-Strasse 35032 Marburg Germany
| |
Collapse
|
43
|
Moore JC, Rodriguez-Granillo A, Crespo A, Govindarajan S, Welch M, Hiraga K, Lexa K, Marshall N, Truppo MD. "Site and Mutation"-Specific Predictions Enable Minimal Directed Evolution Libraries. ACS Synth Biol 2018; 7:1730-1741. [PMID: 29782150 DOI: 10.1021/acssynbio.7b00359] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Directed evolution experiments designed to improve the activity of a biocatalyst have increased in sophistication from the early days of completely random mutagenesis. Sequence-based and structure-based methods have been developed to identify "hotspot" positions that when randomized provide a higher frequency of beneficial mutations that improve activity. These focused mutagenesis methods reduce library sizes and therefore reduce screening burden, accelerating the rate of finding improved enzymes. Looking for further acceleration in finding improved enzymes, we investigated whether two existing methods, one sequence-based (Protein GPS) and one structure-based (using Bioluminate and MOE), were sufficiently predictive to provide not just the hotspot position, but also the amino acid substitution that improved activity at that position. By limiting the libraries to variants that contained only specific amino acid substitutions, library sizes were kept to less than 100 variants. For an initial round of ATA-117 R-selective transaminase evolution, we found that the methods used produced libraries where 9% and 18% of the amino acid substitutions chosen were amino acids that improved reaction performance in lysates. The ability to create combinations of mutations as part of the initial design was confounded by the relatively large number of predicted mutations that were inactivating (30% and 45% for the sequence-based and structure-based methods, respectively). Despite this, combining several mutations identified within a given method produced variant lysates 7- and 9-fold more active than the wild-type lysate, highlighting the capability of mutations chosen this way to generate large advances in activity in addition to the reductions in screening.
Collapse
Affiliation(s)
| | | | | | | | - Mark Welch
- ATUM, 37950 Central Court, Newark, California 94560, United States
| | | | | | | | | |
Collapse
|
44
|
Brown SR, Staff M, Lee R, Love J, Parker DA, Aves SJ, Howard TP. Design of Experiments Methodology to Build a Multifactorial Statistical Model Describing the Metabolic Interactions of Alcohol Dehydrogenase Isozymes in the Ethanol Biosynthetic Pathway of the Yeast Saccharomyces cerevisiae. ACS Synth Biol 2018; 7:1676-1684. [PMID: 29976056 DOI: 10.1021/acssynbio.8b00112] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Multifactorial approaches can quickly and efficiently model complex, interacting natural or engineered biological systems in a way that traditional one-factor-at-a-time experimentation can fail to do. We applied a Design of Experiments (DOE) approach to model ethanol biosynthesis in yeast, which is well-understood and genetically tractable, yet complex. Six alcohol dehydrogenase (ADH) isozymes catalyze ethanol synthesis, differing in their transcriptional and post-translational regulation, subcellular localization, and enzyme kinetics. We generated a combinatorial library of all ADH gene deletions and measured the impact of gene deletion(s) and environmental context on ethanol production of a subset of this library. The data were used to build a statistical model that described known behaviors of ADH isozymes and identified novel interactions. Importantly, the model described features of ADH metabolic behavior without explicit a priori knowledge. The method is therefore highly suited to understanding and optimizing metabolic pathways in less well-understood systems.
Collapse
Affiliation(s)
- Steven R. Brown
- Biosciences, Geoffrey Pope Building, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, U.K
| | - Marta Staff
- Biosciences, Geoffrey Pope Building, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, U.K
| | - Rob Lee
- Biodomain, Shell Technology Center Houston, 3333 Highway 6 South, Houston, Texas 77082-3101, United States
| | - John Love
- Biosciences, Geoffrey Pope Building, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, U.K
| | - David A. Parker
- Biodomain, Shell Technology Center Houston, 3333 Highway 6 South, Houston, Texas 77082-3101, United States
| | - Stephen J. Aves
- Biosciences, Geoffrey Pope Building, College of Life and Environmental Sciences, University of Exeter, Exeter EX4 4QD, U.K
| | - Thomas P. Howard
- School of Natural and Environmental Sciences, Devonshire Building, Faculty of Science, Agriculture and Engineering, Newcastle University, Newcastle-upon-Tyne NE1 7RU, U.K
| |
Collapse
|
45
|
Rigoldi F, Donini S, Redaelli A, Parisini E, Gautieri A. Review: Engineering of thermostable enzymes for industrial applications. APL Bioeng 2018; 2:011501. [PMID: 31069285 PMCID: PMC6481699 DOI: 10.1063/1.4997367] [Citation(s) in RCA: 155] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Accepted: 11/14/2017] [Indexed: 01/19/2023] Open
Abstract
The catalytic properties of some selected enzymes have long been exploited to carry out efficient and cost-effective bioconversions in a multitude of research and industrial sectors, such as food, health, cosmetics, agriculture, chemistry, energy, and others. Nonetheless, for several applications, naturally occurring enzymes are not considered to be viable options owing to their limited stability in the required working conditions. Over the years, the quest for novel enzymes with actual potential for biotechnological applications has involved various complementary approaches such as mining enzyme variants from organisms living in extreme conditions (extremophiles), mimicking evolution in the laboratory to develop more stable enzyme variants, and more recently, using rational, computer-assisted enzyme engineering strategies. In this review, we provide an overview of the most relevant enzymes that are used for industrial applications and we discuss the strategies that are adopted to enhance enzyme stability and/or activity, along with some of the most relevant achievements. In all living species, many different enzymes catalyze fundamental chemical reactions with high substrate specificity and rate enhancements. Besides specificity, enzymes also possess many other favorable properties, such as, for instance, cost-effectiveness, good stability under mild pH and temperature conditions, generally low toxicity levels, and ease of termination of activity. As efficient natural biocatalysts, enzymes provide great opportunities to carry out important chemical reactions in several research and industrial settings, ranging from food to pharmaceutical, cosmetic, agricultural, and other crucial economic sectors.
Collapse
Affiliation(s)
- Federica Rigoldi
- Biomolecular Engineering Lab, Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy
| | - Stefano Donini
- Center for Nano Science and Technology at Polimi, Istituto Italiano di Tecnologia, Via G. Pascoli 70/3, 20133 Milano, Italy
| | - Alberto Redaelli
- Biomolecular Engineering Lab, Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy
| | - Emilio Parisini
- Center for Nano Science and Technology at Polimi, Istituto Italiano di Tecnologia, Via G. Pascoli 70/3, 20133 Milano, Italy
| | - Alfonso Gautieri
- Biomolecular Engineering Lab, Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Piazza Leonardo da Vinci 32, 20133 Milano, Italy
| |
Collapse
|
46
|
Getting Momentum: From Biocatalysis to Advanced Synthetic Biology. Trends Biochem Sci 2018; 43:180-198. [DOI: 10.1016/j.tibs.2018.01.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Revised: 01/08/2018] [Accepted: 01/10/2018] [Indexed: 11/20/2022]
|
47
|
Abstract
The last decade has seen a dramatic increase in the utilization of enzymes as green and sustainable (bio)catalysts in pharmaceutical and industrial applications. This trend has to a significant degree been fueled by advances in scientists' and engineers' ability to customize native enzymes by protein engineering. A review of the literature quickly reveals the tremendous success of this approach; protein engineering has generated enzyme variants with improved catalytic activity, broadened or altered substrate specificity, as well as raised or reversed stereoselectivity. Enzymes have been tailored to retain activity at elevated temperatures and to function in the presence of organic solvents, salts and pH values far from physiological conditions. However, readers unfamiliar with the field will soon encounter the confusingly large number of experimental techniques that have been employed to accomplish these engineering feats. Herein, we use history to guide a brief overview of the major strategies for protein engineering-past, present, and future.
Collapse
Affiliation(s)
- Stefan Lutz
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, GA, 30322, USA.
| | - Samantha M Iamurri
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, GA, 30322, USA
| |
Collapse
|
48
|
Learning epistatic interactions from sequence-activity data to predict enantioselectivity. J Comput Aided Mol Des 2017; 31:1085-1096. [DOI: 10.1007/s10822-017-0090-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 12/04/2017] [Indexed: 10/18/2022]
|
49
|
Musdal Y, Govindarajan S, Mannervik B. Exploring sequence-function space of a poplar glutathione transferase using designed information-rich gene variants. Protein Eng Des Sel 2017; 30:543-549. [PMID: 28967959 PMCID: PMC5914380 DOI: 10.1093/protein/gzx045] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2017] [Accepted: 08/15/2017] [Indexed: 01/19/2023] Open
Abstract
Exploring the vicinity around a locus of a protein in sequence space may identify homologs with enhanced properties, which could become valuable in biotechnical and other applications. A rational approach to this pursuit is the use of ‘infologs’, i.e. synthetic sequences with specific substitutions capturing maximal sequence information derived from the evolutionary history of the protein family. Ninety-five such infolog genes of poplar glutathione transferase were synthesized and expressed in Escherichia coli, and the catalytic activities of the proteins determined with alternative substrates. Sequence–activity relationships derived from the infologs were used to design a second set of 47 infologs in which 90% of the members exceeded wild-type properties. Two mutants, C2 (V55I/E95D/D108E/A160V) and G5 (F13L/C70A/G122E), were further functionally characterized. The activities of the infologs with the alternative substrates 1-chloro-2,4-dinitrobenzene and phenethyl isothiocyanate, subject to different chemistries, were positively correlated, indicating that the examined mutations were affecting the overall catalytic competence without major shift in substrate discrimination. By contrast, the enhanced protein expressivity observed in many of the mutants were not similarly correlated with the activities. In conclusion, small libraries of well-defined infologs can be used to systematically explore sequence space to optimize proteins in multidimensional functional space.
Collapse
Affiliation(s)
- Yaman Musdal
- Department of Neurochemistry, Arrhenius Laboratories, Stockholm University, Svante Arrhenius väg 16B, SE-10691 Stockholm, Sweden
| | | | - Bengt Mannervik
- Department of Neurochemistry, Arrhenius Laboratories, Stockholm University, Svante Arrhenius väg 16B, SE-10691 Stockholm, Sweden
| |
Collapse
|
50
|
Carlin DA, Caster RW, Wang X, Betzenderfer SA, Chen CX, Duong VM, Ryklansky CV, Alpekin A, Beaumont N, Kapoor H, Kim N, Mohabbot H, Pang B, Teel R, Whithaus L, Tagkopoulos I, Siegel JB. Kinetic Characterization of 100 Glycoside Hydrolase Mutants Enables the Discovery of Structural Features Correlated with Kinetic Constants. PLoS One 2016; 11:e0147596. [PMID: 26815142 PMCID: PMC4729467 DOI: 10.1371/journal.pone.0147596] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Accepted: 01/06/2016] [Indexed: 11/18/2022] Open
Abstract
The use of computational modeling algorithms to guide the design of novel enzyme catalysts is a rapidly growing field. Force-field based methods have now been used to engineer both enzyme specificity and activity. However, the proportion of designed mutants with the intended function is often less than ten percent. One potential reason for this is that current force-field based approaches are trained on indirect measures of function rather than direct correlation to experimentally-determined functional effects of mutations. We hypothesize that this is partially due to the lack of data sets for which a large panel of enzyme variants has been produced, purified, and kinetically characterized. Here we report the kcat and KM values of 100 purified mutants of a glycoside hydrolase enzyme. We demonstrate the utility of this data set by using machine learning to train a new algorithm that enables prediction of each kinetic parameter based on readily-modeled structural features. The generated dataset and analyses carried out in this study not only provide insight into how this enzyme functions, they also provide a clear path forward for the improvement of computational enzyme redesign algorithms.
Collapse
Affiliation(s)
- Dylan Alexander Carlin
- Biophysics Graduate Group, University of California Davis, California, United States of America
| | - Ryan W. Caster
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Xiaokang Wang
- Department of Biomedical Engineering, University of California Davis, Davis, California, United States of America
| | | | - Claire X. Chen
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Veasna M. Duong
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Carolina V. Ryklansky
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Alp Alpekin
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Nathan Beaumont
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Harshul Kapoor
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Nicole Kim
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Hosna Mohabbot
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Boyu Pang
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Rachel Teel
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Lillian Whithaus
- Genome Center, University of California Davis, Davis, California, United States of America
| | - Ilias Tagkopoulos
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Computer Science, University of California Davis, Davis, California, United States of America
| | - Justin B. Siegel
- Genome Center, University of California Davis, Davis, California, United States of America
- Department of Chemistry, University of California Davis, Davis, California, United States of America
- Department of Biochemistry & Molecular Medicine, University of California Davis, Davis, California, United States of America
- * E-mail:
| |
Collapse
|