1
|
Lino BR, Williams SJ, Castor ME, Van Deventer JA. Reaching New Heights in Genetic Code Manipulation with High Throughput Screening. Chem Rev 2024; 124:12145-12175. [PMID: 39418482 DOI: 10.1021/acs.chemrev.4c00329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
The chemical and physical properties of proteins are limited by the 20 canonical amino acids. Genetic code manipulation allows for the incorporation of noncanonical amino acids (ncAAs) that enhance or alter protein functionality. This review explores advances in the three main strategies for introducing ncAAs into biosynthesized proteins, focusing on the role of high throughput screening in these advancements. The first section discusses engineering aminoacyl-tRNA synthetases (aaRSs) and tRNAs, emphasizing how novel selection methods improve characteristics including ncAA incorporation efficiency and selectivity. The second section examines high-throughput techniques for improving protein translation machinery, enabling accommodation of alternative genetic codes. This includes opportunities to enhance ncAA incorporation through engineering cellular components unrelated to translation. The final section highlights various discovery platforms for high-throughput screening of ncAA-containing proteins, showcasing innovative binding ligands and enzymes that are challenging to create with only canonical amino acids. These advances have led to promising drug leads and biocatalysts. Overall, the ability to discover unexpected functionalities through high-throughput methods significantly influences ncAA incorporation and its applications. Future innovations in experimental techniques, along with advancements in computational protein design and machine learning, are poised to further elevate this field.
Collapse
Affiliation(s)
- Briana R Lino
- Chemical and Biological Engineering Department, Tufts University, Medford, Massachusetts 02155, United States
| | - Sean J Williams
- Chemical and Biological Engineering Department, Tufts University, Medford, Massachusetts 02155, United States
| | - Michelle E Castor
- Chemical and Biological Engineering Department, Tufts University, Medford, Massachusetts 02155, United States
| | - James A Van Deventer
- Chemical and Biological Engineering Department, Tufts University, Medford, Massachusetts 02155, United States
- Biomedical Engineering Department, Tufts University, Medford, Massachusetts 02155, United States
| |
Collapse
|
2
|
Chiang CH, Wang Y, Hussain A, Brooks CL, Narayan ARH. Ancestral Sequence Reconstruction to Enable Biocatalytic Synthesis of Azaphilones. J Am Chem Soc 2024; 146:30194-30203. [PMID: 39441831 DOI: 10.1021/jacs.4c08761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2024]
Abstract
Biocatalysis can be powerful in organic synthesis but is often limited by enzymes' substrate scope and selectivity. Developing a biocatalytic step involves identifying an initial enzyme for the target reaction followed by optimization through rational design, directed evolution, or both. These steps are time consuming, resource-intensive, and require expertise beyond typical organic chemistry. Thus, an effective strategy for streamlining the process from enzyme identification to implementation is essential to expanding biocatalysis. Here, we present a strategy combining bioinformatics-guided enzyme mining and ancestral sequence reconstruction (ASR) to resurrect enzymes for biocatalytic synthesis. Specifically, we achieve an enantioselective synthesis of azaphilone natural products using two ancestral enzymes: a flavin-dependent monooxygenase (FDMO) for stereodivergent oxidative dearomatization and a substrate-selective acyltransferase (AT) for the acylation of the enzymatically installed hydroxyl group. This cascade, stereocomplementary to established chemoenzymatic routes, expands access to enantiomeric linear tricyclic azaphilones. By leveraging the co-occurrence and coevolution of FDMO and AT in azaphilone biosynthetic pathways, we identified an AT candidate, CazE, and addressed its low solubility and stability through ASR, obtaining a more soluble, stable, promiscuous, and reactive ancestral AT (AncAT). Sequence analysis revealed AncAT as a chimeric composition of its descendants with enhanced reactivity likely due to ancestral promiscuity. Flexible receptor docking and molecular dynamics simulations showed that the most reactive AncAT promotes a reactive geometry between substrates. We anticipate that our bioinformatics-guided, ASR-based approach can be broadly applied in target-oriented synthesis, reducing the time required to develop biocatalytic steps and efficiently access superior biocatalysts.
Collapse
Affiliation(s)
- Chang-Hwa Chiang
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Ye Wang
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Azam Hussain
- Macromolecular Science and Engineering Program, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Program in Chemical Biology, University of Michigan, Ann Arbor, Michigan 48109, United States
- Enhanced Program in Biophysics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Alison R H Narayan
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan 48109, United States
- Program in Chemical Biology, University of Michigan, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
3
|
Abriata LA. The Nobel Prize in Chemistry: past, present, and future of AI in biology. Commun Biol 2024; 7:1409. [PMID: 39472680 PMCID: PMC11522274 DOI: 10.1038/s42003-024-07113-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Accepted: 10/21/2024] [Indexed: 11/02/2024] Open
Abstract
A Comment on the transformative progress of artificial intelligence for structural and protein biology, referencing the 2024 Nobel Prize in Chemistry.
Collapse
Affiliation(s)
- Luciano A Abriata
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, CH-1015, Lausanne, Switzerland.
| |
Collapse
|
4
|
Tripp A, Braun M, Wieser F, Oberdorfer G, Lechner H. Click, Compute, Create: A Review of Web-based Tools for Enzyme Engineering. Chembiochem 2024; 25:e202400092. [PMID: 38634409 DOI: 10.1002/cbic.202400092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/14/2024] [Accepted: 04/15/2024] [Indexed: 04/19/2024]
Abstract
Enzyme engineering, though pivotal across various biotechnological domains, is often plagued by its time-consuming and labor-intensive nature. This review aims to offer an overview of supportive in silico methodologies for this demanding endeavor. Starting from methods to predict protein structures, to classification of their activity and even the discovery of new enzymes we continue with describing tools used to increase thermostability and production yields of selected targets. Subsequently, we discuss computational methods to modulate both, the activity as well as selectivity of enzymes. Last, we present recent approaches based on cutting-edge machine learning methods to redesign enzymes. With exception of the last chapter, there is a strong focus on methods easily accessible via web-interfaces or simple Python-scripts, therefore readily useable for a diverse and broad community.
Collapse
Affiliation(s)
- Adrian Tripp
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Markus Braun
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Florian Wieser
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Gustav Oberdorfer
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| | - Horst Lechner
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| |
Collapse
|
5
|
Li Y, Li F, Duan Z, Liu R, Jiao W, Wu H, Zhu F, Xue W. SYNBIP 2.0: epitopes mapping, sequence expansion and scaffolds discovery for synthetic binding protein innovation. Nucleic Acids Res 2024:gkae893. [PMID: 39413165 DOI: 10.1093/nar/gkae893] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Revised: 09/18/2024] [Accepted: 09/26/2024] [Indexed: 10/18/2024] Open
Abstract
Synthetic binding proteins (SBPs) represent a pivotal class of artificially engineered proteins, meticulously crafted to exhibit targeted binding properties and specific functions. Here, the SYNBIP database, a comprehensive resource for SBPs, has been significantly updated. These enhancements include (i) featuring 3D structures of 899 SBP-target complexes to illustrate the binding epitopes of SBPs, (ii) using the structures of SBPs in the monomer or complex forms with target proteins, their sequence space has been expanded five times to 12 025 by integrating a structure-based protein generation framework and a protein property prediction tool, (iii) offering detailed information on 78 473 newly identified SBP-like scaffolds from the RCSB Protein Data Bank, and an additional 16 401 555 ones from the AlphaFold Protein Structure Database, and (iv) the database is regularly updated, incorporating 153 new SBPs. Furthermore, the structural models of all SBPs have been enhanced through the application of the AlphaFold2, with their clinical statuses concurrently refreshed. Additionally, the design methods employed for each SBP are now prominently featured in the database. In sum, SYNBIP 2.0 is designed to provide researchers with essential SBP data, facilitating their innovation in research, diagnosis and therapy. SYNBIP 2.0 is now freely accessible at https://idrblab.org/synbip/.
Collapse
Affiliation(s)
- Yanlin Li
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| | - Fengcheng Li
- Children's Hospital, Zhejiang University School of Medicine, National Clinical Research Center for Child Health, 3333 Binsheng Road, Hangzhou, Zhejiang 310052, China
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China
| | - Zixin Duan
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| | - Ruihan Liu
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| | - Wantong Jiao
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| | - Haibo Wu
- School of Life Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, 866 Yuhangtang Road, Hangzhou, Zhejiang 310058, China
| | - Weiwei Xue
- Chongqing Key Laboratory of Natural Product Synthesis and Drug Research, School of Pharmaceutical Sciences, Chongqing University, No. 55 South University Town Road, High-tech Zone, Chongqing 401331, China
| |
Collapse
|
6
|
Pleiss J. Modeling Enzyme Kinetics: Current Challenges and Future Perspectives for Biocatalysis. Biochemistry 2024; 63:2533-2541. [PMID: 39325558 DOI: 10.1021/acs.biochem.4c00501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/28/2024]
Abstract
Biocatalysis is becoming a data science. High-throughput experimentation generates a rapidly increasing stream of biocatalytic data, which is the raw material for mechanistic and novel data-driven modeling approaches for the predictive design of improved biocatalysts and novel bioprocesses. The holistic and molecular understanding of enzymatic reaction systems will enable us to identify and overcome kinetic bottlenecks and shift the thermodynamics of a reaction. The full characterization and modeling of reaction systems is a community effort; therefore, published methods and results should be findable, accessible, interoperable, and reusable (FAIR), which is achieved by developing standardized data exchange formats, by a complete and reproducible documentation of experimentation, by collaborative platforms for developing sustainable software and for analyzing data, and by repositories for publishing results together with raw data. The FAIRification of biocatalysis is a prerequisite to developing highly automated laboratory infrastructures that improve the reproducibility of scientific results and reduce the time and costs required to develop novel synthesis routes.
Collapse
Affiliation(s)
- Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| |
Collapse
|
7
|
Nurheibah SI, Sayyed ND, Batyanovskii AV, Talwar CS, Ahn WC, Park KH, Tuzikov AV, Ha KS, Woo EJ. Scyllatoxin-based peptide design for E. coli expression and HIV gp120 binding. Biochem Biophys Res Commun 2024; 727:150310. [PMID: 38941793 DOI: 10.1016/j.bbrc.2024.150310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 06/22/2024] [Indexed: 06/30/2024]
Abstract
Targeting the hydrophobic Phe43 pocket of HIV's envelope glycoprotein gp120 is a critical strategy for antiviral interventions due to its role in interacting with the host cell's CD4. Previous inhibitors, including small molecules and CD4 mimetic peptides based on scyllatoxin, have demonstrated significant binding and neutralization capabilities but were often chemically synthesized or contained non-canonical amino acids. Microbial expression using natural amino acids offers advantages such as cost-effectiveness, scalability, and efficient production of fusion proteins. In this study, we enhanced the previous scyllatoxin-based synthetic peptide by substituting natural amino acids and successfully expressed it in E. coli. The peptide was optimized by mutating the C-terminal amidated valine to valine and glutamine, and by reducing the disulfide bonds from three to two. Circular dichroism confirmed proper secondary structure formation, and fluorescence polarization analysis revealed specific, concentration-dependent binding to HIV gp120, supported by molecular dynamics simulations. These findings indicate the potential for scalable microbial production of effective antiviral peptides, with significant applications in pharmaceutical development for HIV treatment.
Collapse
Affiliation(s)
- Salsabilla Izzah Nurheibah
- Department of Proteome Structural Biology, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34113, Republic of Korea; Disease Target Structure Research Center, KRIBB, Daejeon, 31441, Republic of Korea
| | - Nilofar Danishmalik Sayyed
- Disease Target Structure Research Center, KRIBB, Daejeon, 31441, Republic of Korea; Department of Molecular and Cellular Biochemistry, Kangwon National University School of Medicine, Chuncheon, Kangwon-do, 24341, Republic of Korea
| | - Alexander V Batyanovskii
- United Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, 220012, Belarus
| | - Chandana S Talwar
- Department of Proteome Structural Biology, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34113, Republic of Korea; Disease Target Structure Research Center, KRIBB, Daejeon, 31441, Republic of Korea
| | - Woo-Chan Ahn
- Critical Diseases Diagnostics Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
| | - Kwang-Hyun Park
- Department of Proteome Structural Biology, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34113, Republic of Korea; Critical Diseases Diagnostics Convergence Research Center, Korea Research Institute of Bioscience and Biotechnology (KRIBB), Daejeon, 34141, Republic of Korea
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences of Belarus, Minsk, 220012, Belarus
| | - Kwon-Soo Ha
- Department of Molecular and Cellular Biochemistry, Kangwon National University School of Medicine, Chuncheon, Kangwon-do, 24341, Republic of Korea.
| | - Eui-Jeon Woo
- Department of Proteome Structural Biology, KRIBB School of Bioscience, Korea University of Science and Technology (UST), Daejeon, 34113, Republic of Korea; Disease Target Structure Research Center, KRIBB, Daejeon, 31441, Republic of Korea.
| |
Collapse
|
8
|
Son A, Park J, Kim W, Yoon Y, Lee S, Park Y, Kim H. Revolutionizing Molecular Design for Innovative Therapeutic Applications through Artificial Intelligence. Molecules 2024; 29:4626. [PMID: 39407556 PMCID: PMC11477718 DOI: 10.3390/molecules29194626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 09/19/2024] [Accepted: 09/27/2024] [Indexed: 10/20/2024] Open
Abstract
The field of computational protein engineering has been transformed by recent advancements in machine learning, artificial intelligence, and molecular modeling, enabling the design of proteins with unprecedented precision and functionality. Computational methods now play a crucial role in enhancing the stability, activity, and specificity of proteins for diverse applications in biotechnology and medicine. Techniques such as deep learning, reinforcement learning, and transfer learning have dramatically improved protein structure prediction, optimization of binding affinities, and enzyme design. These innovations have streamlined the process of protein engineering by allowing the rapid generation of targeted libraries, reducing experimental sampling, and enabling the rational design of proteins with tailored properties. Furthermore, the integration of computational approaches with high-throughput experimental techniques has facilitated the development of multifunctional proteins and novel therapeutics. However, challenges remain in bridging the gap between computational predictions and experimental validation and in addressing ethical concerns related to AI-driven protein design. This review provides a comprehensive overview of the current state and future directions of computational methods in protein engineering, emphasizing their transformative potential in creating next-generation biologics and advancing synthetic biology.
Collapse
Affiliation(s)
- Ahrum Son
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA;
| | - Jongham Park
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
| | - Woojin Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
| | - Yoonki Yoon
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
| | - Sangwoon Lee
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
| | - Yongho Park
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
| | - Hyunsoo Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.); (Y.P.)
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- Protein AI Design Institute, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- SCICS, Prove beyond AI, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| |
Collapse
|
9
|
Xie X, Gui L, Qiao B, Wang G, Huang S, Zhao Y, Sun S. Deep learning in template-free de novo biosynthetic pathway design of natural products. Brief Bioinform 2024; 25:bbae495. [PMID: 39373052 PMCID: PMC11456888 DOI: 10.1093/bib/bbae495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 09/12/2024] [Accepted: 09/20/2024] [Indexed: 10/08/2024] Open
Abstract
Natural products (NPs) are indispensable in drug development, particularly in combating infections, cancer, and neurodegenerative diseases. However, their limited availability poses significant challenges. Template-free de novo biosynthetic pathway design provides a strategic solution for NP production, with deep learning standing out as a powerful tool in this domain. This review delves into state-of-the-art deep learning algorithms in NP biosynthesis pathway design. It provides an in-depth discussion of databases like Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, and UniProt, which are essential for model training, along with chemical databases such as Reaxys, SciFinder, and PubChem for transfer learning to expand models' understanding of the broader chemical space. It evaluates the potential and challenges of sequence-to-sequence and graph-to-graph translation models for accurate single-step prediction. Additionally, it discusses search algorithms for multistep prediction and deep learning algorithms for predicting enzyme function. The review also highlights the pivotal role of deep learning in improving catalytic efficiency through enzyme engineering, which is essential for enhancing NP production. Moreover, it examines the application of large language models in pathway design, enzyme discovery, and enzyme engineering. Finally, it addresses the challenges and prospects associated with template-free approaches, offering insights into potential advancements in NP biosynthesis pathway design.
Collapse
Affiliation(s)
- Xueying Xie
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Lin Gui
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Baixue Qiao
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Guohua Wang
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Shan Huang
- Department of Neurology, The Second Affiliated Hospital, Harbin Medical University, No. 246 Xuefu Road, Nangang District,Harbin 150081, China
| | - Yuming Zhao
- College of Computer and Control Engineering, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| | - Shanwen Sun
- Key Laboratory of Saline-Alkali Vegetation Ecology Restoration, Ministry of Education (Northeast Forestry University), No. 26 Hexing Road, Xiangfang District, Harbin 150001, China
- College of Life Science, Northeast Forestry University, No. 26 Hexing Road, Xiangfang District, Harbin 150040, China
| |
Collapse
|
10
|
Sun C, Lou M, Li Z, Cheng F, Li Z. Combining an Enhanced Polyphosphate Kinase-Driven UDP-Glucose Regeneration System with the Screening of Key Glycosyltransferases for Efficient In Vitro Synthesis of Nucleoside Disaccharides. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:20557-20567. [PMID: 39250657 DOI: 10.1021/acs.jafc.4c05329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Nucleoside disaccharides are essential glycosides that naturally occur in specific living organisms. This study developed an enhanced UDP-glucose regeneration system to facilitate the in vitro multienzyme synthesis of nucleoside disaccharides by integrating it with nucleoside-specific glycosyltransferases. The system utilizes maltodextrin and polyphosphate as cost-effective substrates for UDP-glucose supply, catalyzed by α-glucan phosphorylase (αGP) and UDP-glucose pyrophosphorylase (UGP). To address the low activity of known polyphosphate kinases (PPKs) in the UDP phosphorylation reaction, a sequence-driven screening identified RhPPK with high activity against UDP (>1000 U/mg). Computational design further led to the creation of a double mutant with a 2566-fold increase in thermostability at 50 °C. The enhanced UDP-glucose regeneration system increased the production rate of nucleoside disaccharide synthesis by 25-fold. In addition, our UDP-glucose regeneration system is expected to be applied to other glycosyl transfer reactions.
Collapse
Affiliation(s)
- Chuanqi Sun
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Miaozi Lou
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zonglin Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Feiyan Cheng
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
| | - Zhimin Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai 200237, China
- Shanghai Collaborative Innovation Center for Biomanufacturing Technology, 130 Meilong Road, Shanghai 200237, China
| |
Collapse
|
11
|
Gantz M, Mathis SV, Nintzel FEH, Lio P, Hollfelder F. On synergy between ultrahigh throughput screening and machine learning in biocatalyst engineering. Faraday Discuss 2024; 252:89-114. [PMID: 39133073 PMCID: PMC11318516 DOI: 10.1039/d4fd00065j] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Accepted: 04/23/2024] [Indexed: 08/13/2024]
Abstract
Protein design and directed evolution have separately contributed enormously to protein engineering. Without being mutually exclusive, the former relies on computation from first principles, while the latter is a combinatorial approach based on chance. Advances in ultrahigh throughput (uHT) screening, next generation sequencing and machine learning may create alternative routes to engineered proteins, where functional information linked to specific sequences is interpreted and extrapolated in silico. In particular, the miniaturisation of functional tests in water-in-oil emulsion droplets with picoliter volumes and their rapid generation and analysis (>1 kHz) allows screening of >107-membered libraries in a day. Subsequently, decoding the selected clones by short or long-read sequencing methods leads to large sequence-function datasets that may allow extrapolation from experimental directed evolution to further improved mutants beyond the observed hits. In this work, we explore experimental strategies for how to draw up 'fitness landscapes' in sequence space with uHT droplet microfluidics, review the current state of AI/ML in enzyme engineering and discuss how uHT datasets may be combined with AI/ML to make meaningful predictions and accelerate biocatalyst engineering.
Collapse
Affiliation(s)
- Maximilian Gantz
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Simon V Mathis
- Department of Computer Science, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK
| | - Friederike E H Nintzel
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Pietro Lio
- Department of Computer Science, University of Cambridge, 15 JJ Thomson Avenue, Cambridge CB3 0FD, UK
| | - Florian Hollfelder
- Department of Biochemistry, University of Cambridge, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| |
Collapse
|
12
|
Luo J, Song C, Cui W, Wang Q, Zhou Z, Han L. Precise redesign for improving enzyme robustness based on coevolutionary analysis and multidimensional virtual screening. Chem Sci 2024:d4sc02058h. [PMID: 39257856 PMCID: PMC11382147 DOI: 10.1039/d4sc02058h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 06/27/2024] [Indexed: 09/12/2024] Open
Abstract
Natural enzymes are able to function effectively under optimal physiological conditions, but the intrinsic performance often fails to meet the demands of industrial production. Existing strategies are based mainly on the evaluation and subsequent combination of single-point mutations; however, this approach often suffers from a limited number of designable residues and from low accuracy. Here, we propose a strategy (Co-MdVS) based on coevolutionary analysis and multidimensional virtual screening for precise design to improve enzyme robustness, employing nattokinase as a model. Using this strategy, we efficiently screened 8 dual mutants with enhanced thermostability from a virtual mutation library containing 7980 mutants. After further iterative combination, the optimal mutant M6 exhibited a 31-fold increase in half-life at 55 °C, significantly enhanced acid resistance, and improved catalytic efficiency with different substrates. Molecular dynamics simulations indicated that the reduced flexibility of thermal and acid-sensitive regions resulted in a significantly increased robustness of M6. Furthermore, the potential of multidimensional virtual screening in enhancing design precision has been validated on l-rhamnose isomerase and PETase. Therefore, the Co-MdVS strategy introduced in this research may offer a viable approach for developing enzymes with enhanced robustness.
Collapse
Affiliation(s)
- Jie Luo
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| | - Chenshuo Song
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| | - Wenjing Cui
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| | - Qiong Wang
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| | - Zhemin Zhou
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| | - Laichuang Han
- Key Laboratory of Industrial Biotechnology (Ministry of Education), School of Biotechnology, Jiangnan University Wuxi Jiangsu 214122 China
| |
Collapse
|
13
|
Thornton EL, Paterson SM, Stam MJ, Wood CW, Laohakunakorn N, Regan L. Applications of cell free protein synthesis in protein design. Protein Sci 2024; 33:e5148. [PMID: 39180484 PMCID: PMC11344276 DOI: 10.1002/pro.5148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Revised: 07/26/2024] [Accepted: 08/02/2024] [Indexed: 08/26/2024]
Abstract
In protein design, the ultimate test of success is that the designs function as desired. Here, we discuss the utility of cell free protein synthesis (CFPS) as a rapid, convenient and versatile method to screen for activity. We champion the use of CFPS in screening potential designs. Compared to in vivo protein screening, a wider range of different activities can be evaluated using CFPS, and the scale on which it can easily be used-screening tens to hundreds of designed proteins-is ideally suited to current needs. Protein design using physics-based strategies tended to have a relatively low success rate, compared with current machine-learning based methods. Screening steps (such as yeast display) were often used to identify proteins that displayed the desired activity from many designs that were highly ranked computationally. We also describe how CFPS is well-suited to identify the reasons designs fail, which may include problems with transcription, translation, and solubility, in addition to not achieving the desired structure and function.
Collapse
Affiliation(s)
- Ella Lucille Thornton
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Sarah Maria Paterson
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Michael J. Stam
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Christopher W. Wood
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Nadanai Laohakunakorn
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| | - Lynne Regan
- Centre for Engineering Biology, Institute of Quantitative Biology, Biochemistry and Biotechnology, School of Biological SciencesUniversity of EdinburghEdinburghUK
| |
Collapse
|
14
|
Son A, Park J, Kim W, Lee W, Yoon Y, Ji J, Kim H. Integrating Computational Design and Experimental Approaches for Next-Generation Biologics. Biomolecules 2024; 14:1073. [PMID: 39334841 PMCID: PMC11430650 DOI: 10.3390/biom14091073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2024] [Revised: 08/13/2024] [Accepted: 08/26/2024] [Indexed: 09/30/2024] Open
Abstract
Therapeutic protein engineering has revolutionized medicine by enabling the development of highly specific and potent treatments for a wide range of diseases. This review examines recent advances in computational and experimental approaches for engineering improved protein therapeutics. Key areas of focus include antibody engineering, enzyme replacement therapies, and cytokine-based drugs. Computational methods like structure-based design, machine learning integration, and protein language models have dramatically enhanced our ability to predict protein properties and guide engineering efforts. Experimental techniques such as directed evolution and rational design approaches continue to evolve, with high-throughput methods accelerating the discovery process. Applications of these methods have led to breakthroughs in affinity maturation, bispecific antibodies, enzyme stability enhancement, and the development of conditionally active cytokines. Emerging approaches like intracellular protein delivery, stimulus-responsive proteins, and de novo designed therapeutic proteins offer exciting new possibilities. However, challenges remain in predicting in vivo behavior, scalable manufacturing, immunogenicity mitigation, and targeted delivery. Addressing these challenges will require continued integration of computational and experimental methods, as well as a deeper understanding of protein behavior in complex physiological environments. As the field advances, we can anticipate increasingly sophisticated and effective protein therapeutics for treating human diseases.
Collapse
Affiliation(s)
- Ahrum Son
- Department of Molecular Medicine, Scripps Research, La Jolla, CA 92037, USA;
| | - Jongham Park
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (W.L.); (Y.Y.)
| | - Woojin Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (W.L.); (Y.Y.)
| | - Wonseok Lee
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (W.L.); (Y.Y.)
| | - Yoonki Yoon
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (W.L.); (Y.Y.)
| | - Jaeho Ji
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea;
| | - Hyunsoo Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (W.L.); (Y.Y.)
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea;
- Protein AI Design Institute, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- SCICS (Sciences for Panomics), 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| |
Collapse
|
15
|
Zlobin A, Smirnov I, Golovin A. Dynamic interchange between two protonation states is characteristic of active sites of cholinesterases. Protein Sci 2024; 33:e5100. [PMID: 39022909 PMCID: PMC11255601 DOI: 10.1002/pro.5100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 05/28/2024] [Accepted: 06/19/2024] [Indexed: 07/20/2024]
Abstract
Cholinesterases are well-known and widely studied enzymes crucial to human health and involved in neurology, Alzheimer's, and lipid metabolism. The protonation pattern of active sites of cholinesterases influences all the chemical processes within, including reaction, covalent inhibition by nerve agents, and reactivation. Despite its significance, our comprehension of the fine structure of cholinesterases remains limited. In this study, we employed enhanced-sampling quantum-mechanical/molecular-mechanical calculations to show that cholinesterases predominantly operate as dynamic mixtures of two protonation states. The proton transfer between two non-catalytic glutamate residues follows the Grotthuss mechanism facilitated by a mediator water molecule. We show that this uncovered complexity of active sites presents a challenge for classical molecular dynamics simulations and calls for special treatment. The calculated proton transfer barrier of 1.65 kcal/mol initiates a discussion on the potential existence of two coupled low-barrier hydrogen bonds in the inhibited form of butyrylcholinesterase. These findings expand our understanding of structural features expressed by highly evolved enzymes and guide future advances in cholinesterase-related protein and drug design studies.
Collapse
Affiliation(s)
- Alexander Zlobin
- Institute for Drug DiscoveryLeipzig University Medical SchoolLeipzigGermany
- Faculty of Bioengineering and BioinformaticsLomonosov Moscow State UniversityMoscowRussia
| | - Ivan Smirnov
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of SciencesMoscowRussia
| | - Andrey Golovin
- Faculty of Bioengineering and BioinformaticsLomonosov Moscow State UniversityMoscowRussia
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of SciencesMoscowRussia
- Belozersky Institute of Physico‐Chemical BiologyLomonosov Moscow State UniversityMoscowRussia
| |
Collapse
|
16
|
King BR, Sumida KH, Caruso JL, Baker D, Zalatan JG. Computational stabilization of a non-heme iron enzyme enables efficient evolution of new function. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.18.590141. [PMID: 39091854 PMCID: PMC11290999 DOI: 10.1101/2024.04.18.590141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Directed evolution has emerged as a powerful tool for engineering new biocatalysts. However, introducing new catalytic residues can be destabilizing, and it is generally beneficial to start with a stable enzyme parent. Here we show that the deep learning-based tool ProteinMPNN can be used to redesign Fe(II)/αKG superfamily enzymes for greater stability, solubility, and expression while retaining both native activity and industrially-relevant non-native functions. For the Fe(II)/αKG enzyme tP4H, we performed site-saturation mutagenesis with both the wild-type and stabilized design variant and screened for activity increases in a non-native C-H hydroxylation reaction. We observed substantially larger increases in non-native activity for variants obtained from the stabilized scaffold compared to those from the wild-type enzyme. ProteinMPNN is user-friendly and widely-accessible, and straightforward structural criteria were sufficient to obtain stabilized, catalytically-functional variants of the Fe(II)/αKG enzymes tP4H and GriE. Our work suggests that stabilization by computational sequence redesign could be routinely implemented as a first step in directed evolution campaigns for novel biocatalysts.
Collapse
Affiliation(s)
- Brianne R King
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Kiera H Sumida
- Department of Chemistry and Institute for Protein Design, University of Washington, Seattle, Washington 98195, United States
| | - Jessica L Caruso
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - David Baker
- Institute for Protein Design, Department of Biochemistry, and Howard Hughes Medical Institute, University of Washington, Seattle, Washington 98195, United States
| | - Jesse G Zalatan
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
17
|
Birch-Price Z, Hardy FJ, Lister TM, Kohn AR, Green AP. Noncanonical Amino Acids in Biocatalysis. Chem Rev 2024; 124:8740-8786. [PMID: 38959423 PMCID: PMC11273360 DOI: 10.1021/acs.chemrev.4c00120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 06/11/2024] [Accepted: 06/12/2024] [Indexed: 07/05/2024]
Abstract
In recent years, powerful genetic code reprogramming methods have emerged that allow new functional components to be embedded into proteins as noncanonical amino acid (ncAA) side chains. In this review, we will illustrate how the availability of an expanded set of amino acid building blocks has opened a wealth of new opportunities in enzymology and biocatalysis research. Genetic code reprogramming has provided new insights into enzyme mechanisms by allowing introduction of new spectroscopic probes and the targeted replacement of individual atoms or functional groups. NcAAs have also been used to develop engineered biocatalysts with improved activity, selectivity, and stability, as well as enzymes with artificial regulatory elements that are responsive to external stimuli. Perhaps most ambitiously, the combination of genetic code reprogramming and laboratory evolution has given rise to new classes of enzymes that use ncAAs as key catalytic elements. With the framework for developing ncAA-containing biocatalysts now firmly established, we are optimistic that genetic code reprogramming will become a progressively more powerful tool in the armory of enzyme designers and engineers in the coming years.
Collapse
Affiliation(s)
| | | | | | | | - Anthony P. Green
- Manchester Institute of Biotechnology,
School of Chemistry, University of Manchester, Manchester M1 7DN, U.K.
| |
Collapse
|
18
|
Kantroo P, Wagner GP, Machta BB. Pseudo-perplexity in One Fell Swoop for Protein Fitness Estimation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.09.602754. [PMID: 39026871 PMCID: PMC11257618 DOI: 10.1101/2024.07.09.602754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Protein language models trained on the masked language modeling objective learn to predict the identity of hidden amino acid residues within a sequence using the remaining observable sequence as context. They do so by embedding the residues into a high dimensional space that encapsulates the relevant contextual cues. These embedding vectors serve as an informative context-sensitive representation that not only aids with the defined training objective, but can also be used for other tasks by downstream models. We propose a scheme to use the embeddings of an unmasked sequence to estimate the corresponding masked probability vectors for all the positions in a single forward pass through the language model. This One Fell Swoop (OFS) approach allows us to efficiently estimate the pseudo-perplexity of the sequence, a measure of the model's uncertainty in its predictions, that can also serve as a fitness estimate. We find that ESM2 OFS pseudo-perplexity performs nearly as well as the true pseudo-perplexity at fitness estimation, and more notably it defines a new state of the art on the ProteinGym Indels benchmark. The strong performance of the fitness measure prompted us to investigate if it could be used to detect the elevated stability reported in reconstructed ancestral sequences. We find that this measure ranks ancestral reconstructions as more fit than extant sequences. Finally, we show that the computational efficiency of the technique allows for the use of Monte Carlo methods that can rapidly explore functional sequence space.
Collapse
Affiliation(s)
- Pranav Kantroo
- Computational Biology and Bioinformatics Program, Yale University, New Haven, CT-06520, USA
- Quantitative Biology Institute, Yale University, New Haven, CT-06520, USA
| | - Günter P. Wagner
- Emeritus, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT-06520, USA
- Department of Evolutionary Biology, University of Vienna, Djerassi Platz 1, A-1030 Vienna, Austria
- Hagler Institute for Advanced Studies, Texas A&M, College Station, TX-77843, USA
| | - Benjamin B. Machta
- Department of Physics, Yale University, New Haven, CT-06520, USA
- Quantitative Biology Institute, Yale University, New Haven, CT-06520, USA
| |
Collapse
|
19
|
Li Z, Lou M, Sun C, Li Z. Engineering a Robust UDP-Glucose Pyrophosphorylase for Enhanced Biocatalytic Synthesis via ProteinMPNN and Ancestral Sequence Reconstruction. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:15284-15292. [PMID: 38918953 DOI: 10.1021/acs.jafc.4c03126] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/27/2024]
Abstract
UDP-glucose is a key metabolite in carbohydrate metabolism and plays a vital role in glycosyl transfer reactions. Its significance spans across the food and agricultural industries. This study focuses on UDP-glucose synthesis via multienzyme catalysis using dextrin, incorporating UTP production and ATP regeneration modules to reduce costs. To address thermal stability limitations of the key UDP-glucose pyrophosphorylase (UGP), a deep learning-based protein sequence design approach and ancestral sequence reconstruction are employed to engineer a thermally stable UGP variant. The engineered UGP variant is significantly 500-fold more thermally stable at 60 °C and has a half-life of 49.8 h compared to the wild-type enzyme. MD simulations and umbrella sampling calculations provide insights into the mechanism behind the enhanced thermal stability. Experimental validation demonstrates that the engineered UGP variant can produce 52.6 mM UDP-glucose within 6 h in an in vitro cascade reaction. This study offers practical insights for efficient UDP-glucose synthesis methods.
Collapse
Affiliation(s)
- Zonglin Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Miaozi Lou
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Chuanqi Sun
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
| | - Zhimin Li
- State Key Laboratory of Bioreactor Engineering, East China University of Science and Technology, Shanghai 200237, China
- Shanghai Collaborative Innovation Center for Biomanufacturing Technology, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
20
|
Kantroo P, Wagner GP, Machta BB. Pseudo-perplexity in One Fell Swoop for Protein Fitness Estimation. ARXIV 2024:arXiv:2407.07265v1. [PMID: 39040648 PMCID: PMC11261985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Protein language models trained on the masked language modeling objective learn to predict the identity of hidden amino acid residues within a sequence using the remaining observable sequence as context. They do so by embedding the residues into a high dimensional space that encapsulates the relevant contextual cues. These embedding vectors serve as an informative context-sensitive representation that not only aids with the defined training objective, but can also be used for other tasks by downstream models. We propose a scheme to use the embeddings of an unmasked sequence to estimate the corresponding masked probability vectors for all the positions in a single forward pass through the language model. This One Fell Swoop (OFS) approach allows us to efficiently estimate the pseudo-perplexity of the sequence, a measure of the model's uncertainty in its predictions, that can also serve as a fitness estimate. We find that ESM2 OFS pseudo-perplexity performs nearly as well as the true pseudo-perplexity at fitness estimation, and more notably it defines a new state of the art on the ProteinGym Indels benchmark. The strong performance of the fitness measure prompted us to investigate if it could be used to detect the elevated stability reported in reconstructed ancestral sequences. We find that this measure ranks ancestral reconstructions as more fit than extant sequences. Finally, we show that the computational efficiency of the technique allows for the use of Monte Carlo methods that can rapidly explore functional sequence space.
Collapse
Affiliation(s)
- Pranav Kantroo
- Computational Biology and Bioinformatics Program, Yale University, New Haven, CT-06520, USA
- Quantitative Biology Institute, Yale University, New Haven, CT-06520, USA
| | - Günter P. Wagner
- Emeritus, Department of Ecology and Evolutionary Biology, Yale University, New Haven, CT-06520, USA
- Department of Evolutionary Biology, University of Vienna, Djerassi Platz 1, A-1030 Vienna, Austria
- Hagler Institute for Advanced Studies, Texas A&M, College Station, TX-77843, USA
| | - Benjamin B. Machta
- Department of Physics, Yale University, New Haven, CT-06520, USA
- Quantitative Biology Institute, Yale University, New Haven, CT-06520, USA
| |
Collapse
|
21
|
Felbinger N, Ribeiro-Filho H, Pierce B. Proscan: a structure-based proline design web server. Nucleic Acids Res 2024; 52:W280-W286. [PMID: 38769060 PMCID: PMC11223860 DOI: 10.1093/nar/gkae408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 04/16/2024] [Accepted: 05/01/2024] [Indexed: 05/22/2024] Open
Abstract
The ability to control protein conformations and dynamics through structure-based design has been useful in various scenarios, including engineering of viral antigens for vaccines. One effective design strategy is the substitution of residues to proline amino acids, which due to its unique cyclic side chain can favor and rigidify key backbone conformations. To provide the community with a means to readily identify and explore proline designs for target proteins of interest, we developed the Proscan web server. Proscan provides assessment of backbone angles, energetic and deep learning-based favorability scores, and other parameters for proline substitutions at each position of an input structure, along with interactive visualization of backbone angles and candidate substitution sites on structures. It identifies known favorable proline substitutions for viral antigens, and was benchmarked against datasets of proline substitution stability effects from deep mutational scanning and thermodynamic measurements. This tool can enable researchers to identify and prioritize designs for prospective vaccine antigen targets, or other designs to favor stability of key protein conformations. Proscan is available at: https://proscan.ibbr.umd.edu.
Collapse
Affiliation(s)
- Nathaniel Felbinger
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| | - Helder V Ribeiro-Filho
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Brazilian Biosciences National Laboratory, Brazilian Center for Research in Energy and Materials, Campinas 13083-100, Brazil
| | - Brian G Pierce
- University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD 20850, USA
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA
| |
Collapse
|
22
|
Shanker VR, Bruun TUJ, Hie BL, Kim PS. Unsupervised evolution of protein and antibody complexes with a structure-informed language model. Science 2024; 385:46-53. [PMID: 38963838 DOI: 10.1126/science.adk8946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2023] [Accepted: 05/29/2024] [Indexed: 07/06/2024]
Abstract
Large language models trained on sequence information alone can learn high-level principles of protein design. However, beyond sequence, the three-dimensional structures of proteins determine their specific function, activity, and evolvability. Here, we show that a general protein language model augmented with protein structure backbone coordinates can guide evolution for diverse proteins without the need to model individual functional tasks. We also demonstrate that ESM-IF1, which was only trained on single-chain structures, can be extended to engineer protein complexes. Using this approach, we screened about 30 variants of two therapeutic clinical antibodies used to treat severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection. We achieved up to 25-fold improvement in neutralization and 37-fold improvement in affinity against antibody-escaped viral variants of concern BQ.1.1 and XBB.1.5, respectively. These findings highlight the advantage of integrating structural information to identify efficient protein evolution trajectories without requiring any task-specific training data.
Collapse
Affiliation(s)
- Varun R Shanker
- Stanford Biophysics Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
| | - Theodora U J Bruun
- Stanford Medical Scientist Training Program, Stanford University School of Medicine, Stanford, CA 94305, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Brian L Hie
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Peter S Kim
- Sarafan ChEM-H, Stanford University, Stanford, CA 94305, USA
- Department of Biochemistry, Stanford University School of Medicine, Stanford, CA 94305, USA
- Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| |
Collapse
|
23
|
Nestl BM, Nebel BA, Resch V, Schürmann M, Tischler D. The Development and Opportunities of Predictive Biotechnology. Chembiochem 2024; 25:e202300863. [PMID: 38713151 DOI: 10.1002/cbic.202300863] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 04/05/2024] [Indexed: 05/08/2024]
Abstract
Recent advances in bioeconomy allow a holistic view of existing and new process chains and enable novel production routines continuously advanced by academia and industry. All this progress benefits from a growing number of prediction tools that have found their way into the field. For example, automated genome annotations, tools for building model structures of proteins, and structural protein prediction methods such as AlphaFold2TM or RoseTTAFold have gained popularity in recent years. Recently, it has become apparent that more and more AI-based tools are being developed and used for biocatalysis and biotechnology. This is an excellent opportunity for academia and industry to accelerate advancements in the field further. Biotechnology, as a rapidly growing interdisciplinary field, stands to benefit greatly from these developments.
Collapse
Affiliation(s)
- Bettina M Nestl
- Joint working group on biotransformations of the Association for General and Applied Microbiology VAAM, the Society for Chemical Engineering, Biotechnology DECHEMA, Theodor-Heuss-Allee 25, 60486, Frankfurt, Germany
- Innophore GmbH, Am Eisernen Tor 3, 8010, Graz, Austria
| | - Bernd A Nebel
- Innophore GmbH, Am Eisernen Tor 3, 8010, Graz, Austria
| | - Verena Resch
- Innophore GmbH, Am Eisernen Tor 3, 8010, Graz, Austria
| | - Martin Schürmann
- Joint working group on biotransformations of the Association for General and Applied Microbiology VAAM, the Society for Chemical Engineering, Biotechnology DECHEMA, Theodor-Heuss-Allee 25, 60486, Frankfurt, Germany
- InnoSyn B. V., Urmonderbaan 22, 6167 RD, Geleen, The Netherlands
- SynSilico B. V., Urmonderbaan 22, 6167 RD, Geleen, The Netherlands
| | - Dirk Tischler
- Joint working group on biotransformations of the Association for General and Applied Microbiology VAAM, the Society for Chemical Engineering, Biotechnology DECHEMA, Theodor-Heuss-Allee 25, 60486, Frankfurt, Germany
- Microbial Biotechnology, Ruhr University Bochum, Universitätsstrasse 150, 44780, Bochum, Germany
| |
Collapse
|
24
|
Chavas LMG, Coulibaly F, Garriga D. Bridging the microscopic divide: a comprehensive overview of micro-crystallization and in vivo crystallography. IUCRJ 2024; 11:476-485. [PMID: 38958014 PMCID: PMC11220871 DOI: 10.1107/s205225252400513x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 05/30/2024] [Indexed: 07/04/2024]
Abstract
A series of events underscoring the significant advancements in micro-crystallization and in vivo crystallography were held during the 26th IUCr Congress in Melbourne, positioning microcrystallography as a pivotal field within structural biology. Through collaborative discussions and the sharing of innovative methodologies, these sessions outlined frontier approaches in macromolecular crystallography. This review provides an overview of this rapidly moving field in light of the rich dialogues and forward-thinking proposals explored during the congress workshop and microsymposium. These advances in microcrystallography shed light on the potential to reshape current research paradigms and enhance our comprehension of biological mechanisms at the molecular scale.
Collapse
Affiliation(s)
| | - Fasséli Coulibaly
- Biomedicine Discovery Institute & Department of Biochemistry and Molecular BiologyMonash UniversityClaytonAustralia
| | | |
Collapse
|
25
|
Basu S, Subedi U, Tonelli M, Afshinpour M, Tiwari N, Fuentes EJ, Chakravarty S. Assessing the functional roles of coevolving PHD finger residues. Protein Sci 2024; 33:e5065. [PMID: 38923615 PMCID: PMC11201814 DOI: 10.1002/pro.5065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 04/21/2024] [Accepted: 05/16/2024] [Indexed: 06/28/2024]
Abstract
Although in silico folding based on coevolving residue constraints in the deep-learning era has transformed protein structure prediction, the contributions of coevolving residues to protein folding, stability, and other functions in physical contexts remain to be clarified and experimentally validated. Herein, the PHD finger module, a well-known histone reader with distinct subtypes containing subtype-specific coevolving residues, was used as a model to experimentally assess the contributions of coevolving residues and to clarify their specific roles. The results of the assessment, including proteolysis and thermal unfolding of wildtype and mutant proteins, suggested that coevolving residues have varying contributions, despite their large in silico constraints. Residue positions with large constraints were found to contribute to stability in one subtype but not others. Computational sequence design and generative model-based energy estimates of individual structures were also implemented to complement the experimental assessment. Sequence design and energy estimates distinguish coevolving residues that contribute to folding from those that do not. The results of proteolytic analysis of mutations at positions contributing to folding were consistent with those suggested by sequence design and energy estimation. Thus, we report a comprehensive assessment of the contributions of coevolving residues, as well as a strategy based on a combination of approaches that should enable detailed understanding of the residue contributions in other large protein families.
Collapse
Affiliation(s)
- Shraddha Basu
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| | - Ujwal Subedi
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| | - Marco Tonelli
- National Magnetic Resonance Facility at Madison (NMRFAM), University of Wisconsin‐MadisonMadisonWisconsinUSA
| | - Maral Afshinpour
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| | - Nitija Tiwari
- Department of Biochemistry & Molecular BiologyUniversity of IowaIowa CityIowaUSA
| | - Ernesto J. Fuentes
- Department of Biochemistry & Molecular BiologyUniversity of IowaIowa CityIowaUSA
| | - Suvobrata Chakravarty
- Department of Chemistry & BiochemistrySouth Dakota State UniversityBrookingsSouth DakotaUSA
| |
Collapse
|
26
|
Ma B, Liu D, Wang Z, Zhang D, Jian Y, Zhang K, Zhou T, Gao Y, Fan Y, Ma J, Gao Y, Chen Y, Chen S, Liu J, Li X, Li L. A Top-Down Design Approach for Generating a Peptide PROTAC Drug Targeting Androgen Receptor for Androgenetic Alopecia Therapy. J Med Chem 2024; 67:10336-10349. [PMID: 38836467 DOI: 10.1021/acs.jmedchem.4c00828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2024]
Abstract
While large-scale artificial intelligence (AI) models for protein structure prediction and design are advancing rapidly, the translation of deep learning models for practical macromolecular drug development remains limited. This investigation aims to bridge this gap by combining cutting-edge methodologies to create a novel peptide-based PROTAC drug development paradigm. Using ProteinMPNN and RFdiffusion, we identified binding peptides for androgen receptor (AR) and Von Hippel-Lindau (VHL), followed by computational modeling with Alphafold2-multimer and ZDOCK to predict spatial interrelationships. Experimental validation confirmed the designed peptide's binding ability to AR and VHL. Transdermal microneedle patching technology was seamlessly integrated for the peptide PROTAC drug delivery in androgenic alopecia treatment. In summary, our approach provides a generic method for generating peptide PROTACs and offers a practical application for designing potential therapeutic drugs for androgenetic alopecia. This showcases the potential of interdisciplinary approaches in advancing drug development and personalized medicine.
Collapse
Affiliation(s)
- Bohan Ma
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Donghua Liu
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Zhe Wang
- Institute of Bioengineering, College of Chemical and Biological Engineering, Zhejiang University, Hangzhou, Zhejiang 310000, China
| | - Dize Zhang
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yanlin Jian
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Kun Zhang
- Department of Respiratory and Critical Care Medicine, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Tianyang Zhou
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yibo Gao
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yizeng Fan
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Jian Ma
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yang Gao
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Yule Chen
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Si Chen
- School of Medicine, Shanghai University, 99 Shangda Road, Shanghai 200444, China
| | - Jing Liu
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| | - Xiang Li
- School of Pharmacy, Second Military Medical University, 325 Guohe Road, Shanghai 200433, China
| | - Lei Li
- Department of Urology, The First Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710049, China
| |
Collapse
|
27
|
Fram B, Su Y, Truebridge I, Riesselman AJ, Ingraham JB, Passera A, Napier E, Thadani NN, Lim S, Roberts K, Kaur G, Stiffler MA, Marks DS, Bahl CD, Khan AR, Sander C, Gauthier NP. Simultaneous enhancement of multiple functional properties using evolution-informed protein design. Nat Commun 2024; 15:5141. [PMID: 38902262 PMCID: PMC11190266 DOI: 10.1038/s41467-024-49119-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2023] [Accepted: 05/24/2024] [Indexed: 06/22/2024] Open
Abstract
A major challenge in protein design is to augment existing functional proteins with multiple property enhancements. Altering several properties likely necessitates numerous primary sequence changes, and novel methods are needed to accurately predict combinations of mutations that maintain or enhance function. Models of sequence co-variation (e.g., EVcouplings), which leverage extensive information about various protein properties and activities from homologous protein sequences, have proven effective for many applications including structure determination and mutation effect prediction. We apply EVcouplings to computationally design variants of the model protein TEM-1 β-lactamase. Nearly all the 14 experimentally characterized designs were functional, including one with 84 mutations from the nearest natural homolog. The designs also had large increases in thermostability, increased activity on multiple substrates, and nearly identical structure to the wild type enzyme. This study highlights the efficacy of evolutionary models in guiding large sequence alterations to generate functional diversity for protein design applications.
Collapse
Affiliation(s)
- Benjamin Fram
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
| | - Yang Su
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Ian Truebridge
- Institute for Protein Innovation, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- AI Proteins, Boston, MA, USA
| | - Adam J Riesselman
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Program in Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - John B Ingraham
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Alessandro Passera
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Research Institute of Molecular Pathology (IMP), Vienna BioCenter (VBC), Campus-Vienna-Biocenter 1, 1030, Vienna, Austria
| | - Eve Napier
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
| | - Nicole N Thadani
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Apriori Bio, Cambridge, MA, USA
| | - Samuel Lim
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
| | - Kristen Roberts
- Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
| | - Gurleen Kaur
- Selux Diagnostics Inc., 56 Roland Street, Charlestown, MA, USA
| | - Michael A Stiffler
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Dyno Therapeutics, 343 Arsenal Street, Watertown, MA, USA
| | - Debora S Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Christopher D Bahl
- Institute for Protein Innovation, Boston, MA, USA
- Division of Hematology/Oncology, Boston Children's Hospital, Harvard Medical School, Boston, MA, USA
- AI Proteins, Boston, MA, USA
| | - Amir R Khan
- School of Biochemistry and Immunology, Trinity College Dublin, Dublin 2, Ireland
- Division of Newborn Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nicholas P Gauthier
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
28
|
Daffern N, Johansson KE, Baumer ZT, Robertson NR, Woojuh J, Bedewitz MA, Davis Z, Wheeldon I, Cutler SR, Lindorff-Larsen K, Whitehead TA. GMMA Can Stabilize Proteins Across Different Functional Constraints. J Mol Biol 2024; 436:168586. [PMID: 38663544 DOI: 10.1016/j.jmb.2024.168586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2024] [Revised: 04/16/2024] [Accepted: 04/17/2024] [Indexed: 05/06/2024]
Abstract
Stabilizing proteins without otherwise hampering their function is a central task in protein engineering and design. PYR1 is a plant hormone receptor that has been engineered to bind diverse small molecule ligands. We sought a set of generalized mutations that would provide stability without affecting functionality for PYR1 variants with diverse ligand-binding capabilities. To do this we used a global multi-mutant analysis (GMMA) approach, which can identify substitutions that have stabilizing effects and do not lower function. GMMA has the added benefit of finding substitutions that are stabilizing in different sequence contexts and we hypothesized that applying GMMA to PYR1 with different functionalities would identify this set of generalized mutations. Indeed, conducting FACS and deep sequencing of libraries for PYR1 variants with two different functionalities and applying a GMMA analysis identified 5 substitutions that, when inserted into four PYR1 variants that each bind a unique ligand, provided an increase of 2-6 °C in thermal inactivation temperature and no decrease in functionality.
Collapse
Affiliation(s)
- Nicolas Daffern
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80305, USA
| | - Kristoffer E Johansson
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Zachary T Baumer
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80305, USA
| | | | - Janty Woojuh
- Department of Botany and Plant Sciences, University of California, Riverside, USA
| | - Matthew A Bedewitz
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80305, USA
| | - Zoë Davis
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80305, USA
| | - Ian Wheeldon
- Department of Chemical and Environmental Engineering, University of California, Riverside, USA; Institute for Integrative Genome Biology, University of California, Riverside, Riverside, CA, USA
| | - Sean R Cutler
- Department of Botany and Plant Sciences, University of California, Riverside, USA; Institute for Integrative Genome Biology, University of California, Riverside, Riverside, CA, USA; Center for Plant Cell Biology, University of California, Riverside, Riverside, CA, USA
| | - Kresten Lindorff-Larsen
- Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen, Copenhagen, Denmark.
| | - Timothy A Whitehead
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, CO 80305, USA.
| |
Collapse
|
29
|
Nikolaev A, Kuzmin A, Markeeva E, Kuznetsova E, Ryzhykau YL, Semenov O, Anuchina A, Remeeva A, Gushchin I. Reengineering of a flavin-binding fluorescent protein using ProteinMPNN. Protein Sci 2024; 33:e4958. [PMID: 38501498 PMCID: PMC10949330 DOI: 10.1002/pro.4958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 01/12/2024] [Accepted: 02/18/2024] [Indexed: 03/20/2024]
Abstract
Recent advances in machine learning techniques have led to development of a number of protein design and engineering approaches. One of them, ProteinMPNN, predicts an amino acid sequence that would fold and match user-defined backbone structure. Its performance was previously tested for proteins composed of standard amino acids, as well as for peptide- and protein-binding proteins. In this short report, we test whether ProteinMPNN can be used to reengineer a non-proteinaceous ligand-binding protein, flavin-based fluorescent protein CagFbFP. We fixed the native backbone conformation and the identity of 20 amino acids interacting with the chromophore (flavin mononucleotide, FMN) while letting ProteinMPNN predict the rest of the sequence. The software package suggested replacing 36-48 out of the remaining 86 amino acids so that the resulting sequences are 55%-66% identical to the original one. The three designs that we tested experimentally displayed different expression levels, yet all were able to bind FMN and displayed fluorescence, thermal stability, and other properties similar to those of CagFbFP. Our results demonstrate that ProteinMPNN can be used to generate diverging unnatural variants of fluorescent proteins, and, more generally, to reengineer proteins without losing their ligand-binding capabilities.
Collapse
Affiliation(s)
- Andrey Nikolaev
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Alexander Kuzmin
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Elena Markeeva
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Elizaveta Kuznetsova
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Yury L. Ryzhykau
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
- Frank Laboratory of Neutron PhysicsJoint Institute for Nuclear ResearchDubnaRussia
| | - Oleg Semenov
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Arina Anuchina
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Alina Remeeva
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| | - Ivan Gushchin
- Research Center for Molecular Mechanisms of Aging and Age‐Related DiseasesMoscow Institute of Physics and TechnologyDolgoprudnyRussia
| |
Collapse
|
30
|
Notin P, Rollins N, Gal Y, Sander C, Marks D. Machine learning for functional protein design. Nat Biotechnol 2024; 42:216-228. [PMID: 38361074 DOI: 10.1038/s41587-024-02127-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2023] [Accepted: 01/05/2024] [Indexed: 02/17/2024]
Abstract
Recent breakthroughs in AI coupled with the rapid accumulation of protein sequence and structure data have radically transformed computational protein design. New methods promise to escape the constraints of natural and laboratory evolution, accelerating the generation of proteins for applications in biotechnology and medicine. To make sense of the exploding diversity of machine learning approaches, we introduce a unifying framework that classifies models on the basis of their use of three core data modalities: sequences, structures and functional labels. We discuss the new capabilities and outstanding challenges for the practical design of enzymes, antibodies, vaccines, nanomachines and more. We then highlight trends shaping the future of this field, from large-scale assays to more robust benchmarks, multimodal foundation models, enhanced sampling strategies and laboratory automation.
Collapse
Affiliation(s)
- Pascal Notin
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Department of Computer Science, University of Oxford, Oxford, UK.
| | | | - Yarin Gal
- Department of Computer Science, University of Oxford, Oxford, UK
| | - Chris Sander
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA
- Broad Institute of Harvard and MIT, Cambridge, MA, USA
| | - Debora Marks
- Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
- Broad Institute of Harvard and MIT, Cambridge, MA, USA.
| |
Collapse
|
31
|
Dolorfino M, Samanta R, Vorobieva A. ProteinMPNN Recovers Complex Sequence Properties of Transmembrane β-barrels. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.16.575764. [PMID: 38352434 PMCID: PMC10862708 DOI: 10.1101/2024.01.16.575764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
Recent deep-learning (DL) protein design methods have been successfully applied to a range of protein design problems, including the de novo design of novel folds, protein binders, and enzymes. However, DL methods have yet to meet the challenge of de novo membrane protein (MP) and the design of complex β-sheet folds. We performed a comprehensive benchmark of one DL protein sequence design method, ProteinMPNN, using transmembrane and water-soluble β-barrel folds as a model, and compared the performance of ProteinMPNN to the new membrane-specific Rosetta Franklin2023 energy function. We tested the effect of input backbone refinement on ProteinMPNN performance and found that given refined and well-defined inputs, ProteinMPNN more accurately captures global sequence properties despite complex folding biophysics. It generates more diverse TMB sequences than Franklin2023 in pore-facing positions. In addition, ProteinMPNN generated TMB sequences that passed state-of-the-art in silico filters for experimental validation, suggesting that the model could be used in de novo design tasks of diverse nanopores for single-molecule sensing and sequencing. Lastly, our results indicate that the low success rate of ProteinMPNN for the design of β-sheet proteins stems from backbone input accuracy rather than software limitations.
Collapse
Affiliation(s)
- Marissa Dolorfino
- Structural Biology Brussel, Vrije Universiteit Brussel, Brussels, Belgium
- VUB-VIB Center for Structural Biology, Brussels, Belgium
| | | | - Anastassia Vorobieva
- Structural Biology Brussel, Vrije Universiteit Brussel, Brussels, Belgium
- VUB-VIB Center for Structural Biology, Brussels, Belgium
- VIB Center for AI and Computational Biology, Belgium
| |
Collapse
|