1
|
Liu D, Dou W, Song H, Deng H, Tian Z, Chen R, Liu Z, Jiao Z, Akhberdi O. Insights into the functional mechanism of the non-specific lipid transfer protein nsLTP in Kalanchoe fedtschenkoi (Lavender scallops). Protein Expr Purif 2025; 226:106607. [PMID: 39260807 DOI: 10.1016/j.pep.2024.106607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Revised: 09/07/2024] [Accepted: 09/07/2024] [Indexed: 09/13/2024]
Abstract
Plant non-specific lipid transfer protein (nsLTP) is able to bind and transport lipids and essential oils, as well as engage in various physiological processes, including defense against phytopathogens. Kalanchoe fedtschenkoi (Lavender Scallops) is an attractive and versatile succulent. To investigate the functional mechanism of Kalanchoe fedtschenkoi nsLTP (Ka-nsLTP), we expressed, purified and successfully obtained monomeric Ka-nsLTP. Mutational experiments revealed that the C6A variant retained the same activity as the wild-type (WT) Ka-nsLTP. Ka-nsLTP showed weak antiphytopathogenic bacterial activity, but inhibited fungal growth. Ka-nsLTP possessed a hydrophobic cavity effectively binding lauric acid. Our results offer novel molecular insights into the functional mechanism of nsLTP, which broadens our knowledge of the biological function of nsLTP in crops and provides a useful locus for genetic improvement of plants.
Collapse
Affiliation(s)
- Dafeng Liu
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China; School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China.
| | - Wenrui Dou
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China
| | - Hongying Song
- School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Huashui Deng
- School of Life Sciences, Xiamen University, Xiamen, 361102, Fujian, China
| | - Zhu Tian
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China
| | - Rong Chen
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China
| | - Zhen Liu
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China
| | - Ziwei Jiao
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China.
| | - Oren Akhberdi
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, 835000, Xinjiang, China.
| |
Collapse
|
2
|
Madaj R, Martinez-Goikoetxea M, Kaminski K, Ludwiczak J, Dunin-Horkawicz S. Applicability of AlphaFold2 in the modeling of dimeric, trimeric, and tetrameric coiled-coil domains. Protein Sci 2025; 34:e5244. [PMID: 39688306 DOI: 10.1002/pro.5244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 10/10/2024] [Accepted: 11/20/2024] [Indexed: 12/18/2024]
Abstract
Coiled coils are a common protein structural motif involved in cellular functions ranging from mediating protein-protein interactions to facilitating processes such as signal transduction or regulation of gene expression. They are formed by two or more alpha helices that wind around a central axis to form a buried hydrophobic core. Various forms of coiled-coil bundles have been reported, each characterized by the number, orientation, and degree of winding of the constituent helices. This variability is underpinned by short sequence repeats that form coiled coils and whose properties determine both their overall topology and the local geometry of the hydrophobic core. The strikingly repetitive sequence has enabled the development of accurate sequence-based coiled-coil prediction methods; however, the modeling of coiled-coil domains remains a challenging task. In this work, we evaluated the accuracy of AlphaFold2 in modeling coiled-coil domains, both in modeling local geometry and in predicting global topological properties. Furthermore, we show that the prediction of the oligomeric state of coiled-coil bundles can be achieved by using the internal representations of AlphaFold2, with a performance better than any previous state-of-the-art method (code available at https://github.com/labstructbioinf/dc2_oligo).
Collapse
Affiliation(s)
- Rafal Madaj
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | | | - Kamil Kaminski
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Jan Ludwiczak
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
| | - Stanislaw Dunin-Horkawicz
- Institute of Evolutionary Biology, Faculty of Biology, Biological and Chemical Research Centre, University of Warsaw, Warsaw, Poland
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| |
Collapse
|
3
|
Guo W, Ren K, Long Z, Fu X, Zhang J, Liu M, Chen Y. Efficient screening and discovery of umami peptides in Douchi enhanced by molecular dynamics simulations. Food Chem X 2024; 24:101940. [PMID: 39559460 PMCID: PMC11570484 DOI: 10.1016/j.fochx.2024.101940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 10/24/2024] [Accepted: 10/28/2024] [Indexed: 11/20/2024] Open
Abstract
In this study, a partial least squares discriminant analysis (PLS-DA) discriminant model for umami peptides was constructed based on molecular dynamics simulation data, achieving a R 2 value of 0.949 and a Q 2 value of 0.558. Using this novel model and bioinformatics screening methods, five new umami peptides (EALEATAQ, SPPTEE, SEEG, KEE, and FEE, with umami taste thresholds of 0.139, 0.085, 0.096, 0.060, and 0.079 mg/mL, respectively) were identified in Douchi. Molecular docking revealed that the residues ASN150 of T1R1, as well as SER170, GLU301 and GLN389 of T1R3, might be key amino acid residues for the binding of umami peptides to T1R1/T1R3. Molecular dynamics simulations revealed significant differences in the root-mean-square fluctuation (RMSF) values between the two complex systems of umami peptides-T1R1/T1R3 and non-umami peptides-T1R1/T1R3. The newly constructed umami peptide discriminant model can improve the accuracy of umami peptide screening and enhance the efficiency of discovering new umami peptides.
Collapse
Affiliation(s)
- Weidan Guo
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China
| | - Kangzi Ren
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China
| | - Zhao Long
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China
| | - Xiangjin Fu
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China
- Seasonings Green Manufacturing Engineering Technology Research Center of Hunan Province, Hun an Huixiangxuan Bio. Tech. Ltd. Com., Liuyang 410323, China
| | - Jianan Zhang
- College of Food Science and Engineering, Central South University of Forestry and Technology, Changsha 410004, China
| | - Min Liu
- Seasonings Green Manufacturing Engineering Technology Research Center of Hunan Province, Hun an Huixiangxuan Bio. Tech. Ltd. Com., Liuyang 410323, China
| | - Yaquan Chen
- Hunan Xiangdian Food Ltd. Com, Liuyang 410301, China
| |
Collapse
|
4
|
Xu J, Wang Y. Generating Multistate Conformations of P-type ATPases with a Conditional Diffusion Model. J Chem Inf Model 2024; 64:9227-9239. [PMID: 39480276 DOI: 10.1021/acs.jcim.4c01519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Understanding and predicting the diverse conformational states of membrane proteins is essential for elucidating their biological functions. Despite advancements in computational methods, accurately capturing these complex structural changes remains a significant challenge. Here, we introduce a computational approach to generate diverse and biologically relevant conformations of membrane proteins using a conditional diffusion model. Our approach integrates forward and backward diffusion processes, incorporating state classifiers and additional conditioners to control the generation gradient of conformational states. We specifically targeted the P-type ATPases, a critical family of membrane transporters, and constructed a comprehensive data set through a combination of experimental structures and molecular dynamics simulations. Our model, incorporating a graph neural network with specialized membrane constraints, demonstrates exceptional accuracy in generating a wide range of P-type ATPase conformations associated with different functional states. This approach represents a meaningful step forward in the computational generation of membrane protein conformations using AI and holds promise for studying the dynamics of other membrane proteins.
Collapse
Affiliation(s)
- Jingtian Xu
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| | - Yong Wang
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
5
|
Chen C, Zhang Z, Duan M, Wu Q, Yang M, Jiang L, Liu M, Li C. Aromatic-aromatic interactions drive fold switch of GA95 and GB95 with three residue difference. Chem Sci 2024:d4sc04951a. [PMID: 39720130 PMCID: PMC11665817 DOI: 10.1039/d4sc04951a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2024] [Accepted: 12/17/2024] [Indexed: 12/26/2024] Open
Abstract
Proteins typically adopt a single fold to carry out their function, but metamorphic proteins, with multiple folding states, defy this norm. Deciphering the mechanism of conformational interconversion of metamorphic proteins is challenging. Herein, we employed nuclear magnetic resonance (NMR), circular dichroism (CD), and all-atom molecular dynamics (MD) simulations to elucidate the mechanism of fold switching in proteins GA95 and GB95, which share 95% sequence homology. The results reveal that long-range interactions, especially aromatic π-π interactions involving residues F52, Y45, F30, and Y29, are critical for the protein switching from a 3α to a 4β + α fold. This study contributes to understanding how proteins with highly similar sequences fold into distinct conformations and may provide valuable insights into the protein folding code.
Collapse
Affiliation(s)
- Chen Chen
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| | - Zeting Zhang
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| | - Mojie Duan
- Interdisciplinary Institute of NMR and Molecular Sciences, School of Chemistry and Chemical Engineering, The State Key Laboratory of Refractories and Metallurgy, Wuhan University of Science and Technology Wuhan 430081 China
| | - Qiong Wu
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
| | - Minghui Yang
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| | - Ling Jiang
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| | - Maili Liu
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| | - Conggang Li
- Key Laboratory of Magnetic Resonance in Biological Systems, State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan National Laboratory for Optoelectronics, Wuhan Institute of Physics and Mathematics, Innovation Academy of Precision Measurement, Chinese Academy of Sciences Wuhan 430071 China
- Graduate University of Chinese Academy of Sciences Beijing 100049 China
| |
Collapse
|
6
|
Ille AM, Markosian C, Burley SK, Pasqualini R, Arap W. Prediction of peptide structural conformations with AlphaFold2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.03.626727. [PMID: 39677766 PMCID: PMC11642853 DOI: 10.1101/2024.12.03.626727] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Protein structure prediction via artificial intelligence/machine learning (AI/ML) approaches has sparked substantial research interest in structural biology and adjacent disciplines. More recently, AlphaFold2 (AF2) has been adapted for the prediction of multiple structural conformations in addition to single-state structures. This novel avenue of research has focused on proteins (typically 50 residues in length or greater), while multi-conformation prediction of shorter peptides has not yet been explored in this context. Here, we report AF2-based structural conformation prediction of a total of 557 peptides (ranging in length from 10 to 40 residues) for a benchmark dataset with corresponding nuclear magnetic resonance (NMR)-determined conformational ensembles. De novo structure predictions were accompanied by structural comparison analyses to assess prediction accuracy. We found that the prediction of conformational ensembles for peptides with AF2 varied in accuracy versus NMR data, with average root-mean-square deviation (RMSD) among structured regions under 2.5 Å and average root-mean-square fluctuation (RMSF) differences under 1.5 Å. Our results reveal notable capabilities of AF2-based structural conformation prediction for peptides but also underscore the necessity for interpretation discretion.
Collapse
|
7
|
Zhang N, Sood D, Guo SC, Chen N, Antoszewski A, Marianchuk T, Dey S, Xiao Y, Hong L, Peng X, Baxa M, Partch C, Wang LP, Sosnick TR, Dinner AR, LiWang A. Temperature-dependent fold-switching mechanism of the circadian clock protein KaiB. Proc Natl Acad Sci U S A 2024; 121:e2412327121. [PMID: 39671178 DOI: 10.1073/pnas.2412327121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 10/24/2024] [Indexed: 12/14/2024] Open
Abstract
The oscillator of the cyanobacterial circadian clock relies on the ability of the KaiB protein to switch reversibly between a stable ground-state fold (gsKaiB) and an unstable fold-switched fold (fsKaiB). Rare fold-switching events by KaiB provide a critical delay in the negative feedback loop of this posttranslational oscillator. In this study, we experimentally and computationally investigate the temperature dependence of fold switching and its mechanism. We demonstrate that the stability of gsKaiB increases with temperature compared to fsKaiB and that the Q10 value for the gsKaiB → fsKaiB transition is nearly three times smaller than that for the reverse transition in a construct optimized for NMR studies. Simulations and native-state hydrogen-deuterium exchange NMR experiments suggest that fold switching can involve both partially and completely unfolded intermediates. The simulations predict that the transition state for fold switching coincides with isomerization of conserved prolines in the most rapidly exchanging region, and we confirm experimentally that proline isomerization is a rate-limiting step for fold switching. We explore the implications of our results for temperature compensation, a hallmark of circadian clocks, through a kinetic model.
Collapse
Affiliation(s)
- Ning Zhang
- Department of Chemistry and Biochemistry, University of California, Merced, CA 95343
| | - Damini Sood
- Department of Chemistry and Biochemistry, University of California, Merced, CA 95343
| | - Spencer C Guo
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, IL 60637
| | - Nanhao Chen
- Department of Chemistry, University of California, Davis, CA 95616
| | - Adam Antoszewski
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, IL 60637
| | - Tegan Marianchuk
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637
| | - Supratim Dey
- Department of Chemistry and Biochemistry, University of California, Merced, CA 95343
| | - Yunxian Xiao
- Department of Chemistry and Biochemistry, University of California, Merced, CA 95343
| | - Lu Hong
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, IL 60637
| | - Xiangda Peng
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637
| | - Michael Baxa
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637
| | - Carrie Partch
- Department of Chemistry and Biochemistry, University of California, Santa Cruz, CA 95064
| | - Lee-Ping Wang
- Department of Chemistry, University of California, Davis, CA 95616
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL 60637
| | - Aaron R Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, IL 60637
| | - Andy LiWang
- Department of Chemistry and Biochemistry, University of California, Merced, CA 95343
- Center for Cellular and Biomolecular Machines, University of California, Merced, CA 95343
| |
Collapse
|
8
|
Xu Z, Schahl A, Jolivet MD, Legrand A, Grélard A, Berbon M, Morvan E, Lagardere L, Piquemal JP, Loquet A, Germain V, Chavent M, Mongrand S, Habenstein B. Dynamic pre-structuration of lipid nanodomain-segregating remorin proteins. Commun Biol 2024; 7:1620. [PMID: 39639105 PMCID: PMC11621693 DOI: 10.1038/s42003-024-07330-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Accepted: 11/28/2024] [Indexed: 12/07/2024] Open
Abstract
Remorins are multifunctional proteins, regulating immunity, development and symbiosis in plants. When associating to the membrane, remorins sequester specific lipids into functional membrane nanodomains. The multigenic protein family contains six groups, classified upon their protein-domain composition. Membrane targeting of remorins occurs independently from the secretory pathway. Instead, they are directed into different nanodomains depending on their phylogenetic group. All family members contain a C-terminal membrane anchor and a homo-oligomerization domain, flanked by an intrinsically disordered region of variable length at the N-terminal end. We here combined molecular imaging, NMR spectroscopy, protein structure calculations and advanced molecular dynamics simulation to unveil a stable pre-structuration of coiled-coil dimers as nanodomain-targeting units, containing a tunable fuzzy coat and a bar code-like positive surface charge before membrane association. Our data suggest that remorins fold in the cytosol with the N-terminal disordered region as a structural ensemble around a dimeric anti-parallel coiled-coil core containing a symmetric interface motif reminiscent of a hydrophobic Leucine zipper. The domain geometry, the charge distribution in the coiled-coil remorins and the differences in structures and dynamics between C-terminal lipid anchors of the remorin groups provide a selective platform for phospholipid binding when encountering the membrane surface.
Collapse
Affiliation(s)
- Zeren Xu
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France
| | - Adrien Schahl
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, Université Paul Sabatier, 31400, Toulouse, France
- Sorbonne Université, LCT, UMR7616 CNRS,75005Paris, France; Qubit Pharmaceuticals, Advanced Research Department, 75014, Paris, France
| | - Marie-Dominique Jolivet
- Laboratoire de Biogenèse Membranaire (LBM) UMR-5200, CNRS-Univ. Bordeaux, F-33140, Villenave d'Ornon, France
| | - Anthony Legrand
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France
| | - Axelle Grélard
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France
| | - Mélanie Berbon
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France
| | - Estelle Morvan
- Univ. Bordeaux, CNRS, Inserm, IECB, UAR3033, US01, Pessac, France
| | - Louis Lagardere
- Sorbonne Université, LCT, UMR7616 CNRS,75005Paris, France; Qubit Pharmaceuticals, Advanced Research Department, 75014, Paris, France
| | - Jean-Philip Piquemal
- Sorbonne Université, LCT, UMR7616 CNRS,75005Paris, France; Qubit Pharmaceuticals, Advanced Research Department, 75014, Paris, France
| | - Antoine Loquet
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France
| | - Véronique Germain
- Laboratoire de Biogenèse Membranaire (LBM) UMR-5200, CNRS-Univ. Bordeaux, F-33140, Villenave d'Ornon, France
| | - Matthieu Chavent
- Institut de Pharmacologie et de Biologie Structurale, Université de Toulouse, CNRS, Université Paul Sabatier, 31400, Toulouse, France.
- Laboratoire de Microbiologie et Génétique Moléculaires (LMGM), Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, UPS, Toulouse, France.
| | - Sébastien Mongrand
- Laboratoire de Biogenèse Membranaire (LBM) UMR-5200, CNRS-Univ. Bordeaux, F-33140, Villenave d'Ornon, France.
| | - Birgit Habenstein
- Univ. Bordeaux, CNRS, Bordeaux INP, CBMN, UMR 5248, IECB, F-33600, Pessac, France.
| |
Collapse
|
9
|
Liu D, Song H, Deng H, Abdiriyim A, Zhang L, Jiao Z, Li X, Liu L, Bai S. Insights into the functional mechanisms of three terpene synthases from Lavandula angustifolia (Lavender). FRONTIERS IN PLANT SCIENCE 2024; 15:1497345. [PMID: 39691479 PMCID: PMC11649398 DOI: 10.3389/fpls.2024.1497345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2024] [Accepted: 11/11/2024] [Indexed: 12/19/2024]
Abstract
Lavender species are of significant economic value being cultivated extensively worldwide for their essential oils (EOs), which include terpenes that play crucial roles in the cosmetic, personal care, and pharmaceutical industries. The terpene synthases in lavender, such as Lavandula angustifolia linalool synthase (LaLINS), limonene synthase (LaLIMS), and bergamotene synthase (LaBERS), are key enzymes in terpene biosynthesis. However, the functional mechanisms underlying these enzymes remain poorly understood. Here, we used AlphaFold2 to predict the three-dimensional structures of LaLINS, LaLIMS, and LaBERS. The hydrodynamic radii of LaLINS, LaLIMS, and LaBERS were 5.7 ± 0.2, 6.2 ± 0.3, and 5.4 ± 0.2 nm, respectively. Mutations D320A or D324A led to a complete loss of activity in LaLINS compared to the wild-type (WT) enzyme; similarly, mutations D356A or D360A abolished activity in LaLIMS, and D291A or D295A eliminated activity in LaBERS. Furthermore, the genes LaLINS, LaLIMS, and LaBERS exhibited significantly higher expression levels in leaves compared to stems and flowers, with peak expression occurring at 8:00 a.m. Our findings contribute to a deeper understanding of terpene biosynthesis in lavender and offer insights for improving essential oil production through genetic engineering.
Collapse
Affiliation(s)
- Dafeng Liu
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
- School of Life Sciences, Xiamen University, Xiamen, Fujian, China
| | - Hongying Song
- School of Life Sciences, Xiamen University, Xiamen, Fujian, China
| | - Huashui Deng
- School of Life Sciences, Xiamen University, Xiamen, Fujian, China
| | - Ablikim Abdiriyim
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| | - Lvxia Zhang
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| | - Ziwei Jiao
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| | - Xueru Li
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| | - Lu Liu
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| | - Shuangqin Bai
- Xinjiang Key Laboratory of Lavender Conservation and Utilization, College of Biological Sciences and Technology, Yili Normal University, Yining, Xinjiang, China
| |
Collapse
|
10
|
Raisinghani N, Parikh V, Foley B, Verkhivker G. AlphaFold2-Based Characterization of Apo and Holo Protein Structures and Conformational Ensembles Using Randomized Alanine Sequence Scanning Adaptation: Capturing Shared Signature Dynamics and Ligand-Induced Conformational Changes. Int J Mol Sci 2024; 25:12968. [PMID: 39684679 DOI: 10.3390/ijms252312968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 11/24/2024] [Accepted: 11/29/2024] [Indexed: 12/18/2024] Open
Abstract
Proteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2, which combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins are defined by the structural topology of the fold and favor conserved conformational motions driven by soft modes. Our findings provide evidence that AlphaFold2 combined with randomized alanine sequence masking can yield accurate and consistent results in predicting moderate conformational adjustments between apo and holo states, especially for proteins with localized changes upon ligand binding. For large hinge-like domain movements, the proposed approach can predict functional conformations characteristic of both apo and ligand-bound holo ensembles in the absence of ligand information. These results are relevant for using this AlphaFold adaptation for probing conformational selection mechanisms according to which proteins can adopt multiple conformations, including those that are competent for ligand binding. The results of this study indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and the detection of high-energy conformations. By incorporating a wider variety of protein structures in training datasets, including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Vedant Parikh
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Brandon Foley
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
11
|
Totaro MG, Vide U, Zausinger R, Winkler A, Oberdorfer G. ESM-scan-A tool to guide amino acid substitutions. Protein Sci 2024; 33:e5221. [PMID: 39565080 PMCID: PMC11577456 DOI: 10.1002/pro.5221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 09/27/2024] [Accepted: 10/28/2024] [Indexed: 11/21/2024]
Abstract
Protein structure prediction and (re)design have gone through a revolution in the last 3 years. The tremendous progress in these fields has been almost exclusively driven by readily available machine learning algorithms applied to protein folding and sequence design problems. Despite these advancements, predicting site-specific mutational effects on protein stability and function remains an unsolved problem. This is a persistent challenge, mainly because the free energy of large systems is very difficult to compute with absolute accuracy and subtle changes to protein structures are hard to capture with computational models. Here, we describe the implementation and use of ESM-Scan, which uses the ESM zero-shot predictor to scan entire protein sequences for preferential amino acid changes, thus enabling in silico deep mutational scanning experiments. We benchmark ESM-Scan on its predictive capabilities for stability and functionality of sequence changes using three publicly available datasets and proceed by experimentally testing the tool's performance on a challenging test case of a blue-light-activated diguanylate cyclase from Methylotenera species (MsLadC), where it accurately predicted the importance of a highly conserved residue in a region involved in allosteric product inhibition. Our experimental results show that the ESM-zero shot model is capable of inferring the effects of a set of amino acid substitutions in their correlation between predicted fitness and experimental results. ESM-Scan is publicly available at https://huggingface.co/spaces/thaidaev/zsp.
Collapse
Affiliation(s)
| | - Uršula Vide
- Institute of BiochemistryGraz University of TechnologyGrazAustria
| | - Regina Zausinger
- Institute of BiochemistryGraz University of TechnologyGrazAustria
| | - Andreas Winkler
- Institute of BiochemistryGraz University of TechnologyGrazAustria
- BioTechMedGrazAustria
| | - Gustav Oberdorfer
- Institute of BiochemistryGraz University of TechnologyGrazAustria
- BioTechMedGrazAustria
| |
Collapse
|
12
|
Li C, Luo Y, Xie Y, Zhang Z, Liu Y, Zou L, Xiao F. Structural and functional prediction, evaluation, and validation in the post-sequencing era. Comput Struct Biotechnol J 2024; 23:446-451. [PMID: 38223342 PMCID: PMC10787220 DOI: 10.1016/j.csbj.2023.12.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 12/20/2023] [Accepted: 12/22/2023] [Indexed: 01/16/2024] Open
Abstract
The surge of genome sequencing data has underlined substantial genetic variants of uncertain significance (VUS). The decryption of VUS discovered by sequencing poses a major challenge in the post-sequencing era. Although experimental assays have progressed in classifying VUS, only a tiny fraction of the human genes have been explored experimentally. Thus, it is urgently needed to generate state-of-the-art functional predictors of VUS in silico. Artificial intelligence (AI) is an invaluable tool to assist in the identification of VUS with high efficiency and accuracy. An increasing number of studies indicate that AI has brought an exciting acceleration in the interpretation of VUS, and our group has already used AI to develop protein structure-based prediction models. In this review, we provide an overview of the previous research on AI-based prediction of missense variants, and elucidate the challenges and opportunities for protein structure-based variant prediction in the post-sequencing era.
Collapse
Affiliation(s)
- Chang Li
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Yixuan Luo
- Beijing Normal University, Beijing, China
| | - Yibo Xie
- Information Center, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Zaifeng Zhang
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Ye Liu
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Lihui Zou
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
| | - Fei Xiao
- Clinical Biobank, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- The Key Laboratory of Geriatrics, Beijing Institute of Geriatrics, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China
- Beijing Normal University, Beijing, China
| |
Collapse
|
13
|
Opuni KFM, Ruß M, Geens R, Vocht LD, Wielendaele PV, Debuy C, Sterckx YGJ, Glocker MO. Mass spectrometry-complemented molecular modeling predicts the interaction interface for a camelid single-domain antibody targeting the Plasmodium falciparum circumsporozoite protein's C-terminal domain. Comput Struct Biotechnol J 2024; 23:3300-3314. [PMID: 39296809 PMCID: PMC11409006 DOI: 10.1016/j.csbj.2024.08.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2024] [Revised: 08/26/2024] [Accepted: 08/26/2024] [Indexed: 09/21/2024] Open
Abstract
Background Bioanalytical methods that enable rapid and high-detail characterization of binding specificities and strengths of protein complexes with low sample consumption are highly desired. The interaction between a camelid single domain antibody (sdAbCSP1) and its target antigen (PfCSP-Cext) was selected as a model system to provide proof-of-principle for the here described methodology. Research design and methods The structure of the sdAbCSP1 - PfCSP-Cext complex was modeled using AlphaFold2. The recombinantly expressed proteins, sdAbCSP1, PfCSP-Cext, and the sdAbCSP1 - PfCSP-Cext complex, were subjected to limited proteolysis and mass spectrometric peptide analysis. ITEM MS (Intact Transition Epitope Mapping Mass Spectrometry) and ITC (Isothermal Titration Calorimetry) were applied to determine stoichiometry and binding strength. Results The paratope of sdAbCSP1 mainly consists of its CDR3 (aa100-118). PfCSP-Cext's epitope is assembled from its α-helix (aa40-52) and opposing loop (aa83-90). PfCSP-Cext's GluC cleavage sites E46 and E58 were shielded by complex formation, confirming the predicted epitope. Likewise, sdAbCSP1's tryptic cleavage sites R105 and R108 were shielded by complex formation, confirming the predicted paratope. ITEM MS determined the 1:1 stoichiometry and the high complex binding strength, exemplified by the gas phase dissociation reaction enthalpy of 50.2 kJ/mol. The in-solution complex dissociation constant is 5 × 10-10 M. Conclusions Combining AlphaFold2 modeling with mass spectrometry/limited proteolysis generated a trustworthy model for the sdAbCSP1 - PfCSP-Cext complex interaction interface.
Collapse
Affiliation(s)
- Kwabena F M Opuni
- Department of Pharmaceutical Chemistry, School of Pharmacy, College of Health Science, University of Ghana, P.O. Box LG43, Legon, Ghana
| | - Manuela Ruß
- Proteome Center Rostock, University Medicine Rostock and University of Rostock, Schillingallee 69, 18057 Rostock, Germany
| | - Rob Geens
- Laboratory of Medical Biochemistry, Faculty of Pharmaceutical, Biomedical, and Veterinary Sciences, University of Antwerp, Universiteitsplein 1, Wilrijk, 2610 Antwerp, Belgium
| | - Line De Vocht
- Laboratory of Medical Biochemistry, Faculty of Pharmaceutical, Biomedical, and Veterinary Sciences, University of Antwerp, Universiteitsplein 1, Wilrijk, 2610 Antwerp, Belgium
| | - Pieter Van Wielendaele
- Laboratory of Medical Biochemistry, Faculty of Pharmaceutical, Biomedical, and Veterinary Sciences, University of Antwerp, Universiteitsplein 1, Wilrijk, 2610 Antwerp, Belgium
| | - Christophe Debuy
- Laboratory of Medical Biochemistry, Faculty of Pharmaceutical, Biomedical, and Veterinary Sciences, University of Antwerp, Universiteitsplein 1, Wilrijk, 2610 Antwerp, Belgium
| | - Yann G-J Sterckx
- Laboratory of Medical Biochemistry, Faculty of Pharmaceutical, Biomedical, and Veterinary Sciences, University of Antwerp, Universiteitsplein 1, Wilrijk, 2610 Antwerp, Belgium
| | - Michael O Glocker
- Proteome Center Rostock, University Medicine Rostock and University of Rostock, Schillingallee 69, 18057 Rostock, Germany
| |
Collapse
|
14
|
Stohr AM, Ma D, Chen W, Blenner M. Engineering conditional protein-protein interactions for dynamic cellular control. Biotechnol Adv 2024; 77:108457. [PMID: 39343083 DOI: 10.1016/j.biotechadv.2024.108457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 08/28/2024] [Accepted: 09/26/2024] [Indexed: 10/01/2024]
Abstract
Conditional protein-protein interactions enable dynamic regulation of cellular activity and are an attractive approach to probe native protein interactions, improve metabolic engineering of microbial factories, and develop smart therapeutics. Conditional protein-protein interactions have been engineered to respond to various chemical, light, and nucleic acid-based stimuli. These interactions have been applied to assemble protein fragments, build protein scaffolds, and spatially organize proteins in many microbial and higher-order hosts. To foster the development of novel conditional protein-protein interactions that respond to new inputs or can be utilized in alternative settings, we provide an overview of the process of designing new engineered protein interactions while showcasing many recently developed computational tools that may accelerate protein engineering in this space.
Collapse
Affiliation(s)
- Anthony M Stohr
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA
| | - Derron Ma
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA
| | - Wilfred Chen
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA.
| | - Mark Blenner
- Department of Chemical and Biomolecular Engineering, University of Delaware, Newark, DE 19716, USA.
| |
Collapse
|
15
|
Harding-Larsen D, Funk J, Madsen NG, Gharabli H, Acevedo-Rocha CG, Mazurenko S, Welner DH. Protein representations: Encoding biological information for machine learning in biocatalysis. Biotechnol Adv 2024; 77:108459. [PMID: 39366493 DOI: 10.1016/j.biotechadv.2024.108459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/19/2024] [Accepted: 09/29/2024] [Indexed: 10/06/2024]
Abstract
Enzymes offer a more environmentally friendly and low-impact solution to conventional chemistry, but they often require additional engineering for their application in industrial settings, an endeavour that is challenging and laborious. To address this issue, the power of machine learning can be harnessed to produce predictive models that enable the in silico study and engineering of improved enzymatic properties. Such machine learning models, however, require the conversion of the complex biological information to a numerical input, also called protein representations. These inputs demand special attention to ensure the training of accurate and precise models, and, in this review, we therefore examine the critical step of encoding protein information to numeric representations for use in machine learning. We selected the most important approaches for encoding the three distinct biological protein representations - primary sequence, 3D structure, and dynamics - to explore their requirements for employment and inductive biases. Combined representations of proteins and substrates are also introduced as emergent tools in biocatalysis. We propose the division of fixed representations, a collection of rule-based encoding strategies, and learned representations extracted from the latent spaces of large neural networks. To select the most suitable protein representation, we propose two main factors to consider. The first one is the model setup, which is influenced by the size of the training dataset and the choice of architecture. The second factor is the model objectives such as consideration about the assayed property, the difference between wild-type models and mutant predictors, and requirements for explainability. This review is aimed at serving as a source of information and guidance for properly representing enzymes in future machine learning models for biocatalysis.
Collapse
Affiliation(s)
- David Harding-Larsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Jonathan Funk
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Niklas Gesmar Madsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Hani Gharabli
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Carlos G Acevedo-Rocha
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Ditte Hededam Welner
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark.
| |
Collapse
|
16
|
Cao D, Chen M, Zhang R, Wang Z, Huang M, Yu J, Jiang X, Fan Z, Zhang W, Zhou H, Li X, Fu Z, Zhang S, Zheng M. SurfDock is a surface-informed diffusion generative model for reliable and accurate protein-ligand complex prediction. Nat Methods 2024:10.1038/s41592-024-02516-y. [PMID: 39604569 DOI: 10.1038/s41592-024-02516-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 10/16/2024] [Indexed: 11/29/2024]
Abstract
Accurately predicting protein-ligand interactions is crucial for understanding cellular processes. We introduce SurfDock, a deep-learning method that addresses this challenge by integrating protein sequence, three-dimensional structural graphs and surface-level features into an equivariant architecture. SurfDock employs a generative diffusion model on a non-Euclidean manifold, optimizing molecular translations, rotations and torsions to generate reliable binding poses. Our extensive evaluations across various benchmarks demonstrate SurfDock's superiority over existing methods in docking success rates and adherence to physical constraints. It also exhibits remarkable generalizability to unseen proteins and predicted apo structures, while achieving state-of-the-art performance in virtual screening tasks. In a real-world application, SurfDock identified seven novel hit molecules in a virtual screening project targeting aldehyde dehydrogenase 1B1, a key enzyme in cellular metabolism. This showcases SurfDock's ability to elucidate molecular mechanisms underlying cellular processes. These results highlight SurfDock's potential as a transformative tool in structural biology, offering enhanced accuracy, physical plausibility and practical applicability in understanding protein-ligand interactions.
Collapse
Affiliation(s)
- Duanhua Cao
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Runze Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhaokun Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Manlin Huang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Nanchang University, Nanchang, China
| | - Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- Lingang Laboratory, Shanghai, China
- School of Information Science and Technology, ShanghaiTech University, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Zhehuan Fan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Hao Zhou
- Institute for AI Industry Research (AIR), Tsinghua University, Beijing, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, China.
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
17
|
Lazou M, Khan O, Nguyen T, Padhorny D, Kozakov D, Joseph-McCarthy D, Vajda S. Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much. Proc Natl Acad Sci U S A 2024; 121:e2412719121. [PMID: 39565312 PMCID: PMC11621821 DOI: 10.1073/pnas.2412719121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 10/21/2024] [Indexed: 11/21/2024] Open
Abstract
The goal of this paper is predicting the conformational distributions of ligand binding sites using the AlphaFold2 (AF2) protein structure prediction program with stochastic subsampling of the multiple sequence alignment (MSA). We explored the opening of cryptic ligand binding sites in 16 proteins, where the closed and open conformations define the expected extreme points of the conformational variation. Due to the many structures of these proteins in the Protein Data Bank (PDB), we were able to study whether the distribution of X-ray structures affects the distribution of AF2 models. We have found that AF2 generates both a cluster of open and a cluster of closed models for proteins that have comparable numbers of open and closed structures in the PDB and not too many other conformations. This was observed even with default MSA parameters, thus without further subsampling. In contrast, with the exception of a single protein, AF2 did not yield multiple clusters of conformations for proteins that had imbalanced numbers of open and closed structures in the PDB, or had substantial numbers of other structures. Subsampling improved the results only for a single protein, but very shallow MSA led to incorrect structures. The ability of generating both open and closed conformations for six out of the 16 proteins agrees with the success rates of similar studies reported in the literature. However, we showed that this partial success is due to AF2 "remembering" the conformational distributions in the PDB and that the approach fails to predict rarely seen conformations.
Collapse
Affiliation(s)
- Maria Lazou
- Department of Biomedical Engineering, Boston University, Boston, MA02215
| | - Omeir Khan
- Department of Chemistry, Boston University, Boston, MA02215
| | - Thu Nguyen
- Department of Computer Science, Stony Brook University, Stony Brook, NY11794
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Diane Joseph-McCarthy
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| |
Collapse
|
18
|
Olanders G, Testa G, Tibo A, Nittinger E, Tyrchan C. Challenge for Deep Learning: Protein Structure Prediction of Ligand-Induced Conformational Changes at Allosteric and Orthosteric Sites. J Chem Inf Model 2024; 64:8481-8494. [PMID: 39484820 DOI: 10.1021/acs.jcim.4c01475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
In the realm of biomedical research, understanding the intricate structure of proteins is crucial, as these structures determine how proteins function within our bodies and interact with potential drugs. Traditionally, methods like X-ray crystallography and cryo-electron microscopy have been used to unravel these structures, but they are often challenging, time-consuming and costly. Recently, a breakthrough in computational biology has emerged with the development of deep learning algorithms capable of predicting protein structures based on their amino acid sequences (Jumper, J., et al. Nature 2021, 596, 583. Lane, T. J. Nature Methods 2023, 20, 170. Kryshtafovych, A., et al. Proteins: Structure, Function and Bioinformatics 2021, 89, 1607). This study focuses on predicting the dynamic changes that proteins undergo upon ligand binding, specifically when they bind to allosteric sites, i.e. a pocket different from the active site. Allosteric modulators are particularly important for drug discovery, as they open new avenues for designing drugs that can target proteins more effectively and with fewer side effects (Nussinov, R.; Tsai, C. J. Cell 2013, 153, 293). To study this, we curated a data set of 578 X-ray structures comprised of proteins displaying orthosteric and allosteric binding as well as a general framework to evaluate deep learning-based structure prediction methods. Our findings demonstrate the potential and current limitations of deep learning methods, such as AlphaFold2 (Jumper, J., et al. Nature 2021, 596, 583), NeuralPLexer (Qiao, Z., et al. Nat Mach Intell 2024, 6, 195), and RoseTTAFold All-Atom (Krishna, R., et al. Science 2024, 384, eadl2528) to predict not just static protein structures but also the dynamic conformational changes. Herein we show that predicting the allosteric induce-fit conformation still poses a challenge to deep learning methods as they more accurately predict the orthosteric bound conformation compared to the allosteric induce fit conformation. For AlphaFold2, we observed that conformational diversity, and sampling between the apo and holo state could be increased by modifying the MSA depth, but this did not enhance the ability to generate conformations close to the allosteric induced-fit conformation. To further support advancements in protein structure prediction field, the curated data set and evaluation framework are made publicly available.
Collapse
Affiliation(s)
- Gustav Olanders
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Giulia Testa
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Christian Tyrchan
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| |
Collapse
|
19
|
Fan J, Li Z, Alcaide E, Ke G, Huang H, E W. Accurate Conformation Sampling via Protein Structural Diffusion. J Chem Inf Model 2024; 64:8414-8426. [PMID: 39340358 DOI: 10.1021/acs.jcim.4c00928] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/30/2024]
Abstract
Accurate sampling of protein conformations is pivotal for advances in biology and medicine. Although there has been tremendous progress in protein structure prediction in recent years due to deep learning, models that can predict the different stable conformations of proteins with high accuracy and structural validity are still lacking. Here, we introduce UFConf, a cutting-edge approach designed for robust sampling of diverse protein conformations based solely on amino acid sequences. This method transforms AlphaFold2 into a diffusion model by implementing a conformation-based diffusion process and adapting the architecture to process diffused inputs effectively. To counteract the inherent conformational bias in the Protein Data Bank, we developed a novel hierarchical reweighting protocol based on structural clustering. Our evaluations demonstrate that UFConf outperforms existing methods in terms of successful sampling and structural validity. The comparisons with long-time molecular dynamics show that UFConf can overcome the energy barrier existing in molecular dynamics simulations and perform more efficient sampling. Furthermore, We showcase UFConf's utility in drug discovery through its application in neural protein-ligand docking. In a blind test, it accurately predicted a novel protein-ligand complex, underscoring its potential to impact real-world biological research. Additionally, we present other modes of sampling using UFConf, including partial sampling with fixed motif, Langevin dynamics, and structural interpolation.
Collapse
Affiliation(s)
- Jiahao Fan
- School of Physics, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, China
| | - Ziyao Li
- DP Technology, Beijing 100080, China
- Center for Data Science, Peking University, Beijing 100871, China
| | - Eric Alcaide
- DP Technology, Beijing 100080, China
- University of Barcelona, Barcelona 08007, Spain
| | - Guolin Ke
- DP Technology, Beijing 100080, China
| | - Huaqing Huang
- School of Physics, Peking University, Beijing 100871, China
| | - Weinan E
- School of Mathematical Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
20
|
Mishra N, Avillion G, Callaghan S, DiBiase C, Hurtado J, Liendo N, Burbach S, Messmer T, Briney B. Conformational ensemble-based framework enables rapid development of Lassa virus vaccine candidates. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.21.624760. [PMID: 39605488 PMCID: PMC11601624 DOI: 10.1101/2024.11.21.624760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Lassa virus (LASV), an arenavirus endemic to West Africa, poses a significant public health threat due to its high pathogenicity and expanding geographic risk zone. LASV glycoprotein complex (GPC) is the only known target of neutralizing antibodies, but its inherent metastability and conformational flexibility have hindered the development of GPC-based vaccines. We employed a variant of AlphaFold2 (AF2), called subsampled AF2, to generate diverse structures of LASV GPC that capture an array of potential conformational states using MSA subsampling and dropout layers. Conformational ensembles identified several metamorphic domains-areas of significant conformational flexibility-that could be targeted to stabilize the GPC in its immunogenic prefusion state. ProteinMPNN was then used to redesign GPC sequences to minimize the mobility of metamorphic domains. These redesigned sequences were further filtered using subsampled AF2, leading to the identification of promising GPC variants for further testing. A small library of redesigned GPC sequences was experimentally validated and showed significantly increased protein yields compared to controls. Antigenic profiles indicated these variants preserved essential epitopes for effective immune response, suggesting their potential for broad protective efficacy. Our results demonstrate that AI-driven approaches can predict the conformational landscape of complex pathogens. This knowledge can be used to stabilize viral proteins, such as LASV GPC, in their prefusion conformation, optimizing them for stability and expression, and offering a streamlined framework for vaccine design. Our deep learning / machine learning enabled framework contributes to global efforts to combat LASV and has broader implications for vaccine design and pandemic preparedness.
Collapse
Affiliation(s)
- Nitesh Mishra
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Gabriel Avillion
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Sean Callaghan
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Charlotte DiBiase
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Jonathan Hurtado
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Nathan Liendo
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Sarah Burbach
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Terrence Messmer
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Bryan Briney
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
- Center for Viral Systems Biology, The Scripps Research Institute, La Jolla, CA 92037 USA
- Multi-Omics Vaccine Evaluation Consortium, The Scripps Research Institute, La Jolla, CA 92037 USA
- Scripps Consortium for HIV/AIDS Vaccine Development, The Scripps Research Institute, La Jolla, CA 92037 USA
- San Diego Center for AIDS Research, The Scripps Research Institute, La Jolla, CA 92037 USA
| |
Collapse
|
21
|
Huang Y, Zhang Z, Hattori M. Recent Advances in Expression Screening and Sample Evaluation for Structural Studies of Membrane Proteins. J Mol Biol 2024; 436:168809. [PMID: 39362625 DOI: 10.1016/j.jmb.2024.168809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2024] [Revised: 09/26/2024] [Accepted: 09/27/2024] [Indexed: 10/05/2024]
Abstract
Membrane proteins are involved in numerous biological processes and represent more than half of all drug targets; thus, structural information on these proteins is invaluable. However, the low expression level of membrane proteins, as well as their poor stability in solution and tendency to precipitate and aggregate, are major bottlenecks in the preparation of purified membrane proteins for structural studies. Traditionally, the evaluation of membrane protein constructs for structural studies has been quite time consuming and expensive since it is necessary to express and purify the proteins on a large scale, particularly for X-ray crystallography. The emergence of fluorescence detection size exclusion chromatography (FSEC) has drastically changed this situation, as this method can be used to rapidly evaluate the expression and behavior of membrane proteins on a small scale without the need for purification. FSEC has become the most widely used method for the screening of expression conditions and sample evaluation for membrane proteins, leading to the successful determination of numerous structures. Even in the era of cryo-EM, FSEC and the new generation of FSEC derivative methods are being widely used in various manners to facilitate structural analysis. In addition, the application of FSEC is not limited to structural analysis; this method is also widely used for functional analysis of membrane proteins, including for analysis of oligomerization state, screening of antibodies and ligands, and affinity profiling. This review presents the latest advances and applications in membrane protein expression screening and sample evaluation, with a particular focus on FSEC methods.
Collapse
Affiliation(s)
- Yichen Huang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Physiology and Neurobiology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Ziyi Zhang
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Physiology and Neurobiology, School of Life Sciences, Fudan University, Shanghai 200438, China
| | - Motoyuki Hattori
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Physiology and Neurobiology, School of Life Sciences, Fudan University, Shanghai 200438, China.
| |
Collapse
|
22
|
Raisinghani N, Alshahrani M, Gupta G, Tian H, Xiao S, Tao P, Verkhivker G. Probing Functional Allosteric States and Conformational Ensembles of the Allosteric Protein Kinase States and Mutants: Atomistic Modeling and Comparative Analysis of AlphaFold2, OmegaFold, and AlphaFlow Approaches and Adaptations. J Phys Chem B 2024; 128:11088-11107. [PMID: 39485490 DOI: 10.1021/acs.jpcb.4c04985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
This study reports a comprehensive analysis and comparison of several AlphaFold2 adaptations and OmegaFold and AlphaFlow approaches in predicting distinct allosteric states, conformational ensembles, and mutation-induced structural effects for a panel of state-switching allosteric ABL mutants. The results revealed that the proposed AlphaFold2 adaptation with randomized alanine sequence scanning can generate functionally relevant allosteric states and conformational ensembles of the ABL kinase that qualitatively capture a unique pattern of population shifts between the active and inactive states in the allosteric ABL mutants. Consistent with the NMR experiments, the proposed AlphaFold2 adaptation predicted that G269E/M309L/T408Y mutant could induce population changes and sample a significant fraction of the fully inactive I2 form which is a low-populated, high-energy state for the wild-type ABL protein. We also demonstrated that other ABL mutants G269E/M309L/T334I and M309L/L320I/T334I that introduce a single activating T334I mutation can reverse equilibrium and populate exclusively the active ABL form. While the precise quantitative predictions of the relative populations of the active and various hidden inactive states in the ABL mutants remain challenging, our results provide evidence that AlphaFold2 adaptation with randomized alanine sequence scanning can adequately detect a spectrum of the allosteric ABL states and capture the equilibrium redistributions between structurally distinct functional ABL conformations. We further validated the robustness of the proposed AlphaFold2 adaptation for predicting the unique inactive architecture of the BSK8 kinase and structural differences between ligand-unbound apo and ATP-bound forms of BSK8. The results of this comparative study suggested that AlpahFold2, OmegaFold, and AlphaFlow approaches may be driven by structural memorization of existing protein folds and are strongly biased toward predictions of the thermodynamically stable ground states of the protein kinases, highlighting limitations and challenges of AI-based methodologies in detecting alternative functional conformations, accurate characterization of physically significant conformational ensembles, and prediction of mutation-induced allosteric structural changes.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States
- Department of Pharmacology, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| |
Collapse
|
23
|
Mishra N, Callaghan S, Briney B. Decoding protein dynamicity in DNA ligase activity through deep learning-based structural ensembles. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.07.622521. [PMID: 39574676 PMCID: PMC11581005 DOI: 10.1101/2024.11.07.622521] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/02/2024]
Abstract
Numerous proteins perform their functions by transitioning between various structures. Understanding the conformational ensembles associated with these states is essential for uncovering crucial mechanistic aspects that regulate protein function. In this study, we utilized AlphaFold3 ( AF3 ) to investigate the structural dynamics and mechanisms of enzymes involved in DNA homeostasis, using NAD-dependent Taq ligases and human DNA Ligase 1 as a case example. Modifying the input parameters for AF3 yielded detailed conformational states of a DNA-binding enzyme, thereby offering enhanced mechanistic insights. We applied AF3 to model the various stages of thermophilic Taq DNA ligase activity, from its ground state to substrate-bound complexes, revealing significant mobility in the N-terminal adenylation and C-terminal BRCT domains. These detailed structural ensembles provided novel insights into the enzyme's behavior during DNA repair, underscoring the potential of AF3 in elucidating mechanistic details critical for therapeutic and biotechnological targeting. Extending this approach to human LIG1, we examined its end-joining activity on double-strand breaks ( DSBs ) with short 3' and 5' overhangs. In alignment with published experimental data, AF3 conformational ensembles indicated LIG1 has lower catalytic efficiency for 5' overhangs due to suboptimal DNA positioning within the catalytic site, demonstrating AF3's capability to capture subtle yet functionally significant conformational differences by generating conformational ensembles capturing greater structural variance.
Collapse
Affiliation(s)
- Nitesh Mishra
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Sean Callaghan
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
| | - Bryan Briney
- Department of Immunology and Microbiology, The Scripps Research Institute, La Jolla, CA 92037 USA
- Center for Viral Systems Biology, The Scripps Research Institute, La Jolla, CA 92037 USA
- Multi-Omics Vaccine Evaluation Consortium, The Scripps Research Institute, La Jolla, CA 92037 USA
- Scripps Consortium for HIV/AIDS Vaccine Development, The Scripps Research Institute, La Jolla, CA 92037 USA
- San Diego Center for AIDS Research, The Scripps Research Institute, La Jolla, CA 92037 USA
| |
Collapse
|
24
|
Riccabona JR, Spoendlin FC, Fischer ALM, Loeffler JR, Quoika PK, Jenkins TP, Ferguson JA, Smorodina E, Laustsen AH, Greiff V, Forli S, Ward AB, Deane CM, Fernández-Quintero ML. Assessing AF2's ability to predict structural ensembles of proteins. Structure 2024; 32:2147-2159.e2. [PMID: 39332396 DOI: 10.1016/j.str.2024.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/07/2024] [Accepted: 09/02/2024] [Indexed: 09/29/2024]
Abstract
Recent breakthroughs in protein structure prediction have enhanced the precision and speed at which protein configurations can be determined. Additionally, molecular dynamics (MD) simulations serve as a crucial tool for capturing the conformational space of proteins, providing valuable insights into their structural fluctuations. However, the scope of MD simulations is often limited by the accessible timescales and the computational resources available, posing challenges to comprehensively exploring protein behaviors. Recently emerging approaches have focused on expanding the capability of AlphaFold2 (AF2) to predict conformational substates of protein. Here, we benchmark the performance of various workflows that have adapted AF2 for ensemble prediction and compare the obtained structures with ensembles obtained from MD simulations and NMR. We provide an overview of the levels of performance and accessible timescales that can currently be achieved with machine learning (ML) based ensemble generation. Significant minima of the free energy surfaces remain undetected.
Collapse
Affiliation(s)
- Jakob R Riccabona
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Fabian C Spoendlin
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Anna-Lena M Fischer
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Johannes R Loeffler
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Patrick K Quoika
- Center for Functional Protein Assemblies, Technical University of Munich, Ernst-Otto-Fischer-Str. 8, 85748 Garching, Germany
| | - Timothy P Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - James A Ferguson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Eva Smorodina
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Andreas H Laustsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Stefano Forli
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew B Ward
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| | - Monica L Fernández-Quintero
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| |
Collapse
|
25
|
Wayment-Steele HK, Otten R, Pitsawong W, Ojoawo AM, Glaser A, Calderone LA, Kern D. The conformational landscape of fold-switcher KaiB is tuned to the circadian rhythm timescale. Proc Natl Acad Sci U S A 2024; 121:e2412293121. [PMID: 39475637 PMCID: PMC11551320 DOI: 10.1073/pnas.2412293121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Accepted: 10/02/2024] [Indexed: 11/06/2024] Open
Abstract
How can a single protein domain encode a conformational landscape with multiple stably folded states, and how do those states interconvert? Here, we use real-time and relaxation-dispersion NMR to characterize the conformational landscape of the circadian rhythm protein KaiB from Rhodobacter sphaeroides. Unique among known natural metamorphic proteins, this KaiB variant spontaneously interconverts between two monomeric states: the "Ground" and "Fold-switched" (FS) states. KaiB in its FS state interacts with multiple binding partners, including the central KaiC protein, to regulate circadian rhythms. We find that KaiB itself takes hours to interconvert between the Ground and FS state, underscoring the ability of a single-sequence to encode the slow process needed for function. We reveal the rate-limiting step between the Ground and FS state is the cis-trans isomerization of three prolines in the fold-switching region by demonstrating interconversion acceleration by the prolyl isomerase Cyclophilin A. The interconversion proceeds through a "partially disordered" (PD) state, where the C-terminal half becomes disordered while the N-terminal half remains stably folded. We found two additional properties of KaiB's landscape. First, the Ground state experiences cold denaturation: At 4 °C, the PD state becomes the majorly populated state. Second, the Ground state exchanges with a fourth state, the "Enigma" state, on the millisecond-timescale. We combine AlphaFold2-based predictions and NMR chemical shift predictions to predict this Enigma state is a beta-strand register shift that relieves buried charged residues, and support this structure experimentally. These results provide mechanistic insight into how evolution can design a single-sequence that achieves specific timing needed for its function.
Collapse
Affiliation(s)
| | - Renee Otten
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| | - Warintra Pitsawong
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| | - Adedolapo M. Ojoawo
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| | - Andrew Glaser
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| | - Logan A. Calderone
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| | - Dorothee Kern
- Department of Biochemistry, Brandeis University, Waltham, MA02453
- HHMI, Waltham, MA02453
| |
Collapse
|
26
|
Lau AM, Bordin N, Kandathil SM, Sillitoe I, Waman VP, Wells J, Orengo CA, Jones DT. Exploring structural diversity across the protein universe with The Encyclopedia of Domains. Science 2024; 386:eadq4946. [PMID: 39480926 DOI: 10.1126/science.adq4946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2024] [Accepted: 08/30/2024] [Indexed: 11/02/2024]
Abstract
The AlphaFold Protein Structure Database (AFDB) contains more than 214 million predicted protein structures composed of domains, which are independently folding units found in multiple structural and functional contexts. Identifying domains can enable many functional and evolutionary analyses but has remained challenging because of the sheer scale of the data. Using deep learning methods, we have detected and classified every domain in the AFDB, producing The Encyclopedia of Domains. We detected nearly 365 million domains, over 100 million more than can be found by sequence methods, covering more than 1 million taxa. Reassuringly, 77% of the nonredundant domains are similar to known superfamilies, greatly expanding representation of their domain space. We uncovered more than 10,000 new structural interactions between superfamilies and thousands of new folds across the fold space continuum.
Collapse
Affiliation(s)
- Andy M Lau
- Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Nicola Bordin
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Shaun M Kandathil
- Department of Computer Science, University College London, London WC1E 6BT, UK
| | - Ian Sillitoe
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Vaishali P Waman
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - Jude Wells
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
- Centre for Artificial Intelligence, University College London, London WC1V 6BH, UK
| | - Christine A Orengo
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| | - David T Jones
- Department of Computer Science, University College London, London WC1E 6BT, UK
- Institute of Structural and Molecular Biology, University College London, London WC1E 6BT, UK
| |
Collapse
|
27
|
Omidi A, Møller MH, Malhis N, Bui JM, Gsponer J. AlphaFold-Multimer accurately captures interactions and dynamics of intrinsically disordered protein regions. Proc Natl Acad Sci U S A 2024; 121:e2406407121. [PMID: 39446390 PMCID: PMC11536093 DOI: 10.1073/pnas.2406407121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 09/12/2024] [Indexed: 10/27/2024] Open
Abstract
Interactions mediated by intrinsically disordered protein regions (IDRs) pose formidable challenges in structural characterization. IDRs are highly versatile, capable of adopting diverse structures and engagement modes. Motivated by recent strides in protein structure prediction, we embarked on exploring the extent to which AlphaFold-Multimer can faithfully reproduce the intricacies of interactions involving IDRs. To this end, we gathered multiple datasets covering the versatile spectrum of IDR binding modes and used them to probe AlphaFold-Multimer's prediction of IDR interactions and their dynamics. Our analyses revealed that AlphaFold-Multimer is not only capable of predicting various types of bound IDR structures with high success rate, but that distinguishing true interactions from decoys, and unreliable predictions from accurate ones is achievable by appropriate use of AlphaFold-Multimer's intrinsic scores. We found that the quality of predictions drops for more heterogeneous, fuzzy interaction types, most likely due to lower interface hydrophobicity and higher coil content. Notably though, certain AlphaFold-Multimer scores, such as the Predicted Aligned Error and residue-ipTM, are highly correlated with structural heterogeneity of the bound IDR, enabling clear distinctions between predictions of fuzzy and more homogeneous binding modes. Finally, our benchmarking revealed that predictions of IDR interactions can also be successful when using full-length proteins, but not as accurate as with cognate IDRs. To facilitate identification of the cognate IDR of a given partner, we established "minD," which pinpoints potential interaction sites in a full-length protein. Our study demonstrates that AlphaFold-Multimer can correctly identify interacting IDRs and predict their mode of engagement with a given partner.
Collapse
Affiliation(s)
- Alireza Omidi
- Michael Smith Laboratories, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| | - Mads Harder Møller
- Michael Smith Laboratories, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| | - Nawar Malhis
- Michael Smith Laboratories, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| | - Jennifer M. Bui
- Michael Smith Laboratories, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| | - Jörg Gsponer
- Michael Smith Laboratories, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
- Department of Biochemistry and Molecular Biology, University of British Columbia, Vancouver, BCV6T 1Z4, Canada
| |
Collapse
|
28
|
Lopez-Mateos D, Narang K, Yarov-Yarovoy V. Exploring voltage-gated sodium channel conformations and protein-protein interactions using AlphaFold2. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.15.618559. [PMID: 39463944 PMCID: PMC11507785 DOI: 10.1101/2024.10.15.618559] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
Voltage-gated sodium (NaV) channels are vital regulators of electrical activity in excitable cells, playing critical roles in generating and propagating action potentials. Given their importance in physiology, NaV channels are key therapeutic targets for treating numerous conditions, yet developing subtype-selective drugs remains challenging due to the high sequence and structural conservation among NaV family members. Recent advances in cryo-electron microscopy have resolved nearly all human NaV channels, providing valuable insights into their structure and function. However, limitations persist in fully capturing the complex conformational states that underlie NaV channel gating and modulation. This study explores the capability of AlphaFold2 to sample multiple NaV channel conformations and assess AlphaFold Multimer's accuracy in modeling interactions between the NaV α-subunit and its protein partners, including auxiliary β-subunits and calmodulin. We enhance conformational sampling to explore NaV channel conformations using a subsampled multiple sequence alignment approach and varying the number of recycles. Our results demonstrate that AlphaFold2 models multiple NaV channel conformations, including those from experimental structures, new states not yet experimentally identified, and potential intermediate states. Furthermore, AlphaFold Multimer models NaV complexes with auxiliary β-subunits and calmodulin with high accuracy, and the presence of protein partners significantly alters the conformational landscape of the NaV α-subunit. These findings highlight the potential of deep learning-based methods to expand our understanding of NaV channel structure, gating, and modulation, with significant implications for future drug discovery efforts.
Collapse
Affiliation(s)
- Diego Lopez-Mateos
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA 95616
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA 95616
| | - Kush Narang
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA 95616
| | - Vladimir Yarov-Yarovoy
- Department of Physiology and Membrane Biology, University of California School of Medicine, Davis, CA 95616
- Biophysics Graduate Group, University of California School of Medicine, Davis, CA 95616
- Department of Anesthesiology and Pain Medicine, University of California School of Medicine, Davis, CA 95616
| |
Collapse
|
29
|
Qian R, Xue J, Xu Y, Huang J. Alchemical Transformations and Beyond: Recent Advances and Real-World Applications of Free Energy Calculations in Drug Discovery. J Chem Inf Model 2024; 64:7214-7237. [PMID: 39360948 DOI: 10.1021/acs.jcim.4c01024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2024]
Abstract
Computational methods constitute efficient strategies for screening and optimizing potential drug molecules. A critical factor in this process is the binding affinity between candidate molecules and targets, quantified as binding free energy. Among various estimation methods, alchemical transformation methods stand out for their theoretical rigor. Despite challenges in force field accuracy and sampling efficiency, advancements in algorithms, software, and hardware have increased the application of free energy perturbation (FEP) calculations in the pharmaceutical industry. Here, we review the practical applications of FEP in drug discovery projects since 2018, covering both ligand-centric and residue-centric transformations. We show that relative binding free energy calculations have steadily achieved chemical accuracy in real-world applications. In addition, we discuss alternative physics-based simulation methods and the incorporation of deep learning into free energy calculations.
Collapse
Affiliation(s)
- Runtong Qian
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - Jing Xue
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - You Xu
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - Jing Huang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| |
Collapse
|
30
|
Konermann L, Scrosati PM. Hydrogen/Deuterium Exchange Mass Spectrometry: Fundamentals, Limitations, and Opportunities. Mol Cell Proteomics 2024; 23:100853. [PMID: 39383946 PMCID: PMC11570944 DOI: 10.1016/j.mcpro.2024.100853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2024] [Revised: 09/11/2024] [Accepted: 10/02/2024] [Indexed: 10/11/2024] Open
Abstract
Hydrogen/deuterium exchange mass spectrometry (HDX-MS) probes dynamic motions of proteins by monitoring the kinetics of backbone amide deuteration. Dynamic regions exhibit rapid HDX, while rigid segments are more protected. Current data readouts focus on qualitative comparative observations (such as "residues X to Y become more protected after protein exposure to ligand Z"). At present, it is not possible to decode HDX protection patterns in an atomistic fashion. In other words, the exact range of protein motions under a given set of conditions cannot be uncovered, leaving space for speculative interpretations. Amide back exchange is an under-appreciated problem, as the widely used (m-m0)/(m100-m0) correction method can distort HDX kinetic profiles. Future data analysis strategies require a better fundamental understanding of HDX events, going beyond the classical Linderstrøm-Lang model. Combined with experiments that offer enhanced spatial resolution and suppressed back exchange, it should become possible to uncover the exact range of motions exhibited by a protein under a given set of conditions. Such advances would provide a greatly improved understanding of protein behavior in health and disease.
Collapse
Affiliation(s)
- Lars Konermann
- Department of Chemistry, The University of Western Ontario, London, Ontario, Canada.
| | - Pablo M Scrosati
- Department of Chemistry, The University of Western Ontario, London, Ontario, Canada
| |
Collapse
|
31
|
Lisanza SL, Gershon JM, Tipps SWK, Sims JN, Arnoldt L, Hendel SJ, Simma MK, Liu G, Yase M, Wu H, Tharp CD, Li X, Kang A, Brackenbrough E, Bera AK, Gerben S, Wittmann BJ, McShan AC, Baker D. Multistate and functional protein design using RoseTTAFold sequence space diffusion. Nat Biotechnol 2024:10.1038/s41587-024-02395-w. [PMID: 39322764 DOI: 10.1038/s41587-024-02395-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2024] [Accepted: 08/21/2024] [Indexed: 09/27/2024]
Abstract
Protein denoising diffusion probabilistic models are used for the de novo generation of protein backbones but are limited in their ability to guide generation of proteins with sequence-specific attributes and functional properties. To overcome this limitation, we developed ProteinGenerator (PG), a sequence space diffusion model based on RoseTTAFold that simultaneously generates protein sequences and structures. Beginning from a noised sequence representation, PG generates sequence and structure pairs by iterative denoising, guided by desired sequence and structural protein attributes. We designed thermostable proteins with varying amino acid compositions and internal sequence repeats and cage bioactive peptides, such as melittin. By averaging sequence logits between diffusion trajectories with distinct structural constraints, we designed multistate parent-child protein triples in which the same sequence folds to different supersecondary structures when intact in the parent versus split into two child domains. PG design trajectories can be guided by experimental sequence-activity data, providing a general approach for integrated computational and experimental optimization of protein function.
Collapse
Affiliation(s)
- Sidney Lyayuga Lisanza
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Graduate Program in Biological Physics, Structure and Design, University of Washington, Seattle, WA, USA
| | - Jacob Merle Gershon
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Samuel W K Tipps
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Jeremiah Nelson Sims
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Molecular & Cellular Biology, Medical Scientist Training Program, University of Washington, Seattle, WA, USA
| | - Lucas Arnoldt
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Faculty of Engineering Sciences, Heidelberg University, Heidelberg, Germany
| | - Samuel J Hendel
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Miriam K Simma
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA
| | - Ge Liu
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Muna Yase
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
- Department of Molecular Engineering, University of Washington, Seattle, WA, USA
| | - Hongwei Wu
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA
| | - Claire D Tharp
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA
| | - Xinting Li
- Department of Biochemistry, University of Washington, Seattle, WA, USA
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Alex Kang
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Asim K Bera
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Stacey Gerben
- Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Bruce J Wittmann
- Office of the Chief Scientific Officer, Microsoft, Redmond, WA, USA
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA
| | - David Baker
- Department of Biochemistry, University of Washington, Seattle, WA, USA.
- Institute for Protein Design, University of Washington, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
32
|
Rosignoli S, Pacelli M, Manganiello F, Paiardini A. An outlook on structural biology after AlphaFold: tools, limits and perspectives. FEBS Open Bio 2024. [PMID: 39313455 DOI: 10.1002/2211-5463.13902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 08/19/2024] [Accepted: 09/13/2024] [Indexed: 09/25/2024] Open
Abstract
AlphaFold and similar groundbreaking, AI-based tools, have revolutionized the field of structural bioinformatics, with their remarkable accuracy in ab-initio protein structure prediction. This success has catalyzed the development of new software and pipelines aimed at incorporating AlphaFold's predictions, often focusing on addressing the algorithm's remaining challenges. Here, we present the current landscape of structural bioinformatics shaped by AlphaFold, and discuss how the field is dynamically responding to this revolution, with new software, methods, and pipelines. While the excitement around AI-based tools led to their widespread application, it is essential to acknowledge that their practical success hinges on their integration into established protocols within structural bioinformatics, often neglected in the context of AI-driven advancements. Indeed, user-driven intervention is still as pivotal in the structure prediction process as in complementing state-of-the-art algorithms with functional and biological knowledge.
Collapse
Affiliation(s)
- Serena Rosignoli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Maddalena Pacelli
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Francesca Manganiello
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| | - Alessandro Paiardini
- Department of Biochemical sciences "A. Rossi Fanelli", Sapienza Università di Roma, Italy
| |
Collapse
|
33
|
Núñez-Franco R, Muriel-Olaya MM, Jiménez-Osés G, Peccati F. AlphaFold2 Predicts Alternative Conformation Populations in Green Fluorescent Protein Variants. J Chem Inf Model 2024; 64:7135-7140. [PMID: 39227031 PMCID: PMC11423400 DOI: 10.1021/acs.jcim.4c01388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Artificial intelligence-based protein structure prediction methods such as AlphaFold2 have emerged as powerful tools for characterizing proteins sequence-structure relationship offering unprecedented opportunities for the molecular interpretation of biological and biochemical phenomena. While initially confined to providing a static representation of proteins through their global free-energy minimum, AlphaFold2 has demonstrated the ability to partially sample conformational landscapes, providing insights into protein dynamics, which is fundamental for interpreting and potentially tuning the function of natural and artificial proteins. In this study, we show that targeted column masking of AlphaFold2's multiple sequence alignment enables the characterization and estimation of the population ratio of the two main conformations of engineered green fluorescent proteins with alternative β-strands. The possibility of quickly estimating relative populations through AlphaFold2 predictions is expected to speed-up the computational design of related systems for sensing applications.
Collapse
Affiliation(s)
- Reyes Núñez-Franco
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA) Bizkaia Technology Park, Building 801, 48160 Derio, Spain
| | - M Milagros Muriel-Olaya
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA) Bizkaia Technology Park, Building 801, 48160 Derio, Spain
| | - Gonzalo Jiménez-Osés
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA) Bizkaia Technology Park, Building 801, 48160 Derio, Spain
- Ikerbasque, Basque Foundation for Science, 48013 Bilbao, Spain
| | - Francesca Peccati
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA) Bizkaia Technology Park, Building 801, 48160 Derio, Spain
- Ikerbasque, Basque Foundation for Science, 48013 Bilbao, Spain
| |
Collapse
|
34
|
Benavides TL, Montelione GT. Integrative Modeling of Protein-Polypeptide Complexes by Bayesian Model Selection using AlphaFold and NMR Chemical Shift Perturbation Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.19.613999. [PMID: 39345459 PMCID: PMC11430059 DOI: 10.1101/2024.09.19.613999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Protein-polypeptide interactions, including those involving intrinsically-disordered peptides and intrinsically-disordered regions of protein binding partners, are crucial for many biological functions. However, experimental structure determination of protein-peptide complexes can be challenging. Computational methods, while promising, generally require experimental data for validation and refinement. Here we present CSP_Rank, an integrated modeling approach to determine the structures of protein-peptide complexes. This method combines AlphaFold2 (AF2) enhanced sampling methods with a Bayesian conformational selection process based on experimental Nuclear Magnetic Resonance (NMR) Chemical Shift Perturbation (CSP) data and AF2 confidence metrics. Using a curated dataset of 108 protein-peptide complexes from the Biological Magnetic Resonance Data Bank (BMRB), we observe that while AF2 typically yields models with excellent consistency with experimental CSP data, applying enhanced sampling followed by data-guided conformational selection routinely results in ensembles of structures with improved agreement with NMR observables. For two systems, we cross-validate the CSP-selected models using independently acquired nuclear Overhauser effect (NOE) NMR data and demonstrate how CSP and NMR can be combined using our Bayesian framework for model selection. CSP_Rank is a novel method for integrative modeling of protein-peptide complexes and has broad implications for studies of protein-peptide interactions and aiding in understanding their biological functions.
Collapse
Affiliation(s)
- Tiburon L. Benavides
- Department of Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
35
|
Raisinghani N, Alshahrani M, Gupta G, Verkhivker G. Predicting Mutation-Induced Allosteric Changes in Structures and Conformational Ensembles of the ABL Kinase Using AlphaFold2 Adaptations with Alanine Sequence Scanning. Int J Mol Sci 2024; 25:10082. [PMID: 39337567 PMCID: PMC11432724 DOI: 10.3390/ijms251810082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 09/18/2024] [Accepted: 09/18/2024] [Indexed: 09/30/2024] Open
Abstract
Despite the success of AlphaFold2 approaches in predicting single protein structures, these methods showed intrinsic limitations in predicting multiple functional conformations of allosteric proteins and have been challenged to accurately capture the effects of single point mutations that induced significant structural changes. We examined several implementations of AlphaFold2 methods to predict conformational ensembles for state-switching mutants of the ABL kinase. The results revealed that a combination of randomized alanine sequence masking with shallow multiple sequence alignment subsampling can significantly expand the conformational diversity of the predicted structural ensembles and capture shifts in populations of the active and inactive ABL states. Consistent with the NMR experiments, the predicted conformational ensembles for M309L/L320I and M309L/H415P ABL mutants that perturb the regulatory spine networks featured the increased population of the fully closed inactive state. The proposed adaptation of AlphaFold can reproduce the experimentally observed mutation-induced redistributions in the relative populations of the active and inactive ABL states and capture the effects of regulatory mutations on allosteric structural rearrangements of the kinase domain. The ensemble-based network analysis complemented AlphaFold predictions by revealing allosteric hotspots that correspond to state-switching mutational sites which may explain the global effect of regulatory mutations on structural changes between the ABL states. This study suggested that attention-based learning of long-range dependencies between sequence positions in homologous folds and deciphering patterns of allosteric interactions may further augment the predictive abilities of AlphaFold methods for modeling of alternative protein sates, conformational ensembles and mutation-induced structural transformations.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Grace Gupta
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
36
|
Liu J, Guo Z, You H, Zhang C, Lai L. All-Atom Protein Sequence Design Based on Geometric Deep Learning. Angew Chem Int Ed Engl 2024:e202411461. [PMID: 39295564 DOI: 10.1002/anie.202411461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2024] [Revised: 09/09/2024] [Accepted: 09/18/2024] [Indexed: 09/21/2024]
Abstract
Designing sequences for specific protein backbones is a key step in creating new functional proteins. Here, we introduce GeoSeqBuilder, a deep learning framework that integrates protein sequence generation with side chain conformation prediction to produce the complete all-atom structures for designed sequences. GeoSeqBuilder uses spatial geometric features from protein backbones and explicitly includes three-body interactions of neighboring residues. GeoSeqBuilder achieves native residue type recovery rate of 51.6 %, comparable to ProteinMPNN and other leading methods, while accurately predicting side chain conformations. We first used GeoSeqBuilder to design sequences for thioredoxin and a hallucinated three-helical bundle protein. All the 15 tested sequences expressed as soluble monomeric proteins with high thermal stability, and the 2 high-resolution crystal structures solved closely match the designed models. The generated protein sequences exhibit low similarity (minimum 23 %) to the original sequences, with significantly altered hydrophobic cores. We further redesigned the hydrophobic core of glutathione peroxidase 4, and 3 of the 5 designs showed improved enzyme activity. Although further testing is needed, the high experimental success rate in our testing demonstrates that GeoSeqBuilder is a powerful tool for designing novel sequences for predefined protein structures with atomic details. GeoSeqBuilder is available at https://github.com/PKUliujl/GeoSeqBuilder.
Collapse
Affiliation(s)
- Jiale Liu
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Zheng Guo
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
| | - Hantian You
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Changsheng Zhang
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Luhua Lai
- Center for Life Sciences Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
- Center for Quantitative Biology Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China
- Chengdu Academy for Advanced Interdisciplinary Biotechnologies, Peking University, Chengdu, 510100, Sichuan, China
| |
Collapse
|
37
|
Niphakis MJ, Cravatt BF. Ligand discovery by activity-based protein profiling. Cell Chem Biol 2024; 31:1636-1651. [PMID: 39303700 DOI: 10.1016/j.chembiol.2024.08.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2024] [Revised: 08/15/2024] [Accepted: 08/19/2024] [Indexed: 09/22/2024]
Abstract
Genomic technologies have led to massive gains in our understanding of human gene function and disease relevance. Chemical biologists are a primary beneficiary of this information, which can guide the prioritization of proteins for chemical probe and drug development. The vast functional and structural diversity of disease-relevant proteins, however, presents challenges for conventional small molecule screening libraries and assay development that in turn raise questions about the broader "druggability" of the human proteome. Here, we posit that activity-based protein profiling (ABPP), by generating global maps of small molecule-protein interactions in native biological systems, is well positioned to address major obstacles in human biology-guided chemical probe and drug discovery. We will support this viewpoint with case studies highlighting a range of small molecule mechanisms illuminated by ABPP that include the disruption and stabilization of biomolecular (protein-protein/nucleic acid) interactions and underscore allostery as a rich source of chemical tools for historically "undruggable" protein classes.
Collapse
|
38
|
Kacher J, Sokolova OS, Tarek M. A Deep Learning Approach to Uncover Voltage-Gated Ion Channels' Intermediate States. J Phys Chem B 2024; 128:8724-8736. [PMID: 39213618 DOI: 10.1021/acs.jpcb.4c03182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Owing to recent advancements in cryo-electron microscopy, voltage-gated ion channels have gained a greater comprehension of their structural characteristics. However, a significant enigma remains unsolved for a large majority of these channels: their gating mechanism. This mechanism, which encompasses the conformational changes between open and closed states, is pivotal to their proper functioning. Beyond the binary states of open and closed, an ensemble of intermediate states defines the transition path in-between. Due to the lack of experimental data, one might resort to molecular dynamics simulations as an alternative to decipher these states and the transitions between them. However, the high-energy barriers and the colossal time scales involved hinder access to the latter. We present here an application of deep learning as a reliable pipeline for a comprehensive exploration of voltage-gated ion channel conformational rearrangements during gating. We showcase the pipeline performance specifically on the Kv1.2 voltage sensor domain and confront the results with existing data. We demonstrate how our physics-based deep learning approach contributes to the theoretical understanding of these channels and how it might provide further insights into the exploration of channelopathies.
Collapse
Affiliation(s)
- Julia Kacher
- Université de Lorraine, CNRS, LPCT, F-54000 Nancy, France
| | - Olga S Sokolova
- Faculty of Biology, Lomonosov Moscow State University, 1-12 Leninskie Gory, 119234 Moscow, Russia
- Shenzhen MSU-BIT University, 1 International University Park Road, Dayun New Town, Longgang District, Shenzhen 518172, China
| | - Mounir Tarek
- Université de Lorraine, CNRS, LPCT, F-54000 Nancy, France
| |
Collapse
|
39
|
Duran C, Casadevall G, Osuna S. Harnessing conformational dynamics in enzyme catalysis to achieve nature-like catalytic efficiencies: the shortest path map tool for computational enzyme redesign. Faraday Discuss 2024; 252:306-322. [PMID: 38910409 PMCID: PMC11389851 DOI: 10.1039/d3fd00156c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Enzymes exhibit diverse conformations, as represented in the free energy landscape (FEL). Such conformational diversity provides enzymes with the ability to evolve towards novel functions. The challenge lies in identifying mutations that enhance specific conformational changes, especially if located in distal sites from the active site cavity. The shortest path map (SPM) method, which we developed to address this challenge, constructs a graph based on the distances and correlated motions of residues observed in nanosecond timescale molecular dynamics (MD) simulations. We recently introduced a template based AlphaFold2 (tAF2) approach coupled with 10 nanosecond MD simulations to quickly estimate the conformational landscape of enzymes and assess how the FEL is shifted after mutation. In this study, we evaluate the potential of SPM when coupled with tAF2-MD in estimating conformational heterogeneity and identifying key conformationally-relevant positions. The selected model system is the beta subunit of tryptophan synthase (TrpB). We compare how the SPM pathways differ when integrating tAF2 with different MD simulation lengths from as short as 10 ns until 50 ns and considering two distinct Amber forcefield and water models (ff14SB/TIP3P versus ff19SB/OPC). The new methodology can more effectively capture the distal mutations found in laboratory evolution, thus showcasing the efficacy of tAF2-MD-SPM in rapidly estimating enzyme dynamics and identifying the key conformationally relevant hotspots for computational enzyme engineering.
Collapse
Affiliation(s)
- Cristina Duran
- Departament de Química, Institut de Química Computacional i Catàlisi, Universitat de Girona, c/Maria Aurèlia Capmany 69, 17003, Girona, Spain.
| | - Guillem Casadevall
- Departament de Química, Institut de Química Computacional i Catàlisi, Universitat de Girona, c/Maria Aurèlia Capmany 69, 17003, Girona, Spain.
| | - Sílvia Osuna
- Departament de Química, Institut de Química Computacional i Catàlisi, Universitat de Girona, c/Maria Aurèlia Capmany 69, 17003, Girona, Spain.
- ICREA, Pg. Lluís Companys 23, 08010, Barcelona, Spain
| |
Collapse
|
40
|
Gu X, Aranganathan A, Tiwary P. Empowering AlphaFold2 for protein conformation selective drug discovery with AlphaFold2-RAVE. eLife 2024; 13:RP99702. [PMID: 39240197 PMCID: PMC11379456 DOI: 10.7554/elife.99702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024] Open
Abstract
Small-molecule drug design hinges on obtaining co-crystallized ligand-protein structures. Despite AlphaFold2's strides in protein native structure prediction, its focus on apo structures overlooks ligands and associated holo structures. Moreover, designing selective drugs often benefits from the targeting of diverse metastable conformations. Therefore, direct application of AlphaFold2 models in virtual screening and drug discovery remains tentative. Here, we demonstrate an AlphaFold2-based framework combined with all-atom enhanced sampling molecular dynamics and Induced Fit docking, named AF2RAVE-Glide, to conduct computational model-based small-molecule binding of metastable protein kinase conformations, initiated from protein sequences. We demonstrate the AF2RAVE-Glide workflow on three different mammalian protein kinases and their type I and II inhibitors, with special emphasis on binding of known type II kinase inhibitors which target the metastable classical DFG-out state. These states are not easy to sample from AlphaFold2. Here, we demonstrate how with AF2RAVE these metastable conformations can be sampled for different kinases with high enough accuracy to enable subsequent docking of known type II kinase inhibitors with more than 50% success rates across docking calculations. We believe the protocol should be deployable for other kinases and more proteins generally.
Collapse
Affiliation(s)
- Xinyu Gu
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- University of Maryland Institute for Health ComputingBethesdaUnited States
| | - Akashnathan Aranganathan
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- Biophysics Program, University of MarylandCollege ParkUnited States
| | - Pratyush Tiwary
- Institute for Physical Science and Technology, University of MarylandCollege ParkUnited States
- University of Maryland Institute for Health ComputingBethesdaUnited States
- Department of Chemistry and Biochemistry, University of MarylandCollege ParkUnited States
| |
Collapse
|
41
|
Licht JA, Berry SP, Gutierrez MA, Gaudet R. They all rock: A systematic comparison of conformational movements in LeuT-fold transporters. Structure 2024; 32:1528-1543.e3. [PMID: 39025067 PMCID: PMC11380583 DOI: 10.1016/j.str.2024.06.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 05/30/2024] [Accepted: 06/21/2024] [Indexed: 07/20/2024]
Abstract
Many membrane transporters share the LeuT fold-two five-helix repeats inverted across the membrane plane. Despite hundreds of structures, whether distinct conformational mechanisms are supported by the LeuT fold has not been systematically determined. After annotating published LeuT-fold structures, we analyzed distance difference matrices (DDMs) for nine proteins with multiple available conformations. We identified rigid bodies and relative movements of transmembrane helices (TMs) during distinct steps of the transport cycle. In all transporters, the bundle (first two TMs of each repeat) rotates relative to the hash (third and fourth TMs). Motions of the arms (fifth TM) to close or open the intracellular and outer vestibules are common, as is a TM1a swing, with notable variations in the opening-closing motions of the outer vestibule. Our analyses suggest that LeuT-fold transporters layer distinct motions on a common bundle-hash rock and demonstrate that systematic analyses can provide new insights into large structural datasets.
Collapse
Affiliation(s)
- Jacob A Licht
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Samuel P Berry
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Michael A Gutierrez
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA
| | - Rachelle Gaudet
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
42
|
Liu ZH, Tsanai M, Zhang O, Forman-Kay J, Head-Gordon T. Computational Methods to Investigate Intrinsically Disordered Proteins and their Complexes. ARXIV 2024:arXiv:2409.02240v1. [PMID: 39279844 PMCID: PMC11398552] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 09/18/2024]
Abstract
In 1999 Wright and Dyson highlighted the fact that large sections of the proteome of all organisms are comprised of protein sequences that lack globular folded structures under physiological conditions. Since then the biophysics community has made significant strides in unraveling the intricate structural and dynamic characteristics of intrinsically disordered proteins (IDPs) and intrinsically disordered regions (IDRs). Unlike crystallographic beamlines and their role in streamlining acquisition of structures for folded proteins, an integrated experimental and computational approach aimed at IDPs/IDRs has emerged. In this Perspective we aim to provide a robust overview of current computational tools for IDPs and IDRs, and most recently their complexes and phase separated states, including statistical models, physics-based approaches, and machine learning methods that permit structural ensemble generation and validation against many solution experimental data types.
Collapse
Affiliation(s)
- Zi Hao Liu
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Maria Tsanai
- Kenneth S. Pitzer Center for Theoretical Chemistry and Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Oufan Zhang
- Kenneth S. Pitzer Center for Theoretical Chemistry and Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
| | - Julie Forman-Kay
- Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Center for Theoretical Chemistry and Department of Chemistry, University of California, Berkeley, Berkeley, California 94720, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, University of California, Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
43
|
Rahimzadeh F, Mohammad Khanli L, Salehpoor P, Golabi F, PourBahrami S. Unveiling the evolution of policies for enhancing protein structure predictions: A comprehensive analysis. Comput Biol Med 2024; 179:108815. [PMID: 38986287 DOI: 10.1016/j.compbiomed.2024.108815] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 06/09/2024] [Accepted: 06/24/2024] [Indexed: 07/12/2024]
Abstract
Predicting protein structure is both fascinating and formidable, playing a crucial role in structure-based drug discovery and unraveling diseases with elusive origins. The Critical Assessment of Protein Structure Prediction (CASP) serves as a biannual battleground where global scientists converge to untangle the intricate relationships within amino acid chains. Two primary methods, Template-Based Modeling (TBM) and Template-Free (TF) strategies, dominate protein structure prediction. The trend has shifted towards Template-Free predictions due to their broader sequence coverage with fewer templates. The predictive process can be broadly classified into contact map, binned-distance, and real-valued distance predictions, each with distinctive strengths and limitations manifested through tailored loss functions. We have also introduced revolutionary end-to-end, and all-atom diffusion-based techniques that have transformed protein structure predictions. Recent advancements in deep learning techniques have significantly improved prediction accuracy, although the effectiveness is contingent upon the quality of input features derived from natural bio-physiochemical attributes and Multiple Sequence Alignments (MSA). Hence, the generation of high-quality MSA data holds paramount importance in harnessing informative input features for enhanced prediction outcomes. Remarkable successes have been achieved in protein structure prediction accuracy, however not enough for what structural knowledge was intended to, which implies need for development in some other aspects of the predictions. In this regard, scientists have opened other frontiers for protein structural prediction. The utilization of subsampling in multiple sequence alignment (MSA) and protein language modeling appears to be particularly promising in enhancing the accuracy and efficiency of predictions, ultimately aiding in drug discovery efforts. The exploration of predicting protein complex structure also opens up exciting opportunities to deepen our knowledge of molecular interactions and design therapeutics that are more effective. In this article, we have discussed the vicissitudes that the scientists have gone through to improve prediction accuracy, and examined the effective policies in predicting from different aspects, including the construction of high quality MSA, providing informative input features, and progresses in deep learning approaches. We have also briefly touched upon transitioning from predicting single-chain protein structures to predicting protein complex structures. Our findings point towards promoting open research environments to support the objectives of protein structure prediction.
Collapse
Affiliation(s)
- Faezeh Rahimzadeh
- Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | | | - Pedram Salehpoor
- Faculty of Electrical and Computer Engineering, University of Tabriz, Tabriz, Iran
| | - Faegheh Golabi
- Department of Biomedical Engineering, Faculty of Advanced Medical Sciences, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Shahin PourBahrami
- Department of Computer Engineering, Technical and Vocational University (TVU), Tehran, Iran
| |
Collapse
|
44
|
Ai H, Pan M, Liu L. Chemical Synthesis of Human Proteoforms and Application in Biomedicine. ACS CENTRAL SCIENCE 2024; 10:1442-1459. [PMID: 39220697 PMCID: PMC11363345 DOI: 10.1021/acscentsci.4c00642] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2024] [Revised: 07/04/2024] [Accepted: 07/05/2024] [Indexed: 09/04/2024]
Abstract
Limited understanding of human proteoforms with complex posttranslational modifications and the underlying mechanisms poses a major obstacle to research on human health and disease. This Outlook discusses opportunities and challenges of de novo chemical protein synthesis in human proteoform studies. Our analysis suggests that to develop a comprehensive, robust, and cost-effective methodology for chemical synthesis of various human proteoforms, new chemistries of the following types need to be developed: (1) easy-to-use peptide ligation chemistries allowing more efficient de novo synthesis of protein structural domains, (2) robust temporary structural support strategies for ligation and folding of challenging targets, and (3) efficient transpeptidative protein domain-domain ligation methods for multidomain proteins. Our analysis also indicates that accurate chemical synthesis of human proteoforms can be applied to the following aspects of biomedical research: (1) dissection and reconstitution of the proteoform interaction networks, (2) structural mechanism elucidation and functional analysis of human proteoform complexes, and (3) development and evaluation of drugs targeting human proteoforms. Overall, we suggest that through integrating chemical protein synthesis with in vivo functional analysis, mechanistic biochemistry, and drug development, synthetic chemistry would play a pivotal role in human proteoform research and facilitate the development of precision diagnostics and therapeutics.
Collapse
Affiliation(s)
- Huasong Ai
- New
Cornerstone Science Laboratory, Tsinghua-Peking Joint Center for Life
Sciences, MOE Key Laboratory of Bioorganic Phosphorus Chemistry and
Chemical Biology, Center for Synthetic and Systems Biology, Department
of Chemistry, Tsinghua University, Beijing 100084, China
- Institute
of Translational Medicine, School of Pharmacy, School of Chemistry
and Chemical Engineering, National Center for Translational Medicine
(Shanghai), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Man Pan
- Institute
of Translational Medicine, School of Pharmacy, School of Chemistry
and Chemical Engineering, National Center for Translational Medicine
(Shanghai), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Lei Liu
- New
Cornerstone Science Laboratory, Tsinghua-Peking Joint Center for Life
Sciences, MOE Key Laboratory of Bioorganic Phosphorus Chemistry and
Chemical Biology, Center for Synthetic and Systems Biology, Department
of Chemistry, Tsinghua University, Beijing 100084, China
| |
Collapse
|
45
|
Guan X, Tang QY, Ren W, Chen M, Wang W, Wolynes PG, Li W. Predicting protein conformational motions using energetic frustration analysis and AlphaFold2. Proc Natl Acad Sci U S A 2024; 121:e2410662121. [PMID: 39163334 PMCID: PMC11363347 DOI: 10.1073/pnas.2410662121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/16/2024] [Indexed: 08/22/2024] Open
Abstract
Proteins perform their biological functions through motion. Although high throughput prediction of the three-dimensional static structures of proteins has proved feasible using deep-learning-based methods, predicting the conformational motions remains a challenge. Purely data-driven machine learning methods encounter difficulty for addressing such motions because available laboratory data on conformational motions are still limited. In this work, we develop a method for generating protein allosteric motions by integrating physical energy landscape information into deep-learning-based methods. We show that local energetic frustration, which represents a quantification of the local features of the energy landscape governing protein allosteric dynamics, can be utilized to empower AlphaFold2 (AF2) to predict protein conformational motions. Starting from ground state static structures, this integrative method generates alternative structures as well as pathways of protein conformational motions, using a progressive enhancement of the energetic frustration features in the input multiple sequence alignment sequences. For a model protein adenylate kinase, we show that the generated conformational motions are consistent with available experimental and molecular dynamics simulation data. Applying the method to another two proteins KaiB and ribose-binding protein, which involve large-amplitude conformational changes, can also successfully generate the alternative conformations. We also show how to extract overall features of the AF2 energy landscape topography, which has been considered by many to be black box. Incorporating physical knowledge into deep-learning-based structure prediction algorithms provides a useful strategy to address the challenges of dynamic structure prediction of allosteric proteins.
Collapse
Affiliation(s)
- Xingyue Guan
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| | - Qian-Yuan Tang
- Department of Physics, Hong Kong Baptist University, Kowloon Tong, Hong Kong Special Administrative Region999077, China
| | - Weitong Ren
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| | | | - Wei Wang
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, TX77005
| | - Wenfei Li
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| |
Collapse
|
46
|
Bryant P, Noé F. Structure prediction of alternative protein conformations. Nat Commun 2024; 15:7328. [PMID: 39187507 PMCID: PMC11347660 DOI: 10.1038/s41467-024-51507-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 08/07/2024] [Indexed: 08/28/2024] Open
Abstract
Proteins are dynamic molecules whose movements result in different conformations with different functions. Neural networks such as AlphaFold2 can predict the structure of single-chain proteins with conformations most likely to exist in the PDB. However, almost all protein structures with multiple conformations represented in the PDB have been used while training these models. Therefore, it is unclear whether alternative protein conformations can be genuinely predicted using these networks, or if they are simply reproduced from memory. Here, we train a structure prediction network, Cfold, on a conformational split of the PDB to generate alternative conformations. Cfold enables efficient exploration of the conformational landscape of monomeric protein structures. Over 50% of experimentally known nonredundant alternative protein conformations evaluated here are predicted with high accuracy (TM-score > 0.8).
Collapse
Affiliation(s)
- Patrick Bryant
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany.
- The Department of Molecular Biosciences, The Wenner-Gren Institute, Stockholm University, Svante Arrhenius väg 20C, 114 18, Stockholm, Sweden.
- Science for Life Laboratory, 172 21, Solna, Sweden.
| | - Frank Noé
- Department of Mathematics and Informatics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany
| |
Collapse
|
47
|
Chakravarty D, Schafer JW, Chen EA, Thole JF, Ronish LA, Lee M, Porter LL. AlphaFold predictions of fold-switched conformations are driven by structure memorization. Nat Commun 2024; 15:7296. [PMID: 39181864 PMCID: PMC11344769 DOI: 10.1038/s41467-024-51801-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 08/19/2024] [Indexed: 08/27/2024] Open
Abstract
Recent work suggests that AlphaFold (AF)-a deep learning-based model that can accurately infer protein structure from sequence-may discern important features of folded protein energy landscapes, defined by the diversity and frequency of different conformations in the folded state. Here, we test the limits of its predictive power on fold-switching proteins, which assume two structures with regions of distinct secondary and/or tertiary structure. We find that (1) AF is a weak predictor of fold switching and (2) some of its successes result from memorization of training-set structures rather than learned protein energetics. Combining >280,000 models from several implementations of AF2 and AF3, a 35% success rate was achieved for fold switchers likely in AF's training sets. AF2's confidence metrics selected against models consistent with experimentally determined fold-switching structures and failed to discriminate between low and high energy conformations. Further, AF captured only one out of seven experimentally confirmed fold switchers outside of its training sets despite extensive sampling of an additional ~280,000 models. Several observations indicate that AF2 has memorized structural information during training, and AF3 misassigns coevolutionary restraints. These limitations constrain the scope of successful predictions, highlighting the need for physically based methods that readily predict multiple protein conformations.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph W Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Ethan A Chen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph F Thole
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Leslie A Ronish
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Myeongsang Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
48
|
Kovalevskiy O, Mateos-Garcia J, Tunyasuvunakool K. AlphaFold two years on: Validation and impact. Proc Natl Acad Sci U S A 2024; 121:e2315002121. [PMID: 39133843 PMCID: PMC11348012 DOI: 10.1073/pnas.2315002121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Two years on from the initial release of AlphaFold, we have seen its widespread adoption as a structure prediction tool. Here, we discuss some of the latest work based on AlphaFold, with a particular focus on its use within the structural biology community. This encompasses use cases like speeding up structure determination itself, enabling new computational studies, and building new tools and workflows. We also look at the ongoing validation of AlphaFold, as its predictions continue to be compared against large numbers of experimental structures to further delineate the model's capabilities and limitations.
Collapse
|
49
|
Zhou J, Huang M. Navigating the landscape of enzyme design: from molecular simulations to machine learning. Chem Soc Rev 2024; 53:8202-8239. [PMID: 38990263 DOI: 10.1039/d4cs00196f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Global environmental issues and sustainable development call for new technologies for fine chemical synthesis and waste valorization. Biocatalysis has attracted great attention as the alternative to the traditional organic synthesis. However, it is challenging to navigate the vast sequence space to identify those proteins with admirable biocatalytic functions. The recent development of deep-learning based structure prediction methods such as AlphaFold2 reinforced by different computational simulations or multiscale calculations has largely expanded the 3D structure databases and enabled structure-based design. While structure-based approaches shed light on site-specific enzyme engineering, they are not suitable for large-scale screening of potential biocatalysts. Effective utilization of big data using machine learning techniques opens up a new era for accelerated predictions. Here, we review the approaches and applications of structure-based and machine-learning guided enzyme design. We also provide our view on the challenges and perspectives on effectively employing enzyme design approaches integrating traditional molecular simulations and machine learning, and the importance of database construction and algorithm development in attaining predictive ML models to explore the sequence fitness landscape for the design of admirable biocatalysts.
Collapse
Affiliation(s)
- Jiahui Zhou
- School of Chemistry and Chemical Engineering, Queen's University, David Keir Building, Stranmillis Road, Belfast BT9 5AG, Northern Ireland, UK.
| | - Meilan Huang
- School of Chemistry and Chemical Engineering, Queen's University, David Keir Building, Stranmillis Road, Belfast BT9 5AG, Northern Ireland, UK.
| |
Collapse
|
50
|
Correa Marrero M, Jänes J, Baptista D, Beltrao P. Integrating Large-Scale Protein Structure Prediction into Human Genetics Research. Annu Rev Genomics Hum Genet 2024; 25:123-140. [PMID: 38621234 DOI: 10.1146/annurev-genom-120622-020615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
The last five years have seen impressive progress in deep learning models applied to protein research. Most notably, sequence-based structure predictions have seen transformative gains in the form of AlphaFold2 and related approaches. Millions of missense protein variants in the human population lack annotations, and these computational methods are a valuable means to prioritize variants for further analysis. Here, we review the recent progress in deep learning models applied to the prediction of protein structure and protein variants, with particular emphasis on their implications for human genetics and health. Improved prediction of protein structures facilitates annotations of the impact of variants on protein stability, protein-protein interaction interfaces, and small-molecule binding pockets. Moreover, it contributes to the study of host-pathogen interactions and the characterization of protein function. As genome sequencing in large cohorts becomes increasingly prevalent, we believe that better integration of state-of-the-art protein informatics technologies into human genetics research is of paramount importance.
Collapse
Affiliation(s)
- Miguel Correa Marrero
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | - Jürgen Jänes
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | | | - Pedro Beltrao
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| |
Collapse
|