1
|
Zhang N, Sood D, Guo SC, Chen N, Antoszewski A, Marianchuk T, Chavan A, Dey S, Xiao Y, Hong L, Peng X, Baxa M, Partch C, Wang LP, Sosnick TR, Dinner AR, LiWang A. Temperature-Dependent Fold-Switching Mechanism of the Circadian Clock Protein KaiB. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.21.594594. [PMID: 38826295 PMCID: PMC11142059 DOI: 10.1101/2024.05.21.594594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
The oscillator of the cyanobacterial circadian clock relies on the ability of the KaiB protein to switch reversibly between a stable ground-state fold (gsKaiB) and an unstable fold-switched fold (fsKaiB). Rare fold-switching events by KaiB provide a critical delay in the negative feedback loop of this post-translational oscillator. In this study, we experimentally and computationally investigate the temperature dependence of fold switching and its mechanism. We demonstrate that the stability of gsKaiB increases with temperature compared to fsKaiB and that the Q10 value for the gsKaiB → fsKaiB transition is nearly three times smaller than that for the reverse transition. Simulations and native-state hydrogen-deuterium exchange NMR experiments suggest that fold switching can involve both subglobally and near-globally unfolded intermediates. The simulations predict that the transition state for fold switching coincides with isomerization of conserved prolines in the most rapidly exchanging region, and we confirm experimentally that proline isomerization is a rate-limiting step for fold switching. We explore the implications of our results for temperature compensation, a hallmark of circadian clocks, through a kinetic model.
Collapse
|
2
|
Baxa MC, Lin X, Mukinay CD, Chakravarthy S, Sachleben JR, Antilla S, Hartrampf N, Riback JA, Gagnon IA, Pentelute BL, Clark PL, Sosnick TR. How hydrophobicity, side chains, and salt affect the dimensions of disordered proteins. Protein Sci 2024; 33:e4986. [PMID: 38607226 PMCID: PMC11010952 DOI: 10.1002/pro.4986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Revised: 03/13/2024] [Accepted: 03/26/2024] [Indexed: 04/13/2024]
Abstract
Despite the generally accepted role of the hydrophobic effect as the driving force for folding, many intrinsically disordered proteins (IDPs), including those with hydrophobic content typical of foldable proteins, behave nearly as self-avoiding random walks (SARWs) under physiological conditions. Here, we tested how temperature and ionic conditions influence the dimensions of the N-terminal domain of pertactin (PNt), an IDP with an amino acid composition typical of folded proteins. While PNt contracts somewhat with temperature, it nevertheless remains expanded over 10-58°C, with a Flory exponent, ν, >0.50. Both low and high ionic strength also produce contraction in PNt, but this contraction is mitigated by reducing charge segregation. With 46% glycine and low hydrophobicity, the reduced form of snow flea anti-freeze protein (red-sfAFP) is unaffected by temperature and ionic strength and persists as a near-SARW, ν ~ 0.54, arguing that the thermal contraction of PNt is due to stronger interactions between hydrophobic side chains. Additionally, red-sfAFP is a proxy for the polypeptide backbone, which has been thought to collapse in water. Increasing the glycine segregation in red-sfAFP had minimal effect on ν. Water remained a good solvent even with 21 consecutive glycine residues (ν > 0.5), and red-sfAFP variants lacked stable backbone hydrogen bonds according to hydrogen exchange. Similarly, changing glycine segregation has little impact on ν in other glycine-rich proteins. These findings underscore the generality that many disordered states can be expanded and unstructured, and that the hydrophobic effect alone is insufficient to drive significant chain collapse for typical protein sequences.
Collapse
Affiliation(s)
- Michael C. Baxa
- Department of Biochemistry & Molecular BiologyThe University of ChicagoChicagoIllinoisUSA
| | - Xiaoxuan Lin
- Department of Biochemistry & Molecular BiologyThe University of ChicagoChicagoIllinoisUSA
| | - Cedrick D. Mukinay
- Department of Chemistry & BiochemistryUniversity of Notre DameNotre DameIndianaUSA
| | - Srinivas Chakravarthy
- Biophysics Collaborative Access Team (BioCAT), Center for Synchrotron Radiation Research and Instrumentation and Department of Biological and Chemical SciencesIllinois Institute of TechnologyChicagoIllinoisUSA
- Present address:
Cytiva, Fast TrakMarlboroughMAUSA
| | | | - Sarah Antilla
- Department of Materials Science and EngineeringMassachusetts Institute of TechnologyCambridgeMassachusettsUSA
| | - Nina Hartrampf
- Department of ChemistryMassachusetts Institute of TechnologyCambridgeMassachusettsUSA
- Present address:
Department of ChemistryUniversity of ZurichSwitzerland
| | - Joshua A. Riback
- Graduate Program in Biophysical ScienceUniversity of ChicagoChicagoIllinoisUSA
- Present address:
Department of Molecular and Cellular BiologyBaylor College of MedicineHoustonTXUSA
| | - Isabelle A. Gagnon
- Department of Biochemistry & Molecular BiologyThe University of ChicagoChicagoIllinoisUSA
| | - Bradley L. Pentelute
- Department of ChemistryMassachusetts Institute of TechnologyCambridgeMassachusettsUSA
| | - Patricia L. Clark
- Department of Chemistry & BiochemistryUniversity of Notre DameNotre DameIndianaUSA
| | - Tobin R. Sosnick
- Department of Biochemistry & Molecular BiologyThe University of ChicagoChicagoIllinoisUSA
| |
Collapse
|
3
|
Yao J, Hong H. Steric trapping strategy for studying the folding of helical membrane proteins. Methods 2024; 225:1-12. [PMID: 38428472 PMCID: PMC11107808 DOI: 10.1016/j.ymeth.2024.02.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Revised: 02/11/2024] [Accepted: 02/18/2024] [Indexed: 03/03/2024] Open
Abstract
Elucidating the folding energy landscape of membrane proteins is essential to the understanding of the proteins' stabilizing forces, folding mechanisms, biogenesis, and quality control. This is not a trivial task because the reversible control of folding is inherently difficult in a lipid bilayer environment. Recently, novel methods have been developed, each of which has a unique strength in investigating specific aspects of membrane protein folding. Among such methods, steric trapping is a versatile strategy allowing a reversible control of membrane protein folding with minimal perturbation of native protein-water and protein-lipid interactions. In a nutshell, steric trapping exploits the coupling of spontaneous denaturation of a doubly biotinylated protein to the simultaneous binding of bulky monovalent streptavidin molecules. This strategy has been evolved to investigate key elements of membrane protein folding such as thermodynamic stability, spontaneous denaturation rates, conformational features of the denatured states, and cooperativity of stabilizing interactions. In this review, we describe the critical methodological advancement, limitation, and outlook of the steric trapping strategy.
Collapse
Affiliation(s)
- Jiaqi Yao
- Department of Chemistry, Michigan State University, East Lansing, MI 48824, USA
| | - Heedeok Hong
- Department of Chemistry, Michigan State University, East Lansing, MI 48824, USA; Department of Biochemistry & Molecular Biology, Michigan State University, East Lansing, MI 48824, USA.
| |
Collapse
|
4
|
Pirnia A, Maqdisi R, Mittal S, Sener M, Singharoy A. Perspective on Integrative Simulations of Bioenergetic Domains. J Phys Chem B 2024; 128:3302-3319. [PMID: 38562105 DOI: 10.1021/acs.jpcb.3c07335] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Bioenergetic processes in cells, such as photosynthesis or respiration, integrate many time and length scales, which makes the simulation of energy conversion with a mere single level of theory impossible. Just like the myriad of experimental techniques required to examine each level of organization, an array of overlapping computational techniques is necessary to model energy conversion. Here, a perspective is presented on recent efforts for modeling bioenergetic phenomena with a focus on molecular dynamics simulations and its variants as a primary method. An overview of the various classical, quantum mechanical, enhanced sampling, coarse-grained, Brownian dynamics, and Monte Carlo methods is presented. Example applications discussed include multiscale simulations of membrane-wide electron transport, rate kinetics of ATP turnover from electrochemical gradients, and finally, integrative modeling of the chromatophore, a photosynthetic pseudo-organelle.
Collapse
Affiliation(s)
- Adam Pirnia
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287-1004, United States
| | - Ranel Maqdisi
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287-1004, United States
| | - Sumit Mittal
- VIT Bhopal University, Sehore 466114, Madhya Pradesh, India
| | - Melih Sener
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287-1004, United States
- Beckman Institute, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Abhishek Singharoy
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85287-1004, United States
| |
Collapse
|
5
|
Greener JG. Differentiable simulation to develop molecular dynamics force fields for disordered proteins. Chem Sci 2024; 15:4897-4909. [PMID: 38550690 PMCID: PMC10966991 DOI: 10.1039/d3sc05230c] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2023] [Accepted: 02/08/2024] [Indexed: 11/11/2024] Open
Abstract
Implicit solvent force fields are computationally efficient but can be unsuitable for running molecular dynamics on disordered proteins. Here I improve the a99SB-disp force field and the GBNeck2 implicit solvent model to better describe disordered proteins. Differentiable molecular simulations with 5 ns trajectories are used to jointly optimise 108 parameters to better match explicit solvent trajectories. Simulations with the improved force field better reproduce the radius of gyration and secondary structure content seen in experiments, whilst showing slightly degraded performance on folded proteins and protein complexes. The force field, called GB99dms, reproduces the results of a small molecule binding study and improves agreement with experiment for the aggregation of amyloid peptides. GB99dms, which can be used in OpenMM, is available at https://github.com/greener-group/GB99dms. This work is the first to show that gradients can be obtained directly from nanosecond-length differentiable simulations of biomolecules and highlights the effectiveness of this approach to training whole force fields to match desired properties.
Collapse
Affiliation(s)
- Joe G Greener
- Medical Research Council Laboratory of Molecular Biology Cambridge CB2 0QH UK
| |
Collapse
|
6
|
Chen R, Glauninger H, Kahan DN, Shangguan J, Sachleben JR, Riback JA, Drummond DA, Sosnick TR. HDX-MS finds that partial unfolding with sequential domain activation controls condensation of a cellular stress marker. Proc Natl Acad Sci U S A 2024; 121:e2321606121. [PMID: 38513106 PMCID: PMC10990091 DOI: 10.1073/pnas.2321606121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/29/2024] [Indexed: 03/23/2024] Open
Abstract
Eukaryotic cells form condensates to sense and adapt to their environment [S. F. Banani, H. O. Lee, A. A. Hyman, M. K. Rosen, Nat. Rev. Mol. Cell Biol. 18, 285-298 (2017), H. Yoo, C. Triandafillou, D. A. Drummond, J. Biol. Chem. 294, 7151-7159 (2019)]. Poly(A)-binding protein (Pab1), a canonical stress granule marker, condenses upon heat shock or starvation, promoting adaptation [J. A. Riback et al., Cell 168, 1028-1040.e19 (2017)]. The molecular basis of condensation has remained elusive due to a dearth of techniques to probe structure directly in condensates. We apply hydrogen-deuterium exchange/mass spectrometry to investigate the mechanism of Pab1's condensation. Pab1's four RNA recognition motifs (RRMs) undergo different levels of partial unfolding upon condensation, and the changes are similar for thermal and pH stresses. Although structural heterogeneity is observed, the ability of MS to describe populations allows us to identify which regions contribute to the condensate's interaction network. Our data yield a picture of Pab1's stress-triggered condensation, which we term sequential activation (Fig. 1A), wherein each RRM becomes activated at a temperature where it partially unfolds and associates with other likewise activated RRMs to form the condensate. Subsequent association is dictated more by the underlying free energy surface than specific interactions, an effect we refer to as thermodynamic specificity. Our study represents an advance for elucidating the interactions that drive condensation. Furthermore, our findings demonstrate how condensation can use thermodynamic specificity to perform an acute response to multiple stresses, a potentially general mechanism for stress-responsive proteins.
Collapse
Affiliation(s)
- Ruofan Chen
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL60637
| | - Hendrik Glauninger
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
- Graduate Program in Biophysical Sciences, Division of Physical Sciences, University of Chicago, Chicago, IL60637
| | - Darren N. Kahan
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
| | - Julia Shangguan
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
| | | | - Joshua A. Riback
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
- Graduate Program in Biophysical Sciences, Division of Physical Sciences, University of Chicago, Chicago, IL60637
| | - D. Allan Drummond
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
| | - Tobin R. Sosnick
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, IL60637
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL60637
- Institute for Biophysical Dynamics, University of Chicago, Chicago, IL60637
| |
Collapse
|
7
|
Lin X, Haller PR, Bavi N, Faruk N, Perozo E, Sosnick TR. Folding of prestin's anion-binding site and the mechanism of outer hair cell electromotility. eLife 2023; 12:RP89635. [PMID: 38054956 PMCID: PMC10699807 DOI: 10.7554/elife.89635] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2023] Open
Abstract
Prestin responds to transmembrane voltage fluctuations by changing its cross-sectional area, a process underlying the electromotility of outer hair cells and cochlear amplification. Prestin belongs to the SLC26 family of anion transporters yet is the only member capable of displaying electromotility. Prestin's voltage-dependent conformational changes are driven by the putative displacement of residue R399 and a set of sparse charged residues within the transmembrane domain, following the binding of a Cl- anion at a conserved binding site formed by the amino termini of the TM3 and TM10 helices. However, a major conundrum arises as to how an anion that binds in proximity to a positive charge (R399), can promote the voltage sensitivity of prestin. Using hydrogen-deuterium exchange mass spectrometry, we find that prestin displays an unstable anion-binding site, where folding of the amino termini of TM3 and TM10 is coupled to Cl- binding. This event shortens the TM3-TM10 electrostatic gap, thereby connecting the two helices, resulting in reduced cross-sectional area. These folding events upon anion binding are absent in SLC26A9, a non-electromotile transporter closely related to prestin. Dynamics of prestin embedded in a lipid bilayer closely match that in detergent micelle, except for a destabilized lipid-facing helix TM6 that is critical to prestin's mechanical expansion. We observe helix fraying at prestin's anion-binding site but cooperative unfolding of multiple lipid-facing helices, features that may promote prestin's fast electromechanical rearrangements. These results highlight a novel role of the folding equilibrium of the anion-binding site, and help define prestin's unique voltage-sensing mechanism and electromotility.
Collapse
Affiliation(s)
- Xiaoxuan Lin
- Center for Mechanical Excitability, The University of ChicagoChicagoUnited States
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
| | - Patrick R Haller
- Center for Mechanical Excitability, The University of ChicagoChicagoUnited States
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
| | - Navid Bavi
- Center for Mechanical Excitability, The University of ChicagoChicagoUnited States
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
| | - Nabil Faruk
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
| | - Eduardo Perozo
- Center for Mechanical Excitability, The University of ChicagoChicagoUnited States
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
- Institute for Neuroscience, The University of ChicagoChicagoUnited States
- Institute for Biophysical Dynamics, The University of ChicagoChicagoUnited States
| | - Tobin R Sosnick
- Center for Mechanical Excitability, The University of ChicagoChicagoUnited States
- Department of Biochemistry and Molecular Biology, The University of ChicagoChicagoUnited States
- Institute for Biophysical Dynamics, The University of ChicagoChicagoUnited States
- Prizker School for Molecular Engineering, The University of ChicagoChicagoUnited States
| |
Collapse
|
8
|
Lin X, Haller P, Bavi N, Faruk N, Perozo E, Sosnick TR. Folding of Prestin's Anion-Binding Site and the Mechanism of Outer Hair Cell Electromotility. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.27.530320. [PMID: 36909622 PMCID: PMC10002659 DOI: 10.1101/2023.02.27.530320] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Abstract
Prestin responds to transmembrane voltage fluctuations by changing its cross-sectional area, a process underlying the electromotility of outer hair cells and cochlear amplification. Prestin belongs to the SLC26 family of anion transporters yet is the only member capable of displaying electromotility. Prestin's voltage-dependent conformational changes are driven by the putative displacement of residue R399 and a set of sparse charged residues within the transmembrane domain, following the binding of a Cl - anion at a conserved binding site formed by amino termini of the TM3 and TM10 helices. However, a major conundrum arises as to how an anion that binds in proximity to a positive charge (R399), can promote the voltage sensitivity of prestin. Using hydrogen-deuterium exchange mass spectrometry, we find that prestin displays an unstable anion-binding site, where folding of the amino termini of TM3 and TM10 is coupled to Cl - binding. This event shortens the TM3-TM10 electrostatic gap, thereby connecting the two helices, resulting in reduced cross-sectional area. These folding events upon anion-binding are absent in SLC26A9, a non-electromotile transporter closely related to prestin. Dynamics of prestin embedded in a lipid bilayer closely match that in detergent micelle, except for a destabilized lipid-facing helix TM6 that is critical to prestin's mechanical expansion. We observe helix fraying at prestin's anion-binding site but cooperative unfolding of multiple lipid-facing helices, features that may promote prestin's fast electromechanical rearrangements. These results highlight a novel role of the folding equilibrium of the anion-binding site, and helps define prestin's unique voltage-sensing mechanism and electromotility.
Collapse
|
9
|
Sosnick TR. AlphaFold developers Demis Hassabis and John Jumper share the 2023 Albert Lasker Basic Medical Research Award. J Clin Invest 2023:e174915. [PMID: 37731359 DOI: 10.1172/jci174915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023] Open
|
10
|
Hellemann E, Durrant JD. Worth the Weight: Sub-Pocket EXplorer (SubPEx), a Weighted Ensemble Method to Enhance Binding-Pocket Conformational Sampling. J Chem Theory Comput 2023; 19:5677-5689. [PMID: 37585617 PMCID: PMC10500992 DOI: 10.1021/acs.jctc.3c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Indexed: 08/18/2023]
Abstract
Structure-based virtual screening (VS) is an effective method for identifying potential small-molecule ligands, but traditional VS approaches consider only a single binding-pocket conformation. Consequently, they struggle to identify ligands that bind to alternate conformations. Ensemble docking helps address this issue by incorporating multiple conformations into the docking process, but it depends on methods that can thoroughly explore pocket flexibility. We here introduce Sub-Pocket EXplorer (SubPEx), an approach that uses weighted ensemble (WE) path sampling to accelerate binding-pocket sampling. As proof of principle, we apply SubPEx to three proteins relevant to drug discovery: heat shock protein 90, influenza neuraminidase, and yeast hexokinase 2. SubPEx is available free of charge without registration under the terms of the open-source MIT license: http://durrantlab.com/subpex/.
Collapse
Affiliation(s)
- Erich Hellemann
- Department of Biological
Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Jacob D. Durrant
- Department of Biological
Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
11
|
Strahan J, Finkel J, Dinner AR, Weare J. Predicting rare events using neural networks and short-trajectory data. JOURNAL OF COMPUTATIONAL PHYSICS 2023; 488:112152. [PMID: 37332834 PMCID: PMC10270692 DOI: 10.1016/j.jcp.2023.112152] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/20/2023]
Abstract
Estimating the likelihood, timing, and nature of events is a major goal of modeling stochastic dynamical systems. When the event is rare in comparison with the timescales of simulation and/or measurement needed to resolve the elemental dynamics, accurate prediction from direct observations becomes challenging. In such cases a more effective approach is to cast statistics of interest as solutions to Feynman-Kac equations (partial differential equations). Here, we develop an approach to solve Feynman-Kac equations by training neural networks on short-trajectory data. Our approach is based on a Markov approximation but otherwise avoids assumptions about the underlying model and dynamics. This makes it applicable to treating complex computational models and observational data. We illustrate the advantages of our method using a low-dimensional model that facilitates visualization, and this analysis motivates an adaptive sampling strategy that allows on-the-fly identification of and addition of data to regions important for predicting the statistics of interest. Finally, we demonstrate that we can compute accurate statistics for a 75-dimensional model of sudden stratospheric warming. This system provides a stringent test bed for our method.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, the University of Chicago, Chicago, IL 60637
| | - Justin Finkel
- Department of Earth, Atmospheric, and Planetary Sciences, Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, the University of Chicago, Chicago, IL 60637
- Committee on Computational and Applied Mathematics, the University of Chicago, Chicago, IL 60637
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012
| |
Collapse
|
12
|
Kandathil SM, Lau AM, Jones DT. Machine learning methods for predicting protein structure from single sequences. Curr Opin Struct Biol 2023; 81:102627. [PMID: 37320955 DOI: 10.1016/j.sbi.2023.102627] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/17/2023] [Accepted: 05/17/2023] [Indexed: 06/17/2023]
Abstract
Recent breakthroughs in protein structure prediction have increasingly relied on the use of deep neural networks. These recent methods are notable in that they produce 3-D atomic coordinates as a direct output of the networks, a feature which presents many advantages. Although most techniques of this type make use of multiple sequence alignments as their primary input, a new wave of methods have attempted to use just single sequences as the input. We discuss the make-up and operating principles of these models, and highlight new developments in these areas, as well as areas for future development.
Collapse
Affiliation(s)
- Shaun M Kandathil
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - Andy M Lau
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom
| | - David T Jones
- Department of Computer Science, University College London, Gower Street, London, WC1E 6BT, United Kingdom.
| |
Collapse
|
13
|
Factors That Control the Force Needed to Unfold a Membrane Protein in Silico Depend on the Mode of Denaturation. Int J Mol Sci 2023; 24:ijms24032654. [PMID: 36768981 PMCID: PMC9917119 DOI: 10.3390/ijms24032654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/23/2023] [Accepted: 01/24/2023] [Indexed: 02/01/2023] Open
Abstract
Single-molecule force spectroscopy methods, such as AFM and magnetic tweezers, have proved extremely beneficial in elucidating folding pathways for soluble and membrane proteins. To identify factors that determine the force rupture levels in force-induced membrane protein unfolding, we applied our near-atomic-level Upside molecular dynamics package to study the vertical and lateral pulling of bacteriorhodopsin (bR) and GlpG, respectively. With our algorithm, we were able to selectively alter the magnitudes of individual interaction terms and identify that, for vertical pulling, hydrogen bond strength had the strongest effect, whereas other non-bonded protein and membrane-protein interactions had only moderate influences, except for the extraction of the last helix where the membrane-protein interactions had a stronger influence. The up-down topology of the transmembrane helices caused helices to be pulled out as pairs. The rate-limiting rupture event often was the loss of H-bonds and the ejection of the first helix, which then propagated tension to the second helix, which rapidly exited the bilayer. The pulling of the charged linkers across the membrane had minimal influence, as did changing the bilayer thickness. For the lateral pulling of GlpG, the rate-limiting rupture corresponded to the separation of the helices within the membrane, with the H-bonds generally being broken only afterward. Beyond providing a detailed picture of the rupture events, our study emphasizes that the pulling mode greatly affects the factors that determine the forces needed to unfold a membrane protein.
Collapse
|
14
|
The protein folding rate and the geometry and topology of the native state. Sci Rep 2022; 12:6384. [PMID: 35430582 PMCID: PMC9013383 DOI: 10.1038/s41598-022-09924-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 03/21/2022] [Indexed: 11/08/2022] Open
Abstract
AbstractProteins fold in 3-dimensional conformations which are important for their function. Characterizing the global conformation of proteins rigorously and separating secondary structure effects from topological effects is a challenge. New developments in applied knot theory allow to characterize the topological characteristics of proteins (knotted or not). By analyzing a small set of two-state and multi-state proteins with no knots or slipknots, our results show that 95.4% of the analyzed proteins have non-trivial topological characteristics, as reflected by the second Vassiliev measure, and that the logarithm of the experimental protein folding rate depends on both the local geometry and the topology of the protein’s native state.
Collapse
|
15
|
Faruk NF, Peng X, Freed KF, Roux B, Sosnick TR. Challenges and Advantages of Accounting for Backbone Flexibility in Prediction of Protein-Protein Complexes. J Chem Theory Comput 2022; 18:2016-2032. [PMID: 35213808 DOI: 10.1021/acs.jctc.1c01255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Predicting protein binding is a core problem of computational biophysics. That this objective can be partly achieved with some amount of success using docking algorithms based on rigid protein models is remarkable, although going further requires allowing for protein flexibility. However, accurately capturing the conformational changes upon binding remains an enduring challenge for docking algorithms. Here, we adapt our Upside folding model, where side chains are represented as multi-position beads, to explore how flexibility may impact predictions of protein-protein complexes. Specifically, the Upside model is used to investigate where backbone flexibility helps, which types of interactions are important, and what is the impact of coarse graining. These efforts also shed light on the relative challenges posed by folding and docking. After training the Upside energy function for docking, the model is competitive with the established all-atom methods. However, allowing for backbone flexibility during docking is generally detrimental, as the presence of comparatively minor (3-5 Å) deviations relative to the docked structure has a large negative effect on performance. While this issue appears to be inherent to current forcefield-guided flexible docking methods, systems involving the co-folding of flexible loops such as antibody-antigen complexes represent an interesting exception. In this case, binding is improved when backbone flexibility is allowed using the Upside model.
Collapse
Affiliation(s)
- Nabil F Faruk
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, Illinois 60637, United States
| | - Xiangda Peng
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| | - Karl F Freed
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Benoît Roux
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States.,Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States.,Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
16
|
Gaffney KA, Guo R, Bridges MD, Muhammednazaar S, Chen D, Kim M, Yang Z, Schilmiller AL, Faruk NF, Peng X, Jones AD, Kim KH, Sun L, Hubbell WL, Sosnick TR, Hong H. Lipid bilayer induces contraction of the denatured state ensemble of a helical-bundle membrane protein. Proc Natl Acad Sci U S A 2022; 119:e2109169119. [PMID: 34969836 PMCID: PMC8740594 DOI: 10.1073/pnas.2109169119] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2021] [Indexed: 12/19/2022] Open
Abstract
Defining the denatured state ensemble (DSE) and disordered proteins is essential to understanding folding, chaperone action, degradation, and translocation. As compared with water-soluble proteins, the DSE of membrane proteins is much less characterized. Here, we measure the DSE of the helical membrane protein GlpG of Escherichia coli (E. coli) in native-like lipid bilayers. The DSE was obtained using our steric trapping method, which couples denaturation of doubly biotinylated GlpG to binding of two streptavidin molecules. The helices and loops are probed using limited proteolysis and mass spectrometry, while the dimensions are determined using our paramagnetic biotin derivative and double electron-electron resonance spectroscopy. These data, along with our Upside simulations, identify the DSE as being highly dynamic, involving the topology changes and unfolding of some of the transmembrane (TM) helices. The DSE is expanded relative to the native state but only to 15 to 75% of the fully expanded condition. The degree of expansion depends on the local protein packing and the lipid composition. E. coli's lipid bilayer promotes the association of TM helices in the DSE and, probably in general, facilitates interhelical interactions. This tendency may be the outcome of a general lipophobic effect of proteins within the cell membranes.
Collapse
Affiliation(s)
- Kristen A Gaffney
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824
| | - Ruiqiong Guo
- Department of Chemistry, Michigan State University, East Lansing, MI 48824
| | - Michael D Bridges
- Jules Stein Eye Institute, University of California, Los Angeles, CA 90095
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095
| | | | - Daoyang Chen
- Department of Chemistry, Michigan State University, East Lansing, MI 48824
| | - Miyeon Kim
- Department of Chemistry, Michigan State University, East Lansing, MI 48824
| | - Zhongyu Yang
- Department of Chemistry and Biochemistry, North Dakota State University, Fargo, ND 58108
| | - Anthony L Schilmiller
- Research Technology Support Facility Mass Spectrometry and Metabolomics Core, Michigan State University, East Lansing, MI 48824
| | - Nabil F Faruk
- Graduate Program in Biophysical Sciences, The University of Chicago, Chicago, IL 60637
| | - Xiangda Peng
- Department of Biochemistry & Molecular Biology, The University of Chicago, Chicago, IL 60637
- Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637
| | - A Daniel Jones
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824
- Research Technology Support Facility Mass Spectrometry and Metabolomics Core, Michigan State University, East Lansing, MI 48824
| | - Kelly H Kim
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824
| | - Liangliang Sun
- Department of Chemistry, Michigan State University, East Lansing, MI 48824
| | - Wayne L Hubbell
- Jules Stein Eye Institute, University of California, Los Angeles, CA 90095
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095
| | - Tobin R Sosnick
- Department of Biochemistry & Molecular Biology, The University of Chicago, Chicago, IL 60637;
- Institute for Biophysical Dynamics, The University of Chicago, Chicago, IL 60637
| | - Heedeok Hong
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824;
- Department of Chemistry, Michigan State University, East Lansing, MI 48824
| |
Collapse
|
17
|
Baxa MC, Sosnick TR. Engineered Metal-Binding Sites to Probe Protein Folding Transition States: Psi Analysis. Methods Mol Biol 2022; 2376:31-63. [PMID: 34845602 DOI: 10.1007/978-1-0716-1716-8_2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
The formation of the transition state ensemble (TSE) represents the rate-limiting step in protein folding. The TSE is the least populated state on the pathway, and its characterization remains a challenge. Properties of the TSE can be inferred from the effects on folding and unfolding rates for various perturbations. A difficulty remains on how to translate these kinetic effects to structural properties of the TSE. Several factors can obscure the translation of point mutations in the frequently used method, "mutational Phi analysis." We take a complementary approach in "Psi analysis," employing rationally inserted metal binding sites designed to probe pairwise contacts in the TSE. These contacts can be confidently identified and used to construct structural models of the TSE. The method has been applied to multiple proteins and consistently produces a considerably more structured and native-like TSE than Phi analysis. This difference has significant implications to our understanding of protein folding mechanisms. Here we describe the application of the method and discuss how it can be used to study other conformational transitions such as binding.
Collapse
Affiliation(s)
- Michael C Baxa
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA.
| |
Collapse
|
18
|
Greener JG, Kandathil SM, Moffat L, Jones DT. A guide to machine learning for biologists. Nat Rev Mol Cell Biol 2022; 23:40-55. [PMID: 34518686 DOI: 10.1038/s41580-021-00407-0] [Citation(s) in RCA: 593] [Impact Index Per Article: 296.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/23/2021] [Indexed: 02/08/2023]
Abstract
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.
Collapse
Affiliation(s)
- Joe G Greener
- Department of Computer Science, University College London, London, UK
| | - Shaun M Kandathil
- Department of Computer Science, University College London, London, UK
| | - Lewis Moffat
- Department of Computer Science, University College London, London, UK
| | - David T Jones
- Department of Computer Science, University College London, London, UK.
| |
Collapse
|
19
|
Peng X, Baxa M, Faruk N, Sachleben JR, Pintscher S, Gagnon IA, Houliston S, Arrowsmith CH, Freed KF, Rocklin GJ, Sosnick TR. Prediction and Validation of a Protein's Free Energy Surface Using Hydrogen Exchange and (Importantly) Its Denaturant Dependence. J Chem Theory Comput 2021; 18:550-561. [PMID: 34936354 PMCID: PMC8757463 DOI: 10.1021/acs.jctc.1c00960] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The denaturant dependence of hydrogen-deuterium exchange (HDX) is a powerful measurement to identify the breaking of individual H-bonds and map the free energy surface (FES) of a protein including the very rare states. Molecular dynamics (MD) can identify each partial unfolding event with atomic-level resolution. Hence, their combination provides a great opportunity to test the accuracy of simulations and to verify the interpretation of HDX data. For this comparison, we use Upside, our new and extremely fast MD package that is capable of folding proteins with an accuracy comparable to that of all-atom methods. The FESs of two naturally occurring and two designed proteins are so generated and compared to our NMR/HDX data. We find that Upside's accuracy is considerably improved upon modifying the energy function using a new machine-learning procedure that trains for proper protein behavior including realistic denatured states in addition to stable native states. The resulting increase in cooperativity is critical for replicating the HDX data and protein stability, indicating that we have properly encoded the underlying physiochemical interactions into an MD package. We did observe some mismatch, however, underscoring the ongoing challenges faced by simulations in calculating accurate FESs. Nevertheless, our ensembles can identify the properties of the fluctuations that lead to HDX, whether they be small-, medium-, or large-scale openings, and can speak to the breadth of the native ensemble that has been a matter of debate.
Collapse
Affiliation(s)
- Xiangda Peng
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| | - Michael Baxa
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| | - Nabil Faruk
- Graduate Program in Biophysical Sciences, University of Chicago, Chicago, Illinois 60637, United States
| | - Joseph R Sachleben
- Division of Biological Sciences, University of Chicago, Chicago, Illinois 60637, United States
| | - Sebastian Pintscher
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States.,Department of Molecular Biophysics, Faculty of Biochemistry, Biophysics and Biotechnology, Jagiellonian University, Kraków 30387, Poland
| | - Isabelle A Gagnon
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| | - Scott Houliston
- Structural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L7, Canada.,Princess Margaret Cancer Centre and Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 2M9, Canada
| | - Cheryl H Arrowsmith
- Structural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L7, Canada.,Princess Margaret Cancer Centre and Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 2M9, Canada
| | - Karl F Freed
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | - Gabriel J Rocklin
- Department of Pharmacology & Center for Synthetic Biology, Northwestern University, Chicago, Illinois 60614, United States
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
20
|
Ovchinnikov S, Huang PS. Structure-based protein design with deep learning. Curr Opin Chem Biol 2021; 65:136-144. [PMID: 34547592 PMCID: PMC8671290 DOI: 10.1016/j.cbpa.2021.08.004] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 08/13/2021] [Indexed: 12/11/2022]
Abstract
Since the first revelation of proteins functioning as macromolecular machines through their three dimensional structures, researchers have been intrigued by the marvelous ways the biochemical processes are carried out by proteins. The aspiration to understand protein structures has fueled extensive efforts across different scientific disciplines. In recent years, it has been demonstrated that proteins with new functionality or shapes can be designed via structure-based modeling methods, and the design strategies have combined all available information - but largely piece-by-piece - from sequence derived statistics to the detailed atomic-level modeling of chemical interactions. Despite the significant progress, incorporating data-derived approaches through the use of deep learning methods can be a game changer. In this review, we summarize current progress, compare the arc of developing the deep learning approaches with the conventional methods, and describe the motivation and concepts behind current strategies that may lead to potential future opportunities.
Collapse
Affiliation(s)
- Sergey Ovchinnikov
- John Harvard Distinguished Science Fellowship Program, Harvard University, Cambridge, MA, 02138, USA.
| | - Po-Ssu Huang
- Department of Bioengineering, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
21
|
AlQuraishi M, Sorger PK. Differentiable biology: using deep learning for biophysics-based and data-driven modeling of molecular mechanisms. Nat Methods 2021; 18:1169-1180. [PMID: 34608321 PMCID: PMC8793939 DOI: 10.1038/s41592-021-01283-4] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Accepted: 08/27/2021] [Indexed: 02/08/2023]
Abstract
Deep learning using neural networks relies on a class of machine-learnable models constructed using 'differentiable programs'. These programs can combine mathematical equations specific to a particular domain of natural science with general-purpose, machine-learnable components trained on experimental data. Such programs are having a growing impact on molecular and cellular biology. In this Perspective, we describe an emerging 'differentiable biology' in which phenomena ranging from the small and specific (for example, one experimental assay) to the broad and complex (for example, protein folding) can be modeled effectively and efficiently, often by exploiting knowledge about basic natural phenomena to overcome the limitations of sparse, incomplete and noisy data. By distilling differentiable biology into a small set of conceptual primitives and illustrative vignettes, we show how it can help to address long-standing challenges in integrating multimodal data from diverse experiments across biological scales. This promises to benefit fields as diverse as biophysics and functional genomics.
Collapse
Affiliation(s)
- Mohammed AlQuraishi
- Department of Systems Biology, Columbia University, New York, NY, USA.
- Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| | - Peter K Sorger
- Laboratory of Systems Pharmacology, Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
| |
Collapse
|
22
|
Peter EK, Manstein DJ, Shea JE, Schug A. CORE-MD II: A fast, adaptive, and accurate enhanced sampling method. J Chem Phys 2021; 155:104114. [PMID: 34525829 DOI: 10.1063/5.0063664] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In this paper, we present a fast and adaptive correlation guided enhanced sampling method (CORE-MD II). The CORE-MD II technique relies, in part, on partitioning of the entire pathway into short trajectories that we refer to as instances. The sampling within each instance is accelerated by adaptive path-dependent metadynamics simulations. The second part of this approach involves kinetic Monte Carlo (kMC) sampling between the different states that have been accessed during each instance. Through the combination of the partition of the total simulation into short non-equilibrium simulations and the kMC sampling, the CORE-MD II method is capable of sampling protein folding without any a priori definitions of reaction pathways and additional parameters. In the validation simulations, we applied the CORE-MD II on the dialanine peptide and the folding of two peptides: TrpCage and TrpZip2. In a comparison with long time equilibrium Molecular Dynamics (MD), 1 µs replica exchange MD (REMD), and CORE-MD I simulations, we find that the level of convergence of the CORE-MD II method is improved by a factor of 8.8, while the CORE-MD II method reaches acceleration factors of ∼120. In the CORE-MD II simulation of TrpZip2, we observe the formation of the native state in contrast to the REMD and the CORE-MD I simulations. The method is broadly applicable for MD simulations and is not restricted to simulations of protein folding or even biomolecules but also applicable to simulations of protein aggregation, protein signaling, or even materials science simulations.
Collapse
Affiliation(s)
- Emanuel K Peter
- Institute for Biophysical Chemistry, Fritz-Hartmann-Centre for Medical Research, Hannover Medical School, Carl-Neuberg-Str. 1, Hannover 30625, Germany
| | - Dietmar J Manstein
- Institute for Biophysical Chemistry, Fritz-Hartmann-Centre for Medical Research, Hannover Medical School, Carl-Neuberg-Str. 1, Hannover 30625, Germany
| | - Joan-Emma Shea
- Department of Chemistry and Biochemistry, Department of Physics, University of California, Santa Barbara, California 93106, USA
| | - Alexander Schug
- John von Neumann Institute for Computing and Jülich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich, 52425 Jülich, Germany
| |
Collapse
|
23
|
Laine E, Eismann S, Elofsson A, Grudinin S. Protein sequence-to-structure learning: Is this the end(-to-end revolution)? Proteins 2021; 89:1770-1786. [PMID: 34519095 DOI: 10.1002/prot.26235] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/16/2021] [Accepted: 09/03/2021] [Indexed: 01/08/2023]
Abstract
The potential of deep learning has been recognized in the protein structure prediction community for some time, and became indisputable after CASP13. In CASP14, deep learning has boosted the field to unanticipated levels reaching near-experimental accuracy. This success comes from advances transferred from other machine learning areas, as well as methods specifically designed to deal with protein sequences and structures, and their abstractions. Novel emerging approaches include (i) geometric learning, that is, learning on representations such as graphs, three-dimensional (3D) Voronoi tessellations, and point clouds; (ii) pretrained protein language models leveraging attention; (iii) equivariant architectures preserving the symmetry of 3D space; (iv) use of large meta-genome databases; (v) combinations of protein representations; and (vi) finally truly end-to-end architectures, that is, differentiable models starting from a sequence and returning a 3D structure. Here, we provide an overview and our opinion of the novel deep learning approaches developed in the last 2 years and widely used in CASP14.
Collapse
Affiliation(s)
- Elodie Laine
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Stephan Eismann
- Department of Computer Science and Applied Physics, Stanford University, Stanford, California, USA
| | - Arne Elofsson
- Department of Biochemistry and Biophysics and Science for Life Laboratory, Stockholm University, Solna, Sweden
| | - Sergei Grudinin
- Univ. Grenoble Alpes, CNRS, Grenoble INP, LJK, Grenoble, France
| |
Collapse
|
24
|
Greener JG, Jones DT. Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins. PLoS One 2021; 16:e0256990. [PMID: 34473813 PMCID: PMC8412298 DOI: 10.1371/journal.pone.0256990] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/19/2021] [Indexed: 11/26/2022] Open
Abstract
Finding optimal parameters for force fields used in molecular simulation is a challenging and time-consuming task, partly due to the difficulty of tuning multiple parameters at once. Automatic differentiation presents a general solution: run a simulation, obtain gradients of a loss function with respect to all the parameters, and use these to improve the force field. This approach takes advantage of the deep learning revolution whilst retaining the interpretability and efficiency of existing force fields. We demonstrate that this is possible by parameterising a simple coarse-grained force field for proteins, based on training simulations of up to 2,000 steps learning to keep the native structure stable. The learned potential matches chemical knowledge and PDB data, can fold and reproduce the dynamics of small proteins, and shows ability in protein design and model scoring applications. Problems in applying differentiable molecular simulation to all-atom models of proteins are discussed along with possible solutions and the variety of available loss functions. The learned potential, simulation scripts and training code are made available at https://github.com/psipred/cgdms.
Collapse
Affiliation(s)
- Joe G. Greener
- Department of Computer Science, University College London, London, United Kingdom
| | - David T. Jones
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
25
|
|
26
|
Machine learning in protein structure prediction. Curr Opin Chem Biol 2021; 65:1-8. [PMID: 34015749 DOI: 10.1016/j.cbpa.2021.04.005] [Citation(s) in RCA: 102] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 04/10/2021] [Indexed: 12/31/2022]
Abstract
Prediction of protein structure from sequence has been intensely studied for many decades, owing to the problem's importance and its uniquely well-defined physical and computational bases. While progress has historically ebbed and flowed, the past two years saw dramatic advances driven by the increasing "neuralization" of structure prediction pipelines, whereby computations previously based on energy models and sampling procedures are replaced by neural networks. The extraction of physical contacts from the evolutionary record; the distillation of sequence-structure patterns from known structures; the incorporation of templates from homologs in the Protein Databank; and the refinement of coarsely predicted structures into finely resolved ones have all been reformulated using neural networks. Cumulatively, this transformation has resulted in algorithms that can now predict single protein domains with a median accuracy of 2.1 Å, setting the stage for a foundational reconfiguration of the role of biomolecular modeling within the life sciences.
Collapse
|
27
|
Peter EK, Shea JE, Schug A. CORE-MD, a path correlated molecular dynamics simulation method. J Chem Phys 2020; 153:084114. [DOI: 10.1063/5.0015398] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Emanuel K. Peter
- John von Neumann Institute for Computing and Julich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich, Germany
| | - Joan-Emma Shea
- Department of Chemistry and Biochemistry, Department of Physics, University of California, Santa Barbara, Santa Barbara, California 93106, USA
| | - Alexander Schug
- John von Neumann Institute for Computing and Julich Supercomputing Centre, Institute for Advanced Simulation, Forschungszentrum Jülich, Jülich, Germany
- Faculty of Biology, University of Duisburg-Essen, Duisburg, Germany
| |
Collapse
|
28
|
Abstract
Proteins are molecular machines whose function depends on their ability to achieve complex folds with precisely defined structural and dynamic properties. The rational design of proteins from first-principles, or de novo, was once considered to be impossible, but today proteins with a variety of folds and functions have been realized. We review the evolution of the field from its earliest days, placing particular emphasis on how this endeavor has illuminated our understanding of the principles underlying the folding and function of natural proteins, and is informing the design of macromolecules with unprecedented structures and properties. An initial set of milestones in de novo protein design focused on the construction of sequences that folded in water and membranes to adopt folded conformations. The first proteins were designed from first-principles using very simple physical models. As computers became more powerful, the use of the rotamer approximation allowed one to discover amino acid sequences that stabilize the desired fold. As the crystallographic database of protein structures expanded in subsequent years, it became possible to construct proteins by assembling short backbone fragments that frequently recur in Nature. The second set of milestones in de novo design involves the discovery of complex functions. Proteins have been designed to bind a variety of metals, porphyrins, and other cofactors. The design of proteins that catalyze hydrolysis and oxygen-dependent reactions has progressed significantly. However, de novo design of catalysts for energetically demanding reactions, or even proteins that bind with high affinity and specificity to highly functionalized complex polar molecules remains an importnant challenge that is now being achieved. Finally, the protein design contributed significantly to our understanding of membrane protein folding and transport of ions across membranes. The area of membrane protein design, or more generally of biomimetic polymers that function in mixed or non-aqueous environments, is now becoming increasingly possible.
Collapse
|
29
|
Wang Z, Jumper JM, Freed KF, Sosnick TR. On the Interpretation of Force-Induced Unfolding Studies of Membrane Proteins Using Fast Simulations. Biophys J 2019; 117:1429-1441. [PMID: 31587831 DOI: 10.1016/j.bpj.2019.09.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2019] [Revised: 08/25/2019] [Accepted: 09/12/2019] [Indexed: 11/25/2022] Open
Abstract
Single-molecule force spectroscopy has proven extremely beneficial in elucidating folding pathways for membrane proteins. Here, we simulate these measurements, conducting hundreds of unfolding trajectories using our fast Upside algorithm for slow enough speeds to reproduce key experimental features that may be missed using all-atom methods. The speed also enables us to determine the logarithmic dependence of pulling velocities on the rupture levels to better compare to experimental values. For simulations of atomic force microscope measurements in which force is applied vertically to the C-terminus of bacteriorhodopsin, we reproduce the major experimental features including even the back-and-forth unfolding of single helical turns. When pulling laterally on GlpG to mimic the experiment, we observe quite different behavior depending on the stiffness of the spring. With a soft spring, as used in the experimental studies with magnetic tweezers, the force remains nearly constant after the initial unfolding event, and a few pathways and a high degree of cooperativity are observed in both the experiment and simulation. With a stiff spring, however, the force drops to near zero after each major unfolding event, and numerous intermediates are observed along a wide variety of pathways. Hence, the mode of force application significantly alters the perception of the folding landscape, including the number of intermediates and the degree of folding cooperativity, important issues that should be considered when designing experiments and interpreting unfolding data.
Collapse
Affiliation(s)
- Zongan Wang
- Department of Chemistry, James Franck Institute, The University of Chicago, Chicago, Illinois; Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois
| | - John M Jumper
- Department of Chemistry, James Franck Institute, The University of Chicago, Chicago, Illinois; Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois
| | - Karl F Freed
- Department of Chemistry, James Franck Institute, The University of Chicago, Chicago, Illinois.
| | - Tobin R Sosnick
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois; Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois.
| |
Collapse
|
30
|
Commonly used FRET fluorophores promote collapse of an otherwise disordered protein. Proc Natl Acad Sci U S A 2019; 116:8889-8894. [PMID: 30992378 DOI: 10.1073/pnas.1813038116] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
The dimensions that unfolded proteins, including intrinsically disordered proteins (IDPs), adopt in the absence of denaturant remain controversial. We developed an analysis procedure for small-angle X-ray scattering (SAXS) profiles and used it to demonstrate that even relatively hydrophobic IDPs remain nearly as expanded in water as they are in high denaturant concentrations. In contrast, as demonstrated here, most fluorescence resonance energy transfer (FRET) measurements have indicated that relatively hydrophobic IDPs contract significantly in the absence of denaturant. We use two independent approaches to further explore this controversy. First, using SAXS we show that fluorophores employed in FRET can contribute to the observed discrepancy. Specifically, we find that addition of Alexa-488 to a normally expanded IDP causes contraction by an additional 15%, a value in reasonable accord with the contraction reported in FRET-based studies. Second, using our simulations and analysis procedure to accurately extract both the radius of gyration (Rg) and end-to-end distance (Ree) from SAXS profiles, we tested the recent suggestion that FRET and SAXS results can be reconciled if the Rg and Ree are "uncoupled" (i.e., no longer simply proportional), in contrast to the case for random walk homopolymers. We find, however, that even for unfolded proteins, these two measures of unfolded state dimensions remain proportional. Together, these results suggest that improved analysis procedures and a correction for significant, fluorophore-driven interactions are sufficient to reconcile prior SAXS and FRET studies, thus providing a unified picture of the nature of unfolded polypeptide chains in the absence of denaturant.
Collapse
|