1
|
Ghorbani M, Prasad S, Klauda JB, Brooks BR. Variational embedding of protein folding simulations using Gaussian mixture variational autoencoders. J Chem Phys 2021; 155:194108. [PMID: 34800961 DOI: 10.1063/5.0069708] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Conformational sampling of biomolecules using molecular dynamics simulations often produces a large amount of high dimensional data that makes it difficult to interpret using conventional analysis techniques. Dimensionality reduction methods are thus required to extract useful and relevant information. Here, we devise a machine learning method, Gaussian mixture variational autoencoder (GMVAE), that can simultaneously perform dimensionality reduction and clustering of biomolecular conformations in an unsupervised way. We show that GMVAE can learn a reduced representation of the free energy landscape of protein folding with highly separated clusters that correspond to the metastable states during folding. Since GMVAE uses a mixture of Gaussians as its prior, it can directly acknowledge the multi-basin nature of the protein folding free energy landscape. To make the model end-to-end differentiable, we use a Gumbel-softmax distribution. We test the model on three long-timescale protein folding trajectories and show that GMVAE embedding resembles the folding funnel with folded states down the funnel and unfolded states outside the funnel path. Additionally, we show that the latent space of GMVAE can be used for kinetic analysis and Markov state models built on this embedding produce folding and unfolding timescales that are in close agreement with other rigorous dynamical embeddings such as time independent component analysis.
Collapse
Affiliation(s)
- Mahdi Ghorbani
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| | - Samarjeet Prasad
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| | - Jeffery B Klauda
- Department of Chemical and Biomolecular Engineering, University of Maryland, College Park, Maryland 20742, USA
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| |
Collapse
|
2
|
Appadurai R, Nagesh J, Srivastava A. High resolution ensemble description of metamorphic and intrinsically disordered proteins using an efficient hybrid parallel tempering scheme. Nat Commun 2021; 12:958. [PMID: 33574233 PMCID: PMC7878814 DOI: 10.1038/s41467-021-21105-7] [Citation(s) in RCA: 44] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 01/08/2021] [Indexed: 12/26/2022] Open
Abstract
Mapping free energy landscapes of complex multi-funneled metamorphic proteins and weakly-funneled intrinsically disordered proteins (IDPs) remains challenging. While rare-event sampling molecular dynamics simulations can be useful, they often need to either impose restraints or reweigh the generated data to match experiments. Here, we present a parallel-tempering method that takes advantage of accelerated water dynamics and allows efficient and accurate conformational sampling across a wide variety of proteins. We demonstrate the improved sampling efficiency by benchmarking against standard model systems such as alanine di-peptide, TRP-cage and β-hairpin. The method successfully scales to large metamorphic proteins such as RFA-H and to highly disordered IDPs such as Histatin-5. Across the diverse proteins, the calculated ensemble averages match well with the NMR, SAXS and other biophysical experiments without the need to reweigh. By allowing accurate sampling across different landscapes, the method opens doors for sampling free energy landscape of complex uncharted proteins.
Collapse
Affiliation(s)
- Rajeswari Appadurai
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Jayashree Nagesh
- Solid State & Structural Chemistry Unit, Indian Institute of Science, Bangalore, Karnataka, India
| | - Anand Srivastava
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, Karnataka, India.
| |
Collapse
|
3
|
Sidky H, Chen W, Ferguson AL. High-Resolution Markov State Models for the Dynamics of Trp-Cage Miniprotein Constructed Over Slow Folding Modes Identified by State-Free Reversible VAMPnets. J Phys Chem B 2019; 123:7999-8009. [DOI: 10.1021/acs.jpcb.9b05578] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Hythem Sidky
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, United States
| | - Andrew L. Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
4
|
Kamiya M, Sugita Y. Flexible selection of the solute region in replica exchange with solute tempering: Application to protein-folding simulations. J Chem Phys 2018; 149:072304. [PMID: 30134668 DOI: 10.1063/1.5016222] [Citation(s) in RCA: 65] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Replica-exchange molecular dynamics (REMD) and their variants have been widely used in simulations of the biomolecular structure and dynamics. Replica exchange with solute tempering (REST) is one of the methods where temperature of a pre-defined solute molecule is exchanged between replicas, while solvent temperatures in all the replicas are kept constant. REST greatly reduces the number of replicas compared to the temperature REMD, while replicas at low temperatures are often trapped under their conditions, interfering with the conformational sampling. Here, we introduce a new scheme of REST, referred to as generalized REST (gREST), where the solute region is defined as a part of a molecule or a part of the potential energy terms, such as the dihedral-angle energy term or Lennard-Jones energy term. We applied this new method to folding simulations of a β-hairpin (16 residues) and a Trp-cage (20 residues) in explicit water. The protein dihedral-angle energy term is chosen as the solute region in the simulations. gREST reduces the number of replicas necessary for good random walks in the solute-temperature space and covers a wider conformational space compared to the conventional REST2. Considering the general applicability, gREST should become a promising tool for the simulations of protein folding, conformational dynamics, and an in silico drug design.
Collapse
Affiliation(s)
- Motoshi Kamiya
- Computational Biophysics Research Team, RIKEN Advanced Institute for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan
| | - Yuji Sugita
- Computational Biophysics Research Team, RIKEN Advanced Institute for Computational Science, 7-1-26 Minatojima-minamimachi, Chuo-ku, Kobe 650-0047, Japan
| |
Collapse
|
5
|
Kitazawa S, Fossat MJ, McCallum SA, Garcia AE, Royer CA. NMR and Computation Reveal a Pressure-Sensitive Folded Conformation of Trp-Cage. J Phys Chem B 2017; 121:1258-1267. [DOI: 10.1021/acs.jpcb.6b11810] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Soichiro Kitazawa
- Biological
Sciences, Rensselaer Polytechnic Institute, Troy, New York
| | - Martin J. Fossat
- Biological
Sciences, Rensselaer Polytechnic Institute, Troy, New York
- Laboratoire Charles
Coulomb UMR 5221 CNRS-UM, Montpellier, France
| | - Scott A. McCallum
- Center
for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| | - Angel E. Garcia
- Department
of Physics, Rensselaer Polytechnic Institute, Troy, New York
| | | |
Collapse
|
6
|
Gopi S, Singh A, Suresh S, Paul S, Ranu S, Naganathan AN. Toward a quantitative description of microscopic pathway heterogeneity in protein folding. Phys Chem Chem Phys 2017; 19:20891-20903. [DOI: 10.1039/c7cp03011h] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Experimentally consistent statistical modeling of protein folding thermodynamics reveals unprecedented complexity with numerous parallel folding routes in five different proteins.
Collapse
Affiliation(s)
- Soundhararajan Gopi
- Department of Biotechnology
- Bhupat & Jyoti Mehta School of Biosciences
- Indian Institute of Technology Madras
- Chennai 600036
- India
| | - Animesh Singh
- Department of Computer Science and Engineering
- Indian Institute of Technology Madras
- Chennai 600036
- India
| | | | - Suvadip Paul
- Department of Computer Science and Engineering
- Indian Institute of Technology Madras
- Chennai 600036
- India
| | - Sayan Ranu
- Department of Computer Science and Engineering
- Indian Institute of Technology Madras
- Chennai 600036
- India
| | - Athi N. Naganathan
- Department of Biotechnology
- Bhupat & Jyoti Mehta School of Biosciences
- Indian Institute of Technology Madras
- Chennai 600036
- India
| |
Collapse
|
7
|
Smeller L. Folding superfunnel to describe cooperative folding of interacting proteins. Proteins 2016; 84:1009-16. [DOI: 10.1002/prot.25051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2016] [Revised: 04/06/2016] [Accepted: 04/08/2016] [Indexed: 12/18/2022]
Affiliation(s)
- László Smeller
- Department of Biophysics and Radiation Biology; Semmelweis University; Budapest Hungary
| |
Collapse
|
8
|
Neale C, Pomès R, García AE. Peptide Bond Isomerization in High-Temperature Simulations. J Chem Theory Comput 2016; 12:1989-99. [PMID: 26866899 DOI: 10.1021/acs.jctc.5b01022] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Force fields for molecular simulation are generally optimized to model macromolecules such as proteins at ambient temperature and pressure. Nevertheless, elevated temperatures are frequently used to enhance conformational sampling, either during system setup or as a component of an advanced sampling technique such as temperature replica exchange. Because macromolecular force fields are now put upon to simulate temperatures and time scales that greatly exceed their original design specifications, it is appropriate to re-evaluate whether these force fields are up to the task. Here, we quantify the rates of peptide bond isomerization in high-temperature simulations of three octameric peptides and a small fast-folding protein. We show that peptide octamers with and without proline residues undergo cis/trans isomerization every 1-5 ns at 800 K with three classical atomistic force fields (AMBER99SB-ILDN, CHARMM22/CMAP, and OPLS-AA/L). On the low microsecond time scale, these force fields permit isomerization of nonprolyl peptide bonds at temperatures ≥500 K, and the CHARMM22/CMAP force field permits isomerization of prolyl peptide bonds ≥400 K. Moreover, the OPLS-AA/L force field allows chiral inversion about the Cα atom at 800 K. Finally, we show that temperature replica exchange permits cis peptide bonds developed at 540 K to subsequently migrate back to the 300 K ensemble, where cis peptide bonds are present in 2 ± 1% of the population of Trp-cage TC5b, including up to 4% of its folded state. Further work is required to assess the accuracy of cis/trans isomerization in the current generation of protein force fields.
Collapse
Affiliation(s)
- Chris Neale
- Center for NonLinear Studies (CNLS), MS B258, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Régis Pomès
- Molecular Structure and Function, The Hospital for Sick Children , 686 Bay Street, Toronto, Ontario M5G 0A4, Canada.,Department of Biochemistry, University of Toronto , 101 College Street, Toronto, Ontario M5G 1L7, Canada
| | - Angel E García
- Center for NonLinear Studies (CNLS), MS B258, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| |
Collapse
|
9
|
Faraj SE, González-Lebrero RM, Roman EA, Santos J. Human Frataxin Folds Via an Intermediate State. Role of the C-Terminal Region. Sci Rep 2016; 6:20782. [PMID: 26856628 PMCID: PMC4746760 DOI: 10.1038/srep20782] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 01/12/2016] [Indexed: 11/30/2022] Open
Abstract
The aim of this study is to investigate the folding reaction of human frataxin, whose deficiency causes the neurodegenerative disease Friedreich's Ataxia (FRDA). The characterization of different conformational states would provide knowledge about how frataxin can be stabilized without altering its functionality. Wild-type human frataxin and a set of mutants, including two highly destabilized FRDA-associated variants were studied by urea-induced folding/unfolding in a rapid mixing device and followed by circular dichroism. The analysis clearly indicates the existence of an intermediate state (I) in the folding route with significant secondary structure content but relatively low compactness, compared with the native ensemble. However, at high NaCl concentrations I-state gains substantial compaction, and the unfolding barrier is strongly affected, revealing the importance of electrostatics in the folding mechanism. The role of the C-terminal region (CTR), the key determinant of frataxin stability, was also studied. Simulations consistently with experiments revealed that this stretch is essentially unstructured, in the most compact transition state ensemble (TSE2). The complete truncation of the CTR drastically destabilizes the native state without altering TSE2. Results presented here shed light on the folding mechanism of frataxin, opening the possibility of mutating it to generate hyperstable variants without altering their folding kinetics.
Collapse
Affiliation(s)
- Santiago E. Faraj
- Instituto de Química y Físico-Química Biológicas, Universidad de Buenos Aires, Junín 956, 1113AAD, Buenos Aires, Argentina
| | - Rodolfo M. González-Lebrero
- Instituto de Química y Físico-Química Biológicas, Universidad de Buenos Aires, Junín 956, 1113AAD, Buenos Aires, Argentina
| | - Ernesto A. Roman
- Instituto de Química y Físico-Química Biológicas, Universidad de Buenos Aires, Junín 956, 1113AAD, Buenos Aires, Argentina
| | - Javier Santos
- Instituto de Química y Físico-Química Biológicas, Universidad de Buenos Aires, Junín 956, 1113AAD, Buenos Aires, Argentina
| |
Collapse
|
10
|
Affiliation(s)
- Zachary A. Levine
- Department
of Physics, University of California Santa Barbara, Santa Barbara, California 93106, United States
- Department
of Chemistry and Biochemistry, University of California Santa Barbara, Santa
Barbara, California 93106, United States
| | - Sean A. Fischer
- Department
of Chemical Engineering, University of Washington, Seattle, Washington 98105, United States
| | - Joan-Emma Shea
- Department
of Physics, University of California Santa Barbara, Santa Barbara, California 93106, United States
- Department
of Chemistry and Biochemistry, University of California Santa Barbara, Santa
Barbara, California 93106, United States
| | - Jim Pfaendtner
- Department
of Chemical Engineering, University of Washington, Seattle, Washington 98105, United States
| |
Collapse
|