1
|
Huang Y, Zhang H, Lin Z, Wei Y, Xi W. RevGraphVAMP: A protein molecular simulation analysis model combining graph convolutional neural networks and physical constraints. Methods 2024; 229:163-174. [PMID: 38972499 DOI: 10.1016/j.ymeth.2024.06.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 06/19/2024] [Accepted: 06/24/2024] [Indexed: 07/09/2024] Open
Abstract
Molecular dynamics simulation is a crucial research domain within the life sciences, focusing on comprehending the mechanisms of biomolecular interactions at atomic scales. Protein simulation, as a critical subfield, often utilizes MD for implementation, with trajectory data play a pivotal role in drug discovery. The advancement of high-performance computing and deep learning technology becomes popular and critical to predict protein properties from vast trajectory data, posing challenges regarding data features extraction from the complicated simulation data and dimensionality reduction. Simultaneously, it is essential to provide a meaningful explanation of the biological mechanism behind dimensionality. To tackle this challenge, we propose a new unsupervised model named RevGraphVAMP to intelligently analyze the simulation trajectory. This model is based on the variational approach for Markov processes (VAMP) and integrates graph convolutional neural networks and physical constraint optimization to enhance the learning performance. Additionally, we introduce attention mechanism to assess the importance of key interaction region, facilitating the interpretation of molecular mechanism. In comparison to other VAMPNets models, our model showcases competitive performance, improved accuracy in state transition prediction, as demonstrated through its application to two public datasets and the Shank3-Rap1 complex, which is associated with autism spectrum disorder. Moreover, it enhanced dimensionality reduction discrimination across different substates and provides interpretable results for protein structural characterization.
Collapse
Affiliation(s)
- Ying Huang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Huiling Zhang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; College of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510642, China
| | - Zhenli Lin
- Department of Ophthalmology, Shenzhen University General Hospital, Shenzhen 518055, China
| | - Yanjie Wei
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Faculty of Computer Science and Control Engineering, Shenzhen University of Advanced Technology, Shenzhen 518107, China.
| | - Wenhui Xi
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Faculty of Computer Science and Control Engineering, Shenzhen University of Advanced Technology, Shenzhen 518107, China.
| |
Collapse
|
2
|
Martino SA, Morado J, Li C, Lu Z, Rosta E. Kemeny Constant-Based Optimization of Network Clustering Using Graph Neural Networks. J Phys Chem B 2024; 128:8103-8115. [PMID: 39145603 PMCID: PMC11367579 DOI: 10.1021/acs.jpcb.3c08213] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2023] [Revised: 06/28/2024] [Accepted: 07/08/2024] [Indexed: 08/16/2024]
Abstract
The recent trend in using network and graph structures to represent a variety of different data types has renewed interest in the graph partitioning (GP) problem. This interest stems from the need for general methods that can both efficiently identify network communities and reduce the dimensionality of large graphs while satisfying various application-specific criteria. Traditional clustering algorithms often struggle to capture the complex relationships within graphs and generalize to arbitrary clustering criteria. The emergence of graph neural networks (GNNs) as a powerful framework for learning representations of graph data provides new approaches to solving the problem. Previous work has shown GNNs to be capable of proposing partitionings using a variety of criteria. However, these approaches have not yet been extended to Markov chains or kinetic networks. These arise frequently in the study of molecular systems and are of particular interest to the biomolecular modeling community. In this work, we propose several GNN-based architectures to tackle the GP problem for Markov Chains described as kinetic networks. This approach aims to maximize the Kemeny constant, which is a variational quantity and it represents the sum of time scales of the system. We propose using an encoder-decoder architecture and show how simple GraphSAGE-based GNNs with linear layers can outperform much larger and more expressive attention-based models in this context. As a proof of concept, we first demonstrate the method's ability to cluster randomly connected graphs. We also use a linear chain architecture corresponding to a 1D free energy profile as our kinetic network. Subsequently, we demonstrate the effectiveness of our method through experiments on a data set derived from molecular dynamics. We compare the performance of our method to other partitioning techniques, such as PCCA+. We explore the importance of feature and hyperparameter selection and propose a general strategy for large-scale parallel training of GNNs for discovering optimal graph partitionings.
Collapse
Affiliation(s)
- Sam Alexander Martino
- Department of Physics and
Astronomy, University College London, London WC1E 6BT, U.K.
| | - João Morado
- Department of Physics and
Astronomy, University College London, London WC1E 6BT, U.K.
| | - Chenghao Li
- Department of Physics and
Astronomy, University College London, London WC1E 6BT, U.K.
| | - Zhenghao Lu
- Department of Physics and
Astronomy, University College London, London WC1E 6BT, U.K.
| | - Edina Rosta
- Department of Physics and
Astronomy, University College London, London WC1E 6BT, U.K.
| |
Collapse
|
3
|
Rinaldi S, Colombo G, Morra G. Exploring Mutation-Driven Changes in the ATP-ADP Conformational Cycle of Human Hsp70 by All-Atom MD Adaptive Sampling. J Phys Chem B 2024; 128:7770-7780. [PMID: 39091167 DOI: 10.1021/acs.jpcb.4c03603] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2024]
Abstract
Hsp70 belongs to a family of molecular chaperones ubiquitous through organisms that assist client protein folding and prevent aggregation. It works through a tightly ATP-regulated allosteric cycle mechanism, which organizes its two NBD and SBD into alternate open and closed arrangements that facilitate loading and unloading of client proteins. The two cytosolic human isoforms Hsc70 and HspA1 are relevant targets for neurodegenerative diseases and cancer. Illuminating the molecular details of Hsp70 functional dynamics is essential to rationalize differences among the well-characterized bacterial homologue DnaK and the less explored human forms and develop subtype- or species-selective allosteric drugs. We present here a molecular dynamics-based analysis of the conformational dynamics of HspA1. By using an "allosterically impaired" mutant for comparison, we can reconstruct the impact of the ADP-ATP swap on interdomain contacts and dynamic coordination in full-length HspA1, supporting previous predictions that were, however, limited to the NBD. We model the initial onset of the conformational cycle by proposing a sequence of structural steps, which reveal the role of a specific human sequence insertion at the linker, and a modulation of the angle formed by the two NBD lobes during the progression of docking. Our findings pinpoint functionally relevant conformations and set the basis for a selective structure-based drug discovery approach targeting allosteric sites in human Hsp70.
Collapse
Affiliation(s)
- Silvia Rinaldi
- Institute for the Chemistry of Organometallic Compounds (ICCOM)─National Research Council (CNR), Via Madonna del Piano, 10, Sesto Fiorentino, Firenze 50019, Italy
| | - Giorgio Colombo
- Department of Chemistry, University of Pavia Via Taramelli 12, Pavia 27100, Italy
| | - Giulia Morra
- Institute of Chemical Sciences and Technologies (SCITEC)─National Research Council (CNR), Via Mario Bianco 9, Milano 20131, Italy
| |
Collapse
|
4
|
Boccardo F, Pierre-Louis O. Reinforcement learning with thermal fluctuations at the nanoscale. Phys Rev E 2024; 110:L023301. [PMID: 39294981 DOI: 10.1103/physreve.110.l023301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 08/06/2024] [Indexed: 09/21/2024]
Abstract
Reinforcement Learning offers a framework to learn to choose actions in order to control a system. However, at small scales Brownian fluctuations limit the control of nanomachine actuation or nanonavigation and of the molecular machinery of life. We analyze this regime using the general framework of Markov decision processes. We show that at the nanoscale, while optimal control actions should bring an improvement proportional to the small ratio of the applied force times a length scale over the temperature, the learned improvement is smaller and proportional to the square of this small ratio. Consequently, the efficiency of learning, which compares the learning improvement to the theoretical optimal improvement, drops to zero. Nevertheless, these limitations can be circumvented by using actions learned at a lower temperature. These results are illustrated with simulations of the control of the shape of small particle clusters.
Collapse
|
5
|
Xu Q, Yang M, Ji J, Weng J, Wang W, Xu X. Impact of Nonnative Interactions on the Binding Kinetics of Intrinsically Disordered p53 with MDM2: Insights from All-Atom Simulation and Markov State Model Analysis. J Chem Inf Model 2024; 64:5219-5231. [PMID: 38916177 DOI: 10.1021/acs.jcim.3c01833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024]
Abstract
Intrinsically disordered proteins (IDPs) lack a well-defined tertiary structure but are essential players in various biological processes. Their ability to undergo a disorder-to-order transition upon binding to their partners, known as the folding-upon-binding process, is crucial for their function. One classical example is the intrinsically disordered transactivation domain (TAD) of the tumor suppressor protein p53, which quickly forms a structured α-helix after binding to its partner MDM2, with clinical significance for cancer treatment. However, the contribution of nonnative interactions between the IDP and its partner to the rapid binding kinetics, as well as their interplay with native interactions, is not well understood at the atomic level. Here, we used molecular dynamics simulation and Markov state model (MSM) analysis to study the folding-upon-binding mechanism between p53-TAD and MDM2. Our results suggest that the system progresses from the nascent encounter complex to the well-structured encounter complex and finally reaches the native complex, following an induced-fit mechanism. We found that nonnative hydrophobic and hydrogen bond interactions, combined with native interactions, effectively stabilize the nascent and well-structured encounter complexes. Among the nonnative interactions, Leu25p53-Leu54MDM2 and Leu25p53-Phe55MDM2 are particularly noteworthy, as their interaction strength is close to the optimum. Evidently, strengthening or weakening these interactions could both adversely affect the binding kinetics. Overall, our findings suggest that nonnative interactions are evolutionarily optimized to accelerate the binding kinetics of IDPs in conjunction with native interactions.
Collapse
Affiliation(s)
- Qianjun Xu
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| | - Maohua Yang
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| | - Jie Ji
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| | - Jingwei Weng
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| | - Wenning Wang
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| | - Xin Xu
- Department of Chemistry, Institute of Biomedical Sciences and Multiscale Research Institute of Complex Systems, Fudan University, Shanghai 200438, China
| |
Collapse
|
6
|
Guo F, Yang H, Li S, Jiang Y, Bai X, Hu C, Li W, Han W. Using Gaussian accelerated molecular dynamics combined with Markov state models to explore the mechanism of action of new oral inhibitors on Complex I. Comput Biol Med 2024; 177:108598. [PMID: 38776729 DOI: 10.1016/j.compbiomed.2024.108598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 04/15/2024] [Accepted: 05/11/2024] [Indexed: 05/25/2024]
Abstract
In this study, our focus was on investigating H-1,2,3-triazole derivative HP661 as a novel and highly efficient oral OXPHOS inhibitor, with its molecular-level inhibitory mechanism not yet fully understood. We selected the ND1, NDUFS2, and NDUFS7 subunits of Mitochondrial Complex I as the receptor proteins and established three systems for comparative analysis: protein-IACS-010759, protein-lead compound 10, and protein-HP661. Through extensive analysis involving 500 ns Gaussian molecular dynamics simulations, we gained insights into these systems. Additionally, we constructed a Markov State Models to examine changes in secondary structures during the motion processes. The research findings suggest that the inhibitor HP661 enhances the extensibility and hydrophilicity of the receptor protein. Furthermore, HP661 induces the unwinding of the α-helical structure in the region of residues 726-730. Notably, key roles were identified for Met37, Phe53, and Pro212 in the binding of various inhibitors. In conclusion, we delved into the potential molecular mechanisms of triazole derivative HP661 in inhibiting Complex I. These research outcomes provide crucial information for a deeper understanding of the mechanisms underlying OXPHOS inhibition, offering valuable theoretical support for drug development and disease treatment design.
Collapse
Affiliation(s)
- Fangfang Guo
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Hengzheng Yang
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Shihong Li
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Yongxin Jiang
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Xue Bai
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Chengxiang Hu
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China
| | - Wannan Li
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China.
| | - Weiwei Han
- Edmond H. Fischer Signal Transduction Laboratory and Key Laboratory for Molecular Enzymology and Engineering of Ministry of Education, School of Life Science, Jilin University, 2699 Qianjin Street, Changchun, 130012, China.
| |
Collapse
|
7
|
Schäfer JL, Keller BG. Implementation of Girsanov Reweighting in OpenMM and Deeptime. J Phys Chem B 2024; 128:6014-6027. [PMID: 38865491 PMCID: PMC11215775 DOI: 10.1021/acs.jpcb.4c01702] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 05/22/2024] [Accepted: 05/22/2024] [Indexed: 06/14/2024]
Abstract
Classical molecular dynamics (MD) simulations provide invaluable insights into complex molecular systems but face limitations in capturing phenomena occurring on time scales beyond their reach. To bridge this gap, various enhanced sampling techniques have been developed, which are complemented by reweighting techniques to recover the unbiased dynamics. Girsanov reweighting is a reweighting technique that reweights simulation paths, generated by a stochastic MD integrator, without evoking an effective model of the dynamics. Instead, it calculates the relative path probability density at the time resolution of the MD integrator. Efficient implementation of Girsanov reweighting requires that the reweighting factors are calculated on-the-fly during the simulations and thus needs to be implemented within the MD integrator. Here, we present a comprehensive guide for implementing Girsanov reweighting into MD simulations. We demonstrate the implementation in the MD simulation package OpenMM by extending the library openmmtools. Additionally, we implemented a reweighted Markov state model estimator within the time series analysis package Deeptime.
Collapse
Affiliation(s)
- Joana-Lysiane Schäfer
- Department of Biology, Chemistry, and
Pharmacy, Freie Universität Berlin, Berlin 14195, Germany
| | - Bettina G. Keller
- Department of Biology, Chemistry, and
Pharmacy, Freie Universität Berlin, Berlin 14195, Germany
| |
Collapse
|
8
|
Wang D, Qiu Y, Beyerle ER, Huang X, Tiwary P. Information Bottleneck Approach for Markov Model Construction. J Chem Theory Comput 2024; 20:5352-5367. [PMID: 38859575 PMCID: PMC11199095 DOI: 10.1021/acs.jctc.4c00449] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2024]
Abstract
Markov state models (MSMs) have proven valuable in studying the dynamics of protein conformational changes via statistical analysis of molecular dynamics simulations. In MSMs, the complex configuration space is coarse-grained into conformational states, with dynamics modeled by a series of Markovian transitions among these states at discrete lag times. Constructing the Markovian model at a specific lag time necessitates defining states that circumvent significant internal energy barriers, enabling internal dynamics relaxation within the lag time. This process effectively coarse-grains time and space, integrating out rapid motions within metastable states. Thus, MSMs possess a multiresolution nature, where the granularity of states can be adjusted according to the time-resolution, offering flexibility in capturing system dynamics. This work introduces a continuous embedding approach for molecular conformations using the state predictive information bottleneck (SPIB), a framework that unifies dimensionality reduction and state space partitioning via a continuous, machine learned basis set. Without explicit optimization of the VAMP-based scores, SPIB demonstrates state-of-the-art performance in identifying slow dynamical processes and constructing predictive multiresolution Markovian models. Through applications to well-validated mini-proteins, SPIB showcases unique advantages compared to competing methods. It autonomously and self-consistently adjusts the number of metastable states based on a specified minimal time resolution, eliminating the need for manual tuning. While maintaining efficacy in dynamical properties, SPIB excels in accurately distinguishing metastable states and capturing numerous well-populated macrostates. This contrasts with existing VAMP-based methods, which often emphasize slow dynamics at the expense of incorporating numerous sparsely populated states. Furthermore, SPIB's ability to learn a low-dimensional continuous embedding of the underlying MSMs enhances the interpretation of dynamic pathways. With these benefits, we propose SPIB as an easy-to-implement methodology for end-to-end MSM construction.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Yunrui Qiu
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Eric R. Beyerle
- Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
| | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, WI 53706, United States
- Data Science Institute, University of Wisconsin-Madison, Madison, WI, 53706, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, MD 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, MD 20852, United States
| |
Collapse
|
9
|
Weigle AT, Shukla D. The Arabidopsis AtSWEET13 transporter discriminates sugars by selective facial and positional substrate recognition. Commun Biol 2024; 7:764. [PMID: 38914639 PMCID: PMC11196581 DOI: 10.1038/s42003-024-06291-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Accepted: 05/03/2024] [Indexed: 06/26/2024] Open
Abstract
Transporters are targeted by endogenous metabolites and exogenous molecules to reach cellular destinations, but it is generally not understood how different substrate classes exploit the same transporter's mechanism. Any disclosure of plasticity in transporter mechanism when treated with different substrates becomes critical for developing general selectivity principles in membrane transport catalysis. Using extensive molecular dynamics simulations with an enhanced sampling approach, we select the Arabidopsis sugar transporter AtSWEET13 as a model system to identify the basis for glucose versus sucrose molecular recognition and transport. Here we find that AtSWEET13 chemical selectivity originates from a conserved substrate facial selectivity demonstrated when committing alternate access, despite mono-/di-saccharides experiencing differing degrees of conformational and positional freedom throughout other stages of transport. However, substrate interactions with structural hallmarks associated with known functional annotations can help reinforce selective preferences in molecular transport.
Collapse
Affiliation(s)
- Austin T Weigle
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA
| | - Diwakar Shukla
- Department of Chemical & Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
- Center for Biophysics and Computational Biology, University of Illinois at Urbana-Champaign, Urbana, IL, 61801, USA.
| |
Collapse
|
10
|
Keller BG, Bolhuis PG. Dynamical Reweighting for Biased Rare Event Simulations. Annu Rev Phys Chem 2024; 75:137-162. [PMID: 38941527 DOI: 10.1146/annurev-physchem-083122-124538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Dynamical reweighting techniques aim to recover the correct molecular dynamics from a simulation at a modified potential energy surface. They are important for unbiasing enhanced sampling simulations of molecular rare events. Here, we review the theoretical frameworks of dynamical reweighting for modified potentials. Based on an overview of kinetic models with increasing level of detail, we discuss techniques to reweight two-state dynamics, multistate dynamics, and path integrals. We explore the natural link to transition path sampling and how the effect of nonequilibrium forces can be reweighted. We end by providing an outlook on how dynamical reweighting integrates with techniques for optimizing collective variables and with modern potential energy surfaces.
Collapse
Affiliation(s)
- Bettina G Keller
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany;
| | - Peter G Bolhuis
- Van 't Hoff Institute for Molecular Sciences, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
11
|
Champion C, Lehner M, Smith AA, Ferrage F, Bolik-Coulon N, Riniker S. Unraveling motion in proteins by combining NMR relaxometry and molecular dynamics simulations: A case study on ubiquitin. J Chem Phys 2024; 160:104105. [PMID: 38465679 DOI: 10.1063/5.0188416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 02/20/2024] [Indexed: 03/12/2024] Open
Abstract
Nuclear magnetic resonance (NMR) relaxation experiments shine light onto the dynamics of molecular systems in the picosecond to millisecond timescales. As these methods cannot provide an atomically resolved view of the motion of atoms, functional groups, or domains giving rise to such signals, relaxation techniques have been combined with molecular dynamics (MD) simulations to obtain mechanistic descriptions and gain insights into the functional role of side chain or domain motion. In this work, we present a comparison of five computational methods that permit the joint analysis of MD simulations and NMR relaxation experiments. We discuss their relative strengths and areas of applicability and demonstrate how they may be utilized to interpret the dynamics in MD simulations with the small protein ubiquitin as a test system. We focus on the aliphatic side chains given the rigidity of the backbone of this protein. We find encouraging agreement between experiment, Markov state models built in the χ1/χ2 rotamer space of isoleucine residues, explicit rotamer jump models, and a decomposition of the motion using ROMANCE. These methods allow us to ascribe the dynamics to specific rotamer jumps. Simulations with eight different combinations of force field and water model highlight how the different metrics may be employed to pinpoint force field deficiencies. Furthermore, the presented comparison offers a perspective on the utility of NMR relaxation to serve as validation data for the prediction of kinetics by state-of-the-art biomolecular force fields.
Collapse
Affiliation(s)
- Candide Champion
- Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Marc Lehner
- Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Albert A Smith
- Institute for Medical Physics and Biophysics, Leipzig University, Härtelstrasse 16-18, 04107 Leipzig, Germany
| | - Fabien Ferrage
- Laboratoire des Biomolécules, LBM, Département de Chimie, École normale supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Nicolas Bolik-Coulon
- Department of Molecular Genetics, University of Toronto, Toronto, Ontario M5S 1A8, Canada
- Department of Chemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada
- Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 3H6, Canada
| | - Sereina Riniker
- Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
12
|
Hradiská H, Kurečka M, Beránek J, Tedeschi G, Višňovský V, Křenek A, Spiwok V. Acceleration of Molecular Simulations by Parametric Time-Lagged tSNE Metadynamics. J Phys Chem B 2024; 128:903-913. [PMID: 38237064 PMCID: PMC10839826 DOI: 10.1021/acs.jpcb.3c05669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/22/2023] [Accepted: 12/28/2023] [Indexed: 02/02/2024]
Abstract
The potential of molecular simulations is limited by their computational costs. There is often a need to accelerate simulations using some of the enhanced sampling methods. Metadynamics applies a history-dependent bias potential that disfavors previously visited states. To apply metadynamics, it is necessary to select a few properties of the system─collective variables (CVs) that can be used to define the bias potential. Over the past few years, there have been emerging opportunities for machine learning and, in particular, artificial neural networks within this domain. In this broad context, a specific unsupervised machine learning method was utilized, namely, parametric time-lagged t-distributed stochastic neighbor embedding (ptltSNE) to design CVs. The approach was tested on a Trp-cage trajectory (tryptophan cage) from the literature. The trajectory was used to generate a map of conformations, distinguish fast conformational changes from slow ones, and design CVs. Then, metadynamic simulations were performed. To accelerate the formation of the α-helix, we added the α-RMSD collective variable. This simulation led to one folding event in a 350 ns metadynamics simulation. To accelerate degrees of freedom not addressed by CVs, we performed parallel tempering metadynamics. This simulation led to 10 folding events in a 200 ns simulation with 32 replicas.
Collapse
Affiliation(s)
- Helena Hradiská
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Martin Kurečka
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Jan Beránek
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Guglielmo Tedeschi
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Vladimír Višňovský
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Aleš Křenek
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Vojtěch Spiwok
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| |
Collapse
|
13
|
Woods EJ, Wales DJ. Analysis and interpretation of first passage time distributions featuring rare events. Phys Chem Chem Phys 2024; 26:1640-1657. [PMID: 38059562 DOI: 10.1039/d3cp04199a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
In this contribution we consider theory and associated computational tools to treat the kinetics associated with competing pathways on multifunnel energy landscapes. Multifunnel landscapes are associated with molecular switches and multifunctional materials, and are expected to exhibit multiple relaxation time scales and associated thermodynamic signatures in the heat capacity. Our focus here is on the first passage time distribution, which is encoded in a kinetic transition network containing all the locally stable states and the pathways between them. This network can be renormalised to reduce the dimensionality, while exactly conserving the mean first passage time and approximately conserving the full distribution. The structure of the reduced network can be visualised using disconnectivity graphs. We show how features in the first passage time distribution can be associated with specific kinetic traps, and how the appearance of competing relaxation time scales depends on the starting conditions. The theory is tested for two model landscapes and applied to an atomic cluster and a disordered peptide. Our most important contribution is probably the reconstruction of the full distribution for long time scales, where numerical problems prevent direct calculations. Here we combine accurate treatment of the mean first passage time with the reliable part of the distribution corresponding to faster time scales. Hence we now have a fundamental understanding of both thermodynamic and kinetic signatures of multifunnel landscapes.
Collapse
Affiliation(s)
- Esmae J Woods
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge CB3 0HE, UK
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
| | - David J Wales
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK.
| |
Collapse
|
14
|
Chen J, Wang W, Sun H, He W. Roles of Accelerated Molecular Dynamics Simulations in Predictions of Binding Kinetic Parameters. Mini Rev Med Chem 2024; 24:1323-1333. [PMID: 38265367 DOI: 10.2174/0113895575252165231122095555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 09/05/2023] [Accepted: 10/16/2023] [Indexed: 01/25/2024]
Abstract
Rational predictions on binding kinetics parameters of drugs to targets play significant roles in future drug designs. Full conformational samplings of targets are requisite for accurate predictions of binding kinetic parameters. In this review, we mainly focus on the applications of enhanced sampling technologies in calculations of binding kinetics parameters and residence time of drugs. The methods involved in molecular dynamics simulations are applied to not only probe conformational changes of targets but also reveal calculations of residence time that is significant for drug efficiency. For this review, special attention are paid to accelerated molecular dynamics (aMD) and Gaussian aMD (GaMD) simulations that have been adopted to predict the association or disassociation rate constant. We also expect that this review can provide useful information for future drug design.
Collapse
Affiliation(s)
- Jianzhong Chen
- School of Science, Shandong Jiaotong University, Jinan-250357, China
| | - Wei Wang
- School of Science, Shandong Jiaotong University, Jinan-250357, China
| | - Haibo Sun
- School of Science, Shandong Jiaotong University, Jinan-250357, China
| | - Weikai He
- School of Science, Shandong Jiaotong University, Jinan-250357, China
| |
Collapse
|
15
|
Lazzeri G, Jung H, Bolhuis PG, Covino R. Molecular Free Energies, Rates, and Mechanisms from Data-Efficient Path Sampling Simulations. J Chem Theory Comput 2023; 19:9060-9076. [PMID: 37988412 PMCID: PMC10753783 DOI: 10.1021/acs.jctc.3c00821] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/24/2023] [Accepted: 10/24/2023] [Indexed: 11/23/2023]
Abstract
Molecular dynamics is a powerful tool for studying the thermodynamics and kinetics of complex molecular events. However, these simulations can rarely sample the required time scales in practice. Transition path sampling overcomes this limitation by collecting unbiased trajectories and capturing the relevant events. Moreover, the integration of machine learning can boost the sampling while simultaneously learning a quantitative representation of the mechanism. Still, the resulting trajectories are by construction non-Boltzmann-distributed, preventing the calculation of free energies and rates. We developed an algorithm to approximate the equilibrium path ensemble from machine-learning-guided path sampling data. At the same time, our algorithm provides efficient sampling, mechanism, free energy, and rates of rare molecular events at a very moderate computational cost. We tested the method on the folding of the mini-protein chignolin. Our algorithm is straightforward and data-efficient, opening the door to applications in many challenging molecular systems.
Collapse
Affiliation(s)
- Gianmarco Lazzeri
- Frankfurt
Institute for Advanced Studies, Frankfurt am Main, 60438, Germany
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
| | - Hendrik Jung
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Frankfurt
am Main, 60438, Germany
| | - Peter G. Bolhuis
- Van’t
Hoff Institute for Molecular Sciences, University
of Amsterdam, Amsterdam, 1090GD, The Netherlands
| | - Roberto Covino
- Frankfurt
Institute for Advanced Studies, Frankfurt am Main, 60438, Germany
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
| |
Collapse
|
16
|
Arad E, Pedersen KB, Malka O, Mambram Kunnath S, Golan N, Aibinder P, Schiøtt B, Rapaport H, Landau M, Jelinek R. Staphylococcus aureus functional amyloids catalyze degradation of β-lactam antibiotics. Nat Commun 2023; 14:8198. [PMID: 38081813 PMCID: PMC10713593 DOI: 10.1038/s41467-023-43624-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2023] [Accepted: 11/15/2023] [Indexed: 12/18/2023] Open
Abstract
Antibiotic resistance of bacteria is considered one of the most alarming developments in modern medicine. While varied pathways for bacteria acquiring antibiotic resistance have been identified, there still are open questions concerning the mechanisms underlying resistance. Here, we show that alpha phenol-soluble modulins (PSMαs), functional bacterial amyloids secreted by Staphylococcus aureus, catalyze hydrolysis of β-lactams, a prominent class of antibiotic compounds. Specifically, we show that PSMα2 and, particularly, PSMα3 catalyze hydrolysis of the amide-like bond of the four membered β-lactam ring of nitrocefin, an antibiotic β-lactam surrogate. Examination of the catalytic activities of several PSMα3 variants allowed mapping of the active sites on the amyloid fibrils' surface, specifically underscoring the key roles of the cross-α fibril organization, and the combined electrostatic and nucleophilic functions of the lysine arrays. Molecular dynamics simulations further illuminate the structural features of β-lactam association upon the fibril surface. Complementary experimental data underscore the generality of the functional amyloid-mediated catalytic phenomenon, demonstrating hydrolysis of clinically employed β-lactams by PSMα3 fibrils, and illustrating antibiotic degradation in actual S. aureus biofilms and live bacteria environments. Overall, this study unveils functional amyloids as catalytic agents inducing degradation of β-lactam antibiotics, underlying possible antibiotic resistance mechanisms associated with bacterial biofilms.
Collapse
Affiliation(s)
- Elad Arad
- Ilse Katz Institute (IKI) for Nanoscale Science and Technology, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
- Department of Chemistry, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
| | - Kasper B Pedersen
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000, Aarhus C, Denmark
| | - Orit Malka
- Department of Chemistry, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
| | - Sisira Mambram Kunnath
- Ilse Katz Institute (IKI) for Nanoscale Science and Technology, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
- Department of Chemistry, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
| | - Nimrod Golan
- Department of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
| | - Polina Aibinder
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
| | - Birgit Schiøtt
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000, Aarhus C, Denmark
- Interdisciplinary Nanoscience Center (iNANO), Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark
| | - Hanna Rapaport
- Ilse Katz Institute (IKI) for Nanoscale Science and Technology, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
- Avram and Stella Goldstein-Goren Department of Biotechnology Engineering, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel
| | - Meytal Landau
- Department of Biology, Technion-Israel Institute of Technology, Haifa, 3200003, Israel
- Centre for Structural Systems Biology (CSSB), and European Molecular Biology Laboratory (EMBL), Hamburg, 22607, Germany
| | - Raz Jelinek
- Ilse Katz Institute (IKI) for Nanoscale Science and Technology, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel.
- Department of Chemistry, Ben Gurion University of the Negev, Beer Sheva, 8410501, Israel.
| |
Collapse
|
17
|
Dandekar BR, Majumdar BB, Mondal J. Nonmonotonic Modulation of the Protein-Ligand Recognition Event by Inert Crowders. J Phys Chem B 2023; 127:7449-7461. [PMID: 37590118 DOI: 10.1021/acs.jpcb.3c03946] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
The ubiquitous event of a protein recognizing small molecules or ligands at its native binding site is crucial for initiating major biological processes. However, how a crowded environment, as is typically represented by a cellular interior, would modulate the protein-ligand search process is largely debated. Excluded volume-based theory suggests that the presence of an inert crowder would reinforce a steady stabilization and enhancement of the protein-ligand recognition process. Here, we counter this long-held perspective via the molecular dynamics simulation and Markov state model of the protein-ligand recognition event in the presence of inert crowders. Specifically, we demonstrate that, depending on concentration, even purely inert crowders can exert a nonmonotonic effect via either stabilizing or destabilizing the protein-ligand binding event. Analysis of the kinetic network of binding pathways reveals that the crowders would either modulate precedent non-native on-pathway intermediates or would devise additional ones in a multistate recognition event across a wide range of concentrations. As an important insight, crowders gradually shift the relative transitional preference of these intermediates toward a native-bound state, with ligand residence time at the binding pocket dictating the trend of nonmonotonic concentration dependence by simple inert crowders.
Collapse
|
18
|
Kozlowski N, Grubmüller H. Uncertainties in Markov State Models of Small Proteins. J Chem Theory Comput 2023; 19:5516-5524. [PMID: 37540193 PMCID: PMC10448719 DOI: 10.1021/acs.jctc.3c00372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Indexed: 08/05/2023]
Abstract
Markov state models are widely used to describe and analyze protein dynamics based on molecular dynamics simulations, specifically to extract functionally relevant characteristic time scales and motions. Particularly for larger biomolecules such as proteins, however, insufficient sampling is a notorious concern and often the source of large uncertainties that are difficult to quantify. Furthermore, there are several other sources of uncertainty, such as choice of the number of Markov states and lag time, choice and parameters of dimension reduction preprocessing step, and uncertainty due to the limited number of observed transitions; the latter is often estimated via a Bayesian approach. Here, we quantified and ranked all of these uncertainties for four small globular test proteins. We found that the largest uncertainty is due to insufficient sampling and initially increases with the total trajectory length T up to a critical tipping point, after which it decreases as 1 / T , thus providing guidelines for how much sampling is required for given accuracy. We also found that single long trajectories yielded better sampling accuracy than many shorter trajectories starting from the same structure. In comparison, the remaining sources of the above uncertainties are generally smaller by a factor of about 5, rendering them less of a concern but certainly not negligible. Importantly, the Bayes uncertainty, commonly used as the only uncertainty estimate, captures only a relatively small part of the true uncertainty, which is thus often drastically underestimated.
Collapse
Affiliation(s)
- Nicolai Kozlowski
- Department of Theoretical and Computational
Biophysics, Max-Planck-Institute for Multidisciplinary
Sciences, Göttingen 37077, Germany
| | - Helmut Grubmüller
- Department of Theoretical and Computational
Biophysics, Max-Planck-Institute for Multidisciplinary
Sciences, Göttingen 37077, Germany
| |
Collapse
|
19
|
Voelz VA, Pande VS, Bowman GR. Folding@home: Achievements from over 20 years of citizen science herald the exascale era. Biophys J 2023; 122:2852-2863. [PMID: 36945779 PMCID: PMC10398258 DOI: 10.1016/j.bpj.2023.03.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 01/26/2023] [Accepted: 03/16/2023] [Indexed: 03/23/2023] Open
Abstract
Simulations of biomolecules have enormous potential to inform our understanding of biology but require extremely demanding calculations. For over 20 years, the Folding@home distributed computing project has pioneered a massively parallel approach to biomolecular simulation, harnessing the resources of citizen scientists across the globe. Here, we summarize the scientific and technical advances this perspective has enabled. As the project's name implies, the early years of Folding@home focused on driving advances in our understanding of protein folding by developing statistical methods for capturing long-timescale processes and facilitating insight into complex dynamical processes. Success laid a foundation for broadening the scope of Folding@home to address other functionally relevant conformational changes, such as receptor signaling, enzyme dynamics, and ligand binding. Continued algorithmic advances, hardware developments such as graphics processing unit (GPU)-based computing, and the growing scale of Folding@home have enabled the project to focus on new areas where massively parallel sampling can be impactful. While previous work sought to expand toward larger proteins with slower conformational changes, new work focuses on large-scale comparative studies of different protein sequences and chemical compounds to better understand biology and inform the development of small-molecule drugs. Progress on these fronts enabled the community to pivot quickly in response to the COVID-19 pandemic, expanding to become the world's first exascale computer and deploying this massive resource to provide insight into the inner workings of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus and aid the development of new antivirals. This success provides a glimpse of what is to come as exascale supercomputers come online and as Folding@home continues its work.
Collapse
Affiliation(s)
- Vincent A Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania
| | | | - Gregory R Bowman
- Departments of Biochemistry & Biophysics and of Bioengineering, University of Pennsylvania, Philadelphia, Pennsylvania.
| |
Collapse
|
20
|
Strahan J, Guo SC, Lorpaiboon C, Dinner AR, Weare J. Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction. J Chem Phys 2023; 159:014110. [PMID: 37409704 PMCID: PMC10328561 DOI: 10.1063/5.0151309] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 06/02/2023] [Indexed: 07/07/2023] Open
Abstract
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Spencer C. Guo
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Chatipat Lorpaiboon
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
21
|
Boothroyd S, Behara PK, Madin OC, Hahn DF, Jang H, Gapsys V, Wagner JR, Horton JT, Dotson DL, Thompson MW, Maat J, Gokey T, Wang LP, Cole DJ, Gilson MK, Chodera JD, Bayly CI, Shirts MR, Mobley DL. Development and Benchmarking of Open Force Field 2.0.0: The Sage Small Molecule Force Field. J Chem Theory Comput 2023; 19:3251-3275. [PMID: 37167319 PMCID: PMC10269353 DOI: 10.1021/acs.jctc.3c00039] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2023] [Indexed: 05/13/2023]
Abstract
We introduce the Open Force Field (OpenFF) 2.0.0 small molecule force field for drug-like molecules, code-named Sage, which builds upon our previous iteration, Parsley. OpenFF force fields are based on direct chemical perception, which generalizes easily to highly diverse sets of chemistries based on substructure queries. Like the previous OpenFF iterations, the Sage generation of OpenFF force fields was validated in protein-ligand simulations to be compatible with AMBER biopolymer force fields. In this work, we detail the methodology used to develop this force field, as well as the innovations and improvements introduced since the release of Parsley 1.0.0. One particularly significant feature of Sage is a set of improved Lennard-Jones (LJ) parameters retrained against condensed phase mixture data, the first refit of LJ parameters in the OpenFF small molecule force field line. Sage also includes valence parameters refit to a larger database of quantum chemical calculations than previous versions, as well as improvements in how this fitting is performed. Force field benchmarks show improvements in general metrics of performance against quantum chemistry reference data such as root-mean-square deviations (RMSD) of optimized conformer geometries, torsion fingerprint deviations (TFD), and improved relative conformer energetics (ΔΔE). We present a variety of benchmarks for these metrics against our previous force fields as well as in some cases other small molecule force fields. Sage also demonstrates improved performance in estimating physical properties, including comparison against experimental data from various thermodynamic databases for small molecule properties such as ΔHmix, ρ(x), ΔGsolv, and ΔGtrans. Additionally, we benchmarked against protein-ligand binding free energies (ΔGbind), where Sage yields results statistically similar to previous force fields. All the data is made publicly available along with complete details on how to reproduce the training results at https://github.com/openforcefield/openff-sage.
Collapse
Affiliation(s)
| | - Pavan Kumar Behara
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
| | - Owen C. Madin
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David F. Hahn
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Hyesu Jang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
- OpenEye
Scientific Software, Santa
Fe, New Mexico 87508, United States
| | - Vytautas Gapsys
- Computational
Chemistry, Janssen Research & Development, Turnhoutseweg 30, Beerse B-2340, Belgium
- Computational
Biomolecular Dynamics Group, Department of Theoretical and Computational
Biophysics, Max Planck Institute for Multidisciplinary
Sciences, Am Fassberg 11, D-37077, Göttingen, Germany
| | - Jeffrey R. Wagner
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Joshua T. Horton
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - David L. Dotson
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
- Datryllic LLC, Phoenix, Arizona 85003, United
States
| | - Matthew W. Thompson
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
- The Open
Force Field Initiative, Open Molecular Software
Foundation, Davis, California 95616, United States
| | - Jessica Maat
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Trevor Gokey
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| | - Lee-Ping Wang
- Chemistry
Department, The University of California
at Davis, Davis, California 95616, United States
| | - Daniel J. Cole
- School
of Natural and Environmental Sciences, Newcastle
University, Newcastle
upon Tyne NE1 7RU, U.K.
| | - Michael K. Gilson
- Skaggs
School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - John D. Chodera
- Computational
& Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | | | - Michael R. Shirts
- Chemical
& Biological Engineering Department, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - David L. Mobley
- Department
of Pharmaceutical Sciences, University of
California, Irvine, California 92697, United States
- Department
of Chemistry, University of California, Irvine, California 92697, United States
| |
Collapse
|
22
|
Dominic AJ, Cao S, Montoya-Castillo A, Huang X. Memory Unlocks the Future of Biomolecular Dynamics: Transformative Tools to Uncover Physical Insights Accurately and Efficiently. J Am Chem Soc 2023; 145:9916-9927. [PMID: 37104720 DOI: 10.1021/jacs.3c01095] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/29/2023]
Abstract
Conformational changes underpin function and encode complex biomolecular mechanisms. Gaining atomic-level detail of how such changes occur has the potential to reveal these mechanisms and is of critical importance in identifying drug targets, facilitating rational drug design, and enabling bioengineering applications. While the past two decades have brought Markov state model techniques to the point where practitioners can regularly use them to glimpse the long-time dynamics of slow conformations in complex systems, many systems are still beyond their reach. In this Perspective, we discuss how including memory (i.e., non-Markovian effects) can reduce the computational cost to predict the long-time dynamics in these complex systems by orders of magnitude and with greater accuracy and resolution than state-of-the-art Markov state models. We illustrate how memory lies at the heart of successful and promising techniques, ranging from the Fokker-Planck and generalized Langevin equations to deep-learning recurrent neural networks and generalized master equations. We delineate how these techniques work, identify insights that they can offer in biomolecular systems, and discuss their advantages and disadvantages in practical settings. We show how generalized master equations can enable the investigation of, for example, the gate-opening process in RNA polymerase II and demonstrate how our recent advances tame the deleterious influence of statistical underconvergence of the molecular dynamics simulations used to parameterize these techniques. This represents a significant leap forward that will enable our memory-based techniques to interrogate systems that are currently beyond the reach of even the best Markov state models. We conclude by discussing some current challenges and future prospects for how exploiting memory will open the door to many exciting opportunities.
Collapse
Affiliation(s)
- Anthony J Dominic
- Department of Chemistry, University of Colorado Boulder, Boulder, Colorado 80309, USA
| | - Siqin Cao
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | | | - Xuhui Huang
- Department of Chemistry, Theoretical Chemistry Institute, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| |
Collapse
|
23
|
Ojha AA, Srivastava A, Votapka LW, Amaro RE. Selectivity and Ranking of Tight-Binding JAK-STAT Inhibitors Using Markovian Milestoning with Voronoi Tessellations. J Chem Inf Model 2023; 63:2469-2482. [PMID: 37023323 PMCID: PMC10131228 DOI: 10.1021/acs.jcim.2c01589] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/08/2023]
Abstract
Janus kinases (JAK), a group of proteins in the nonreceptor tyrosine kinase (NRTKs) family, play a crucial role in growth, survival, and angiogenesis. They are activated by cytokines through the Janus kinase-signal transducer and activator of a transcription (JAK-STAT) signaling pathway. JAK-STAT signaling pathways have significant roles in the regulation of cell division, apoptosis, and immunity. Identification of the V617F mutation in the Janus homology 2 (JH2) domain of JAK2 leading to myeloproliferative disorders has stimulated great interest in the drug discovery community to develop JAK2-specific inhibitors. However, such inhibitors should be selective toward JAK2 over other JAKs and display an extended residence time. Recently, novel JAK2/STAT5 axis inhibitors (N-(1H-pyrazol-3-yl)pyrimidin-2-amino derivatives) have displayed extended residence times (hours or longer) on target and adequate selectivity excluding JAK3. To facilitate a deeper understanding of the kinase-inhibitor interactions and advance the development of such inhibitors, we utilize a multiscale Markovian milestoning with Voronoi tessellations (MMVT) approach within the Simulation-Enabled Estimation of Kinetic Rates v.2 (SEEKR2) program to rank order these inhibitors based on their kinetic properties and further explain the selectivity of JAK2 inhibitors over JAK3. Our approach investigates the kinetic and thermodynamic properties of JAK-inhibitor complexes in a user-friendly, fast, efficient, and accurate manner compared to other brute force and hybrid-enhanced sampling approaches.
Collapse
Affiliation(s)
- Anupam Anand Ojha
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Ambuj Srivastava
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Lane William Votapka
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Rommie E Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
24
|
Chakraborty D, Straub JE, Thirumalai D. Energy landscapes of Aβ monomers are sculpted in accordance with Ostwald's rule of stages. SCIENCE ADVANCES 2023; 9:eadd6921. [PMID: 36947617 PMCID: PMC10032606 DOI: 10.1126/sciadv.add6921] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 02/22/2023] [Indexed: 06/18/2023]
Abstract
The transition from a disordered to an assembly-competent monomeric state (N*) in amyloidogenic sequences is a crucial event in the aggregation cascade. Using a well-calibrated model for intrinsically disordered proteins (IDPs), we show that the N* states, which bear considerable resemblance to the polymorphic fibril structures found in experiments, not only appear as excitations in the free energy landscapes of Aβ40 and Aβ42, but also initiate the aggregation cascade. For Aβ42, the transitions to the different N* states are in accord with Ostwald's rule of stages, with the least stable structures forming ahead of thermodynamically favored ones. The Aβ40 and Aβ42 monomer landscapes exhibit different extents of local frustration, which we show have profound implications in dictating subsequent self-assembly. Using kinetic transition networks, we illustrate that the most favored dimerization routes proceed via N* states. We argue that Ostwald's rule also holds for the aggregation of fused in sarcoma and polyglutamine proteins.
Collapse
Affiliation(s)
- Debayan Chakraborty
- Department of Chemistry, The University of Texas at Austin, 105 E 24th Street, Stop A5300, Austin TX 78712, USA
| | - John E. Straub
- Department of Chemistry, Boston University, MA 022155, USA
| | - D. Thirumalai
- Department of Chemistry, The University of Texas at Austin, 105 E 24th Street, Stop A5300, Austin TX 78712, USA
| |
Collapse
|
25
|
Bernard DN, Narayanan C, Hempel T, Bafna K, Bhojane PP, Létourneau M, Howell EE, Agarwal PK, Doucet N. Conformational exchange divergence along the evolutionary pathway of eosinophil-associated ribonucleases. Structure 2023; 31:329-342.e4. [PMID: 36649708 PMCID: PMC9992247 DOI: 10.1016/j.str.2022.12.011] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2022] [Revised: 11/24/2022] [Accepted: 12/20/2022] [Indexed: 01/18/2023]
Abstract
The evolutionary role of conformational exchange in the emergence and preservation of function within structural homologs remains elusive. While protein engineering has revealed the importance of flexibility in function, productive modulation of atomic-scale dynamics has only been achieved on a finite number of distinct folds. Allosteric control of unique members within dynamically diverse structural families requires a better appreciation of exchange phenomena. Here, we examined the functional and structural role of conformational exchange within eosinophil-associated ribonucleases. Biological and catalytic activity of various EARs was performed in parallel to mapping their conformational behavior on multiple timescales using NMR and computational analyses. Despite functional conservation and conformational seclusion to a specific domain, we show that EARs can display similar or distinct motional profiles, implying divergence rather than conservation of flexibility. Comparing progressively more distant enzymes should unravel how this subfamily has evolved new functions and/or altered their behavior at the molecular level.
Collapse
Affiliation(s)
- David N Bernard
- Centre Armand-Frappier Santé Biotechnologie, Institut national de la recherche scientifique (INRS), Université du Québec, 531 Boulevard des Prairies, Laval, QC H7V 1B7, Canada
| | - Chitra Narayanan
- Centre Armand-Frappier Santé Biotechnologie, Institut national de la recherche scientifique (INRS), Université du Québec, 531 Boulevard des Prairies, Laval, QC H7V 1B7, Canada; Department of Chemistry, New Jersey City University, Jersey City, NJ 07305, USA
| | - Tim Hempel
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany; Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
| | - Khushboo Bafna
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Purva Prashant Bhojane
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Myriam Létourneau
- Centre Armand-Frappier Santé Biotechnologie, Institut national de la recherche scientifique (INRS), Université du Québec, 531 Boulevard des Prairies, Laval, QC H7V 1B7, Canada
| | - Elizabeth E Howell
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Pratul K Agarwal
- Department of Physiological Sciences and High-Performance Computing Center, Oklahoma State University, Stillwater, OK 74078, USA.
| | - Nicolas Doucet
- Centre Armand-Frappier Santé Biotechnologie, Institut national de la recherche scientifique (INRS), Université du Québec, 531 Boulevard des Prairies, Laval, QC H7V 1B7, Canada; PROTEO, the Québec Network for Research on Protein Function, Engineering, and Applications, Université Laval, 1045 Avenue de la Médecine, Québec, QC G1V 0A6, Canada.
| |
Collapse
|
26
|
Omar SI, Keasar C, Ben-Sasson AJ, Haber E. Protein Design Using Physics Informed Neural Networks. Biomolecules 2023; 13:biom13030457. [PMID: 36979392 PMCID: PMC10046838 DOI: 10.3390/biom13030457] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 02/16/2023] [Accepted: 02/27/2023] [Indexed: 03/06/2023] Open
Abstract
The inverse protein folding problem, also known as protein sequence design, seeks to predict an amino acid sequence that folds into a specific structure and performs a specific function. Recent advancements in machine learning techniques have been successful in generating functional sequences, outperforming previous energy function-based methods. However, these machine learning methods are limited in their interoperability and robustness, especially when designing proteins that must function under non-ambient conditions, such as high temperature, extreme pH, or in various ionic solvents. To address this issue, we propose a new Physics-Informed Neural Networks (PINNs)-based protein sequence design approach. Our approach combines all-atom molecular dynamics simulations, a PINNs MD surrogate model, and a relaxation of binary programming to solve the protein design task while optimizing both energy and the structural stability of proteins. We demonstrate the effectiveness of our design framework in designing proteins that can function under non-ambient conditions.
Collapse
Affiliation(s)
| | - Chen Keasar
- Department of Computer Science, Ben Gurion University of the Negev, Be’er Sheva 84105, Israel
| | - Ariel J. Ben-Sasson
- Independent Researcher, Haifa 3436301, Israel
- Correspondence: (A.J.B.-S.); (E.H.)
| | - Eldad Haber
- Department of Earth Ocean and Atmospheric Sciences, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Correspondence: (A.J.B.-S.); (E.H.)
| |
Collapse
|
27
|
Yang W, Zhuang J, Li C, Cheng GJ. Unveiling the Methyl Transfer Mechanisms in the Epigenetic Machinery DNMT3A-3L: A Comprehensive Study Integrating Assembly Dynamics with Catalytic Reactions. Comput Struct Biotechnol J 2023; 21:2086-2099. [PMID: 36968013 PMCID: PMC10034213 DOI: 10.1016/j.csbj.2023.03.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 03/02/2023] [Accepted: 03/02/2023] [Indexed: 03/07/2023] Open
Abstract
In epigenetic mechanisms, DNA methyltransferase 3 alpha (DNMT3A) acts as an initiator for DNA methylation and prevents the downstream genes from expressing. Perturbations of DNMT3A functions may cause uncontrolled gene expression, resulting in pathogenic consequences such as cancers. It is, therefore, vitally important to understand the catalytic process of DNMT3A in its biological macromolecule assembly, viz., heterotetramer: (DNMT3A-3 L)dimer. In this study, we utilized molecular dynamics (MD) simulations, Markov State Models (MSM), and quantum mechanics/molecular mechanics simulations (QM/MM) to investigate the de novo methyl transfer process. We identified the dynamics of the key residues relevant to the insertion of the target cytosine (dC) into the catalytic domain of DNMT3A, and the detailed potential energy surface of the seven-step reaction referring to methyl transfer. Our calculated potential energy barrier (22.51 kcal/mol) approximates the former experimental data (23.12 kcal/mol). The conformational change of the 5-methyl-cytosine (5mC) intermediate was found necessary in forming a four-water chain for the elimination step, which is unique to the other DNMTs. The biological assembly facilitates the creation of such a water chain, and the elimination occurs in an asynchronized mechanism in the two catalytic pockets. We anticipate the findings can enable a better understanding of the general mechanisms of the de novo methyl transfer for fulfilling the key enzymatic functions in epigenetics. And the unique elimination of DNMT3A might ignite novel methods for designing anti-cancer and tumor inhibitors of DNMTs.
Collapse
Affiliation(s)
- Wei Yang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Biotechnology, University of Science and Technology of China, Hefei 230026, China
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Jingyuan Zhuang
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
| | - Gui-Juan Cheng
- Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Shenzhen Key Laboratory of Steroid Drug Development, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Corresponding author at: Warshel Institute for Computational Biology, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China.
| |
Collapse
|
28
|
Galama MM, Wu H, Krämer A, Sadeghi M, Noé F. Stochastic Approximation to MBAR and TRAM: Batchwise Free Energy Estimation. J Chem Theory Comput 2023; 19:758-766. [PMID: 36689637 DOI: 10.1021/acs.jctc.2c00976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
The dynamics of molecules are governed by rare event transitions between long-lived (metastable) states. To explore these transitions efficiently, many enhanced sampling protocols have been introduced that involve using simulations with biases or changed temperatures. Two established statistically optimal estimators for obtaining unbiased equilibrium properties from such simulations are the multistate Bennett acceptance ratio (MBAR) and the transition-based reweighting analysis method (TRAM). Both MBAR and TRAM are solved iteratively and can suffer from long convergence times. Here, we introduce stochastic approximators (SA) for both estimators, resulting in SAMBAR and SATRAM, which are shown to converge faster than their deterministic counterparts, without significant accuracy loss. Both methods are demonstrated on different molecular systems.
Collapse
Affiliation(s)
- Maaike M Galama
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Hao Wu
- School of Mathematical Sciences, Institute of Natural Sciences, and MOE-LSC, Shanghai Jiao Tong University, 200240Shanghai, China.,School of Mathematical Sciences, Tongji University, 200092Shanghai, China
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Mohsen Sadeghi
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195Berlin, Germany.,Microsoft Research AI4Science, Karl Liebknecht Str 32, 10178Berlin, Germany.,Department of Physics, Freie Universität Berlin, Arnimallee 14, 14195Berlin, Germany.,Department of Chemistry, Rice University, 6100 Main St., Houston, Texas77005-1827, United States
| |
Collapse
|
29
|
Aristoff D, Copperman J, Simpson G, Webber RJ, Zuckerman DM. Weighted ensemble: Recent mathematical developments. J Chem Phys 2023; 158:014108. [PMID: 36610976 PMCID: PMC9822651 DOI: 10.1063/5.0110873] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 12/04/2022] [Indexed: 12/12/2022] Open
Abstract
Weighted ensemble (WE) is an enhanced sampling method based on periodically replicating and pruning trajectories generated in parallel. WE has grown increasingly popular for computational biochemistry problems due, in part, to improved hardware and accessible software implementations. Algorithmic and analytical improvements have played an important role, and progress has accelerated in recent years. Here, we discuss and elaborate on the WE method from a mathematical perspective, highlighting recent results that enhance the computational efficiency. The mathematical theory reveals a new strategy for optimizing trajectory management that approaches the best possible variance while generalizing to systems of arbitrary dimension.
Collapse
Affiliation(s)
- D. Aristoff
- Mathematics, Colorado State University, Fort Collins, CO 80521 USA
| | - J. Copperman
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239 USA
| | - G. Simpson
- Mathematics, Drexel University, Philadelphia, Pennsylvania 19104 USA
| | - R. J. Webber
- Computing and Mathematical Sciences, California Institute of Technology, Pasadena, California 91125 USA
| | - D. M. Zuckerman
- Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239 USA
| |
Collapse
|
30
|
Markov field models: Scaling molecular kinetics approaches to large molecular machines. Curr Opin Struct Biol 2022; 77:102458. [PMID: 36162297 DOI: 10.1016/j.sbi.2022.102458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Accepted: 08/05/2022] [Indexed: 12/14/2022]
Abstract
With recent advances in structural biology, including experimental techniques and deep learning-enabled high-precision structure predictions, molecular dynamics methods that scale up to large biomolecular systems are required. Current state-of-the-art approaches in molecular dynamics modeling focus on encoding global configurations of molecular systems as distinct states. This paradigm commands us to map out all possible structures and sample transitions between them, a task that becomes impossible for large-scale systems such as biomolecular complexes. To arrive at scalable molecular models, we suggest moving away from global state descriptions to a set of coupled models that each describe the dynamics of local domains or sites of the molecular system. We describe limitations in the current state-of-the-art global-state Markovian modeling approaches and then introduce Markov field models as an umbrella term that includes models from various scientific communities, including Independent Markov decomposition, Ising and Potts models, and (dynamic) graphical models, and evaluate their use for computational molecular biology. Finally, we give a few examples of early adoptions of these ideas for modeling molecular kinetics and thermodynamics.
Collapse
|
31
|
Mardt A, Hempel T, Clementi C, Noé F. Deep learning to decompose macromolecules into independent Markovian domains. Nat Commun 2022; 13:7101. [PMID: 36402768 PMCID: PMC9675806 DOI: 10.1038/s41467-022-34603-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 10/27/2022] [Indexed: 11/21/2022] Open
Abstract
The increasing interest in modeling the dynamics of ever larger proteins has revealed a fundamental problem with models that describe the molecular system as being in a global configuration state. This notion limits our ability to gather sufficient statistics of state probabilities or state-to-state transitions because for large molecular systems the number of metastable states grows exponentially with size. In this manuscript, we approach this challenge by introducing a method that combines our recent progress on independent Markov decomposition (IMD) with VAMPnets, a deep learning approach to Markov modeling. We establish a training objective that quantifies how well a given decomposition of the molecular system into independent subdomains with Markovian dynamics approximates the overall dynamics. By constructing an end-to-end learning framework, the decomposition into such subdomains and their individual Markov state models are simultaneously learned, providing a data-efficient and easily interpretable summary of the complex system dynamics. While learning the dynamical coupling between Markovian subdomains is still an open issue, the present results are a significant step towards learning Ising models of large molecular complexes from simulation data.
Collapse
Affiliation(s)
- Andreas Mardt
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany
| | - Tim Hempel
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany
| | - Cecilia Clementi
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,grid.509984.90000 0004 5907 3802Rice University, Center for Theoretical Biological Physics, Houston, TX USA
| | - Frank Noé
- grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Mathematics and Computer Science, Berlin, Germany ,grid.14095.390000 0000 9116 4836Freie Universität Berlin, Department of Physics, Berlin, Germany ,grid.21940.3e0000 0004 1936 8278Rice University, Department of Chemistry, Houston, TX USA ,Microsoft Research AI4Science, Berlin, Germany
| |
Collapse
|
32
|
Casalino L, Seitz C, Lederhofer J, Tsybovsky Y, Wilson IA, Kanekiyo M, Amaro RE. Breathing and tilting: mesoscale simulations illuminate influenza glycoprotein vulnerabilities. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2022:2022.08.02.502576. [PMID: 35982676 PMCID: PMC9387122 DOI: 10.1101/2022.08.02.502576] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Influenza virus has resurfaced recently from inactivity during the early stages of the COVID-19 pandemic, raising serious concerns about the nature and magnitude of future epidemics. The main antigenic targets of influenza virus are two surface glycoproteins, hemagglutinin (HA) and neuraminidase (NA). Whereas the structural and dynamical properties of both glycoproteins have been studied previously, the understanding of their plasticity in the whole-virion context is fragmented. Here, we investigate the dynamics of influenza glycoproteins in a crowded protein environment through mesoscale all-atom molecular dynamics simulations of two evolutionary-linked glycosylated influenza A whole-virion models. Our simulations reveal and kinetically characterize three main molecular motions of influenza glycoproteins: NA head tilting, HA ectodomain tilting, and HA head breathing. The flexibility of HA and NA highlights antigenically relevant conformational states, as well as facilitates the characterization of a novel monoclonal antibody, derived from human convalescent plasma, that binds to the underside of the NA head. Our work provides previously unappreciated views on the dynamics of HA and NA, advancing the understanding of their interplay and suggesting possible strategies for the design of future vaccines and antivirals against influenza.
Collapse
Affiliation(s)
- Lorenzo Casalino
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Christian Seitz
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| | - Julia Lederhofer
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Yaroslav Tsybovsky
- Electron Microscopy Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research sponsored by the National Cancer Institute, Frederick, MD 21702, United States
| | - Ian A. Wilson
- Department of Integrative Structural and Computational Biology and the Skaggs Institute for Chemical Biology, The Scripps Research Institute, La Jolla, CA 92037, United States
| | - Masaru Kanekiyo
- Vaccine Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Rommie E. Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
33
|
Soltani S, Sinclair CW, Rottler J. Exploring glassy dynamics with Markov state models from graph dynamical neural networks. Phys Rev E 2022; 106:025308. [PMID: 36109953 DOI: 10.1103/physreve.106.025308] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 07/21/2022] [Indexed: 06/15/2023]
Abstract
Using machine learning techniques, we introduce a Markov state model (MSM) for a model glass former that reveals structural heterogeneities and their slow dynamics by coarse-graining the molecular dynamics into a low-dimensional feature space. The transition timescale between states is larger than the conventional structural relaxation time τ_{α}, but can be obtained from trajectories much shorter than τ_{α}. The learned map of states assigned to the particles corresponds to local excess Voronoi volume. These results resonate with classic free volume theories of the glass transition, singling out local packing fluctuations as one of the dominant slowly relaxing features.
Collapse
Affiliation(s)
- Siavash Soltani
- Department of Materials Engineering, The University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
| | - Chad W Sinclair
- Department of Materials Engineering, The University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
| | - Jörg Rottler
- Department of Physics and Astronomy, The University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z1
- Stewart Blusson Quantum Matter Institute, The University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
| |
Collapse
|
34
|
Abstract
Multifunctional systems, such as molecular switches, exhibit multifunnel energy landscapes associated with the alternative functional states. In this contribution the multifunnel organization is decoded from dynamical signatures in the first passage time distribution between reactants and products. Characteristic relaxation rates are revealed by analyzing the kinetics as a function of the observation time scale, which scans the underlying distribution. Extracting the corresponding dynamical signatures provides direct insight into the organization of the molecular energy landscape, which will facilitate a rational design of target functionality. Examples are illustrated for multifunnel landscapes in biomolecular systems and an atomic cluster.
Collapse
|
35
|
Cong X, Zhang X, Liang X, He X, Tang Y, Zheng X, Lu S, Zhang J, Chen T. Delineating the conformational landscape and intrinsic properties of the angiotensin II type 2 receptor using a computational study. Comput Struct Biotechnol J 2022; 20:2268-2279. [PMID: 35615027 PMCID: PMC9117689 DOI: 10.1016/j.csbj.2022.05.012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 05/04/2022] [Accepted: 05/06/2022] [Indexed: 12/22/2022] Open
Abstract
As a key regulator for the renin-angiotensin system, a class A G protein-coupled receptor (GPCR), AngII type 2 receptor (AT2R), plays a pivotal role in the homeostasis of the cardiovascular system. Compared with other GPCRs, AT2R has a unique antagonist-bound conformation and its mechanism is still an enigma. Here, we applied combined dynamic and evolutional approaches to investigate the conformational space and intrinsic properties of AT2R. With molecular dynamic simulations, Markov State Models, and statistics coupled analysis, we captured the conformational landscape of AT2R and identified its uniquity from both dynamical and evolutional viewpoints. A cryptic pocket was also discovered in the intermediate state during conformation transitions. These findings offer a deeper understanding of the AT2R mechanism at an atomic level and provide hints for the design of novel AT2R modulators.
Collapse
Affiliation(s)
- Xiaoliang Cong
- Department of Cardiology, Shanghai Changzheng Hospital, the Second Affiliated Hospital of Naval Medical University, Shanghai 200003, China
| | - Xiaogang Zhang
- Department of Cardiology, Shanghai University of Medicine & Health Sciences Affiliated Zhoupu Hospital, Shanghai 201318, China
| | - Xin Liang
- Department of Cardiology, Shanghai Changzheng Hospital, the Second Affiliated Hospital of Naval Medical University, Shanghai 200003, China
| | - Xinheng He
- Medicinal Chemistry and Bioinformatics Centre, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
| | - Yehua Tang
- Department of Cardiology, Shanghai Changzheng Hospital, the Second Affiliated Hospital of Naval Medical University, Shanghai 200003, China
| | - Xing Zheng
- Department of Cardiology, Changhai Hospital, Naval Medical University, Shanghai 200433, China
| | - Shaoyong Lu
- Medicinal Chemistry and Bioinformatics Centre, Shanghai Jiao Tong University, School of Medicine, Shanghai 200025, China
- Corresponding authors.
| | - Jiayou Zhang
- Department of Cardiology, Shanghai Changzheng Hospital, the Second Affiliated Hospital of Naval Medical University, Shanghai 200003, China
- Corresponding authors.
| | - Ting Chen
- Department of Cardiology, Shanghai Changzheng Hospital, the Second Affiliated Hospital of Naval Medical University, Shanghai 200003, China
- Corresponding authors.
| |
Collapse
|
36
|
Ghorbani M, Prasad S, Klauda JB, Brooks BR. GraphVAMPNet, using graph neural networks and variational approach to Markov processes for dynamical modeling of biomolecules. J Chem Phys 2022; 156:184103. [PMID: 35568532 PMCID: PMC9094994 DOI: 10.1063/5.0085607] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Accepted: 04/22/2022] [Indexed: 11/14/2022] Open
Abstract
Finding a low dimensional representation of data from long-timescale trajectories of biomolecular processes, such as protein folding or ligand-receptor binding, is of fundamental importance, and kinetic models, such as Markov modeling, have proven useful in describing the kinetics of these systems. Recently, an unsupervised machine learning technique called VAMPNet was introduced to learn the low dimensional representation and the linear dynamical model in an end-to-end manner. VAMPNet is based on the variational approach for Markov processes and relies on neural networks to learn the coarse-grained dynamics. In this paper, we combine VAMPNet and graph neural networks to generate an end-to-end framework to efficiently learn high-level dynamics and metastable states from the long-timescale molecular dynamics trajectories. This method bears the advantages of graph representation learning and uses graph message passing operations to generate an embedding for each datapoint, which is used in the VAMPNet to generate a coarse-grained dynamical model. This type of molecular representation results in a higher resolution and a more interpretable Markov model than the standard VAMPNet, enabling a more detailed kinetic study of the biomolecular processes. Our GraphVAMPNet approach is also enhanced with an attention mechanism to find the important residues for classification into different metastable states.
Collapse
Affiliation(s)
| | - Samarjeet Prasad
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| | - Jeffery B. Klauda
- Department of Chemical and Biomolecular Engineering, University of Maryland, College Park, Maryland 20742, USA
| | - Bernard R. Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland 20824, USA
| |
Collapse
|
37
|
Hoffmann M, Scherer M, Hempel T, Mardt A, de Silva B, Husic BE, Klus S, Wu H, Kutz N, Brunton SL, Noé F. Deeptime: a Python library for machine learning dynamical models from time series data. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac3de0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Abstract
Generation and analysis of time-series data is relevant to many quantitative fields ranging from economics to fluid mechanics. In the physical sciences, structures such as metastable and coherent sets, slow relaxation processes, collective variables, dominant transition pathways or manifolds and channels of probability flow can be of great importance for understanding and characterizing the kinetic, thermodynamic and mechanistic properties of the system. Deeptime is a general purpose Python library offering various tools to estimate dynamical models based on time-series data including conventional linear learning methods, such as Markov state models (MSMs), Hidden Markov Models and Koopman models, as well as kernel and deep learning approaches such as VAMPnets and deep MSMs. The library is largely compatible with scikit-learn, having a range of Estimator classes for these different models, but in contrast to scikit-learn also provides deep Model classes, e.g. in the case of an MSM, which provide a multitude of analysis methods to compute interesting thermodynamic, kinetic and dynamical quantities, such as free energies, relaxation times and transition paths. The library is designed for ease of use but also easily maintainable and extensible code. In this paper we introduce the main features and structure of the deeptime software. Deeptime can be found under https://deeptime-ml.github.io/.
Collapse
|
38
|
Kasson PM. Modeling biomolecular kinetics with large-scale simulation. Curr Opin Struct Biol 2022; 72:95-102. [PMID: 34592698 PMCID: PMC9476681 DOI: 10.1016/j.sbi.2021.08.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2021] [Revised: 08/26/2021] [Accepted: 08/27/2021] [Indexed: 02/03/2023]
Abstract
The molecular details of biomolecular kinetics present a challenging estimation problem because the identities of relevant intermediates and the rates of exchange between them must be determined. These can be derived from prior knowledge, but in recent years, great advances have been made in the development and application of methods to systematically determine states and rates using biomolecular simulation. Doing this for biological systems of reasonable complexity requires substantial computational power, and contemporary methods leverage distributed computing or leadership-class computing resources to accomplish this. The result has been substantial insight into pressing contemporary problems, including structural activation of pandemic viruses. Here, we highlight recent developments in both methodology and exciting applications.
Collapse
Affiliation(s)
- Peter M Kasson
- Departments of Molecular Physiology and Biomedical Engineering, University of Virginia, Box 800886, Charlottesville, VA, 22908, USA; Department of Cell and Molecular Biology, Uppsala University, Box 256, Uppsala 75105, Sweden.
| |
Collapse
|
39
|
Kamenik AS, Linker SM, Riniker S. Enhanced sampling without borders: on global biasing functions and how to reweight them. Phys Chem Chem Phys 2022; 24:1225-1236. [PMID: 34935813 PMCID: PMC8768491 DOI: 10.1039/d1cp04809k] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 12/14/2021] [Indexed: 12/17/2022]
Abstract
Molecular dynamics (MD) simulations are a powerful tool to follow the time evolution of biomolecular motions in atomistic resolution. However, the high computational demand of these simulations limits the timescales of motions that can be observed. To resolve this issue, so called enhanced sampling techniques are developed, which extend conventional MD algorithms to speed up the simulation process. Here, we focus on techniques that apply global biasing functions. We provide a broad overview of established enhanced sampling methods and promising new advances. As the ultimate goal is to retrieve unbiased information from biased ensembles, we also discuss benefits and limitations of common reweighting schemes. In addition to concisely summarizing critical assumptions and implications, we highlight the general application opportunities as well as uncertainties of global enhanced sampling.
Collapse
Affiliation(s)
- Anna S Kamenik
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland.
| | - Stephanie M Linker
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland.
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland.
| |
Collapse
|
40
|
Beyerle ER, Guenza MG. Identifying the leading dynamics of ubiquitin: A comparison between the tICA and the LE4PD slow fluctuations in amino acids' position. J Chem Phys 2021; 155:244108. [PMID: 34972386 DOI: 10.1063/5.0059688] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Molecular Dynamics (MD) simulations of proteins implicitly contain the information connecting the atomistic molecular structure and proteins' biologically relevant motion, where large-scale fluctuations are deemed to guide folding and function. In the complex multiscale processes described by MD trajectories, it is difficult to identify, separate, and study those large-scale fluctuations. This problem can be formulated as the need to identify a small number of collective variables that guide the slow kinetic processes. The most promising method among the ones used to study the slow leading processes in proteins' dynamics is the time-structure based on time-lagged independent component analysis (tICA), which identifies the dominant components in a noisy signal. Recently, we developed an anisotropic Langevin approach for the dynamics of proteins, called the anisotropic Langevin Equation for Protein Dynamics or LE4PD-XYZ. This approach partitions the protein's MD dynamics into mostly uncorrelated, wavelength-dependent, diffusive modes. It associates with each mode a free-energy map, where one measures the spatial extension and the time evolution of the mode-dependent, slow dynamical fluctuations. Here, we compare the tICA modes' predictions with the collective LE4PD-XYZ modes. We observe that the two methods consistently identify the nature and extension of the slowest fluctuation processes. The tICA separates the leading processes in a smaller number of slow modes than the LE4PD does. The LE4PD provides time-dependent information at short times and a formal connection to the physics of the kinetic processes that are missing in the pure statistical analysis of tICA.
Collapse
Affiliation(s)
- E R Beyerle
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| | - M G Guenza
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|
41
|
Mardt A, Noé F. Progress in deep Markov state modeling: Coarse graining and experimental data restraints. J Chem Phys 2021; 155:214106. [PMID: 34879670 DOI: 10.1063/5.0064668] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Recent advances in deep learning frameworks have established valuable tools for analyzing the long-timescale behavior of complex systems, such as proteins. In particular, the inclusion of physical constraints, e.g., time-reversibility, was a crucial step to make the methods applicable to biophysical systems. Furthermore, we advance the method by incorporating experimental observables into the model estimation showing that biases in simulation data can be compensated for. We further develop a new neural network layer in order to build a hierarchical model allowing for different levels of details to be studied. Finally, we propose an attention mechanism, which highlights important residues for the classification into different states. We demonstrate the new methodology on an ultralong molecular dynamics simulation of the Villin headpiece miniprotein.
Collapse
Affiliation(s)
- Andreas Mardt
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
42
|
Carvalho HF, Ferrario V, Pleiss J. Molecular Mechanism of Methanol Inhibition in CALB-Catalyzed Alcoholysis: Analyzing Molecular Dynamics Simulations by a Markov State Model. J Chem Theory Comput 2021; 17:6570-6582. [PMID: 34494846 DOI: 10.1021/acs.jctc.1c00559] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Lipases are widely used enzymes that catalyze hydrolysis and alcoholysis of fatty acid esters. At high concentrations of small alcohols such as methanol or ethanol, many lipases are inhibited by the substrate. The molecular basis of the inhibition of Candida antarctica lipase B (CALB) by methanol was investigated by unbiased molecular dynamics (MD) simulations, and the substrate binding kinetics was analyzed by Markov state models (MSMs). The modeled fluxes of productive methanol binding at concentrations between 50 mM and 5.5 M were in good agreement with the experimental activity profile of CALB, with a peak at 300 mM. The kinetic and structural analysis uncovered the molecular basis of CALB inhibition. Beyond 300 mM, the kinetic bottleneck results from crowding of methanol in the substrate access channel, which is caused by the gradual formation of methanol patches close to Leu140 (helix α5), Leu278, and Ile285 (helix α10) at a distance of 4-5 Å from the active site. Our findings demonstrate the usefulness of unbiased MD simulations to study enzyme-substrate interactions at realistic substrate concentrations and the feasibility of scale-bridging by an MSM analysis to derive kinetic information.
Collapse
Affiliation(s)
- Henrique F Carvalho
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | - Valerio Ferrario
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| | - Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Allmandring 31, 70569 Stuttgart, Germany
| |
Collapse
|
43
|
Meshkin H, Zhu F. Toward Convergence in Free Energy Calculations for Protein Conformational Changes: A Case Study on the Thin Gate of Mhp1 Transporter. J Chem Theory Comput 2021; 17:6583-6596. [PMID: 34523931 DOI: 10.1021/acs.jctc.1c00585] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
It has been challenging to obtain reliable free energies for protein conformational changes from all-atom molecular dynamics simulations, despite the availability of many enhanced sampling techniques. To alleviate the difficulties associated with the enormous complexity of the conformational space, here we propose a few practical strategies for such calculations, including (1) a stringent method to examine convergence by comparing independent simulations starting from different initial coordinates, (2) adoption of multistep schemes in which the complete conformational change consists of multiple transition steps, each sampled using a distinct reaction coordinate, and (3) application of boundary restraints to simplify the conformational space. We demonstrate these strategies on the conformational changes between the outward-facing and outward-occluded states of the Mhp1 membrane transporter, obtaining the equilibrium thermodynamics of the relevant metastable states, the kinetic rates between these states, and the reactive trajectories that reveal the atomic details of spontaneous transitions. Our approaches thus promise convergent and reliable calculations to examine intuition-based hypotheses and to eventually elucidate the underlying molecular mechanisms of reversible conformational changes in complex protein systems.
Collapse
Affiliation(s)
- Hamed Meshkin
- Department of Physics, Indiana University Purdue University Indianapolis, 402 N. Blackford Street, Indianapolis, Indiana 46202, United States
| | - Fangqiang Zhu
- Department of Physics, Indiana University Purdue University Indianapolis, 402 N. Blackford Street, Indianapolis, Indiana 46202, United States
| |
Collapse
|
44
|
Glielmo A, Husic BE, Rodriguez A, Clementi C, Noé F, Laio A. Unsupervised Learning Methods for Molecular Simulation Data. Chem Rev 2021; 121:9722-9758. [PMID: 33945269 PMCID: PMC8391792 DOI: 10.1021/acs.chemrev.0c01195] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Indexed: 12/21/2022]
Abstract
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
Collapse
Affiliation(s)
- Aldo Glielmo
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
| | - Brooke E. Husic
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
| | - Alex Rodriguez
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| | - Cecilia Clementi
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Frank Noé
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Alessandro Laio
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| |
Collapse
|
45
|
Li C, Liu Z, Goonetilleke EC, Huang X. Temperature-dependent kinetic pathways of heterogeneous ice nucleation competing between classical and non-classical nucleation. Nat Commun 2021; 12:4954. [PMID: 34400646 PMCID: PMC8367957 DOI: 10.1038/s41467-021-25267-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 07/26/2021] [Indexed: 12/04/2022] Open
Abstract
Ice nucleation on the surface plays a vital role in diverse areas, ranging from physics and cryobiology to atmospheric science. Compared to ice nucleation in the bulk, the water-surface interactions present in heterogeneous ice nucleation complicate the nucleation process, making heterogeneous ice nucleation less comprehended, especially the relationship between the kinetics and the structures of the critical ice nucleus. Here we combine Markov State Models and transition path theory to elucidate the ensemble pathways of heterogeneous ice nucleation. Our Markov State Models reveal that the classical one-step and non-classical two-step nucleation pathways can surprisingly co-exist with comparable fluxes at T = 230 K. Interestingly, we find that the disordered mixing of rhombic and hexagonal ice leads to a favorable configurational entropy that stabilizes the critical nucleus, facilitating the non-classical pathway. In contrast, the favorable energetics promotes the formation of hexagonal ice, resulting in the classical pathway. Furthermore, we discover that, at elevated temperatures, the nucleation process prefers to proceed via the classical pathway, as opposed to the non-classical pathway, since the potential energy contributions override the configurational entropy compensation. This study provides insights into the mechanisms of heterogeneous ice nucleation and sheds light on the rational designs to control crystallization processes.
Collapse
Affiliation(s)
- Chu Li
- Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Zhuo Liu
- Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
- Institute for Advanced Study, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Eshani C Goonetilleke
- Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xuhui Huang
- Department of Chemistry, Center of Systems Biology and Human Health, State Key Laboratory of Molecular Neuroscience, The Hong Kong University of Science and Technology, Kowloon, Hong Kong.
| |
Collapse
|
46
|
Hempel T, Del Razo MJ, Lee CT, Taylor BC, Amaro RE, Noé F. Independent Markov decomposition: Toward modeling kinetics of biomolecular complexes. Proc Natl Acad Sci U S A 2021; 118:e2105230118. [PMID: 34321356 PMCID: PMC8346863 DOI: 10.1073/pnas.2105230118] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
To advance the mission of in silico cell biology, modeling the interactions of large and complex biological systems becomes increasingly relevant. The combination of molecular dynamics (MD) simulations and Markov state models (MSMs) has enabled the construction of simplified models of molecular kinetics on long timescales. Despite its success, this approach is inherently limited by the size of the molecular system. With increasing size of macromolecular complexes, the number of independent or weakly coupled subsystems increases, and the number of global system states increases exponentially, making the sampling of all distinct global states unfeasible. In this work, we present a technique called independent Markov decomposition (IMD) that leverages weak coupling between subsystems to compute a global kinetic model without requiring the sampling of all combinatorial states of subsystems. We give a theoretical basis for IMD and propose an approach for finding and validating such a decomposition. Using empirical few-state MSMs of ion channel models that are well established in electrophysiology, we demonstrate that IMD models can reproduce experimental conductance measurements with a major reduction in sampling compared with a standard MSM approach. We further show how to find the optimal partition of all-atom protein simulations into weakly coupled subunits.
Collapse
Affiliation(s)
- Tim Hempel
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
- Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany
| | - Mauricio J Del Razo
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany
- Van't Hoff Institute for Molecular Sciences, University of Amsterdam, 1090 GD Amsterdam, The Netherlands
- Korteweg-de Vries Institute for Mathematics, University of Amsterdam, 1090 GE Amsterdam, The Netherlands
- Dutch Institute for Emergent Phenomena, 1090 GL Amsterdam, The Netherlands
| | - Christopher T Lee
- Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, CA 92093
| | - Bryn C Taylor
- Biomedical Sciences Graduate Program, University of California San Diego, La Jolla, CA 92093
| | - Rommie E Amaro
- Department of Chemistry & Biochemistry, University of California San Diego, La Jolla, CA 92093;
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany;
- Department of Physics, Freie Universität Berlin, 14195 Berlin, Germany
- Department of Chemistry, Rice University, Houston, TX 77005
| |
Collapse
|
47
|
Suárez E, Wiewiora RP, Wehmeyer C, Noé F, Chodera JD, Zuckerman DM. What Markov State Models Can and Cannot Do: Correlation versus Path-Based Observables in Protein-Folding Models. J Chem Theory Comput 2021; 17:3119-3133. [PMID: 33904312 PMCID: PMC8127341 DOI: 10.1021/acs.jctc.0c01154] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Markov state models (MSMs) have been widely applied to study the kinetics and pathways of protein conformational dynamics based on statistical analysis of molecular dynamics (MD) simulations. These MSMs coarse-grain both configuration space and time in ways that limit what kinds of observables they can reproduce with high fidelity over different spatial and temporal resolutions. Despite their popularity, there is still limited understanding of which biophysical observables can be computed from these MSMs in a robust and unbiased manner, and which suffer from the space-time coarse-graining intrinsic in the MSM model. Most theoretical arguments and practical validity tests for MSMs rely on long-time equilibrium kinetics, such as the slowest relaxation time scales and experimentally observable time-correlation functions. Here, we perform an extensive assessment of the ability of well-validated protein folding MSMs to accurately reproduce path-based observable such as mean first-passage times (MFPTs) and transition path mechanisms compared to a direct trajectory analysis. We also assess a recently proposed class of history-augmented MSMs (haMSMs) that exploit additional information not accounted for in standard MSMs. We conclude with some practical guidance on the use of MSMs to study various problems in conformational dynamics of biomolecules. In brief, MSMs can accurately reproduce correlation functions slower than the lag time, but path-based observables can only be reliably reproduced if the lifetimes of states exceed the lag time, which is a much stricter requirement. Even in the presence of short-lived states, we find that haMSMs reproduce path-based observables more reliably.
Collapse
Affiliation(s)
- Ernesto Suárez
- Advanced Biomedical Computational Science, Frederick National Laboratory for Cancer Research, Frederick, MD 21702
| | - Rafal P. Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| | | | | | - John D. Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| | - Daniel M. Zuckerman
- Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR 97239
| |
Collapse
|
48
|
Abstract
The ability to make sense of the massive amounts of high-dimensional data generated from molecular dynamics simulations is heavily dependent on the knowledge of a low-dimensional manifold (parameterized by a reaction coordinate or RC) that typically distinguishes between relevant metastable states, and which captures the relevant slow dynamics of interest. Methods based on machine learning and artificial intelligence have been proposed over the years to deal with learning such low-dimensional manifolds, but they are often criticized for a disconnect from more traditional and physically interpretable approaches. To deal with such concerns, in this work we propose a deep learning based state predictive information bottleneck approach to learn the RC from high-dimensional molecular simulation trajectories. We demonstrate analytically and numerically how the RC learnt in this approach is connected to the committor in chemical physics and can be used to accurately identify transition states. A crucial hyperparameter in this approach is the time delay or how far into the future the algorithm should make predictions about. Through careful comparisons for benchmark systems, we demonstrate that this hyperparameter choice gives useful control over how coarse-grained we want the metastable state classification of the system to be. We thus believe that this work represents a step forward in systematic application of deep learning based ideas to molecular simulations.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, USA
| |
Collapse
|
49
|
Beyerle ER, Guenza MG. Comparison between slow anisotropic LE4PD fluctuations and the principal component analysis modes of ubiquitin. J Chem Phys 2021; 154:124111. [PMID: 33810675 DOI: 10.1063/5.0041211] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The biological function and folding mechanisms of proteins are often guided by large-scale slow motions, which involve crossing high energy barriers. In a simulation trajectory, these slow fluctuations are commonly identified using a principal component analysis (PCA). Despite the popularity of this method, a complete analysis of its predictions based on the physics of protein motion has been so far limited. This study formally connects the PCA to a Langevin model of protein dynamics and analyzes the contributions of energy barriers and hydrodynamic interactions to the slow PCA modes of motion. To do so, we introduce an anisotropic extension of the Langevin equation for protein dynamics, called the LE4PD-XYZ, which formally connects to the PCA "essential dynamics." The LE4PD-XYZ is an accurate coarse-grained diffusive method to model protein motion, which describes anisotropic fluctuations in the alpha carbons of the protein. The LE4PD accounts for hydrodynamic effects and mode-dependent free-energy barriers. This study compares large-scale anisotropic fluctuations identified by the LE4PD-XYZ to the mode-dependent PCA predictions, starting from a microsecond-long alpha carbon molecular dynamics atomistic trajectory of the protein ubiquitin. We observe that the inclusion of free-energy barriers and hydrodynamic interactions has important effects on the identification and timescales of ubiquitin's slow modes.
Collapse
Affiliation(s)
- E R Beyerle
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| | - M G Guenza
- Institute for Fundamental Science and Department of Chemistry and Biochemistry, University of Oregon, Eugene, Oregon 97403, USA
| |
Collapse
|
50
|
Kieninger S, Keller BG. Path probability ratios for Langevin dynamics-Exact and approximate. J Chem Phys 2021; 154:094102. [PMID: 33685138 DOI: 10.1063/5.0038408] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Path reweighting is a principally exact method to estimate dynamic properties from biased simulations-provided that the path probability ratio matches the stochastic integrator used in the simulation. Previously reported path probability ratios match the Euler-Maruyama scheme for overdamped Langevin dynamics. Since molecular dynamics simulations use Langevin dynamics rather than overdamped Langevin dynamics, this severely impedes the application of path reweighting methods. Here, we derive the path probability ratio ML for Langevin dynamics propagated by a variant of the Langevin Leapfrog integrator. This new path probability ratio allows for exact reweighting of Langevin dynamics propagated by this integrator. We also show that a previously derived approximate path probability ratio Mapprox differs from the exact ML only by O(ξ4Δt4) and thus yields highly accurate dynamic reweighting results. (Δt is the integration time step, and ξ is the collision rate.) The results are tested, and the efficiency of path reweighting is explored using butane as an example.
Collapse
Affiliation(s)
- S Kieninger
- Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Arnimallee 22, D-14195 Berlin, Germany
| | - B G Keller
- Department of Biology, Chemistry, Pharmacy, Freie Universität Berlin, Arnimallee 22, D-14195 Berlin, Germany
| |
Collapse
|