1
|
Javed R, Kapakayala AB, Nair NN. Buckets Instead of Umbrellas for Enhanced Sampling and Free Energy Calculations. J Chem Theory Comput 2024; 20:8450-8460. [PMID: 39344058 DOI: 10.1021/acs.jctc.4c00776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Umbrella sampling has been a workhorse for free energy calculations in molecular simulations for several decades. In conventional umbrella sampling, restraining bias potentials are strategically applied along one or several collective variables. Major drawbacks associated with this method are the requirement of a large number of bias windows and the poor sampling of the transverse coordinates. In this work, we propose an alternate formalism that departs from the traditional umbrella sampling to mitigate these issues, where we replace umbrella-type restraining bias potentials with bucket-type wall potentials. This modification permits one to formulate an efficient computational strategy leveraging wall potentials and metadynamics sampling. This new method, called "bucket sampling", can significantly reduce the computational cost of obtaining converged high-dimensional free energy surfaces. Extensions of the proposed method with temperature acceleration and replica exchange solute tempering are also demonstrated.
Collapse
Affiliation(s)
- Ramsha Javed
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - Anji Babu Kapakayala
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| | - Nisanth N Nair
- Department of Chemistry, Indian Institute of Technology Kanpur, Kanpur 208016, India
| |
Collapse
|
2
|
Odstrcil RE, Dutta P, Liu J. Enhanced Sampling for Conformational Changes and Molecular Mechanisms of Human NTHL1. J Phys Chem Lett 2024; 15:3206-3213. [PMID: 38483510 PMCID: PMC11059236 DOI: 10.1021/acs.jpclett.4c00161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/22/2024]
Abstract
The functionalities of proteins rely on protein conformational changes during many processes. Identification of the protein conformations and capturing transitions among different conformations are important but extremely challenging in both experiments and simulations. In this work, we develop a machine learning based approach to identify a reaction coordinate that accelerates the exploration of protein conformational changes in molecular simulations. We implement our approach to study the conformational changes of human NTHL1 during DNA repair. Our results identified three distinct conformations: open (stable), closed (unstable), and bundle (stable). The existence of the bundle conformation can rationalize recent experimental observations. Comparison with an NTHL1 mutant demonstrates that a closely packed cluster of positively charged residues in the linker could be a factor to search when screening for genetic abnormalities. Results will lead to a better modulation of the DNA repair pathway to protect against carcinogenesis.
Collapse
Affiliation(s)
- Ryan E. Odstrcil
- School of Mechanical and Materials Engineering, Washington State University, Pullman, Washington 99164, USA
| | - Prashanta Dutta
- School of Mechanical and Materials Engineering, Washington State University, Pullman, Washington 99164, USA
| | - Jin Liu
- School of Mechanical and Materials Engineering, Washington State University, Pullman, Washington 99164, USA
| |
Collapse
|
3
|
Ji X, Wang R, Wang H, Liu W. On committor functions in milestoning. J Chem Phys 2023; 159:244115. [PMID: 38153148 DOI: 10.1063/5.0180513] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 12/07/2023] [Indexed: 12/29/2023] Open
Abstract
As an optimal one-dimensional reaction coordinate, the committor function not only describes the probability of a trajectory initiated at a phase space point first reaching the product state before reaching the reactant state but also preserves the kinetics when utilized to run a reduced dynamics model. However, calculating the committor function in high-dimensional systems poses significant challenges. In this paper, within the framework of milestoning, exact expressions for committor functions at two levels of coarse graining are given, including committor functions of phase space point to point (CFPP) and milestone to milestone (CFMM). When combined with transition kernels obtained from trajectory analysis, these expressions can be utilized to accurately and efficiently compute the committor functions. Furthermore, based on the calculated committor functions, an adaptive algorithm is developed to gradually refine the transition state region. Finally, two model examples are employed to assess the accuracy of these different formulations of committor functions.
Collapse
Affiliation(s)
- Xiaojun Ji
- Research Center for Mathematics and Interdisciplinary Sciences, Shandong University, Qingdao, Shandong 266237, People's Republic of China
- Frontiers Science Center for Nonlinear Expectations (Ministry of Education), Shandong University, Qingdao, Shandong 266237, People's Republic of China
| | - Ru Wang
- Qingdao Institute for Theoretical and Computational Sciences, Institute of Frontier and Interdisciplinary Science, Shandong University, Qingdao, Shandong 266237, People's Republic of China
| | - Hao Wang
- Qingdao Institute for Theoretical and Computational Sciences, Institute of Frontier and Interdisciplinary Science, Shandong University, Qingdao, Shandong 266237, People's Republic of China
| | - Wenjian Liu
- Qingdao Institute for Theoretical and Computational Sciences, Institute of Frontier and Interdisciplinary Science, Shandong University, Qingdao, Shandong 266237, People's Republic of China
| |
Collapse
|
4
|
Mouaffac L, Palacio-Rodriguez K, Pietrucci F. Optimal Reaction Coordinates and Kinetic Rates from the Projected Dynamics of Transition Paths. J Chem Theory Comput 2023; 19:5701-5711. [PMID: 37550088 DOI: 10.1021/acs.jctc.3c00158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Finding optimal reaction coordinates and predicting accurate kinetic rates for activated processes are two of the foremost challenges of molecular simulations. We introduce an algorithm that tackles the two problems at once: starting from a limited number of reactive molecular dynamics trajectories (transition paths), we automatically generate with a Monte Carlo approach a sequence of different reaction coordinates that progressively reduce the kinetic rate of their projected effective dynamics. Based on a variational principle, the minimal rate accurately approximates the exact one, and it corresponds to the optimal reaction coordinate. After benchmarking the method on an analytic double-well system, we apply it to complex atomistic systems: the interaction of carbon nanoparticles of different sizes in water.
Collapse
Affiliation(s)
- Line Mouaffac
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| | - Karen Palacio-Rodriguez
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| | - Fabio Pietrucci
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| |
Collapse
|
5
|
Magrino T, Huet L, Saitta AM, Pietrucci F. Critical Assessment of Data-Driven versus Heuristic Reaction Coordinates in Solution Chemistry. J Phys Chem A 2022; 126:8887-8900. [DOI: 10.1021/acs.jpca.2c07640] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
- Théo Magrino
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, Muséum National d’Histoire Naturelle, CNRS UMR 7590, Paris 75005, France
| | - Léon Huet
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, Muséum National d’Histoire Naturelle, CNRS UMR 7590, Paris 75005, France
| | - A. Marco Saitta
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, Muséum National d’Histoire Naturelle, CNRS UMR 7590, Paris 75005, France
| | - Fabio Pietrucci
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, Muséum National d’Histoire Naturelle, CNRS UMR 7590, Paris 75005, France
| |
Collapse
|
6
|
Krivov SV. Additive eigenvectors as optimal reaction coordinates, conditioned trajectories, and time-reversible description of stochastic processes. J Chem Phys 2022; 157:014108. [DOI: 10.1063/5.0088061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A fundamental way to analyze complex multidimensional stochastic dynamics is to describe it as diffusion on a free energy landscape—free energy as a function of reaction coordinates (RCs). For such a description to be quantitatively accurate, the RC should be chosen in an optimal way. The committor function is a primary example of an optimal RC for the description of equilibrium reaction dynamics between two states. Here, additive eigenvectors (addevs) are considered as optimal RCs to address the limitations of the committor. An addev master equation for a Markov chain is derived. A stationary solution of the equation describes a sub-ensemble of trajectories conditioned on having the same optimal RC for the forward and time-reversed dynamics in the sub-ensemble. A collection of such sub-ensembles of trajectories, called stochastic eigenmodes, can be used to describe/approximate the stochastic dynamics. A non-stationary solution describes the evolution of the probability distribution. However, in contrast to the standard master equation, it provides a time-reversible description of stochastic dynamics. It can be integrated forward and backward in time. The developed framework is illustrated on two model systems—unidirectional random walk and diffusion.
Collapse
Affiliation(s)
- Sergei V. Krivov
- University of Leeds, Astbury Center for Structural Molecular Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
7
|
Domino effect in allosteric signaling of peptide binding. J Mol Biol 2022; 434:167661. [DOI: 10.1016/j.jmb.2022.167661] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 05/24/2022] [Accepted: 05/24/2022] [Indexed: 11/22/2022]
|
8
|
Louwerse MD, Sivak DA. Information Thermodynamics of the Transition-Path Ensemble. PHYSICAL REVIEW LETTERS 2022; 128:170602. [PMID: 35570424 DOI: 10.1103/physrevlett.128.170602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 03/10/2022] [Indexed: 06/15/2023]
Abstract
The reaction coordinate describing a transition between reactant and product is a fundamental concept in the theory of chemical reactions. Within transition-path theory, a quantitative definition of the reaction coordinate is found in the committor, which is the probability that a trajectory initiated from a given microstate first reaches the product before the reactant. Here we develop an information-theoretic origin for the committor and show how selecting transition paths from a long ergodic equilibrium trajectory induces entropy production which exactly equals the information that system dynamics provide about the reactivity of trajectories. This equality of entropy production and dynamical information generation also holds at the level of arbitrary individual coordinates, providing parallel measures of the coordinate's relevance to the reaction, each of which is maximized by the committor.
Collapse
Affiliation(s)
- Miranda D Louwerse
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia V5A1S6, Canada
| | - David A Sivak
- Department of Physics, Simon Fraser University, Burnaby, British Columbia V5A1S6, Canada
| |
Collapse
|
9
|
Abstract
The kinetics of a dynamical system dominated by two metastable states is examined from the perspective of the activated-dynamics reactive flux formalism, Markov state eigenvalue spectral decomposition, and committor-based transition path theory. Analysis shows that the different theoretical formulations are consistent, clarifying the significance of the inherent microscopic lag-times that are implicated, and that the most meaningful one-dimensional reaction coordinate in the region of the transition state is along the gradient of the committor in the multidimensional subspace of collective variables. It is shown that the familiar reactive flux activated dynamics formalism provides an effective route to calculate the transition rate in the case of a narrow sharp barrier but much less so in the case of a broad flat barrier. In this case, the standard reactive flux correlation function decays very slowly to the plateau value that corresponds to the transmission coefficient. Treating the committor function as a reaction coordinate does not alleviate all issues caused by the slow relaxation of the reactive flux correlation function. A more efficient activated dynamics simulation algorithm may be achieved from a modified reactive flux weighted by the committor. Simulation results on simple systems are used to illustrate the various conceptual points.
Collapse
Affiliation(s)
- Benoît Roux
- Department of Biochemistry and Molecular Biology, Department of Chemistry, The University of Chicago, 5735 S Ellis Ave., Chicago, Illinois 60637, USA
| |
Collapse
|
10
|
Kikutsuji T, Mori Y, Okazaki KI, Mori T, Kim K, Matubayasi N. Explaining reaction coordinates of alanine dipeptide isomerization obtained from deep neural networks using Explainable Artificial Intelligence (XAI). J Chem Phys 2022; 156:154108. [DOI: 10.1063/5.0087310] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A method for obtaining appropriate reaction coordinates is required to identify transition states distinguishing product and reactant in complex molecular systems. Recently, abundant research has been devoted to obtaining reaction coordinates using artificial neural networks from deep learning literature, where many collective variables are typically utilized in the input layer. However, it is difficult to explain the details of which collective variables contribute to the predicted reaction coordinates owing to the complexity of the nonlinear functions in deep neural networks. To overcome this limitation, we used Explainable Artificial Intelligence (XAI) methods of the Local Interpretable Model-agnostic Explanation (LIME) and the game theory-based framework known as Shapley Additive exPlanations (SHAP). We demonstrated that XAI enables us to obtain the degree of contribution of each collective variable to reaction coordinates that is determined by nonlinear regressions with deep learning for the committor of the alanine dipeptide isomerization in vacuum. In particular, both LIME and SHAP provide important features to the predicted reaction coordinates, which are characterized by appropriate dihedral angles consistent with those previously reported from the committor test analysis. The present study offers an AI-aided framework to explain the appropriate reaction coordinates, which acquires considerable significance when the number of degrees of freedom increases.
Collapse
Affiliation(s)
| | | | - Kei-ichi Okazaki
- Department of Theoretical and Computational Molecular Science, Institute for Molecular Science, Japan
| | - Toshifumi Mori
- Kyushu University Institute for Materials Chemistry and Engineering, Japan
| | - Kang Kim
- Graduate School of Engineering Science, Osaka University - Toyonaka Campus, Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Graduate School of Engineering Science, Osaka University, Japan
| |
Collapse
|
11
|
Paul TK, Taraphder S. Nonlinear Reaction Coordinate of an Enzyme Catalyzed Proton Transfer Reaction. J Phys Chem B 2022; 126:1413-1425. [PMID: 35138854 DOI: 10.1021/acs.jpcb.1c08760] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
We present an in-depth study on the theoretical calculation of an optimum reaction coordinate as a linear or nonlinear combination of important collective variables (CVs) sampled from an ensemble of reactive transition paths for an intramolecular proton transfer reaction catalyzed by the enzyme human carbonic anhydrase (HCA) II. The linear models are optimized by likelihood maximization for a given number of CVs. The nonlinear models are based on an artificial neural network with the same number of CVs and optimized by minimizing the root-mean-square error in comparison to a training set of committor estimators generated for the given transition. The nonlinear reaction coordinate thus obtained yields the free energy of activation and rate constant as 9.46 kcal mol-1 and 1.25 × 106 s-1, respectively. These estimates are found to be in quantitative agreement with the known experimental results. We have also used an extended autoencoder model to show that a similar analysis can be carried out using a single CV only. The resultant free energies and kinetics of the reaction slightly overestimate the experimental data. The implications of these results are discussed using a detailed microkinetic scheme of the proton transfer reaction catalyzed by HCA II.
Collapse
Affiliation(s)
- Tanmoy Kumar Paul
- Department of Chemistry, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| | - Srabani Taraphder
- Department of Chemistry, Indian Institute of Technology Kharagpur, Kharagpur 721302, India
| |
Collapse
|
12
|
Abstract
We extend the nonparametric framework of reaction coordinate optimization to nonequilibrium ensembles of (short) trajectories. For example, we show how, starting from such an ensemble, one can obtain an equilibrium free-energy profile along the committor, which can be used to determine important properties of the dynamics exactly. A new adaptive sampling approach, the transition-state ensemble enrichment, is suggested, which samples the configuration space by "growing" committor segments toward each other starting from the boundary states. This framework is suggested as a general tool, alternative to the Markov state models, for a rigorous and accurate analysis of simulations of large biomolecular systems, as it has the following attractive properties. It is immune to the curse of dimensionality, does not require system-specific information, can approximate arbitrary reaction coordinates with high accuracy, and has sensitive and rigorous criteria to test optimality and convergence. The approaches are illustrated on a 50-dimensional model system and a realistic protein folding trajectory.
Collapse
Affiliation(s)
- Sergei V Krivov
- Astbury Center for Structural Molecular Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, U.K
| |
Collapse
|
13
|
Quoika PK, Fernández-Quintero ML, Podewitz M, Hofer F, Liedl KR. Implementation of the Freely Jointed Chain Model to Assess Kinetics and Thermodynamics of Thermosensitive Coil-Globule Transition by Markov States. J Phys Chem B 2021; 125:4898-4909. [PMID: 33942614 PMCID: PMC8154620 DOI: 10.1021/acs.jpcb.1c01946] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
![]()
We revived and implemented
a method developed by Kuhn in 1934,
originally only published in German, that is, the so-called “freely
jointed chain” model. This approach turned out to be surprisingly
useful for analyzing state-of-the-art computer simulations of the
thermosensitive coil–globule transition of N-Isopropylacrylamide 20-mer. Our atomistic computer simulations are
orders of magnitude longer than those of previous studies and lead
to a reliable description of thermodynamics and kinetics at many different
temperatures. The freely jointed chain model provides a coordinate
system, which allows us to construct a Markov state model of the conformational
transitions. Furthermore, this guarantees a reliable reconstruction
of the kinetics in back-and-forth directions. In addition, we obtain
a description of the high diversity and variability of both conformational
states. Thus, we gain a detailed understanding of the coil–globule
transition. Surprisingly, conformational entropy turns out to play
only a minor role in the thermodynamic balance of the process. Moreover,
we show that the radius of gyration is an unexpectedly unsuitable
coordinate to comprehend the transition kinetics because it does not
capture the high conformational diversity within the different states.
Consequently, the approach presented here allows for an exhaustive
description and resolution of the conformational ensembles of arbitrary
linear polymer chains.
Collapse
Affiliation(s)
- Patrick K Quoika
- Institute of General, Inorganic and Theoretical Chemistry, and Centre of Molecular Biosciences University of Innsbruck, A-6020 Innsbruck, Austria
| | - Monica L Fernández-Quintero
- Institute of General, Inorganic and Theoretical Chemistry, and Centre of Molecular Biosciences University of Innsbruck, A-6020 Innsbruck, Austria
| | - Maren Podewitz
- Institute of General, Inorganic and Theoretical Chemistry, and Centre of Molecular Biosciences University of Innsbruck, A-6020 Innsbruck, Austria
| | - Florian Hofer
- Institute of General, Inorganic and Theoretical Chemistry, and Centre of Molecular Biosciences University of Innsbruck, A-6020 Innsbruck, Austria
| | - Klaus R Liedl
- Institute of General, Inorganic and Theoretical Chemistry, and Centre of Molecular Biosciences University of Innsbruck, A-6020 Innsbruck, Austria
| |
Collapse
|
14
|
Abstract
We describe a nonparametric approach for accurate determination of the slowest relaxation eigenvectors of molecular dynamics. The approach is blind as it uses no system specific information. In particular, it does not require a functional form with many parameters to closely approximate eigenvectors, e.g., linear combinations of molecular descriptors or a deep neural network, and thus no extensive expertise with the system. We suggest a rigorous and sensitive validation/optimality criterion for an eigenvector. The criterion uses only eigenvector time series and can be used to validate eigenvectors computed by other approaches. The power of the approach is illustrated on long atomistic protein folding trajectories. The determined eigenvectors pass the validation test at a time scale of 0.2 ns, much shorter than alternative approaches.
Collapse
Affiliation(s)
- Sergei V Krivov
- Astbury Center for Structural Molecular Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
15
|
Bubnis G, Grubmüller H. Sequential Water and Headgroup Merger: Membrane Poration Paths and Energetics from MD Simulations. Biophys J 2020; 119:2418-2430. [PMID: 33189685 PMCID: PMC7822740 DOI: 10.1016/j.bpj.2020.10.037] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Revised: 10/07/2020] [Accepted: 10/23/2020] [Indexed: 01/06/2023] Open
Abstract
Membrane topology changes such as poration, stalk formation, and hemifusion rupture are essential to cellular function, but their molecular details, energetics, and kinetics are still not fully understood. Here, we present a unified energetic and mechanistic picture of metastable pore defects in tensionless lipid membranes. We used an exhaustive committor analysis to test and select optimal reaction coordinates and also to determine the nucleation mechanism. These reaction coordinates were used to calculate free-energy landscapes that capture the full process and end states. The identified barriers agree with the committor analysis. To enable sufficient sampling of the complete transition path for our molecular dynamics simulations, we developed a “gizmo” potential biasing scheme. The simulations suggest that the essential step in the nucleation is the initial merger of lipid headgroups at the nascent pore center. To facilitate this event, an indentation pathway is energetically preferred to a hydrophobic defect. Continuous water columns that span the indentation were determined to be on-path transients that precede the nucleation barrier. This study gives a quantitative description of the nucleation mechanism and energetics of small metastable pores and illustrates a systematic approach to uncover the mechanisms of diverse cellular membrane remodeling processes.
Collapse
Affiliation(s)
- Greg Bubnis
- Department of Theoretical and Computational Biophysics, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany; Weill Institute for Neurosciences and Department of Neurology, University of California San Francisco, San Francisco, California.
| | - Helmut Grubmüller
- Department of Theoretical and Computational Biophysics, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany.
| |
Collapse
|
16
|
Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T. Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems. J Chem Theory Comput 2020; 16:4757-4775. [PMID: 32559068 PMCID: PMC8312194 DOI: 10.1021/acs.jctc.0c00355] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Machine learning encompasses tools and algorithms that are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Collapse
Affiliation(s)
- Paraskevi Gkeka
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
| | - Gabriel Stoltz
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| | | | - Zineb Belkacemi
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
| | - Michele Ceriotti
- Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Aaron R Dinner
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | | | - Hervé Minoux
- Integrated Drug Discovery, Sanofi R&D, 94403 Vitry-sur-Seine, France
| | | | - Fabio Pietrucci
- UMR CNRS 7590, MNHN, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, 75005 Paris, France
| | - Ana Silveira
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Zofia Trstanova
- School of Mathematics, The University of Edinburgh, Edinburgh EH9 3FD, U.K
| | - Rafal Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Tony Lelièvre
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| |
Collapse
|
17
|
Mori Y, Okazaki KI, Mori T, Kim K, Matubayasi N. Learning reaction coordinates via cross-entropy minimization: Application to alanine dipeptide. J Chem Phys 2020; 153:054115. [DOI: 10.1063/5.0009066] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Affiliation(s)
- Yusuke Mori
- Division of Chemical Engineering, Department of Materials Engineering Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan
| | | | - Toshifumi Mori
- Institute for Molecular Science, Okazaki, Aichi 444-8585, Japan
- The Graduate University for Advanced Studies, Okazaki, Aichi 444-8585, Japan
| | - Kang Kim
- Division of Chemical Engineering, Department of Materials Engineering Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan
- Institute for Molecular Science, Okazaki, Aichi 444-8585, Japan
| | - Nobuyuki Matubayasi
- Division of Chemical Engineering, Department of Materials Engineering Science, Graduate School of Engineering Science, Osaka University, Toyonaka, Osaka 560-8531, Japan
| |
Collapse
|
18
|
Sigg D, Voelz VA, Carnevale V. Microcanonical coarse-graining of the kinetic Ising model. J Chem Phys 2020; 152:084104. [PMID: 32113343 PMCID: PMC7042020 DOI: 10.1063/1.5139228] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Accepted: 02/06/2020] [Indexed: 11/15/2022] Open
Abstract
We propose a scheme for coarse-graining the dynamics of the 2-D kinetic Ising model onto the microcanonical ensemble. At subcritical temperatures, 2-D and higher-dimensional Ising lattices possess two basins of attraction separated by a free energy barrier. Projecting onto the microcanonical ensemble has the advantage that the dependence of the crossing rate constant on environmental conditions can be obtained from a single Monte Carlo trajectory. Using various numerical methods, we computed the forward rate constants of coarse-grained representations of the Ising model and compared them with the true value obtained from brute force simulation. While coarse-graining preserves detailed balance, the computed rate constants for barrier heights between 5 kT and 9 kT were consistently 50% larger than the true value. Markovianity testing revealed loss of dynamical memory, which we propose accounts for coarse-graining error. Committor analysis did not support the alternative hypothesis that microcanonical projection is incompatible with an optimal reaction coordinate. The correct crossing rate constant was obtained by spectrally decomposing the diffusion coefficient near the free energy barrier and selecting the slowest (reactive) component. The spectral method also yielded the correct rate constant in the 3-D Ising lattice, where coarse-graining error was 6% and memory effects were diminished. We conclude that microcanonical coarse-graining supplemented by spectral analysis of short-term barrier fluctuations provides a comprehensive kinetic description of barrier crossing in a non-inertial continuous-time jump process.
Collapse
Affiliation(s)
| | - Vincent A. Voelz
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, USA
| | - Vincenzo Carnevale
- Institute for Computational Molecular Science, College of Science and Technology, Temple University, Philadelphia, Pennsylvania 19122, USA
| |
Collapse
|
19
|
Chiuchiù D, Ferrare J, Pigolotti S. Assembly of heteropolymers via a network of reaction coordinates. Phys Rev E 2019; 100:062502. [PMID: 31962425 DOI: 10.1103/physreve.100.062502] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Indexed: 06/10/2023]
Abstract
In biochemistry, heteropolymers encoding biological information are assembled out of equilibrium by sequentially incorporating available monomers found in the environment. Current models of polymerization treat monomer incorporation as a sequence of discrete chemical reactions between intermediate metastable states. In this paper, we use ideas from reaction rate theory and describe nonequilibrium assembly of a heteropolymer via a continuous reaction coordinate. Our approach allows for estimating the copy error and incorporation speed from the Gibbs free energy landscape of the process. We apply our theory to several examples from a simple reaction characterized by a free energy barrier to more complex cases incorporating error correction mechanisms, such as kinetic proofreading.
Collapse
Affiliation(s)
- Davide Chiuchiù
- Biological Complexity Unit, Okinawa Institute for Science and Technology, 1919-1 Tancha, Onna, Kunigami-gun, Okinawa 904-0412, Japan
| | - James Ferrare
- Biological Complexity Unit, Okinawa Institute for Science and Technology, 1919-1 Tancha, Onna, Kunigami-gun, Okinawa 904-0412, Japan
- Tulane University, 6823 St. Charles Avenue, New Orleans, Lousiana 70118, USA
| | - Simone Pigolotti
- Biological Complexity Unit, Okinawa Institute for Science and Technology, 1919-1 Tancha, Onna, Kunigami-gun, Okinawa 904-0412, Japan
| |
Collapse
|
20
|
Bacci M, Caflisch A, Vitalis A. On the removal of initial state bias from simulation data. J Chem Phys 2019; 150:104105. [PMID: 30876362 DOI: 10.1063/1.5063556] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Classical atomistic simulations of biomolecules play an increasingly important role in molecular life science. The structure of current computing architectures favors methods that run multiple trajectories at once without requiring extensive communication between them. Many advanced sampling strategies in the field fit this mold. These approaches often rely on an adaptive logic and create ensembles of comparatively short trajectories whose starting points are not distributed according to the correct Boltzmann weights. This type of bias is notoriously difficult to remove, and Markov state models (MSMs) are one of the few strategies available for recovering the correct kinetics and thermodynamics from these ensembles of trajectories. In this contribution, we analyze the performance of MSMs in the thermodynamic reweighting task for a hierarchical set of systems. We show that MSMs can be rigorous tools to recover the correct equilibrium distribution for systems of sufficiently low dimensionality. This is conditional upon not tampering with local flux imbalances found in the data. For a real-world application, we find that a pure likelihood-based inference of the transition matrix produces the best results. The removal of the bias is incomplete, however, and for this system, all tested MSMs are outperformed by an alternative albeit less general approach rooted in the ideas of statistical resampling. We conclude by formulating some recommendations for how to address the reweighting issue in practice.
Collapse
Affiliation(s)
- Marco Bacci
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Amedeo Caflisch
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| | - Andreas Vitalis
- University of Zurich, Department of Biochemistry, Winterthurerstrasse 190, CH-8057 Zurich, Switzerland
| |
Collapse
|
21
|
Andryushchenko VA, Chekmarev SF. Modeling of Multicolor Single-Molecule Förster Resonance Energy-Transfer Experiments on Protein Folding. J Phys Chem B 2018; 122:10678-10685. [PMID: 30383961 DOI: 10.1021/acs.jpcb.8b07737] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Using a coarse-grained, Cα-model of BBL protein, a multicolor single-molecule Förster resonance energy transfer (FRET) experiment is modeled. Three fluorophores are introduced, which, for simplicity, are associated with Cα beads. Two fluorophores are placed at the ends of protein chain and the third one at the middle of the chain. The free-energy surfaces (FESs) depending on the interfluorophore distances and on the FRET efficiencies corresponding to these distances have been constructed and compared with the FESs depending on the conventional collective variables, such as the fraction of native contacts and radius of gyration. It has been found that multicolor experiments can successfully resolve all essential BBL states that are revealed by the conventional FESs. The resolution of these states with the FRET-efficiency histogram is found to be successful if the energy transfer is measured between the fluorophores at the BBL ends. We also show that, although the present model construct of BBL is very simple, it captures some characteristic features of the single-molecule FRET experiments, such as the pattern of the FRET-efficiency histograms and their evolution with the denaturant concentration.
Collapse
Affiliation(s)
- Vladimir A Andryushchenko
- Institute of Thermophysics , SB RAS , 630090 Novosibirsk , Russia.,Department of Physics , Novosibirsk State University , 630090 Novosibirsk , Russia
| | - Sergei F Chekmarev
- Institute of Thermophysics , SB RAS , 630090 Novosibirsk , Russia.,Department of Physics , Novosibirsk State University , 630090 Novosibirsk , Russia
| |
Collapse
|
22
|
Lindahl V, Lidmar J, Hess B. Riemann metric approach to optimal sampling of multidimensional free-energy landscapes. Phys Rev E 2018; 98:023312. [PMID: 30253489 DOI: 10.1103/physreve.98.023312] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2018] [Indexed: 06/08/2023]
Abstract
Exploring the free-energy landscape along reaction coordinates or system parameters λ is central to many studies of high-dimensional model systems in physics, e.g., large molecules or spin glasses. In simulations this usually requires sampling conformational transitions or phase transitions, but efficient sampling is often difficult to attain due to the roughness of the energy landscape. For Boltzmann distributions, crossing rates decrease exponentially with free-energy barrier heights. Thus, exponential acceleration can be achieved in simulations by applying an artificial bias along λ tuned such that a flat target distribution is obtained. A flat distribution is, however, an ambiguous concept unless a proper metric is used and is generally suboptimal. Here we propose a multidimensional Riemann metric, which takes the local diffusion into account, and redefine uniform sampling such that it is invariant under nonlinear coordinate transformations. We use the metric in combination with the accelerated weight histogram method, a free-energy calculation and sampling method, to adaptively optimize sampling toward the target distribution prescribed by the metric. We demonstrate that for complex problems, such as molecular dynamics simulations of DNA base-pair opening, sampling uniformly according to the metric, which can be calculated without significant computational overhead, improves sampling efficiency by 50%-70%.
Collapse
Affiliation(s)
- Viveca Lindahl
- Department of Physics and Swedish e-Science Research Center, KTH Royal Institute of Technology, 10691 Stockholm, Sweden
| | - Jack Lidmar
- Department of Physics and Swedish e-Science Research Center, KTH Royal Institute of Technology, 10691 Stockholm, Sweden
| | - Berk Hess
- Department of Physics and Swedish e-Science Research Center, KTH Royal Institute of Technology, 10691 Stockholm, Sweden
| |
Collapse
|
23
|
Krivov SV. Protein Folding Free Energy Landscape along the Committor - the Optimal Folding Coordinate. J Chem Theory Comput 2018; 14:3418-3427. [PMID: 29791148 DOI: 10.1021/acs.jctc.8b00101] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Recent advances in simulation and experiment have led to dramatic increases in the quantity and complexity of produced data, which makes the development of automated analysis tools very important. A powerful approach to analyze dynamics contained in such data sets is to describe/approximate it by diffusion on a free energy landscape - free energy as a function of reaction coordinates (RC). For the description to be quantitatively accurate, RCs should be chosen in an optimal way. Recent theoretical results show that such an optimal RC exists; however, determining it for practical systems is a very difficult unsolved problem. Here we describe a solution to this problem. We describe an adaptive nonparametric approach to accurately determine the optimal RC (the committor) for an equilibrium trajectory of a realistic system. In contrast to alternative approaches, which require a functional form with many parameters to approximate an RC and thus extensive expertise with the system, the suggested approach is nonparametric and can approximate any RC with high accuracy without system specific information. To avoid overfitting for a realistically sampled system, the approach performs RC optimization in an adaptive manner by focusing optimization on less optimized spatiotemporal regions of the RC. The power of the approach is illustrated on a long equilibrium atomistic folding simulation of HP35 protein. We have determined the optimal folding RC - the committor, which was confirmed by passing a stringent committor validation test. It allowed us to determine a first quantitatively accurate protein folding free energy landscape. We have confirmed the recent theoretical results that diffusion on such a free energy profile can be used to compute exactly the equilibrium flux, the mean first passage times, and the mean transition path times between any two points on the profile. We have shown that the mean squared displacement along the optimal RC grows linear with time as for simple diffusion. The free energy profile allowed us to obtain a direct rigorous estimate of the pre-exponential factor for the folding dynamics.
Collapse
Affiliation(s)
- Sergei V Krivov
- Astbury Center for Structural Molecular Biology , University of Leeds , Leeds LS2 9JT , United Kingdom
| |
Collapse
|
24
|
Klein M, Krivov SV, Ferrer AJ, Luo L, Samuel AD, Karplus M. Exploratory search during directed navigation in C. elegans and Drosophila larva. eLife 2017; 6. [PMID: 29083306 PMCID: PMC5662291 DOI: 10.7554/elife.30503] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2017] [Accepted: 10/11/2017] [Indexed: 11/23/2022] Open
Abstract
Many organisms—from bacteria to nematodes to insect larvae—navigate their environments by biasing random movements. In these organisms, navigation in isotropic environments can be characterized as an essentially diffusive and undirected process. In stimulus gradients, movement decisions are biased to drive directed navigation toward favorable environments. How does directed navigation in a gradient modulate random exploration either parallel or orthogonal to the gradient? Here, we introduce methods originally used for analyzing protein folding trajectories to study the trajectories of the nematode Caenorhabditis elegans and the Drosophila larva in isotropic environments, as well as in thermal and chemical gradients. We find that the statistics of random exploration in any direction are little affected by directed movement along a stimulus gradient. A key constraint on the behavioral strategies of these organisms appears to be the preservation of their capacity to continuously explore their environments in all directions even while moving toward favorable conditions.
Collapse
Affiliation(s)
- Mason Klein
- Department of Physics, University of Miami, Coral Gables, United States
| | - Sergei V Krivov
- Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, United Kingdom
| | - Anggie J Ferrer
- Department of Physics, University of Miami, Coral Gables, United States
| | - Linjiao Luo
- Key Laboratory of Modern Acoustics, Ministry of Education, Department of Physics, Nanjing University, Nanjing, China
| | - Aravinthan Dt Samuel
- Center for Brain Science, Department of Physics, Harvard University, Cambridge, United States
| | - Martin Karplus
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, United States.,Laboratoire de Chimie Biophysique, ISIS, Université de Strasbourg, Strasbourg, France
| |
Collapse
|
25
|
Noé F, Clementi C. Collective variables for the study of long-time kinetics from molecular trajectories: theory and methods. Curr Opin Struct Biol 2017; 43:141-147. [PMID: 28327454 DOI: 10.1016/j.sbi.2017.02.006] [Citation(s) in RCA: 100] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 02/20/2017] [Indexed: 12/23/2022]
Abstract
Collective variables are an important concept to study high-dimensional dynamical systems, such as molecular dynamics of macromolecules, liquids, or polymers, in particular to define relevant metastable states and state-transition or phase-transition. Over the past decade, a rigorous mathematical theory has been formulated to define optimal collective variables to characterize slow dynamical processes. Here we review recent developments, including a variational principle to find optimal approximations to slow collective variables from simulation data, and algorithms such as the time-lagged independent component analysis. Using these concepts, a distance metric can be defined that quantifies how slowly molecular conformations interconvert. Extensions and open questions are discussed.
Collapse
Affiliation(s)
- Frank Noé
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 6, 14195 Berlin, Germany.
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, and Department of Chemistry, Rice University, 6100 Main Street, Houston, TX 77005, United States.
| |
Collapse
|
26
|
Meloni R, Camilloni C, Tiana G. Properties of low-dimensional collective variables in the molecular dynamics of biopolymers. Phys Rev E 2016; 94:052406. [PMID: 27967023 DOI: 10.1103/physreve.94.052406] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Indexed: 11/07/2022]
Abstract
The description of the dynamics of a complex, high-dimensional system in terms of a low-dimensional set of collective variables Y can be fruitful if the low-dimensional representation satisfies a Langevin equation with drift and diffusion coefficients that depend only on Y. We present a computational scheme to evaluate whether a given collective variable provides a faithful low-dimensional representation of the dynamics of a high-dimensional system. The scheme is based on the framework of a finite-difference Langevin equation, similar to that used for molecular-dynamics simulations. This allows one to calculate the drift and diffusion coefficients in any point of the full-dimensional system. The width of the distribution of drift and diffusion coefficients in an ensemble of microscopic points at the same value of Y indicates to what extent the dynamics of Y is described by a simple Langevin equation. Using a simple protein model, we show that collective variables often used to describe biopolymers display a non-negligible width both in the drift and in the diffusion coefficients. We also show that the associated effective force is compatible with the equilibrium free energy calculated from a microscopic sampling, but it results in markedly different dynamical properties.
Collapse
Affiliation(s)
- Roberto Meloni
- Department of Physics, Università degli Studi di Milano, and INFN, via Celoria 16, 20133 Milano, Italy
| | - Carlo Camilloni
- Department of Chemistry and Institute for Advanced Study, Technische Universität München, Lichtenbergstrasse 4, 85747 Garching, Germany
| | - Guido Tiana
- Center for Complexity and Biosystems and Department of Physics, Università degli Studi di Milano, and INFN, via Celoria 16, 20133 Milano, Italy
| |
Collapse
|