1
|
Molina-Taborda A, Cossio P, Lopez-Acevedo O, Gabrié M. Active Learning of Boltzmann Samplers and Potential Energies with Quantum Mechanical Accuracy. J Chem Theory Comput 2024; 20:8833-8843. [PMID: 39370622 DOI: 10.1021/acs.jctc.4c00506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Extracting consistent statistics between relevant free energy minima of a molecular system is essential for physics, chemistry, and biology. Molecular dynamics (MD) simulations can aid in this task but are computationally expensive, especially for systems that require quantum accuracy. To overcome this challenge, we developed an approach combining enhanced sampling with deep generative models and active learning of a machine learning potential (MLP). We introduce an adaptive Markov chain Monte Carlo framework that enables the training of one normalizing flow (NF) and one MLP per state, achieving rapid convergence toward the Boltzmann distribution. Leveraging the trained NF and MLP models, we compute thermodynamic observables such as free energy differences and optical spectra. We apply this method to study the isomerization of an ultrasmall silver nanocluster belonging to a set of systems with diverse applications in the fields of medicine and catalysis.
Collapse
Affiliation(s)
- Ana Molina-Taborda
- Biophysics of Tropical Diseases Max Planck Tandem Group, University of Antioquia UdeA, 050010 Medellin, Colombia
- Grupo de Física Atómica y Molecular, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, 050010 Medellin, Colombia
- Center for Computational Biology, Flatiron Institute, 10010 New York, New York, United States
| | - Pilar Cossio
- Center for Computational Biology, Flatiron Institute, 10010 New York, New York, United States
- Center for Computational Mathematics, Flatiron Institute, 10010 New York, New York, United States
| | - Olga Lopez-Acevedo
- Biophysics of Tropical Diseases Max Planck Tandem Group, University of Antioquia UdeA, 050010 Medellin, Colombia
- Grupo de Física Atómica y Molecular, Instituto de Física, Facultad de Ciencias Exactas y Naturales, Universidad de Antioquia UdeA, 050010 Medellin, Colombia
| | - Marylou Gabrié
- CMAP, CNRS, École polytechnique, Institut Polytechnique de Paris, 91120 Palaiseau, France
| |
Collapse
|
2
|
Mitchell AR, Rotskoff GM. Committor Guided Estimates of Molecular Transition Rates. J Chem Theory Comput 2024. [PMID: 39420582 DOI: 10.1021/acs.jctc.4c00997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2024]
Abstract
The probability that a configuration of a physical system reacts, or transitions from one metastable state to another, is quantified by the committor function. This function contains richly detailed mechanistic information about transition pathways, but a full parametrization of the committor requires the construction of a high-dimensional function, a generically challenging task. Recent efforts to leverage neural networks as a means to solve high-dimensional partial differential equations, often called "physics-informed" machine learning, have brought the committor into computational reach. Here, we build on the semigroup approach to learning the committor and assess its utility for predicting dynamical quantities such as transition rates. We show that a careful reframing of the objective function and improved adaptive sampling strategies provide highly accurate representations of the committor. Furthermore, by directly applying the Hill relation, we show that these committors provide accurate transition rates for molecular systems.
Collapse
Affiliation(s)
- Andrew R Mitchell
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Grant M Rotskoff
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
3
|
Mehdi S, Tiwary P. Thermodynamics-inspired explanations of artificial intelligence. Nat Commun 2024; 15:7859. [PMID: 39251574 PMCID: PMC11385982 DOI: 10.1038/s41467-024-51970-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 08/20/2024] [Indexed: 09/11/2024] Open
Abstract
In recent years, predictive machine learning models have gained prominence across various scientific domains. However, their black-box nature necessitates establishing trust in them before accepting their predictions as accurate. One promising strategy involves employing explanation techniques that elucidate the rationale behind a model's predictions in a way that humans can understand. However, assessing the degree of human interpretability of these explanations is a nontrivial challenge. In this work, we introduce interpretation entropy as a universal solution for evaluating the human interpretability of any linear model. Using this concept and drawing inspiration from classical thermodynamics, we present Thermodynamics-inspired Explainable Representations of AI and other black-box Paradigms, a method for generating optimally human-interpretable explanations in a model-agnostic manner. We demonstrate the wide-ranging applicability of this method by explaining predictions from various black-box model architectures across diverse domains, including molecular simulations, text, and image classification.
Collapse
Affiliation(s)
- Shams Mehdi
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park, 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park, 20742, USA.
- University of Maryland Institute for Health Computing, Bethesda, Maryland, 20852, USA.
| |
Collapse
|
4
|
Devergne T, Huet L, Pietrucci F, Saitta AM. Efficient machine learning approach for accurate free-energy profiles and kinetic rates. Phys Rev E 2024; 110:L033301. [PMID: 39425316 DOI: 10.1103/physreve.110.l033301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 08/01/2024] [Indexed: 10/21/2024]
Abstract
The computational exploration of reactive processes is challenging due to the requirement of thorough sampling across the free energy landscape using accurate ab initio methods. To address these constraints, machine learning potentials are employed, yet their training for this kind of problem is still a laborious and tedious task. In this study, we present an efficient approach to train these potentials by cleverly using a single batch of unbiased trajectories that avoid the pitfalls of trajectories artificially biased along a suboptimal collective variable. This strategy, when integrated with current enhanced sampling techniques, allows to obtain free energy profiles and kinetic rates of ab initio quality, yet dramatically reducing the computational cost.
Collapse
Affiliation(s)
- Timothée Devergne
- Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, UMR 7590 CNRS, Sorbonne Université, Muséum National d'Histoire Naturelle - Paris 75005, France; Atomistic Simulations, Italian Institute of Technology, 16142 Genoa, Italy; and Computational Statistics and Machine Learning, Italian Institute of Technology, 16142 Genoa, Italy
| | | | | | | |
Collapse
|
5
|
Singh AN, Limmer DT. Splitting probabilities as optimal controllers of rare reactive events. J Chem Phys 2024; 161:054113. [PMID: 39101534 DOI: 10.1063/5.0203840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/10/2024] [Indexed: 08/06/2024] Open
Abstract
The committor constitutes the primary quantity of interest within chemical kinetics as it is understood to encode the ideal reaction coordinate for a rare reactive event. We show the generative utility of the committor in that it can be used explicitly to produce a reactive trajectory ensemble that exhibits numerically exact statistics as that of the original transition path ensemble. This is done by relating a time-dependent analog of the committor that solves a generalized bridge problem to the splitting probability that solves a boundary value problem under a bistable assumption. By invoking stochastic optimal control and spectral theory, we derive a general form for the optimal controller of a bridge process that connects two metastable states expressed in terms of the splitting probability. This formalism offers an alternative perspective into the role of the committor and its gradients in that they encode force fields that guarantee reactivity, generating trajectories that are statistically identical to the way that a system would react autonomously.
Collapse
Affiliation(s)
- Aditya N Singh
- Department of Chemistry, University of California, Berkeley, California 94720, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley, California 94720, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Kavli Energy Nanoscience Institute at Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
6
|
Rosa-Raíces JL, Limmer DT. Variational time reversal for free-energy estimation in nonequilibrium steady states. Phys Rev E 2024; 110:024120. [PMID: 39295045 DOI: 10.1103/physreve.110.024120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 07/22/2024] [Indexed: 09/21/2024]
Abstract
Studying the structure of systems in nonequilibrium steady states necessitates tools that quantify population shifts and associated deformations of equilibrium free-energy landscapes under persistent currents. Within the framework of stochastic thermodynamics, we establish a variant of the Kawasaki-Crooks equality that relates nonequilibrium free-energy corrections in overdamped Langevin systems to heat dissipation statistics along time-reversed relaxation trajectories computable with molecular simulation. Using stochastic control theory, we arrive at a general variational approach to evaluate the Kawasaki-Crooks equality and use it to estimate distribution functions of order parameters in specific models of driven and active matter, attaining substantial improvement in accuracy over simple perturbative methods.
Collapse
Affiliation(s)
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley, California 94720, USA
- Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Chemical Science Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Kavli Energy NanoScience Institute, Berkeley, California 94720, USA
| |
Collapse
|
7
|
Ohnuki J, Okazaki KI. Integration of AlphaFold with Molecular Dynamics for Efficient Conformational Sampling of Transporter Protein NarK. J Phys Chem B 2024. [PMID: 39066727 DOI: 10.1021/acs.jpcb.4c02726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Transporter proteins carry their substrate across the cell membrane by changing their conformation. Thus, conformational dynamics are crucial for transport function. However, clarifying the complete transport cycle is challenging even with the current structural biology approach. Molecular dynamics (MD) simulation is a computational approach that can provide the time-resolved conformational dynamics of transporter proteins in atomic details but suffers from a high computational cost. Here, we integrate state-of-the-art protein structure prediction AI, AlphaFold2 (AF2), with MD simulation to reduce the computational cost. Focusing on the transporter protein NarK, we first show that AF2 sampled broad conformations of NarK, including the inward-open, occluded, and outward-open states. We also applied the coevolution-informed mutation in AF2, identifying state-shifting mutations. Then, we show that MD simulations from AF2-generated outward-open conformation, which is experimentally unresolved, captured the essence of the conformational state. We also found that MD simulations from AF2-generated intermediates showed transient dynamics like a transition state connecting two conformational states. This study paves the way for efficient conformational sampling of transporter proteins.
Collapse
Affiliation(s)
- Jun Ohnuki
- Research Center for Computational Science, Institute for Molecular Science, National Institutes of Natural Sciences, Okazaki, Aichi 444-8585, Japan
- Graduate Institute for Advanced Studies, SOKENDAI, Okazaki, Aichi 444-8585, Japan
| | - Kei-Ichi Okazaki
- Research Center for Computational Science, Institute for Molecular Science, National Institutes of Natural Sciences, Okazaki, Aichi 444-8585, Japan
- Graduate Institute for Advanced Studies, SOKENDAI, Okazaki, Aichi 444-8585, Japan
| |
Collapse
|
8
|
Zhang P, Nde J, Eliaz Y, Jennings N, Cieplak P, Cheung MS. Chemistry-informed Machine Learning Explains Calcium-binding Proteins' Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions. ARXIV 2024:arXiv:2407.17017v1. [PMID: 39108291 PMCID: PMC11302678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Proteins' fuzziness are features for communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. Binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit, but it is unclear whether the limited experimental data available can be used to train models to accurately predict the charges of calcium-binding protein variants. Here, we developed a chemistry-informed, machine-learning algorithm that implements a game theoretic approach to explain the output of a machine-learning model without the prerequisite of an excessively large database for high-performance prediction of atomic charges. We used the ab initio electronic structure data representing calcium ions and the structures of the disordered segments of calcium-binding peptides with surrounding water molecules to train several explainable models. Network theory was used to extract the topological features of atomic interactions in the structurally complex data dictated by the coordination chemistry of a calcium ion, a potent indicator of its charge state in protein. With our designs, we provided a framework of explainable machine learning model to annotate atomic charges of calcium ions in calcium-binding proteins with domain knowledge in response to the chemical changes in an environment based on the limited size of scientific data in a genome space.
Collapse
Affiliation(s)
- Pengzhi Zhang
- Center for Bioinformatics and Computational Biology, Houston Methodist Research Institute, Houston, TX, USA
| | - Jules Nde
- Department of Physics, University of Washington, Seattle, WA, USA
| | - Yossi Eliaz
- Department of Physics, University of Houston, Houston, TX, USA
- Computer Science Department, HIT Holon Institute of Technology, Holon, Israel
| | | | - Piotr Cieplak
- Sanford Burnham Prebys Medical Discovery Institute, La Jolla, CA, USA
| | - Margaret S Cheung
- Department of Physics, University of Washington, Seattle, WA, USA
- Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, Richland, WA, USA
| |
Collapse
|
9
|
Tänzel V, Jäger M, Wolf S. Learning Protein-Ligand Unbinding Pathways via Single-Parameter Community Detection. J Chem Theory Comput 2024; 20:5058-5067. [PMID: 38865714 DOI: 10.1021/acs.jctc.4c00250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2024]
Abstract
Understanding the dynamics of biomolecular complexes, e.g., of protein-ligand (un)binding, requires the comprehension of paths such systems take between metastable states. In MD simulations, paths are usually not observable per se, but they need to be inferred from simulation trajectories. Here, we present a novel approach to cluster trajectories based on a community detection algorithm that necessitates only the definition of a single parameter. The unbinding of the streptavidin-biotin complex is used as a benchmark system and the A2a adenosine receptor in complex with the inhibitor ZM241385 as an elaborate application. We demonstrate how such clusters of trajectories correspond to pathways and how the approach helps in the identification of reaction coordinates for a considered (un)binding process.
Collapse
Affiliation(s)
- Victor Tänzel
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| | - Miriam Jäger
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| | - Steffen Wolf
- Biomolecular Dynamics, Institute of Physics, University of Freiburg, Freiburg 79104, Germany
| |
Collapse
|
10
|
Ghysbrecht S, Keller BG. Thermal isomerization rates in retinal analogues using Ab-Initio molecular dynamics. J Comput Chem 2024; 45:1390-1403. [PMID: 38414274 DOI: 10.1002/jcc.27332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 01/31/2024] [Accepted: 02/02/2024] [Indexed: 02/29/2024]
Abstract
For a detailed understanding of chemical processes in nature and industry, we need accurate models of chemical reactions in complex environments. While Eyring transition state theory is commonly used for modeling chemical reactions, it is most accurate for small molecules in the gas phase. A wide range of alternative rate theories exist that can better capture reactions involving complex molecules and environmental effects. However, they require that the chemical reaction is sampled by molecular dynamics simulations. This is a formidable challenge since the accessible simulation timescales are many orders of magnitude smaller than typical timescales of chemical reactions. To overcome these limitations, rare event methods involving enhanced molecular dynamics sampling are employed. In this work, thermal isomerization of retinal is studied using tight-binding density functional theory. Results from transition state theory are compared to those obtained from enhanced sampling. Rates obtained from dynamical reweighting using infrequent metadynamics simulations were in close agreement with those from transition state theory. Meanwhile, rates obtained from application of Kramers' rate equation to a sampled free energy profile along a torsional dihedral reaction coordinate were found to be up to three orders of magnitude higher. This discrepancy raises concerns about applying rate methods to one-dimensional reaction coordinates in chemical reactions.
Collapse
Affiliation(s)
- Simon Ghysbrecht
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany
| | - Bettina G Keller
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
11
|
Zhang J, Zhang O, Bonati L, Hou T. Combining Transition Path Sampling with Data-Driven Collective Variables through a Reactivity-Biased Shooting Algorithm. J Chem Theory Comput 2024; 20:4523-4532. [PMID: 38801759 DOI: 10.1021/acs.jctc.4c00423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Rare event sampling is a central problem in modern computational chemistry research. Among the existing methods, transition path sampling (TPS) can generate unbiased representations of reaction processes. However, its efficiency depends on the ability to generate reactive trial paths, which in turn depends on the quality of the shooting algorithm used. We propose a new algorithm based on the shooting success rate, i.e., reactivity, measured as a function of a reduced set of collective variables (CVs). These variables are extracted with a machine learning approach directly from TPS simulations, using a multitask objective function. Iteratively, this workflow significantly improves the shooting efficiency without any prior knowledge of the process. In addition, the optimized CVs can be used with biased enhanced sampling methodologies to accurately reconstruct the free energy profiles. We tested the method on three different systems: a two-dimensional toy model, conformational transitions of alanine dipeptide, and hydrolysis of acetyl chloride in bulk water. In the latter, we integrated our workflow with an active learning scheme to learn a reactive machine learning-based potential, which allowed us to study the mechanism and free energy profile with an ab initio-like accuracy.
Collapse
Affiliation(s)
- Jintu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- Atomistic Simulations, Italian Institute of Technology, Genova 16152, Italy
| | - Odin Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | - Luigi Bonati
- Atomistic Simulations, Italian Institute of Technology, Genova 16152, Italy
| | - TingJun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
- State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
12
|
Vervust W, Zhang DT, Ghysels A, Roet S, van Erp TS, Riccardi E. PyRETIS 3: Conquering rare and slow events without boundaries. J Comput Chem 2024; 45:1224-1234. [PMID: 38345082 DOI: 10.1002/jcc.27319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 01/16/2024] [Accepted: 01/18/2024] [Indexed: 04/19/2024]
Abstract
We present and discuss the advancements made in PyRETIS 3, the third instalment of our Python library for an efficient and user-friendly rare event simulation, focused to execute molecular simulations with replica exchange transition interface sampling (RETIS) and its variations. Apart from a general rewiring of the internal code towards a more modular structure, several recently developed sampling strategies have been implemented. These include recently developed Monte Carlo moves to increase path decorrelation and convergence rate, and new ensemble definitions to handle the challenges of long-lived metastable states and transitions with unbounded reactant and product states. Additionally, the post-analysis software PyVisa is now embedded in the main code, allowing fast use of machine-learning algorithms for clustering and visualising collective variables in the simulation data.
Collapse
Affiliation(s)
- Wouter Vervust
- IBiTech-BioMMedA Group, Ghent University, Ghent, Belgium
| | - Daniel T Zhang
- Department of Chemistry, Norwegian University of Science and Technology, Trondheim, Norway
| | - An Ghysels
- IBiTech-BioMMedA Group, Ghent University, Ghent, Belgium
| | - Sander Roet
- Department of Chemistry, Utrecht University, Utrecht, The Netherlands
| | - Titus S van Erp
- Department of Chemistry, Norwegian University of Science and Technology, Trondheim, Norway
| | - Enrico Riccardi
- Department of Energy Resources, University of Stavanger, Stavanger, Norway
| |
Collapse
|
13
|
Tiwary P. Modeling prebiotic chemistries with quantum accuracy at classical costs. Proc Natl Acad Sci U S A 2024; 121:e2408742121. [PMID: 38809708 PMCID: PMC11161769 DOI: 10.1073/pnas.2408742121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024] Open
Affiliation(s)
- Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, MD20742
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD20742
- University of Maryland Institute for Health Computing, Bethesda, MD20852
| |
Collapse
|
14
|
Keller BG, Bolhuis PG. Dynamical Reweighting for Biased Rare Event Simulations. Annu Rev Phys Chem 2024; 75:137-162. [PMID: 38941527 DOI: 10.1146/annurev-physchem-083122-124538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Dynamical reweighting techniques aim to recover the correct molecular dynamics from a simulation at a modified potential energy surface. They are important for unbiasing enhanced sampling simulations of molecular rare events. Here, we review the theoretical frameworks of dynamical reweighting for modified potentials. Based on an overview of kinetic models with increasing level of detail, we discuss techniques to reweight two-state dynamics, multistate dynamics, and path integrals. We explore the natural link to transition path sampling and how the effect of nonequilibrium forces can be reweighted. We end by providing an outlook on how dynamical reweighting integrates with techniques for optimizing collective variables and with modern potential energy surfaces.
Collapse
Affiliation(s)
- Bettina G Keller
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany;
| | - Peter G Bolhuis
- Van 't Hoff Institute for Molecular Sciences, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
15
|
Mous S, Poitevin F, Hunter MS, Asthagiri DN, Beck TL. Structural biology in the age of X-ray free-electron lasers and exascale computing. Curr Opin Struct Biol 2024; 86:102808. [PMID: 38547555 DOI: 10.1016/j.sbi.2024.102808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Revised: 02/07/2024] [Accepted: 03/07/2024] [Indexed: 05/19/2024]
Abstract
Serial femtosecond X-ray crystallography has emerged as a powerful method for investigating biomolecular structure and dynamics. With the new generation of X-ray free-electron lasers, which generate ultrabright X-ray pulses at megahertz repetition rates, we can now rapidly probe ultrafast conformational changes and charge movement in biomolecules. Over the last year, another innovation has been the deployment of Frontier, the world's first exascale supercomputer. Synergizing extremely high repetition rate X-ray light sources and exascale computing has the potential to accelerate discovery in biomolecular sciences. Here we outline our perspective on each of these remarkable innovations individually, and the opportunities and challenges in yoking them within an integrated research infrastructure.
Collapse
Affiliation(s)
- Sandra Mous
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, 94025, CA, USA
| | - Frédéric Poitevin
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, 94025, CA, USA
| | - Mark S Hunter
- Linac Coherent Light Source, SLAC National Accelerator Laboratory, Menlo Park, 94025, CA, USA.
| | - Dilipkumar N Asthagiri
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, 37830-6012, TN, USA
| | - Thomas L Beck
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, 37830-6012, TN, USA.
| |
Collapse
|
16
|
Kang P, Trizio E, Parrinello M. Computing the committor with the committor to study the transition state ensemble. NATURE COMPUTATIONAL SCIENCE 2024; 4:451-460. [PMID: 38839932 DOI: 10.1038/s43588-024-00645-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2024] [Accepted: 05/14/2024] [Indexed: 06/07/2024]
Abstract
The study of the kinetic bottlenecks that hinder the rare transitions between long-lived metastable states is a major challenge in atomistic simulations. Here we propose a method to explore the transition state ensemble, which is the distribution of configurations that the system passes through as it translocates from one metastable basin to another. We base our method on the committor function and the variational principle that it obeys. We find its minimum through a self-consistent procedure that starts from information limited to the initial and final states. Right from the start, our procedure allows the sampling of very many transition state configurations. With the help of the variational principle, we perform a detailed analysis of the transition state ensemble, ranking quantitatively the degrees of freedom mostly involved in the transition and enabling a systematic approach for the interpretation of simulation results and the construction of efficient physics-informed collective variables.
Collapse
Affiliation(s)
- Peilin Kang
- Atomistic Simulations, Italian Institute of Technology, Genova, Italy
| | - Enrico Trizio
- Atomistic Simulations, Italian Institute of Technology, Genova, Italy
- Department of Materials Science, Università di Milano-Bicocca, Milano, Italy
| | | |
Collapse
|
17
|
David R, Tuñón I, Laage D. Competing Reaction Mechanisms of Peptide Bond Formation in Water Revealed by Deep Potential Molecular Dynamics and Path Sampling. J Am Chem Soc 2024; 146:14213-14224. [PMID: 38739765 DOI: 10.1021/jacs.4c03445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
The formation of an amide bond is an essential step in the synthesis of materials and drugs, and in the assembly of amino acids to form peptides. The mechanism of this reaction has been studied extensively, in particular to understand how it can be catalyzed, but a representation capable of explaining all the experimental data is still lacking. Numerical simulation should provide the necessary molecular description, but the solvent involvement poses a number of challenges. Here, we combine the efficiency and accuracy of neural network potential-based reactive molecular dynamics with the extensive and unbiased exploration of reaction pathways provided by transition path sampling. Using microsecond-scale simulations at the density functional theory level, we show that this method reveals the presence of two competing distinct mechanisms for peptide bond formation between alanine esters in aqueous solution. We describe how both reaction pathways, via a general base catalysis mechanism and via direct cleavage of the tetrahedral intermediate respectively, change with pH. This result contrasts with the conventional mechanism involving a single pathway in which only the barrier heights are affected by pH. We show that this new proposal involving two competing mechanisms is consistent with the experimental data, and we discuss the implications for peptide bond formation under prebiotic conditions and in the ribosome. Our work shows that integrating deep potential molecular dynamics with path sampling provides a powerful approach for exploring complex chemical mechanisms.
Collapse
Affiliation(s)
- Rolf David
- PASTEUR, Department of Chemistry, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| | - Iñaki Tuñón
- Departamento de Química Física, Universitat de Valencia, Burjassot, 46100 Valencia, Spain
| | - Damien Laage
- PASTEUR, Department of Chemistry, École Normale Supérieure, PSL University, Sorbonne Université, CNRS, 75005 Paris, France
| |
Collapse
|
18
|
France-Lanord A, Vroylandt H, Salanne M, Rotenberg B, Saitta AM, Pietrucci F. Data-Driven Path Collective Variables. J Chem Theory Comput 2024; 20:3069-3084. [PMID: 38619076 DOI: 10.1021/acs.jctc.4c00123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Identifying optimal collective variables to model transformations using atomic-scale simulations is a long-standing challenge. We propose a new method for the generation, optimization, and comparison of collective variables that can be thought of as a data-driven generalization of the path collective variable concept. It consists of a kernel ridge regression of the committor probability, which encodes a transformation's progress. The resulting collective variable is one-dimensional, interpretable, and differentiable, making it appropriate for enhanced sampling simulations requiring biasing. We demonstrate the validity of the method on two different applications: a precipitation model and the association of Li+ and F- in water. For the former, we show that global descriptors such as the permutation invariant vector allow reaching an accuracy far from the one achieved via simpler, more intuitive variables. For the latter, we show that information correlated with the transformation mechanism is contained in the first solvation shell only and that inertial effects prevent the derivation of optimal collective variables from the atomic positions only.
Collapse
Affiliation(s)
- Arthur France-Lanord
- Institut des Sciences du Calcul et des Données, ISCD, Sorbonne Université, F-75005 Paris, France
- Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Sorbonne Université, F-75005 Paris, France
| | - Hadrien Vroylandt
- Institut des Sciences du Calcul et des Données, ISCD, Sorbonne Université, F-75005 Paris, France
| | - Mathieu Salanne
- Physicochimie des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, 4 Place Jussieu, F-75005 Paris, France
- Institut Universitaire de France (IUF), 75231 Paris, France
| | - Benjamin Rotenberg
- Physicochimie des Électrolytes et Nanosystèmes Interfaciaux, Sorbonne Université, CNRS, 4 Place Jussieu, F-75005 Paris, France
| | - A Marco Saitta
- Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Sorbonne Université, F-75005 Paris, France
| | - Fabio Pietrucci
- Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Sorbonne Université, F-75005 Paris, France
| |
Collapse
|
19
|
Ghamari D, Covino R, Faccioli P. Sampling a Rare Protein Transition Using Quantum Annealing. J Chem Theory Comput 2024; 20:3322-3334. [PMID: 38587482 DOI: 10.1021/acs.jctc.3c01174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Simulating spontaneous structural rearrangements in macromolecules with classical molecular dynamics is an outstanding challenge. Conventional supercomputers can access time intervals of up to tens of μs, while many key events occur on exponentially longer time scales. Path sampling techniques have the advantage of focusing the computational power on barrier-crossing trajectories, but generating uncorrelated transition paths that explore diverse conformational regions remains a problem. We employ a hybrid path-sampling paradigm that addresses this issue by generating trial transition paths using a quantum annealing (QA) machine. We first employ a classical computer to perform an uncharted exploration of the conformational space. The data set generated in this exploration is then postprocessed using a path integral-based method to yield a coarse-grained network representation of the reactive kinetics. By resorting to a quantum annealer, quantum superposition can be exploited to encode all of the transition pathways in the initial quantum state, thus potentially solving the path exploration problem. Furthermore, each QA cycle yields a completely uncorrelated trial trajectory. We previously validated this scheme on a prototypically simple transition, which could be extensively characterized on a desktop computer. Here, we scale up in complexity and perform an all-atom simulation of a protein conformational transition that occurs on the millisecond time scale, obtaining results that match those of the Anton special-purpose supercomputer. Despite limitations due to the available quantum annealers, our study highlights how realistic biomolecular simulations provide potentially impactful new ground for applying, testing, and advancing quantum technologies.
Collapse
Affiliation(s)
- Danial Ghamari
- Physics Department, Trento University, Via Sommarive 14, Povo 38123, Trento, Italy
- INFN-TIFPA, Via Sommarive 14, Povo 38123, Trento, Italy
| | - Roberto Covino
- Frankfurt Institute for Advanced Studies, Ruth-Moufang-Straße 1 Frankfurt am Main, Frankfurt D-60438, Germany
- Department of Biochemistry, University of Bayreuth, Universitätsstraße 30, Bayreuth 95447, Germany
| | - Pietro Faccioli
- INFN-TIFPA, Via Sommarive 14, Povo 38123, Trento, Italy
- Bicocca Quantum Technology Center and Physics Department, University of Milan Bicocca, Piazza della Scienza 2/A, Milan 20126, Italy
| |
Collapse
|
20
|
Falkner S, Coretti A, Dellago C. Enhanced Sampling of Configuration and Path Space in a Generalized Ensemble by Shooting Point Exchange. PHYSICAL REVIEW LETTERS 2024; 132:128001. [PMID: 38579233 DOI: 10.1103/physrevlett.132.128001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 12/15/2023] [Accepted: 02/09/2024] [Indexed: 04/07/2024]
Abstract
The computer simulation of many molecular processes is complicated by long timescales caused by rare transitions between long-lived states. Here, we propose a new approach to simulate such rare events, which combines transition path sampling with enhanced exploration of configuration space. The method relies on exchange moves between configuration and trajectory space, carried out based on a generalized ensemble. This scheme substantially enhances the efficiency of the transition path sampling simulations, particularly for systems with multiple transition channels, and yields information on thermodynamics, kinetics and reaction coordinates of molecular processes without distorting their dynamics. The method is illustrated using the isomerization of proline in the KPTP tetrapeptide.
Collapse
|
21
|
Lelièvre T, Pigeon T, Stoltz G, Zhang W. Analyzing Multimodal Probability Measures with Autoencoders. J Phys Chem B 2024; 128:2607-2631. [PMID: 38466759 DOI: 10.1021/acs.jpcb.3c07075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Finding collective variables to describe some important coarse-grained information on physical systems, in particular metastable states, remains a key issue in molecular dynamics. Recently, machine learning techniques have been intensively used to complement and possibly bypass expert knowledge in order to construct collective variables. Our focus here is on neural network approaches based on autoencoders. We study some relevant mathematical properties of the loss function considered for training autoencoders and provide physical interpretations based on conditional variances and minimum energy paths. We also consider various extensions in order to better describe physical systems, by incorporating more information on transition states at saddle points, and/or allowing for multiple decoders in order to describe several transition paths. Our results are illustrated on toy two-dimensional systems and on alanine dipeptide.
Collapse
Affiliation(s)
- Tony Lelièvre
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Thomas Pigeon
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
- IFP Energies Nouvelles, Rond-Point de l'Echangeur de Solaize, BP 3, 69360 Solaize, France
| | - Gabriel Stoltz
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Wei Zhang
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
- Zuse Institute Berlin, Takustraße 7, 14195 Berlin, Germany
| |
Collapse
|
22
|
Zou Z, Tiwary P. Enhanced Sampling of Crystal Nucleation with Graph Representation Learnt Variables. J Phys Chem B 2024. [PMID: 38502931 DOI: 10.1021/acs.jpcb.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
In this study, we present a graph neural network (GNN)-based learning approach using an autoencoder setup to derive low-dimensional variables from features observed in experimental crystal structures. These variables are then biased in enhanced sampling to observe state-to-state transitions and reliable thermodynamic weights. In our approach, we used simple convolution and pooling methods. To verify the effectiveness of our protocol, we examined the nucleation of various allotropes and polymorphs of iron and glycine in their molten states. Our graph latent variables, when biased in well-tempered metadynamics, consistently show transitions between states and achieve accurate thermodynamic rankings in agreement with experiments, both of which are indicators of dependable sampling. This underscores the strength and promise of our GNN variables for improved sampling. The protocol shown here should be applicable for other systems and other sampling methods.
Collapse
Affiliation(s)
- Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
- Institute for Physical Science and Technology, University of Maryland, College Park 20742, Maryland, United States
- University of Maryland Institute for Health Computing, Rockville, Maryland 20852, United States
| |
Collapse
|
23
|
Beck TL, Carloni P, Asthagiri DN. All-Atom Biomolecular Simulation in the Exascale Era. J Chem Theory Comput 2024; 20:1777-1782. [PMID: 38382017 DOI: 10.1021/acs.jctc.3c01276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Exascale supercomputers have opened the door to dynamic simulations, facilitated by AI/ML techniques, that model biomolecular motions over unprecedented length and time scales. This new capability holds the potential to revolutionize our understanding of fundamental biological processes. Here we report on some of the major advances that were discussed at a recent CECAM workshop in Pisa, Italy, on the topic with a primary focus on atomic-level simulations. First, we highlight examples of current large-scale biomolecular simulations and the future possibilities enabled by crossing the exascale threshold. Next, we discuss challenges to be overcome in optimizing the usage of these powerful resources. Finally, we close by listing several grand challenge problems that could be investigated with this new computer architecture.
Collapse
Affiliation(s)
- Thomas L Beck
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Paolo Carloni
- INM-9/IAS-5 Computational Biomedicine, Forschungszentrum Jülich, Wilhelm-Johnen-Straße, D-54245 Jülich, Germany
- Department of Physics, RWTH Aachen University, D-52078 Aachen, Germany
| | - Dilipkumar N Asthagiri
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| |
Collapse
|
24
|
Lei ZC, Wang X, Yang L, Qu H, Sun Y, Yang Y, Li W, Zhang WB, Cao XY, Fan C, Li G, Wu J, Tian ZQ. What can molecular assembly learn from catalysed assembly in living organisms? Chem Soc Rev 2024; 53:1892-1914. [PMID: 38230701 DOI: 10.1039/d3cs00634d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2024]
Abstract
Molecular assembly is the process of organizing individual molecules into larger structures and complex systems. The self-assembly approach is predominantly utilized in creating artificial molecular assemblies, and was believed to be the primary mode of molecular assembly in living organisms as well. However, it has been shown that the assembly of many biological complexes is "catalysed" by other molecules, rather than relying solely on self-assembly. In this review, we summarize these catalysed-assembly (catassembly) phenomena in living organisms and systematically analyse their mechanisms. We then expand on these phenomena and discuss related concepts, including catalysed-disassembly and catalysed-reassembly. Catassembly proves to be an efficient and highly selective strategy for synergistically controlling and manipulating various noncovalent interactions, especially in hierarchical molecular assemblies. Overreliance on self-assembly may, to some extent, hinder the advancement of artificial molecular assembly with powerful features. Furthermore, inspired by the biological catassembly phenomena, we propose guidelines for designing artificial catassembly systems and developing characterization and theoretical methods, and review pioneering works along this new direction. Overall, this approach may broaden and deepen our understanding of molecular assembly, enabling the construction and control of intelligent assembly systems with advanced functionality.
Collapse
Affiliation(s)
- Zhi-Chao Lei
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Xinchang Wang
- School of Electronic Science and Engineering, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, P. R. China
| | - Liulin Yang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Hang Qu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Yibin Sun
- Beijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Yang Yang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Wei Li
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Wen-Bin Zhang
- Beijing National Laboratory for Molecular Sciences, Key Laboratory of Polymer Chemistry & Physics of Ministry of Education, Center for Soft Matter Science and Engineering, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, P. R. China
| | - Xiao-Yu Cao
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| | - Chunhai Fan
- School of Chemistry and Chemical Engineering, Frontiers Science, Center for Transformative Molecules and National Center for Translational Medicine, Shanghai Jiao Tong University, Shanghai 200240, P. R. China
| | - Guohong Li
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, P. R. China
- University of Chinese Academy of Sciences, Beijing 100049, P. R. China
| | - Jiarui Wu
- Key Laboratory of Systems Biology, Center for Excellence in Molecular Cell Science, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, Shanghai, 200031, P. R. China
- School of Life Science and Technology, ShanghaiTech University, Shanghai, 201210, P. R. China
- Key Laboratory of Systems Health Science of Zhejiang Province, School of Life Science, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Hangzhou, 310024, P. R. China
| | - Zhong-Qun Tian
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, P. R. China.
| |
Collapse
|
25
|
Fu H, Bian H, Shao X, Cai W. Collective Variable-Based Enhanced Sampling: From Human Learning to Machine Learning. J Phys Chem Lett 2024; 15:1774-1783. [PMID: 38329095 DOI: 10.1021/acs.jpclett.3c03542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Enhanced-sampling algorithms relying on collective variables (CVs) are extensively employed to study complex (bio)chemical processes that are not amenable to brute-force molecular simulations. The selection of appropriate CVs characterizing the slow movement modes is of paramount importance for reliable and efficient enhanced-sampling simulations. In this Perspective, we first review the application and limitations of CVs obtained from chemical and geometrical intuition. We also introduce path-sampling algorithms, which can identify path-like CVs in a high-dimensional free-energy space. Machine-learning algorithms offer a viable approach to finding suitable CVs by analyzing trajectories from preliminary simulations. We discuss both the performance of machine-learning-derived CVs in enhanced-sampling simulations of experimental models and the challenges involved in applying these CVs to realistic, complex molecular assemblies. Moreover, we provide a prospective view of the potential advancements of machine-learning algorithms for the development of CVs in the field of enhanced-sampling simulations.
Collapse
Affiliation(s)
- Haohao Fu
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Hengwei Bian
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Xueguang Shao
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| | - Wensheng Cai
- Research Center for Analytical Sciences, Frontiers Science Center for New Organic Matter, College of Chemistry, Nankai University, Tianjin Key Laboratory of Biosensing and Molecular Recognition, State Key Laboratory of Medicinal Chemical Biology, Tianjin 300071, China
- Haihe Laboratory of Sustainable Chemical Transformations, Tianjin 300192, China
| |
Collapse
|
26
|
Nicolle A, Deng S, Ihme M, Kuzhagaliyeva N, Ibrahim EA, Farooq A. Mixtures Recomposition by Neural Nets: A Multidisciplinary Overview. J Chem Inf Model 2024; 64:597-620. [PMID: 38284618 DOI: 10.1021/acs.jcim.3c01633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2024]
Abstract
Artificial Neural Networks (ANNs) are transforming how we understand chemical mixtures, providing an expressive view of the chemical space and multiscale processes. Their hybridization with physical knowledge can bridge the gap between predictivity and understanding of the underlying processes. This overview explores recent progress in ANNs, particularly their potential in the 'recomposition' of chemical mixtures. Graph-based representations reveal patterns among mixture components, and deep learning models excel in capturing complexity and symmetries when compared to traditional Quantitative Structure-Property Relationship models. Key components, such as Hamiltonian networks and convolution operations, play a central role in representing multiscale mixtures. The integration of ANNs with Chemical Reaction Networks and Physics-Informed Neural Networks for inverse chemical kinetic problems is also examined. The combination of sensors with ANNs shows promise in optical and biomimetic applications. A common ground is identified in the context of statistical physics, where ANN-based methods iteratively adapt their models by blending their initial states with training data. The concept of mixture recomposition unveils a reciprocal inspiration between ANNs and reactive mixtures, highlighting learning behaviors influenced by the training environment.
Collapse
Affiliation(s)
- Andre Nicolle
- Aramco Fuel Research Center, Rueil-Malmaison 92852, France
| | - Sili Deng
- Massachusetts Institute of Technology, Cambridge 02139, Massachusetts, United States
| | - Matthias Ihme
- Stanford University, Stanford 94305, California, United States
| | | | - Emad Al Ibrahim
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| | - Aamir Farooq
- King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia
| |
Collapse
|
27
|
Beck M, Covino R, Hänelt I, Müller-McNicoll M. Understanding the cell: Future views of structural biology. Cell 2024; 187:545-562. [PMID: 38306981 DOI: 10.1016/j.cell.2023.12.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/05/2023] [Accepted: 12/11/2023] [Indexed: 02/04/2024]
Abstract
Determining the structure and mechanisms of all individual functional modules of cells at high molecular detail has often been seen as equal to understanding how cells work. Recent technical advances have led to a flush of high-resolution structures of various macromolecular machines, but despite this wealth of detailed information, our understanding of cellular function remains incomplete. Here, we discuss present-day limitations of structural biology and highlight novel technologies that may enable us to analyze molecular functions directly inside cells. We predict that the progression toward structural cell biology will involve a shift toward conceptualizing a 4D virtual reality of cells using digital twins. These will capture cellular segments in a highly enriched molecular detail, include dynamic changes, and facilitate simulations of molecular processes, leading to novel and experimentally testable predictions. Transferring biological questions into algorithms that learn from the existing wealth of data and explore novel solutions may ultimately unveil how cells work.
Collapse
Affiliation(s)
- Martin Beck
- Max Planck Institute of Biophysics, Max-von-Laue-Straße 3, 60438 Frankfurt am Main, Germany; Goethe University Frankfurt, Frankfurt, Germany.
| | - Roberto Covino
- Frankfurt Institute for Advanced Studies, Ruth-Moufang-Straße 1, 60438 Frankfurt am Main, Germany.
| | - Inga Hänelt
- Goethe University Frankfurt, Frankfurt, Germany.
| | | |
Collapse
|
28
|
Lazzeri G, Jung H, Bolhuis PG, Covino R. Molecular Free Energies, Rates, and Mechanisms from Data-Efficient Path Sampling Simulations. J Chem Theory Comput 2023; 19:9060-9076. [PMID: 37988412 PMCID: PMC10753783 DOI: 10.1021/acs.jctc.3c00821] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 10/24/2023] [Accepted: 10/24/2023] [Indexed: 11/23/2023]
Abstract
Molecular dynamics is a powerful tool for studying the thermodynamics and kinetics of complex molecular events. However, these simulations can rarely sample the required time scales in practice. Transition path sampling overcomes this limitation by collecting unbiased trajectories and capturing the relevant events. Moreover, the integration of machine learning can boost the sampling while simultaneously learning a quantitative representation of the mechanism. Still, the resulting trajectories are by construction non-Boltzmann-distributed, preventing the calculation of free energies and rates. We developed an algorithm to approximate the equilibrium path ensemble from machine-learning-guided path sampling data. At the same time, our algorithm provides efficient sampling, mechanism, free energy, and rates of rare molecular events at a very moderate computational cost. We tested the method on the folding of the mini-protein chignolin. Our algorithm is straightforward and data-efficient, opening the door to applications in many challenging molecular systems.
Collapse
Affiliation(s)
- Gianmarco Lazzeri
- Frankfurt
Institute for Advanced Studies, Frankfurt am Main, 60438, Germany
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
| | - Hendrik Jung
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
- Department
of Theoretical Biophysics, Max Planck Institute
of Biophysics, Frankfurt
am Main, 60438, Germany
| | - Peter G. Bolhuis
- Van’t
Hoff Institute for Molecular Sciences, University
of Amsterdam, Amsterdam, 1090GD, The Netherlands
| | - Roberto Covino
- Frankfurt
Institute for Advanced Studies, Frankfurt am Main, 60438, Germany
- Goethe
University Frankfurt, Frankfurt
am Main, 60438, Germany
| |
Collapse
|
29
|
Hofmann H. All over or overall - Do we understand allostery? Curr Opin Struct Biol 2023; 83:102724. [PMID: 37898005 DOI: 10.1016/j.sbi.2023.102724] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 09/27/2023] [Accepted: 09/28/2023] [Indexed: 10/30/2023]
Abstract
Allostery is probably the most important concept in the regulation of cellular processes. Models to explain allostery are plenty. Each sheds light on different aspects but their entirety conveys an ambiguous feeling of comprehension and disappointment. Here, I discuss the most popular allostery models, their roots, similarities, and limitations. All of them are thermodynamic models. Naturally this bears a certain degree of redundancy, which forms the center of this review. After sixty years, many questions remain unanswered, mainly because our human longing for causality as base for understanding is not satisfied by thermodynamics alone. A description of allostery in terms of pathways, i.e., as a temporal chain of events, has been-, and still is-, a missing piece of the puzzle.
Collapse
Affiliation(s)
- Hagen Hofmann
- Department of Chemical and Structural Biology, Weizmann Institute of Science, Herzl St. 234, 76100 Rehovot, Israel.
| |
Collapse
|
30
|
Ngo VA, Lin YT, Perez D. Improving Estimation of the Koopman Operator with Kolmogorov-Smirnov Indicator Functions. J Chem Theory Comput 2023; 19:7187-7198. [PMID: 37800673 DOI: 10.1021/acs.jctc.3c00632] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/07/2023]
Abstract
It has become common to perform kinetic analysis using approximate Koopman operators that transform high-dimensional timeseries of observables into ranked dynamical modes. The key to the practical success of the approach is the identification of a set of observables that form a good basis on which to expand the slow relaxation modes. Good observables are, however, difficult to identify a priori and suboptimal choices can lead to significant underestimations of characteristic time scales. Leveraging the representation of slow dynamics in terms of Hidden Markov Models (HMM), we propose a simple and computationally efficient clustering procedure to infer surrogate observables that form a good basis for slow modes. We apply the approach to an analytically solvable model system as well as on three protein systems of different complexities. We consistently demonstrate that the inferred indicator functions can significantly improve the estimation of the leading eigenvalues of Koopman operators and correctly identify key states and transition time scales of stochastic systems, even when good observables are not known a priori.
Collapse
Affiliation(s)
- Van A Ngo
- Advanced Computing for Life Sciences and Engineering, Computing and Computational Sciences, National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Yen Ting Lin
- Information Sciences Group (CCS-3), Computer, Computational and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Danny Perez
- Physics and Chemistry of Materials Group (T-1), Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87544, United States
| |
Collapse
|
31
|
Bhendale M, Indra A, Singh JK. Does freezing induce self-assembly of polymers? A molecular dynamics study. SOFT MATTER 2023; 19:7570-7579. [PMID: 37751160 DOI: 10.1039/d3sm00892d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
This work investigates the freezing-induced self-assembly (FISA) of polyvinyl alcohol (PVA) and PVA-like polymers using molecular dynamics simulations. In particular, the effect of the degree of supercooling, degree of polymerization, polymer type, and initial local concentration on the FISA was studied. It was found that the preeminent factor responsible for FISA is not the diffusion of the polymers away from the nucleating ice front, but the increase in the polymer's local concentration upon freezing of the solvent (water). At a higher degree of supercooling, the polymers are engulfed by the growing ice front, impeding their diffusion into the supercooled solution and finally inhibiting their self-assembly. Conversely, at a relatively lower degree of supercooling, the rate of diffusion of the polymers into the supercooled solution is higher, which increases their local concentration and results in FISA. FISA was also observed to depend on the polymer-solvent interactions. Strongly favorable solute-solvent interactions hinder the self-assembly, whereas unfavorable solute-solvent interactions promote the self-assembly. The polymer and aggregate morphology were investigated using the radius of gyration, end-to-end distance, and asphericity analysis. This study brings molecular insights into the quintessential factors governing self-assembly via freezing of the solvent, which is a novel self-assembly technique especially suitable for biomedical applications.
Collapse
Affiliation(s)
- Mangesh Bhendale
- Department of Chemical Engineering, Indian Institute of Technology, Kanpur, Uttar Pradesh 208016, India.
| | - Aindrila Indra
- Department of Chemical Engineering, Indian Institute of Technology, Kanpur, Uttar Pradesh 208016, India.
| | - Jayant K Singh
- Department of Chemical Engineering, Indian Institute of Technology, Kanpur, Uttar Pradesh 208016, India.
- Prescience Insilico Private Limited, 5th floor, Novel MSR Building, Marathalli, Bengaluru, Karnataka 560037, India
| |
Collapse
|
32
|
Mouaffac L, Palacio-Rodriguez K, Pietrucci F. Optimal Reaction Coordinates and Kinetic Rates from the Projected Dynamics of Transition Paths. J Chem Theory Comput 2023; 19:5701-5711. [PMID: 37550088 DOI: 10.1021/acs.jctc.3c00158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Finding optimal reaction coordinates and predicting accurate kinetic rates for activated processes are two of the foremost challenges of molecular simulations. We introduce an algorithm that tackles the two problems at once: starting from a limited number of reactive molecular dynamics trajectories (transition paths), we automatically generate with a Monte Carlo approach a sequence of different reaction coordinates that progressively reduce the kinetic rate of their projected effective dynamics. Based on a variational principle, the minimal rate accurately approximates the exact one, and it corresponds to the optimal reaction coordinate. After benchmarking the method on an analytic double-well system, we apply it to complex atomistic systems: the interaction of carbon nanoparticles of different sizes in water.
Collapse
Affiliation(s)
- Line Mouaffac
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| | - Karen Palacio-Rodriguez
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| | - Fabio Pietrucci
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, F-75005 Paris, France
| |
Collapse
|
33
|
Yuan Y, Cui Q. Accurate and Efficient Multilevel Free Energy Simulations with Neural Network-Assisted Enhanced Sampling. J Chem Theory Comput 2023; 19:5394-5406. [PMID: 37527495 PMCID: PMC10810721 DOI: 10.1021/acs.jctc.3c00591] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Free energy differences (ΔF) are essential to quantitative characterization and understanding of chemical and biological processes. Their direct estimation with an accurate quantum mechanical potential is of great interest and yet impractical due to high computational cost and incompatibility with typical alchemical free energy protocols. One promising solution is the multilevel free energy simulation in which the estimate of ΔF at an inexpensive low level of theory is combined with the correction toward a higher level of theory. The poor configurational overlap generally expected between the two levels of theory, however, presents a major challenge. We overcome this challenge by using a deep neural network model and enhanced sampling simulations. An adversarial autoencoder is used to identify a low-dimensional (latent) space that compactly represents the degrees of freedom that encode the distinct distributions at the two levels of theory. Enhanced sampling in this latent space is then used to drive the sampling of configurations that predominantly contribute to the free energy correction. Results for both gas phase and condensed phase systems demonstrate that this data-driven approach offers high accuracy and efficiency with great potential for scalability to complex systems.
Collapse
Affiliation(s)
- Yuchen Yuan
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
| | - Qiang Cui
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| |
Collapse
|
34
|
Strahan J, Guo SC, Lorpaiboon C, Dinner AR, Weare J. Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction. J Chem Phys 2023; 159:014110. [PMID: 37409704 PMCID: PMC10328561 DOI: 10.1063/5.0151309] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 06/02/2023] [Indexed: 07/07/2023] Open
Abstract
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics, such as the likelihood and average time of events (predictions). Here, we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a dataset of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
Collapse
Affiliation(s)
- John Strahan
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Spencer C. Guo
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Chatipat Lorpaiboon
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Aaron R. Dinner
- Department of Chemistry and James Franck Institute, University of Chicago, Chicago, Illinois 60637, USA
| | - Jonathan Weare
- Courant Institute of Mathematical Sciences, New York University, New York, New York 10012, USA
| |
Collapse
|
35
|
Rydzewski J. Spectral Map: Embedding Slow Kinetics in Collective Variables. J Phys Chem Lett 2023; 14:5216-5220. [PMID: 37260045 PMCID: PMC10258851 DOI: 10.1021/acs.jpclett.3c01101] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 05/26/2023] [Indexed: 06/02/2023]
Abstract
The dynamics of physical systems that require high-dimensional representation can often be captured in a few meaningful degrees of freedom called collective variables (CVs). However, identifying CVs is challenging and constitutes a fundamental problem in physical chemistry. This problem is even more pronounced when CVs need to provide information about slow kinetics related to rare transitions between long-lived metastable states. To address this issue, we propose an unsupervised deep-learning method called spectral map. Our method constructs slow CVs by maximizing the spectral gap between slow and fast eigenvalues of a transition matrix estimated by an anisotropic diffusion kernel. We demonstrate our method in several high-dimensional reversible folding processes.
Collapse
Affiliation(s)
- Jakub Rydzewski
- Institute of Physics, Faculty of Physics,
Astronomy and Informatics, Nicolaus Copernicus
University, Grudziadzka 5, 87-100 Toruń, Poland
| |
Collapse
|