1
|
Hradiská H, Kurečka M, Beránek J, Tedeschi G, Višňovský V, Křenek A, Spiwok V. Acceleration of Molecular Simulations by Parametric Time-Lagged tSNE Metadynamics. J Phys Chem B 2024; 128:903-913. [PMID: 38237064 PMCID: PMC10839826 DOI: 10.1021/acs.jpcb.3c05669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 12/22/2023] [Accepted: 12/28/2023] [Indexed: 02/02/2024]
Abstract
The potential of molecular simulations is limited by their computational costs. There is often a need to accelerate simulations using some of the enhanced sampling methods. Metadynamics applies a history-dependent bias potential that disfavors previously visited states. To apply metadynamics, it is necessary to select a few properties of the system─collective variables (CVs) that can be used to define the bias potential. Over the past few years, there have been emerging opportunities for machine learning and, in particular, artificial neural networks within this domain. In this broad context, a specific unsupervised machine learning method was utilized, namely, parametric time-lagged t-distributed stochastic neighbor embedding (ptltSNE) to design CVs. The approach was tested on a Trp-cage trajectory (tryptophan cage) from the literature. The trajectory was used to generate a map of conformations, distinguish fast conformational changes from slow ones, and design CVs. Then, metadynamic simulations were performed. To accelerate the formation of the α-helix, we added the α-RMSD collective variable. This simulation led to one folding event in a 350 ns metadynamics simulation. To accelerate degrees of freedom not addressed by CVs, we performed parallel tempering metadynamics. This simulation led to 10 folding events in a 200 ns simulation with 32 replicas.
Collapse
Affiliation(s)
- Helena Hradiská
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Martin Kurečka
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Jan Beránek
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Guglielmo Tedeschi
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| | - Vladimír Višňovský
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Aleš Křenek
- Institute
of Computer Science, Masaryk Univerzity, Šumavská 416/15, Brno 602 00, Czech Republic
| | - Vojtěch Spiwok
- Department
of Biochemistry and Microbiology, University
of Chemistry and Technology Prague, Technická 3, Prague
6 166 28, Czech Republic
| |
Collapse
|
2
|
Vymětal J, Vondrášek J. Iterative Landmark-Based Umbrella Sampling (ILBUS) Protocol for Sampling of Conformational Space of Biomolecules. J Chem Inf Model 2022; 62:4783-4798. [PMID: 36122323 DOI: 10.1021/acs.jcim.2c00370] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Computer simulations of biomolecules such as molecular dynamics often suffer from insufficient sampling. Due to limited computational resources, insufficient sampling prevents obtaining proper equilibrium distributions of observed properties. To deal with this problem, we proposed a simulation protocol for efficient resampling of collected off-equilibrium trajectories. These trajectories are utilized for the initial mapping of the conformational space, which is later properly resampled by the introduced Iterative Landmark-Based Umbrella Sampling (ILBUS) method. Reconstruction of static equilibrium properties is achieved by the multistate Bennett acceptance ratio (MBAR) method, which enables efficient use of simulated data. The ILBUS protocol is geometry-based and does not demand any additional collective variable or a dimensional-reduction technique. The only requirement is a set of suitably spaced reference conformations, which serve as landmarks in the mapped conformational space. Additionally, the ILBUS protocol encompasses an iterative process that optimizes the force constant used in the umbrella sampling simulation. Such tuning is an inherent feature of the protocol and does not need to be performed by the user in advance. Furthermore, even the simulations with suboptimal force constants can be used in estimates by MBAR. We demonstrate the feasibility and the performance of this approach in the study of the conformational landscape of the alanine dipeptide, met-enkephalin, and adenylate kinase.
Collapse
Affiliation(s)
- Jiří Vymětal
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 542/2, 160 00 Praha 6, Czech Republic
| | - Jiří Vondrášek
- Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Flemingovo náměstí 542/2, 160 00 Praha 6, Czech Republic
| |
Collapse
|
3
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
4
|
Spiwok V, Kurečka M, Křenek A. Collective Variable for Metadynamics Derived From AlphaFold Output. Front Mol Biosci 2022; 9:878133. [PMID: 35769910 PMCID: PMC9234394 DOI: 10.3389/fmolb.2022.878133] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Accepted: 05/05/2022] [Indexed: 11/13/2022] Open
Abstract
AlphaFold is a neural network–based tool for the prediction of 3D structures of proteins. In CASP14, a blind structure prediction challenge, it performed significantly better than other competitors, making it the best available structure prediction tool. One of the outputs of AlphaFold is the probability profile of residue–residue distances. This makes it possible to score any conformation of the studied protein to express its compliance with the AlphaFold model. Here, we show how this score can be used to drive protein folding simulation by metadynamics and parallel tempering metadynamics. Using parallel tempering metadynamics, we simulated the folding of a mini-protein Trp-cage and β hairpin and predicted their folding equilibria. We observe the potential of the AlphaFold-based collective variable in applications beyond structure prediction, such as in structure refinement or prediction of the outcome of a mutation.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, Faculty of Food and Biochemical Technology, University of Chemistry and Technology, Prague, Czechia
- *Correspondence: Vojtěch Spiwok,
| | - Martin Kurečka
- Institute of Computer Science, Masaryk University, Brno, Czechia
| | - Aleš Křenek
- Institute of Computer Science, Masaryk University, Brno, Czechia
| |
Collapse
|
5
|
Trapl D, Krupička M, Višňovský V, Hozzová J, Ol'ha J, Křenek A, Spiwok V. Property Map Collective Variable as a Useful Tool for a Force Field Correction. J Chem Inf Model 2022; 62:567-576. [PMID: 35112877 DOI: 10.1021/acs.jcim.1c00651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The accuracy of biomolecular simulations depends on the accuracy of an empirical molecular mechanics potential known as a force field: a set of parameters and expressions to estimate the potential from atomic coordinates. Accurate parametrization of force fields for small organic molecules is a challenge due to their high diversity. One of the possible approaches is to apply a correction to the existing force fields. Here, we propose an approach to estimate the density functional theory (DFT)-derived force field correction which is calculated during the run of molecular dynamics without significantly affecting its speed. Using the formula known as a property map collective variable, we approximate the force field correction by a weighted average of this force field correction calculated only for a small series of reference structures. To validate this method, we used seven AMBER force fields, and we show how it is possible to convert one force field to behave like the other one. We also present the force field correction for the important anticancer drug Imatinib as a use case example. Our method appears to be suitable for adjusting the force field for general drug-like molecules. We provide a pipeline that generates the correction; this pipeline is available at https://pmcvff-correction.cerit-sc.cz/.
Collapse
Affiliation(s)
- Dalibor Trapl
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Technická 5, Prague 6 166 28, Czech Republic
| | - Martin Krupička
- Department of Organic Chemistry, University of Chemistry and Technology, Technická 5, Prague 6 166 28, Czech Republic
| | - Vladimír Višňovský
- Institute of Computer Science, Masaryk University, Botanická 554/68a, Brno 602 00, Czech Republic
| | - Jana Hozzová
- Institute of Computer Science, Masaryk University, Botanická 554/68a, Brno 602 00, Czech Republic
| | - Jaroslav Ol'ha
- Institute of Computer Science, Masaryk University, Botanická 554/68a, Brno 602 00, Czech Republic
| | - Aleš Křenek
- Institute of Computer Science, Masaryk University, Botanická 554/68a, Brno 602 00, Czech Republic
| | - Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Technická 5, Prague 6 166 28, Czech Republic
| |
Collapse
|
6
|
Glielmo A, Husic BE, Rodriguez A, Clementi C, Noé F, Laio A. Unsupervised Learning Methods for Molecular Simulation Data. Chem Rev 2021; 121:9722-9758. [PMID: 33945269 PMCID: PMC8391792 DOI: 10.1021/acs.chemrev.0c01195] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Indexed: 12/21/2022]
Abstract
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
Collapse
Affiliation(s)
- Aldo Glielmo
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
| | - Brooke E. Husic
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
| | - Alex Rodriguez
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| | - Cecilia Clementi
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Frank Noé
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Alessandro Laio
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| |
Collapse
|
7
|
Musil F, Grisafi A, Bartók AP, Ortner C, Csányi G, Ceriotti M. Physics-Inspired Structural Representations for Molecules and Materials. Chem Rev 2021; 121:9759-9815. [PMID: 34310133 DOI: 10.1021/acs.chemrev.1c00021] [Citation(s) in RCA: 145] [Impact Index Per Article: 48.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The first step in the construction of a regression model or a data-driven analysis, aiming to predict or elucidate the relationship between the atomic-scale structure of matter and its properties, involves transforming the Cartesian coordinates of the atoms into a suitable representation. The development of atomic-scale representations has played, and continues to play, a central role in the success of machine-learning methods for chemistry and materials science. This review summarizes the current understanding of the nature and characteristics of the most commonly used structural and chemical descriptions of atomistic structures, highlighting the deep underlying connections between different frameworks and the ideas that lead to computationally efficient and universally applicable models. It emphasizes the link between properties, structures, their physical chemistry, and their mathematical description, provides examples of recent applications to a diverse set of chemical and materials science problems, and outlines the open questions and the most promising research directions in the field.
Collapse
Affiliation(s)
- Felix Musil
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Andrea Grisafi
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Albert P Bartók
- Department of Physics and Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Christoph Ortner
- University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, United Kingdom
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
8
|
Bernetti M, Bertazzo M, Masetti M. Data-Driven Molecular Dynamics: A Multifaceted Challenge. Pharmaceuticals (Basel) 2020; 13:E253. [PMID: 32961909 PMCID: PMC7557855 DOI: 10.3390/ph13090253] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Revised: 09/14/2020] [Accepted: 09/16/2020] [Indexed: 12/18/2022] Open
Abstract
The big data concept is currently revolutionizing several fields of science including drug discovery and development. While opening up new perspectives for better drug design and related strategies, big data analysis strongly challenges our current ability to manage and exploit an extraordinarily large and possibly diverse amount of information. The recent renewal of machine learning (ML)-based algorithms is key in providing the proper framework for addressing this issue. In this respect, the impact on the exploitation of molecular dynamics (MD) simulations, which have recently reached mainstream status in computational drug discovery, can be remarkable. Here, we review the recent progress in the use of ML methods coupled to biomolecular simulations with potentially relevant implications for drug design. Specifically, we show how different ML-based strategies can be applied to the outcome of MD simulations for gaining knowledge and enhancing sampling. Finally, we discuss how intrinsic limitations of MD in accurately modeling biomolecular systems can be alleviated by including information coming from experimental data.
Collapse
Affiliation(s)
- Mattia Bernetti
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, I-34136 Trieste, Italy;
| | - Martina Bertazzo
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, I-16163 Genova, Italy;
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| |
Collapse
|
9
|
Gkeka P, Stoltz G, Barati Farimani A, Belkacemi Z, Ceriotti M, Chodera JD, Dinner AR, Ferguson AL, Maillet JB, Minoux H, Peter C, Pietrucci F, Silveira A, Tkatchenko A, Trstanova Z, Wiewiora R, Lelièvre T. Machine Learning Force Fields and Coarse-Grained Variables in Molecular Dynamics: Application to Materials and Biological Systems. J Chem Theory Comput 2020; 16:4757-4775. [PMID: 32559068 PMCID: PMC8312194 DOI: 10.1021/acs.jctc.0c00355] [Citation(s) in RCA: 82] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Machine learning encompasses tools and algorithms that are now becoming popular in almost all scientific and technological fields. This is true for molecular dynamics as well, where machine learning offers promises of extracting valuable information from the enormous amounts of data generated by simulation of complex systems. We provide here a review of our current understanding of goals, benefits, and limitations of machine learning techniques for computational studies on atomistic systems, focusing on the construction of empirical force fields from ab initio databases and the determination of reaction coordinates for free energy computation and enhanced sampling.
Collapse
Affiliation(s)
- Paraskevi Gkeka
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
| | - Gabriel Stoltz
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| | | | - Zineb Belkacemi
- Integrated Drug Discovery, Sanofi R&D, 91385 Chilly-Mazarin, France
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
| | - Michele Ceriotti
- Laboratory of Computational Science and Modelling, Institute of Materials, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Aaron R Dinner
- Department of Chemistry, The University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, 5640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | | | - Hervé Minoux
- Integrated Drug Discovery, Sanofi R&D, 94403 Vitry-sur-Seine, France
| | | | - Fabio Pietrucci
- UMR CNRS 7590, MNHN, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, Sorbonne Université, 75005 Paris, France
| | - Ana Silveira
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Zofia Trstanova
- School of Mathematics, The University of Edinburgh, Edinburgh EH9 3FD, U.K
| | - Rafal Wiewiora
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Tony Lelièvre
- CERMICS, Ecole des Ponts, Marne-la-Vallée, France
- Matherials Project-Team, Inria Paris, 75012 Paris, France
| |
Collapse
|
10
|
Spiwok V, Kříž P. Time-Lagged t-Distributed Stochastic Neighbor Embedding (t-SNE) of Molecular Simulation Trajectories. Front Mol Biosci 2020; 7:132. [PMID: 32714941 PMCID: PMC7344294 DOI: 10.3389/fmolb.2020.00132] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 06/03/2020] [Indexed: 11/30/2022] Open
Abstract
Molecular simulation trajectories represent high-dimensional data. Such data can be visualized by methods of dimensionality reduction. Non-linear dimensionality reduction methods are likely to be more efficient than linear ones due to the fact that motions of atoms are non-linear. Here we test a popular non-linear t-distributed Stochastic Neighbor Embedding (t-SNE) method on analysis of trajectories of 200 ns alanine dipeptide dynamics and 208 μs Trp-cage folding and unfolding. Furthermore, we introduce a time-lagged variant of t-SNE in order to focus on rarely occurring transitions in the molecular system. This time-lagged t-SNE efficiently separates states according to distance in time. Using this method it is possible to visualize key states of studied systems (e.g., unfolded and folded protein) as well as possible kinetic traps using a two-dimensional plot. Time-lagged t-SNE is a visualization method and other applications, such as clustering and free energy modeling, must be done with caution.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Czechia
| | - Pavel Kříž
- Department of Mathematics, University of Chemistry and Technology, Prague, Czechia
| |
Collapse
|
11
|
Fabrizio A, Meyer B, Corminboeuf C. Machine learning models of the energy curvature vs particle number for optimal tuning of long-range corrected functionals. J Chem Phys 2020; 152:154103. [DOI: 10.1063/5.0005039] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023] Open
Affiliation(s)
- Alberto Fabrizio
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Benjamin Meyer
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
- National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
12
|
Bouvier B. Curvature as a Collective Coordinate in Enhanced Sampling Membrane Simulations. J Chem Theory Comput 2019; 15:6551-6561. [DOI: 10.1021/acs.jctc.9b00716] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Benjamin Bouvier
- Laboratoire de Glycochimie, des Antimicrobiens et des Agroressources, CNRS UMR7378/Université de Picardie Jules Verne, 10, rue Baudelocque, 80039 Amiens Cedex, France
| |
Collapse
|
13
|
Tribello GA, Gasparotto P. Using Dimensionality Reduction to Analyze Protein Trajectories. Front Mol Biosci 2019; 6:46. [PMID: 31275943 PMCID: PMC6593086 DOI: 10.3389/fmolb.2019.00046] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 05/31/2019] [Indexed: 11/24/2022] Open
Abstract
In recent years the analysis of molecular dynamics trajectories using dimensionality reduction algorithms has become commonplace. These algorithms seek to find a low-dimensional representation of a trajectory that is, according to a well-defined criterion, optimal. A number of different strategies for generating projections of trajectories have been proposed but little has been done to systematically compare how these various approaches fare when it comes to analysing trajectories for biomolecules in explicit solvent. In the following paper, we have thus analyzed a molecular dynamics trajectory of the C-terminal fragment of the immunoglobulin binding domain B1 of protein G of Streptococcus modeled in explicit solvent using a range of different dimensionality reduction algorithms. We have then tried to systematically compare the projections generated using each of these algorithms by using a clustering algorithm to find the positions and extents of the basins in the high-dimensional energy landscape. We find that no algorithm outshines all the other in terms of the quality of the projection it generates. Instead, all the algorithms do a reasonable job when it comes to building a projection that separates some of the configurations that lie in different basins. Having said that, however, all the algorithms struggle to project the basins because they all have a large intrinsic dimensionality.
Collapse
Affiliation(s)
- Gareth A Tribello
- Atomistic Simulation Centre, School of Mathematics and Physics, Queen's University Belfast, Belfast, United Kingdom
| | - Piero Gasparotto
- Department of Physics and Astronomy, Thomas Young Centre, University College London, London, United Kingdom
| |
Collapse
|
14
|
Ceriotti M. Unsupervised machine learning in atomistic simulations, between predictions and understanding. J Chem Phys 2019; 150:150901. [PMID: 31005087 DOI: 10.1063/1.5091842] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Automated analyses of the outcome of a simulation have been an important part of atomistic modeling since the early days, addressing the need of linking the behavior of individual atoms and the collective properties that are usually the final quantity of interest. Methods such as clustering and dimensionality reduction have been used to provide a simplified, coarse-grained representation of the structure and dynamics of complex systems from proteins to nanoparticles. In recent years, the rise of machine learning has led to an even more widespread use of these algorithms in atomistic modeling and to consider different classification and inference techniques as part of a coherent toolbox of data-driven approaches. This perspective briefly reviews some of the unsupervised machine-learning methods-that are geared toward classification and coarse-graining of molecular simulations-seen in relation to the fundamental mathematical concepts that underlie all machine-learning techniques. It discusses the importance of using concise yet complete representations of atomic structures as the starting point of the analyses and highlights the risk of introducing preconceived biases when using machine learning to rationalize and understand structure-property relations. Supervised machine-learning techniques that explicitly attempt to predict the properties of a material given its structure are less susceptible to such biases. Current developments in the field suggest that using these two classes of approaches side-by-side and in a fully integrated mode, while keeping in mind the relations between the data analysis framework and the fundamental physical principles, will be key to realizing the full potential of machine learning to help understand the behavior of complex molecules and materials.
Collapse
Affiliation(s)
- Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute des Materiaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
15
|
Trapl D, Horvacanin I, Mareska V, Ozcelik F, Unal G, Spiwok V. Anncolvar: Approximation of Complex Collective Variables by Artificial Neural Networks for Analysis and Biasing of Molecular Simulations. Front Mol Biosci 2019; 6:25. [PMID: 31058167 PMCID: PMC6482212 DOI: 10.3389/fmolb.2019.00025] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Accepted: 04/01/2019] [Indexed: 11/23/2022] Open
Abstract
The state of a molecular system can be described in terms of collective variables. These low-dimensional descriptors of molecular structure can be used to monitor the state of the simulation, to calculate free energy profiles or to accelerate rare events by a bias potential or a bias force. Frequent calculation of some complex collective variables may slow down the simulation or analysis of trajectories. Moreover, many collective variables cannot be explicitly calculated for newly sampled structures. In order to address this problem, we developed a new package called anncolvar. This package makes it possible to build and train an artificial neural network model that approximates a collective variable. It can be used to generate an input for the open-source enhanced sampling simulation PLUMED package, so the collective variable can be monitored and biased by methods available in this program. The computational efficiency and the accuracy of anncolvar are demonstrated on selected molecular systems (cyclooctane derivative, Trp-cage miniprotein) and selected collective variables (Isomap, molecular surface area).
Collapse
Affiliation(s)
- Dalibor Trapl
- Department of Biochemistry and Microbiology, University of Chemistry and Technology in Prague, Prague, Czechia
| | - Izabela Horvacanin
- Department of Biochemistry and Microbiology, University of Chemistry and Technology in Prague, Prague, Czechia.,Faculty of Science, University of Zagreb, Zagreb, Croatia
| | - Vaclav Mareska
- Department of Biochemistry and Microbiology, University of Chemistry and Technology in Prague, Prague, Czechia
| | - Furkan Ozcelik
- Computer Engineering Department, Istanbul Technical University, Istanbul, Turkey
| | - Gozde Unal
- Computer Engineering Department, Istanbul Technical University, Istanbul, Turkey
| | - Vojtech Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology in Prague, Prague, Czechia
| |
Collapse
|
16
|
Abstract
This chapter discusses the way in which dimensionality reduction algorithms such as diffusion maps and sketch-map can be used to analyze molecular dynamics trajectories. The first part discusses how these various algorithms function as well as practical issues such as landmark selection and how these algorithms can be used when the data to be analyzed comes from enhanced sampling trajectories. In the later part a comparison between the results obtained by applying various algorithms to two sets of sample data is performed and discussed. This section is then followed by a summary of how one algorithm in particular, sketch-map, has been applied to a range of problems. The chapter concludes with a discussion on the directions that we believe this field is currently moving.
Collapse
|
17
|
Abstract
This chapter discusses how the PLUMED plugin for molecular dynamics can be used to analyze and bias molecular dynamics trajectories. The chapter begins by introducing the notion of a collective variable and by then explaining how the free energy can be computed as a function of one or more collective variables. A number of practical issues mostly around periodic boundary conditions that arise when these types of calculations are performed using PLUMED are then discussed. Later parts of the chapter discuss how PLUMED can be used to perform enhanced sampling simulations that introduce simulation biases or multiple replicas of the system and Monte Carlo exchanges between these replicas. This section is then followed by a discussion on how free-energy surfaces and associated error bars can be extracted from such simulations by using weighted histogram and block averaging techniques.
Collapse
Affiliation(s)
- Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati, Trieste, Italy.
| | - Gareth A Tribello
- Atomistic Simulation Centre, School of Mathematics and Physics, Queen's University Belfast, Belfast, UK.
| |
Collapse
|
18
|
Schuetz DA, Bernetti M, Bertazzo M, Musil D, Eggenweiler HM, Recanatini M, Masetti M, Ecker GF, Cavalli A. Predicting Residence Time and Drug Unbinding Pathway through Scaled Molecular Dynamics. J Chem Inf Model 2018; 59:535-549. [DOI: 10.1021/acs.jcim.8b00614] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Doris A. Schuetz
- Department of Pharmaceutical Chemistry, University of Vienna, UZA 2, Althanstrasse 14, 1090 Vienna, Austria
| | - Mattia Bernetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| | - Martina Bertazzo
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
| | - Djordje Musil
- Discovery Technologies, Merck KGaA, Frankfurter Straße 250, 64293 Darmstadt, Germany
| | | | - Maurizio Recanatini
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| | - Matteo Masetti
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
| | - Gerhard F. Ecker
- Department of Pharmaceutical Chemistry, University of Vienna, UZA 2, Althanstrasse 14, 1090 Vienna, Austria
| | - Andrea Cavalli
- Department of Pharmacy and Biotechnology, Alma Mater Studiorum—Università di Bologna, via Belmeloro 6, I-40126 Bologna, Italy
- Computational Sciences, Istituto Italiano di Tecnologia, via Morego 30, 16163 Genova, Italy
| |
Collapse
|
19
|
Xie T, Grossman JC. Hierarchical visualization of materials space with graph convolutional neural networks. J Chem Phys 2018; 149:174111. [DOI: 10.1063/1.5047803] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Affiliation(s)
- Tian Xie
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | - Jeffrey C. Grossman
- Department of Materials Science and Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| |
Collapse
|
20
|
Chen W, Ferguson AL. Molecular enhanced sampling with autoencoders: On-the-fly collective variable discovery and accelerated free energy landscape exploration. J Comput Chem 2018; 39:2079-2102. [PMID: 30368832 DOI: 10.1002/jcc.25520] [Citation(s) in RCA: 113] [Impact Index Per Article: 18.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Accepted: 06/14/2018] [Indexed: 01/08/2023]
Abstract
Macromolecular and biomolecular folding landscapes typically contain high free energy barriers that impede efficient sampling of configurational space by standard molecular dynamics simulation. Biased sampling can artificially drive the simulation along prespecified collective variables (CVs), but success depends critically on the availability of good CVs associated with the important collective dynamical motions. Nonlinear machine learning techniques can identify such CVs but typically do not furnish an explicit relationship with the atomic coordinates necessary to perform biased sampling. In this work, we employ auto-associative artificial neural networks ("autoencoders") to learn nonlinear CVs that are explicit and differentiable functions of the atomic coordinates. Our approach offers substantial speedups in exploration of configurational space, and is distinguished from existing approaches by its capacity to simultaneously discover and directly accelerate along data-driven CVs. We demonstrate the approach in simulations of alanine dipeptide and Trp-cage, and have developed an open-source and freely available implementation within OpenMM. © 2018 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois, 61801.,Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 W Green Street, Urbana, Illinois, 61801.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, 600 South Mathews Avenue, Urbana, Illinois, 61801
| |
Collapse
|
21
|
Chen W, Tan AR, Ferguson AL. Collective variable discovery and enhanced sampling using autoencoders: Innovations in network architecture and error function design. J Chem Phys 2018; 149:072312. [PMID: 30134681 DOI: 10.1063/1.5023804] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Auto-associative neural networks ("autoencoders") present a powerful nonlinear dimensionality reduction technique to mine data-driven collective variables from molecular simulation trajectories. This technique furnishes explicit and differentiable expressions for the nonlinear collective variables, making it ideally suited for integration with enhanced sampling techniques for accelerated exploration of configurational space. In this work, we describe a number of sophistications of the neural network architectures to improve and generalize the process of interleaved collective variable discovery and enhanced sampling. We employ circular network nodes to accommodate periodicities in the collective variables, hierarchical network architectures to rank-order the collective variables, and generalized encoder-decoder architectures to support bespoke error functions for network training to incorporate prior knowledge. We demonstrate our approach in blind collective variable discovery and enhanced sampling of the configurational free energy landscapes of alanine dipeptide and Trp-cage using an open-source plugin developed for the OpenMM molecular simulation package.
Collapse
Affiliation(s)
- Wei Chen
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| | - Aik Rui Tan
- Department of Materials Science and Engineering, University of Illinois at Urbana-Champaign, 1304 West Green Street, Urbana, Illinois 61801, USA
| | - Andrew L Ferguson
- Department of Physics, University of Illinois at Urbana-Champaign, 1110 West Green Street, Urbana, Illinois 61801, USA
| |
Collapse
|
22
|
Pazúriková J, Křenek A, Spiwok V, Šimková M. Reducing the number of mean-square deviation calculations with floating close structure in metadynamics. J Chem Phys 2018; 146:115101. [PMID: 28330370 DOI: 10.1063/1.4978296] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Metadynamics is an important collective-coordinate-based enhanced sampling simulation method. Its performance depends significantly on the capability of collective coordinates to describe the studied molecular processes. Collective coordinates based on comparison with reference landmark structures can be used to enhance sampling in highly complex systems; however, they may slow down simulations due to high number of structure-structure distance (e.g., mean-square deviation) calculations. Here we introduce an approximation of root-mean-square or mean-square deviation that significantly reduces numbers of computationally expensive operations. We evaluate its accuracy and theoretical performance gain with metadynamics simulations on two molecular systems.
Collapse
Affiliation(s)
- Jana Pazúriková
- Institute of Computer Science, Masaryk University, Brno, Czech Republic
| | - Aleš Křenek
- Institute of Computer Science, Masaryk University, Brno, Czech Republic
| | - Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Czech Republic
| | - Mária Šimková
- Institute of Computer Science, Masaryk University, Brno, Czech Republic
| |
Collapse
|
23
|
Wang J, Ferguson AL. Nonlinear machine learning in simulations of soft and biological materials. MOLECULAR SIMULATION 2017. [DOI: 10.1080/08927022.2017.1400164] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- J. Wang
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
| | - A. L. Ferguson
- Department of Physics, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Materials Science and Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign , Urbana, IL, USA
| |
Collapse
|
24
|
Galvelis R, Sugita Y. Neural Network and Nearest Neighbor Algorithms for Enhancing Sampling of Molecular Dynamics. J Chem Theory Comput 2017; 13:2489-2500. [PMID: 28437616 DOI: 10.1021/acs.jctc.7b00188] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The free energy calculations of complex chemical and biological systems with molecular dynamics (MD) are inefficient due to multiple local minima separated by high-energy barriers. The minima can be escaped using an enhanced sampling method such as metadynamics, which apply bias (i.e., importance sampling) along a set of collective variables (CV), but the maximum number of CVs (or dimensions) is severely limited. We propose a high-dimensional bias potential method (NN2B) based on two machine learning algorithms: the nearest neighbor density estimator (NNDE) and the artificial neural network (ANN) for the bias potential approximation. The bias potential is constructed iteratively from short biased MD simulations accounting for correlation among CVs. Our method is capable of achieving ergodic sampling and calculating free energy of polypeptides with up to 8-dimensional bias potential.
Collapse
Affiliation(s)
- Raimondas Galvelis
- RIKEN Theoretical Molecular Science Laboratory , 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuji Sugita
- RIKEN Theoretical Molecular Science Laboratory , 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.,RIKEN Advance Institute for Computational Science , Integrated Inovation Building 7F, 6-7-1 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.,RIKEN iTHES , 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.,RIKEN Quantitative Biology Center , Integrated Inovation Building 7F, 6-7-1 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
25
|
Hashemian B, Millán D, Arroyo M. Charting molecular free-energy landscapes with an atlas of collective variables. J Chem Phys 2016; 145:174109. [DOI: 10.1063/1.4966262] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Daniel Millán
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Marino Arroyo
- LaCàN, Universitat Politècnica de Catalunya–BarcelonaTech, Barcelona, Spain
| |
Collapse
|
26
|
Bonomi M, Camilloni C, Vendruscolo M. Metadynamic metainference: Enhanced sampling of the metainference ensemble using metadynamics. Sci Rep 2016; 6:31232. [PMID: 27561930 PMCID: PMC4999896 DOI: 10.1038/srep31232] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Accepted: 07/11/2016] [Indexed: 01/23/2023] Open
Abstract
Accurate and precise structural ensembles of proteins and macromolecular complexes can be obtained with metainference, a recently proposed Bayesian inference method that integrates experimental information with prior knowledge and deals with all sources of errors in the data as well as with sample heterogeneity. The study of complex macromolecular systems, however, requires an extensive conformational sampling, which represents a separate challenge. To address such challenge and to exhaustively and efficiently generate structural ensembles we combine metainference with metadynamics and illustrate its application to the calculation of the free energy landscape of the alanine dipeptide.
Collapse
Affiliation(s)
- Massimiliano Bonomi
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| | - Carlo Camilloni
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
- Department of Chemistry and Institute for Advanced Study, Technische Universität München, Lichtenbergstrasse 4, D-85747 Garching, Germany
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| |
Collapse
|
27
|
Spiwok V, Oborský P, Pazúriková J, Křenek A, Králová B. Nonlinear vs. linear biasing in Trp-cage folding simulations. J Chem Phys 2015; 142:115101. [PMID: 25796266 DOI: 10.1063/1.4914828] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Biased simulations have great potential for the study of slow processes, including protein folding. Atomic motions in molecules are nonlinear, which suggests that simulations with enhanced sampling of collective motions traced by nonlinear dimensionality reduction methods may perform better than linear ones. In this study, we compare an unbiased folding simulation of the Trp-cage miniprotein with metadynamics simulations using both linear (principle component analysis) and nonlinear (Isomap) low dimensional embeddings as collective variables. Folding of the mini-protein was successfully simulated in 200 ns simulation with linear biasing and non-linear motion biasing. The folded state was correctly predicted as the free energy minimum in both simulations. We found that the advantage of linear motion biasing is that it can sample a larger conformational space, whereas the advantage of nonlinear motion biasing lies in slightly better resolution of the resulting free energy surface. In terms of sampling efficiency, both methods are comparable.
Collapse
Affiliation(s)
- Vojtěch Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic
| | - Pavel Oborský
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic
| | - Jana Pazúriková
- Institute of Computer Science, Masaryk University, Botanická 554/68a, 602 00 Brno, Czech Republic
| | - Aleš Křenek
- Institute of Computer Science, Masaryk University, Botanická 554/68a, 602 00 Brno, Czech Republic
| | - Blanka Králová
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic
| |
Collapse
|
28
|
Pfaendtner J, Bonomi M. Efficient Sampling of High-Dimensional Free-Energy Landscapes with Parallel Bias Metadynamics. J Chem Theory Comput 2015; 11:5062-7. [DOI: 10.1021/acs.jctc.5b00846] [Citation(s) in RCA: 138] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- Jim Pfaendtner
- Department
of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Massimiliano Bonomi
- Department
of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
29
|
Bussi G, Branduardi D. Free-Energy Calculations with Metadynamics: Theory and Practice. REVIEWS IN COMPUTATIONAL CHEMISTRY 2015. [DOI: 10.1002/9781118889886.ch1] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
|
30
|
Hashemian B, Arroyo M. Topological obstructions in the way of data-driven collective variables. J Chem Phys 2015; 142:044102. [PMID: 25637964 DOI: 10.1063/1.4906425] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Nonlinear dimensionality reduction (NLDR) techniques are increasingly used to visualize molecular trajectories and to create data-driven collective variables for enhanced sampling simulations. The success of these methods relies on their ability to identify the essential degrees of freedom characterizing conformational changes. Here, we show that NLDR methods face serious obstacles when the underlying collective variables present periodicities, e.g., arising from proper dihedral angles. As a result, NLDR methods collapse very distant configurations, thus leading to misinterpretations and inefficiencies in enhanced sampling. Here, we identify this largely overlooked problem and discuss possible approaches to overcome it. We also characterize the geometry and topology of conformational changes of alanine dipeptide, a benchmark system for testing new methods to identify collective variables.
Collapse
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politecnica de Catalunya–BarcelonaTech, Barcelona, Spain
| | - Marino Arroyo
- LaCàN, Universitat Politecnica de Catalunya–BarcelonaTech, Barcelona, Spain
| |
Collapse
|
31
|
Barducci A, Pfaendtner J, Bonomi M. Tackling sampling challenges in biomolecular simulations. Methods Mol Biol 2015; 1215:151-71. [PMID: 25330963 DOI: 10.1007/978-1-4939-1465-4_8] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Molecular dynamics (MD) simulations are a powerful tool to give an atomistic insight into the structure and dynamics of proteins. However, the time scales accessible in standard simulations, which often do not match those in which interesting biological processes occur, limit their predictive capabilities. Many advanced sampling techniques have been proposed over the years to overcome this limitation. This chapter focuses on metadynamics, a method based on the introduction of a time-dependent bias potential to accelerate sampling and recover equilibrium properties of a few descriptors that are able to capture the complexity of a process at a coarse-grained level. The theory of metadynamics and its combination with other popular sampling techniques such as the replica exchange method is briefly presented. Practical applications of these techniques to the study of the Trp-Cage miniprotein folding are also illustrated. The examples contain a guide for performing these calculations with PLUMED, a plugin to perform enhanced sampling simulations in combination with many popular MD codes.
Collapse
Affiliation(s)
- Alessandro Barducci
- Laboratory of Statistical Biophysics, School of Basic Sciences, Ecole Polytechnique Fédérale de Lausanne, 1015, Lausanne, Switzerland
| | | | | |
Collapse
|
32
|
Ferrarotti MJ, Bottaro S, Pérez-Villa A, Bussi G. Accurate Multiple Time Step in Biased Molecular Simulations. J Chem Theory Comput 2014; 11:139-46. [DOI: 10.1021/ct5007086] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Marco Jacopo Ferrarotti
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Sandro Bottaro
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Andrea Pérez-Villa
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| | - Giovanni Bussi
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), via Bonomea 265, 34136 Trieste, Italy
| |
Collapse
|
33
|
Abstract
Voltage sensor domains (VSDs) are membrane-bound protein modules that confer voltage sensitivity to membrane proteins. VSDs sense changes in the transmembrane voltage and convert the electrical signal into a conformational change called activation. Activation involves a reorganization of the membrane protein charges that is detected experimentally as transient currents. These so-called gating currents have been investigated extensively within the theoretical framework of so-called discrete-state Markov models (DMMs), whereby activation is conceptualized as a series of transitions across a discrete set of states. Historically, the interpretation of DMM transition rates in terms of transition state theory has been instrumental in shaping our view of the activation process, whose free-energy profile is currently envisioned as composed of a few local minima separated by steep barriers. Here we use atomistic level modeling and well-tempered metadynamics to calculate the configurational free energy along a single transition from first principles. We show that this transition is intrinsically multidimensional and described by a rough free-energy landscape. Remarkably, a coarse-grained description of the system, based on the use of the gating charge as reaction coordinate, reveals a smooth profile with a single barrier, consistent with phenomenological models. Our results bridge the gap between microscopic and macroscopic descriptions of activation dynamics and show that choosing the gating charge as reaction coordinate masks the topological complexity of the network of microstates participating in the transition. Importantly, full characterization of the latter is a prerequisite to rationalize modulation of this process by lipids, toxins, drugs, and genetic mutations.
Collapse
|
34
|
Spiwok V, Sucur Z, Hosek P. Enhanced sampling techniques in biomolecular simulations. Biotechnol Adv 2014; 33:1130-40. [PMID: 25482668 DOI: 10.1016/j.biotechadv.2014.11.011] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2014] [Revised: 11/21/2014] [Accepted: 11/24/2014] [Indexed: 02/01/2023]
Abstract
Biomolecular simulations are routinely used in biochemistry and molecular biology research; however, they often fail to match expectations of their impact on pharmaceutical and biotech industry. This is caused by the fact that a vast amount of computer time is required to simulate short episodes from the life of biomolecules. Several approaches have been developed to overcome this obstacle, including application of massively parallel and special purpose computers or non-conventional hardware. Methodological approaches are represented by coarse-grained models and enhanced sampling techniques. These techniques can show how the studied system behaves in long time-scales on the basis of relatively short simulations. This review presents an overview of new simulation approaches, the theory behind enhanced sampling methods and success stories of their applications with a direct impact on biotechnology or drug design.
Collapse
Affiliation(s)
- Vojtech Spiwok
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic.
| | - Zoran Sucur
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic
| | - Petr Hosek
- Department of Biochemistry and Microbiology, University of Chemistry and Technology, Prague, Technická 3, Prague 6 166 28, Czech Republic
| |
Collapse
|
35
|
Hashemian B, Millán D, Arroyo M. Modeling and enhanced sampling of molecular systems with smooth and nonlinear data-driven collective variables. J Chem Phys 2014; 139:214101. [PMID: 24320358 DOI: 10.1063/1.4830403] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Collective variables (CVs) are low-dimensional representations of the state of a complex system, which help us rationalize molecular conformations and sample free energy landscapes with molecular dynamics simulations. Given their importance, there is need for systematic methods that effectively identify CVs for complex systems. In recent years, nonlinear manifold learning has shown its ability to automatically characterize molecular collective behavior. Unfortunately, these methods fail to provide a differentiable function mapping high-dimensional configurations to their low-dimensional representation, as required in enhanced sampling methods. We introduce a methodology that, starting from an ensemble representative of molecular flexibility, builds smooth and nonlinear data-driven collective variables (SandCV) from the output of nonlinear manifold learning algorithms. We demonstrate the method with a standard benchmark molecule, alanine dipeptide, and show how it can be non-intrusively combined with off-the-shelf enhanced sampling methods, here the adaptive biasing force method. We illustrate how enhanced sampling simulations with SandCV can explore regions that were poorly sampled in the original molecular ensemble. We further explore the transferability of SandCV from a simpler system, alanine dipeptide in vacuum, to a more complex system, alanine dipeptide in explicit water.
Collapse
Affiliation(s)
- Behrooz Hashemian
- LaCàN, Universitat Politècnica de Catalunya - BarcelonaTech, Campus Nord, 08034 Barcelona, Spain
| | | | | |
Collapse
|
36
|
Enhanced Sampling in Molecular Dynamics Using Metadynamics, Replica-Exchange, and Temperature-Acceleration. ENTROPY 2013. [DOI: 10.3390/e16010163] [Citation(s) in RCA: 291] [Impact Index Per Article: 26.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
37
|
Duan M, Fan J, Li M, Han L, Huo S. Evaluation of Dimensionality-reduction Methods from Peptide Folding-unfolding Simulations. J Chem Theory Comput 2013; 9:2490-2497. [PMID: 23772182 DOI: 10.1021/ct400052y] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Dimensionality reduction methods have been widely used to study the free energy landscapes and low-free energy pathways of molecular systems. It was shown that the non-linear dimensionality-reduction methods gave better embedding results than the linear methods, such as principal component analysis, in some simple systems. In this study, we have evaluated several non linear methods, locally linear embedding, Isomap, and diffusion maps, as well as principal component analysis from the equilibrium folding/unfolding trajectory of the second β-hairpin of the B1 domain of streptococcal protein G. The CHARMM parm19 polar hydrogen potential function was used. A series of criteria which reflects different aspects of the embedding qualities were employed in the evaluation. Our results show that principal component analysis is not worse than the non-linear ones on this complex system. There is no clear winner in all aspects of the evaluation. Each dimensionality-reduction method has its limitations in a certain aspect. We emphasize that a fair, informative assessment of an embedding result requires a combination of multiple evaluation criteria rather than any single one. Caution should be used when dimensionality-reduction methods are employed, especially when only a few of top embedding dimensions are used to describe the free energy landscape.
Collapse
Affiliation(s)
- Mojie Duan
- Gustaf H. Carlson School of Chemistry and Biochemistry, Clark University, Worcester, MA 01610 USA
| | | | | | | | | |
Collapse
|
38
|
Ceriotti M, Tribello GA, Parrinello M. Demonstrating the Transferability and the Descriptive Power of Sketch-Map. J Chem Theory Comput 2013; 9:1521-32. [DOI: 10.1021/ct3010563] [Citation(s) in RCA: 92] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Michele Ceriotti
- Physical and Theoretical Chemistry
Laboratory, University of Oxford, South Parks Road, Oxford OX1 3QZ,
United Kingdom
| | - Gareth A. Tribello
- Computational
Science, Department
of Chemistry and Applied Biosciences, ETH Zurich and Facoltà
di Informatica, Instituto di Scienza Computationali, Università della Svizzera Italiana, Via Giuseppe
Buffi 13, CH-6900, Lugano, Switzerland
| | - Michele Parrinello
- Computational
Science, Department
of Chemistry and Applied Biosciences, ETH Zurich and Facoltà
di Informatica, Instituto di Scienza Computationali, Università della Svizzera Italiana, Via Giuseppe
Buffi 13, CH-6900, Lugano, Switzerland
| |
Collapse
|
39
|
Sutto L, Marsili S, Gervasio FL. New advances in metadynamics. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1103] [Citation(s) in RCA: 94] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
40
|
Using sketch-map coordinates to analyze and bias molecular dynamics simulations. Proc Natl Acad Sci U S A 2012; 109:5196-201. [PMID: 22427357 DOI: 10.1073/pnas.1201152109] [Citation(s) in RCA: 101] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
When examining complex problems, such as the folding of proteins, coarse grained descriptions of the system drive our investigation and help us to rationalize the results. Oftentimes collective variables (CVs), derived through some chemical intuition about the process of interest, serve this purpose. Because finding these CVs is the most difficult part of any investigation, we recently developed a dimensionality reduction algorithm, sketch-map, that can be used to build a low-dimensional map of a phase space of high-dimensionality. In this paper we discuss how these machine-generated CVs can be used to accelerate the exploration of phase space and to reconstruct free-energy landscapes. To do so, we develop a formalism in which high-dimensional configurations are no longer represented by low-dimensional position vectors. Instead, for each configuration we calculate a probability distribution, which has a domain that encompasses the entirety of the low-dimensional space. To construct a biasing potential, we exploit an analogy with metadynamics and use the trajectory to adaptively construct a repulsive, history-dependent bias from the distributions that correspond to the previously visited configurations. This potential forces the system to explore more of phase space by making it desirable to adopt configurations whose distributions do not overlap with the bias. We apply this algorithm to a small model protein and succeed in reproducing the free-energy surface that we obtain from a parallel tempering calculation.
Collapse
|
41
|
Thomas PS, Somers MF, Hoekstra AW, Kroes GJ. Chebyshev high-dimensional model representation (Chebyshev-HDMR) potentials: application to reactive scattering of H2 from Pt(111) and Cu(111) surfaces. Phys Chem Chem Phys 2012; 14:8628-43. [DOI: 10.1039/c2cp40173h] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|