51
|
Li Z, Meidani K, Yadav P, Barati Farimani A. Graph neural networks accelerated molecular dynamics. J Chem Phys 2022; 156:144103. [DOI: 10.1063/5.0083060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Molecular Dynamics (MD) simulation is a powerful tool for understanding the dynamics and structure of matter. Since the resolution of MD is atomic-scale, achieving long timescale simulations with femtosecond integration is very expensive. In each MD step, numerous iterative computations are performed to calculate energy based on different types of interaction and their corresponding spatial gradients. These repetitive computations can be learned and surrogated by a deep learning model, such as a Graph Neural Network (GNN). In this work, we developed a GNN Accelerated MD (GAMD) model that directly predicts forces, given the state of the system (atom positions, atom types), bypassing the evaluation of potential energy. By training the GNN on a variety of data sources (simulation data derived from classical MD and density functional theory), we show that GAMD can predict the dynamics of two typical molecular systems, Lennard-Jones system and water system, in the NVT ensemble with velocities regulated by a thermostat. We further show that GAMD’s learning and inference are agnostic to the scale, where it can scale to much larger systems at test time. We also perform a comprehensive benchmark test comparing our implementation of GAMD to production-level MD software, showing GAMD’s competitive performance on the large-scale simulation.
Collapse
Affiliation(s)
- Zijie Li
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Kazem Meidani
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Prakarsh Yadav
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Amir Barati Farimani
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
52
|
Drug repurposing in silico screening platforms. Biochem Soc Trans 2022; 50:747-758. [PMID: 35285479 DOI: 10.1042/bst20200967] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/08/2022] [Accepted: 02/21/2022] [Indexed: 12/15/2022]
Abstract
Over the last decade, for the first time, substantial efforts have been directed at the development of dedicated in silico platforms for drug repurposing, including initiatives targeting cancers and conditions as diverse as cryptosporidiosis, dengue, dental caries, diabetes, herpes, lupus, malaria, tuberculosis and Covid-19 related respiratory disease. This review outlines some of the exciting advances in the specific applications of in silico approaches to the challenge of drug repurposing and focuses particularly on where these efforts have resulted in the development of generic platform technologies of broad value to researchers involved in programmatic drug repurposing work. Recent advances in molecular docking methodologies and validation approaches, and their combination with machine learning or deep learning approaches are continually enhancing the precision of repurposing efforts. The meaningful integration of better understanding of molecular mechanisms with molecular pathway data and knowledge of disease networks is widening the scope for discovery of repurposing opportunities. The power of Artificial Intelligence is being gainfully exploited to advance progress in an integrated science that extends from the sub-atomic to the whole system level. There are many promising emerging developments but there are remaining challenges to be overcome in the successful integration of the new advances in useful platforms. In conclusion, the essential component requirements for development of powerful and well optimised drug repurposing screening platforms are discussed.
Collapse
|
53
|
Mayr F, Harth M, Kouroudis I, Rinderle M, Gagliardi A. Machine Learning and Optoelectronic Materials Discovery: A Growing Synergy. J Phys Chem Lett 2022; 13:1940-1951. [PMID: 35188778 DOI: 10.1021/acs.jpclett.1c04223] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Novel optoelectronic materials have the potential to revolutionize the ongoing green transition by both providing more efficient photovoltaic (PV) devices and lowering energy consumption of devices like LEDs and sensors. The lead candidate materials for these applications are both organic semiconductors and more recently perovskites. This Perspective illustrates how novel machine learning techniques can help explore these materials, from speeding up ab initio calculations toward experimental guidance. Furthermore, based on existing work, perspectives around machine-learned molecular dynamics potentials, physically informed neural networks, and generative methods are outlined.
Collapse
Affiliation(s)
- Felix Mayr
- Department of Electrical and Computer Engineering, Technical University of Munich, Hans-Piloty-Straße 1, 85748 Garching bei München, Germany
| | - Milan Harth
- Department of Electrical and Computer Engineering, Technical University of Munich, Hans-Piloty-Straße 1, 85748 Garching bei München, Germany
| | - Ioannis Kouroudis
- Department of Electrical and Computer Engineering, Technical University of Munich, Hans-Piloty-Straße 1, 85748 Garching bei München, Germany
| | - Michael Rinderle
- Department of Electrical and Computer Engineering, Technical University of Munich, Hans-Piloty-Straße 1, 85748 Garching bei München, Germany
| | - Alessio Gagliardi
- Department of Electrical and Computer Engineering, Technical University of Munich, Hans-Piloty-Straße 1, 85748 Garching bei München, Germany
- Munich Data Science Institute, Technical University of Munich, Walther-von-Dyck-Straße 10, 85748 Garching bei München, Germany
| |
Collapse
|
54
|
Gokcan H, Isayev O. Learning molecular potentials with neural networks. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1564] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Hatice Gokcan
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| |
Collapse
|
55
|
DeLyser MR, Noid WG. Coarse-grained models for local density gradients. J Chem Phys 2022; 156:034106. [DOI: 10.1063/5.0075291] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Affiliation(s)
- Michael R. DeLyser
- Department of Chemistry, Penn State University, University Park, Pennsylvania 16802, USA
| | - W. G. Noid
- Department of Chemistry, Penn State University, University Park, Pennsylvania 16802, USA
| |
Collapse
|
56
|
Empereur-Mot C, Capelli R, Perrone M, Caruso C, Doni G, Pavan GM. Automatic multi-objective optimization of coarse-grained lipid force fields using SwarmCG. J Chem Phys 2022; 156:024801. [PMID: 35032979 DOI: 10.1063/5.0079044] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The development of coarse-grained (CG) molecular models typically requires a time-consuming iterative tuning of parameters in order to have the approximated CG models behave correctly and consistently with, e.g., available higher-resolution simulation data and/or experimental observables. Automatic data-driven approaches are increasingly used to develop accurate models for molecular dynamics simulations. However, the parameters obtained via such automatic methods often make use of specifically designed interaction potentials and are typically poorly transferable to molecular systems or conditions other than those used for training them. Using a multi-objective approach in combination with an automatic optimization engine (SwarmCG), here, we show that it is possible to optimize CG models that are also transferable, obtaining optimized CG force fields (FFs). As a proof of concept, here, we use lipids for which we can avail reference experimental data (area per lipid and bilayer thickness) and reliable atomistic simulations to guide the optimization. Once the resolution of the CG models (mapping) is set as an input, SwarmCG optimizes the parameters of the CG lipid models iteratively and simultaneously against higher-resolution simulations (bottom-up) and experimental data (top-down references). Including different types of lipid bilayers in the training set in a parallel optimization guarantees the transferability of the optimized lipid FF parameters. We demonstrate that SwarmCG can reach satisfactory agreement with experimental data for different resolution CG FFs. We also obtain stimulating insights into the precision-resolution balance of the FFs. The approach is general and can be effectively used to develop new FFs and to improve the existing ones.
Collapse
Affiliation(s)
- Charly Empereur-Mot
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Polo Universitario Lugano, Campus Est, Via la Santa 1, 6962 Lugano-Viganello, Switzerland
| | - Riccardo Capelli
- Politecnico di Torino, Department of Applied Science and Technology, Corso Duca degli Abruzzi 24, Torino 10129, Italy
| | - Mattia Perrone
- Politecnico di Torino, Department of Applied Science and Technology, Corso Duca degli Abruzzi 24, Torino 10129, Italy
| | - Cristina Caruso
- Politecnico di Torino, Department of Applied Science and Technology, Corso Duca degli Abruzzi 24, Torino 10129, Italy
| | - Giovanni Doni
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Polo Universitario Lugano, Campus Est, Via la Santa 1, 6962 Lugano-Viganello, Switzerland
| | - Giovanni M Pavan
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Polo Universitario Lugano, Campus Est, Via la Santa 1, 6962 Lugano-Viganello, Switzerland
| |
Collapse
|
57
|
Xu P, Mou X, Guo Q, Fu T, Ren H, Wang G, Li Y, Li G. Coarse-grained molecular dynamics study based on TorchMD. CHINESE J CHEM PHYS 2021. [DOI: 10.1063/1674-0068/cjcp2110218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Peijun Xu
- Liaoning Normal University, Dalian 116029, China
| | - Xiaohong Mou
- Liaoning Normal University, Dalian 116029, China
| | - Qiuhan Guo
- Liaoning Normal University, Dalian 116029, China
| | - Ting Fu
- Pharmacy Department of Affiliated Zhongshan Hospital of Dalian University, Dalian 116001, China
| | - Hong Ren
- Department of Ophthalmology Aerospace Center Hospital, Beijing 100049, China
| | - Guiyan Wang
- Dalian Ocean University, Dalian 116029, China
| | - Yan Li
- Dalian Institute of Chemical Physics, State Key Laboratory of Molecular Reaction Dynamics, Dalian 116023, China
| | - Guohui Li
- Dalian Institute of Chemical Physics, State Key Laboratory of Molecular Reaction Dynamics, Dalian 116023, China
| |
Collapse
|
58
|
Thaler S, Zavadlav J. Learning neural network potentials from experimental data via Differentiable Trajectory Reweighting. Nat Commun 2021; 12:6884. [PMID: 34824254 PMCID: PMC8617111 DOI: 10.1038/s41467-021-27241-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 11/09/2021] [Indexed: 11/09/2022] Open
Abstract
In molecular dynamics (MD), neural network (NN) potentials trained bottom-up on quantum mechanical data have seen tremendous success recently. Top-down approaches that learn NN potentials directly from experimental data have received less attention, typically facing numerical and computational challenges when backpropagating through MD simulations. We present the Differentiable Trajectory Reweighting (DiffTRe) method, which bypasses differentiation through the MD simulation for time-independent observables. Leveraging thermodynamic perturbation theory, we avoid exploding gradients and achieve around 2 orders of magnitude speed-up in gradient computation for top-down learning. We show effectiveness of DiffTRe in learning NN potentials for an atomistic model of diamond and a coarse-grained model of water based on diverse experimental observables including thermodynamic, structural and mechanical properties. Importantly, DiffTRe also generalizes bottom-up structural coarse-graining methods such as iterative Boltzmann inversion to arbitrary potentials. The presented method constitutes an important milestone towards enriching NN potentials with experimental data, particularly when accurate bottom-up data is unavailable.
Collapse
Affiliation(s)
- Stephan Thaler
- Professorship of Multiscale Modeling of Fluid Materials, TUM School of Engineering and Design, Technical University of Munich, Munich, Germany.
| | - Julija Zavadlav
- Professorship of Multiscale Modeling of Fluid Materials, TUM School of Engineering and Design, Technical University of Munich, Munich, Germany.
- Munich Data Science Institute, Technical University of Munich, Munich, Germany.
| |
Collapse
|
59
|
Dhamankar S, Webb MA. Chemically specific coarse‐graining of polymers: Methods and prospects. JOURNAL OF POLYMER SCIENCE 2021. [DOI: 10.1002/pol.20210555] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Satyen Dhamankar
- Department of Chemical and Biological Engineering Princeton University Princeton New Jersey USA
| | - Michael A. Webb
- Department of Chemical and Biological Engineering Princeton University Princeton New Jersey USA
| |
Collapse
|
60
|
Kim J, Park S, Min D, Kim W. Comprehensive Survey of Recent Drug Discovery Using Deep Learning. Int J Mol Sci 2021; 22:9983. [PMID: 34576146 PMCID: PMC8470987 DOI: 10.3390/ijms22189983] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2021] [Revised: 09/09/2021] [Accepted: 09/10/2021] [Indexed: 02/07/2023] Open
Abstract
Drug discovery based on artificial intelligence has been in the spotlight recently as it significantly reduces the time and cost required for developing novel drugs. With the advancement of deep learning (DL) technology and the growth of drug-related data, numerous deep-learning-based methodologies are emerging at all steps of drug development processes. In particular, pharmaceutical chemists have faced significant issues with regard to selecting and designing potential drugs for a target of interest to enter preclinical testing. The two major challenges are prediction of interactions between drugs and druggable targets and generation of novel molecular structures suitable for a target of interest. Therefore, we reviewed recent deep-learning applications in drug-target interaction (DTI) prediction and de novo drug design. In addition, we introduce a comprehensive summary of a variety of drug and protein representations, DL models, and commonly used benchmark datasets or tools for model training and testing. Finally, we present the remaining challenges for the promising future of DL-based DTI prediction and de novo drug design.
Collapse
Affiliation(s)
- Jintae Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Sera Park
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
| | - Dongbo Min
- Computer Vision Lab, Department of Computer Science and Engineering, Ewha Womans University, Seoul 03760, Korea
| | - Wankyu Kim
- KaiPharm Co., Ltd., Seoul 03759, Korea; (J.K.); (S.P.)
- System Pharmacology Lab, Department of Life Sciences, Ewha Womans University, Seoul 03760, Korea
| |
Collapse
|
61
|
Greener JG, Jones DT. Differentiable molecular simulation can learn all the parameters in a coarse-grained force field for proteins. PLoS One 2021; 16:e0256990. [PMID: 34473813 PMCID: PMC8412298 DOI: 10.1371/journal.pone.0256990] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2021] [Accepted: 08/19/2021] [Indexed: 11/26/2022] Open
Abstract
Finding optimal parameters for force fields used in molecular simulation is a challenging and time-consuming task, partly due to the difficulty of tuning multiple parameters at once. Automatic differentiation presents a general solution: run a simulation, obtain gradients of a loss function with respect to all the parameters, and use these to improve the force field. This approach takes advantage of the deep learning revolution whilst retaining the interpretability and efficiency of existing force fields. We demonstrate that this is possible by parameterising a simple coarse-grained force field for proteins, based on training simulations of up to 2,000 steps learning to keep the native structure stable. The learned potential matches chemical knowledge and PDB data, can fold and reproduce the dynamics of small proteins, and shows ability in protein design and model scoring applications. Problems in applying differentiable molecular simulation to all-atom models of proteins are discussed along with possible solutions and the variety of available loss functions. The learned potential, simulation scripts and training code are made available at https://github.com/psipred/cgdms.
Collapse
Affiliation(s)
- Joe G. Greener
- Department of Computer Science, University College London, London, United Kingdom
| | - David T. Jones
- Department of Computer Science, University College London, London, United Kingdom
| |
Collapse
|
62
|
Chen Y, Krämer A, Charron NE, Husic BE, Clementi C, Noé F. Machine learning implicit solvation for molecular dynamics. J Chem Phys 2021; 155:084101. [PMID: 34470360 DOI: 10.1063/5.0059915] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Accurate modeling of the solvent environment for biological molecules is crucial for computational biology and drug design. A popular approach to achieve long simulation time scales for large system sizes is to incorporate the effect of the solvent in a mean-field fashion with implicit solvent models. However, a challenge with existing implicit solvent models is that they often lack accuracy or certain physical properties compared to explicit solvent models as the many-body effects of the neglected solvent molecules are difficult to model as a mean field. Here, we leverage machine learning (ML) and multi-scale coarse graining (CG) in order to learn implicit solvent models that can approximate the energetic and thermodynamic properties of a given explicit solvent model with arbitrary accuracy, given enough training data. Following the previous ML-CG models CGnet and CGSchnet, we introduce ISSNet, a graph neural network, to model the implicit solvent potential of mean force. ISSNet can learn from explicit solvent simulation data and be readily applied to molecular dynamics simulations. We compare the solute conformational distributions under different solvation treatments for two peptide systems. The results indicate that ISSNet models can outperform widely used generalized Born and surface area models in reproducing the thermodynamics of small protein systems with respect to explicit solvent. The success of this novel method demonstrates the potential benefit of applying machine learning methods in accurate modeling of solvent effects for in silico research and biomedical applications.
Collapse
Affiliation(s)
- Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Brooke E Husic
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | - Cecilia Clementi
- Department of Physics, Rice University, Houston, Texas 77005, USA
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| |
Collapse
|
63
|
Clark AE, Adams H, Hernandez R, Krylov AI, Niklasson AMN, Sarupria S, Wang Y, Wild SM, Yang Q. The Middle Science: Traversing Scale In Complex Many-Body Systems. ACS CENTRAL SCIENCE 2021; 7:1271-1287. [PMID: 34471670 PMCID: PMC8393217 DOI: 10.1021/acscentsci.1c00685] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
A roadmap is developed that integrates simulation methodology and data science methods to target new theories that traverse the multiple length- and time-scale features of many-body phenomena.
Collapse
Affiliation(s)
- Aurora E. Clark
- Department of Chemistry, Washington State University, Pullman, Washington 99163, United States
| | - Henry Adams
- Department of Mathematics, Colorado State
University, Fort Collins, Colorado 80523, United States
| | - Rigoberto Hernandez
- Departments
of Chemistry, Chemical and Biomolecular Engineering, and Materials
Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, United States
| | - Anna I. Krylov
- Department of Chemistry, University of Southern California, Los Angeles, California 90089, United States
| | - Anders M. N. Niklasson
- Theoretical
Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sapna Sarupria
- Department of Chemical and Biomolecular Engineering, Center for Optical
Materials Science and Engineering Technologies (COMSET), Clemson University, Clemson, South Carolina 29670, United States
- Department
of Chemistry, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - Yusu Wang
- Halıcıŏglu Data Science Institute, University of California, San Diego, La Jolla, California 92093, United States
| | - Stefan M. Wild
- Mathematics
and Computer Science Division, Argonne National
Laboratory, Lemont, Illinois 60439, United
States
| | - Qian Yang
- Computer Science and Engineering Department, University of Connecticut, Storrs, Connecticut 06269-4155, United States
| |
Collapse
|
64
|
Glielmo A, Husic BE, Rodriguez A, Clementi C, Noé F, Laio A. Unsupervised Learning Methods for Molecular Simulation Data. Chem Rev 2021; 121:9722-9758. [PMID: 33945269 PMCID: PMC8391792 DOI: 10.1021/acs.chemrev.0c01195] [Citation(s) in RCA: 116] [Impact Index Per Article: 38.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Indexed: 12/21/2022]
Abstract
Unsupervised learning is becoming an essential tool to analyze the increasingly large amounts of data produced by atomistic and molecular simulations, in material science, solid state physics, biophysics, and biochemistry. In this Review, we provide a comprehensive overview of the methods of unsupervised learning that have been most commonly used to investigate simulation data and indicate likely directions for further developments in the field. In particular, we discuss feature representation of molecular systems and present state-of-the-art algorithms of dimensionality reduction, density estimation, and clustering, and kinetic models. We divide our discussion into self-contained sections, each discussing a specific method. In each section, we briefly touch upon the mathematical and algorithmic foundations of the method, highlight its strengths and limitations, and describe the specific ways in which it has been used-or can be used-to analyze molecular simulation data.
Collapse
Affiliation(s)
- Aldo Glielmo
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
| | - Brooke E. Husic
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
| | - Alex Rodriguez
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| | - Cecilia Clementi
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Frank Noé
- Freie
Universität Berlin, Department of Mathematics
and Computer Science, 14195 Berlin, Germany
- Freie
Universität Berlin, Department for
Physics, 14195 Berlin, Germany
- Rice
University Houston, Department of Chemistry, Houston, Texas 77005, United States
| | - Alessandro Laio
- International
School for Advanced Studies (SISSA) 34014 Trieste, Italy
- International Centre for Theoretical
Physics (ICTP), Condensed Matter and Statistical
Physics Section, 34100 Trieste, Italy
| |
Collapse
|
65
|
Lindorff-Larsen K, Kragelund BB. On the potential of machine learning to examine the relationship between sequence, structure, dynamics and function of intrinsically disordered proteins. J Mol Biol 2021; 433:167196. [PMID: 34390736 DOI: 10.1016/j.jmb.2021.167196] [Citation(s) in RCA: 33] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 08/03/2021] [Accepted: 08/04/2021] [Indexed: 11/29/2022]
Abstract
Intrinsically disordered proteins (IDPs) constitute a broad set of proteins with few uniting and many diverging properties. IDPs-and intrinsically disordered regions (IDRs) interspersed between folded domains-are generally characterized as having no persistent tertiary structure; instead they interconvert between a large number of different and often expanded structures. IDPs and IDRs are involved in an enormously wide range of biological functions and reveal novel mechanisms of interactions, and while they defy the common structure-function paradigm of folded proteins, their structural preferences and dynamics are important for their function. We here discuss open questions in the field of IDPs and IDRs, focusing on areas where machine learning and other computational methods play a role. We discuss computational methods aimed to predict transiently formed local and long-range structure, including methods for integrative structural biology. We discuss the many different ways in which IDPs and IDRs can bind to other molecules, both via short linear motifs, as well as in the formation of larger dynamic complexes such as biomolecular condensates. We discuss how experiments are providing insight into such complexes and may enable more accurate predictions. Finally, we discuss the role of IDPs in disease and how new methods are needed to interpret the mechanistic effects of genomic variants in IDPs.
Collapse
Affiliation(s)
- Kresten Lindorff-Larsen
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| | - Birthe B Kragelund
- Structural Biology and NMR Laboratory & Linderstrøm-Lang Centre for Protein Science, Department of Biology, University of Copenhagen. Ole Maaløes Vej 5, DK-2200 Copenhagen N, Denmark.
| |
Collapse
|
66
|
Wang J, Charron N, Husic B, Olsson S, Noé F, Clementi C. Multi-body effects in a coarse-grained protein force field. J Chem Phys 2021; 154:164113. [PMID: 33940848 DOI: 10.1063/5.0041022] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
The use of coarse-grained (CG) models is a popular approach to study complex biomolecular systems. By reducing the number of degrees of freedom, a CG model can explore long time- and length-scales inaccessible to computational models at higher resolution. If a CG model is designed by formally integrating out some of the system's degrees of freedom, one expects multi-body interactions to emerge in the effective CG model's energy function. In practice, it has been shown that the inclusion of multi-body terms indeed improves the accuracy of a CG model. However, no general approach has been proposed to systematically construct a CG effective energy that includes arbitrary orders of multi-body terms. In this work, we propose a neural network based approach to address this point and construct a CG model as a multi-body expansion. By applying this approach to a small protein, we evaluate the relative importance of the different multi-body terms in the definition of an accurate model. We observe a slow convergence in the multi-body expansion, where up to five-body interactions are needed to reproduce the free energy of an atomistic model.
Collapse
Affiliation(s)
- Jiang Wang
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Nicholas Charron
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Brooke Husic
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Frank Noé
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| | - Cecilia Clementi
- Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
67
|
Ma Z, Wang S, Kim M, Liu K, Chen CL, Pan W. Transfer learning of memory kernels for transferable coarse-graining of polymer dynamics. SOFT MATTER 2021; 17:5864-5877. [PMID: 34096961 DOI: 10.1039/d1sm00364j] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The present work concerns the transferability of coarse-grained (CG) modeling in reproducing the dynamic properties of the reference atomistic systems across a range of parameters. In particular, we focus on implicit-solvent CG modeling of polymer solutions. The CG model is based on the generalized Langevin equation, where the memory kernel plays the critical role in determining the dynamics in all time scales. Thus, we propose methods for transfer learning of memory kernels. The key ingredient of our methods is Gaussian process regression. By integration with the model order reduction via proper orthogonal decomposition and the active learning technique, the transfer learning can be practically efficient and requires minimum training data. Through two example polymer solution systems, we demonstrate the accuracy and efficiency of the proposed transfer learning methods in the construction of transferable memory kernels. The transferability allows for out-of-sample predictions, even in the extrapolated domain of parameters. Built on the transferable memory kernels, the CG models can reproduce the dynamic properties of polymers in all time scales at different thermodynamic conditions (such as temperature and solvent viscosity) and for different systems with varying concentrations and lengths of polymers.
Collapse
Affiliation(s)
- Zhan Ma
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Shu Wang
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | - Minhee Kim
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Kaibo Liu
- Department of Industrial and Systems Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Chun-Long Chen
- Physical Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Wenxiao Pan
- Department of Mechanical Engineering, University of Wisconsin-Madison, Madison, WI 53706, USA.
| |
Collapse
|
68
|
Koutsoukos S, Philippi F, Malaret F, Welton T. A review on machine learning algorithms for the ionic liquid chemical space. Chem Sci 2021; 12:6820-6843. [PMID: 34123314 PMCID: PMC8153233 DOI: 10.1039/d1sc01000j] [Citation(s) in RCA: 46] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Accepted: 04/28/2021] [Indexed: 01/05/2023] Open
Abstract
There are thousands of papers published every year investigating the properties and possible applications of ionic liquids. Industrial use of these exceptional fluids requires adequate understanding of their physical properties, in order to create the ionic liquid that will optimally suit the application. Computational property prediction arose from the urgent need to minimise the time and cost that would be required to experimentally test different combinations of ions. This review discusses the use of machine learning algorithms as property prediction tools for ionic liquids (either as standalone methods or in conjunction with molecular dynamics simulations), presents common problems of training datasets and proposes ways that could lead to more accurate and efficient models.
Collapse
Affiliation(s)
- Spyridon Koutsoukos
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Frederik Philippi
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| | - Francisco Malaret
- Department of Chemical Engineering, Imperial College London South Kensington Campus London SW7 2AZ UK
| | - Tom Welton
- Department of Chemistry, Molecular Sciences Research Hub, Imperial College London White City Campus London W12 0BZ UK
| |
Collapse
|
69
|
Doerr S, Majewski M, Pérez A, Krämer A, Clementi C, Noe F, Giorgino T, De Fabritiis G. TorchMD: A Deep Learning Framework for Molecular Simulations. J Chem Theory Comput 2021; 17:2355-2363. [PMID: 33729795 PMCID: PMC8486166 DOI: 10.1021/acs.jctc.0c01343] [Citation(s) in RCA: 70] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Indexed: 11/28/2022]
Abstract
Molecular dynamics simulations provide a mechanistic description of molecules by relying on empirical potentials. The quality and transferability of such potentials can be improved leveraging data-driven models derived with machine learning approaches. Here, we present TorchMD, a framework for molecular simulations with mixed classical and machine learning potentials. All force computations including bond, angle, dihedral, Lennard-Jones, and Coulomb interactions are expressed as PyTorch arrays and operations. Moreover, TorchMD enables learning and simulating neural network potentials. We validate it using standard Amber all-atom simulations, learning an ab initio potential, performing an end-to-end training, and finally learning and simulating a coarse-grained model for protein folding. We believe that TorchMD provides a useful tool set to support molecular simulations of machine learning potentials. Code and data are freely available at github.com/torchmd.
Collapse
Affiliation(s)
| | - Maciej Majewski
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Adrià Pérez
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Andreas Krämer
- Department
of Mathematics and Computer Science, Freie
Universität, 14195 Berlin, Germany
| | - Cecilia Clementi
- Department
of Physics, Freie Universität, 14195 Berlin, Germany
- Department
of Chemistry, Rice University, Houston, 77005 Texas, United States
| | - Frank Noe
- Department
of Mathematics and Computer Science, Freie
Universität, 14195 Berlin, Germany
- Department
of Physics, Freie Universität, 14195 Berlin, Germany
- Department
of Chemistry, Rice University, Houston, 77005 Texas, United States
| | - Toni Giorgino
- Biophysics
Institute, National Research Council (CNR-IBF), 20133 Milano, Italy
- Department
of Biosciences, Università degli
Studi di Milano, 20133 Milano, Italy
| | - Gianni De Fabritiis
- Acellera, 08005 Barcelona, Spain
- Computational
Science Laboratory, Universitat Pompeu Fabra, 08003 Barcelona, Spain
- Institució
Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
| |
Collapse
|
70
|
Al-Samir S, Itel F, Hegermann J, Gros G, Tsiavaliaris G, Endeward V. O 2 permeability of lipid bilayers is low, but increases with membrane cholesterol. Cell Mol Life Sci 2021; 78:7649-7662. [PMID: 34694438 PMCID: PMC8629883 DOI: 10.1007/s00018-021-03974-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 09/06/2021] [Accepted: 10/12/2021] [Indexed: 11/18/2022]
Abstract
Oxygen on its transport route from lung to tissue mitochondria has to cross several cell membranes. The permeability value of membranes for O2 (PO2), although of fundamental importance, is controversial. Previous studies by mostly indirect methods diverge between 0.6 and 125 cm/s. Here, we use a most direct approach by observing transmembrane O2 fluxes out of 100 nm liposomes at defined transmembrane O2 gradients in a stopped-flow system. Due to the small size of the liposomes intra- as well as extraliposomal diffusion processes do not affect the overall kinetics of the O2 release process. We find, for cholesterol-free liposomes, the unexpectedly low PO2 value of 0.03 cm/s at 35 °C. This PO2 would present a serious obstacle to O2 entering or leaving the erythrocyte. Cholesterol turns out to be a novel major modifier of PO2, able to increase PO2 by an order of magnitude. With a membrane cholesterol of 45 mol% as it occurs in erythrocytes, PO2 rises to 0.2 cm/s at 35 °C. This PO2 is just sufficient to ensure complete O2 loading during passage of erythrocytes through the lung's capillary bed under the conditions of rest as well as maximal exercise.
Collapse
Affiliation(s)
- Samer Al-Samir
- AG Vegetative Physiologie 4220, Zentrum Physiologie, Medizinische Hochschule Hannover, 30625, Hannover, Germany
| | - Fabian Itel
- Empa, Swiss Federal Laboratories for Materials Science and Technology, Lerchenfeldstr. 5, CH-9014, St. Gallen, Switzerland
| | - Jan Hegermann
- Abteilung Funktionelle und Angewandte Anatomie, Elektronenmikroskopie 8840, Medizinische Hochschule Hannover, 30625, Hannover, Germany
| | - Gerolf Gros
- AG Vegetative Physiologie 4220, Zentrum Physiologie, Medizinische Hochschule Hannover, 30625, Hannover, Germany.
| | - Georgios Tsiavaliaris
- Abteilung Biophysikalische Chemie 4350, Medizinische Hochschule Hannover, 30625, Hannover, Germany
| | - Volker Endeward
- AG Vegetative Physiologie 4220, Zentrum Physiologie, Medizinische Hochschule Hannover, 30625, Hannover, Germany
| |
Collapse
|
71
|
Chen M. Collective variable-based enhanced sampling and machine learning. THE EUROPEAN PHYSICAL JOURNAL. B 2021; 94:211. [PMID: 34697536 PMCID: PMC8527828 DOI: 10.1140/epjb/s10051-021-00220-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 10/03/2021] [Indexed: 05/14/2023]
Abstract
ABSTRACT Collective variable-based enhanced sampling methods have been widely used to study thermodynamic properties of complex systems. Efficiency and accuracy of these enhanced sampling methods are affected by two factors: constructing appropriate collective variables for enhanced sampling and generating accurate free energy surfaces. Recently, many machine learning techniques have been developed to improve the quality of collective variables and the accuracy of free energy surfaces. Although machine learning has achieved great successes in improving enhanced sampling methods, there are still many challenges and open questions. In this perspective, we shall review recent developments on integrating machine learning techniques and collective variable-based enhanced sampling approaches. We also discuss challenges and future research directions including generating kinetic information, exploring high-dimensional free energy surfaces, and efficiently sampling all-atom configurations.
Collapse
Affiliation(s)
- Ming Chen
- Department of Chemistry, Purdue University, West Lafayette, IN 47907 USA
| |
Collapse
|