1
|
Nikitin A, Wang F. Simulation of Linear and Cyclic Alkanes with Second-Order Møller-Plesset Perturbation Theory through Adaptive Force Matching. J Chem Theory Comput 2024; 20:5241-5249. [PMID: 38848512 PMCID: PMC11209940 DOI: 10.1021/acs.jctc.4c00509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Revised: 05/28/2024] [Accepted: 05/29/2024] [Indexed: 06/09/2024]
Abstract
Predicting ensemble properties, such as density and heat of vaporization, of small hydrocarbons is challenging due to the dispersion-dominated weak interactions between these molecules. With the adaptive force matching (AFM) method, the bonded and short-range nonbonded interactions are fitted to second-order Møller-Plesset perturbation theory (MP2) references computed with the def2-TZVP basis set. The dispersion is modeled using symmetry adapted perturbation theory (SAPT) at MP4 accuracy using the def2-TZVPD basis set. A new charge matrix decomposition technique is described to obtain partial charges in AFM. Although the models developed do not have any empirical parameters, several properties of the resulting models are compared with experiments as validations. The density, heat of vaporization, pressure dependence of density, diffusion constants, and surface tensions all show quantitative agreement with experiments. Although the density shows a very small systematic error, which could be due to missing three-body dispersion, the heat of vaporization agrees with experiments of within 0.5%. The paper shows that AFM can be used as a reliable tool to enable simulations at post-Hartree-Fock quality at the cost of molecular mechanics force fields.
Collapse
Affiliation(s)
- Alexei Nikitin
- Department of Chemistry and
Biochemistry, University of Arkansas, Fayetteville, Arkansas 72701, United States
| | - Feng Wang
- Department of Chemistry and
Biochemistry, University of Arkansas, Fayetteville, Arkansas 72701, United States
| |
Collapse
|
2
|
Wang D, Tiwary P. Augmenting Human Expertise in Weighted Ensemble Simulations through Deep Learning based Information Bottleneck. ARXIV 2024:arXiv:2406.14839v1. [PMID: 38947925 PMCID: PMC11213147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 07/02/2024]
Abstract
The weighted ensemble (WE) method stands out as a widely used segment-based sampling technique renowned for its rigorous treatment of kinetics. The WE framework typically involves initially mapping the configuration space onto a low-dimensional collective variable (CV) space and then partitioning it into bins. The efficacy of WE simulations heavily depends on the selection of CVs and binning schemes. The recently proposed State Predictive Information Bottleneck (SPIB) method has emerged as a promising tool for automatically constructing CVs from data and guiding enhanced sampling through an iterative manner. In this work, we advance this data-driven pipeline by incorporating prior expert knowledge. Our hybrid approach combines SPIB-learned CVs to enhance sampling in explored regions with expert-based CVs to guide exploration in regions of interest, synergizing the strengths of both methods. Through benchmarking on alanine dipeptide and chignoin systems, we demonstrate that our hybrid approach effectively guides WE simulations to sample states of interest, and reduces run-to-run variances. Moreover, our integration of the SPIB model also enhances the analysis and interpretation of WE simulation data by effectively identifying metastable states and pathways, and offering direct visualization of dynamics.
Collapse
Affiliation(s)
- Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Bethesda 20852, USA
| |
Collapse
|
3
|
Mayorga LS, Masone D. The Secret Ballet Inside Multivesicular Bodies. ACS NANO 2024; 18:15651-15660. [PMID: 38830824 DOI: 10.1021/acsnano.4c01590] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Lipid bilayers possess the capacity for self-assembly due to the amphipathic nature of lipid molecules, which have both hydrophobic and hydrophilic regions. When confined, lipid bilayers exhibit astonishing versatility in their forms, adopting diverse shapes that are challenging to observe through experimental means. Exploiting this adaptability, lipid structures motivate the development of bio-inspired mechanomaterials and integrated nanobio-interfaces that could seamlessly merge with biological entities, ultimately bridging the gap between synthetic and biological systems. In this work, we demonstrate how, in numerical simulations of multivesicular bodies, a fascinating evolution unfolds from an initial semblance of order toward states of higher entropy over time. We observe dynamic rearrangements in confined vesicles that reveal unexpected limit shapes of distinct geometric patterns. We identify five structures as the basic building blocks that systematically repeat under various conditions of size and composition. Moreover, we observe more complex and less frequent shapes that emerge in confined spaces. Our results provide insights into the dynamics of multivesicular systems, offering a richer understanding of how confined lipid bodies spontaneously self-organize.
Collapse
Affiliation(s)
- Luis S Mayorga
- Instituto de Histología y Embriología de Mendoza (IHEM), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Cuyo (UNCuyo), 5500 Mendoza, Argentina
- Facultad de Ciencias Exactas y Naturales, Universidad Nacional de Cuyo (UNCuyo), 5500, Mendoza, Argentina
| | - Diego Masone
- Instituto de Histología y Embriología de Mendoza (IHEM), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Universidad Nacional de Cuyo (UNCuyo), 5500 Mendoza, Argentina
- Facultad de Ingeniería, Universidad Nacional de Cuyo (UNCuyo), 5500 Mendoza, Argentina
| |
Collapse
|
4
|
Noid WG, Szukalo RJ, Kidder KM, Lesniewski MC. Rigorous Progress in Coarse-Graining. Annu Rev Phys Chem 2024; 75:21-45. [PMID: 38941523 DOI: 10.1146/annurev-physchem-062123-010821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Low-resolution coarse-grained (CG) models provide remarkable computational and conceptual advantages for simulating soft materials. In principle, bottom-up CG models can reproduce all structural and thermodynamic properties of atomically detailed models that can be observed at the resolution of the CG model. This review discusses recent progress in developing theory and computational methods for achieving this promise. We first briefly review variational approaches for parameterizing interaction potentials and their relationship to machine learning methods. We then discuss recent approaches for simultaneously improving both the transferability and thermodynamic properties of bottom-up models by rigorously addressing the density and temperature dependence of these potentials. We also briefly discuss exciting progress in modeling high-resolution observables with low-resolution CG models. More generally, we highlight the essential role of the bottom-up framework not only for fundamentally understanding the limitations of prior CG models but also for developing robust computational methods that resolve these limitations in practice.
Collapse
Affiliation(s)
- W G Noid
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania, USA;
| | - Ryan J Szukalo
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania, USA;
- Current affiliation: Department of Chemistry, Princeton University, Princeton, New Jersey, USA
| | - Katherine M Kidder
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania, USA;
| | - Maria C Lesniewski
- Department of Chemistry, Pennsylvania State University, University Park, Pennsylvania, USA;
| |
Collapse
|
5
|
Althorpe SC. Path Integral Simulations of Condensed-Phase Vibrational Spectroscopy. Annu Rev Phys Chem 2024; 75:397-420. [PMID: 38941531 DOI: 10.1146/annurev-physchem-090722-124705] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Recent theoretical and algorithmic developments have improved the accuracy with which path integral dynamics methods can include nuclear quantum effects in simulations of condensed-phase vibrational spectra. Such methods are now understood to be approximations to the delocalized classical Matsubara dynamics of smooth Feynman paths, which dominate the dynamics of systems such as liquid water at room temperature. Focusing mainly on simulations of liquid water and hexagonal ice, we explain how the recently developed quasicentroid molecular dynamics (QCMD), fast-QCMD, and temperature-elevated path integral coarse-graining simulations (Te PIGS) methods generate classical dynamics on potentials of mean force obtained by averaging over quantum thermal fluctuations. These new methods give very close agreement with one another, and the Te PIGS method has recently yielded excellent agreement with experimentally measured vibrational spectra for liquid water, ice, and the liquid-air interface. We also discuss the limitations of such methods.
Collapse
Affiliation(s)
- Stuart C Althorpe
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom;
| |
Collapse
|
6
|
Masella M, Léonforté F. The multi-scale polarizable pseudo-particle solvent coarse-grained approach: From NaCl salt solutions to polyelectrolyte hydration. J Chem Phys 2024; 160:204902. [PMID: 38780384 DOI: 10.1063/5.0194968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/22/2024] [Indexed: 05/25/2024] Open
Abstract
We discuss key parameters that affect the reliability of hybrid simulations in the aqueous phase based on an efficient multi-scale coarse-grained polarizable pseudo-particle approach, denoted as pppl, to model the solvent water, whereas solutes are modeled using an all atom polarizable force field. Among those parameters, the extension of the solvent domain (SD) at the solute vicinity (domain in which each solvent particle corresponds to a single water molecule) and the magnitude of solute/solvent short range polarization damping effects are shown to be pivotal to model NaCl salty aqueous solutions and the hydration of charged systems, such as the hydrophobic polyelectrolyte polymer that we have recently investigated [Masella et al., J. Chem. Phys. 155, 114903 (2021)]. Strong short range damping is pivotal to simulate aqueous salt NaCl solutions at moderate concentration (up to 1.0M). The SD extension (as well as short range damping) has a weak effect on the polymer conformation; however, it plays a pivotal role in computing accurate polymer/solvent interaction energies. As the pppl approach is up to two orders of magnitude computationally more efficient than all atom polarizable force field methods, our results show it to be an efficient alternative route to investigate the equilibrium properties of complex charged molecular systems in extended chemical environments.
Collapse
Affiliation(s)
- Michel Masella
- Laboratoire de Biologie Structurale et Radiobiologie, Service de Bioénergétique, Biologie Structurale et Mécanismes, Institut de Biologie et de Technologies de Saclay, CEA Saclay, F-91191 Gif sur Yvette Cedex, France
| | - Fabien Léonforté
- L'Oréal Group, Research and Innovation, Aulnay-Sous-Bois, France
| |
Collapse
|
7
|
Duignan TT. The Potential of Neural Network Potentials. ACS PHYSICAL CHEMISTRY AU 2024; 4:232-241. [PMID: 38800721 PMCID: PMC11117678 DOI: 10.1021/acsphyschemau.4c00004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/04/2024] [Accepted: 03/05/2024] [Indexed: 05/29/2024]
Abstract
In the next half-century, physical chemistry will likely undergo a profound transformation, driven predominantly by the combination of recent advances in quantum chemistry and machine learning (ML). Specifically, equivariant neural network potentials (NNPs) are a breakthrough new tool that are already enabling us to simulate systems at the molecular scale with unprecedented accuracy and speed, relying on nothing but fundamental physical laws. The continued development of this approach will realize Paul Dirac's 80-year-old vision of using quantum mechanics to unify physics with chemistry and providing invaluable tools for understanding materials science, biology, earth sciences, and beyond. The era of highly accurate and efficient first-principles molecular simulations will provide a wealth of training data that can be used to build automated computational methodologies, using tools such as diffusion models, for the design and optimization of systems at the molecular scale. Large language models (LLMs) will also evolve into increasingly indispensable tools for literature review, coding, idea generation, and scientific writing.
Collapse
|
8
|
Bag S, Meinel MK, Müller-Plathe F. Synthetic Force-Field Database for Training Machine Learning Models to Predict Mobility-Preserving Coarse-Grained Molecular-Simulation Potentials. J Chem Theory Comput 2024; 20:3046-3060. [PMID: 38593205 DOI: 10.1021/acs.jctc.4c00242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/11/2024]
Abstract
Balancing accuracy and efficiency is a common problem in molecular simulation. This tradeoff is evident in coarse-grained molecular dynamics simulation, which prioritizes efficiency, and all-atom molecular simulation, which prioritizes accuracy. Despite continuous efforts, creating a coarse-grained model that accurately captures both the system's structure and dynamics remains elusive. In this article, we present a data-driven approach for constructing coarse-grained models that aim to describe both the structure and dynamics of the system equally well. While the development of machine learning models is well-received in the scientific community, the significance of dataset creation for these models is often overlooked. However, data-driven approaches cannot progress without a robust dataset. To address this, we construct a database of synthetic coarse-grained potentials generated from unphysical all-atom models. A neural network is trained with the generated database to predict the coarse-grained potentials of real liquids. We evaluate their quality by calculating the combined loss of structural and dynamical accuracy upon coarse-graining. When we compare our machine learning-based coarse-grained potential with the one from iterative Boltzmann inversion, the machine learning prediction turns out better for all eight hydrocarbon liquids we studied. As all-atom surfaces turn more nonspherical, both ways of coarse-graining degrade. Still, the neural network outperforms iterative Boltzmann inversion in constructing good quality coarse-grained models for such cases. The synthetic database and the developed machine learning models are freely available to the community, and we believe that our approach will generate interest in efficiently deriving accurate coarse-grained models for liquids.
Collapse
Affiliation(s)
- Saientan Bag
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| | - Melissa K Meinel
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| | - Florian Müller-Plathe
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Peter-Grünberg-Str. 8, 64287 Darmstadt, Germany
| |
Collapse
|
9
|
Wu Z, Zhou T. Structural Coarse-Graining via Multiobjective Optimization with Differentiable Simulation. J Chem Theory Comput 2024; 20:2605-2617. [PMID: 38483262 DOI: 10.1021/acs.jctc.3c01348] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
In the realm of multiscale molecular simulations, structure-based coarse-graining is a prominent approach for creating efficient coarse-grained (CG) representations of soft matter systems, such as polymers. This involves optimizing CG interactions by matching static correlation functions of the corresponding degrees of freedom in all-atom (AA) models. Here, we present a versatile method, namely, differentiable coarse-graining (DiffCG), which combines multiobjective optimization and differentiable simulation. The DiffCG approach is capable of constructing robust CG models by iteratively optimizing the effective potentials to simultaneously match multiple target properties. We demonstrate our approach by concurrently optimizing bonded and nonbonded potentials of a CG model of polystyrene (PS) melts. The resulting CG-PS model effectively reproduces both the structural characteristics, such as the equilibrium probability distribution of microscopic degrees of freedom and the thermodynamic pressure of the AA counterpart. More importantly, leveraging the multiobjective optimization capability, we develop a precise and efficient CG model for PS melts that is transferable across a wide range of temperatures, i.e., from 400 to 600 K. It is achieved via optimizing a pairwise potential with nonlinear temperature dependence in the CG model to simultaneously match target data from AA-MD simulations at multiple thermodynamic states. The temperature transferable CG-PS model demonstrates its ability to accurately predict the radial distribution functions and density at different temperatures, including those that are not included in the target thermodynamic states. Our work opens up a promising route for developing accurate and transferable CG models of complex soft-matter systems through multiobjective optimization with differentiable simulation.
Collapse
Affiliation(s)
- Zhenghao Wu
- Department of Chemistry, Xi'an Jiaotong-Liverpool University, Suzhou, Jiangsu 215123, P. R. China
| | - Tianhang Zhou
- College of Carbon Neutrality Future Technology, State Key Laboratory of Heavy Oil Processing, China University of Petroleum (Beijing), Beijing 102249, P. R. China
| |
Collapse
|
10
|
Hsu T, Sadigh B, Bulatov V, Zhou F. Score Dynamics: Scaling Molecular Dynamics with Picoseconds Time Steps via Conditional Diffusion Model. J Chem Theory Comput 2024; 20:2335-2348. [PMID: 38489243 DOI: 10.1021/acs.jctc.3c01361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
We propose score dynamics (SD), a general framework for learning accelerated evolution operators with large timesteps from molecular dynamics (MD) simulations. SD is centered around scores or derivatives of the transition log-probability with respect to the dynamical degrees of freedom. The latter play the same role as force fields in MD but are used in denoising diffusion probability models to generate discrete transitions of the dynamical variables in an SD time step, which can be orders of magnitude larger than a typical MD time step. In this work, we construct graph neural network-based SD models of realistic molecular systems that are evolved with 10 ps timesteps. We demonstrate the efficacy of SD with case studies of the alanine dipeptide and short alkanes in aqueous solution. Both equilibrium predictions derived from the stationary distributions of the conditional probability and kinetic predictions for the transition rates and transition paths are in good agreement with MD. Our current SD implementation is about 2 orders of magnitude faster than the MD counterpart for the systems studied in this work. Open challenges and possible future remedies to improve SD are also discussed.
Collapse
Affiliation(s)
- Tim Hsu
- Lawrence Livermore National Laboratory, Livermore, California 94551, United States
| | - Babak Sadigh
- Lawrence Livermore National Laboratory, Livermore, California 94551, United States
| | - Vasily Bulatov
- Lawrence Livermore National Laboratory, Livermore, California 94551, United States
| | - Fei Zhou
- Lawrence Livermore National Laboratory, Livermore, California 94551, United States
| |
Collapse
|
11
|
Nie Y, Zheng Z, Li C, Zhan H, Kou L, Gu Y, Lü C. Resolving the dynamic properties of entangled linear polymers in non-equilibrium coarse grain simulation with a priori scaling factors. NANOSCALE 2024. [PMID: 38494916 DOI: 10.1039/d3nr06185j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
The molecular weight of polymers can influence the material properties, but the molecular weight at the experiment level sometimes can be a huge burden for property prediction with full-atomic simulations. The traditional bottom-up coarse grain (CG) simulation can reduce the computation cost. However, the dynamic properties predicted by the CG simulation can deviate from the full-atomic simulation result. Usually, in CG simulations, the diffusion is faster and the viscosity and modulus are much lower. The fast dynamics in CG are usually solved by a posteriori scaling on time, temperature, or potential modifications, which usually have poor transferability to other non-fitted physical properties because of a lack of fundamental physics. In this work, a priori scaling factors were calculated by the loss of degrees of freedom and implemented in the iterative Boltzmann inversion. According to the simulation results on 3 different CG levels at different temperatures and loading rates, such a priori scaling factors can help in reproducing some dynamic properties of polycaprolactone in CG simulation more accurately, such as heat capacity, Young's modulus, and viscosity, while maintaining the accuracy in the structural distribution prediction. The transferability of entropy-enthalpy compensation and a dissipative particle dynamics thermostat is also presented for comparison. The proposed method reveals the huge potential for developing customized CG thermostats and offers a simple way to rebuild multiphysics CG models for polymers with good transferability.
Collapse
Affiliation(s)
- Yihan Nie
- College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China
| | - Zhuoqun Zheng
- School of Astronautics, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China
| | - Chengkai Li
- School of Materials Science and Engineering, Taiyuan University of Science and Technology, Taiyuan 030024, China
| | - Haifei Zhan
- College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China
- School of Mechanical, Medical and Process Engineering, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
- Center for Materials Science, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
| | - Liangzhi Kou
- School of Mechanical, Medical and Process Engineering, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
- Center for Materials Science, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
| | - Yuantong Gu
- School of Mechanical, Medical and Process Engineering, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
- Center for Materials Science, Queensland University of Technology (QUT), Brisbane QLD 4001, Australia
| | - Chaofeng Lü
- Faculty of Mechanical Engineering & Mechanics, Ningbo University, Ningbo 315211, China
- College of Civil Engineering and Architecture, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
12
|
Izvekov S, Kroonblawd MP, Larentzos JP, Brennan JK, Rice BM. Maximum Entropy Theory of Multiscale Coarse-Graining via Matching Thermodynamic Forces: Application to a Molecular Crystal (TATB). J Phys Chem B 2024. [PMID: 38489758 DOI: 10.1021/acs.jpcb.3c07078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/17/2024]
Abstract
The MSCG/FM (multiscale coarse-graining via force-matching) approach is an efficient supervised machine learning method to develop microscopically informed coarse-grained (CG) models. We present a theory based on the principle of maximum entropy (PME) enveloping the existing MSCG/FM approaches. This theory views the MSCG/FM method as a special case of matching the thermodynamic forces from the extended ensemble described by the set of thermodynamic (relevant) system coordinates. This set may include CG coordinates, the stress tensor, applied external fields, and so forth, and may be characterized by nonequilibrium conditions. Following the presentation of the theory, we discuss the consistent matching of both bonded and nonbonded interactions. The proposed PME formulation is used as a starting point to extend the MSCG/FM method to the constant strain ensemble, which together with the explicit matching of the bonded forces is better suited for coarse-graining anisotropic media at a submolecular resolution. The theory is demonstrated by performing the fine coarse-graining of crystalline 1,3,5-triamino-2,4,6-trinitrobenzene (TATB), a well-known insensitive molecular energetic material, which exhibits highly anisotropic mechanical properties.
Collapse
Affiliation(s)
- Sergei Izvekov
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Matthew P Kroonblawd
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - James P Larentzos
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - John K Brennan
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Betsy M Rice
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
13
|
Zhang H, Liu S, You J, Liu C, Zheng S, Lu Z, Wang T, Zheng N, Shao B. Overcoming the barrier of orbital-free density functional theory for molecular systems using deep learning. NATURE COMPUTATIONAL SCIENCE 2024; 4:210-223. [PMID: 38467870 DOI: 10.1038/s43588-024-00605-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 02/07/2024] [Indexed: 03/13/2024]
Abstract
Orbital-free density functional theory (OFDFT) is a quantum chemistry formulation that has a lower cost scaling than the prevailing Kohn-Sham DFT, which is increasingly desired for contemporary molecular research. However, its accuracy is limited by the kinetic energy density functional, which is notoriously hard to approximate for non-periodic molecular systems. Here we propose M-OFDFT, an OFDFT approach capable of solving molecular systems using a deep learning functional model. We build the essential non-locality into the model, which is made affordable by the concise density representation as expansion coefficients under an atomic basis. With techniques to address unconventional learning challenges therein, M-OFDFT achieves a comparable accuracy to Kohn-Sham DFT on a wide range of molecules untouched by OFDFT before. More attractively, M-OFDFT extrapolates well to molecules much larger than those seen in training, which unleashes the appealing scaling of OFDFT for studying large molecules including proteins, representing an advancement of the accuracy-efficiency trade-off frontier in quantum chemistry.
Collapse
Affiliation(s)
- He Zhang
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China
- Microsoft Research AI4Science, Beijing, China
| | - Siyuan Liu
- Microsoft Research AI4Science, Beijing, China
| | | | - Chang Liu
- Microsoft Research AI4Science, Beijing, China.
| | | | - Ziheng Lu
- Microsoft Research AI4Science, Beijing, China
| | - Tong Wang
- Microsoft Research AI4Science, Beijing, China
| | - Nanning Zheng
- National Key Laboratory of Human-Machine Hybrid Augmented Intelligence, National Engineering Research Center for Visual Information and Applications, and Institute of Artificial Intelligence and Robotics, Xi'an Jiaotong University, Xi'an, China
| | - Bin Shao
- Microsoft Research AI4Science, Beijing, China.
| |
Collapse
|
14
|
Célerse F, Wodrich MD, Vela S, Gallarati S, Fabregat R, Juraskova V, Corminboeuf C. From Organic Fragments to Photoswitchable Catalysts: The OFF-ON Structural Repository for Transferable Kernel-Based Potentials. J Chem Inf Model 2024; 64:1201-1212. [PMID: 38319296 PMCID: PMC10900300 DOI: 10.1021/acs.jcim.3c01953] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 01/18/2024] [Accepted: 01/22/2024] [Indexed: 02/07/2024]
Abstract
Structurally and conformationally diverse databases are needed to train accurate neural networks or kernel-based potentials capable of exploring the complex free energy landscape of flexible functional organic molecules. Curating such databases for species beyond "simple" drug-like compounds or molecules composed of well-defined building blocks (e.g., peptides) is challenging as it requires thorough chemical space mapping and evaluation of both chemical and conformational diversities. Here, we introduce the OFF-ON (organic fragments from organocatalysts that are non-modular) database, a repository of 7869 equilibrium and 67,457 nonequilibrium geometries of organic compounds and dimers aimed at describing conformationally flexible functional organic molecules, with an emphasis on photoswitchable organocatalysts. The relevance of this database is then demonstrated by training a local kernel regression model on a low-cost semiempirical baseline and comparing it with a PBE0-D3 reference for several known catalysts, notably the free energy surfaces of exemplary photoswitchable organocatalysts. Our results demonstrate that the OFF-ON data set offers reliable predictions for simulating the conformational behavior of virtually any (photoswitchable) organocatalyst or organic compound composed of H, C, N, O, F, and S atoms, thereby opening a computationally feasible route to explore complex free energy surfaces in order to rationalize and predict catalytic behavior.
Collapse
Affiliation(s)
- Frédéric Célerse
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Matthew D. Wodrich
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| | - Sergi Vela
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Simone Gallarati
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Raimon Fabregat
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Veronika Juraskova
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Clémence Corminboeuf
- Laboratory
for Computational Molecular Design (LCMD), Institute of Chemical Sciences
and Engineering, Ecole Polytechnique Fédérale
de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Center for Competence in Research-Catalysis (NCCR-Catalysis), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), Ecole Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
| |
Collapse
|
15
|
DeLuca M, Sensale S, Lin PA, Arya G. Prediction and Control in DNA Nanotechnology. ACS APPLIED BIO MATERIALS 2024; 7:626-645. [PMID: 36880799 DOI: 10.1021/acsabm.2c01045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
Abstract
DNA nanotechnology is a rapidly developing field that uses DNA as a building material for nanoscale structures. Key to the field's development has been the ability to accurately describe the behavior of DNA nanostructures using simulations and other modeling techniques. In this Review, we present various aspects of prediction and control in DNA nanotechnology, including the various scales of molecular simulation, statistical mechanics, kinetic modeling, continuum mechanics, and other prediction methods. We also address the current uses of artificial intelligence and machine learning in DNA nanotechnology. We discuss how experiments and modeling are synergistically combined to provide control over device behavior, allowing scientists to design molecular structures and dynamic devices with confidence that they will function as intended. Finally, we identify processes and scenarios where DNA nanotechnology lacks sufficient prediction ability and suggest possible solutions to these weak areas.
Collapse
Affiliation(s)
- Marcello DeLuca
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, North Carolina 27708, United States
| | - Sebastian Sensale
- Department of Physics, Cleveland State University, Cleveland, Ohio 44115, United States
| | - Po-An Lin
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, North Carolina 27708, United States
| | - Gaurav Arya
- Thomas Lord Department of Mechanical Engineering and Materials Science, Duke University, Durham, North Carolina 27708, United States
| |
Collapse
|
16
|
Christians LF, Halingstad EV, Kram E, Okolovitch EM, Pak AJ. Formalizing Coarse-Grained Representations of Anisotropic Interactions at Multimeric Protein Interfaces Using Virtual Sites. J Phys Chem B 2024; 128:1394-1406. [PMID: 38316012 DOI: 10.1021/acs.jpcb.3c07023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Molecular simulations of biomacromolecules that assemble into multimeric complexes remain a challenge due to computationally inaccessible length and time scales. Low-resolution and implicit-solvent coarse-grained modeling approaches using traditional nonbonded interactions (both pairwise and spherically isotropic) have been able to partially address this gap. However, these models may fail to capture the complex anisotropic interactions present at macromolecular interfaces unless higher-order interaction potentials are incorporated at the expense of the computational cost. In this work, we introduce an alternate and systematic approach to represent directional interactions at protein-protein interfaces by using virtual sites restricted to pairwise interactions. We show that virtual site interaction parameters can be optimized within a relative entropy minimization framework by using only information from known statistics between coarse-grained sites. We compare our virtual site models to traditional coarse-grained models using two case studies of multimeric protein assemblies and find that the virtual site models predict pairwise correlations with higher fidelity and, more importantly, assembly behavior that is morphologically consistent with experiments. Our study underscores the importance of anisotropic interaction representations and paves the way for more accurate yet computationally efficient coarse-grained simulations of macromolecular assembly in future research.
Collapse
Affiliation(s)
- Luc F Christians
- Department of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Ethan V Halingstad
- Department of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Emiel Kram
- Department of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Evan M Okolovitch
- Department of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | - Alexander J Pak
- Department of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
- Quantitative Biosciences and Engineering Program, Colorado School of Mines, Golden, Colorado 80401, United States
- Materials Science Program, Colorado School of Mines, Golden, Colorado 80401, United States
| |
Collapse
|
17
|
Kurnikov IV, Pereyaslavets L, Kamath G, Sakipov SN, Voronina E, Butin O, Illarionov A, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Ivahnenko I, Chen Y, Lock CB, Levitt M, Kornberg RD, Fain B. Neural Network Corrections to Intermolecular Interaction Terms of a Molecular Force Field Capture Nuclear Quantum Effects in Calculations of Liquid Thermodynamic Properties. J Chem Theory Comput 2024; 20:1347-1357. [PMID: 38240485 PMCID: PMC11042917 DOI: 10.1021/acs.jctc.3c00921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
We incorporate nuclear quantum effects (NQE) in condensed matter simulations by introducing short-range neural network (NN) corrections to the ab initio fitted molecular force field ARROW. Force field NN corrections are fitted to average interaction energies and forces of molecular dimers, which are simulated using the Path Integral Molecular Dynamics (PIMD) technique with restrained centroid positions. The NN-corrected force field allows reproduction of the NQE for computed liquid water and methane properties such as density, radial distribution function (RDF), heat of evaporation (HVAP), and solvation free energy. Accounting for NQE through molecular force field corrections circumvents the need for explicit computationally expensive PIMD simulations in accurate calculations of the properties of chemical and biological systems. The accuracy and locality of pairwise NN NQE corrections indicate that this approach could be applicable to complex heterogeneous systems, such as proteins.
Collapse
Affiliation(s)
- Igor V Kurnikov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan N Sakipov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Oleg Butin
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Alexey Illarionov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ilya Ivahnenko
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - YuChun Chen
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher B Lock
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Roger D Kornberg
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Boris Fain
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
18
|
Brown BP, Stein RA, Meiler J, Mchaourab HS. Approximating Projections of Conformational Boltzmann Distributions with AlphaFold2 Predictions: Opportunities and Limitations. J Chem Theory Comput 2024; 20:1434-1447. [PMID: 38215214 PMCID: PMC10867840 DOI: 10.1021/acs.jctc.3c01081] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 12/13/2023] [Accepted: 12/13/2023] [Indexed: 01/14/2024]
Abstract
Protein thermodynamics is intimately tied to biological function and can enable processes such as signal transduction, enzyme catalysis, and molecular recognition. The relative free energies of conformations that contribute to these functional equilibria evolved for the physiology of the organism. Despite the importance of these equilibria for understanding biological function and developing treatments for disease, computational and experimental methods capable of quantifying the energetic determinants of these equilibria are limited to systems of modest size. Recently, it has been demonstrated that the artificial intelligence system AlphaFold2 can be manipulated to produce structurally valid protein conformational ensembles. Here, we extend these studies and explore the extent to which AlphaFold2 contact distance distributions can approximate projections of the conformational Boltzmann distributions. For this purpose, we examine the joint probability distributions of inter-residue contact distances along functionally relevant collective variables of several protein systems. Our studies suggest that AlphaFold2 normalized contact distance distributions can correlate with conformation probabilities obtained with other methods but that they suffer from peak broadening. We also find that the AlphaFold2 contact distance distributions can be sensitive to point mutations. Overall, we anticipate that our findings will be valuable as the community seeks to model the thermodynamics of conformational changes in large biomolecular systems.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Department
of Chemistry, Vanderbilt University, Nashville, Tennessee 37232, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
- Center
for Applied AI in Protein Dynamics, Vanderbilt
University, Nashville, Tennessee 37232, United States
| | - Richard A. Stein
- Center
for Applied AI in Protein Dynamics, Vanderbilt
University, Nashville, Tennessee 37232, United States
- Department
of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, United States
| | - Jens Meiler
- Department
of Chemistry, Vanderbilt University, Nashville, Tennessee 37232, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
- Center
for Applied AI in Protein Dynamics, Vanderbilt
University, Nashville, Tennessee 37232, United States
- Institute
for Drug Discovery, Leipzig University Medical
School, Leipzig, SAC 04103, Germany
| | - Hassane S. Mchaourab
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
- Center
for Applied AI in Protein Dynamics, Vanderbilt
University, Nashville, Tennessee 37232, United States
- Department
of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, Tennessee 37232, United States
| |
Collapse
|
19
|
Kapil V, Kovács DP, Csányi G, Michaelides A. First-principles spectroscopy of aqueous interfaces using machine-learned electronic and quantum nuclear effects. Faraday Discuss 2024; 249:50-68. [PMID: 37799072 PMCID: PMC10845015 DOI: 10.1039/d3fd00113j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 07/18/2023] [Indexed: 10/07/2023]
Abstract
Vibrational spectroscopy is a powerful approach to visualising interfacial phenomena. However, extracting structural and dynamical information from vibrational spectra is a challenge that requires first-principles simulations, including non-Condon and quantum nuclear effects. We address this challenge by developing a machine-learning enhanced first-principles framework to speed up predictive modelling of infrared, Raman, and sum-frequency generation spectra. Our approach uses machine learning potentials that encode quantum nuclear effects to generate quantum trajectories using simple molecular dynamics efficiently. In addition, we reformulate bulk and interfacial selection rules to express them unambiguously in terms of the derivatives of polarisation and polarisabilities of the whole system and predict these derivatives efficiently using fully-differentiable machine learning models of dielectric response tensors. We demonstrate our framework's performance by predicting the IR, Raman, and sum-frequency generation spectra of liquid water, ice and the water-air interface by achieving near quantitative agreement with experiments at nearly the same computational efficiency as pure classical methods. Finally, to aid the experimental discovery of new phases of nanoconfined water, we predict the temperature-dependent vibrational spectra of monolayer water across the solid-hexatic-liquid phases transition.
Collapse
Affiliation(s)
- Venkat Kapil
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| | | | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, UK
| | - Angelos Michaelides
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
20
|
Del Razo MJ, Crommelin D, Bolhuis PG. Data-driven dynamical coarse-graining for condensed matter systems. J Chem Phys 2024; 160:024108. [PMID: 38193550 DOI: 10.1063/5.0177553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 12/05/2023] [Indexed: 01/10/2024] Open
Abstract
Simulations of condensed matter systems often focus on the dynamics of a few distinguished components but require integrating the full system. A prime example is a molecular dynamics simulation of a (macro)molecule in a solution, where the molecule(s) and the solvent dynamics need to be integrated, rendering the simulations computationally costly and often unfeasible for physically/biologically relevant time scales. Standard coarse graining approaches can reproduce equilibrium distributions and structural features but do not properly include the dynamics. In this work, we develop a general data-driven coarse-graining methodology inspired by the Mori-Zwanzig formalism, which shows that macroscopic systems with a large number of degrees of freedom can be described by a few relevant variables and additional noise and memory terms. Our coarse-graining method consists of numerical integrators for the distinguished components, where the noise and interaction terms with other system components are substituted by a random variable sampled from a data-driven model. The model is parameterized using data from multiple short-time full-system simulations, and then, it is used to run long-time simulations. Applying our methodology to three systems-a distinguished particle under a harmonic and a bistable potential and a dimer with two metastable configurations-the resulting coarse-grained models are capable of reproducing not only the equilibrium distributions but also the dynamic behavior due to temporal correlations and memory effects. Remarkably, our method even reproduces the transition dynamics between metastable states, which is challenging to capture correctly. Our approach is not constrained to specific dynamics and can be extended to systems beyond Langevin dynamics, and, in principle, even to non-equilibrium dynamics.
Collapse
Affiliation(s)
- Mauricio J Del Razo
- Department of Mathematics and Computer Science, Freie Universität Berlin, Berlin, Germany
- Van't Hoff Institute for Molecular Sciences, University of Amsterdam, PO Box 94157, 1090GD Amsterdam, The Netherlands
- Korteweg-de Vries Institute for Mathematics, University of Amsterdam, PO Box 94248, 1090GD Amsterdam, The Netherlands
- Dutch Institute for Emergent Phenomena, University of Amsterdam, Amsterdam, The Netherlands
| | - Daan Crommelin
- Korteweg-de Vries Institute for Mathematics, University of Amsterdam, PO Box 94248, 1090GD Amsterdam, The Netherlands
- Centrum Wiskunde & Informatica, 1098 XG Amsterdam, The Netherlands
| | - Peter G Bolhuis
- Van't Hoff Institute for Molecular Sciences, University of Amsterdam, PO Box 94157, 1090GD Amsterdam, The Netherlands
| |
Collapse
|
21
|
Zong T, Liu X, Zhang X, Yang Q. Efficient characterization of double-cross-linked networks in hydrogels using data-inspired coarse-grained molecular dynamics model. J Chem Phys 2024; 160:024115. [PMID: 38197443 DOI: 10.1063/5.0180847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 12/14/2023] [Indexed: 01/11/2024] Open
Abstract
The network structure within polymers significantly influences their mechanical properties, including their strength, toughness, and fatigue resistance. All-atom molecular dynamics (AAMD) simulations offer a method to investigate the energy dissipation mechanism within polymers during deformation and fracture; Such an approach is, however, computationally inefficient when used to analyze polymers with complex network structures, such as the common chemically double-networked hydrogels. Alternatively, coarse-grained molecular dynamics (CGMD) models, which reduce the computational degrees of freedom by concentrating a set of adjacent atoms into a coarse-grained bead, can be employed. In CGMD simulations, a coarse-grained force field (CGFF) is a critical factor affecting the simulation accuracy. In this paper, we proposed a data-based method for predicting the CGFF parameters to improve the simulation efficiency of complex cross-linked network in polymers. Here, we utilized a typical chemically double-networked hydrogel as an example. An artificial neural network was selected, and it was trained with the tensile stress-strain data from the CGMD simulations using different CGFF parameters. The CGMD simulations using the predicted CGFF parameters show good agreement with the AAMD simulations and are almost fifty times faster. The data-inspired CGMD model presented here broadens the applicability of molecular dynamics simulations to cross-linked polymers and has the potential to provide insights that will aid the design of polymers with desirable mechanical properties.
Collapse
Affiliation(s)
- Ting Zong
- Beijing University of Technology, Beijing 100124, China
| | - Xia Liu
- Beijing University of Technology, Beijing 100124, China
| | - Xingyu Zhang
- Beijing University of Technology, Beijing 100124, China
| | | |
Collapse
|
22
|
Kanada R, Tokuhisa A, Nagasaka Y, Okuno S, Amemiya K, Chiba S, Bekker GJ, Kamiya N, Kato K, Okuno Y. Enhanced Coarse-Grained Molecular Dynamics Simulation with a Smoothed Hybrid Potential Using a Neural Network Model. J Chem Theory Comput 2024; 20:7-17. [PMID: 38148034 DOI: 10.1021/acs.jctc.3c00889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
In all-atom (AA) molecular dynamics (MD) simulations, the rugged energy profile of the force field makes it challenging to reproduce spontaneous structural changes in biomolecules within a reasonable calculation time. Existing coarse-grained (CG) models, in which the energy profile is set to a global minimum around the initial structure, are unsuitable to explore the structural dynamics between metastable states far away from the initial structure without any bias. In this study, we developed a new hybrid potential composed of an artificial intelligence (AI) potential and minimal CG potential related to the statistical bond length and excluded volume interactions to accelerate the transition dynamics while maintaining the protein character. The AI potential is trained by energy matching using a diverse structural ensemble sampled via multicanonical (Mc) MD simulation and the corresponding AA force field energy, profile of which is smoothed by energy minimization. By applying the new methodology to chignolin and TrpCage, we showed that the AI potential can predict the AA energy with significantly high accuracy, as indicated by a correlation coefficient (R-value) between the true and predicted energies exceeding 0.89. In addition, we successfully demonstrated that CGMD simulation based on the smoothed hybrid potential can significantly enhance the transition dynamics between various metastable states while preserving protein properties compared to those obtained with conventional CGMD and AAMD.
Collapse
Affiliation(s)
- Ryo Kanada
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
| | | | | | | | | | - Shuntaro Chiba
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
| | - Gert-Jan Bekker
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Narutoshi Kamiya
- Graduate School of Information Science, University of Hyogo, Kobe, Hyogo 650-0047, Japan
| | - Koichiro Kato
- Graduate School of Engineering, Kyushu University, Fukuoka 819-0395, Japan
- Center for Molecular System, Kyushu University, 744 Motooka, Noshi-ku, Fukuoka 819-0395, Japan
| | - Yasushi Okuno
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
- Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| |
Collapse
|
23
|
Coste A, Slejko E, Zavadlav J, Praprotnik M. Developing an Implicit Solvation Machine Learning Model for Molecular Simulations of Ionic Media. J Chem Theory Comput 2024; 20:411-420. [PMID: 38118122 PMCID: PMC10782447 DOI: 10.1021/acs.jctc.3c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/04/2023] [Accepted: 12/04/2023] [Indexed: 12/22/2023]
Abstract
Molecular dynamics (MD) simulations of biophysical systems require accurate modeling of their native environment, i.e., aqueous ionic solution, as it critically impacts the structure and function of biomolecules. On the other hand, the models should be computationally efficient to enable simulations of large spatiotemporal scales. Here, we present the deep implicit solvation model for sodium chloride solutions that satisfies both requirements. Owing to the use of the neural network potential, the model can capture the many-body potential of mean force, while the implicit water treatment renders the model inexpensive. We demonstrate our approach first for pure ionic solutions with concentrations ranging from physiological to 2 M. We then extend the model to capture the effective ion interactions in the vicinity and far away from a DNA molecule. In both cases, the structural properties are in good agreement with all-atom MD, showcasing a general methodology for the efficient and accurate modeling of ionic media.
Collapse
Affiliation(s)
- Amaury Coste
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
| | - Ema Slejko
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| | - Julija Zavadlav
- Professorship
of Multiscale Modeling of Fluid Materials, TUM School of Engineering
and Design, Technical University of Munich, Garching Near Munich DE-85748, Germany
| | - Matej Praprotnik
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| |
Collapse
|
24
|
Kehrein J, Sotriffer C. Molecular Dynamics Simulations for Rationalizing Polymer Bioconjugation Strategies: Challenges, Recent Developments, and Future Opportunities. ACS Biomater Sci Eng 2024; 10:51-74. [PMID: 37466304 DOI: 10.1021/acsbiomaterials.3c00636] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
The covalent modification of proteins with polymers is a well-established method for improving the pharmacokinetic properties of therapeutically valuable biologics. The conjugated polymer chains of the resulting hybrid represent highly flexible macromolecular structures. As the dynamics of such systems remain rather elusive for established experimental techniques from the field of protein structure elucidation, molecular dynamics simulations have proven as a valuable tool for studying such conjugates at an atomistic level, thereby complementing experimental studies. With a focus on new developments, this review aims to provide researchers from the polymer bioconjugation field with a concise and up to date overview of such approaches. After introducing basic principles of molecular dynamics simulations, as well as methods for and potential pitfalls in modeling bioconjugates, the review illustrates how these computational techniques have contributed to the understanding of bioconjugates and bioconjugation strategies in the recent past and how they may lead to a more rational design of novel bioconjugates in the future.
Collapse
Affiliation(s)
- Josef Kehrein
- Institute of Pharmacy and Food Chemistry, University of Würzburg, Würzburg 97074, Germany
| | - Christoph Sotriffer
- Institute of Pharmacy and Food Chemistry, University of Würzburg, Würzburg 97074, Germany
| |
Collapse
|
25
|
Singh S, Sahani H. Current Advancement and Future Prospects: Biomedical Nanoengineering. Curr Radiopharm 2024; 17:120-137. [PMID: 38058099 DOI: 10.2174/0118744710274376231123063135] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 10/19/2023] [Accepted: 10/27/2023] [Indexed: 12/08/2023]
Abstract
Recent advancements in biomedicine have seen a significant reliance on nanoengineering, as traditional methods often fall short in harnessing the unique attributes of biomaterials. Nanoengineering has emerged as a valuable approach to enhance and enrich the performance and functionalities of biomaterials, driving research and development in the field. This review emphasizes the most prevalent biomaterials used in biomedicine, including polymers, nanocomposites, and metallic materials, and explores the pivotal role of nanoengineering in developing biomedical treatments and processes. Particularly, the review highlights research focused on gaining an in-depth understanding of material properties and effectively enhancing material performance through molecular dynamics simulations, all from a nanoengineering perspective.
Collapse
Affiliation(s)
- Sonia Singh
- Institute of Pharmaceutical Research, GLA University, 17 km Stone, NH-2, Mathura-Delhi Road Mathura, Chaumuhan, Uttar Pradesh, 281406, India
| | - Hrishika Sahani
- Lifecell International Pvt. Ltd., NSP Office, Pearls Business Park, 8th Floor Office No-804, Netaji Subhash Palace Delhi, 110034, India
| |
Collapse
|
26
|
Airas J, Ding X, Zhang B. Transferable Implicit Solvation via Contrastive Learning of Graph Neural Networks. ACS CENTRAL SCIENCE 2023; 9:2286-2297. [PMID: 38161379 PMCID: PMC10755853 DOI: 10.1021/acscentsci.3c01160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 10/26/2023] [Accepted: 10/31/2023] [Indexed: 01/03/2024]
Abstract
Implicit solvent models are essential for molecular dynamics simulations of biomolecules, striking a balance between computational efficiency and biological realism. Efforts are underway to develop accurate and transferable implicit solvent models and coarse-grained (CG) force fields in general, guided by a bottom-up approach that matches the CG energy function with the potential of mean force (PMF) defined by the finer system. However, practical challenges arise due to the lack of analytical expressions for the PMF and algorithmic limitations in parameterizing CG force fields. To address these challenges, a machine learning-based approach is proposed, utilizing graph neural networks (GNNs) to represent the solvation free energy and potential contrasting for parameter optimization. We demonstrate the effectiveness of the approach by deriving a transferable GNN implicit solvent model using 600,000 atomistic configurations of six proteins obtained from explicit solvent simulations. The GNN model provides solvation free energy estimations much more accurately than state-of-the-art implicit solvent models, reproducing configurational distributions of explicit solvent simulations. We also demonstrate the reasonable transferability of the GNN model outside of the training data. Our study offers valuable insights for deriving systematically improvable implicit solvent models and CG force fields from a bottom-up perspective.
Collapse
Affiliation(s)
- Justin Airas
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| | - Xinqiang Ding
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139-4307, United
States
| |
Collapse
|
27
|
Loose T, Sahrmann PG, Qu TS, Voth GA. Coarse-Graining with Equivariant Neural Networks: A Path Toward Accurate and Data-Efficient Models. J Phys Chem B 2023; 127:10564-10572. [PMID: 38033234 PMCID: PMC10726966 DOI: 10.1021/acs.jpcb.3c05928] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/30/2023] [Accepted: 11/09/2023] [Indexed: 12/02/2023]
Abstract
Machine learning has recently entered into the mainstream of coarse-grained (CG) molecular modeling and simulation. While a variety of methods for incorporating deep learning into these models exist, many of them involve training neural networks to act directly as the CG force field. This has several benefits of which the most significant is accuracy. Neural networks can inherently incorporate multibody effects during the calculation of CG forces, and a well-trained neural network force field outperforms pairwise basis sets generated from essentially any methodology. However, this comes at a significant cost. First, these models are typically slower than pairwise force fields, even when accounting for specialized hardware, which accelerates the training and integration of such networks. The second and the focus of this paper is the need for a considerable amount of data to train such force fields. It is common to use 10s of microseconds of molecular dynamics data to train a single CG model, which approaches the point of eliminating the CG model's usefulness in the first place. As we investigate in this work, this "data-hunger" trap from neural networks for predicting molecular energies and forces can be remediated in part by incorporating equivariant convolutional operations. We demonstrate that, for CG water, networks that incorporate equivariant convolutional operations can produce functional models using data sets as small as a single frame of reference data, while networks without these operations cannot.
Collapse
Affiliation(s)
| | | | - Thomas S. Qu
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, James Franck Institute,
and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department of Chemistry,
Chicago Center for Theoretical Chemistry, James Franck Institute,
and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
28
|
Fonseca G, Poltavsky I, Tkatchenko A. Force Field Analysis Software and Tools (FFAST): Assessing Machine Learning Force Fields under the Microscope. J Chem Theory Comput 2023; 19:8706-8717. [PMID: 38011895 PMCID: PMC10720330 DOI: 10.1021/acs.jctc.3c00985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023]
Abstract
As the sophistication of machine learning force fields (MLFF) increases to match the complexity of extended molecules and materials, so does the need for tools to properly analyze and assess the practical performance of MLFFs. To go beyond average error metrics and into a complete picture of a model's applicability and limitations, we developed FFAST (force field analysis software and tools): a cross-platform software package designed to gain detailed insights into a model's performance and limitations, complete with an easy-to-use graphical user interface. The software allows the user to gauge the performance of any molecular force field,─such as popular state-of-the-art MLFF models, ─ on various popular data set types, providing general prediction error overviews, outlier detection mechanisms, atom-projected errors, and more. It has a 3D visualizer to find and picture problematic configurations, atoms, or clusters in a large data set. In this paper, the example of the MACE and NequIP models is used on two data sets of interest [stachyose and docosahexaenoic acid (DHA)]─to illustrate the use cases of the software. With this, it was found that carbons and oxygens involved in or near glycosidic bonds inside the stachyose molecule present increased prediction errors. In addition, prediction errors on DHA rise as the molecule folds, especially for the carboxylic group at the edge of the molecule. We emphasize the need for a systematic assessment of MLFF models for ensuring their successful application to the study of dynamics of molecules and materials.
Collapse
Affiliation(s)
- Gregory Fonseca
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Igor Poltavsky
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| |
Collapse
|
29
|
Drici N. The influence of the hydrogen-bond network on the structure and dynamics of the RAPRKKG heptapeptide and its mutants. J Mol Graph Model 2023; 125:108598. [PMID: 37586130 DOI: 10.1016/j.jmgm.2023.108598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Revised: 08/02/2023] [Accepted: 08/08/2023] [Indexed: 08/18/2023]
Abstract
The structural behaviour of the RAPRKKG heptapeptide after individual or multiple mutations was inspected through molecular dynamics simulation. The nature of the mutations provided information on the flexibility of the heptapeptide and on how water molecules establish hydrogen bonds with it. The structural behaviour of the wild-type and the mutated structures were measured through the analysis of protein‒protein and protein‒solvent hydrogen bonds. The conformational behaviours of the different structures were analysed through free energy landscape analysis. The flexibility characteristics of the mutants seem to depend on the reorganization of water molecules and their static or dynamic behaviour around amino acid side chains.
Collapse
Affiliation(s)
- Nedjoua Drici
- University of Mostaganem, Abdelhamid Ibn Badis, Faculty of Exact Sciences and Informatics, Chemin des cretes ex INES, Mostaganem, 27000, Algeria; Laboratoire de Chimie Physique Macromoleculaire LCPM, University of Oran1 Ahmed benbella, Oran, 31000, Algeria.
| |
Collapse
|
30
|
Gao P, Zhang Q, Keely D, Cleveland DW, Ye Y, Zheng W, Shen M, Yu H. Molecular Graph-Based Deep Learning Algorithm Facilitates an Imaging-Based Strategy for Rapid Discovery of Small Molecules Modulating Biomolecular Condensates. J Med Chem 2023; 66:15084-15093. [PMID: 37937963 PMCID: PMC10810226 DOI: 10.1021/acs.jmedchem.3c00490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2023]
Abstract
Biomolecular condensates are proposed to cause diseases, such as cancer and neurodegeneration, by concentrating proteins at abnormal subcellular loci. Imaging-based compound screens have been used to identify small molecules that reverse or promote biomolecular condensates. However, limitations of conventional imaging-based methods restrict the screening scale. Here, we used a graph convolutional network (GCN)-based computational approach and identified small molecule candidates that reduce the nuclear liquid-liquid phase separation of TAR DNA-binding protein 43 (TDP-43), an essential protein that undergoes phase transition in neurodegenerative diseases. We demonstrated that the GCN-based deep learning algorithm is suitable for spatial information extraction from the molecular graph. Thus, this is a promising method to identify small molecule candidates with novel scaffolds. Furthermore, we validated that these candidates do not affect the normal splicing function of TDP-43. Taken together, a combination of an imaging-based screen and a GCN-based deep learning method dramatically improves the speed and accuracy of the compound screen for biomolecular condensates.
Collapse
Affiliation(s)
- Peng Gao
- The National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), MD 20850, USA
| | - Qi Zhang
- The National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), MD 20850, USA
| | - Devin Keely
- Center for Alzheimer’s and Neurodegenerative Diseases, Department of Molecular Biology, Peter O’Donnell Jr. Brain Institute, UT Southwestern Medical Center, TX, 75287, USA
| | - Don W. Cleveland
- Department of Cellular and Molecular Medicine, UC San Diego, CA, 92093, USA
| | - Yihong Ye
- National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), MD 20850, USA
| | - Wei Zheng
- The National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), MD 20850, USA
| | - Min Shen
- The National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), MD 20850, USA
| | - Haiyang Yu
- Center for Alzheimer’s and Neurodegenerative Diseases, Department of Molecular Biology, Peter O’Donnell Jr. Brain Institute, UT Southwestern Medical Center, TX, 75287, USA
| |
Collapse
|
31
|
Jones MS, Shmilovich K, Ferguson AL. DiAMoNDBack: Diffusion-Denoising Autoregressive Model for Non-Deterministic Backmapping of Cα Protein Traces. J Chem Theory Comput 2023; 19:7908-7923. [PMID: 37906711 DOI: 10.1021/acs.jctc.3c00840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2023]
Abstract
Coarse-grained molecular models of proteins permit access to length and time scales unattainable by all-atom models and the simulation of processes that occur on long time scales, such as aggregation and folding. The reduced resolution realizes computational accelerations, but an atomistic representation can be vital for a complete understanding of mechanistic details. Backmapping is the process of restoring all-atom resolution to coarse-grained molecular models. In this work, we report DiAMoNDBack (Diffusion-denoising Autoregressive Model for Non-Deterministic Backmapping) as an autoregressive denoising diffusion probability model to restore all-atom details to coarse-grained protein representations retaining only Cα coordinates. The autoregressive generation process proceeds from the protein N-terminus to C-terminus in a residue-by-residue fashion conditioned on the Cα trace and previously backmapped backbone and side-chain atoms within the local neighborhood. The local and autoregressive nature of our model makes it transferable between proteins. The stochastic nature of the denoising diffusion process means that the model generates a realistic ensemble of backbone and side-chain all-atom configurations consistent with the coarse-grained Cα trace. We train DiAMoNDBack over 65k+ structures from the Protein Data Bank (PDB) and validate it in applications to a hold-out PDB test set, intrinsically disordered protein structures from the Protein Ensemble Database (PED), molecular dynamics simulations of fast-folding mini-proteins from DE Shaw Research, and coarse-grained simulation data. We achieve state-of-the-art reconstruction performance in terms of correct bond formation, avoidance of side-chain clashes, and the diversity of the generated side-chain configurational states. We make the DiAMoNDBack model publicly available as a free and open-source Python package.
Collapse
Affiliation(s)
- Michael S Jones
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Kirill Shmilovich
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
32
|
Navarro C, Majewski M, De Fabritiis G. Top-Down Machine Learning of Coarse-Grained Protein Force Fields. J Chem Theory Comput 2023; 19:7518-7526. [PMID: 37874270 PMCID: PMC10777392 DOI: 10.1021/acs.jctc.3c00638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Indexed: 10/25/2023]
Abstract
Developing accurate and efficient coarse-grained representations of proteins is crucial for understanding their folding, function, and interactions over extended time scales. Our methodology involves simulating proteins with molecular dynamics and utilizing the resulting trajectories to train a neural network potential through differentiable trajectory reweighting. Remarkably, this method requires only the native conformation of proteins, eliminating the need for labeled data derived from extensive simulations or memory-intensive end-to-end differentiable simulations. Once trained, the model can be employed to run parallel molecular dynamics simulations and sample folding events for proteins both within and beyond the training distribution, showcasing its extrapolation capabilities. By applying Markov state models, native-like conformations of the simulated proteins can be predicted from the coarse-grained simulations. Owing to its theoretical transferability and ability to use solely experimental static structures as training data, we anticipate that this approach will prove advantageous for developing new protein force fields and further advancing the study of protein dynamics, folding, and interactions.
Collapse
Affiliation(s)
- Carles Navarro
- Acellera
Labs, Doctor Trueta 183, 08005 Barcelona, Spain
| | | | - Gianni De Fabritiis
- Computational
Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera
Ltd., Devonshire House
582, Middlesex HA7 1JS, United Kingdom
- Institució
Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
33
|
Bernardi A, Bennett WFD, He S, Jones D, Kirshner D, Bennion BJ, Carpenter TS. Advances in Computational Approaches for Estimating Passive Permeability in Drug Discovery. MEMBRANES 2023; 13:851. [PMID: 37999336 PMCID: PMC10673305 DOI: 10.3390/membranes13110851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Revised: 10/19/2023] [Accepted: 10/21/2023] [Indexed: 11/25/2023]
Abstract
Passive permeation of cellular membranes is a key feature of many therapeutics. The relevance of passive permeability spans all biological systems as they all employ biomembranes for compartmentalization. A variety of computational techniques are currently utilized and under active development to facilitate the characterization of passive permeability. These methods include lipophilicity relations, molecular dynamics simulations, and machine learning, which vary in accuracy, complexity, and computational cost. This review briefly introduces the underlying theories, such as the prominent inhomogeneous solubility diffusion model, and covers a number of recent applications. Various machine-learning applications, which have demonstrated good potential for high-volume, data-driven permeability predictions, are also discussed. Due to the confluence of novel computational methods and next-generation exascale computers, we anticipate an exciting future for computationally driven permeability predictions.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Timothy S. Carpenter
- Lawrence Livermore National Laboratory, Livermore, CA 94550, USA; (A.B.); (W.F.D.B.); (S.H.); (D.J.); (D.K.); (B.J.B.)
| |
Collapse
|
34
|
Borges-Araújo L, Patmanidis I, Singh AP, Santos LHS, Sieradzan AK, Vanni S, Czaplewski C, Pantano S, Shinoda W, Monticelli L, Liwo A, Marrink SJ, Souza PCT. Pragmatic Coarse-Graining of Proteins: Models and Applications. J Chem Theory Comput 2023; 19:7112-7135. [PMID: 37788237 DOI: 10.1021/acs.jctc.3c00733] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
The molecular details involved in the folding, dynamics, organization, and interaction of proteins with other molecules are often difficult to assess by experimental techniques. Consequently, computational models play an ever-increasing role in the field. However, biological processes involving large-scale protein assemblies or long time scale dynamics are still computationally expensive to study in atomistic detail. For these applications, employing coarse-grained (CG) modeling approaches has become a key strategy. In this Review, we provide an overview of what we call pragmatic CG protein models, which are strategies combining, at least in part, a physics-based implementation and a top-down experimental approach to their parametrization. In particular, we focus on CG models in which most protein residues are represented by at least two beads, allowing these models to retain some degree of chemical specificity. A description of the main modern pragmatic protein CG models is provided, including a review of the most recent applications and an outlook on future perspectives in the field.
Collapse
Affiliation(s)
- Luís Borges-Araújo
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Ilias Patmanidis
- Department of Chemistry, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Akhil P Singh
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
| | - Lucianna H S Santos
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Adam K Sieradzan
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Stefano Vanni
- Department of Biology, University of Fribourg, Chemin du Musée 10, Fribourg CH-1700, Switzerland
- Institut de Pharmacologie Moléculaire et Cellulaire, Université Côte d'Azur, Inserm, CNRS, 06560 Valbonne, France
| | - Cezary Czaplewski
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Sergio Pantano
- Biomolecular Simulations Group, Institut Pasteur de Montevideo, Montevideo 11400, Uruguay
| | - Wataru Shinoda
- Research Institute for Interdisciplinary Science, Okayama University, 3-1-1 Tsushima-naka, Kita, Okayama 700-8530, Japan
| | - Luca Monticelli
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| | - Adam Liwo
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, 80-308 Gdansk, Poland
| | - Siewert J Marrink
- Groningen Biomolecular Sciences and Biotechnology Institute and Zernike Institute for Advanced Materials, University of Groningen, Nijenborgh 7, 9747 AG Groningen, The Netherlands
| | - Paulo C T Souza
- Molecular Microbiology and Structural Biochemistry (MMSB, UMR 5086), CNRS, University of Lyon, 7 Passage du Vercors, 69007 Lyon, France
| |
Collapse
|
35
|
Peng Y, Pak AJ, Durumeric AEP, Sahrmann PG, Mani S, Jin J, Loose TD, Beiter J, Voth GA. OpenMSCG: A Software Tool for Bottom-Up Coarse-Graining. J Phys Chem B 2023; 127:8537-8550. [PMID: 37791670 PMCID: PMC10577682 DOI: 10.1021/acs.jpcb.3c04473] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 09/05/2023] [Indexed: 10/05/2023]
Abstract
The "bottom-up" approach to coarse-graining, for building accurate and efficient computational models to simulate large-scale and complex phenomena and processes, is an important approach in computational chemistry, biophysics, and materials science. As one example, the Multiscale Coarse-Graining (MS-CG) approach to developing CG models can be rigorously derived using statistical mechanics applied to fine-grained, i.e., all-atom simulation data for a given system. Under a number of circumstances, a systematic procedure, such as MS-CG modeling, is particularly valuable. Here, we present the development of the OpenMSCG software, a modularized open-source software that provides a collection of successful and widely applied bottom-up CG methods, including Boltzmann Inversion (BI), Force-Matching (FM), Ultra-Coarse-Graining (UCG), Relative Entropy Minimization (REM), Essential Dynamics Coarse-Graining (EDCG), and Heterogeneous Elastic Network Modeling (HeteroENM). OpenMSCG is a high-performance and comprehensive toolset that can be used to derive CG models from large-scale fine-grained simulation data in file formats from common molecular dynamics (MD) software packages, such as GROMACS, LAMMPS, and NAMD. OpenMSCG is modularized in the Python programming framework, which allows users to create and customize modeling "recipes" for reproducible results, thus greatly improving the reliability, reproducibility, and sharing of bottom-up CG models and their applications.
Collapse
Affiliation(s)
- Yuxing Peng
- NVIDIA
Corporation, 2788 San Tomas Expressway, Santa Clara, California 95051, United States
| | - Alexander J. Pak
- Department
of Chemical and Biological Engineering, Colorado School of Mines, Golden, Colorado 80401, United States
| | | | - Patrick G. Sahrmann
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Sriramvignesh Mani
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Jaehyeok Jin
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Timothy D. Loose
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Jeriann Beiter
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| | - Gregory A. Voth
- Department
of Chemistry, Chicago Center for Theoretical Chemistry, James Franck
Institute, and Institute for Biophysical Dynamics, The University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
36
|
Ivanov M, Posysoev M, Lyubartsev AP. Coarse-Grained Modeling Using Neural Networks Trained on Structural Data. J Chem Theory Comput 2023; 19:6704-6717. [PMID: 37712507 PMCID: PMC10569054 DOI: 10.1021/acs.jctc.3c00516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Indexed: 09/16/2023]
Abstract
We propose a method of bottom-up coarse-graining, in which interactions within a coarse-grained model are determined by an artificial neural network trained on structural data obtained from multiple atomistic simulations. The method uses ideas of the inverse Monte Carlo approach, relating changes in the neural network weights with changes in average structural properties, such as radial distribution functions. As a proof of concept, we demonstrate the method on a system interacting by a Lennard-Jones potential modeled by a simple linear network and a single-site coarse-grained model of methanol-water solutions. In the latter case, we implement a nonlinear neural network with intermediate layers trained by atomistic simulations carried out at different methanol concentrations. We show that such a network acts as a transferable potential at the coarse-grained resolution for a wide range of methanol concentrations, including those not included in the training set.
Collapse
Affiliation(s)
- Mikhail Ivanov
- Department of Materials and
Environmental Chemistry, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Maksim Posysoev
- Department of Materials and
Environmental Chemistry, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Alexander P. Lyubartsev
- Department of Materials and
Environmental Chemistry, Stockholm University, SE-106 91 Stockholm, Sweden
| |
Collapse
|
37
|
Lederer J, Gastegger M, Schütt KT, Kampffmeyer M, Müller KR, Unke OT. Automatic identification of chemical moieties. Phys Chem Chem Phys 2023; 25:26370-26379. [PMID: 37750554 PMCID: PMC10548786 DOI: 10.1039/d3cp03845a] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 08/18/2023] [Indexed: 09/27/2023]
Abstract
In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or be learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.
Collapse
Affiliation(s)
- Jonas Lederer
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Michael Gastegger
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Kristof T Schütt
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
| | - Michael Kampffmeyer
- Department of Physics and Technology, UiT The Arctic University of Norway, 9019 Tromsø, Norway
| | - Klaus-Robert Müller
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
- Google Deepmind, Germany
- Department of Artificial Intelligence, Korea University, Seoul 136-713, Korea
- Max Planck Institut für Informatik, 66123 Saarbrücken, Germany
| | - Oliver T Unke
- Berlin Institute of Technology (TU Berlin), 10587 Berlin, Germany.
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Germany
- Google Deepmind, Germany
| |
Collapse
|
38
|
Conflitti P, Raniolo S, Limongelli V. Perspectives on Ligand/Protein Binding Kinetics Simulations: Force Fields, Machine Learning, Sampling, and User-Friendliness. J Chem Theory Comput 2023; 19:6047-6061. [PMID: 37656199 PMCID: PMC10536999 DOI: 10.1021/acs.jctc.3c00641] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Indexed: 09/02/2023]
Abstract
Computational techniques applied to drug discovery have gained considerable popularity for their ability to filter potentially active drugs from inactive ones, reducing the time scale and costs of preclinical investigations. The main focus of these studies has historically been the search for compounds endowed with high affinity for a specific molecular target to ensure the formation of stable and long-lasting complexes. Recent evidence has also correlated the in vivo drug efficacy with its binding kinetics, thus opening new fascinating scenarios for ligand/protein binding kinetic simulations in drug discovery. The present article examines the state of the art in the field, providing a brief summary of the most popular and advanced ligand/protein binding kinetics techniques and evaluating their current limitations and the potential solutions to reach more accurate kinetic models. Particular emphasis is put on the need for a paradigm change in the present methodologies toward ligand and protein parametrization, the force field problem, characterization of the transition states, the sampling issue, and algorithms' performance, user-friendliness, and data openness.
Collapse
Affiliation(s)
- Paolo Conflitti
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Stefano Raniolo
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
| | - Vittorio Limongelli
- Faculty
of Biomedical Sciences, Euler Institute, Universitá della Svizzera italiana (USI), 6900 Lugano, Switzerland
- Department
of Pharmacy, University of Naples “Federico
II”, 80131 Naples, Italy
| |
Collapse
|
39
|
Arts M, Garcia Satorras V, Huang CW, Zügner D, Federici M, Clementi C, Noé F, Pinsler R, van den Berg R. Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics. J Chem Theory Comput 2023; 19:6151-6159. [PMID: 37688551 DOI: 10.1021/acs.jctc.3c00702] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2023]
Abstract
Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields, and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several protein simulations for systems up to 56 amino acids, reproducing the CG equilibrium distribution and preserving the dynamics of all-atom simulations such as protein folding events.
Collapse
Affiliation(s)
- Marloes Arts
- Department of Computer Science, University of Copenhagen, Universitetsparken 1, Copenhagen 2100, Denmark
| | - Victor Garcia Satorras
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Chin-Wei Huang
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| | - Daniel Zügner
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Marco Federici
- Informatics Institute, University of Amsterdam, Science Park 904, Amsterdam 1098 XH, The Netherlands
| | - Cecilia Clementi
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
- Department of Physics, Freie Universität Berlin, Arnimalle 12, Berlin 14195, Germany
| | - Frank Noé
- AI4Science, Microsoft Research, Karl-Liebknecht-Straße 32, Berlin 10178, Germany
| | - Robert Pinsler
- AI4Science, Microsoft Research, 21 Station Road, Cambridge CB1 2FB, U.K
| | - Rianne van den Berg
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, Amsterdam 1118 CZ, The Netherlands
| |
Collapse
|
40
|
Faure Beaulieu Z, Nicholas TC, Gardner JLA, Goodwin AL, Deringer VL. Coarse-grained versus fully atomistic machine learning for zeolitic imidazolate frameworks. Chem Commun (Camb) 2023; 59:11405-11408. [PMID: 37668310 PMCID: PMC10513772 DOI: 10.1039/d3cc02265j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 08/22/2023] [Indexed: 09/06/2023]
Abstract
Zeolitic imidazolate frameworks are widely thought of as being analogous to inorganic AB2 phases. We test the validity of this assumption by comparing simplified and fully atomistic machine-learning models for local environments in ZIFs. Our work addresses the central question to what extent chemical information can be "coarse-grained" in hybrid framework materials.
Collapse
Affiliation(s)
- Zoé Faure Beaulieu
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Thomas C Nicholas
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - John L A Gardner
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Andrew L Goodwin
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, UK.
| |
Collapse
|
41
|
Majewski M, Pérez A, Thölke P, Doerr S, Charron NE, Giorgino T, Husic BE, Clementi C, Noé F, De Fabritiis G. Machine learning coarse-grained potentials of protein thermodynamics. Nat Commun 2023; 14:5739. [PMID: 37714883 PMCID: PMC10504246 DOI: 10.1038/s41467-023-41343-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 08/29/2023] [Indexed: 09/17/2023] Open
Abstract
A generalized understanding of protein dynamics is an unsolved scientific problem, the solution of which is critical to the interpretation of the structure-function relationships that govern essential biological processes. Here, we approach this problem by constructing coarse-grained molecular potentials based on artificial neural networks and grounded in statistical mechanics. For training, we build a unique dataset of unbiased all-atom molecular dynamics simulations of approximately 9 ms for twelve different proteins with multiple secondary structure arrangements. The coarse-grained models are capable of accelerating the dynamics by more than three orders of magnitude while preserving the thermodynamics of the systems. Coarse-grained simulations identify relevant structural states in the ensemble with comparable energetics to the all-atom systems. Furthermore, we show that a single coarse-grained potential can integrate all twelve proteins and can capture experimental structural features of mutated proteins. These results indicate that machine learning coarse-grained potentials could provide a feasible approach to simulate and understand protein dynamics.
Collapse
Affiliation(s)
- Maciej Majewski
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Adrià Pérez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Philipp Thölke
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Stefan Doerr
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain
| | - Nicholas E Charron
- Department of Physics, Rice University, Houston, TX, 77005, USA
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Toni Giorgino
- Biophysics Institute, National Research Council (CNR-IBF), 20133, Milan, Italy
| | - Brooke E Husic
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 12, 14195, Berlin, Germany
- Lewis Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, 08540, USA
- Princeton Center for Theoretical Science, Princeton University, Princeton, NJ, 08540, USA
- Center for the Physics of Biological Function, Princeton University, Princeton, NJ, 08540, USA
| | - Cecilia Clementi
- Department of Physics, Rice University, Houston, TX, 77005, USA.
- Center for Theoretical Biological Physics, Rice University, Houston, TX, 77005, USA.
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Chemistry, Rice University, Houston, TX, 77005, USA.
| | - Frank Noé
- Department of Physics, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Mathematics and Computer Science, FU Berlin, Arnimallee 12, 14195, Berlin, Germany.
- Department of Chemistry, Rice University, Houston, TX, 77005, USA.
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178, Berlin, Germany.
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), Carrer Dr. Aiguader 88, 08003, Barcelona, Spain.
- Acellera Labs, Doctor Trueta 183, 08005, Barcelona, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010, Barcelona, Spain.
| |
Collapse
|
42
|
Danielsson A, Samsonov SA, Liwo A, Sieradzan AK. Extension of the SUGRES-1P Coarse-Grained Model of Polysaccharides to Heparin. J Chem Theory Comput 2023; 19:6023-6036. [PMID: 37587433 PMCID: PMC10500997 DOI: 10.1021/acs.jctc.3c00511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Indexed: 08/18/2023]
Abstract
Heparin is an unbranched periodic polysaccharide composed of negatively charged monomers and involved in key biological processes, including anticoagulation, angiogenesis, and inflammation. Its structure and dynamics have been studied extensively using experimental as well as theoretical approaches. The conventional approach of computational chemistry applied to the analysis of biomolecules is all-atom molecular dynamics, which captures the interactions of individual atoms by solving Newton's equation of motion. An alternative is molecular dynamics simulations using coarse-grained models of biomacromolecules, which offer a reduction of the representation and consequently enable us to extend the time and size scale of simulations by orders of magnitude. In this work, we extend the UNIfied COarse-gRaiNed (UNICORN) model of biological macromolecules developed in our laboratory to heparin. We carried out extensive tests to estimate the optimal weights of energy terms of the effective energy function as well as the optimal Debye-Hückel screening factor for electrostatic interactions. We applied the model to study unbound heparin molecules of polymerization degree ranging from 6 to 68 residues. We compare the obtained coarse-grained heparin conformations with models obtained from X-ray diffraction studies of heparin. The SUGRES-1P force field was able to accurately predict the general shape and global characteristics of heparin molecules.
Collapse
Affiliation(s)
- Annemarie Danielsson
- Faculty of Chemistry, University
of Gdansk, ul. Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Sergey A. Samsonov
- Faculty of Chemistry, University
of Gdansk, ul. Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Adam Liwo
- Faculty of Chemistry, University
of Gdansk, ul. Wita Stwosza 63, 80-308 Gdańsk, Poland
| | - Adam K. Sieradzan
- Faculty of Chemistry, University
of Gdansk, ul. Wita Stwosza 63, 80-308 Gdańsk, Poland
| |
Collapse
|
43
|
Liu S, Wang C, Latham AP, Ding X, Zhang B. OpenABC enables flexible, simplified, and efficient GPU accelerated simulations of biomolecular condensates. PLoS Comput Biol 2023; 19:e1011442. [PMID: 37695778 PMCID: PMC10513381 DOI: 10.1371/journal.pcbi.1011442] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Revised: 09/21/2023] [Accepted: 08/19/2023] [Indexed: 09/13/2023] Open
Abstract
Biomolecular condensates are important structures in various cellular processes but are challenging to study using traditional experimental techniques. In silico simulations with residue-level coarse-grained models strike a balance between computational efficiency and chemical accuracy. They could offer valuable insights by connecting the emergent properties of these complex systems with molecular sequences. However, existing coarse-grained models often lack easy-to-follow tutorials and are implemented in software that is not optimal for condensate simulations. To address these issues, we introduce OpenABC, a software package that greatly simplifies the setup and execution of coarse-grained condensate simulations with multiple force fields using Python scripting. OpenABC seamlessly integrates with the OpenMM molecular dynamics engine, enabling efficient simulations with performance on a single GPU that rivals the speed achieved by hundreds of CPUs. We also provide tools that convert coarse-grained configurations to all-atom structures for atomistic simulations. We anticipate that OpenABC will significantly facilitate the adoption of in silico simulations by a broader community to investigate the structural and dynamical properties of condensates.
Collapse
Affiliation(s)
- Shuming Liu
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Cong Wang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Andrew P. Latham
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, California, United States of America
| | - Xinqiang Ding
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Bin Zhang
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| |
Collapse
|
44
|
Wellawatte GP, Hocky GM, White AD. Neural potentials of proteins extrapolate beyond training data. J Chem Phys 2023; 159:085103. [PMID: 37642255 PMCID: PMC10474891 DOI: 10.1063/5.0147240] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 07/31/2023] [Indexed: 08/31/2023] Open
Abstract
We evaluate neural network (NN) coarse-grained (CG) force fields compared to traditional CG molecular mechanics force fields. We conclude that NN force fields are able to extrapolate and sample from unseen regions of the free energy surface when trained with limited data. Our results come from 88 NN force fields trained on different combinations of clustered free energy surfaces from four protein mapped trajectories. We used a statistical measure named total variation similarity to assess the agreement between reference free energy surfaces from mapped atomistic simulations and CG simulations from trained NN force fields. Our conclusions support the hypothesis that NN CG force fields trained with samples from one region of the proteins' free energy surface can, indeed, extrapolate to unseen regions. Additionally, the force matching error was found to only be weakly correlated with a force field's ability to reconstruct the correct free energy surface.
Collapse
Affiliation(s)
- Geemi P. Wellawatte
- Department of Chemistry, University of Rochester, Rochester, New York 14627, USA
| | - Glen M. Hocky
- Department of Chemistry, Simons Center for Computational Physical Chemistry, New York University, New York, New York 10003, USA
| | - Andrew D. White
- Department of Chemical Engineering, University of Rochester, Rochester, New York 14627, USA
| |
Collapse
|
45
|
Zaporozhets I, Clementi C. Multibody Terms in Protein Coarse-Grained Models: A Top-Down Perspective. J Phys Chem B 2023; 127:6920-6927. [PMID: 37499123 DOI: 10.1021/acs.jpcb.3c04493] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Coarse-grained models allow computational investigation of biomolecular processes occurring on long time and length scales, intractable with atomistic simulation. Traditionally, many coarse-grained models rely mostly on pairwise interaction potentials. However, the decimation of degrees of freedom should, in principle, lead to a complex many-body effective interaction potential. In this work, we use experimental data on mutant stability to parametrize coarse-grained models for two proteins with and without many-body terms. We demonstrate that many-body terms are necessary to reproduce quantitatively the effects of point mutations on protein stability, particularly to implicitly take into account the effect of the solvent.
Collapse
Affiliation(s)
- Iryna Zaporozhets
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| | - Cecilia Clementi
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics, Freie Universität, Arnimallee 12, Berlin 14195, Germany
| |
Collapse
|
46
|
Brown BP, Stein RA, Meiler J, Mchaourab H. Approximating conformational Boltzmann distributions with AlphaFold2 predictions. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.06.552168. [PMID: 37609301 PMCID: PMC10441281 DOI: 10.1101/2023.08.06.552168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
Protein dynamics are intimately tied to biological function and can enable processes such as signal transduction, enzyme catalysis, and molecular recognition. The relative free energies of conformations that contribute to these functional equilibria are evolved for the physiology of the organism. Despite the importance of these equilibria for understanding biological function and developing treatments for disease, the computational and experimental methods capable of quantifying them are limited to systems of modest size. Here, we demonstrate that AlphaFold2 contact distance distributions can approximate conformational Boltzmann distributions, which we evaluate through examination of the joint probability distributions of inter-residue contact distances along functionally relevant collective variables of several protein systems. Further, we show that contact distance probability distributions generated by AlphaFold2 are sensitive to points mutations thus AF2 can predict the structural effects of mutations in some systems. We anticipate that our approach will be a valuable tool to model the thermodynamics of conformational changes in large biomolecular systems.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Center for Applied AI in Protein Dynamics, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
| | - Richard A. Stein
- Center for Applied AI in Protein Dynamics, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, USA. Nashville, TN 37232, USA
| | - Jens Meiler
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Center for Applied AI in Protein Dynamics, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Institute for Drug Discovery, Leipzig University Medical School, Leipzig, SAC 04103, Germany
| | - Hassane Mchaourab
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Center for Applied AI in Protein Dynamics, Vanderbilt University, Nashville, TN, USA. Nashville, TN 37232, USA
- Department of Molecular Physiology and Biophysics, Vanderbilt University School of Medicine, Nashville, TN, USA. Nashville, TN 37232, USA
| |
Collapse
|
47
|
Sahrmann P, Loose TD, Durumeric AEP, Voth GA. Utilizing Machine Learning to Greatly Expand the Range and Accuracy of Bottom-Up Coarse-Grained Models through Virtual Particles. J Chem Theory Comput 2023; 19:4402-4413. [PMID: 36802592 PMCID: PMC10373655 DOI: 10.1021/acs.jctc.2c01183] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Indexed: 02/22/2023]
Abstract
Coarse-grained (CG) models parametrized using atomistic reference data, i.e., "bottom up" CG models, have proven useful in the study of biomolecules and other soft matter. However, the construction of highly accurate, low resolution CG models of biomolecules remains challenging. We demonstrate in this work how virtual particles, CG sites with no atomistic correspondence, can be incorporated into CG models within the context of relative entropy minimization (REM) as latent variables. The methodology presented, variational derivative relative entropy minimization (VD-REM), enables optimization of virtual particle interactions through a gradient descent algorithm aided by machine learning. We apply this methodology to the challenging case of a solvent-free CG model of a 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC) lipid bilayer and demonstrate that introduction of virtual particles captures solvent-mediated behavior and higher-order correlations which REM alone cannot capture in a more standard CG model based only on the mapping of collections of atoms to the CG sites.
Collapse
Affiliation(s)
- Patrick
G. Sahrmann
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Timothy D. Loose
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Aleksander E. P. Durumeric
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| | - Gregory A. Voth
- Department of Chemistry, Chicago Center
for Theoretical Chemistry, James Franck Institute, and Institute for
Biophysical Dynamics, The University of
Chicago, Chicago, Illinois 60637, United
States
| |
Collapse
|
48
|
Topel M, Ejaz A, Squires A, Ferguson AL. Learned Reconstruction of Protein Folding Trajectories from Noisy Single-Molecule Time Series. J Chem Theory Comput 2023; 19:4654-4667. [PMID: 36701162 DOI: 10.1021/acs.jctc.2c00920] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
Single-molecule Förster resonance energy transfer (smFRET) is an experimental methodology to track the real-time dynamics of molecules using fluorescent probes to follow one or more intramolecular distances. These distances provide a low-dimensional representation of the full atomistic dynamics. Under mild technical conditions, Takens' Delay Embedding Theorem guarantees that the full three-dimensional atomistic dynamics of a system are diffeomorphic (i.e., related by a smooth and invertible transformation) to a time-delayed embedding of one or more scalar observables. Appealing to these theoretical guarantees, we employ manifold learning, artificial neural networks, and statistical mechanics to learn from molecular simulation training data the a priori unknown transformation between the atomic coordinates and delay-embedded intramolecular distances accessible to smFRET. This learned transformation may then be used to reconstruct atomistic coordinates from smFRET time series data. We term this approach Single-molecule TAkens Reconstruction (STAR). We have previously applied STAR to reconstruct molecular configurations of a C24H50 polymer chain and the mini-protein Chignolin with accuracies better than 0.2 nm from simulated smFRET data under noise free and high time resolution conditions. In the present work, we investigate the role of signal-to-noise ratio, data volume, and time resolution in simulated smFRET data to assess the performance of STAR under conditions more representative of experimental realities. We show that STAR can reconstruct the Chignolin and Villin mini-proteins to accuracies of 0.12 and 0.42 nm, respectively, and place bounds on these conditions for accurate reconstructions. These results demonstrate that it is possible to reconstruct dynamical trajectories of protein folding from time series in noisy, time binned, experimentally measurable observables and lay the foundations for the application of STAR to real experimental data.
Collapse
Affiliation(s)
- Maximilian Topel
- Department of Physics, University of Chicago, Chicago, Illinois 60637, United States
| | - Ayesha Ejaz
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Allison Squires
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| | - Andrew L Ferguson
- Pritzker School of Molecular Engineering, University of Chicago, Chicago, Illinois 60637, United States
| |
Collapse
|
49
|
Zhang H, Saravanan KM, Zhang JZH. DeepBindGCN: Integrating Molecular Vector Representation with Graph Convolutional Neural Networks for Protein-Ligand Interaction Prediction. Molecules 2023; 28:4691. [PMID: 37375246 DOI: 10.3390/molecules28124691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Revised: 06/08/2023] [Accepted: 06/09/2023] [Indexed: 06/29/2023] Open
Abstract
The core of large-scale drug virtual screening is to select the binders accurately and efficiently with high affinity from large libraries of small molecules in which non-binders are usually dominant. The binding affinity is significantly influenced by the protein pocket, ligand spatial information, and residue types/atom types. Here, we used the pocket residues or ligand atoms as the nodes and constructed edges with the neighboring information to comprehensively represent the protein pocket or ligand information. Moreover, the model with pre-trained molecular vectors performed better than the one-hot representation. The main advantage of DeepBindGCN is that it is independent of docking conformation, and concisely keeps the spatial information and physical-chemical features. Using TIPE3 and PD-L1 dimer as proof-of-concept examples, we proposed a screening pipeline integrating DeepBindGCN and other methods to identify strong-binding-affinity compounds. It is the first time a non-complex-dependent model has achieved a root mean square error (RMSE) value of 1.4190 and Pearson r value of 0.7584 in the PDBbind v.2016 core set, respectively, thereby showing a comparable prediction power with the state-of-the-art affinity prediction models that rely upon the 3D complex. DeepBindGCN provides a powerful tool to predict the protein-ligand interaction and can be used in many important large-scale virtual screening application scenarios.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
50
|
Bhatia H, Aydin F, Carpenter TS, Lightstone FC, Bremer PT, Ingólfsson HI, Nissley DV, Streitz FH. The confluence of machine learning and multiscale simulations. Curr Opin Struct Biol 2023; 80:102569. [PMID: 36966691 DOI: 10.1016/j.sbi.2023.102569] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/31/2023] [Accepted: 02/08/2023] [Indexed: 06/04/2023]
Abstract
Multiscale modeling has a long history of use in structural biology, as computational biologists strive to overcome the time- and length-scale limits of atomistic molecular dynamics. Contemporary machine learning techniques, such as deep learning, have promoted advances in virtually every field of science and engineering and are revitalizing the traditional notions of multiscale modeling. Deep learning has found success in various approaches for distilling information from fine-scale models, such as building surrogate models and guiding the development of coarse-grained potentials. However, perhaps its most powerful use in multiscale modeling is in defining latent spaces that enable efficient exploration of conformational space. This confluence of machine learning and multiscale simulation with modern high-performance computing promises a new era of discovery and innovation in structural biology.
Collapse
Affiliation(s)
- Harsh Bhatia
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA. https://twitter.com/@harshbhatia85
| | - Fikret Aydin
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Timothy S Carpenter
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Felice C Lightstone
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Peer-Timo Bremer
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Helgi I Ingólfsson
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Dwight V Nissley
- RAS Initiative, The Cancer Research Technology Program, Frederick National Laboratory, Frederick, MD, 21701, USA.
| | - Frederick H Streitz
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA.
| |
Collapse
|