1
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
2
|
Wang Y, Inizan TJ, Liu C, Piquemal JP, Ren P. Incorporating Neural Networks into the AMOEBA Polarizable Force Field. J Phys Chem B 2024; 128:2381-2388. [PMID: 38445577 PMCID: PMC10985787 DOI: 10.1021/acs.jpcb.3c08166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2024]
Abstract
Neural network potentials (NNPs) offer significant promise to bridge the gap between the accuracy of quantum mechanics and the efficiency of molecular mechanics in molecular simulation. Most NNPs rely on the locality assumption that ensures the model's transferability and scalability and thus lack the treatment of long-range interactions, which are essential for molecular systems in the condensed phase. Here we present an integrated hybrid model, AMOEBA+NN, which combines the AMOEBA potential for the short- and long-range noncovalent atomic interactions and an NNP to capture the remaining local covalent contributions. The AMOEBA+NN model was trained on the conformational energy of the ANI-1x data set and tested on several external data sets ranging from small molecules to tetrapeptides. The hybrid model demonstrated substantial improvements over the baseline models in term of accuracy as the molecule size increased, suggesting its potential as a next-generation approach for chemically accurate molecular simulations.
Collapse
Affiliation(s)
- Yanxing Wang
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Théo Jaffrelot Inizan
- Sorbonne Université, Laboratoire de Chimie Théorique, UMR 7616 CNRS, Paris 75005, France
| | - Chengwen Liu
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Jean-Philip Piquemal
- Sorbonne Université, Laboratoire de Chimie Théorique, UMR 7616 CNRS, Paris 75005, France
| | - Pengyu Ren
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| |
Collapse
|
3
|
Ding Y, Huang J. Implementation and Validation of an OpenMM Plugin for the Deep Potential Representation of Potential Energy. Int J Mol Sci 2024; 25:1448. [PMID: 38338727 PMCID: PMC10855459 DOI: 10.3390/ijms25031448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024] Open
Abstract
Machine learning potentials, particularly the deep potential (DP) model, have revolutionized molecular dynamics (MD) simulations, striking a balance between accuracy and computational efficiency. To facilitate the DP model's integration with the popular MD engine OpenMM, we have developed a versatile OpenMM plugin. This plugin supports a range of applications, from conventional MD simulations to alchemical free energy calculations and hybrid DP/MM simulations. Our extensive validation tests encompassed energy conservation in microcanonical ensemble simulations, fidelity in canonical ensemble generation, and the evaluation of the structural, transport, and thermodynamic properties of bulk water. The introduction of this plugin is expected to significantly expand the application scope of DP models within the MD simulation community, representing a major advancement in the field.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China;
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| |
Collapse
|
4
|
Chen J, Yu K. PhyNEO: A Neural-Network-Enhanced Physics-Driven Force Field Development Workflow for Bulk Organic Molecule and Polymer Simulations. J Chem Theory Comput 2024; 20:253-265. [PMID: 38118076 DOI: 10.1021/acs.jctc.3c01045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
An accurate, generalizable, and transferable force field plays a crucial role in the molecular dynamics simulations of organic polymers and biomolecules. Conventional empirical force fields often fail to capture precise intermolecular interactions due to their negligence of important physics, such as polarization, charge penetration, many-body dispersion, etc. Moreover, the parameterization of these force fields relies heavily on top-down fittings, limiting their transferabilities to new systems where the experimental data are often unavailable. To address these challenges, we introduce a general and fully ab initio force field construction strategy, named PhyNEO. It features a hybrid approach that combines both the physics-driven and the data-driven methods and is able to generate a bulk potential with chemical accuracy using only quantum chemistry data of very small clusters. Careful separations of long-/short-range interactions and nonbonding/bonding interactions are the key to the success of PhyNEO. By such a strategy, we mitigate the limitations of pure data-driven methods in long-range interactions, thus largely increasing the data efficiency and the scalability of machine learning models. The new approach is thoroughly tested on poly(ethylene oxide) and polyethylene glycol systems, giving superior accuracies in both microscopic and bulk properties compared to conventional force fields. This work thus offers a promising framework for the development of advanced force fields in a wide range of organic molecular systems.
Collapse
Affiliation(s)
- Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
- Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| |
Collapse
|
5
|
Chen Z, Yang W. Development of a machine learning finite-range nonlocal density functional. J Chem Phys 2024; 160:014105. [PMID: 38180254 DOI: 10.1063/5.0179149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/12/2023] [Indexed: 01/06/2024] Open
Abstract
Kohn-Sham density functional theory has been the most popular method in electronic structure calculations. To fulfill the increasing accuracy requirements, new approximate functionals are needed to address key issues in existing approximations. It is well known that nonlocal components are crucial. Current nonlocal functionals mostly require orbital dependence such as in Hartree-Fock exchange and many-body perturbation correlation energy, which, however, leads to higher computational costs. Deviating from this pathway, we describe functional nonlocality in a new approach. By partitioning the total density to atom-centered local densities, a many-body expansion is proposed. This many-body expansion can be truncated at one-body contributions, if a base functional is used and an energy correction is approximated. The contribution from each atom-centered local density is a single finite-range nonlocal functional that is universal for all atoms. We then use machine learning to develop this universal atom-centered functional. Parameters in this functional are determined by fitting to data that are produced by high-level theories. Extensive tests on several different test sets, which include reaction energies, reaction barrier heights, and non-covalent interaction energies, show that the new functional, with only the density as the basic variable, can produce results comparable to the best-performing double-hybrid functionals, (for example, for the thermochemistry test set selected from the GMTKN55 database, BLYP based machine learning functional gives a weighted total mean absolute deviations of 3.33 kcal/mol, while DSD-BLYP-D3(BJ) gives 3.28 kcal/mol) with a lower computational cost. This opens a new pathway to nonlocal functional development and applications.
Collapse
Affiliation(s)
- Zehua Chen
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry and Department of Physics, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
6
|
Mohanty S, Stevenson J, Browning AR, Jacobson L, Leswing K, Halls MD, Afzal MAF. Development of scalable and generalizable machine learned force field for polymers. Sci Rep 2023; 13:17251. [PMID: 37821501 PMCID: PMC10567837 DOI: 10.1038/s41598-023-43804-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 09/28/2023] [Indexed: 10/13/2023] Open
Abstract
Understanding and predicting the properties of polymers is vital to developing tailored polymer molecules for desired applications. Classical force fields may fail to capture key properties, for example, the transport properties of certain polymer systems such as polyethylene glycol. As a solution, we present an alternative potential energy surface, a charge recursive neural network (QRNN) model trained on DFT calculations made on smaller atomic clusters that generalizes well to oligomers comprising larger atomic clusters or longer chains. We demonstrate the validity of the polymer QRNN workflow by modeling the oligomers of ethylene glycol. We apply two rounds of active learning (addition of new training clusters based on current model performance) and implement a novel model training approach that uses partial charges from a semi-empirical method. Our developed QRNN model for polymers produces stable molecular dynamics (MD) simulation trajectory and captures the dynamics of polymer chains as indicated by the striking agreement with experimental values. Our model allows working on much larger systems than allowed by DFT simulations, at the same time providing a more accurate force field than classical force fields which provides a promising avenue for large-scale molecular simulations of polymeric systems.
Collapse
|
7
|
Wang X, Li J, Yang L, Chen F, Wang Y, Chang J, Chen J, Feng W, Zhang L, Yu K. DMFF: An Open-Source Automatic Differentiable Platform for Molecular Force Field Development and Molecular Dynamics Simulation. J Chem Theory Comput 2023; 19:5897-5909. [PMID: 37589304 DOI: 10.1021/acs.jctc.2c01297] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2023]
Abstract
In the simulation of molecular systems, the underlying force field (FF) model plays an extremely important role in determining the reliability of the simulation. However, the quality of the state-of-the-art molecular force fields is still unsatisfactory in many cases, and the FF parameterization process largely relies on human experience, which is not scalable. To address this issue, we introduce DMFF, an open-source molecular FF development platform based on an automatic differentiation technique. DMFF serves as a powerful tool for both top-down and bottom-up FF development. Using DMFF, both energies/forces and thermodynamic quantities such as ensemble averages and free energies can be evaluated in a differentiable way, realizing an automatic, yet highly flexible FF optimization workflow. DMFF also eases the evaluation of forces and virial tensors for complicated advanced FFs, helping the fast validation of new models in molecular dynamics simulation. DMFF has been released as an open-source package under the LGPL-3.0 license and is available at https://github.com/deepmodeling/DMFF.
Collapse
Affiliation(s)
| | - Jichen Li
- DP Technology, Beijing 100080, P. R. China
| | - Lan Yang
- Tsinghua-Berkley Shenzhen Institute, Shenzhen, Guangdong 518055, P. R. China
| | | | | | | | - Junmin Chen
- Tsinghua-Berkley Shenzhen Institute, Shenzhen, Guangdong 518055, P. R. China
| | - Wei Feng
- DP Technology, Beijing 100080, P. R. China
| | - Linfeng Zhang
- AI for Science Institute, Beijing 100080, P. R. China
| | - Kuang Yu
- Tsinghua-Berkley Shenzhen Institute, Shenzhen, Guangdong 518055, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen, Guangdong 518055, P. R. China
| |
Collapse
|
8
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
9
|
Wang Y, Walker BD, Liu C, Ren P. An Efficient Approach to Large-Scale Ab Initio Conformational Energy Profiles of Small Molecules. Molecules 2022; 27:8567. [PMID: 36500658 PMCID: PMC9738817 DOI: 10.3390/molecules27238567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 11/19/2022] [Accepted: 11/27/2022] [Indexed: 12/12/2022] Open
Abstract
Accurate conformational energetics of molecules are of great significance to understand maby chemical properties. They are also fundamental for high-quality parameterization of force fields. Traditionally, accurate conformational profiles are obtained with density functional theory (DFT) methods. However, obtaining a reliable energy profile can be time-consuming when the molecular sizes are relatively large or when there are many molecules of interest. Furthermore, incorporation of data-driven deep learning methods into force field development has great requirements for high-quality geometry and energy data. To this end, we compared several possible alternatives to the traditional DFT methods for conformational scans, including the semi-empirical method GFN2-xTB and the neural network potential ANI-2x. It was found that a sequential protocol of geometry optimization with the semi-empirical method and single-point energy calculation with high-level DFT methods can provide satisfactory conformational energy profiles hundreds of times faster in terms of optimization.
Collapse
Affiliation(s)
| | | | | | - Pengyu Ren
- Department of Biomedical Engineering, The University of Texas at Austin, Austin, TX 78712, USA
| |
Collapse
|
10
|
Zhang Y, Xia J, Jiang B. REANN: A PyTorch-based end-to-end multi-functional deep neural network package for molecular, reactive, and periodic systems. J Chem Phys 2022; 156:114801. [DOI: 10.1063/5.0080766] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
In this work, we present a general purpose deep neural network package for representing energies, forces, dipole moments, and polarizabilities of atomistic systems. This so-called recursively embedded atom neural network model takes advantages of both the physically inspired atomic descriptor based neural networks and the message-passing based neural networks. Implemented in the PyTorch framework, the training process is parallelized on both the central processing unit and the graphics processing unit with high efficiency and low memory in which all hyperparameters can be optimized automatically. We demonstrate the state-of-the-art accuracy, high efficiency, scalability, and universality of this package by learning not only energies (with or without forces) but also dipole moment vectors and polarizability tensors in various molecular, reactive, and periodic systems. An interface between a trained model and LAMMPs is provided for large scale molecular dynamics simulations. We hope that this open-source toolbox will allow for future method development and applications of machine learned potential energy surfaces and quantum-chemical properties of molecules, reactions, and materials.
Collapse
Affiliation(s)
- Yaolong Zhang
- School of Chemistry and Materials Science, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Junfan Xia
- School of Chemistry and Materials Science, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Jiang
- School of Chemistry and Materials Science, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|