1
|
Giese TJ, Zeng J, Lerew L, McCarthy E, Tao Y, Ekesan Ş, York DM. Software Infrastructure for Next-Generation QM/MM-ΔMLP Force Fields. J Phys Chem B 2024; 128:6257-6271. [PMID: 38905451 DOI: 10.1021/acs.jpcb.4c01466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/23/2024]
Abstract
We present software infrastructure for the design and testing of new quantum mechanical/molecular mechanical and machine-learning potential (QM/MM-ΔMLP) force fields for a wide range of applications. The software integrates Amber's molecular dynamics simulation capabilities with fast, approximate quantum models in the xtb package and machine-learning potential corrections in DeePMD-kit. The xtb package implements the recently developed density-functional tight-binding QM models with multipolar electrostatics and density-dependent dispersion (GFN2-xTB), and the interface with Amber enables their use in periodic boundary QM/MM simulations with linear-scaling QM/MM particle-mesh Ewald electrostatics. The accuracy of the semiempirical models is enhanced by including machine-learning correction potentials (ΔMLPs) enabled through an interface with the DeePMD-kit software. The goal of this paper is to present and validate the implementation of this software infrastructure in molecular dynamics and free energy simulations. The utility of the new infrastructure is demonstrated in proof-of-concept example applications. The software elements presented here are open source and freely available. Their interface provides a powerful enabling technology for the design of new QM/MM-ΔMLP models for studying a wide range of problems, including biomolecular reactivity and protein-ligand binding.
Collapse
Affiliation(s)
- Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Lauren Lerew
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Erika McCarthy
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Yujun Tao
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| |
Collapse
|
2
|
Tao Y, Giese TJ, Ekesan Ş, Zeng J, Aradi B, Hourahine B, Aktulga HM, Götz AW, Merz KM, York DM. Amber free energy tools: Interoperable software for free energy simulations using generalized quantum mechanical/molecular mechanical and machine learning potentials. J Chem Phys 2024; 160:224104. [PMID: 38856060 DOI: 10.1063/5.0211276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Accepted: 05/15/2024] [Indexed: 06/11/2024] Open
Abstract
We report the development and testing of new integrated cyberinfrastructure for performing free energy simulations with generalized hybrid quantum mechanical/molecular mechanical (QM/MM) and machine learning potentials (MLPs) in Amber. The Sander molecular dynamics program has been extended to leverage fast, density-functional tight-binding models implemented in the DFTB+ and xTB packages, and an interface to the DeePMD-kit software enables the use of MLPs. The software is integrated through application program interfaces that circumvent the need to perform "system calls" and enable the incorporation of long-range Ewald electrostatics into the external software's self-consistent field procedure. The infrastructure provides access to QM/MM models that may serve as the foundation for QM/MM-ΔMLP potentials, which supplement the semiempirical QM/MM model with a MLP correction trained to reproduce ab initio QM/MM energies and forces. Efficient optimization of minimum free energy pathways is enabled through a new surface-accelerated finite-temperature string method implemented in the FE-ToolKit package. Furthermore, we interfaced Sander with the i-PI software by implementing the socket communication protocol used in the i-PI client-server model. The new interface with i-PI allows for the treatment of nuclear quantum effects with semiempirical QM/MM-ΔMLP models. The modular interoperable software is demonstrated on proton transfer reactions in guanine-thymine mispairs in a B-form deoxyribonucleic acid helix. The current work represents a considerable advance in the development of modular software for performing free energy simulations of chemical reactions that are important in a wide range of applications.
Collapse
Affiliation(s)
- Yujun Tao
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Bálint Aradi
- Bremen Center for Computational Materials Science, University of Bremen, D-28334 Bremen, Germany
| | - Ben Hourahine
- SUPA, Department of Physics, University of Strathclyde, Glasgow G4 0NG, United Kingdom
| | - Hasan Metin Aktulga
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, USA
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, USA
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, USA
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
3
|
Zinovjev K, Hedges L, Montagud Andreu R, Woods C, Tuñón I, van der Kamp MW. emle-engine: A Flexible Electrostatic Machine Learning Embedding Package for Multiscale Molecular Dynamics Simulations. J Chem Theory Comput 2024; 20:4514-4522. [PMID: 38804055 PMCID: PMC11171281 DOI: 10.1021/acs.jctc.4c00248] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 05/17/2024] [Accepted: 05/20/2024] [Indexed: 05/29/2024]
Abstract
We present in this work the emle-engine package (https://github.com/chemle/emle-engine)─the implementation of a new machine learning embedding scheme for hybrid machine learning potential/molecular-mechanics (ML/MM) dynamics simulations. The package is based on an embedding scheme that uses a physics-based model of the electronic density and induction with a handful of tunable parameters derived from in vacuo properties of the subsystem to be embedded. This scheme is completely independent of the in vacuo potential and requires only the positions of the atoms of the machine learning subsystem and the positions and partial charges of the molecular mechanics environment. These characteristics allow emle-engine to be employed in existing QM/MM software. We demonstrate that the implemented electrostatic machine learning embedding scheme (named EMLE) is stable in enhanced sampling molecular dynamics simulations. Through the calculation of free energy surfaces of alanine dipeptide in water with two different ML options for the in vacuo potential and three embedding models, we test the performance of EMLE. When compared to the reference DFT/MM surface, the EMLE embedding is clearly superior to the MM one based on fixed partial charges. The configurational dependence of the electronic density and the inclusion of the induction energy introduced by the EMLE model leads to a systematic reduction in the average error of the free energy surface when compared to MM embedding. By enabling the usage of EMLE embedding in practical ML/MM simulations, emle-engine will make it possible to accurately model systems and processes that feature significant variations in the charge distribution of the ML subsystem and/or the interacting environment.
Collapse
Affiliation(s)
- Kirill Zinovjev
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Lester Hedges
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | | | - Christopher Woods
- Research
Software Engineering, Advanced Computing
Research Centre, 31 Great
George Street, Bristol BS1 5QD, U.K.
| | - Iñaki Tuñón
- Departamento
de Química Física, Universidad
de Valencia, 46100 Burjassot, Spain
| | - Marc W. van der Kamp
- School
of Biochemistry, University of Bristol, Biomedical Sciences Building, University
Walk, Bristol BS8 1TD, U.K.
| |
Collapse
|
4
|
Lei YK, Yagi K, Sugita Y. Learning QM/MM potential using equivariant multiscale model. J Chem Phys 2024; 160:214109. [PMID: 38828815 DOI: 10.1063/5.0205123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/09/2024] [Indexed: 06/05/2024] Open
Abstract
The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
Collapse
Affiliation(s)
- Yao-Kun Lei
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
| | - Kiyoshi Yagi
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
- Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
5
|
Tao Y, Giese TJ, York DM. Electronic and Nuclear Quantum Effects on Proton Transfer Reactions of Guanine-Thymine (G-T) Mispairs Using Combined Quantum Mechanical/Molecular Mechanical and Machine Learning Potentials. Molecules 2024; 29:2703. [PMID: 38893576 PMCID: PMC11173453 DOI: 10.3390/molecules29112703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Revised: 05/30/2024] [Accepted: 06/04/2024] [Indexed: 06/21/2024] Open
Abstract
Rare tautomeric forms of nucleobases can lead to Watson-Crick-like (WC-like) mispairs in DNA, but the process of proton transfer is fast and difficult to detect experimentally. NMR studies show evidence for the existence of short-time WC-like guanine-thymine (G-T) mispairs; however, the mechanism of proton transfer and the degree to which nuclear quantum effects play a role are unclear. We use a B-DNA helix exhibiting a wGT mispair as a model system to study tautomerization reactions. We perform ab initio (PBE0/6-31G*) quantum mechanical/molecular mechanical (QM/MM) simulations to examine the free energy surface for tautomerization. We demonstrate that while the ab initio QM/MM simulations are accurate, considerable sampling is required to achieve high precision in the free energy barriers. To address this problem, we develop a QM/MM machine learning potential correction (QM/MM-ΔMLP) that is able to improve the computational efficiency, greatly extend the accessible time scales of the simulations, and enable practical application of path integral molecular dynamics to examine nuclear quantum effects. We find that the inclusion of nuclear quantum effects has only a modest effect on the mechanistic pathway but leads to a considerable lowering of the free energy barrier for the GT*⇌G*T equilibrium. Our results enable a rationalization of observed experimental data and the prediction of populations of rare tautomeric forms of nucleobases and rates of their interconversion in B-DNA.
Collapse
|
6
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
7
|
Yan Z, Wei D, Li X, Chung LW. Accelerating reliable multiscale quantum refinement of protein-drug systems enabled by machine learning. Nat Commun 2024; 15:4181. [PMID: 38755151 PMCID: PMC11099068 DOI: 10.1038/s41467-024-48453-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 04/24/2024] [Indexed: 05/18/2024] Open
Abstract
Biomacromolecule structures are essential for drug development and biocatalysis. Quantum refinement (QR) methods, which employ reliable quantum mechanics (QM) methods in crystallographic refinement, showed promise in improving the structural quality or even correcting the structure of biomacromolecules. However, vast computational costs and complex quantum mechanics/molecular mechanics (QM/MM) setups limit QR applications. Here we incorporate robust machine learning potentials (MLPs) in multiscale ONIOM(QM:MM) schemes to describe the core parts (e.g., drugs/inhibitors), replacing the expensive QM method. Additionally, two levels of MLPs are combined for the first time to overcome MLP limitations. Our unique MLPs+ONIOM-based QR methods achieve QM-level accuracy with significantly higher efficiency. Furthermore, our refinements provide computational evidence for the existence of bonded and nonbonded forms of the Food and Drug Administration (FDA)-approved drug nirmatrelvir in one SARS-CoV-2 main protease structure. This study highlights that powerful MLPs accelerate QRs for reliable protein-drug complexes, promote broader QR applications and provide more atomistic insights into drug development.
Collapse
Affiliation(s)
- Zeyin Yan
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Dacong Wei
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Xin Li
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Lung Wa Chung
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China.
| |
Collapse
|
8
|
Wan K, He J, Shi X. Construction of High Accuracy Machine Learning Interatomic Potential for Surface/Interface of Nanomaterials-A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2305758. [PMID: 37640376 DOI: 10.1002/adma.202305758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Indexed: 08/31/2023]
Abstract
The inherent discontinuity and unique dimensional attributes of nanomaterial surfaces and interfaces bestow them with various exceptional properties. These properties, however, also introduce difficulties for both experimental and computational studies. The advent of machine learning interatomic potential (MLIP) addresses some of the limitations associated with empirical force fields, presenting a valuable avenue for accurate simulations of these surfaces/interfaces of nanomaterials. Central to this approach is the idea of capturing the relationship between system configuration and potential energy, leveraging the proficiency of machine learning (ML) to precisely approximate high-dimensional functions. This review offers an in-depth examination of MLIP principles and their execution and elaborates on their applications in the realm of nanomaterial surface and interface systems. The prevailing challenges faced by this potent methodology are also discussed.
Collapse
Affiliation(s)
- Kaiwei Wan
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jianxin He
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xinghua Shi
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
9
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
10
|
Giese TJ, Ekesan Ş, McCarthy E, Tao Y, York DM. Surface-Accelerated String Method for Locating Minimum Free Energy Paths. J Chem Theory Comput 2024; 20:2058-2073. [PMID: 38367218 PMCID: PMC11059188 DOI: 10.1021/acs.jctc.3c01401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2024]
Abstract
We present a surface-accelerated string method (SASM) to efficiently optimize low-dimensional reaction pathways from the sampling performed with expensive quantum mechanical/molecular mechanical (QM/MM) Hamiltonians. The SASM accelerates the convergence of the path using the aggregate sampling obtained from the current and previous string iterations, whereas approaches like the string method in collective variables (SMCV) or the modified string method in collective variables (MSMCV) update the path only from the sampling obtained from the current iteration. Furthermore, the SASM decouples the number of images used to perform sampling from the number of synthetic images used to represent the path. The path is optimized on the current best estimate of the free energy surface obtained from all available sampling, and the proposed set of new simulations is not restricted to being located along the optimized path. Instead, the umbrella potential placement is chosen to extend the range of the free energy surface and improve the quality of the free energy estimates near the path. In this manner, the SASM is shown to improve the exploration for a minimum free energy pathway in regions where the free energy surface is relatively flat. Furthermore, it improves the quality of the free energy profile when the string is discretized with too few images. We compare the SASM, SMCV, and MSMCV using 3 QM/MM applications: a ribozyme methyltransferase reaction using 2 reaction coordinates, the 2'-O-transphosphorylation reaction of Hammerhead ribozyme using 3 reaction coordinates, and a tautomeric reaction in B-DNA using 5 reaction coordinates. We show that SASM converges the paths using roughly 3 times less sampling than the SMCV and MSMCV methods. All three algorithms have been implemented in the FE-ToolKit package made freely available.
Collapse
Affiliation(s)
- Timothy J. Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Erika McCarthy
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Yujun Tao
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
11
|
Nam K, Shao Y, Major DT, Wolf-Watz M. Perspectives on Computational Enzyme Modeling: From Mechanisms to Design and Drug Development. ACS OMEGA 2024; 9:7393-7412. [PMID: 38405524 PMCID: PMC10883025 DOI: 10.1021/acsomega.3c09084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/15/2024] [Accepted: 01/19/2024] [Indexed: 02/27/2024]
Abstract
Understanding enzyme mechanisms is essential for unraveling the complex molecular machinery of life. In this review, we survey the field of computational enzymology, highlighting key principles governing enzyme mechanisms and discussing ongoing challenges and promising advances. Over the years, computer simulations have become indispensable in the study of enzyme mechanisms, with the integration of experimental and computational exploration now established as a holistic approach to gain deep insights into enzymatic catalysis. Numerous studies have demonstrated the power of computer simulations in characterizing reaction pathways, transition states, substrate selectivity, product distribution, and dynamic conformational changes for various enzymes. Nevertheless, significant challenges remain in investigating the mechanisms of complex multistep reactions, large-scale conformational changes, and allosteric regulation. Beyond mechanistic studies, computational enzyme modeling has emerged as an essential tool for computer-aided enzyme design and the rational discovery of covalent drugs for targeted therapies. Overall, enzyme design/engineering and covalent drug development can greatly benefit from our understanding of the detailed mechanisms of enzymes, such as protein dynamics, entropy contributions, and allostery, as revealed by computational studies. Such a convergence of different research approaches is expected to continue, creating synergies in enzyme research. This review, by outlining the ever-expanding field of enzyme research, aims to provide guidance for future research directions and facilitate new developments in this important and evolving field.
Collapse
Affiliation(s)
- Kwangho Nam
- Department
of Chemistry and Biochemistry, University
of Texas at Arlington, Arlington, Texas 76019, United States
| | - Yihan Shao
- Department
of Chemistry and Biochemistry, University
of Oklahoma, Norman, Oklahoma 73019-5251, United States
| | - Dan T. Major
- Department
of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | | |
Collapse
|
12
|
Chung Y, Green WH. Machine learning from quantum chemistry to predict experimental solvent effects on reaction rates. Chem Sci 2024; 15:2410-2424. [PMID: 38362410 PMCID: PMC10866337 DOI: 10.1039/d3sc05353a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 01/04/2024] [Indexed: 02/17/2024] Open
Abstract
Fast and accurate prediction of solvent effects on reaction rates are crucial for kinetic modeling, chemical process design, and high-throughput solvent screening. Despite the recent advance in machine learning, a scarcity of reliable data has hindered the development of predictive models that are generalizable for diverse reactions and solvents. In this work, we generate a large set of data with the COSMO-RS method for over 28 000 neutral reactions and 295 solvents and train a machine learning model to predict the solvation free energy and solvation enthalpy of activation (ΔΔG‡solv, ΔΔH‡solv) for a solution phase reaction. On unseen reactions, the model achieves mean absolute errors of 0.71 and 1.03 kcal mol-1 for ΔΔG‡solv and ΔΔH‡solv, respectively, relative to the COSMO-RS calculations. The model also provides reliable predictions of relative rate constants within a factor of 4 when tested on experimental data. The presented model can provide nearly instantaneous predictions of kinetic solvent effects or relative rate constants for a broad range of neutral closed-shell or free radical reactions and solvents only based on atom-mapped reaction SMILES and solvent SMILES strings.
Collapse
Affiliation(s)
- Yunsie Chung
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
13
|
Ding Y, Huang J. Implementation and Validation of an OpenMM Plugin for the Deep Potential Representation of Potential Energy. Int J Mol Sci 2024; 25:1448. [PMID: 38338727 PMCID: PMC10855459 DOI: 10.3390/ijms25031448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024] Open
Abstract
Machine learning potentials, particularly the deep potential (DP) model, have revolutionized molecular dynamics (MD) simulations, striking a balance between accuracy and computational efficiency. To facilitate the DP model's integration with the popular MD engine OpenMM, we have developed a versatile OpenMM plugin. This plugin supports a range of applications, from conventional MD simulations to alchemical free energy calculations and hybrid DP/MM simulations. Our extensive validation tests encompassed energy conservation in microcanonical ensemble simulations, fidelity in canonical ensemble generation, and the evaluation of the structural, transport, and thermodynamic properties of bulk water. The introduction of this plugin is expected to significantly expand the application scope of DP models within the MD simulation community, representing a major advancement in the field.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China;
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| |
Collapse
|
14
|
Ding Y, Huang J. DP/MM: A Hybrid Model for Zinc-Protein Interactions in Molecular Dynamics. J Phys Chem Lett 2024; 15:616-627. [PMID: 38198685 DOI: 10.1021/acs.jpclett.3c03158] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Zinc-containing proteins are vital for many biological processes, yet accurately modeling them using classical force fields is hindered by complicated polarization and charge transfer effects. This study introduces DP/MM, a hybrid force field scheme that utilizes a deep potential model to correct the atomic forces of zinc ions and their coordinated atoms, elevating them from MM to QM levels of accuracy. Trained on the difference between MM and QM atomic forces across diverse zinc coordination groups, the DP/MM model faithfully reproduces structural characteristics of zinc coordination during simulations, such as the tetrahedral coordination of Cys4 and Cys3His1 groups. Furthermore, DP/MM allows water exchange in the zinc coordination environment. With its unique blend of accuracy, efficiency, flexibility, and transferability, DP/MM serves as a valuable tool for studying structures and dynamics of zinc-containing proteins and also represents a pioneering approach in the evolving landscape of machine learning potentials for molecular modeling.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310027, China
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang 310024, China
| |
Collapse
|
15
|
Bass L, Elder LH, Folescu DE, Forouzesh N, Tolokh IS, Karpatne A, Onufriev AV. Improving the Accuracy of Physics-Based Hydration-Free Energy Predictions by Machine Learning the Remaining Error Relative to the Experiment. J Chem Theory Comput 2024; 20:396-410. [PMID: 38149593 PMCID: PMC10950260 DOI: 10.1021/acs.jctc.3c00981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
The accuracy of computational models of water is key to atomistic simulations of biomolecules. We propose a computationally efficient way to improve the accuracy of the prediction of hydration-free energies (HFEs) of small molecules: the remaining errors of the physics-based models relative to the experiment are predicted and mitigated by machine learning (ML) as a postprocessing step. Specifically, the trained graph convolutional neural network attempts to identify the "blind spots" in the physics-based model predictions, where the complex physics of aqueous solvation is poorly accounted for, and partially corrects for them. The strategy is explored for five classical solvent models representing various accuracy/speed trade-offs, from the fast analytical generalized Born (GB) to the popular TIP3P explicit solvent model; experimental HFEs of small neutral molecules from the FreeSolv set are used for the training and testing. For all of the models, the ML correction reduces the resulting root-mean-square error relative to the experiment for HFEs of small molecules, without significant overfitting and with negligible computational overhead. For example, on the test set, the relative accuracy improvement is 47% for the fast analytical GB, making it, after the ML correction, almost as accurate as uncorrected TIP3P. For the TIP3P model, the accuracy improvement is about 39%, bringing the ML-corrected model's accuracy below the 1 kcal/mol threshold. In general, the relative benefit of the ML corrections is smaller for more accurate physics-based models, reaching the lower limit of about 20% relative accuracy gain compared with that of the physics-based treatment alone. The proposed strategy of using ML to learn the remaining error of physics-based models offers a distinct advantage over training ML alone directly on reference HFEs: it preserves the correct overall trend, even well outside of the training set.
Collapse
Affiliation(s)
- Lewis Bass
- Department of Computer Engineering, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Luke H Elder
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Dan E Folescu
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Mathematics, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Negin Forouzesh
- Department of Computer Science, California State University, Los Angeles, California 90032, United States
| | - Igor S Tolokh
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Anuj Karpatne
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Alexey V Onufriev
- Department of Computer Science, Virginia Tech, Blacksburg, Virginia 24061, United States
- Department of Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
- Center for Soft Matter and Biological Physics, Virginia Tech, Blacksburg, Virginia 24061, United States
| |
Collapse
|
16
|
Yang J, Cong Y, Li Y, Li H. Machine Learning Approach Based on a Range-Corrected Deep Potential Model for Efficient Vibrational Frequency Computation. J Chem Theory Comput 2023; 19:6366-6374. [PMID: 37652890 DOI: 10.1021/acs.jctc.3c00386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
As an ensemble average result, vibrational spectrum simulation can be time-consuming with high accuracy methods. We present a machine learning approach based on the range-corrected deep potential (DPRc) model to improve the computing efficiency. The DPRc method divides the system into "probe region" and "solvent region"; "solvent-solvent" interactions are not counted in the neural network. We applied the approach to two systems: formic acid C═O stretching and MeCN C≡N stretching vibrational frequency shifts in water. All data sets were prepared using the quantum vibration perturbation approach. Effects of different region divisions, one-body correction, cut range, and training data size were tested. The model with a single-molecule "probe region" showed stable accuracy; it ran roughly 10 times faster than regular deep potential and reduced the training time by about four. The approach is efficient, easy to apply, and extendable to calculating various spectra.
Collapse
Affiliation(s)
- Jitai Yang
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Yang Cong
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - You Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Hui Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| |
Collapse
|
17
|
Yuan Y, Cui Q. Accurate and Efficient Multilevel Free Energy Simulations with Neural Network-Assisted Enhanced Sampling. J Chem Theory Comput 2023; 19:5394-5406. [PMID: 37527495 PMCID: PMC10810721 DOI: 10.1021/acs.jctc.3c00591] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Free energy differences (ΔF) are essential to quantitative characterization and understanding of chemical and biological processes. Their direct estimation with an accurate quantum mechanical potential is of great interest and yet impractical due to high computational cost and incompatibility with typical alchemical free energy protocols. One promising solution is the multilevel free energy simulation in which the estimate of ΔF at an inexpensive low level of theory is combined with the correction toward a higher level of theory. The poor configurational overlap generally expected between the two levels of theory, however, presents a major challenge. We overcome this challenge by using a deep neural network model and enhanced sampling simulations. An adversarial autoencoder is used to identify a low-dimensional (latent) space that compactly represents the degrees of freedom that encode the distinct distributions at the two levels of theory. Enhanced sampling in this latent space is then used to drive the sampling of configurations that predominantly contribute to the free energy correction. Results for both gas phase and condensed phase systems demonstrate that this data-driven approach offers high accuracy and efficiency with great potential for scalability to complex systems.
Collapse
Affiliation(s)
- Yuchen Yuan
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
| | - Qiang Cui
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| |
Collapse
|
18
|
Zeng J, Zhang D, Lu D, Mo P, Li Z, Chen Y, Rynik M, Huang L, Li Z, Shi S, Wang Y, Ye H, Tuo P, Yang J, Ding Y, Li Y, Tisi D, Zeng Q, Bao H, Xia Y, Huang J, Muraoka K, Wang Y, Chang J, Yuan F, Bore SL, Cai C, Lin Y, Wang B, Xu J, Zhu JX, Luo C, Zhang Y, Goodall REA, Liang W, Singh AK, Yao S, Zhang J, Wentzcovitch R, Han J, Liu J, Jia W, York DM, E W, Car R, Zhang L, Wang H. DeePMD-kit v2: A software package for deep potential models. J Chem Phys 2023; 159:054801. [PMID: 37526163 PMCID: PMC10445636 DOI: 10.1063/5.0155600] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/03/2023] [Indexed: 08/02/2023] Open
Abstract
DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current version of DeePMD-kit offers numerous advanced features, such as DeepPot-SE, attention-based and hybrid descriptors, the ability to fit tensile properties, type embedding, model deviation, DP-range correction, DP long range, graphics processing unit support for customized operators, model compression, non-von Neumann molecular dynamics, and improved usability, including documentation, compiled binary packages, graphical user interfaces, and application programming interfaces. This article presents an overview of the current major version of the DeePMD-kit package, highlighting its features and technical details. Additionally, this article presents a comprehensive procedure for conducting molecular dynamics as a representative application, benchmarks the accuracy and efficiency of different models, and discusses ongoing developments.
Collapse
Affiliation(s)
- Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | - Denghui Lu
- HEDPS, CAPT, College of Engineering, Peking University, Beijing 100871, People’s Republic of China
| | - Pinghui Mo
- College of Electrical and Information Engineering, Hunan University, Changsha, People’s Republic of China
| | - Zeyu Li
- Yuanpei College, Peking University, Beijing 100871, People’s Republic of China
| | - Yixiao Chen
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08540, USA
| | - Marián Rynik
- Department of Experimental Physics, Comenius University, Mlynská Dolina F2, 842 48 Bratislava, Slovakia
| | - Li’ang Huang
- Center for Quantum Information, Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, People’s Republic of China
| | | | - Shaochen Shi
- ByteDance Research, Zhonghang Plaza, No. 43, North 3rd Ring West Road, Haidian District, Beijing, People’s Republic of China
| | | | - Haotian Ye
- Yuanpei College, Peking University, Beijing 100871, People’s Republic of China
| | - Ping Tuo
- AI for Science Institute, Beijing 100080, People’s Republic of China
| | - Jiabin Yang
- Baidu, Inc., Beijing, People’s Republic of China
| | | | - Yifan Li
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Qiyu Zeng
- Department of Physics, National University of Defense Technology, Changsha, Hunan 410073, People’s Republic of China
| | | | - Yu Xia
- ByteDance Research, Zhonghang Plaza, No. 43, North 3rd Ring West Road, Haidian District, Beijing, People’s Republic of China
| | | | - Koki Muraoka
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Yibo Wang
- DP Technology, Beijing 100080, People’s Republic of China
| | | | - Fengbo Yuan
- DP Technology, Beijing 100080, People’s Republic of China
| | - Sigbjørn Løland Bore
- Hylleraas Centre for Quantum Molecular Sciences and Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, 0315 Oslo, Norway
| | | | - Yinnian Lin
- Wangxuan Institute of Computer Technology, Peking University, Beijing 100871, People’s Republic of China
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, People’s Republic of China
| | - Jiayan Xu
- School of Chemistry and Chemical Engineering, Queen’s University Belfast, Belfast BT9 5AG, United Kingdom
| | - Jia-Xin Zhu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, People’s Republic of China
| | - Chenxing Luo
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York 10027, USA
| | - Yuzhi Zhang
- DP Technology, Beijing 100080, People’s Republic of China
| | | | - Wenshuo Liang
- DP Technology, Beijing 100080, People’s Republic of China
| | - Anurag Kumar Singh
- Department of Data Science, Indian Institute of Technology, Palakkad, Kerala, India
| | - Sikai Yao
- DP Technology, Beijing 100080, People’s Republic of China
| | - Jingchao Zhang
- NVIDIA AI Technology Center (NVAITC), Santa Clara, California 95051, USA
| | | | - Jiequn Han
- Center for Computational Mathematics, Flatiron Institute, New York, New York 10010, USA
| | - Jie Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, People’s Republic of China
| | | | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | - Roberto Car
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Han Wang
- Author to whom correspondence should be addressed:
| |
Collapse
|
19
|
Bhatia H, Aydin F, Carpenter TS, Lightstone FC, Bremer PT, Ingólfsson HI, Nissley DV, Streitz FH. The confluence of machine learning and multiscale simulations. Curr Opin Struct Biol 2023; 80:102569. [PMID: 36966691 DOI: 10.1016/j.sbi.2023.102569] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 01/31/2023] [Accepted: 02/08/2023] [Indexed: 06/04/2023]
Abstract
Multiscale modeling has a long history of use in structural biology, as computational biologists strive to overcome the time- and length-scale limits of atomistic molecular dynamics. Contemporary machine learning techniques, such as deep learning, have promoted advances in virtually every field of science and engineering and are revitalizing the traditional notions of multiscale modeling. Deep learning has found success in various approaches for distilling information from fine-scale models, such as building surrogate models and guiding the development of coarse-grained potentials. However, perhaps its most powerful use in multiscale modeling is in defining latent spaces that enable efficient exploration of conformational space. This confluence of machine learning and multiscale simulation with modern high-performance computing promises a new era of discovery and innovation in structural biology.
Collapse
Affiliation(s)
- Harsh Bhatia
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA. https://twitter.com/@harshbhatia85
| | - Fikret Aydin
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Timothy S Carpenter
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Felice C Lightstone
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Peer-Timo Bremer
- Computing Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Helgi I Ingólfsson
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA
| | - Dwight V Nissley
- RAS Initiative, The Cancer Research Technology Program, Frederick National Laboratory, Frederick, MD, 21701, USA.
| | - Frederick H Streitz
- Physical and Life Sciences (PLS) Directorate, Lawrence Livermore National Laboratory, Livermore, CA, 94550, USA.
| |
Collapse
|
20
|
Wang JN, Xue Y, Li P, Pan X, Wang M, Shao Y, Mo Y, Mei Y. Perspective: Reference-Potential Methods for the Study of Thermodynamic Properties in Chemical Processes: Theory, Applications, and Pitfalls. J Phys Chem Lett 2023; 14:4866-4875. [PMID: 37196031 PMCID: PMC10840091 DOI: 10.1021/acs.jpclett.3c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
In silico investigations of enzymatic reactions and chemical reactions in condensed phases often suffer from formidable computational costs due to a large number of degrees of freedom and enormous important volume in phase space. Usually, accuracy must be compromised to trade for efficiency by lowering the reliability of the Hamiltonians employed or reducing the sampling time. Reference-potential methods (RPMs) offer an alternative approach to reaching high accuracy of simulation without much loss of efficiency. In this Perspective, we summarize the idea of RPMs and showcase some recent applications. Most importantly, the pitfalls of these methods are also discussed, and remedies to these pitfalls are presented.
Collapse
Affiliation(s)
- Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Pengfei Li
- Single Particle, LLC, San Diego 92127, California, United States
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman 73019, Oklahoma, United States
| | - Meiting Wang
- School of Medical Engineering, Xinxiang Medical University, Xinxiang 453003, Henan, China
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman 73019, Oklahoma, United States
| | - Yan Mo
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan 030006, Shanxi, China
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan 030006, Shanxi, China
| |
Collapse
|
21
|
Giese TJ, York DM. Estimation of frequency factors for the calculation of kinetic isotope effects from classical and path integral free energy simulations. J Chem Phys 2023; 158:174105. [PMID: 37125722 PMCID: PMC10154067 DOI: 10.1063/5.0147218] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2023] [Accepted: 04/17/2023] [Indexed: 05/02/2023] Open
Abstract
We use the modified Bigeleisen-Mayer equation to compute kinetic isotope effect values for non-enzymatic phosphoryl transfer reactions from classical and path integral molecular dynamics umbrella sampling. The modified form of the Bigeleisen-Mayer equation consists of a ratio of imaginary mode vibrational frequencies and a contribution arising from the isotopic substitution's effect on the activation free energy, which can be computed from path integral simulation. In the present study, we describe a practical method for estimating the frequency ratio correction directly from umbrella sampling in a manner that does not require normal mode analysis of many geometry optimized structures. Instead, the method relates the frequency ratio to the change in the mass weighted coordinate representation of the minimum free energy path at the transition state induced by isotopic substitution. The method is applied to the calculation of 16/18O and 32/34S primary kinetic isotope effect values for six non-enzymatic phosphoryl transfer reactions. We demonstrate that the results are consistent with the analysis of geometry optimized transition state ensembles using the traditional Bigeleisen-Mayer equation. The method thus presents a new practical tool to enable facile calculation of kinetic isotope effect values for complex chemical reactions in the condensed phase.
Collapse
Affiliation(s)
- Timothy J. Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| |
Collapse
|
22
|
Morado J, Mortenson PN, Nissink JWM, Essex JW, Skylaris CK. Does a Machine-Learned Potential Perform Better Than an Optimally Tuned Traditional Force Field? A Case Study on Fluorohydrins. J Chem Inf Model 2023; 63:2810-2827. [PMID: 37071825 PMCID: PMC10170518 DOI: 10.1021/acs.jcim.2c01510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/20/2023]
Abstract
We present a comparative study that evaluates the performance of a machine learning potential (ANI-2x), a conventional force field (GAFF), and an optimally tuned GAFF-like force field in the modeling of a set of 10 γ-fluorohydrins that exhibit a complex interplay between intra- and intermolecular interactions in determining conformer stability. To benchmark the performance of each molecular model, we evaluated their energetic, geometric, and sampling accuracies relative to quantum-mechanical data. This benchmark involved conformational analysis both in the gas phase and chloroform solution. We also assessed the performance of the aforementioned molecular models in estimating nuclear spin-spin coupling constants by comparing their predictions to experimental data available in chloroform. The results and discussion presented in this study demonstrate that ANI-2x tends to predict stronger-than-expected hydrogen bonding and overstabilize global minima and shows problems related to inadequate description of dispersion interactions. Furthermore, while ANI-2x is a viable model for modeling in the gas phase, conventional force fields still play an important role, especially for condensed-phase simulations. Overall, this study highlights the strengths and weaknesses of each model, providing guidelines for the use and future development of force fields and machine learning potentials.
Collapse
Affiliation(s)
- João Morado
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Paul N Mortenson
- Astex Pharmaceuticals, 436 Cambridge Science Park, Milton Road, Cambridge CB4 0QA, United Kingdom
| | - J Willem M Nissink
- Computational Chemistry, Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry, University of Southampton, Highfield, Southampton SO17 1BJ, United Kingdom
| |
Collapse
|
23
|
Abstract
Advances in machine learned interatomic potentials (MLIPs), such as those using neural networks, have resulted in short-range models that can infer interaction energies with near ab initio accuracy and orders of magnitude reduced computational cost. For many atom systems, including macromolecules, biomolecules, and condensed matter, model accuracy can become reliant on the description of short- and long-range physical interactions. The latter terms can be difficult to incorporate into an MLIP framework. Recent research has produced numerous models with considerations for nonlocal electrostatic and dispersion interactions, leading to a large range of applications that can be addressed using MLIPs. In light of this, we present a Perspective focused on key methodologies and models being used where the presence of nonlocal physics and chemistry are crucial for describing system properties. The strategies covered include MLIPs augmented with dispersion corrections, electrostatics calculated with charges predicted from atomic environment descriptors, the use of self-consistency and message passing iterations to propagated nonlocal system information, and charges obtained via equilibration schemes. We aim to provide a pointed discussion to support the development of machine learning-based interatomic potentials for systems where contributions from only nearsighted terms are deficient.
Collapse
Affiliation(s)
- Dylan M Anstine
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
24
|
Zeng J, Tao Y, Giese TJ, York DM. QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery. J Chem Theory Comput 2023; 19:1261-1275. [PMID: 36696673 PMCID: PMC9992268 DOI: 10.1021/acs.jctc.2c01172] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
We report QDπ-v1.0 for modeling the internal energy of drug molecules containing H, C, N, and O atoms. The QDπ model is in the form of a quantum mechanical/machine learning potential correction (QM/Δ-MLP) that uses a fast third-order self-consistent density-functional tight-binding (DFTB3/3OB) model that is corrected to a quantitatively high-level of accuracy through a deep-learning potential (DeepPot-SE). The model has the advantage that it is able to properly treat electrostatic interactions and handle changes in charge/protonation states. The model is trained against reference data computed at the ωB97X/6-31G* level (as in the ANI-1x data set) and compared to several other approximate semiempirical and machine learning potentials (ANI-1x, ANI-2x, DFTB3, MNDO/d, AM1, PM6, GFN1-xTB, and GFN2-xTB). The QDπ model is demonstrated to be accurate for a wide range of intra- and intermolecular interactions (despite its intended use as an internal energy model) and has shown to perform exceptionally well for relative protonation/deprotonation energies and tautomers. An example application to model reactions involved in RNA strand cleavage catalyzed by protein and nucleic acid enzymes illustrates QDπ has average errors less than 0.5 kcal/mol, whereas the other models compared have errors over an order of magnitude greater. Taken together, this makes QDπ highly attractive as a potential force field model for drug discovery.
Collapse
Affiliation(s)
- Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Yujun Tao
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Timothy J. Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
25
|
Zhou B, Zhou Y, Xie D. Accelerated Quantum Mechanics/Molecular Mechanics Simulations via Neural Networks Incorporated with Mechanical Embedding Scheme. J Chem Theory Comput 2023; 19:1157-1169. [PMID: 36724190 DOI: 10.1021/acs.jctc.2c01131] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
A powerful tool to study the mechanism of reactions in solutions or enzymes is to perform the ab initio quantum mechanical/molecular mechanical (QM/MM) molecular dynamics (MD) simulations. However, the computational cost is too high due to the explicit electronic structure calculations at every time step of the simulation. A neural network (NN) method can accelerate the QM/MM-MD simulations, but it has long been a problem to accurately describe the QM/MM electrostatic coupling by NN in the electrostatic embedding (EE) scheme. In this work, we developed a new method to accelerate QM/MM calculations in the mechanic embedding (ME) scheme. The potentials and partial point charges of QM atoms are first learned in vacuo by the embedded atom neural networks (EANN) approach. MD simulations are then performed on this EANN/MM potential energy surface (PES) to obtain free energy (FE) profiles for reactions, in which the QM/MM electrostatic coupling is treated in the mechanic embedding (ME) scheme. Finally, a weighted thermodynamic perturbation (wTP) corrects the FE profiles in the ME scheme to the EE scheme. For two reactions in water and one in methanol, our simulations reproduced the B3LYP/MM free energy profiles within 0.5 kcal/mol with a speed-up of 30-60-fold. The results show that the strategy of combining EANN potential in the ME scheme with the wTP correction is efficient and reliable for chemical reaction simulations in liquid. Another advantage of our method is that the QM PES is independent of the MM subsystem, so it can be applied to various MM environments as demonstrated by an SN2 reaction studied in water and methanol individually, which used the same EANN PES. The free energy profiles are in excellent accordance with the results obtained from B3LYP/MM-MD simulations. In future, this method will be applied to the reactions of enzymes and their variants.
Collapse
Affiliation(s)
- Boyi Zhou
- Institute of Theoretical and Computational Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| | - Yanzi Zhou
- Institute of Theoretical and Computational Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China
| | - Daiqian Xie
- Institute of Theoretical and Computational Chemistry, Key Laboratory of Mesoscopic Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, China.,Hefei National Laboratory, Hefei 230088, China
| |
Collapse
|
26
|
Abstract
This work presents a variant of an electrostatic embedding scheme that allows the embedding of arbitrary machine learned potentials trained on molecular systems in vacuo. The scheme is based on physically motivated models of electronic density and polarizability, resulting in a generic model without relying on an exhaustive training set. The scheme only requires in vacuo single point QM calculations to provide training densities and molecular dipolar polarizabilities. As an example, the scheme is applied to create an embedding model for the QM7 data set using Gaussian Process Regression with only 445 reference atomic environments. The model was tested on the SARS-CoV-2 protease complex with PF-00835231, resulting in a predicted embedding energy RMSE of 2 kcal/mol, compared to explicit DFT/MM calculations.
Collapse
Affiliation(s)
- Kirill Zinovjev
- Departament de Química Física, Universitat de València, 46100 Burjassot, Spain
| |
Collapse
|
27
|
Ding Y, Yu K, Huang J. Data science techniques in biomolecular force field development. Curr Opin Struct Biol 2023; 78:102502. [PMID: 36462448 DOI: 10.1016/j.sbi.2022.102502] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 10/18/2022] [Accepted: 10/25/2022] [Indexed: 12/03/2022]
Abstract
Recent advances in data science are impacting the development of classical force fields. Here we review some ideas and techniques from data science that have been used in force field development, including database construction, atom typing, and machine learning potentials. We highlight how new tools such as active learning and automatic differentiation are facilitating the generation of target data and the direct fitting with macroscopic observables. Philosophical changes on how force field models should be built and used are also discussed. It's inspiring that more accurate biomolecular force fields can be developed with the aid of data science techniques.
Collapse
Affiliation(s)
- Ye Ding
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, 310024, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China
| | - Kuang Yu
- Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong, 518055, China
| | - Jing Huang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou, Zhejiang, 310024, China; Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang, 310024, China.
| |
Collapse
|
28
|
Csizi K, Reiher M. Universal
QM
/
MM
approaches for general nanoscale applications. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1656] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
| | - Markus Reiher
- Laboratorium für Physikalische Chemie ETH Zürich Zürich Switzerland
| |
Collapse
|
29
|
Cignoni E, Cupellini L, Mennucci B. Machine Learning Exciton Hamiltonians in Light-Harvesting Complexes. J Chem Theory Comput 2023; 19:965-977. [PMID: 36701385 PMCID: PMC9933434 DOI: 10.1021/acs.jctc.2c01044] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
We propose a machine learning (ML)-based strategy for an inexpensive calculation of excitonic properties of light-harvesting complexes (LHCs). The strategy uses classical molecular dynamics simulations of LHCs in their natural environment in combination with ML prediction of the excitonic Hamiltonian of the embedded aggregate of pigments. The proposed ML model can reproduce the effects of geometrical fluctuations together with those due to electrostatic and polarization interactions between the pigments and the protein. The training is performed on the chlorophylls of the major LHC of plants, but we demonstrate that the model is able to extrapolate well beyond the initial training set. Moreover, the accuracy in predicting the effects of the environment is tested on the simulation of the small changes observed in the absorption spectra of the wild-type and a mutant of a minor LHC.
Collapse
|
30
|
Nam K, Wolf-Watz M. Protein dynamics: The future is bright and complicated! STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2023; 10:014301. [PMID: 36865927 PMCID: PMC9974214 DOI: 10.1063/4.0000179] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 02/03/2023] [Indexed: 06/18/2023]
Abstract
Biological life depends on motion, and this manifests itself in proteins that display motion over a formidable range of time scales spanning from femtoseconds vibrations of atoms at enzymatic transition states, all the way to slow domain motions occurring on micro to milliseconds. An outstanding challenge in contemporary biophysics and structural biology is a quantitative understanding of the linkages among protein structure, dynamics, and function. These linkages are becoming increasingly explorable due to conceptual and methodological advances. In this Perspective article, we will point toward future directions of the field of protein dynamics with an emphasis on enzymes. Research questions in the field are becoming increasingly complex such as the mechanistic understanding of high-order interaction networks in allosteric signal propagation through a protein matrix, or the connection between local and collective motions. In analogy to the solution to the "protein folding problem," we argue that the way forward to understanding these and other important questions lies in the successful integration of experiment and computation, while utilizing the present rapid expansion of sequence and structure space. Looking forward, the future is bright, and we are in a period where we are on the doorstep to, at least in part, comprehend the importance of dynamics for biological function.
Collapse
Affiliation(s)
- Kwangho Nam
- Department of Chemistry and Biochemistry, University of Texas at Arlington, Arlington, Texas 76019, USA
| | | |
Collapse
|
31
|
Giese TJ, Zeng J, York DM. Multireference Generalization of the Weighted Thermodynamic Perturbation Method. J Phys Chem A 2022; 126:8519-8533. [PMID: 36301936 PMCID: PMC9771595 DOI: 10.1021/acs.jpca.2c06201] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We describe the generalized weighted thermodynamic perturbation (gwTP) method for estimating the free energy surface of an expensive "high-level" potential energy function from the umbrella sampling performed with multiple inexpensive "low-level" reference potentials. The gwTP method is a generalization of the weighted thermodynamic perturbation (wTP) method developed by Li and co-workers [J. Chem. Theory Comput. 2018, 14, 5583-5596] that uses a single "low-level" reference potential. The gwTP method offers new possibilities in model design whereby the sampling generated from several low-level potentials may be combined (e.g., specific reaction parameter models that might have variable accuracy at different stages of a multistep reaction). The gwTP method is especially well suited for use with machine learning potentials (MLPs) that are trained against computationally expensive ab initio quantum mechanical/molecular mechanical (QM/MM) energies and forces using active learning procedures that naturally produce multiple distinct neural network potentials. Simulations can be performed with greater sampling using the fast MLPs and then corrected to the ab initio level using gwTP. The capabilities of the gwTP method are demonstrated by creating reference potentials based on the MNDO/d and DFTB2/MIO semiempirical models supplemented with the "range-corrected deep potential" (DPRc). The DPRc parameters are trained to ab initio QM/MM data, and the potentials are used to calculate the free energy surface of stepwise mechanisms for nonenzymatic RNA 2'-O-transesterification model reactions. The extended sampling made possible by the reference potentials allows one to identify unequilibrated portions of the simulations that are not always evident from the short time scale commonly used with ab initio QM/MM potentials. We show that the reference potential approach can yield more accurate ab initio free energy predictions than the wTP method or what can be reasonably afforded from explicit ab initio QM/MM sampling.
Collapse
Affiliation(s)
- Timothy J. Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
32
|
Snyder R, Kim B, Pan X, Shao Y, Pu J. Facilitating ab initio QM/MM free energy simulations by Gaussian process regression with derivative observations. Phys Chem Chem Phys 2022; 24:25134-25143. [PMID: 36222412 PMCID: PMC11095978 DOI: 10.1039/d2cp02820d] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In combined quantum mechanical and molecular mechanical (QM/MM) free energy simulations, how to synthesize the accuracy of ab initio (AI) methods with the speed of semiempirical (SE) methods for a cost-effective QM treatment remains a long-standing challenge. In this work, we present a machine-learning-facilitated method for obtaining AI/MM-quality free energy profiles through efficient SE/MM simulations. In particular, we use Gaussian process regression (GPR) to learn the energy and force corrections needed for SE/MM to match with AI/MM results during molecular dynamics simulations. Force matching is enabled in our model by including energy derivatives into the observational targets through the extended-kernel formalism. We demonstrate the effectiveness of this method on the solution-phase SN2 Menshutkin reaction using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. Trained on only 80 configurations sampled along the minimum free energy path (MFEP), the resulting GPR model reduces the average energy error in AM1/MM from 18.2 to 5.8 kcal mol-1 for the 4000-sample testing set with the average force error on the QM atoms decreased from 14.6 to 3.7 kcal mol-1 Å-1. Free energy sampling with the GPR corrections applied (AM1-GPR/MM) produces a free energy barrier of 14.4 kcal mol-1 and a reaction free energy of -34.1 kcal mol-1, in closer agreement with the AI/MM benchmarks and experimental results.
Collapse
Affiliation(s)
- Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| | - Bryant Kim
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, OK 73019, USA.
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, OK 73019, USA.
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| |
Collapse
|
33
|
Hofstetter A, Böselt L, Riniker S. Graph-convolutional neural networks for (QM)ML/MM molecular dynamics simulations. Phys Chem Chem Phys 2022; 24:22497-22512. [PMID: 36106790 DOI: 10.1039/d2cp02931f] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
To accurately study the chemical reactions in the condensed phase or within enzymes, both quantum-mechanical description and sufficient configurational sampling are required to reach converged estimates. Here, quantum mechanics/molecular mechanics (QM/MM) molecular dynamics (MD) simulations play an important role, providing QM accuracy for the region of interest at a decreased computational cost. However, QM/MM simulations are still too expensive to study large systems on longer time scales. Recently, machine learning (ML) models have been proposed to replace the QM description. The main limitation of these models lies in the accurate description of long-range interactions present in condensed-phase systems. To overcome this issue, a recent workflow has been introduced combining a semi-empirical method (i.e. density functional tight binding (DFTB)) and a high-dimensional neural network potential (HDNNP) in a Δ-learning scheme. This approach has been shown to be capable of correctly incorporating long-range interactions within a cutoff of 1.4 nm. One of the promising alternative approaches to efficiently take long-range effects into account is the development of graph-convolutional neural networks (GCNNs) for the prediction of the potential-energy surface. In this work, we investigate the use of GCNN models - with and without a Δ-learning scheme - for (QM)ML/MM MD simulations. We show that the Δ-learning approach using a GCNN and DFTB as a baseline achieves competitive performance on our benchmarking set of solutes and chemical reactions in water. This method is additionally validated by performing prospective (QM)ML/MM MD simulations of retinoic acid in water and S-adenoslymethionine interacting with cytosine in water. The results indicate that the Δ-learning GCNN model is a valuable alternative for the (QM)ML/MM MD simulations of condensed-phase systems.
Collapse
Affiliation(s)
- Albert Hofstetter
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland.
| | - Lennard Böselt
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland.
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland.
| |
Collapse
|
34
|
Watanabe N, Yamamoto M, Murata M, Vavricka CJ, Ogino C, Kondo A, Araki M. Comprehensive Machine Learning Prediction of Extensive Enzymatic Reactions. J Phys Chem B 2022; 126:6762-6770. [PMID: 36053051 DOI: 10.1021/acs.jpcb.2c03287] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
New enzyme functions exist within the increasing number of unannotated protein sequences. Novel enzyme discovery is necessary to expand the pathways that can be accessed by metabolic engineering for the biosynthesis of functional compounds. Accordingly, various machine learning models have been developed to predict enzymatic reactions. However, the ability to predict unknown reactions that are not included in the training data has not been clarified. In order to cover uncertain and unknown reactions, a wider range of reaction types must be demonstrated by the models. Here, we establish 16 expanded enzymatic reaction prediction models developed using various machine learning algorithms, including deep neural network. Improvements in prediction performances over that of our previous study indicate that the updated methods are more effective for the prediction of enzymatic reactions. Overall, the deep neural network model trained with combined substrate-enzyme-product information exhibits the highest prediction accuracy with Macro F1 scores up to 0.966 and with robust prediction of unknown enzymatic reactions that are not included in the training data. This model can predict more extensive enzymatic reactions in comparison to previously reported models. This study will facilitate the discovery of new enzymes for the production of useful substances.
Collapse
Affiliation(s)
- Naoki Watanabe
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan
| | - Masaki Yamamoto
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan
| | - Masahiro Murata
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan
| | - Christopher J Vavricka
- Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Chiaki Ogino
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan
| | - Akihiko Kondo
- Department of Chemical Science and Engineering Graduate School of Engineering, Kobe University, 1-1 Rokkodai-cho, Nada, Kobe, Hyogo 657-8501, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan
| | - Michihiro Araki
- Graduate School of Medicine, Kyoto University, 54 Kawahara-cho, Shogoin Sakyo-ku, Kyoto 606-8507, Japan.,Graduate School of Science, Technology and Innovation, Kobe University, 1-1 Rokkodai-cho, Nada-ku, Kobe 657-8501, Japan.,National Institutes of Biomedical Innovation, Health and Nutrition, National Institute of Health and Nutrition, 1-23-1 Toyama, Shinjuku-ku, Tokyo 162-8638, Japan.,National Cerebral and Cardiovascular Center, 6-1 Kishibe-Shinmachi, Suita, Osaka 564-8565, Japan
| |
Collapse
|
35
|
Liao K, Dong S, Cheng Z, Li W, Li S. Combined fragment-based machine learning force field with classical force field and its application in the NMR calculations of macromolecules in solutions. Phys Chem Chem Phys 2022; 24:18559-18567. [PMID: 35916054 DOI: 10.1039/d2cp02192g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We have developed a combined fragment-based machine learning (ML) force field and molecular mechanics (MM) force field for simulating the structures of macromolecules in solutions, and then compute its NMR chemical shifts with the generalized energy-based fragmentation (GEBF) approach at the level of density functional theory (DFT). In this work, we first construct Gaussian approximation potential based on GEBF subsystems of macromolecules for MD simulations and then a GEBF-based neural network (GEBF-NN) with deep potential model for the studied macromolecule. Then, we develop a GEBF-NN/MM force field for macromolecules in solutions by combining the GEBF-NN force field for the solute molecule and ff14SB force field for solvent molecules. Using the GEBF-NN/MM MD simulation to generate snapshot structures of solute/solvent clusters, we then perform the NMR calculations with the GEBF approach at the DFT level to calculate NMR chemical shifts of the solute molecule. Taking a heptamer of oligopyridine-dicarboxamides in chloroform solution as an example, our results show that the GEBF-NN force field is quite accurate for this heptamer by comparing with the reference DFT results. For this heptamer in chloroform solution, both the GEBF-NN/MM and classical MD simulations could lead to helical structures from the same initial extended structure. The GEBF-DFT NMR results indicate that the GEBF-NN/MM force field could lead to more accurate NMR chemical shifts on hydrogen atoms by comparing with the experimental NMR results. Therefore, the GEBF-NN/MM force field could be employed for predicting more accurate dynamical behaviors than the classical force field for complex systems in solutions.
Collapse
Affiliation(s)
- Kang Liao
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shiyu Dong
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Zheng Cheng
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Wei Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shuhua Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| |
Collapse
|
36
|
Manathunga M, Götz AW, Merz KM. Computer-aided drug design, quantum-mechanical methods for biological problems. Curr Opin Struct Biol 2022; 75:102417. [PMID: 35779437 DOI: 10.1016/j.sbi.2022.102417] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 05/14/2022] [Accepted: 05/16/2022] [Indexed: 11/28/2022]
Abstract
Quantum chemistry enables to study systems with chemical accuracy (<1 kcal/mol from experiment) but is restricted to a handful of atoms due to its computational expense. This has led to ongoing interest to optimize and simplify these methods while retaining accuracy. Implementing quantum mechanical (QM) methods on modern hardware such as multiple-GPUs is one example of how the field is optimizing performance. Multiscale approaches like the so-called QM/molecular mechanical method are gaining popularity in drug discovery because they focus the application of QM methods on the region of choice (e.g., the binding site), while using efficient MM models to represent less relevant areas. The creation of simplified QM methods is another example, including the use of machine learning to create ultra-fast and accurate QM models. Herein, we summarize recent advancements in the development of optimized QM methods that enhance our ability to use these methods in computer aided drug discovery.
Collapse
Affiliation(s)
- Madushanka Manathunga
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, United States. https://twitter.com/@MaduManathunga
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, La Jolla, CA 92093, United States. https://twitter.com/@awgoetz
| | - Kenneth M Merz
- Department of Chemistry and Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI 48824, United States.
| |
Collapse
|
37
|
Giese TJ, Zeng J, Ekesan Ş, York DM. Combined QM/MM, Machine Learning Path Integral Approach to Compute Free Energy Profiles and Kinetic Isotope Effects in RNA Cleavage Reactions. J Chem Theory Comput 2022; 18:4304-4317. [PMID: 35709391 DOI: 10.1021/acs.jctc.2c00151] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We present a fast, accurate, and robust approach for determination of free energy profiles and kinetic isotope effects for RNA 2'-O-transphosphorylation reactions with inclusion of nuclear quantum effects. We apply a deep potential range correction (DPRc) for combined quantum mechanical/molecular mechanical (QM/MM) simulations of reactions in the condensed phase. The method uses the second-order density-functional tight-binding method (DFTB2) as a fast, approximate base QM model. The DPRc model modifies the DFTB2 QM interactions and applies short-range corrections to the QM/MM interactions to reproduce ab initio DFT (PBE0/6-31G*) QM/MM energies and forces. The DPRc thus enables both QM and QM/MM interactions to be tuned to high accuracy, and the QM/MM corrections are designed to smoothly vanish at a specified cutoff boundary (6 Å in the present work). The computational speed-up afforded by the QM/MM+DPRc model enables free energy profiles to be calculated that include rigorous long-range QM/MM interactions under periodic boundary conditions and nuclear quantum effects through a path integral approach using a new interface between the AMBER and i-PI software. The approach is demonstrated through the calculation of free energy profiles of a native RNA cleavage model reaction and reactions involving thio-substitutions, which are important experimental probes of the mechanism. The DFTB2+DPRc QM/MM free energy surfaces agree very closely with the PBE0/6-31G* QM/MM results, and it is vastly superior to the DFTB2 QM/MM surfaces with and without weighted thermodynamic perturbation corrections. 18O and 34S primary kinetic isotope effects are compared, and the influence of nuclear quantum effects on the free energy profiles is examined.
Collapse
Affiliation(s)
- Timothy J Giese
- Laboratory for Biomolecular Simulation Research, Center for Integrative Proteomics Research and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Center for Integrative Proteomics Research and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Şölen Ekesan
- Laboratory for Biomolecular Simulation Research, Center for Integrative Proteomics Research and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Darrin M York
- Laboratory for Biomolecular Simulation Research, Center for Integrative Proteomics Research and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, United States
| |
Collapse
|
38
|
Cao L, Zeng J, Wang B, Zhu T, Zhang JZH. Ab initio neural network MD simulation of thermal decomposition of a high energy material CL-20/TNT. Phys Chem Chem Phys 2022; 24:11801-11811. [PMID: 35506927 DOI: 10.1039/d2cp00710j] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
CL-20 (2,4,6,8,10,12-hexanitro-2,4,6,8,10,12-hexaazaisowurtzitane, also known as HNIW) is one of the most powerful energetic materials. However, its high sensitivity to environmental stimuli greatly reduces its safety and severely limits its application. In this work, ab initio based neural network potential (NNP) energy surfaces for both β-CL-20 and CL-20/TNT co-crystals were constructed. To accurately simulate the thermal decomposition processes of these two crystal systems, reactive molecular dynamics simulations based on the NNPs were performed. Many important intermediate species and their associated reaction paths during the decomposition had been identified in the simulations and the direct results on detonation temperatures of both systems were provided. The simulations also showed clearly that 2,4,6-trinitrotoluene (TNT) molecules in the co-crystal act as a buffer to slow down the chain reactions triggered by nitrogen dioxide and this effect is more significant at lower temperatures. Specifically, the addition of TNT molecules in the CL-20/TNT co-crystal introduces intermolecular hydrogen bonds between CL-20 and TNT molecules in the system, thereby increasing the thermal stability of the co-crystal. The current reactive molecular dynamics simulation is performed based on the NNP which helps in accelerating the speed of ab initio molecular dynamics (AIMD) simulation by more than 3 orders of magnitude while preserving the accuracy of density functional theory (DFT) calculations. This enabled us to perform longer-time simulations at more realistic temperatures that traditional AIMD methods cannot achieve. With the advantage of the NNP in its powerful fitting ability and transferability, the NNP-based MD simulation can be widely applied to energetic material systems.
Collapse
Affiliation(s)
- Liqun Cao
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
| | - Jinzhe Zeng
- Department of Chemistry and Chemical Biology, Institute for Quantitative Biomedicine, Rutgers, the State University of New Jersey, Piscataway 08854-8076, NJ, USA
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China.
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, 200062, China. .,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, 200062, China.,Department of Chemistry, New York University, New York 10003, USA.,Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong, China.,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi, 030006, China
| |
Collapse
|
39
|
Lier B, Poliak P, Marquetand P, Westermayr J, Oostenbrink C. BuRNN: Buffer Region Neural Network Approach for Polarizable-Embedding Neural Network/Molecular Mechanics Simulations. J Phys Chem Lett 2022; 13:3812-3818. [PMID: 35467875 PMCID: PMC9082612 DOI: 10.1021/acs.jpclett.2c00654] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Accepted: 04/18/2022] [Indexed: 06/14/2023]
Abstract
Hybrid quantum mechanics/molecular mechanics (QM/MM) simulations have advanced the field of computational chemistry tremendously. However, they require the partitioning of a system into two different regions that are treated at different levels of theory, which can cause artifacts at the interface. Furthermore, they are still limited by high computational costs of quantum chemical calculations. In this work, we develop the buffer region neural network (BuRNN), an alternative approach to existing QM/MM schemes, which introduces a buffer region that experiences full electronic polarization by the inner QM region to minimize artifacts. The interactions between the QM and the buffer region are described by deep neural networks (NNs), which leads to the high computational efficiency of this hybrid NN/MM scheme while retaining quantum chemical accuracy. We demonstrate the BuRNN approach by performing NN/MM simulations of the hexa-aqua iron complex.
Collapse
Affiliation(s)
- Bettina Lier
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, 1190 Vienna, Austria
| | - Peter Poliak
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, 1190 Vienna, Austria
- Department
of Chemical Physics, Institute of Physical Chemistry and Chemical
Physics, Faculty of Chemical and Food Technology, Slovak University of Technology in Bratislava, Radlinského 9, 812 37 Bratislava, Slovakia
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, University of
Vienna, Währingerstraße 17, 1090 Vienna, Austria
| | - Julia Westermayr
- Department
of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, U.K.
| | - Chris Oostenbrink
- Institute
for Molecular Modeling and Simulation, Department of Material Sciences
and Process Engineering, University of Natural
Resources and Life Sciences, Vienna, Muthgasse 18, 1190 Vienna, Austria
| |
Collapse
|
40
|
Gómez-Flores CL, Maag D, Kansari M, Vuong VQ, Irle S, Gräter F, Kubař T, Elstner M. Accurate Free Energies for Complex Condensed-Phase Reactions Using an Artificial Neural Network Corrected DFTB/MM Methodology. J Chem Theory Comput 2022; 18:1213-1226. [PMID: 34978438 DOI: 10.1021/acs.jctc.1c00811] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Semiempirical methods like density functional tight-binding (DFTB) allow extensive phase space sampling, making it possible to generate free energy surfaces of complex reactions in condensed-phase environments. Such a high efficiency often comes at the cost of reduced accuracy, which may be improved by developing a specific reaction parametrization (SRP) for the particular molecular system. Thiol-disulfide exchange is a nucleophilic substitution reaction that occurs in a large class of proteins. Its proper description requires a high-level ab initio method, while DFT-GAA and hybrid functionals were shown to be inadequate, and so is DFTB due to its DFT-GGA descent. We develop an SRP for thiol-disulfide exchange based on an artificial neural network (ANN) implementation in the DFTB+ software and compare its performance to that of a standard SRP approach applied to DFTB. As an application, we use both new DFTB-SRP as components of a QM/MM scheme to investigate thiol-disulfide exchange in two molecular complexes: a solvated model system and a blood protein. Demonstrating the strengths of the methodology, highly accurate free energy surfaces are generated at a low cost, as the augmentation of DFTB with an ANN only adds a small computational overhead.
Collapse
Affiliation(s)
- Claudia L Gómez-Flores
- Institute of Physical Chemistry, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| | - Denis Maag
- Institute of Physical Chemistry, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| | - Mayukh Kansari
- Institute of Physical Chemistry, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| | - Van-Quan Vuong
- Bredesen Center for Interdisciplinary Research and Graduate Education, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Stephan Irle
- Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States.,National Virtual Biotechnology Laboratory, U.S. Department of Energy, Washington, DC 20585, United States
| | - Frauke Gräter
- Heidelberg Institute for Theoretical Studies, 69118 Heidelberg, Germany
| | - Tomáš Kubař
- Institute of Physical Chemistry, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| | - Marcus Elstner
- Institute of Physical Chemistry, Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany.,Institute of Biological Interfaces (IBG-2), Karlsruhe Institute of Technology, 76131 Karlsruhe, Germany
| |
Collapse
|
41
|
Xue Y, Wang JN, Hu W, Zheng J, Li Y, Pan X, Mo Y, Shao Y, Wang L, Mei Y. Affordable Ab Initio Path Integral for Thermodynamic Properties via Molecular Dynamics Simulations Using Semiempirical Reference Potential. J Phys Chem A 2021; 125:10677-10685. [PMID: 34894680 PMCID: PMC9108008 DOI: 10.1021/acs.jpca.1c07727] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Path integral molecular dynamics (PIMD) is becoming a routinely applied method for incorporating the nuclear quantum effect in computer simulations. However, direct PIMD simulations at an ab initio level of theory are formidably expensive. Using the protonated 1,8-bis(dimethylamino)naphthalene molecule as an example, we show in this work that the computational expense for the intramolecular proton transfer between the two nitrogen atoms can be remarkably reduced by implementing the idea of reference-potential methods. The simulation time can be easily extended to a scale of nanoseconds while maintaining the accuracy on an ab initio level of theory for thermodynamic properties. In addition, postprocessing can be carried out in parallel on massive computer nodes. A 545-fold reduction in the total CPU time can be achieved in this way as compared to a direct PIMD simulation at the same ab initio level of theory.
Collapse
Affiliation(s)
- Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200062, China
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200062, China
| | - Wenxin Hu
- The Computer Center, School of Data Science & Engineering, East China Normal University, Shanghai 200062, China
| | - Jun Zheng
- The Computer Center, School of Data Science & Engineering, East China Normal University, Shanghai 200062, China
| | - Yongle Li
- Department of Physics, International Center of Quantum and Molecular Structure, and Shanghai Key Laboratory of High Temperature Superconductors, Shanghai University, Shanghai 200444, China
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Yan Mo
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200062, China,NYU–ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, Oklahoma 73019, United States
| | - Lu Wang
- Department of Chemistry and Chemical Biology, Institute for Quantitative Biomedicine, Rutgers University, Piscataway, New Jersey 08854, United States
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200062, China,NYU–ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|