1
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
2
|
Chen Z, Yang W. Development of a machine learning finite-range nonlocal density functional. J Chem Phys 2024; 160:014105. [PMID: 38180254 DOI: 10.1063/5.0179149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Accepted: 12/12/2023] [Indexed: 01/06/2024] Open
Abstract
Kohn-Sham density functional theory has been the most popular method in electronic structure calculations. To fulfill the increasing accuracy requirements, new approximate functionals are needed to address key issues in existing approximations. It is well known that nonlocal components are crucial. Current nonlocal functionals mostly require orbital dependence such as in Hartree-Fock exchange and many-body perturbation correlation energy, which, however, leads to higher computational costs. Deviating from this pathway, we describe functional nonlocality in a new approach. By partitioning the total density to atom-centered local densities, a many-body expansion is proposed. This many-body expansion can be truncated at one-body contributions, if a base functional is used and an energy correction is approximated. The contribution from each atom-centered local density is a single finite-range nonlocal functional that is universal for all atoms. We then use machine learning to develop this universal atom-centered functional. Parameters in this functional are determined by fitting to data that are produced by high-level theories. Extensive tests on several different test sets, which include reaction energies, reaction barrier heights, and non-covalent interaction energies, show that the new functional, with only the density as the basic variable, can produce results comparable to the best-performing double-hybrid functionals, (for example, for the thermochemistry test set selected from the GMTKN55 database, BLYP based machine learning functional gives a weighted total mean absolute deviations of 3.33 kcal/mol, while DSD-BLYP-D3(BJ) gives 3.28 kcal/mol) with a lower computational cost. This opens a new pathway to nonlocal functional development and applications.
Collapse
Affiliation(s)
- Zehua Chen
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry and Department of Physics, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
3
|
Zhang P, Yang W. Toward a general neural network force field for protein simulations: Refining the intramolecular interaction in protein. J Chem Phys 2023; 159:024118. [PMID: 37431910 PMCID: PMC10481389 DOI: 10.1063/5.0142280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/22/2023] [Indexed: 07/12/2023] Open
Abstract
Molecular dynamics (MD) is an extremely powerful, highly effective, and widely used approach to understanding the nature of chemical processes in atomic details for proteins. The accuracy of results from MD simulations is highly dependent on force fields. Currently, molecular mechanical (MM) force fields are mainly utilized in MD simulations because of their low computational cost. Quantum mechanical (QM) calculation has high accuracy, but it is exceedingly time consuming for protein simulations. Machine learning (ML) provides the capability for generating accurate potential at the QM level without increasing much computational effort for specific systems that can be studied at the QM level. However, the construction of general machine learned force fields, needed for broad applications and large and complex systems, is still challenging. Here, general and transferable neural network (NN) force fields based on CHARMM force fields, named CHARMM-NN, are constructed for proteins by training NN models on 27 fragments partitioned from the residue-based systematic molecular fragmentation (rSMF) method. The NN for each fragment is based on atom types and uses new input features that are similar to MM inputs, including bonds, angles, dihedrals, and non-bonded terms, which enhance the compatibility of CHARMM-NN to MM MD and enable the implementation of CHARMM-NN force fields in different MD programs. While the main part of the energy of the protein is based on rSMF and NN, the nonbonded interactions between the fragments and with water are taken from the CHARMM force field through mechanical embedding. The validations of the method for dipeptides on geometric data, relative potential energies, and structural reorganization energies demonstrate that the CHARMM-NN local minima on the potential energy surface are very accurate approximations to QM, showing the success of CHARMM-NN for bonded interactions. However, the MD simulations on peptides and proteins indicate that more accurate methods to represent protein-water interactions in fragments and non-bonded interactions between fragments should be considered in the future improvement of CHARMM-NN, which can increase the accuracy of approximation beyond the current mechanical embedding QM/MM level.
Collapse
Affiliation(s)
- Pan Zhang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
4
|
Chen WK, Wang SR, Liu XY, Fang WH, Cui G. Nonadiabatic Derivative Couplings Calculated Using Information of Potential Energy Surfaces without Wavefunctions: Ab Initio and Machine Learning Implementations. Molecules 2023; 28:molecules28104222. [PMID: 37241962 DOI: 10.3390/molecules28104222] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/16/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
In this work, we implemented an approximate algorithm for calculating nonadiabatic coupling matrix elements (NACMEs) of a polyatomic system with ab initio methods and machine learning (ML) models. Utilizing this algorithm, one can calculate NACMEs using only the information of potential energy surfaces (PESs), i.e., energies, and gradients as well as Hessian matrix elements. We used a realistic system, namely CH2NH, to compare NACMEs calculated by this approximate PES-based algorithm and the accurate wavefunction-based algorithm. Our results show that this approximate PES-based algorithm can give very accurate results comparable to the wavefunction-based algorithm except at energetically degenerate points, i.e., conical intersections. We also tested a machine learning (ML)-trained model with this approximate PES-based algorithm, which also supplied similarly accurate NACMEs but more efficiently. The advantage of this PES-based algorithm is its significant potential to combine with electronic structure methods that do not implement wavefunction-based algorithms, low-scaling energy-based fragment methods, etc., and in particular efficient ML models, to compute NACMEs. The present work could encourage further research on nonadiabatic processes of large systems simulated by ab initio nonadiabatic dynamics simulation methods in which NACMEs are always required.
Collapse
Affiliation(s)
- Wen-Kai Chen
- Hebei Key Laboratory of Inorganic Nano-Materials, College of Chemistry and Materials Science, Hebei Normal University, Shijiazhuang 050024, China
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Sheng-Rui Wang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Xiang-Yang Liu
- College of Chemistry and Material Science, Sichuan Normal University, Chengdu 610068, China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| | - Ganglong Cui
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| |
Collapse
|
5
|
Zhu Q, Ge Y, Li W, Ma J. Treating Polarization Effects in Charged and Polar Bio-Molecules Through Variable Electrostatic Parameters. J Chem Theory Comput 2023; 19:396-411. [PMID: 36592097 DOI: 10.1021/acs.jctc.2c01130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Polarization plays important roles in charged and hydrogen bonding containing systems. Much effort ranging from the construction of physics-based models to quantum mechanism (QM)-based and machine learning (ML)-assisted models have been devoted to incorporating the polarization effect into the conventional force fields at different levels, such as atomic and coarse grained (CG). The application of polarizable force fields or polarization models was limited by two aspects, namely, computational cost and transferability. Different from physics-based models, no predetermining parameters were required in the QM-based approaches. Taking advantage of both the accuracy of QM calculations and efficiency of molecular mechanism (MM) and ML, polarization effects could be treated more efficiently while maintaining the QM accuracy. The computational cost could be reduced with variable electrostatic parameters, such as the charge, dipole, and electronic dielectric constant with the help of linear scaling fragmentation-based QM calculations and ML models. Polarization and entropy effects on the prediction of partition coefficient of druglike molecules are demonstrated by using both explicit or implicit all-atom molecular dynamics simulations and machine learning-assisted models. Directions and challenges for future development are also envisioned.
Collapse
Affiliation(s)
- Qiang Zhu
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing210023, P. R. China
| | - Yang Ge
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing210023, P. R. China
| | - Wei Li
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing210023, P. R. China
| | - Jing Ma
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing210023, P. R. China
| |
Collapse
|
6
|
Liu J, He X. Recent advances in quantum fragmentation approaches to complex molecular and condensed‐phase systems. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Jinfeng Liu
- Department of Basic Medicine and Clinical Pharmacy China Pharmaceutical University Nanjing China
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering East China Normal University Shanghai China
| | - Xiao He
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Frontiers Science Center of Molecule Intelligent Syntheses, School of Chemistry and Molecular Engineering East China Normal University Shanghai China
- New York University‐East China Normal University Center for Computational Chemistry New York University Shanghai Shanghai China
| |
Collapse
|
7
|
Liao K, Dong S, Cheng Z, Li W, Li S. Combined fragment-based machine learning force field with classical force field and its application in the NMR calculations of macromolecules in solutions. Phys Chem Chem Phys 2022; 24:18559-18567. [PMID: 35916054 DOI: 10.1039/d2cp02192g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We have developed a combined fragment-based machine learning (ML) force field and molecular mechanics (MM) force field for simulating the structures of macromolecules in solutions, and then compute its NMR chemical shifts with the generalized energy-based fragmentation (GEBF) approach at the level of density functional theory (DFT). In this work, we first construct Gaussian approximation potential based on GEBF subsystems of macromolecules for MD simulations and then a GEBF-based neural network (GEBF-NN) with deep potential model for the studied macromolecule. Then, we develop a GEBF-NN/MM force field for macromolecules in solutions by combining the GEBF-NN force field for the solute molecule and ff14SB force field for solvent molecules. Using the GEBF-NN/MM MD simulation to generate snapshot structures of solute/solvent clusters, we then perform the NMR calculations with the GEBF approach at the DFT level to calculate NMR chemical shifts of the solute molecule. Taking a heptamer of oligopyridine-dicarboxamides in chloroform solution as an example, our results show that the GEBF-NN force field is quite accurate for this heptamer by comparing with the reference DFT results. For this heptamer in chloroform solution, both the GEBF-NN/MM and classical MD simulations could lead to helical structures from the same initial extended structure. The GEBF-DFT NMR results indicate that the GEBF-NN/MM force field could lead to more accurate NMR chemical shifts on hydrogen atoms by comparing with the experimental NMR results. Therefore, the GEBF-NN/MM force field could be employed for predicting more accurate dynamical behaviors than the classical force field for complex systems in solutions.
Collapse
Affiliation(s)
- Kang Liao
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shiyu Dong
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Zheng Cheng
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Wei Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shuhua Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| |
Collapse
|
8
|
Xu M, Zhu T, Zhang JZH. Automated Construction of Neural Network Potential Energy Surface: The Enhanced Self-Organizing Incremental Neural Network Deep Potential Method. J Chem Inf Model 2021; 61:5425-5437. [PMID: 34752095 DOI: 10.1021/acs.jcim.1c01125] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In recent years, the use of deep learning (neural network) potential energy surface (NNPES) in molecular dynamics simulation has experienced explosive growth as it can be as accurate as quantum chemistry methods while being as efficient as classical mechanic methods. However, the development of NNPES is highly nontrivial. In particular, it has been troubling to construct a dataset that is as small as possible yet can cover the target chemical space. In this work, an ESOINN-DP method is developed, which has the enhanced self-organizing incremental neural network (ESOINN) and a newly proposed error indicator at its core. With ESOINN-DP, one can construct the NNPES with little human intervention, and this method ensures that the constructed reference dataset covers the target chemical space with minimum redundancy. The performance of the ESOINN-DP method has been well validated by developing neural network potential energy surfaces for water clusters, tripeptides, and by de-redundancy of a sub-dataset of the ANI-1 database. We believe that the ESOINN-DP method provides a novel idea for the construction of NNPES and, especially, the reference datasets, and it can be used for molecular dynamics (MD) simulations of various gas-phase and condensed-phase chemical systems.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China.,Department of Chemistry, New York University, New York, New York 10003, United States.,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|
9
|
Cheng Z, Du J, Zhang L, Ma J, Li W, Li S. Building quantum mechanics quality force fields of proteins with the generalized energy-based fragmentation approach and machine learning. Phys Chem Chem Phys 2021; 24:1326-1337. [PMID: 34718360 DOI: 10.1039/d1cp03934b] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
We combined our generalized energy-based fragmentation (GEBF) approach and machine learning (ML) technique to construct quantum mechanics (QM) quality force fields for proteins. In our scheme, the training sets for a protein are only constructed from its small subsystems, which capture all short-range interactions in the target system. The energy of a given protein is expressed as the summation of atomic contributions from QM calculations of various subsystems, corrected by long-range Coulomb and van der Waals interactions. With the Gaussian approximation potential (GAP) method, our protocol can automatically generate training sets with high efficiency. To facilitate the construction of training sets for proteins, we store all trained subsystem data in a library. If subsystems in the library are detected in a new protein, corresponding datasets can be directly reused as a part of the training set on this new protein. With two polypeptides, 4ZNN and 1XQ8 segment, as examples, the energies and forces predicted by GEBF-GAP are in good agreement with those from conventional QM calculations, and dihedral angle distributions from GEBF-GAP molecular dynamics (MD) simulations can also well reproduce those from ab initio MD simulations. In addition, with the training set generated from GEBF-GAP, we also demonstrate that GEBF-ML force fields constructed by neural network (NN) methods can also show QM quality. Therefore, the present work provides an efficient and systematic way to build QM quality force fields for biological systems.
Collapse
Affiliation(s)
- Zheng Cheng
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| | - Jiahui Du
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| | - Lei Zhang
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| | - Jing Ma
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| | - Wei Li
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shuhua Li
- Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, School of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, 210023, P. R. China.
| |
Collapse
|
10
|
Westermayr J, Marquetand P. Machine Learning for Electronically Excited States of Molecules. Chem Rev 2021; 121:9873-9926. [PMID: 33211478 PMCID: PMC8391943 DOI: 10.1021/acs.chemrev.0c00749] [Citation(s) in RCA: 162] [Impact Index Per Article: 54.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Indexed: 12/11/2022]
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
11
|
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
12
|
Husic BE, Charron NE, Lemm D, Wang J, Pérez A, Majewski M, Krämer A, Chen Y, Olsson S, de Fabritiis G, Noé F, Clementi C. Coarse graining molecular dynamics with graph neural networks. J Chem Phys 2020; 153:194101. [PMID: 33218238 PMCID: PMC7671749 DOI: 10.1063/5.0026133] [Citation(s) in RCA: 71] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Accepted: 10/27/2020] [Indexed: 11/14/2022] Open
Abstract
Coarse graining enables the investigation of molecular dynamics for larger systems and at longer timescales than is possible at an atomic resolution. However, a coarse graining model must be formulated such that the conclusions we draw from it are consistent with the conclusions we would draw from a model at a finer level of detail. It has been proved that a force matching scheme defines a thermodynamically consistent coarse-grained model for an atomistic system in the variational limit. Wang et al. [ACS Cent. Sci. 5, 755 (2019)] demonstrated that the existence of such a variational limit enables the use of a supervised machine learning framework to generate a coarse-grained force field, which can then be used for simulation in the coarse-grained space. Their framework, however, requires the manual input of molecular features to machine learn the force field. In the present contribution, we build upon the advance of Wang et al. and introduce a hybrid architecture for the machine learning of coarse-grained force fields that learn their own features via a subnetwork that leverages continuous filter convolutions on a graph neural network architecture. We demonstrate that this framework succeeds at reproducing the thermodynamics for small biomolecular systems. Since the learned molecular representations are inherently transferable, the architecture presented here sets the stage for the development of machine-learned, coarse-grained force fields that are transferable across molecular systems.
Collapse
Affiliation(s)
| | | | - Dominik Lemm
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | | | - Adrià Pérez
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Maciej Majewski
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr. Aiguader 88, Barcelona, Spain
| | - Andreas Krämer
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | - Simon Olsson
- Department of Mathematics and Computer Science, Freie Universität, Berlin, Germany
| | | | | | | |
Collapse
|
13
|
Westermayr J, Marquetand P. Deep learning for UV absorption spectra with SchNarc: First steps toward transferability in chemical compound space. J Chem Phys 2020; 153:154112. [DOI: 10.1063/5.0021915] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- J. Westermayr
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
| | - P. Marquetand
- Faculty of Chemistry, Institute of Theoretical Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Faculty of Chemistry, Data Science @ Uni Vienna, University of Vienna, Währinger Str. 29, 1090 Vienna, Austria
| |
Collapse
|
14
|
Westermayr J, Marquetand P. Machine learning and excited-state molecular dynamics. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab9c3e] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
15
|
Cheng Z, Zhao D, Ma J, Li W, Li S. An On-the-Fly Approach to Construct Generalized Energy-Based Fragmentation Machine Learning Force Fields of Complex Systems. J Phys Chem A 2020; 124:5007-5014. [DOI: 10.1021/acs.jpca.0c04526] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Affiliation(s)
- Zheng Cheng
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Dongbo Zhao
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
- Kuang Yaming Honors School, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Jing Ma
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Wei Li
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Shuhua Li
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| |
Collapse
|
16
|
Westermayr J, Gastegger M, Marquetand P. Combining SchNet and SHARC: The SchNarc Machine Learning Approach for Excited-State Dynamics. J Phys Chem Lett 2020; 11:3828-3834. [PMID: 32311258 PMCID: PMC7246974 DOI: 10.1021/acs.jpclett.0c00527] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 04/20/2020] [Indexed: 05/26/2023]
Abstract
In recent years, deep learning has become a part of our everyday life and is revolutionizing quantum chemistry as well. In this work, we show how deep learning can be used to advance the research field of photochemistry by learning all important properties-multiple energies, forces, and different couplings-for photodynamics simulations. We simplify such simulations substantially by (i) a phase-free training skipping costly preprocessing of raw quantum chemistry data; (ii) rotationally covariant nonadiabatic couplings, which can either be trained or (iii) alternatively be approximated from only ML potentials, their gradients, and Hessians; and (iv) incorporating spin-orbit couplings. As the deep-learning method, we employ SchNet with its automatically determined representation of molecular structures and extend it for multiple electronic states. In combination with the molecular dynamics program SHARC, our approach termed SchNarc is tested on two polyatomic molecules and paves the way toward efficient photodynamics simulations of complex systems.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
| | - Michael Gastegger
- Machine
Learning Group, Technical University of
Berlin, 10587 Berlin, Germany
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Str. 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Str. 29, 1090 Vienna, Austria
| |
Collapse
|
17
|
Wang Z, Han Y, Li J, He X. Combining the Fragmentation Approach and Neural Network Potential Energy Surfaces of Fragments for Accurate Calculation of Protein Energy. J Phys Chem B 2020; 124:3027-3035. [PMID: 32208716 DOI: 10.1021/acs.jpcb.0c01370] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Accurate and efficient all-atom quantum mechanical (QM) calculations for biomolecules still present a challenge to computational physicists and chemists. In this study, an extensible generalized molecular fractionation with a conjugate caps method combined with neural networks (NN-GMFCC) is developed for efficient QM calculation of protein energy. In the NN-GMFCC scheme, the total energy of a given protein is calculated by taking a proper combination of the high-precision neural network potential energies of all capped residues and overlapping conjugate caps. In addition, the two-body interaction energies of residue pairs are calculated by molecular mechanics (MM). With reference to the GMFCC/MM calculation at the ωB97XD/6-31G* level, the overall mean unsigned errors of the energy deviations and atomic force root-mean-squared errors calculated by NN-GMFCC are only 2.01 kcal/mol and 0.68 kcal/mol/Å, respectively, for 14 proteins (containing up to 13,728 atoms). Meanwhile, the NN-GMFCC approach is about 4 orders of magnitude faster than the GMFCC/MM method. The NN-GMFCC method could be systematically improved by inclusion of two-body QM interaction and multibody electronic polarization effect. Moreover, the NN-GMFCC approach can also be applied to other macromolecular systems such as DNA/RNA, and it is capable of providing a powerful and efficient approach for exploration of structures and functions of proteins with QM accuracy.
Collapse
Affiliation(s)
- Zhilong Wang
- Key Laboratory of Thin Film and Micro Fabrication, Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yanqiang Han
- Key Laboratory of Thin Film and Micro Fabrication, Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Jinjin Li
- Key Laboratory of Thin Film and Micro Fabrication, Ministry of Education, Department of Micro/Nano-electronics, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Xiao He
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
18
|
Wang J, Olsson S, Wehmeyer C, Pérez A, Charron NE, de Fabritiis G, Noé F, Clementi C. Machine Learning of Coarse-Grained Molecular Dynamics Force Fields. ACS CENTRAL SCIENCE 2019; 5:755-767. [PMID: 31139712 PMCID: PMC6535777 DOI: 10.1021/acscentsci.8b00913] [Citation(s) in RCA: 199] [Impact Index Per Article: 39.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2018] [Indexed: 05/17/2023]
Abstract
Atomistic or ab initio molecular dynamics simulations are widely used to predict thermodynamics and kinetics and relate them to molecular structure. A common approach to go beyond the time- and length-scales accessible with such computationally expensive simulations is the definition of coarse-grained molecular models. Existing coarse-graining approaches define an effective interaction potential to match defined properties of high-resolution models or experimental data. In this paper, we reformulate coarse-graining as a supervised machine learning problem. We use statistical learning theory to decompose the coarse-graining error and cross-validation to select and compare the performance of different models. We introduce CGnets, a deep learning approach, that learns coarse-grained free energy functions and can be trained by a force-matching scheme. CGnets maintain all physically relevant invariances and allow one to incorporate prior physics knowledge to avoid sampling of unphysical structures. We show that CGnets can capture all-atom explicit-solvent free energy surfaces with models using only a few coarse-grained beads and no solvent, while classical coarse-graining methods fail to capture crucial features of the free energy surface. Thus, CGnets are able to capture multibody terms that emerge from the dimensionality reduction.
Collapse
Affiliation(s)
- Jiang Wang
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
| | - Simon Olsson
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Christoph Wehmeyer
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Adrià Pérez
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
| | - Nicholas E. Charron
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| | - Gianni de Fabritiis
- Computational
Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Dr Aiguader 88, 08003 Barcelona, Spain
- Institucio
Catalana de Recerca i Estudis Avanats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| | - Frank Noé
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
| | - Cecilia Clementi
- Center
for Theoretical Biological Physics, Rice
University, Houston, Texas 77005, United States
- Department
of Chemistry, Rice University, Houston, Texas 77005, United States
- Department
of Mathematics and Computer Science, Freie
Universität Berlin, Arnimallee 6, 14195 Berlin, Germany
- Department
of Physics, Rice University, Houston, Texas 77005, United States
| |
Collapse
|