1
|
Teng C, Wang Y, Bao JL. Physical Prior Mean Function-Driven Gaussian Processes Search for Minimum-Energy Reaction Paths with a Climbing-Image Nudged Elastic Band: A General Method for Gas-Phase, Interfacial, and Bulk-Phase Reactions. J Chem Theory Comput 2024; 20:4308-4324. [PMID: 38720441 DOI: 10.1021/acs.jctc.4c00291] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
The climbing-image nudged elastic band (CI-NEB) method serves as an indispensable tool for computational chemists, offering insight into minimum-energy reaction paths (MEPs) by delineating both transition states (TSs) and intermediate nonstationary structures along reaction coordinates. However, executing CI-NEB calculations for reactions with extensive reaction coordinate spans necessitates a large number of images to ensure a reliable convergence of the MEPs and TS structures, presenting a computationally demanding optimization challenge, even with mildly costly electronic-structure methods. In this study, we advocate for the utilization of physically inspired prior mean function-based Gaussian processes (GPs) to expedite MEP exploration and TS optimization via the CI-NEB method. By incorporating reliable prior physical approximations into potential energy surface (PES) modeling, we demonstrate enhanced efficiency in multidimensional CI-NEB optimization with surrogate-based optimizers. Our physically informed GP approach not only outperforms traditional nonsurrogate-based optimizers in optimization efficiency but also on-the-fly learns the reaction path valley during optimization, culminating in significant advancements. The surrogate PES derived from our optimization exhibits high accuracy compared to true PES references, aligning with our emphasis on leveraging reliable physical priors for robust and efficient posterior mean learning in GPs. Through a systematic benchmark study encompassing various reaction pathways, including gas-phase, bulk-phase, and interfacial/surface reactions, our physical GPs consistently demonstrate superior efficiency and reliability. For instance, they outperform the popular fast inertial relaxation engine optimizer by approximately a factor of 10, showcasing their versatility and efficacy in exploring reaction mechanisms and surface reaction PESs.
Collapse
Affiliation(s)
- Chong Teng
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Yang Wang
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| |
Collapse
|
2
|
Fang W, Zhu YC, Cheng Y, Hao YP, Richardson JO. Robust Gaussian Process Regression Method for Efficient Tunneling Pathway Optimization: Application to Surface Processes. J Chem Theory Comput 2024; 20:3766-3778. [PMID: 38708859 PMCID: PMC11099967 DOI: 10.1021/acs.jctc.4c00158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Revised: 04/17/2024] [Accepted: 04/17/2024] [Indexed: 05/07/2024]
Abstract
Simulation of surface processes is a key part of computational chemistry that offers atomic-scale insights into mechanisms of heterogeneous catalysis, diffusion dynamics, and quantum tunneling phenomena. The most common theoretical approaches involve optimization of reaction pathways, including semiclassical tunneling pathways (called instantons). The computational effort can be demanding, especially for instanton optimizations with an ab initio electronic structure. Recently, machine learning has been applied to accelerate reaction-pathway optimization, showing great potential for a wide range of applications. However, previous methods still suffer from numerical and efficiency issues and were not designed for condensed-phase reactions. We propose an improved framework based on Gaussian process regression for general transformed coordinates, which has improved efficiency and numerical stability, and we propose a descriptor that combines internal and Cartesian coordinates suitable for modeling surface processes. We demonstrate with 11 instanton optimizations in three representative systems that the improved approach makes ab initio instanton optimization significantly cheaper, such that it becomes not much more expensive than a classical transition-state theory rate calculation.
Collapse
Affiliation(s)
- Wei Fang
- Department
of Chemistry, Shanghai Key Laboratory of Molecular Catalysis and Innovative
Materials, Fudan University, Shanghai 200438, P. R. China
- Laboratory
of Physical Chemistry, ETH Zürich, Zürich 8093, Switzerland
- State
Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical
Computational Chemistry, Dalian Institute
of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, P. R. China
| | - Yu-Cheng Zhu
- State
Key Laboratory for Artificial Microstructure and Mesoscopic Physics,
Frontier Science Center for Nano-optoelectronics and School of Physics, Peking University, Beijing 100871, China
| | - Yihan Cheng
- State
Key Laboratory for Artificial Microstructure and Mesoscopic Physics,
Frontier Science Center for Nano-optoelectronics and School of Physics, Peking University, Beijing 100871, China
| | - Yi-Ping Hao
- State
Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical
Computational Chemistry, Dalian Institute
of Chemical Physics, Chinese Academy of Sciences, Dalian 116023, P. R. China
| | - Jeremy O. Richardson
- Department
of Chemistry and Applied Biosciences, ETH
Zürich, Zürich 8093, Switzerland
| |
Collapse
|
3
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
4
|
Xu W, Zhao Y, Chen J, Wan Z, Yan D, Zhang X, Zhang R. A Q-learning method based on coarse-to-fine potential energy surface for locating transition state and reaction pathway. J Comput Chem 2024; 45:487-497. [PMID: 37966714 DOI: 10.1002/jcc.27259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 10/25/2023] [Accepted: 11/02/2023] [Indexed: 11/16/2023]
Abstract
Transition state (TS) on the potential energy surface (PES) plays a key role in determining the kinetics and thermodynamics of chemical reactions. Inspired by the fact that the dynamics of complex systems are always driven by rare but significant transition events, we herein propose a TS search method in accordance with the Q-learning algorithm. Appropriate reward functions are set for a given PES to optimize the reaction pathway through continuous trial and error, and then the TS can be obtained from the optimized reaction pathway. The validity of this Q-learning method with reasonable settings of Q-value table including actions, states, learning rate, greedy rate, discount rate, and so on, is exemplified in 2 two-dimensional potential functions. In the applications of the Q-learning method to two chemical reactions, it is demonstrated that the Q-learning method can predict consistent TS and reaction pathway with those by ab initio calculations. Notably, the PES must be well prepared before using the Q-learning method, and a coarse-to-fine PES scanning scheme is thus introduced to save the computational time while maintaining the accuracy of the Q-learning prediction. This work offers a simple and reliable Q-learning method to search for all possible TS and reaction pathway of a chemical reaction, which may be a new option for effectively exploring the PES in an extensive search manner.
Collapse
Affiliation(s)
- Wenjun Xu
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Yanling Zhao
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Jialu Chen
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Zhongyu Wan
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| | - Dadong Yan
- Department of Physics, Beijing Normal University, Beijing, China
| | - Xinghua Zhang
- School of Science, Beijing Jiaotong University, Beijing, China
| | - Ruiqin Zhang
- Department of Physics, City University of Hong Kong, Hong Kong SAR, China
| |
Collapse
|
5
|
Teng C, Huang D, Donahue E, Bao JL. Exploring torsional conformer space with physical prior mean function-driven meta-Gaussian processes. J Chem Phys 2023; 159:214111. [PMID: 38051097 DOI: 10.1063/5.0176709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 11/12/2023] [Indexed: 12/07/2023] Open
Abstract
We present a novel approach for systematically exploring the conformational space of small molecules with multiple internal torsions. Identifying unique conformers through a systematic conformational search is important for obtaining accurate thermodynamic functions (e.g., free energy), encompassing contributions from the ensemble of all local minima. Traditional geometry optimizers focus on one structure at a time, lacking transferability from the local potential-energy surface (PES) around a specific minimum to optimize other conformers. In this work, we introduce a physics-driven meta-Gaussian processes (meta-GPs) method that not only enables efficient exploration of target PES for locating local minima but, critically, incorporates physical surrogates that can be applied universally across the optimization of all conformers of the same molecule. Meta-GPs construct surrogate PESs based on the optimization history of prior conformers, dynamically selecting the most suitable prior mean function (representing prior knowledge in Bayesian learning) as a function of the optimization progress. We systematically benchmarked the performance of multiple GP variants for brute-force conformational search of amino acids. Our findings highlight the superior performance of meta-GPs in terms of efficiency, comprehensiveness of conformer discovery, and the distribution of conformers compared to conventional non-surrogate optimizers and other non-meta-GPs. Furthermore, we demonstrate that by concurrently optimizing, training GPs on the fly, and learning PESs, meta-GPs exhibit the capacity to generate high-quality PESs in the torsional space without extensive training data. This represents a promising avenue for physics-based transfer learning via meta-GPs with adaptive priors in exploring torsional conformer space.
Collapse
Affiliation(s)
- Chong Teng
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, USA
| | - Elizabeth Donahue
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| |
Collapse
|
6
|
Chang YC, Li YP. Integrating Chemical Information into Reinforcement Learning for Enhanced Molecular Geometry Optimization. J Chem Theory Comput 2023. [PMID: 38012608 DOI: 10.1021/acs.jctc.3c00696] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Geometry optimization is a crucial step in computational chemistry, and the efficiency of optimization algorithms plays a pivotal role in reducing computational costs. In this study, we introduce a novel reinforcement-learning-based optimizer that surpasses traditional methods in terms of efficiency. What sets our model apart is its ability to incorporate chemical information into the optimization process. By exploring different state representations that integrate gradients, displacements, primitive type labels, and additional chemical information from the SchNet model, our reinforcement learning optimizer achieves exceptional results. It demonstrates an average reduction of about 50% or more in optimization steps compared to the conventional optimization algorithms that we examined when dealing with challenging initial geometries. Moreover, the reinforcement learning optimizer exhibits promising transferability across various levels of theory, emphasizing its versatility and potential for enhancing molecular geometry optimization. This research highlights the significance of leveraging reinforcement learning algorithms to harness chemical knowledge, paving the way for future advancements in computational chemistry.
Collapse
Affiliation(s)
- Yu-Cheng Chang
- Department of Chemical Engineering, National Taiwan University, No. 1, Sect. 4, Roosevelt Road, Taipei 10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sect. 4, Roosevelt Road, Taipei 10617, Taiwan
- Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), Academia Sinica, No. 128, Sec. 2, Academia Road, Taipei 11529, Taiwan
| |
Collapse
|
7
|
Shajan A, Manathunga M, Götz AW, Merz KM. Geometry Optimization: A Comparison of Different Open-Source Geometry Optimizers. J Chem Theory Comput 2023; 19:7533-7541. [PMID: 37870541 DOI: 10.1021/acs.jctc.3c00188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2023]
Abstract
Based on a series of energy minimizations with starting structures obtained from the Baker test set of 30 organic molecules, a comparison is made between various open-source geometry optimization codes that are interfaced with the open-source QUantum Interaction Computational Kernel (QUICK) program for gradient and energy calculations. The findings demonstrate how the choice of the coordinate system influences the optimization process to reach an equilibrium structure. With fewer steps, internal coordinates outperform Cartesian coordinates, while the choice of the initial Hessian and Hessian update method in quasi-Newton approaches made by different optimization algorithms also contributes to the rate of convergence. Furthermore, an available open-source machine learning method based on Gaussian process regression (GPR) was evaluated for energy minimizations over surrogate potential energy surfaces with both Cartesian and internal coordinates with internal coordinates outperforming Cartesian. Overall, geomeTRIC and DL-FIND with their default optimization method as well as with the GPR-based model using Hartree-Fock theory with the 6-31G** basis set needed a comparable number of geometry optimization steps to the approach of Baker using a unit matrix as the initial Hessian to reach the optimized geometry. On the other hand, the Berny and Sella offerings in ASE outperformed the other algorithms. Based on this, we recommend using the file-based approaches, ASE/Berny and ASE/Sella, for large-scale optimization efforts, while if using a single executable is preferable, we now distribute QUICK integrated with DL-FIND.
Collapse
Affiliation(s)
- Akhil Shajan
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Madushanka Manathunga
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Andreas W Götz
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093-0505, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
8
|
Snyder R, Kim B, Pan X, Shao Y, Pu J. Bridging semiempirical and ab initio QM/MM potentials by Gaussian process regression and its sparse variants for free energy simulation. J Chem Phys 2023; 159:054107. [PMID: 37530109 PMCID: PMC10400118 DOI: 10.1063/5.0156327] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 07/10/2023] [Indexed: 08/03/2023] Open
Abstract
Free energy simulations that employ combined quantum mechanical and molecular mechanical (QM/MM) potentials at ab initio QM (AI) levels are computationally highly demanding. Here, we present a machine-learning-facilitated approach for obtaining AI/MM-quality free energy profiles at the cost of efficient semiempirical QM/MM (SE/MM) methods. Specifically, we use Gaussian process regression (GPR) to learn the potential energy corrections needed for an SE/MM level to match an AI/MM target along the minimum free energy path (MFEP). Force modification using gradients of the GPR potential allows us to improve configurational sampling and update the MFEP. To adaptively train our model, we further employ the sparse variational GP (SVGP) and streaming sparse GPR (SSGPR) methods, which efficiently incorporate previous sample information without significantly increasing the training data size. We applied the QM-(SS)GPR/MM method to the solution-phase SN2 Menshutkin reaction, NH3+CH3Cl→CH3NH3++Cl-, using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. For 4000 configurations sampled along the MFEP, the iteratively optimized AM1-SSGPR-4/MM model reduces the energy error in AM1/MM from 18.2 to 4.4 kcal/mol. Although not explicitly fitting forces, our method also reduces the key internal force errors from 25.5 to 11.1 kcal/mol/Å and from 30.2 to 10.3 kcal/mol/Å for the N-C and C-Cl bonds, respectively. Compared to the uncorrected simulations, the AM1-SSGPR-4/MM method lowers the predicted free energy barrier from 28.7 to 11.7 kcal/mol and decreases the reaction free energy from -12.4 to -41.9 kcal/mol, bringing these results into closer agreement with their AI/MM and experimental benchmarks.
Collapse
Affiliation(s)
- Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N Blackford St., Indianapolis, Indiana 46202, USA
| | - Bryant Kim
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N Blackford St., Indianapolis, Indiana 46202, USA
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, Oklahoma 73019, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, Oklahoma 73019, USA
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N Blackford St., Indianapolis, Indiana 46202, USA
| |
Collapse
|
9
|
Fdez Galván I, Lindh R. Smooth Things Come in Threes: A Diabatic Surrogate Model for Conical Intersection Optimization. J Chem Theory Comput 2023. [PMID: 37192531 DOI: 10.1021/acs.jctc.3c00389] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
The optimization of conical intersection structures is complicated by the nondifferentiability of the adiabatic potential energy surfaces. In this work, we build a pseudodiabatic surrogate model, based on Gaussian process regression, formed by three smooth and differentiable surfaces that can adequately reproduce the adiabatic surfaces. Using this model with the restricted variance optimization method results in a notable decrease of the overall computational effort required to obtain minimum energy crossing points.
Collapse
Affiliation(s)
- Ignacio Fdez Galván
- Department of Chemistry-BMC, Uppsala University, P.O. Box 576, SE-75123 Uppsala, Sweden
| | - Roland Lindh
- Department of Chemistry-BMC, Uppsala University, P.O. Box 576, SE-75123 Uppsala, Sweden
- Uppsala Center for Computational Chemistry (UC3), Uppsala University, P.O. Box 576, SE-75123 Uppsala, Sweden
| |
Collapse
|
10
|
Weser O, Hein-Janke B, Mata RA. Automated handling of complex chemical structures in Z-matrix coordinates-The chemcoord library. J Comput Chem 2023; 44:710-726. [PMID: 36541725 DOI: 10.1002/jcc.27029] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 10/05/2022] [Accepted: 10/09/2022] [Indexed: 12/24/2022]
Abstract
In this work, we present a fully automated method for the construction of chemically meaningful sets of hierarchical nonredundant internal coordinates (ICs; also commonly denoted as Z-matrices) from the Cartesian coordinates of a molecular system. Particular focus is placed on avoiding ill-definitions of angles and dihedrals due to linear arrangements of atoms, to consistently guarantee a well-defined transformation to Cartesian coordinates, even after structural changes. The representations thus obtained are particularly well suited for pathway construction in double-ended methods for transition state search and optimizations with nonlinear constraints. Analytical gradients for the transformation between the coordinate systems were derived for analytical geometry optimizations purely in Z-matrix coordinates. The geometry optimization was coupled with a Symbolic Algebra package to support arbitrary nonlinear constraints in Z-matrix coordinates, while retaining analytical energy gradient conversion. The difference to the commonly used nonhierarchical IC transformations is discussed. Sample applications are provided for a number of common chemical reactions and illustrative examples.
Collapse
Affiliation(s)
- Oskar Weser
- Electronic Structure Theory Department, Max-Planck-Institute for Solid State Research, Stuttgart, Germany.,Institute of Physical Chemistry, University of Goettingen, Goettingen, Germany
| | - Björn Hein-Janke
- Institute of Physical Chemistry, University of Goettingen, Goettingen, Germany
| | - Ricardo A Mata
- Institute of Physical Chemistry, University of Goettingen, Goettingen, Germany
| |
Collapse
|
11
|
Deffner M, Weise MP, Zhang H, Mücke M, Proppe J, Franco I, Herrmann C. Learning Conductance: Gaussian Process Regression for Molecular Electronics. J Chem Theory Comput 2023; 19:992-1002. [PMID: 36692968 DOI: 10.1021/acs.jctc.2c00648] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Experimental studies of charge transport through single molecules often rely on break junction setups, where molecular junctions are repeatedly formed and broken while measuring the conductance, leading to a statistical distribution of conductance values. Modeling this experimental situation and the resulting conductance histograms is challenging for theoretical methods, as computations need to capture structural changes in experiments, including the statistics of junction formation and rupture. This type of extensive structural sampling implies that even when evaluating conductance from computationally efficient electronic structure methods, which typically are of reduced accuracy, the evaluation of conductance histograms is too expensive to be a routine task. Highly accurate quantum transport computations are only computationally feasible for a few selected conformations and thus necessarily ignore the rich conformational space probed in experiments. To overcome these limitations, we investigate the potential of machine learning for modeling conductance histograms, in particular by Gaussian process regression. We show that by selecting specific structural parameters as features, Gaussian process regression can be used to efficiently predict the zero-bias conductance from molecular structures, reducing the computational cost of simulating conductance histograms by an order of magnitude. This enables the efficient calculation of conductance histograms even on the basis of computationally expensive first-principles approaches by effectively reducing the number of necessary charge transport calculations, paving the way toward their routine evaluation.
Collapse
Affiliation(s)
- Michael Deffner
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany.,The Hamburg Centre for Ultrafast Imaging, Hamburg22761, Germany
| | - Marc Philipp Weise
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany
| | - Haitao Zhang
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany
| | - Maike Mücke
- Institute of Physical Chemistry, Georg-August University, Göttingen37077, Germany
| | - Jonny Proppe
- Institute of Physical and Theoretical Chemistry, TU Braunschweig, Braunschweig38106, Germany
| | - Ignacio Franco
- Departments of Chemistry and Physics, University of Rochester, Rochester, New York14627-0216, United States
| | - Carmen Herrmann
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany.,The Hamburg Centre for Ultrafast Imaging, Hamburg22761, Germany
| |
Collapse
|
12
|
Teng C, Huang D, Bao JL. A spur to molecular geometry optimization: Gradient-enhanced universal kriging with on-the-fly adaptive ab initio prior mean functions in curvilinear coordinates. J Chem Phys 2023; 158:024112. [PMID: 36641392 DOI: 10.1063/5.0133675] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
We present a molecular geometry optimization algorithm based on the gradient-enhanced universal kriging (GEUK) formalism with ab initio prior mean functions, which incorporates prior physical knowledge to surrogate-based optimization. In this formalism, we have demonstrated the advantage of allowing the prior mean functions to be adaptive during geometry optimization over a pre-fixed choice of prior functions. Our implementation is general and flexible in two senses. First, the optimizations on the surrogate surface can be in both Cartesian coordinates and curvilinear coordinates. We explore four representative curvilinear coordinates in this work, including the redundant Coulombic coordinates, the redundant internal coordinates, the non-redundant delocalized internal coordinates, and the non-redundant hybrid delocalized internal Z-matrix coordinates. We show that our GEUK optimizer accelerates geometry optimization as compared to conventional non-surrogate-based optimizers in internal coordinates. We further showcase the power of the GEUK with on-the-fly adaptive priors for efficient optimizations of challenging molecules (Criegee intermediates) with a high-accuracy electronic structure method (the coupled-cluster method). Second, we present the usage of internal coordinates under the complete curvilinear scheme. A complete curvilinear scheme performs both surrogate potential-energy surface (PES) fitting and structure optimization entirely in the curvilinear coordinates. Our benchmark indicates that the complete curvilinear scheme significantly reduces the cost of structure minimization on the surrogate compared to the incomplete curvilinear scheme, which fits the surrogate PES in curvilinear coordinates partially and optimizes a structure in Cartesian coordinates through curvilinear coordinates via the chain rule.
Collapse
Affiliation(s)
- Chong Teng
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, USA
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| |
Collapse
|
13
|
Heinen S, von Rudorff GF, von Lilienfeld OA. Transition state search and geometry relaxation throughout chemical compound space with quantum machine learning. J Chem Phys 2022; 157:221102. [PMID: 36546806 DOI: 10.1063/5.0112856] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
We use energies and forces predicted within response operator based quantum machine learning (OQML) to perform geometry optimization and transition state search calculations with legacy optimizers but without the need for subsequent re-optimization with quantum chemistry methods. For randomly sampled initial coordinates of small organic query molecules, we report systematic improvement of equilibrium and transition state geometry output as training set sizes increase. Out-of-sample SN2 reactant complexes and transition state geometries have been predicted using the LBFGS and the QST2 algorithms with an root-mean-square deviation (RMSD) of 0.16 and 0.4 Å-after training on up to 200 reactant complex relaxations and transition state search trajectories from the QMrxn20 dataset, respectively. For geometry optimizations, we have also considered relaxation paths up to 5'595 constitutional isomers with sum formula C7H10O2 from the QM9-database. Using the resulting OQML models with an LBFGS optimizer reproduces the minimum geometry with an RMSD of 0.14 Å, only using ∼6000 training points obtained from normal mode sampling along the optimization paths of the training compounds without the need for active learning. For converged equilibrium and transition state geometries, subsequent vibrational normal mode frequency analysis indicates deviation from MP2 reference results by on average 14 and 26 cm-1, respectively. While the numerical cost for OQML predictions is negligible in comparison to density functional theory or MP2, the number of steps until convergence is typically larger in either case. The success rate for reaching convergence, however, improves systematically with training set size, underscoring OQML's potential for universal applicability.
Collapse
Affiliation(s)
- Stefan Heinen
- University of Vienna, Faculty of Physics, Kolingasse 14-16, AT-1090 Wien, Austria
| | | | | |
Collapse
|
14
|
Jiang T, Fang W, Alavi A, Chen J. General Analytical Nuclear Forces and Molecular Potential Energy Surface from Full Configuration Interaction Quantum Monte Carlo. J Chem Theory Comput 2022; 18:7233-7242. [PMID: 36326847 DOI: 10.1021/acs.jctc.2c00440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The full configuration interaction quantum Monte Carlo (FCIQMC) is a state-of-the-art stochastic electronic structure method, providing a methodology to compute FCI-level state energies of molecular systems within a quantum chemical basis. However, especially to probe dynamics at the FCIQMC level, it is necessary to devise more efficient schemes to produce nuclear forces and potential energy surfaces (PES) from FCIQMC. In this work, we derive the general formula for nuclear forces from FCIQMC, and clarify different contributions of the total force. This method to obtain FCIQMC forces eliminates previous restrictions and can be used with frozen core approximation and free selection of orbitals, making it promising for more efficient nuclear forces calculations. After some numerical checks of this procedure on the binding curve of N2 molecule, we use the FCIQMC energy and force to obtain the full-dimensional ground state PES of the water molecule via Gaussian processes regression. The new water FCIQMC PES can be used as the basis for H2O ground state nuclear dynamics, structure optimization, and rotation-vibrational spectrum calculation.
Collapse
Affiliation(s)
- Tonghuan Jiang
- School of Physics, Peking University, Beijing100871, P. R. China
| | - Wei Fang
- State Key Laboratory of Molecular Reaction Dynamics and Center for Theoretical Computational Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, Dalian116023, P. R. China.,Department of Chemistry, Fudan University, Shanghai200438, P. R. China
| | - Ali Alavi
- Max Planck Institute for Solid State Research, Heisenbergstrasse 1, 70569Stuttgart, Germany.,University of Cambridge, Lensfield Road, CambridgeCB2 1EW, United Kingdom
| | - Ji Chen
- School of Physics, Peking University, Beijing100871, P. R. China.,Collaborative Innovation Center of Quantum Matter, Beijing100871, P. R. China.,Interdisciplinary Institute of Light-Element Quantum Materials and Research Center for Light-Element Advanced Materials, Peking University, Beijing100871, P. R. China.,Frontiers Science Center for Nano-Optoelectronics, Peking University, Beijing100871, P. R. China
| |
Collapse
|
15
|
Snyder R, Kim B, Pan X, Shao Y, Pu J. Facilitating ab initio QM/MM free energy simulations by Gaussian process regression with derivative observations. Phys Chem Chem Phys 2022; 24:25134-25143. [PMID: 36222412 PMCID: PMC11095978 DOI: 10.1039/d2cp02820d] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
In combined quantum mechanical and molecular mechanical (QM/MM) free energy simulations, how to synthesize the accuracy of ab initio (AI) methods with the speed of semiempirical (SE) methods for a cost-effective QM treatment remains a long-standing challenge. In this work, we present a machine-learning-facilitated method for obtaining AI/MM-quality free energy profiles through efficient SE/MM simulations. In particular, we use Gaussian process regression (GPR) to learn the energy and force corrections needed for SE/MM to match with AI/MM results during molecular dynamics simulations. Force matching is enabled in our model by including energy derivatives into the observational targets through the extended-kernel formalism. We demonstrate the effectiveness of this method on the solution-phase SN2 Menshutkin reaction using AM1/MM and B3LYP/6-31+G(d,p)/MM as the base and target levels, respectively. Trained on only 80 configurations sampled along the minimum free energy path (MFEP), the resulting GPR model reduces the average energy error in AM1/MM from 18.2 to 5.8 kcal mol-1 for the 4000-sample testing set with the average force error on the QM atoms decreased from 14.6 to 3.7 kcal mol-1 Å-1. Free energy sampling with the GPR corrections applied (AM1-GPR/MM) produces a free energy barrier of 14.4 kcal mol-1 and a reaction free energy of -34.1 kcal mol-1, in closer agreement with the AI/MM benchmarks and experimental results.
Collapse
Affiliation(s)
- Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| | - Bryant Kim
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, OK 73019, USA.
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, 101 Stephenson Pkwy, Norman, OK 73019, USA.
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, 402 N. Blackford St., Indianapolis, IN 46202, USA.
| |
Collapse
|
16
|
Westermayr J, Chaudhuri S, Jeindl A, Hofmann OT, Maurer RJ. Long-range dispersion-inclusive machine learning potentials for structure search and optimization of hybrid organic-inorganic interfaces. DIGITAL DISCOVERY 2022; 1:463-475. [PMID: 36091414 PMCID: PMC9358753 DOI: 10.1039/d2dd00016d] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Accepted: 06/03/2022] [Indexed: 12/16/2022]
Abstract
The computational prediction of the structure and stability of hybrid organic-inorganic interfaces provides important insights into the measurable properties of electronic thin film devices, coatings, and catalyst surfaces and plays an important role in their rational design. However, the rich diversity of molecular configurations and the important role of long-range interactions in such systems make it difficult to use machine learning (ML) potentials to facilitate structure exploration that otherwise requires computationally expensive electronic structure calculations. We present an ML approach that enables fast, yet accurate, structure optimizations by combining two different types of deep neural networks trained on high-level electronic structure data. The first model is a short-ranged interatomic ML potential trained on local energies and forces, while the second is an ML model of effective atomic volumes derived from atoms-in-molecules partitioning. The latter can be used to connect short-range potentials to well-established density-dependent long-range dispersion correction methods. For two systems, specifically gold nanoclusters on diamond (110) surfaces and organic π-conjugated molecules on silver (111) surfaces, we train models on sparse structure relaxation data from density functional theory and show the ability of the models to deliver highly efficient structure optimizations and semi-quantitative energy predictions of adsorption structures.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick Coventry CV4 7AL UK
| | - Shayantan Chaudhuri
- Department of Chemistry, University of Warwick Coventry CV4 7AL UK
- Centre for Doctoral Training in Diamond Science and Technology, University of Warwick Coventry CV4 7AL UK
| | - Andreas Jeindl
- Institute of Solid State Physics, Graz University of Technology 8010 Graz Austria
| | - Oliver T Hofmann
- Institute of Solid State Physics, Graz University of Technology 8010 Graz Austria
| | | |
Collapse
|
17
|
Teng C, Wang Y, Huang D, Martin K, Tristan JB, Bao JL. Dual-Level Training of Gaussian Processes with Physically Inspired Priors for Geometry Optimizations. J Chem Theory Comput 2022; 18:5739-5754. [PMID: 35939760 DOI: 10.1021/acs.jctc.2c00546] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Gaussian process (GP) regression has been recently developed as an effective method in molecular geometry optimization. The prior mean function is one of the crucial parts of the GP. We design and validate two types of physically inspired prior mean functions: force-field-based priors and posterior-type priors. In this work, we implement a dual-level training (DLT) optimizer for the posterior-type priors. The DLT optimizers can be considered as a class of optimization algorithms that belong to the delta-machine learning paradigm but with several major differences compared to the previously proposed algorithms in the same paradigm. In the first level of the DLT, we incorporate the classical mechanical descriptions of the equilibrium geometries into the prior function, which enhances the performance of the GP optimizer as compared to the one using a constant (or zero) prior. In the second level, we utilize the surrogate potential energy surfaces (PESs), which incorporate the physics learned in the first-level training, as the prior function to refine the model performance further. We find that the force-field-based priors and posterior-type priors reduce the overall optimization steps by a factor of 2-3 when compared to the limited-memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) optimizer as well as the constant-prior GP optimizer proposed in previous works. We also demonstrate the potential of recovering the real PESs with GP with a force-field prior. This work shows the importance of including domain knowledge as an ingredient in the GP, which offers a potentially robust learning model for molecular geometry optimization and for exploring molecular PESs.
Collapse
Affiliation(s)
- Chong Teng
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Yang Wang
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, United States
| | - Katherine Martin
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Jean-Baptiste Tristan
- Department of Computer Science, Boston College, Chestnut Hill, Massachusetts 02467, United States
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, United States
| |
Collapse
|
18
|
Guo X, Fang L, Xu Y, Duan W, Rinke P, Todorović M, Chen X. Molecular Conformer Search with Low-Energy Latent Space. J Chem Theory Comput 2022; 18:4574-4585. [PMID: 35696366 PMCID: PMC9281398 DOI: 10.1021/acs.jctc.2c00290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Identifying low-energy conformers with quantum mechanical accuracy for molecules with many degrees of freedom is challenging. In this work, we use the molecular dihedral angles as features and explore the possibility of performing molecular conformer search in a latent space with a generative model named variational auto-encoder (VAE). We bias the VAE towards low-energy molecular configurations to generate more informative data. In this way, we can effectively build a reliable energy model for the low-energy potential energy surface. After the energy model has been built, we extract local-minimum conformations and refine them with structure optimization. We have tested and benchmarked our low-energy latent-space (LOLS) structure search method on organic molecules with 5-9 searching dimensions. Our results agree with previous studies.
Collapse
Affiliation(s)
- Xiaomi Guo
- State
Key Laboratory of Low Dimensional Quantum Physics and Department of
Physics, Tsinghua University, Beijing 100084, China
- Department
of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Lincan Fang
- Department
of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Yong Xu
- State
Key Laboratory of Low Dimensional Quantum Physics and Department of
Physics, Tsinghua University, Beijing 100084, China
- Frontier
Science Center for Quantum Information, Beijing 100084, China
- RIKEN
Center for Emergent Matter Science (CEMS), Wako, Saitama 351-0198, Japan
| | - Wenhui Duan
- State
Key Laboratory of Low Dimensional Quantum Physics and Department of
Physics, Tsinghua University, Beijing 100084, China
- Frontier
Science Center for Quantum Information, Beijing 100084, China
- Institute
for Advanced Study, Tsinghua University, Beijing 100084, China
| | - Patrick Rinke
- Department
of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Milica Todorović
- Department
of Mechanical and Materials Engineering, University of Turku, FI-20014 Turku, Finland
| | - Xi Chen
- Department
of Applied Physics, Aalto University, Espoo 00076, Finland
| |
Collapse
|
19
|
Abstract
Recent work has demonstrated the promise of using machine-learned surrogates, in particular, Gaussian process (GP) surrogates, in reducing the number of electronic structure calculations (ESCs) needed to perform surrogate model based (SMB) geometry optimization. In this paper, we study geometry meta-optimization with GP surrogates where a SMB optimizer additionally learns from its past "experience" performing geometry optimization. To validate this idea, we start with the simplest setting where a geometry meta-optimizer learns from previous optimizations of the same molecule with different initial-guess geometries. We give empirical evidence that geometry meta-optimization with GP surrogates is effective and requires less tuning compared to SMB optimization with GP surrogates on the ANI-1 dataset of off-equilibrium initial structures of small organic molecules. Unlike SMB optimization where a surrogate should be immediately useful for optimizing a given geometry, a surrogate in geometry meta-optimization has more flexibility because it can distribute its ESC savings across a set of geometries. Indeed, we find that GP surrogates that preserve rotational invariance provide increased marginal ESC savings across geometries. As a more stringent test, we also apply geometry meta-optimization to conformational search on a hand-constructed dataset of hydrocarbons and alcohols. We observe that while SMB optimization and geometry meta-optimization do save on ESCs, they also tend to miss higher energy conformers compared to standard geometry optimization. We believe that further research into characterizing the divergence between GP surrogates and potential energy surfaces is critical not only for advancing geometry meta-optimization but also for exploring the potential of machine-learned surrogates in geometry optimization in general.
Collapse
Affiliation(s)
- Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, USA
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Jean-Baptiste Tristan
- Department of Computer Science, Boston College, Chestnut Hill, Massachusetts 02467, USA
| |
Collapse
|
20
|
Chen M, Xu Z, Zhao J, Zhu Y, Shao Z. Nonparametric identification of batch process using two-dimensional kernel-based Gaussian process regression. Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2021.117372] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
21
|
Allotey J, Butler KT, Thiyagalingam J. Entropy-based active learning of graph neural network surrogate models for materials properties. J Chem Phys 2021; 155:174116. [PMID: 34742215 DOI: 10.1063/5.0065694] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Graph neural networks trained on experimental or calculated data are becoming an increasingly important tool in computational materials science. Networks once trained are able to make highly accurate predictions at a fraction of the cost of experiments or first-principles calculations of comparable accuracy. However, these networks typically rely on large databases of labeled experiments to train the model. In scenarios where data are scarce or expensive to obtain, this can be prohibitive. By building a neural network that provides confidence on the predicted properties, we are able to develop an active learning scheme that can reduce the amount of labeled data required by identifying the areas of chemical space where the model is most uncertain. We present a scheme for coupling a graph neural network with a Gaussian process to featurize solid-state materials and predict properties including a measure of confidence in the prediction. We then demonstrate that this scheme can be used in an active learning context to speed up the training of the model by selecting the optimal next experiment for obtaining a data label. Our active learning scheme can double the rate at which the performance of the model on a test dataset improves with additional data compared to choosing the next sample at random. This type of uncertainty quantification and active learning has the potential to open up new areas of materials science, where data are scarce and expensive to obtain, to the transformative power of graph neural networks.
Collapse
Affiliation(s)
- Johannes Allotey
- School of Physics, University of Bristol, Bristol BS8 1TL, United Kingdom
| | - Keith T Butler
- Scientific Machine Learning Research Group, Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council, Didcot OX11 0DQ, United Kingdom
| | - Jeyan Thiyagalingam
- Scientific Machine Learning Research Group, Scientific Computing Department, Rutherford Appleton Laboratory, Science and Technology Facilities Council, Didcot OX11 0DQ, United Kingdom
| |
Collapse
|
22
|
Westermayr J, Marquetand P. Machine Learning for Electronically Excited States of Molecules. Chem Rev 2021; 121:9873-9926. [PMID: 33211478 PMCID: PMC8391943 DOI: 10.1021/acs.chemrev.0c00749] [Citation(s) in RCA: 167] [Impact Index Per Article: 55.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Indexed: 12/11/2022]
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute
of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna
Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data
Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
23
|
Abstract
Electronically excited states of molecules are at the heart of photochemistry, photophysics, as well as photobiology and also play a role in material science. Their theoretical description requires highly accurate quantum chemical calculations, which are computationally expensive. In this review, we focus on not only how machine learning is employed to speed up such excited-state simulations but also how this branch of artificial intelligence can be used to advance this exciting research field in all its aspects. Discussed applications of machine learning for excited states include excited-state dynamics simulations, static calculations of absorption spectra, as well as many others. In order to put these studies into context, we discuss the promises and pitfalls of the involved machine learning techniques. Since the latter are mostly based on quantum chemistry calculations, we also provide a short introduction into excited-state electronic structure methods and approaches for nonadiabatic dynamics simulations and describe tricks and problems when using them in machine learning for excited states of molecules.
Collapse
Affiliation(s)
- Julia Westermayr
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
| | - Philipp Marquetand
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Strasse 17, 1090 Vienna, Austria
- Data Science @ Uni Vienna, University of Vienna, Währinger Strasse 29, 1090 Vienna, Austria
| |
Collapse
|
24
|
Born D, Kästner J. Geometry Optimization in Internal Coordinates Based on Gaussian Process Regression: Comparison of Two Approaches. J Chem Theory Comput 2021; 17:5955-5967. [PMID: 34378918 DOI: 10.1021/acs.jctc.1c00517] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Geometry optimization based on Gaussian process regression (GPR) was extended to internal coordinates. We used delocalized internal coordinates composed of distances and several types of angles and compared two methods of including them. In both cases, the GPR surrogate surface is trained on geometries in internal coordinates. In one case, it predicts the gradient in Cartesian coordinates and in the other, in internal coordinates. We tested both methods on a set of 30 small molecules and one larger Rh complex taken from the study of a catalytic mechanism. The former method is slightly more efficient, while the latter method is somewhat more robust. Both methods reduce the number of required optimization steps compared to GPR in Cartesian coordinates or the standard L-BFGS optimizer. We found it advantageous to use automatically adjusted hyperparameters to optimize them.
Collapse
Affiliation(s)
- Daniel Born
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| |
Collapse
|
25
|
Westermayr J, Gastegger M, Schütt KT, Maurer RJ. Perspective on integrating machine learning into computational chemistry and materials science. J Chem Phys 2021; 154:230903. [PMID: 34241249 DOI: 10.1063/5.0047760] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
26
|
Xu J, Cao XM, Hu P. Perspective on computational reaction prediction using machine learning methods in heterogeneous catalysis. Phys Chem Chem Phys 2021; 23:11155-11179. [PMID: 33972971 DOI: 10.1039/d1cp01349a] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Heterogeneous catalysis plays a significant role in the modern chemical industry. Towards the rational design of novel catalysts, understanding reactions over surfaces is the most essential aspect. Typical industrial catalytic processes such as syngas conversion and methane utilisation can generate a large reaction network comprising thousands of intermediates and reaction pairs. This complexity not only arises from the permutation of transformations between species but also from the extra reaction channels offered by distinct surface sites. Despite the success in investigating surface reactions at the atomic scale, the huge computational expense of ab initio methods hinders the exploration of such complicated reaction networks. With the proliferation of catalysis studies, machine learning as an emerging tool can take advantage of the accumulated reaction data to emulate the output of ab initio methods towards swift reaction prediction. Here, we briefly summarise the conventional workflow of reaction prediction, including reaction network generation, ab initio thermodynamics and microkinetic modelling. An overview of the frequently used regression models in machine learning is presented. As a promising alternative to full ab initio calculations, machine learning interatomic potentials are highlighted. Furthermore, we survey applications assisted by these methods for accelerating reaction prediction, exploring reaction networks, and computational catalyst design. Finally, we envisage future directions in computationally investigating reactions and implementing machine learning algorithms in heterogeneous catalysis.
Collapse
Affiliation(s)
- Jiayan Xu
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China. and School of Chemistry and Chemical Engineering, Queen's University Belfast, Belfast BT9 5AG, UK
| | - Xiao-Ming Cao
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China.
| | - P Hu
- Key Laboratory for Advanced Materials and Joint International Research Laboratory of Precision Chemistry and Molecular Engineering, Feringa Nobel Prize Scientist Joint Research Center, Frontiers Science Center for Materiobiology and Dynamic Chemistry, Centre for Computational Chemistry and Research Institute of Industrial Catalysis, School of Chemistry and Molecular Engineering, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, P. R. China. and School of Chemistry and Chemical Engineering, Queen's University Belfast, Belfast BT9 5AG, UK
| |
Collapse
|
27
|
Fang L, Makkonen E, Todorović M, Rinke P, Chen X. Efficient Amino Acid Conformer Search with Bayesian Optimization. J Chem Theory Comput 2021; 17:1955-1966. [PMID: 33577313 PMCID: PMC8023666 DOI: 10.1021/acs.jctc.0c00648] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
![]()
Finding low-energy molecular conformers
is challenging due to the
high dimensionality of the search space and the computational cost
of accurate quantum chemical methods for determining conformer structures
and energies. Here, we combine active-learning Bayesian optimization
(BO) algorithms with quantum chemistry methods to address this challenge.
Using cysteine as an example, we show that our procedure is both efficient
and accurate. After only 1000 single-point calculations and approximately
80 structure relaxations, which is less than 10% computational cost
of the current fastest method, we have found the low-energy conformers
in good agreement with experimental measurements and reference calculations.
To test the transferability of our method, we also repeated the conformer
search of serine, tryptophan, and aspartic acid. The results agree
well with previous conformer search studies.
Collapse
Affiliation(s)
- Lincan Fang
- Department of Applied Physics, Aalto University, AALTO 00076, Finland
| | - Esko Makkonen
- Department of Applied Physics, Aalto University, AALTO 00076, Finland
| | - Milica Todorović
- Department of Applied Physics, Aalto University, AALTO 00076, Finland
| | - Patrick Rinke
- Department of Applied Physics, Aalto University, AALTO 00076, Finland
| | - Xi Chen
- Department of Applied Physics, Aalto University, AALTO 00076, Finland
| |
Collapse
|
28
|
Fdez Galván I, Raggi G, Lindh R. Restricted-Variance Constrained, Reaction Path, and Transition State Molecular Optimizations Using Gradient-Enhanced Kriging. J Chem Theory Comput 2020; 17:571-582. [PMID: 33382621 PMCID: PMC7871327 DOI: 10.1021/acs.jctc.0c01163] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Gaussian process
regression has recently been explored as an alternative
to standard surrogate models in molecular equilibrium geometry optimization.
In particular, the gradient-enhanced Kriging approach in association
with internal coordinates, restricted-variance optimization, and an
efficient and fast estimate of hyperparameters has demonstrated performance
on par or better than standard methods. In this report, we extend
the approach to constrained optimizations and transition states and
benchmark it for a set of reactions. We compare the performance of
the newly developed method with the standard techniques in the location
of transition states and in constrained optimizations, both isolated
and in the context of reaction path computation. The results show
that the method outperforms the current standard in efficiency as
well as in robustness.
Collapse
Affiliation(s)
| | - Gerardo Raggi
- Department of Chemistry - BMC, Uppsala University, Uppsala 75123, Sweden
| | - Roland Lindh
- Department of Chemistry - BMC, Uppsala University, Uppsala 75123, Sweden
| |
Collapse
|
29
|
Garijo del Río E, Kaappa S, Garrido Torres JA, Bligaard T, Jacobsen KW. Machine learning with bond information for local structure optimizations in surface science. J Chem Phys 2020; 153:234116. [DOI: 10.1063/5.0033778] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
| | - Sami Kaappa
- Department of Physics, Technical University of Denmark, Kgs. Lyngby, Denmark
| | - José A. Garrido Torres
- SUNCAT Center for Interface Science and Catalysis, Department of Chemical Engineering, Stanford University, Stanford, California 94305, USA
- SLAC National Accelerator Laboratory, 2575 Sand Hill Road, Menlo Park, California 94025, USA
- Columbia Electrochemical Energy Center, Department of Chemical Engineering, Columbia University, New York, New York 10027, USA
| | - Thomas Bligaard
- SUNCAT Center for Interface Science and Catalysis, Department of Chemical Engineering, Stanford University, Stanford, California 94305, USA
- Department of Energy Conversion and Storage, Technical University of Denmark, Kgs. Lyngby, Denmark
| | | |
Collapse
|
30
|
Bahlke MP, Mogos N, Proppe J, Herrmann C. Exchange Spin Coupling from Gaussian Process Regression. J Phys Chem A 2020; 124:8708-8723. [DOI: 10.1021/acs.jpca.0c05983] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Marc Philipp Bahlke
- Department of Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany
| | - Natnael Mogos
- Department of Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany
| | - Jonny Proppe
- Institute of Physical Chemistry, Georg-August University, Tammannstr. 6, 37077 Göttingen, Germany
| | - Carmen Herrmann
- Department of Chemistry, University of Hamburg, Martin-Luther-King-Platz 6, 20146 Hamburg, Germany
| |
Collapse
|
31
|
Meyer R, Weichselbaum M, Hauser AW. Machine Learning Approaches toward Orbital-free Density Functional Theory: Simultaneous Training on the Kinetic Energy Density Functional and Its Functional Derivative. J Chem Theory Comput 2020; 16:5685-5694. [PMID: 32786898 PMCID: PMC7482319 DOI: 10.1021/acs.jctc.0c00580] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
![]()
Orbital-free
approaches might offer a way to boost the applicability
of density functional theory by orders of magnitude in system size.
An important ingredient for this endeavor is the kinetic energy density
functional. Snyder et al. [2012, 108, 25300223004593] presented a machine
learning approximation for this functional achieving chemical accuracy
on a one-dimensional model system. However, a poor performance with
respect to the functional derivative, a crucial element in iterative
energy minimization procedures, enforced the application of a computationally
expensive projection method. In this work we circumvent this issue
by including the functional derivative into the training of various
machine learning models. Besides kernel ridge regression, the original
method of choice, we also test the performance of convolutional neural
network techniques borrowed from the field of image recognition.
Collapse
Affiliation(s)
- Ralf Meyer
- Institute of Experimental Physics, Graz University of Technology, Petersgasse 16, 8010 Graz, Austria
| | - Manuel Weichselbaum
- Institute of Experimental Physics, Graz University of Technology, Petersgasse 16, 8010 Graz, Austria
| | - Andreas W Hauser
- Institute of Experimental Physics, Graz University of Technology, Petersgasse 16, 8010 Graz, Austria
| |
Collapse
|
32
|
Denzel A, Kästner J. Hessian Matrix Update Scheme for Transition State Search Based on Gaussian Process Regression. J Chem Theory Comput 2020; 16:5083-5089. [DOI: 10.1021/acs.jctc.0c00348] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Alexander Denzel
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, Pfaffenwaldring 55, 70569 Stuttgart, Germany
| |
Collapse
|
33
|
Raggi G, Galván IF, Ritterhoff CL, Vacher M, Lindh R. Restricted-Variance Molecular Geometry Optimization Based on Gradient-Enhanced Kriging. J Chem Theory Comput 2020; 16:3989-4001. [PMID: 32374164 PMCID: PMC7304864 DOI: 10.1021/acs.jctc.0c00257] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Machine learning techniques, specifically
gradient-enhanced Kriging
(GEK), have been implemented for molecular geometry optimization.
GEK-based optimization has many advantages compared to conventional—step-restricted
second-order truncated expansion—molecular optimization methods.
In particular, the surrogate model given by GEK can have multiple
stationary points, will smoothly converge to the exact model as the
number of sample points increases, and contains an explicit expression
for the expected error of the model function at an arbitrary point.
Machine learning is, however, associated with abundance of data, contrary
to the situation desired for efficient geometry optimizations. In
this paper, we demonstrate how the GEK procedure can be utilized in
a fashion such that in the presence of few data points, the surrogate
surface will in a robust way guide the optimization to a minimum of
a potential energy surface. In this respect, the GEK procedure will
be used to mimic the behavior of a conventional second-order scheme
but retaining the flexibility of the superior machine learning approach.
Moreover, the expected error will be used in the optimizations to
facilitate restricted-variance optimizations. A procedure which relates
the eigenvalues of the approximate guessed Hessian with the individual
characteristic lengths, used in the GEK model, reduces the number
of empirical parameters to optimize to two: the value of the trend
function and the maximum allowed variance. These parameters are determined
using the extended Baker (e-Baker) and part of the Baker transition-state
(Baker-TS) test suites as a training set. The so-created optimization
procedure is tested using the e-Baker, full Baker-TS, and S22 test
suites, at the density functional theory and second-order Møller–Plesset
levels of approximation. The results show that the new method is generally
of similar or better performance than a state-of-the-art conventional
method, even for cases where no significant improvement was expected.
Collapse
Affiliation(s)
- Gerardo Raggi
- Department of Chemistry-BMC, Uppsala University, 751 23 Uppsala, Sweden
| | | | - Christian L Ritterhoff
- Department of Chemistry-BMC, Uppsala University, 751 23 Uppsala, Sweden.,Faculty of Science, Universität Erlangen-Nürnberg, 91054 Erlangen, Germany
| | - Morgane Vacher
- Department of Chemistry-Ångström Laboratory, Uppsala University, 751 21 Uppsala, Sweden.,Université de Nantes, CNRS, CEISAM UMR 6230, F-44000 Nantes, France
| | - Roland Lindh
- Department of Chemistry-BMC, Uppsala University, 751 23 Uppsala, Sweden
| |
Collapse
|