1
|
Ma S, Cao Y, Shi YF, Shang C, He L, Liu ZP. Data-driven discovery of active phosphine ligand space for cross-coupling reactions. Chem Sci 2024; 15:13359-13368. [PMID: 39183919 PMCID: PMC11339946 DOI: 10.1039/d4sc02327g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/18/2024] [Indexed: 08/27/2024] Open
Abstract
The design of highly active catalysts is a main theme in organic chemistry, but it still relies heavily on expert experience. Herein, powered by machine-learning global structure exploration, we forge a Metal-Phosphine Catalyst Database (MPCD) with a meticulously designed ligand replacement energy metric, a key descriptor to describe the metal-ligand interactions. It pushes the rational design of organometallic catalysts to a quantitative era, where a ±10 kJ mol-1 window of relative ligand binding strength, a so-called active ligand space (ALS), is identified for highly effective catalyst screening. We highlight the chemistry interpretability and effectiveness of ALS for various C-N, C-C and C-S cross-coupling reactions via a Sabatier-principle-based volcano plot and demonstrate its predictive power in discovering low-cost ligands in catalyzing Suzuki cross-coupling involving aryl chloride. The advent of the MPCD provides a data-driven new route for speeding up organometallic catalysis and other applications.
Collapse
Affiliation(s)
- Sicong Ma
- State Key Laboratory of Metal Organic Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences Shanghai 200032 China
| | - Yanwei Cao
- State Key Laboratory for Oxo Synthesis and Selective Oxidation, Lanzhou Institute of Chemical Physics (LICP), Chinese Academy of Sciences Lanzhou 730000 China
| | - Yun-Fei Shi
- Collaborative Innovation Center of Chemistry for Energy Materials (IChem), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Materials (IChem), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
| | - Lin He
- State Key Laboratory for Oxo Synthesis and Selective Oxidation, Lanzhou Institute of Chemical Physics (LICP), Chinese Academy of Sciences Lanzhou 730000 China
| | - Zhi-Pan Liu
- State Key Laboratory of Metal Organic Chemistry, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences Shanghai 200032 China
- Collaborative Innovation Center of Chemistry for Energy Materials (IChem), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University Shanghai 200433 China
| |
Collapse
|
2
|
Manzhos S, Chen QG, Lee WY, Heejoo Y, Ihara M, Chueh CC. Computational Investigation of the Potential and Limitations of Machine Learning with Neural Network Circuits Based on Synaptic Transistors. J Phys Chem Lett 2024; 15:6974-6985. [PMID: 38941557 PMCID: PMC11247485 DOI: 10.1021/acs.jpclett.4c01413] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Synaptic transistors have been proposed to implement neuron activation functions of neural networks (NNs). While promising to enable compact, fast, inexpensive, and energy-efficient dedicated NN circuits, they also have limitations compared to digital NNs (realized as codes for digital processors), including shape choices of the activation function using particular types of transistor implementation, and instabilities due to noise and other factors present in analog circuits. We present a computational study of the effects of these factors on NN performance and find that, while accuracy competitive with traditional NNs can be realized for many applications, there is high sensitivity to the instability in the shape of the activation function, suggesting that, when highly accurate NNs are required, high-precision circuitry should be developed beyond what has been reported for synaptic transistors to date.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Qun Gao Chen
- Department of Chemical Engineering and Biotechnology, National Taipei University of Technology, Taipei 106, Taiwan
| | - Wen-Ya Lee
- Department of Chemical Engineering and Biotechnology, National Taipei University of Technology, Taipei 106, Taiwan
| | - Yoon Heejoo
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Chu-Chen Chueh
- Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
3
|
Spencer RJ, Zhanserkeev AA, Yang EL, Steele RP. The Near-Sightedness of Many-Body Interactions in Anharmonic Vibrational Couplings. J Am Chem Soc 2024; 146:15376-15392. [PMID: 38771156 DOI: 10.1021/jacs.4c03198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024]
Abstract
Couplings between vibrational motions are driven by electronic interactions, and these couplings carry special significance in vibrational energy transfer, multidimensional spectroscopy experiments, and simulations of vibrational spectra. In this investigation, the many-body contributions to these couplings are analyzed computationally in the context of clathrate-like alkali metal cation hydrates, including Cs+(H2O)20, Rb+(H2O)20, and K+(H2O)20, using both analytic and quantum-chemistry potential energy surfaces. Although the harmonic spectra and one-dimensional anharmonic spectra depend strongly on these many-body interactions, the mode-pair couplings were, perhaps surprisingly, found to be dominated by one-body effects, even in cases of couplings to low-frequency modes that involved the motion of multiple water molecules. The origin of this effect was traced mainly to geometric distortion within water monomers and cancellation of many-body effects in differential couplings, and the effect was also shown to be agnostic to the identity of the ion. These outcomes provide new understanding of vibrational couplings and suggest the possibility of improved computational methods for the simulation of infrared and Raman spectra.
Collapse
Affiliation(s)
- Ryan J Spencer
- Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Asylbek A Zhanserkeev
- Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Emily L Yang
- Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Ryan P Steele
- Department of Chemistry and Henry Eyring Center for Theoretical Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
4
|
Weike N, Fritsch F, Eisfeld W. Compensation States Approach in the Hybrid Diabatization Scheme: Extension to Multidimensional Data and Properties. J Phys Chem A 2024; 128:4353-4368. [PMID: 38748493 DOI: 10.1021/acs.jpca.4c01134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024]
Abstract
The diabatization of reactive systems for more than just a couple of states is a very demanding problem and generally requires advanced diabatization techniques. Especially for dissociative processes, the drastic changes in the adiabatic wave functions often would require large diabatic state bases, which quickly become impractical. Recently, we addressed this problem by the compensation states approach developed in the context of our hybrid diabatization scheme. This scheme utilizes wave function as well as energy data in combination with a diabatic potential model. In regions where the initial diabatic state basis becomes insufficient for an appropriate representation of the adiabatic states, new model states are generated. The new model states compensate for the state space not spanned by the initial diabatic basis. Such a compensation state is obtained by projecting the initial diabatic state space out of the adiabatic wave function. This yields a very efficient basis representation of the electronic Hamiltonian. The present work presents two new aspects. First, it is shown how other operators like the spin-orbit operator in the framework of the Effective Relativistic Coupling by Asymptotic Representation (ERCAR) can be evaluated in this compact model state space without losing the correct wave function information and accuracy. Second, the extension of the approach to multidimensional potential energy surface models is presented for methyl iodide including the C-I dissociation coordinate and the angular H3C-I bending coordinates.
Collapse
Affiliation(s)
- Nicole Weike
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Fabian Fritsch
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Wolfgang Eisfeld
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| |
Collapse
|
5
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
6
|
Weike N, Eisfeld W. The effective relativistic coupling by asymptotic representation approach for molecules with multiple relativistic atoms. J Chem Phys 2024; 160:064104. [PMID: 38341788 DOI: 10.1063/5.0191529] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 01/18/2024] [Indexed: 02/13/2024] Open
Abstract
The Effective Relativistic Coupling by Asymptotic Representation (ERCAR) approach is a method to generate fully coupled diabatic potential energy surfaces (PESs) including relativistic effects, especially spin-orbit coupling. The spin-orbit coupling of a full molecule is determined only by the atomic states of selected relativistically treated atoms. The full molecular coupling effect is obtained by a diabatization with respect to asymptotic states, resulting in the correct geometry dependence of the spin-orbit effect. The ERCAR approach has been developed over the last decade and initially only for molecules with a single relativistic atom. This work presents its extension to molecules with more than a single relativistic atom using the iodine molecule as a proof-of-principle example. The theory for the general multiple atomic ERCAR approach is given. In this case, the diabatic basis is defined at the asymptote where all relativistic atoms are separated from the remaining molecular fragment. The effective spin-orbit operator is then a sum of spin-orbit operators acting on isolated relativistic atoms. PESs for the iodine molecule are developed within the new approach and it is shown that the resulting fine structure states are in good agreement with spin-orbit ab initio calculations.
Collapse
Affiliation(s)
- Nicole Weike
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Wolfgang Eisfeld
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| |
Collapse
|
7
|
Iyengar SS, Ricard TC, Zhu X. Reformulation of All ONIOM-Type Molecular Fragmentation Approaches and Many-Body Theories Using Graph-Theory-Based Projection Operators: Applications to Dynamics, Molecular Potential Surfaces, Machine Learning, and Quantum Computing. J Phys Chem A 2024; 128:466-478. [PMID: 38180503 DOI: 10.1021/acs.jpca.3c05630] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Abstract
We present a graph-theory-based reformulation of all ONIOM-based molecular fragmentation methods. We discuss applications to (a) accurate post-Hartree-Fock AIMD that can be conducted at DFT cost for medium-sized systems, (b) hybrid DFT condensed-phase studies at the cost of pure density functionals, (c) reduced cost on-the-fly large basis gas-phase AIMD and condensed-phase studies, (d) post-Hartree-Fock-level potential surfaces at DFT cost to obtain quantum nuclear effects, and (e) novel transfer machine learning protocols derived from these measures. Additionally, in previous work, the unifying strategy discussed here has been used to construct new quantum computing algorithms. Thus, we conclude that this reformulation is robust and accurate.
Collapse
Affiliation(s)
- Srinivasan S Iyengar
- Department of Chemistry, Department of Physics, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Timothy C Ricard
- Department of Chemistry, Department of Physics, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Xiao Zhu
- Department of Chemistry, Department of Physics, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| |
Collapse
|
8
|
Manzhos S, Ihara M. A controlled study of the effect of deviations from symmetry of the potential energy surface (PES) on the accuracy of the vibrational spectrum computed with collocation. J Chem Phys 2023; 159:211103. [PMID: 38038200 DOI: 10.1063/5.0182373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Accepted: 11/12/2023] [Indexed: 12/02/2023] Open
Abstract
Symmetry, in particular permutational symmetry, of a potential energy surface (PES) is a useful property in quantum chemical calculations. It facilitates, in particular, state labelling and identification of degenerate states. In many practically important applications, however, these issues are unimportant. The imposition of exact symmetry and the perception that it is necessary create additional methodological requirements narrowing or complicating algorithmic choices that are thereby biased against methods and codes that by default do not incorporate symmetry, including most off-the-shelf machine learning methods that cannot be directly used if exact symmetry is demanded. By introducing symmetric and unsymmetric errors into the PES of H2CO in a controlled way and computing the vibrational spectrum with collocation using symmetric and nonsymmetric collocation point sets, we show that when the deviations from an ideal PES are random, imposition of exact symmetry does not bring any practical advantages. Moreover, a calculation ignoring symmetry may be more accurate. We also compare machine-learned PESs with and without symmetrization and demonstrate that there is no advantage of imposing exact symmetry for the accuracy of the vibrational spectrum.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| |
Collapse
|
9
|
Ricard TC, Zhu X, Iyengar SS. Capturing Weak Interactions in Surface Adsorbate Systems at Coupled Cluster Accuracy: A Graph-Theoretic Molecular Fragmentation Approach Improved through Machine Learning. J Chem Theory Comput 2023. [PMID: 38019639 DOI: 10.1021/acs.jctc.3c00955] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
The accurate and efficient study of the interactions of organic matter with the surface of water is critical to a wide range of applications. For example, environmental studies have found that acidic polyfluorinated alkyl substances, especially perfluorooctanoic acid (PFOA), have spread throughout the environment and bioaccumulate into human populations residing near contaminated watersheds, leading to many systemic maladies. Thus, the study of the interactions of PFOA with water surfaces became important for the mitigation of their activity as pollutants and threats to public health. However, theoretical study of the interactions of such organic adsorbates on the surface of water, and their bulk concerted properties, often necessitates the use of ab initio methods to properly incorporate the long-range electronic properties that govern these extended systems. Notable theoretical treatments of "on-water" reactions thus far have employed hybrid DFT and semilocal DFT, but the interactions involved are weak interactions that may be best described using post-Hartree-Fock theory. Here, we aim to demonstrate the utility of a graph-theoretic approach to molecular fragmentation that accurately captures the critical "weak" interactions while maintaining an efficient ab initio treatment of the long-range periodic interactions that underpin the physics of extended systems. We apply this graph-theoretical treatment to study PFOA on the surface of water as a model system for the study of weak interactions seen in the wide range of surface interactions and reactions. The approach divides a system into a set of vertices, that are then connected through edges, faces, and higher order graph theoretic objects known as simplexes, to represent a collection of locally interacting subsystems. These subsystems are then used to construct ab initio molecular dynamics simulations and for computing multidimensional potential energy surfaces. To further improve the computational efficiency of our graph theoretic fragmentation method, we use a recently developed transfer learning protocol to construct the full system potential energy from a family of neural networks each designed to accurately model the behavior of individual simplexes. We use a unique multidimensional clustering algorithm, based on the k-means clustering methodology, to define our training space for each separate simplex. These models are used to extrapolate the energies for molecular dynamics trajectories at PFOA water interfaces, at less than one-tenth the cost as compared to a regular molecular fragmentation-based dynamics calculation with excellent agreement with couple cluster level of full system potential energies.
Collapse
Affiliation(s)
- Timothy C Ricard
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Xiao Zhu
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Srinivasan S Iyengar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| |
Collapse
|
10
|
Manzhos S, Ihara M. Neural Network with Optimal Neuron Activation Functions Based on Additive Gaussian Process Regression. J Phys Chem A 2023; 127:7823-7835. [PMID: 37698519 DOI: 10.1021/acs.jpca.3c02949] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
Feed-forward neural networks (NNs) are a staple machine learning method widely used in many areas of science and technology, including physical chemistry, computational chemistry, and materials informatics. While even a single-hidden-layer NN is a universal approximator, its expressive power is limited by the use of simple neuron activation functions (such as sigmoid functions) that are typically the same for all neurons. More flexible neuron activation functions would allow the use of fewer neurons and layers and thereby save computational cost and improve expressive power. We show that additive Gaussian process regression (GPR) can be used to construct optimal neuron activation functions that are individual to each neuron. An approach is also introduced that avoids nonlinear fitting of neural network parameters by defining them with rules. The resulting method combines the advantage of robustness of a linear regression with the higher expressive power of an NN. We demonstrate the approach by fitting the potential energy surfaces of the water molecule and formaldehyde. Without requiring any nonlinear optimization, the additive-GPR-based approach outperforms a conventional NN in the high-accuracy regime, where a conventional NN suffers more from overfitting.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| |
Collapse
|
11
|
Arab F, Nazari F, Illas F. Artificial Neural Network-Derived Unified Six-Dimensional Potential Energy Surface for Tetra Atomic Isomers of the Biogenic [H, C, N, O] System. J Chem Theory Comput 2023; 19:1186-1196. [PMID: 36735891 PMCID: PMC9979606 DOI: 10.1021/acs.jctc.2c00915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Recognition of different structural patterns in different potential energy surface regions, such as in isomerizing quasilinear tetra atomic molecules, is important for understanding the details of underlying physics and chemistry. In this respect, using three variants of artificial neural networks (ANNs), we investigated the six-dimensional (6-D) singlet potential energy surfaces (PES) of tetra atomic isomers of the biogenic [H, C, N, O] system. At first, we constructed a separate ANN potential for each of the studied isomers. In the next step, a comparative assessment of the separate ANN models led to the setting up of a unified 6-D singlet PES equally and accurately describing all studied isomers. The constructed unified model yields relative energies comparable to those obtained either from the gold standard CCSD(T) method or from separate ANNs for each of the studied isomers. The accuracy of the unified singlet PES is on the order of 10-4 Hartrees (0.1 kcal/mol). The developed PES in this work captures the main features of nonlinear and quasilinear tetra atomic isomers of this biogenic system.
Collapse
Affiliation(s)
- Fatemeh Arab
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran
| | - Fariba Nazari
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran,Center
of Climate Change and Global Warming, Institute
for Advanced Studies in Basic Sciences, Zanjan45137-66731, Iran,
| | - Francesc Illas
- Departament
de Ciència de Materials i Química Física &
Institut de Química Teòrica i Computacional (IQTCUB), Universitat de Barcelona, C/Martí i Franquès 1, 08028Barcelona, Spain,
| |
Collapse
|
12
|
Manzhos S, Ihara M. The loss of the property of locality of the kernel in high-dimensional Gaussian process regression on the example of the fitting of molecular potential energy surfaces. J Chem Phys 2023; 158:044111. [PMID: 36725493 DOI: 10.1063/5.0136156] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Kernel-based methods, including Gaussian process regression (GPR) and generally kernel ridge regression, have been finding increasing use in computational chemistry, including the fitting of potential energy surfaces and density functionals in high-dimensional feature spaces. Kernels of the Matern family, such as Gaussian-like kernels (basis functions), are often used which allow imparting to them the meaning of covariance functions and formulating GPR as an estimator of the mean of a Gaussian distribution. The notion of locality of the kernel is critical for this interpretation. It is also critical to the formulation of multi-zeta type basis functions widely used in computational chemistry. We show, on the example of fitting of molecular potential energy surfaces of increasing dimensionality, the practical disappearance of the property of locality of a Gaussian-like kernel in high dimensionality. We also formulate a multi-zeta approach to the kernel and show that it significantly improves the quality of regression in low dimensionality but loses any advantage in high dimensionality, which is attributed to the loss of the property of locality.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan
| |
Collapse
|
13
|
Manzhos S, Tsuda S, Ihara M. Machine learning in computational chemistry: interplay between (non)linearity, basis sets, and dimensionality. Phys Chem Chem Phys 2023; 25:1546-1555. [PMID: 36562317 DOI: 10.1039/d2cp04155c] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Machine learning (ML) based methods and tools have now firmly established themselves in physical chemistry and in particular in theoretical and computational chemistry and in materials chemistry. The generality of popular ML techniques such as neural networks or kernel methods (Gaussian process and kernel ridge regression and their flavors) permitted their application to diverse problems from prediction of properties of functional materials (catalysts, solid state ionic conductors, etc.) from descriptors to the building of interatomic potentials (where ML is currently routinely used in applications) and electron density functionals. These ML techniques are assumed to have superior expressive power of nonlinear methods, and are often used "as is", with concepts such as "non-parametric" or "deep learning" used without a clear justification for their need or advantage over simpler and more robust alternatives. In this Perspective, we highlight some interrelations between popular ML techniques and traditional linear regressions and basis expansions and demonstrate that in certain regimes (such as a very high dimensionality) these approximations might collapse. We also discuss ways to recover the expressive power of a nonlinear approach and to help select hyperparameters with the help of high-dimensional model representation and to obtain elements of insight while preserving the generality of the method.
Collapse
Affiliation(s)
- Sergei Manzhos
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan.
| | - Shunsaku Tsuda
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan.
| | - Manabu Ihara
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Ookayama 2-12-1, Meguro-ku, Tokyo 152-8552, Japan.
| |
Collapse
|
14
|
Muther T, Dahaghi AK, Syed FI, Van Pham V. Physical laws meet machine intelligence: current developments and future directions. Artif Intell Rev 2022. [DOI: 10.1007/s10462-022-10329-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
15
|
Zhu X, Iyengar SS. Graph Theoretic Molecular Fragmentation for Multidimensional Potential Energy Surfaces Yield an Adaptive and General Transfer Machine Learning Protocol. J Chem Theory Comput 2022; 18:5125-5144. [PMID: 35994592 DOI: 10.1021/acs.jctc.1c01241] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Over a series of publications we have introduced a graph-theoretic description for molecular fragmentation. Here, a system is divided into a set of nodes, or vertices, that are then connected through edges, faces, and higher-order simplexes to represent a collection of spatially overlapping and locally interacting subsystems. Each such subsystem is treated at two levels of electronic structure theory, and the result is used to construct many-body expansions that are then embedded within an ONIOM-scheme. These expansions converge rapidly with many-body order (or graphical rank) of subsystems and have been previously used for ab initio molecular dynamics (AIMD) calculations and for computing multidimensional potential energy surfaces. Specifically, in all these cases we have shown that CCSD and MP2 level AIMD trajectories and potential surfaces may be obtained at density functional theory cost. The approach has been demonstrated for gas-phase studies, for condensed phase electronic structure, and also for basis set extrapolation-based AIMD. Recently, this approach has also been used to derive new quantum-computing algorithms that enormously reduce the quantum circuit depth in a circuit-based computation of correlated electronic structure. In this publication, we introduce (a) a family of neural networks that act in parallel to represent, efficiently, the post-Hartree-Fock electronic structure energy contributions for all simplexes (fragments), and (b) a new k-means-based tessellation strategy to glean training data for high-dimensional molecular spaces and minimize the extent of training needed to construct this family of neural networks. The approach is particularly useful when coupled cluster accuracy is desired and when fragment sizes grow in order to capture nonlocal interactions accurately. The unique multidimensional k-means tessellation/clustering algorithm used to determine our training data for all fragments is shown to be extremely efficient and reduces the needed training to only 10% of data for all fragments to obtain accurate neural networks for each fragment. These fully connected dense neural networks are then used to extrapolate the potential energy surface for all molecular fragments, and these are then combined as per our graph-theoretic procedure to transfer the learning process to a full system energy for the entire AIMD trajectory at less than one-tenth the cost as compared to a regular fragmentation-based AIMD calculation.
Collapse
Affiliation(s)
- Xiao Zhu
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington 47405, Indiana, United States
| | - Srinivasan S Iyengar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington 47405, Indiana, United States
| |
Collapse
|
16
|
Dai J, Krems RV. Quantum Gaussian process model of potential energy surface for a polyatomic molecule. J Chem Phys 2022; 156:184802. [PMID: 35568545 DOI: 10.1063/5.0088821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
With gates of a quantum computer designed to encode multi-dimensional vectors, projections of quantum computer states onto specific qubit states can produce kernels of reproducing kernel Hilbert spaces. We show that quantum kernels obtained with a fixed ansatz implementable on current quantum computers can be used for accurate regression models of global potential energy surfaces (PESs) for polyatomic molecules. To obtain accurate regression models, we apply Bayesian optimization to maximize marginal likelihood by varying the parameters of the quantum gates. This yields Gaussian process models with quantum kernels. We illustrate the effect of qubit entanglement in the quantum kernels and explore the generalization performance of quantum Gaussian processes by extrapolating global six-dimensional PESs in the energy domain.
Collapse
Affiliation(s)
- J Dai
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, CanadaStewart Blusson Quantum Matter Institute, Vancouver, British Columbia V6T 1Z4, Canada
| | - R V Krems
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, CanadaStewart Blusson Quantum Matter Institute, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
17
|
Zhang L, Zhang S, Owens A, Yurchenko SN, Dral PO. VIB5 database with accurate ab initio quantum chemical molecular potential energy surfaces. Sci Data 2022; 9:84. [PMID: 35277513 PMCID: PMC8917215 DOI: 10.1038/s41597-022-01185-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Accepted: 01/19/2022] [Indexed: 11/09/2022] Open
Abstract
High-level ab initio quantum chemical (QC) molecular potential energy surfaces (PESs) are crucial for accurately simulating molecular rotation-vibration spectra. Machine learning (ML) can help alleviate the cost of constructing such PESs, but requires access to the original ab initio PES data, namely potential energies computed on high-density grids of nuclear geometries. In this work, we present a new structured PES database called VIB5, which contains high-quality ab initio data on 5 small polyatomic molecules of astrophysical significance (CH3Cl, CH4, SiH4, CH3F, and NaOH). The VIB5 database is based on previously used PESs, which, however, are either publicly unavailable or lacking key information to make them suitable for ML applications. The VIB5 database provides tens of thousands of grid points for each molecule with theoretical best estimates of potential energies along with their constituent energy correction terms and a data-extraction script. In addition, new complementary QC calculations of energies and energy gradients have been performed to provide a consistent database, which, e.g., can be used for gradient-based ML methods. Measurement(s) | potential energy surfaces | Technology Type(s) | quantum chemistry computational methods |
Collapse
Affiliation(s)
- Lina Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China
| | - Shuang Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China
| | - Alec Owens
- Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT, London, United Kingdom.
| | - Sergei N Yurchenko
- Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT, London, United Kingdom
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, 361005, China.
| |
Collapse
|
18
|
Theoretical Description of Water from Single-Molecule to Condensed Phase: a Review of Recent Progress on Potential Energy Surfaces and Molecular Dynamics. CHINESE J CHEM PHYS 2022. [DOI: 10.1063/1674-0068/cjcp2201005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
|
19
|
Manzhos S, Ihara M. Computational vibrational spectroscopy of molecule-surface interactions: what is still difficult and what can be done about it. Phys Chem Chem Phys 2022; 24:15158-15172. [DOI: 10.1039/d2cp01389d] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Interactions of molecules with solid surfaces are responsible for key functionalities for a range of currently actively pursued technologies, including heterogeneous catalysis for synthesis or decomposition of molecules, sensitization, surface...
Collapse
|
20
|
Yang Z, Chen H, Chen M. Representing Globally Accurate Reactive Potential Energy Surfaces with Complex Topography by Combining Gaussian Process Regression and Neural Network. Phys Chem Chem Phys 2022; 24:12827-12836. [DOI: 10.1039/d2cp00719c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
There has been increasing attention in using machine learning technologies, such as neural network (NN) and Gaussian process regression (GPR), to model multidimensional potential energy surfaces (PESs). NN PES features...
Collapse
|
21
|
Omodemi O, Sprouse S, Herbert D, Kaledin M, Kaledin AL. On the Cartesian Representation of the Molecular Polarizability Tensor Surface by Polynomial Fitting to Ab Initio Data. J Chem Theory Comput 2021; 18:37-45. [PMID: 34958587 DOI: 10.1021/acs.jctc.1c01015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We describe an approach to constructing an analytic Cartesian representation of the molecular dipole polarizability tensor surface in terms of polynomials in interatomic distances with a training set of ab initio data points obtained from a molecular dynamics (MD) simulation or by any other available means. The proposed formulation is based on a perturbation treatment of the unmodified point dipole polarizability model of Applequist [ J. Am. Chem. Soc. 1972, 94, 2952] and is shown here to be, by construction (i) free of short-range or other singularities or discontinuities, (ii) symmetric and translationally invariant, and (iii) nonreliant on a body-fixed coordinate system. Permutational invariance of like nuclei is demonstrated to be readily applicable, making this approach useful for highly fluxional and reactive systems. Derivation of the method is described in detail, adding brief didactic numerical examples of H2 and H2O and concluding with an MD simulation of the Raman spectrum of H5O2+ at 300 K with the polarizability tensor fitted to CCSD(T)/aug-cc-pVTZ data obtained using the HBB-4B potential [ J. Chem. Phys. 2005, 122, 044308].
Collapse
Affiliation(s)
- Oluwaseun Omodemi
- Department of Chemistry & Biochemistry, Kennesaw State University, 370 Paulding Avenue NW, Box # 1203, Kennesaw, Georgia 30144, United States
| | - Sarah Sprouse
- Department of Chemistry & Biochemistry, Kennesaw State University, 370 Paulding Avenue NW, Box # 1203, Kennesaw, Georgia 30144, United States
| | - Destyni Herbert
- Department of Chemistry & Biochemistry, Kennesaw State University, 370 Paulding Avenue NW, Box # 1203, Kennesaw, Georgia 30144, United States
| | - Martina Kaledin
- Department of Chemistry & Biochemistry, Kennesaw State University, 370 Paulding Avenue NW, Box # 1203, Kennesaw, Georgia 30144, United States
| | - Alexey L Kaledin
- Cherry L. Emerson Center for Scientific Computation, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| |
Collapse
|
22
|
Asnaashari K, Krems RV. Gradient domain machine learning with composite kernels: improving the accuracy of PES and force fields for large molecules. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/ac3845] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
The generalization accuracy of machine learning models of potential energy surfaces (PES) and force fields (FF) for large polyatomic molecules can be improved either by increasing the number of training points or by improving the models. In order to build accurate models based on expensive ab initio calculations, much of recent work has focused on the latter. In particular, it has been shown that gradient domain machine learning (GDML) models produce accurate results for high-dimensional molecular systems with a small number of ab initio calculations. The present work extends GDML to models with composite kernels built to maximize inference from a small number of molecular geometries. We illustrate that GDML models can be improved by increasing the complexity of underlying kernels through a greedy search algorithm using Bayesian information criterion as the model selection metric. We show that this requires including anisotropy into kernel functions and produces models with significantly smaller generalization errors. The results are presented for ethanol, uracil, malonaldehyde and aspirin. For aspirin, the model with composite kernels trained by forces at 1000 randomly sampled molecular geometries produces a global 57-dimensional PES with the mean absolute accuracy 0.177 kcal mol−1 (61.9 cm−1) and FFs with the mean absolute error 0.457 kcal mol−1 Å−1.
Collapse
|
23
|
DiRisio RJ, Lu F, McCoy AB. GPU-Accelerated Neural Network Potential Energy Surfaces for Diffusion Monte Carlo. J Phys Chem A 2021; 125:5849-5859. [PMID: 34165989 DOI: 10.1021/acs.jpca.1c03709] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Diffusion Monte Carlo (DMC) provides a powerful method for understanding the vibrational landscape of molecules that are not well-described by conventional methods. The most computationally demanding step of these calculations is the evaluation of the potential energy. In this work, a general approach is developed in which a neural network potential energy surface is trained by using data generated from a small-scale DMC calculation. Once trained, the neural network can be evaluated by using highly parallelizable calls to a graphics processing unit (GPU). The power of this approach is demonstrated for DMC simulations on H2O, CH5+, and (H2O)2. The need to include permutation symmetry in the neural network potentials is explored and incorporated into the molecular descriptors of CH5+ and (H2O)2. It is shown that the zero-point energies and wave functions obtained by using the neural network potentials are nearly identical to the results obtained when using the potential energy surfaces that were used to train the neural networks at a substantial savings in the computational requirements of the simulations.
Collapse
Affiliation(s)
- Ryan J DiRisio
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Fenris Lu
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| | - Anne B McCoy
- Department of Chemistry, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
24
|
Affiliation(s)
- Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| |
Collapse
|
25
|
Allen AEA, Dusson G, Ortner C, Csányi G. Atomic permutationally invariant polynomials for fitting molecular force fields. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abd51e] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
|
26
|
Quintas-Sánchez E, Dawes R. Spectroscopy and Scattering Studies Using Interpolated Ab Initio Potentials. Annu Rev Phys Chem 2021; 72:399-421. [PMID: 33503385 DOI: 10.1146/annurev-physchem-090519-051837] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The Born-Oppenheimer potential energy surface (PES) has come a long way since its introduction in the 1920s, both conceptually and in predictive power for practical applications. Nevertheless, nearly 100 years later-despite astonishing advances in computational power-the state-of-the-art first-principles prediction of observables related to spectroscopy and scattering dynamics is surprisingly limited. For example, the water dimer, (H2O)2, with only six nuclei and 20 electrons, still presents a formidable challenge for full-dimensional variational calculations of bound states and is considered out of reach for rigorous scattering calculations. The extremely poor scaling of the most rigorous quantum methods is fundamental; however, recent progress in development of approximate methodologies has opened the door to fairly routine high-quality predictions, unthinkable 20 years ago. In this review, in relation to the workflow of spectroscopy and/or scattering studies, we summarize progress and challenges in the component areas of electronic structure calculations, PES fitting, and quantum dynamical calculations.
Collapse
Affiliation(s)
- Ernesto Quintas-Sánchez
- Department of Chemistry, Missouri University of Science and Technology, Rolla, Missouri 65409, USA;
| | - Richard Dawes
- Department of Chemistry, Missouri University of Science and Technology, Rolla, Missouri 65409, USA;
| |
Collapse
|
27
|
Manzhos S, Carrington T. Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem Rev 2020; 121:10187-10217. [PMID: 33021368 DOI: 10.1021/acs.chemrev.0c00665] [Citation(s) in RCA: 119] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We review progress in neural network (NN)-based methods for the construction of interatomic potentials from discrete samples (such as ab initio energies) for applications in classical and quantum dynamics including reaction dynamics and computational spectroscopy. The main focus is on methods for building molecular potential energy surfaces (PES) in internal coordinates that explicitly include all many-body contributions, even though some of the methods we review limit the degree of coupling, due either to a desire to limit computational cost or to limited data. Explicit and direct treatment of all many-body contributions is only practical for sufficiently small molecules, which are therefore our primary focus. This includes small molecules on surfaces. We consider direct, single NN PES fitting as well as more complex methods that impose structure (such as a multibody representation) on the PES function, either through the architecture of one NN or by using multiple NNs. We show how NNs are effective in building representations with low-dimensional functions including dimensionality reduction. We consider NN-based approaches to build PESs in the sums-of-product form important for quantum dynamics, ways to treat symmetry, and issues related to sampling data distributions and the relation between PES errors and errors in observables. We highlight combinations of NNs with other ideas such as permutationally invariant polynomials or sums of environment-dependent atomic contributions, which have recently emerged as powerful tools for building highly accurate PESs for relatively large molecular and reactive systems.
Collapse
Affiliation(s)
- Sergei Manzhos
- Centre Énergie Matériaux Télécommunications, Institut National de la Recherche Scientifique, 1650, Boulevard Lionel-Boulet, Varennes, Québec City, Québec J3X 1S2, Canada
| | - Tucker Carrington
- Chemistry Department, Queen's University, Kingston Ontario K7L 3N6, Canada
| |
Collapse
|
28
|
Sugisawa H, Ida T, Krems RV. Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer. J Chem Phys 2020; 153:114101. [DOI: 10.1063/5.0023492] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Affiliation(s)
- Hiroki Sugisawa
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - Tomonori Ida
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - R. V. Krems
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Stewart Blusson Quantum Matter Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
29
|
Williams DMG, Eisfeld W. Complete Nuclear Permutation Inversion Invariant Artificial Neural Network (CNPI-ANN) Diabatization for the Accurate Treatment of Vibronic Coupling Problems. J Phys Chem A 2020; 124:7608-7621. [DOI: 10.1021/acs.jpca.0c05991] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- David M. G. Williams
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Wolfgang Eisfeld
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| |
Collapse
|
30
|
Dral PO, Owens A, Dral A, Csányi G. Hierarchical machine learning of potential energy surfaces. J Chem Phys 2020; 152:204110. [DOI: 10.1063/5.0006498] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Pavlo O. Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Alec Owens
- Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT London, United Kingdom
| | - Alexey Dral
- BigData Team, 1A Tormoznoye Shosse Off 17, Yaroslavl, Yaroslavl 150022, Russian Federation
| | - Gábor Csányi
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|
31
|
Manzhos S. Machine learning for the solution of the Schrödinger equation. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab7d30] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
32
|
Abstract
As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
33
|
Dai J, Krems RV. Interpolation and Extrapolation of Global Potential Energy Surfaces for Polyatomic Systems by Gaussian Processes with Composite Kernels. J Chem Theory Comput 2020; 16:1386-1395. [DOI: 10.1021/acs.jctc.9b00700] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- J. Dai
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| | - R. V. Krems
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
| |
Collapse
|
34
|
Schröder M. Transforming high-dimensional potential energy surfaces into a canonical polyadic decomposition using Monte Carlo methods. J Chem Phys 2020; 152:024108. [DOI: 10.1063/1.5140085] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Markus Schröder
- Theoretische Chemie, Physikalisch-Chemisches Institut, Universität Heidelberg, Im Neuenheimer Feld 229, D-69120 Heidelberg, Germany
| |
Collapse
|
35
|
|
36
|
Flynn SW, Mandelshtam VA. Sampling general distributions with quasi-regular grids: Application to the vibrational spectra calculations. J Chem Phys 2019; 151:241105. [DOI: 10.1063/1.5134677] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Shane W. Flynn
- Department of Chemistry, University of California, Irvine, California 92697, USA
| | | |
Collapse
|
37
|
Brown SE. From ab initio data to high-dimensional potential energy surfaces: A critical overview and assessment of the development of permutationally invariant polynomial potential energy surfaces for single molecules. J Chem Phys 2019; 151:194111. [PMID: 31757150 DOI: 10.1063/1.5123999] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The representation of high-dimensional potential energy surfaces by way of the many-body expansion and permutationally invariant polynomials has become a well-established tool for improving the resolution and extending the scope of molecular simulations. The high level of accuracy that can be attained by these potential energy functions (PEFs) is due in large part to their specificity: for each term in the many-body expansion, a species-specific training set must be generated at the desired level of theory and a number of fits attempted in order to obtain a robust and reliable PEF. In this work, we attempt to characterize the numerical aspects of the fitting problem, addressing questions which are of simultaneous practical and fundamental importance. These include concrete illustrations of the nonconvexity of the problem, the ill-conditionedness of the linear system to be solved and possible need for regularization, the sensitivity of the solutions to the characteristics of the training set, and limitations of the approach with respect to accuracy and the types of molecules that can be treated. In addition, we introduce a general approach to the generation of training set configurations based on the familiar harmonic approximation and evaluate the possible benefits to the use of quasirandom sequences for sampling configuration space in this context. Using sulfate as a case study, the findings are largely generalizable and expected to ultimately facilitate the efficient development of PIP-based many-body PEFs for general systems via automation.
Collapse
Affiliation(s)
- Sandra E Brown
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
38
|
Schran C, Behler J, Marx D. Automated Fitting of Neural Network Potentials at Coupled Cluster Accuracy: Protonated Water Clusters as Testing Ground. J Chem Theory Comput 2019; 16:88-99. [DOI: 10.1021/acs.jctc.9b00805] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Christoph Schran
- Lehrstuhl für Theoretische Chemie, Ruhr−Universität Bochum, 44780 Bochum, Germany
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstrasse 6, 37077 Göttingen, Germany
| | - Dominik Marx
- Lehrstuhl für Theoretische Chemie, Ruhr−Universität Bochum, 44780 Bochum, Germany
| |
Collapse
|
39
|
Karandashev K, Vaníček J. A combined on-the-fly/interpolation procedure for evaluating energy values needed in molecular simulations. J Chem Phys 2019; 151:174116. [PMID: 31703487 DOI: 10.1063/1.5124469] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We propose an algorithm for molecular dynamics or Monte Carlo simulations that uses an interpolation procedure to estimate potential energy values from energies and gradients evaluated previously at points of a simplicial mesh. We chose an interpolation procedure that is exact for harmonic systems and considered two possible mesh types: Delaunay triangulation and an alternative anisotropic triangulation designed to improve performance in anharmonic systems. The mesh is generated and updated on the fly during the simulation. The procedure is tested on two-dimensional quartic oscillators and on the path integral Monte Carlo evaluation of the HCN/DCN equilibrium isotope effect.
Collapse
Affiliation(s)
- Konstantin Karandashev
- Laboratory of Theoretical Physical Chemistry, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| | - Jiří Vaníček
- Laboratory of Theoretical Physical Chemistry, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland
| |
Collapse
|
40
|
Williams DMG, Viel A, Eisfeld W. Diabatic neural network potentials for accurate vibronic quantum dynamics—The test case of planar NO3. J Chem Phys 2019; 151:164118. [DOI: 10.1063/1.5125851] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- David M. G. Williams
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Alexandra Viel
- Univ Rennes, CNRS, IPR (Institut de Physique de Rennes) - UMR 6251, F-35000 Rennes, France
| | - Wolfgang Eisfeld
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| |
Collapse
|
41
|
Panadés-Barrueta RL, Martínez-Núñez E, Peláez D. Specific Reaction Parameter Multigrid POTFIT (SRP-MGPF): Automatic Generation of Sum-of-Products Form Potential Energy Surfaces for Quantum Dynamical Calculations. Front Chem 2019; 7:576. [PMID: 31475138 PMCID: PMC6702682 DOI: 10.3389/fchem.2019.00576] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Accepted: 07/30/2019] [Indexed: 11/13/2022] Open
Abstract
We present Specific Reaction Parameter Multigrid POTFIT (SRP-MGPF), an automated methodology for the generation of global potential energy surfaces (PES), molecular properties surfaces, e.g., dipole, polarizabilities, etc. using a single random geometry as input. The SRP-MGPF workflow integrates: (i) a fully automated procedure for the global topographical characterization of a (intermolecular) PES based on the Transition State Search Using Chemical Dynamical Simulations (TSSCDS) family of methods;i (ii) the global optimization of the parameters of a semiempirical Hamiltonian in order to reproduce a given level of electronic structure theory; and (iii) a tensor decomposition algorithm which turns the resulting SRP-PES into sum of products (Tucker) form with the Multigrid POTFIT algorithm. The latter is necessary for quantum dynamical studies within the Multiconfiguration Time-Dependent Hartree (MCTDH) quantum dynamics method. To demonstrate our approach, we have applied our methodology to the cis-trans isomerization reaction in HONO in full dimensionality (6D). The resulting SRP-PES has been validated through the computation of classical on-the-fly dynamical calculations as well as calculations of the lowest vibrational eigenstates of HONO as well as high-energy wavepacket propagations.
Collapse
Affiliation(s)
- Ramón L. Panadés-Barrueta
- Laboratoire de Physique des Lasers, Atomes et Molécules (PhLAM), Université de Lille, Villeneuve-d'Ascq, France
| | - Emilio Martínez-Núñez
- Departamento de Química Física, Facultade de Química, Universidade de Santiago de Compostela, Santiago de Compostela, Spain
| | - Daniel Peláez
- Laboratoire de Physique des Lasers, Atomes et Molécules (PhLAM), Université de Lille, Villeneuve-d'Ascq, France
| |
Collapse
|
42
|
Ma S, Shang C, Liu ZP. Heterogeneous catalysis from structure to activity via SSW-NN method. J Chem Phys 2019. [DOI: 10.1063/1.5113673] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Sicong Ma
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Zhi-Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| |
Collapse
|
43
|
Brorsen KR. Reproducing global potential energy surfaces with continuous-filter convolutional neural networks. J Chem Phys 2019; 150:204104. [DOI: 10.1063/1.5093908] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Affiliation(s)
- Kurt R. Brorsen
- Department of Chemistry, University of Missouri, Columbia, Missouri 65203, USA
| |
Collapse
|
44
|
Christensen AS, Faber FA, von Lilienfeld OA. Operators in quantum machine learning: Response properties in chemical space. J Chem Phys 2019; 150:064105. [DOI: 10.1063/1.5053562] [Citation(s) in RCA: 69] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
| | - Felix A. Faber
- Department of Chemistry, University of Basel, Basel, Switzerland
| | | |
Collapse
|
45
|
McConnell SR, Kästner J. Instanton rate constant calculations using interpolated potential energy surfaces in nonredundant, rotationally and translationally invariant coordinates. J Comput Chem 2019; 40:866-874. [PMID: 30677168 DOI: 10.1002/jcc.25770] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 11/25/2018] [Accepted: 11/27/2018] [Indexed: 11/07/2022]
Abstract
A trivial flaw in the utilization of artificial neural networks in interpolating chemical potential energy surfaces (PES) whose descriptors are Cartesian coordinates is their dependence on simple translations and rotations of the molecule under consideration. A different set of descriptors can be chosen to circumvent this problem, internuclear distances, inverse internuclear distances or z-matrix coordinates are three such descriptors. The objective is to use an interpolated PES in instanton rate constant calculations, hence information on the energy, gradient, and Hessian is required at coordinates in the vicinity of the tunneling path. Instanton theory relies on smoothly fitted Hessians, therefore we use energy, gradients, and Hessians in the training procedure. A major challenge is presented in the proper back-transformation of the output gradients and Hessians from internal coordinates to Cartesian coordinates. We perform comparisons between our method, a previous approach and on-the-fly rate constant calcuations on the hydrogen abstraction from methanol and on the hydrogen addition to isocyanic acid. © 2018Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Sean R McConnell
- Institute for Theoretical Chemistry, University of Stuttgart, 70569, Stuttgart, Germany
| | - Johannes Kästner
- Institute for Theoretical Chemistry, University of Stuttgart, 70569, Stuttgart, Germany
| |
Collapse
|
46
|
Abstract
This article discusses applications of Bayesian machine learning for quantum molecular dynamics.
Collapse
Affiliation(s)
- R. V. Krems
- Department of Chemistry
- University of British Columbia
- Vancouver
- Canada
| |
Collapse
|
47
|
Golub P, Manzhos S. Kinetic energy densities based on the fourth order gradient expansion: performance in different classes of materials and improvement via machine learning. Phys Chem Chem Phys 2018; 21:378-395. [PMID: 30525136 DOI: 10.1039/c8cp06433d] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We study the performance of fourth-order gradient expansions of the kinetic energy density (KED) in semi-local kinetic energy functionals depending on the density-dependent variables. The formal fourth-order expansion is convergent for periodic systems and small molecules but does not improve over the second-order expansion (the Thomas-Fermi term plus one-ninth of the von Weizsäcker term). Linear fitting of the expansion coefficients somewhat improves on the formal expansion. The tuning of the fourth order expansion coefficients allows for better reproducibility of the Kohn-Sham kinetic energy density than the tuning of the second-order expansion coefficients alone. The possibility of a much more accurate match with the Kohn-Sham kinetic energy density by using neural networks (NN) trained using the terms of the 4th order expansion as density-dependent variables is demonstrated. We obtain ultra-low fitting errors without overfitting of NN parameters. Small single hidden layer neural networks can provide good accuracy in separate KED fits of each compound, while for joint fitting of KEDs of multiple compounds multiple hidden layers were required to achieve good fit quality. The critical issue of data distribution is highlighted. We also show the critical role of pseudopotentials in the performance of the expansion, where in the case of a too rapid decay of the valence density at the nucleus with some pseudopotentials, numeric instabilities can arise.
Collapse
Affiliation(s)
- Pavlo Golub
- Department of Mechanical Engineering, National University of Singapore, Block EA #07-08, 9 Engineering Drive 1, Singapore 117576.
| | | |
Collapse
|
48
|
Williams DMG, Eisfeld W. Neural network diabatization: A new ansatz for accurate high-dimensional coupled potential energy surfaces. J Chem Phys 2018; 149:204106. [DOI: 10.1063/1.5053664] [Citation(s) in RCA: 36] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- David M. G. Williams
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| | - Wolfgang Eisfeld
- Theoretische Chemie, Universität Bielefeld, Postfach 100131, D-33501 Bielefeld, Germany
| |
Collapse
|
49
|
Inverse Multiquadratic Functions as the Basis for the Rectangular Collocation Method to Solve the Vibrational Schrödinger Equation. MATHEMATICS 2018. [DOI: 10.3390/math6110253] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
We explore the use of inverse multiquadratic (IMQ) functions as basis functions when solving the vibrational Schrödinger equation with the rectangular collocation method. The quality of the vibrational spectrum of formaldehyde (in six dimensions) is compared to that obtained using Gaussian basis functions when using different numbers of width-optimized IMQ functions. The effects of the ratio of the number of collocation points to the number of basis functions and of the choice of the IMQ exponent are studied. We show that the IMQ basis can be used with parameters where the IMQ function is not integrable. We find that the quality of the spectrum with IMQ basis functions is somewhat lower that that with a Gaussian basis when the basis size is large, and for a range of IMQ exponents. The IMQ functions are; however, advantageous when a small number of functions is used or with a small number of collocation points (e.g., when using square collocation).
Collapse
|
50
|
Petty C, Spada RFK, Machado FBC, Poirier B. Accurate rovibrational energies of ozone isotopologues up toJ= 10 utilizing artificial neural networks. J Chem Phys 2018; 149:024307. [DOI: 10.1063/1.5036602] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Corey Petty
- Departamento de Química, Instituto Tecnológico de Aeronáutica, São José dos Campos, 12.228-900, SP, Brazil
| | - Rene F. K. Spada
- Departamento de Física, Instituto Tecnológico de Aeronáutica, São José dos Campos, 12.228-900, SP, Brazil
| | - Francisco B. C. Machado
- Departamento de Química, Instituto Tecnológico de Aeronáutica, São José dos Campos, 12.228-900, SP, Brazil
| | - Bill Poirier
- Department of Chemistry and Biochemistry, Texas Tech University, Lubbock, Texas 79409, USA
| |
Collapse
|