1
|
Chen Y, Pios SV, Gelin MF, Chen L. Accelerating Molecular Vibrational Spectra Simulations with a Physically Informed Deep Learning Model. J Chem Theory Comput 2024; 20:4703-4710. [PMID: 38825857 DOI: 10.1021/acs.jctc.4c00173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
In recent years, machine learning (ML) surrogate models have emerged as an indispensable tool to accelerate simulations of physical and chemical processes. However, there is still a lack of ML models that can accurately predict molecular vibrational spectra. Here, we present a highly efficient multitask ML surrogate model termed Vibrational Spectra Neural Network (VSpecNN), to accurately calculate infrared (IR) and Raman spectra based on dipole moments and polarizabilities obtained on-the-fly via ML-enhanced molecular dynamics simulations. The methodology is applied to pyrazine, a prototypical polyatomic chromophore. The VSpecNN-predicted energies are well within the chemical accuracy (1 kcal/mol), and the errors for VSpecNN-predicted forces are only half of those obtained from a popular high-performance ML model. Compared to the ab initio reference, the VSpecNN-predicted frequencies of IR and Raman spectra differ only by less than 5.87 cm-1, and the intensities of IR spectra and the depolarization ratios of Raman spectra are well reproduced. The VSpecNN model developed in this work highlights the importance of constructing highly accurate neural network potentials for predicting molecular vibrational spectra.
Collapse
Affiliation(s)
| | | | - Maxim F Gelin
- School of Science, Hangzhou Dianzi University, Hangzhou 310018, China
| | | |
Collapse
|
2
|
Vijayanathan M, Vadakkepat AK, Mahendran KR, Sharaf A, Frandsen KEH, Bandyopadhyay D, Pillai MR, Soniya EV. Structural and mechanistic insights into Quinolone Synthase to address its functional promiscuity. Commun Biol 2024; 7:566. [PMID: 38745065 PMCID: PMC11093982 DOI: 10.1038/s42003-024-06152-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 04/07/2024] [Indexed: 05/16/2024] Open
Abstract
Quinolone synthase from Aegle marmelos (AmQNS) is a type III polyketide synthase that yields therapeutically effective quinolone and acridone compounds. Addressing the structural and molecular underpinnings of AmQNS and its substrate interaction in terms of its high selectivity and specificity can aid in the development of numerous novel compounds. This paper presents a high-resolution AmQNS crystal structure and explains its mechanistic role in synthetic selectivity. Additionally, we provide a model framework to comprehend structural constraints on ketide insertion and postulate that AmQNS's steric and electrostatic selectivity plays a role in its ability to bind to various core substrates, resulting in its synthetic diversity. AmQNS prefers quinolone synthesis and can accommodate large substrates because of its wide active site entrance. However, our research suggests that acridone is exclusively synthesized in the presence of high malonyl-CoA concentrations. Potential implications of functionally relevant residue mutations were also investigated, which will assist in harnessing the benefits of mutations for targeted polyketide production. The pharmaceutical industry stands to gain from these findings as they expand the pool of potential drug candidates, and these methodologies can also be applied to additional promising enzymes.
Collapse
Affiliation(s)
- Mallika Vijayanathan
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
- Department of Plant and Environment Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark
| | - Abhinav Koyamangalath Vadakkepat
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, India
- Department of Molecular and Cell Biology, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE17HB, UK
| | - Kozhinjampara R Mahendran
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
| | - Abdoallah Sharaf
- SequAna Core Facility, Department of Biology, University of Konstanz, Konstanz, Germany
- Genetic Department, Faculty of Agriculture, Ain Shams University, Cairo, 11241, Egypt
| | - Kristian E H Frandsen
- Department of Plant and Environment Sciences, University of Copenhagen, 1871, Frederiksberg C, Denmark
| | - Debashree Bandyopadhyay
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - M Radhakrishna Pillai
- Cancer Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India
| | - Eppurath Vasudevan Soniya
- Transdisciplinary Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram, 695014, India.
| |
Collapse
|
3
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. SCIENCE ADVANCES 2024; 10:eadn4397. [PMID: 38579003 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
4
|
Montes de Oca-Estévez MJ, Valdés Á, Prosmiti R. A kernel-based machine learning potential and quantum vibrational state analysis of the cationic Ar hydride (Ar 2H +). Phys Chem Chem Phys 2024; 26:7060-7071. [PMID: 38345626 DOI: 10.1039/d3cp05865d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
One of the most fascinating discoveries in recent years, in the cold and low pressure regions of the universe, was the detection of ArH+ and HeH+ species. The identification of such noble gas-containing molecules in space is the key to understanding noble gas chemistry. In the present work, we discuss the possibility of [Ar2H]+ existence as a potentially detectable molecule in the interstellar medium, providing new data on possible astronomical pathways and energetics of this compound. As a first step, a data-driven approach is proposed to construct a full 3D machine-learning potential energy surface (ML-PES) via the reproducing kernel Hilbert space (RKHS) method. The training and testing data sets are generated from CCSD(T)/CBS[56] computations, while a validation protocol is introduced to ensure the quality of the potential. In turn, the resulting ML-PES is employed to compute vibrational levels and molecular spectroscopic constants for the cation. In this way, the most common isotopologue in ISM, [36Ar2H]+, was characterized for the first time, while simultaneously, comparisons with previously reported values available for [40Ar2H]+ are discussed. Our present data could serve as a benchmark for future studies on this system, as well as on higher-order cationic Ar-hydrides of astrophysical interest.
Collapse
Affiliation(s)
- María Judit Montes de Oca-Estévez
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.
- Atelgraphics S.L., Mota de Cuervo 42, 28043, Madrid, Spain
| | - Álvaro Valdés
- Escuela de Física, Universidad Nacional de Colombia, Sede Medellín, A. A., 3840, Medellín, Colombia
| | - Rita Prosmiti
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.
| |
Collapse
|
5
|
Langer MF, Frank JT, Knoop F. Stress and heat flux via automatic differentiation. J Chem Phys 2023; 159:174105. [PMID: 37921248 DOI: 10.1063/5.0155760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2023] [Accepted: 09/25/2023] [Indexed: 11/04/2023] Open
Abstract
Machine-learning potentials provide computationally efficient and accurate approximations of the Born-Oppenheimer potential energy surface. This potential determines many materials properties and simulation techniques usually require its gradients, in particular forces and stress for molecular dynamics, and heat flux for thermal transport properties. Recently developed potentials feature high body order and can include equivariant semi-local interactions through message-passing mechanisms. Due to their complex functional forms, they rely on automatic differentiation (AD), overcoming the need for manual implementations or finite-difference schemes to evaluate gradients. This study discusses how to use AD to efficiently obtain forces, stress, and heat flux for such potentials, and provides a model-independent implementation. The method is tested on the Lennard-Jones potential, and then applied to predict cohesive properties and thermal conductivity of tin selenide using an equivariant message-passing neural network potential.
Collapse
Affiliation(s)
- Marcel F Langer
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- BIFOLD-Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
- The NOMAD Laboratory at the Fritz Haber Institute of the Max Planck Society and Humboldt University, Berlin, Germany
| | - J Thorben Frank
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- BIFOLD-Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | - Florian Knoop
- Theoretical Physics Division, Department of Physics, Chemistry and Biology (IFM), Linköping University, SE-581 83 Linköping, Sweden
| |
Collapse
|
6
|
Konings M, Harvey JN, Loreau J. Machine Learning Representations of the Three Lowest Adiabatic Electronic Potential Energy Surfaces for the ArH 2+ Reactive System. J Phys Chem A 2023; 127:8083-8094. [PMID: 37748085 DOI: 10.1021/acs.jpca.3c04015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2023]
Abstract
In this work, we present Gaussian process regression machine learning representations of the three lowest coupled 2A' adiabatic electronic potential energy surfaces of the ArH2+ reactive system in full dimensionality. Additionally, the nonadiabatic coupling matrix elements were calculated. These adiabatic potentials and their nonadiabatic couplings are necessary ingredients in the theoretical investigation of the nonadiabatic reaction dynamics of the Ar + H2+ → ArH+ + H and Ar+ + H2 → ArH+ + H reactions, as well as the competing charge transfer process, Ar + H2+↔ Ar+ + H2. Accurate ab initio electronic structure calculations (ic-MRCI+Q/aug-cc-pVQZ), whereby the effect of spin-orbit coupling in Ar+ has been accounted for through the state interaction method, serve as input for the machine learning training process. The potential energy surfaces are fitted with high accuracies, with root-mean-square errors on the order of 10-7 eV for the three surfaces, which meet the requirements for chemical dynamics at low temperature. It was found that quite a large number of training points (of the order of 5000 ab initio points) are needed in order to achieve these accuracies due to the complex topography of these electronic surfaces.
Collapse
Affiliation(s)
- Maarten Konings
- Division of Quantum Chemistry and Physical Chemistry, Department of Chemistry, KU Leuven, Celestijnenlaan 200F, 3001 Leuven, Belgium
| | - Jeremy N Harvey
- Division of Quantum Chemistry and Physical Chemistry, Department of Chemistry, KU Leuven, Celestijnenlaan 200F, 3001 Leuven, Belgium
| | - Jérôme Loreau
- Division of Quantum Chemistry and Physical Chemistry, Department of Chemistry, KU Leuven, Celestijnenlaan 200F, 3001 Leuven, Belgium
| |
Collapse
|
7
|
Wang Y, Xu C, Li Z, Barati Farimani A. Denoise Pretraining on Nonequilibrium Molecules for Accurate and Transferable Neural Potentials. J Chem Theory Comput 2023; 19:5077-5087. [PMID: 37390120 PMCID: PMC10413865 DOI: 10.1021/acs.jctc.3c00289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Indexed: 07/02/2023]
Abstract
Recent advances in equivariant graph neural networks (GNNs) have made deep learning amenable to developing fast surrogate models to expensive ab initio quantum mechanics (QM) approaches for molecular potential predictions. However, building accurate and transferable potential models using GNNs remains challenging, as the data are greatly limited by the expensive computational costs and level of theory of QM methods, especially for large and complex molecular systems. In this work, we propose denoise pretraining on nonequilibrium molecular conformations to achieve more accurate and transferable GNN potential predictions. Specifically, atomic coordinates of sampled nonequilibrium conformations are perturbed by random noises, and GNNs are pretrained to denoise the perturbed molecular conformations which recovers the original coordinates. Rigorous experiments on multiple benchmarks reveal that pretraining significantly improves the accuracy of neural potentials. Furthermore, we show that the proposed pretraining approach is model-agnostic, as it improves the performance of different invariant and equivariant GNNs. Notably, our models pretrained on small molecules demonstrate remarkable transferability, improving performance when fine-tuned on diverse molecular systems, including different elements, charged molecules, biomolecules, and larger systems. These results highlight the potential for leveraging denoise pretraining approaches to build more generalizable neural potentials for complex molecular systems.
Collapse
Affiliation(s)
- Yuyang Wang
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Machine
Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Changwen Xu
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Zijie Li
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Amir Barati Farimani
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Machine
Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
8
|
Sit MK, Das S, Samanta K. Semiclassical Dynamics on Machine-Learned Coupled Multireference Potential Energy Surfaces: Application to the Photodissociation of the Simplest Criegee Intermediate. J Phys Chem A 2023; 127:2376-2387. [PMID: 36856588 DOI: 10.1021/acs.jpca.2c07229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/02/2023]
Abstract
Determination of high-dimensional potential energy surfaces (PESs) and nonadiabatic couplings have always been quite challenging. To this end, machine learning (ML) models, trained with a finite set of ab initio data, allow accurate prediction of such properties. To express the PESs in terms of atomic contributions is the cornerstone of any ML based technique because it can be easily scaled to large systems. In this work, we have constructed high fidelity PESs and nonadiabatic coupling terms at the CASSCF level of ab initio data using a machine learning technique, namely, kernel-ridge regression. Additional MRCI-level calculations were carried out to assess the quality of the PESs. We use these machine-learned PESs and nonadiabatic couplings to simulate excited-state molecular dynamics based on Tully's fewest-switches surface hopping method (FSSH). FSSH is a semiclassical method in which nuclei move on the PESs due to the electrons according to the laws of classical mechanics. Nonadiabatic effects are taken into account in terms of transitions between PESs. We apply this scheme to study the O-O photodissociation of the simplest Criegee intermediate (CH2OO). The FSSH trajectories were initiated on the lowest optically bright singlet excited state (S2) and propagated along the three most important internal coordinates, namely, O-O and C-O bond distances and the COO bond angle. Some of the trajectories end up on energetically lower PESs as a result of radiationless transfer through conical intersections. All of the trajectories lead to the dissociation of the O-O bond due to the dissociative nature of the excited PESs through one of the two dissociative channels. The simulation reveals that there is about 88.4% probability of dissociation through the lower channel leading to the H2CO (X1A1) and O (1D) products, whereas there is only 11.6% probability of dissociation through the upper channel leading to H2CO (a3A″) and O (3P) products.
Collapse
Affiliation(s)
- Mahesh K Sit
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| | - Subhasish Das
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| | - Kousik Samanta
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| |
Collapse
|
9
|
Pinheiro M, Zhang S, Dral PO, Barbatti M. WS22 database, Wigner Sampling and geometry interpolation for configurationally diverse molecular datasets. Sci Data 2023; 10:95. [PMID: 36792601 PMCID: PMC9931705 DOI: 10.1038/s41597-023-01998-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 02/01/2023] [Indexed: 02/17/2023] Open
Abstract
Multidimensional surfaces of quantum chemical properties, such as potential energies and dipole moments, are common targets for machine learning, requiring the development of robust and diverse databases extensively exploring molecular configurational spaces. Here we composed the WS22 database covering several quantum mechanical (QM) properties (including potential energies, forces, dipole moments, polarizabilities, HOMO, and LUMO energies) for ten flexible organic molecules of increasing complexity and with up to 22 atoms. This database consists of 1.18 million equilibrium and non-equilibrium geometries carefully sampled from Wigner distributions centered at different equilibrium conformations (either at the ground or excited electronic states) and further augmented with interpolated structures. The diversity of our datasets is demonstrated by visualizing the geometries distribution with dimensionality reduction as well as via comparison of statistical features of the QM properties with those available in existing datasets. Our sampling targets broader quantum mechanical distribution of the configurational space than provided by commonly used sampling through classical molecular dynamics, upping the challenge for machine learning models.
Collapse
Affiliation(s)
- Max Pinheiro
- Aix Marseille University, CNRS, ICR, Marseille, France.
| | - Shuang Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, China
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen, China
| | - Mario Barbatti
- Aix Marseille University, CNRS, ICR, Marseille, France. .,Institut Universitaire de France, 75231, Paris, France.
| |
Collapse
|
10
|
Prediction of interaction energy for rare gas dimers using machine learning approaches. J CHEM SCI 2023. [DOI: 10.1007/s12039-023-02131-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2023]
|
11
|
Arab F, Nazari F, Illas F. Artificial Neural Network-Derived Unified Six-Dimensional Potential Energy Surface for Tetra Atomic Isomers of the Biogenic [H, C, N, O] System. J Chem Theory Comput 2023; 19:1186-1196. [PMID: 36735891 PMCID: PMC9979606 DOI: 10.1021/acs.jctc.2c00915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Recognition of different structural patterns in different potential energy surface regions, such as in isomerizing quasilinear tetra atomic molecules, is important for understanding the details of underlying physics and chemistry. In this respect, using three variants of artificial neural networks (ANNs), we investigated the six-dimensional (6-D) singlet potential energy surfaces (PES) of tetra atomic isomers of the biogenic [H, C, N, O] system. At first, we constructed a separate ANN potential for each of the studied isomers. In the next step, a comparative assessment of the separate ANN models led to the setting up of a unified 6-D singlet PES equally and accurately describing all studied isomers. The constructed unified model yields relative energies comparable to those obtained either from the gold standard CCSD(T) method or from separate ANNs for each of the studied isomers. The accuracy of the unified singlet PES is on the order of 10-4 Hartrees (0.1 kcal/mol). The developed PES in this work captures the main features of nonlinear and quasilinear tetra atomic isomers of this biogenic system.
Collapse
Affiliation(s)
- Fatemeh Arab
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran
| | - Fariba Nazari
- Department
of Chemistry, Institute for Advanced Studies
in Basic Sciences, Zanjan45137-66731, Iran,Center
of Climate Change and Global Warming, Institute
for Advanced Studies in Basic Sciences, Zanjan45137-66731, Iran,
| | - Francesc Illas
- Departament
de Ciència de Materials i Química Física &
Institut de Química Teòrica i Computacional (IQTCUB), Universitat de Barcelona, C/Martí i Franquès 1, 08028Barcelona, Spain,
| |
Collapse
|
12
|
Käser S, Richardson JO, Meuwly M. Transfer Learning for Affordable and High-Quality Tunneling Splittings from Instanton Calculations. J Chem Theory Comput 2022; 18:6840-6850. [DOI: 10.1021/acs.jctc.2c00790] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | | | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| |
Collapse
|
13
|
Kiss O, Tacchino F, Vallecorsa S, Tavernelli I. Quantum neural networks force fields generation. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac7d3c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
Accurate molecular force fields are of paramount importance for the efficient implementation of molecular dynamics techniques at large scales. In the last decade, machine learning (ML) methods have demonstrated impressive performances in predicting accurate values for energy and forces when trained on finite size ensembles generated with ab initio techniques. At the same time, quantum computers have recently started to offer new viable computational paradigms to tackle such problems. On the one hand, quantum algorithms may notably be used to extend the reach of electronic structure calculations. On the other hand, quantum ML is also emerging as an alternative and promising path to quantum advantage. Here we follow this second route and establish a direct connection between classical and quantum solutions for learning neural network (NN) potentials. To this end, we design a quantum NN architecture and apply it successfully to different molecules of growing complexity. The quantum models exhibit larger effective dimension with respect to classical counterparts and can reach competitive performances, thus pointing towards potential quantum advantages in natural science applications via quantum ML.
Collapse
|
14
|
Li Z, Meidani K, Yadav P, Barati Farimani A. Graph neural networks accelerated molecular dynamics. J Chem Phys 2022; 156:144103. [DOI: 10.1063/5.0083060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Molecular Dynamics (MD) simulation is a powerful tool for understanding the dynamics and structure of matter. Since the resolution of MD is atomic-scale, achieving long timescale simulations with femtosecond integration is very expensive. In each MD step, numerous iterative computations are performed to calculate energy based on different types of interaction and their corresponding spatial gradients. These repetitive computations can be learned and surrogated by a deep learning model, such as a Graph Neural Network (GNN). In this work, we developed a GNN Accelerated MD (GAMD) model that directly predicts forces, given the state of the system (atom positions, atom types), bypassing the evaluation of potential energy. By training the GNN on a variety of data sources (simulation data derived from classical MD and density functional theory), we show that GAMD can predict the dynamics of two typical molecular systems, Lennard-Jones system and water system, in the NVT ensemble with velocities regulated by a thermostat. We further show that GAMD’s learning and inference are agnostic to the scale, where it can scale to much larger systems at test time. We also perform a comprehensive benchmark test comparing our implementation of GAMD to production-level MD software, showing GAMD’s competitive performance on the large-scale simulation.
Collapse
Affiliation(s)
- Zijie Li
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Kazem Meidani
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Prakarsh Yadav
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Amir Barati Farimani
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
15
|
Saito K, Hashimoto Y, Takayanagi T. Ring-Polymer Molecular Dynamics Calculations of Thermal Rate Coefficients and Branching Ratios for the Interstellar H 3+ + CO → H 2 + HCO +/HOC + Reaction and Its Deuterated Analogue. J Phys Chem A 2021; 125:10750-10756. [PMID: 34918514 DOI: 10.1021/acs.jpca.1c09160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The reaction between H3+ and CO is important in understanding the H3+ destruction mechanism in the interstellar medium. In this work, thermal rate coefficients for the H3+ + CO and D3+ + CO reactions are calculated using ring-polymer molecular dynamics (RPMD) on a high-level machine-learning potential energy surface. The RPMD results agree well with the classical molecular dynamics results, where nuclear quantum effects are completely ignored, whereas the agreement between the RPMD results and the previous quasi-classical trajectory is good only at low temperatures. The calculated [HCO+]/[HOC+] product branching ratios decrease as the temperature increases, and the product branching is exclusively determined by the initial collisional orientation, which governs the formation of an ion-dipole complex, H3+···CO or H3+···OC, that dissociates into H2 + HCO+ or H2 + HOC+, respectively, via a direct mechanism. However, the contribution of the indirect mechanism via the rearrangement between H3+···CO and H3+···OC increases as the temperature increases, although its absolute fraction is small.
Collapse
Affiliation(s)
- Kohei Saito
- Department of Chemistry, Saitama University, Shimo-Okubo 255, Sakura-ku, Saitama 338-8570, Japan
| | - Yu Hashimoto
- Department of Chemistry, Saitama University, Shimo-Okubo 255, Sakura-ku, Saitama 338-8570, Japan
| | - Toshiyuki Takayanagi
- Department of Chemistry, Saitama University, Shimo-Okubo 255, Sakura-ku, Saitama 338-8570, Japan
| |
Collapse
|
16
|
Gerrits N. Accurate Simulations of the Reaction of H 2 on a Curved Pt Crystal through Machine Learning. J Phys Chem Lett 2021; 12:12157-12164. [PMID: 34918518 PMCID: PMC8724818 DOI: 10.1021/acs.jpclett.1c03395] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]
Abstract
Theoretical studies on molecule-metal surface reactions have so far been limited to small surface unit cells due to computational costs. Here, for the first time molecular dynamics simulations on very large surface unit cells at the level of density functional theory are performed, allowing a direct comparison to experiments performed on a curved crystal. Specifically, the reaction of D2 on a curved Pt crystal is investigated with a neural network potential (NNP). The developed NNP is also accurate for surface unit cells considerably larger than those that have been included in the training data, allowing dynamical simulations on very large surface unit cells that otherwise would have been intractable. Important and complex aspects of the reaction mechanism are discovered such as diffusion and a shadow effect of the step. Furthermore, conclusions from simulations on smaller surface unit cells cannot always be transfered to larger surface unit cells, limiting the applicability of theoretical studies of smaller surface unit cells to heterogeneous catalysts with small defect densities.
Collapse
Affiliation(s)
- Nick Gerrits
- Leiden
Institute of Chemistry, Leiden University, Gorlaeus Laboratories, P.O. Box 9502, 2300 RA Leiden, The Netherlands
- Research
Group PLASMANT, Department of Chemistry, University of Antwerp, Universiteitsplein 1, BE-2610 Wilrijk, Antwerp, Belgium
| |
Collapse
|
17
|
Unke OT, Chmiela S, Gastegger M, Schütt KT, Sauceda HE, Müller KR. SpookyNet: Learning force fields with electronic degrees of freedom and nonlocal effects. Nat Commun 2021; 12:7273. [PMID: 34907176 PMCID: PMC8671403 DOI: 10.1038/s41467-021-27504-0] [Citation(s) in RCA: 87] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Accepted: 11/16/2021] [Indexed: 01/12/2023] Open
Abstract
Machine-learned force fields combine the accuracy of ab initio methods with the efficiency of conventional force fields. However, current machine-learned force fields typically ignore electronic degrees of freedom, such as the total charge or spin state, and assume chemical locality, which is problematic when molecules have inconsistent electronic states, or when nonlocal effects play a significant role. This work introduces SpookyNet, a deep neural network for constructing machine-learned force fields with explicit treatment of electronic degrees of freedom and nonlocality, modeled via self-attention in a transformer architecture. Chemically meaningful inductive biases and analytical corrections built into the network architecture allow it to properly model physical limits. SpookyNet improves upon the current state-of-the-art (or achieves similar performance) on popular quantum chemistry data sets. Notably, it is able to generalize across chemical and conformational space and can leverage the learned chemical insights, e.g. by predicting unknown spin states, thus helping to close a further important remaining gap for today's machine learning models in quantum chemistry.
Collapse
Affiliation(s)
- Oliver T Unke
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623, Berlin, Germany.
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623, Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
| | - Huziel E Sauceda
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- BASLEARN, BASF-TU joint Lab, Technische Universität Berlin, 10587, Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
- BIFOLD-Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google Research, Brain team, Berlin, Germany.
| |
Collapse
|
18
|
Xu M, Zhu T, Zhang JZH. Automated Construction of Neural Network Potential Energy Surface: The Enhanced Self-Organizing Incremental Neural Network Deep Potential Method. J Chem Inf Model 2021; 61:5425-5437. [PMID: 34752095 DOI: 10.1021/acs.jcim.1c01125] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In recent years, the use of deep learning (neural network) potential energy surface (NNPES) in molecular dynamics simulation has experienced explosive growth as it can be as accurate as quantum chemistry methods while being as efficient as classical mechanic methods. However, the development of NNPES is highly nontrivial. In particular, it has been troubling to construct a dataset that is as small as possible yet can cover the target chemical space. In this work, an ESOINN-DP method is developed, which has the enhanced self-organizing incremental neural network (ESOINN) and a newly proposed error indicator at its core. With ESOINN-DP, one can construct the NNPES with little human intervention, and this method ensures that the constructed reference dataset covers the target chemical space with minimum redundancy. The performance of the ESOINN-DP method has been well validated by developing neural network potential energy surfaces for water clusters, tripeptides, and by de-redundancy of a sub-dataset of the ANI-1 database. We believe that the ESOINN-DP method provides a novel idea for the construction of NNPES and, especially, the reference datasets, and it can be used for molecular dynamics (MD) simulations of various gas-phase and condensed-phase chemical systems.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China.,Department of Chemistry, New York University, New York, New York 10003, United States.,Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|
19
|
Pinheiro M, Ge F, Ferré N, Dral PO, Barbatti M. Choosing the right molecular machine learning potential. Chem Sci 2021; 12:14396-14413. [PMID: 34880991 PMCID: PMC8580106 DOI: 10.1039/d1sc03564a] [Citation(s) in RCA: 45] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 09/14/2021] [Indexed: 11/21/2022] Open
Abstract
Quantum-chemistry simulations based on potential energy surfaces of molecules provide invaluable insight into the physicochemical processes at the atomistic level and yield such important observables as reaction rates and spectra. Machine learning potentials promise to significantly reduce the computational cost and hence enable otherwise unfeasible simulations. However, the surging number of such potentials begs the question of which one to choose or whether we still need to develop yet another one. Here, we address this question by evaluating the performance of popular machine learning potentials in terms of accuracy and computational cost. In addition, we deliver structured information for non-specialists in machine learning to guide them through the maze of acronyms, recognize each potential's main features, and judge what they could expect from each one.
Collapse
Affiliation(s)
- Max Pinheiro
- Aix Marseille University, CNRS, ICR Marseille France
| | - Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Nicolas Ferré
- Aix Marseille University, CNRS, ICR Marseille France
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, College of Chemistry and Chemical Engineering, Xiamen University China
| | - Mario Barbatti
- Aix Marseille University, CNRS, ICR Marseille France
- Institut Universitaire de France 75231 Paris France
| |
Collapse
|
20
|
Abstract
We demonstrate that a program synthesis approach based on a linear code representation can be used to generate algorithms that approximate the ground-state solutions of one-dimensional time-independent Schrödinger equations constructed with bound polynomial potential energy surfaces (PESs). Here, an algorithm is constructed as a linear series of instructions operating on a set of input vectors, matrices, and constants that define the problem characteristics, such as the PES. Discrete optimization is performed using simulated annealing in order to identify sequences of code-lines, operating on the program inputs that can reproduce the expected ground-state wavefunctions ψ(x) for a set of target PESs. The outcome of this optimization is not simply a mathematical function approximating ψ(x) but is, instead, a complete algorithm that converts the input vectors describing the system into a ground-state solution of the Schrödinger equation. These initial results point the way toward an alternative route for developing novel algorithms for quantum chemistry applications.
Collapse
Affiliation(s)
- Scott Habershon
- Department of Chemistry, University of Warwick, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
21
|
Unke O, Chmiela S, Sauceda HE, Gastegger M, Poltavsky I, Schütt KT, Tkatchenko A, Müller KR. Machine Learning Force Fields. Chem Rev 2021; 121:10142-10186. [PMID: 33705118 PMCID: PMC8391964 DOI: 10.1021/acs.chemrev.0c01111] [Citation(s) in RCA: 360] [Impact Index Per Article: 120.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Indexed: 12/27/2022]
Abstract
In recent years, the use of machine learning (ML) in computational chemistry has enabled numerous advances previously out of reach due to the computational complexity of traditional electronic-structure methods. One of the most promising applications is the construction of ML-based force fields (FFs), with the aim to narrow the gap between the accuracy of ab initio methods and the efficiency of classical FFs. The key idea is to learn the statistical relation between chemical structure and potential energy without relying on a preconceived notion of fixed chemical bonds or knowledge about the relevant interactions. Such universal ML approximations are in principle only limited by the quality and quantity of the reference data used to train them. This review gives an overview of applications of ML-FFs and the chemical insights that can be obtained from them. The core concepts underlying ML-FFs are described in detail, and a step-by-step guide for constructing and testing them from scratch is given. The text concludes with a discussion of the challenges that remain to be overcome by the next generation of ML-FFs.
Collapse
Affiliation(s)
- Oliver
T. Unke
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Stefan Chmiela
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Huziel E. Sauceda
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Michael Gastegger
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- DFG
Cluster of Excellence “Unifying Systems in Catalysis”
(UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN,
BASF-TU Joint Lab, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Igor Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Kristof T. Schütt
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
| | - Alexandre Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Machine
Learning Group, Technische Universität
Berlin, 10587 Berlin, Germany
- BIFOLD−Berlin
Institute for the Foundations of Learning and Data, Berlin, Germany
- Department
of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck
Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Google
Research, Brain Team, Berlin, Germany
| |
Collapse
|
22
|
Young TA, Johnston-Wood T, Deringer VL, Duarte F. A transferable active-learning strategy for reactive molecular force fields. Chem Sci 2021; 12:10944-10955. [PMID: 34476072 PMCID: PMC8372546 DOI: 10.1039/d1sc01825f] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/04/2021] [Indexed: 11/25/2022] Open
Abstract
Predictive molecular simulations require fast, accurate and reactive interatomic potentials. Machine learning offers a promising approach to construct such potentials by fitting energies and forces to high-level quantum-mechanical data, but doing so typically requires considerable human intervention and data volume. Here we show that, by leveraging hierarchical and active learning, accurate Gaussian Approximation Potential (GAP) models can be developed for diverse chemical systems in an autonomous manner, requiring only hundreds to a few thousand energy and gradient evaluations on a reference potential-energy surface. The approach uses separate intra- and inter-molecular fits and employs a prospective error metric to assess the accuracy of the potentials. We demonstrate applications to a range of molecular systems with relevance to computational organic chemistry: ranging from bulk solvents, a solvated metal ion and a metallocage onwards to chemical reactivity, including a bifurcating Diels-Alder reaction in the gas phase and non-equilibrium dynamics (a model SN2 reaction) in explicit solvent. The method provides a route to routinely generating machine-learned force fields for reactive molecular systems.
Collapse
Affiliation(s)
- Tom A Young
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| | - Tristan Johnston-Wood
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford Oxford OX1 3QR UK
| | - Fernanda Duarte
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| |
Collapse
|
23
|
Vazquez-Salazar LI, Boittier ED, Unke OT, Meuwly M. Impact of the Characteristics of Quantum Chemical Databases on Machine Learning Prediction of Tautomerization Energies. J Chem Theory Comput 2021; 17:4769-4785. [PMID: 34288675 DOI: 10.1021/acs.jctc.1c00363] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
An essential aspect for adequate predictions of chemical properties by machine learning models is the database used for training them. However, studies that analyze how the content and structure of the databases used for training impact the prediction quality are scarce. In this work, we analyze and quantify the relationships learned by a machine learning model (Neural Network) trained on five different reference databases (QM9, PC9, ANI-1E, ANI-1, and ANI-1x) to predict tautomerization energies from molecules in Tautobase. For this, characteristics such as the number of heavy atoms in a molecule, number of atoms of a given element, bond composition, or initial geometry on the quality of the predictions are considered. The results indicate that training on a chemically diverse database is crucial for obtaining good results and also that conformational sampling can partly compensate for limited coverage of chemical diversity. The overall best-performing reference database (ANI-1x) performs on average by 1 kcal/mol better than PC9, which, however, contains about 2 orders of magnitude fewer reference structures. On the other hand, PC9 is chemically more diverse by a factor of ∼5 as quantified by the number of atom-in-molecule-based fragments (amons) it contains compared with the ANI family of databases. A quantitative measure for deficiencies is the Kullback-Leibler divergence between reference and target distributions. It is explicitly demonstrated that when certain types of bonds need to be covered in the target database (Tautobase) but are undersampled in the reference databases, the resulting predictions are poor. Examples of this include the poor performance of all databases analyzed to predict C(sp2)-C(sp2) double bonds close to heteroatoms and azoles containing N-N and N-O bonds. Analysis of the results with a Tree MAP algorithm provides deeper understanding of specific deficiencies in predicting tautomerization energies by the reference datasets due to inadequate coverage of chemical space. Capitalizing on this information can be used to either improve existing databases or generate new databases of sufficient diversity for a range of machine learning (ML) applications in chemistry.
Collapse
Affiliation(s)
| | - Eric D Boittier
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Oliver T Unke
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany.,DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland.,Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| |
Collapse
|
24
|
Miksch AM, Morawietz T, Kästner J, Urban A, Artrith N. Strategies for the construction of machine-learning potentials for accurate and efficient atomic-scale simulations. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abfd96] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Recent advances in machine-learning interatomic potentials have enabled the efficient modeling of complex atomistic systems with an accuracy that is comparable to that of conventional quantum-mechanics based methods. At the same time, the construction of new machine-learning potentials can seem a daunting task, as it involves data-science techniques that are not yet common in chemistry and materials science. Here, we provide a tutorial-style overview of strategies and best practices for the construction of artificial neural network (ANN) potentials. We illustrate the most important aspects of (a) data collection, (b) model selection, (c) training and validation, and (d) testing and refinement of ANN potentials on the basis of practical examples. Current research in the areas of active learning and delta learning are also discussed in the context of ANN potentials. This tutorial review aims at equipping computational chemists and materials scientists with the required background knowledge for ANN potential construction and application, with the intention to accelerate the adoption of the method, so that it can facilitate exciting research that would otherwise be challenging with conventional strategies.
Collapse
|
25
|
Makoś MZ, Verma N, Larson EC, Freindorf M, Kraka E. Generative adversarial networks for transition state geometry prediction. J Chem Phys 2021; 155:024116. [PMID: 34266275 DOI: 10.1063/5.0055094] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
This work introduces a novel application of generative adversarial networks (GANs) for the prediction of starting geometries in transition state (TS) searches based on the geometries of reactants and products. The multi-dimensional potential energy space of a chemical reaction often complicates the location of a starting TS geometry, leading to the correct TS combining reactants and products in question. The proposed TS-GAN efficiently maps the space between reactants and products and generates reliable TS guess geometries, and it can be easily combined with any quantum chemical software package performing geometry optimizations. The TS-GAN was trained and applied to generate TS guess structures for typical chemical reactions, such as hydrogen migration, isomerization, and transition metal-catalyzed reactions. The performance of the TS-GAN was directly compared to that of classical approaches, proving its high accuracy and efficiency. The current TS-GAN can be extended to any dataset that contains sufficient chemical reactions for training. The software is freely available for training, experimentation, and prediction at https://github.com/ekraka/TS-GAN.
Collapse
Affiliation(s)
- Małgorzata Z Makoś
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, USA
| | - Niraj Verma
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, USA
| | - Eric C Larson
- Computer Science Department, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, USA
| | - Marek Freindorf
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, USA
| | - Elfi Kraka
- Computational and Theoretical Chemistry Group (CATCO), Department of Chemistry, Southern Methodist University, 3215 Daniel Avenue, Dallas, Texas 75275-0314, USA
| |
Collapse
|
26
|
Westermayr J, Gastegger M, Schütt KT, Maurer RJ. Perspective on integrating machine learning into computational chemistry and materials science. J Chem Phys 2021; 154:230903. [PMID: 34241249 DOI: 10.1063/5.0047760] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
Machine learning (ML) methods are being used in almost every conceivable area of electronic structure theory and molecular simulation. In particular, ML has become firmly established in the construction of high-dimensional interatomic potentials. Not a day goes by without another proof of principle being published on how ML methods can represent and predict quantum mechanical properties-be they observable, such as molecular polarizabilities, or not, such as atomic charges. As ML is becoming pervasive in electronic structure theory and molecular simulation, we provide an overview of how atomistic computational modeling is being transformed by the incorporation of ML approaches. From the perspective of the practitioner in the field, we assess how common workflows to predict structure, dynamics, and spectroscopy are affected by ML. Finally, we discuss how a tighter and lasting integration of ML methods with computational chemistry and materials science can be achieved and what it will mean for research practice, software development, and postgraduate training.
Collapse
Affiliation(s)
- Julia Westermayr
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Kristof T Schütt
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Reinhard J Maurer
- Department of Chemistry, University of Warwick, Gibbet Hill Road, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
27
|
Käser S, Boittier ED, Upadhyay M, Meuwly M. Transfer Learning to CCSD(T): Accurate Anharmonic Frequencies from Machine Learning Models. J Chem Theory Comput 2021; 17:3687-3699. [PMID: 33960787 DOI: 10.1021/acs.jctc.1c00249] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The calculation of the anharmonic modes of small- to medium-sized molecules for assigning experimentally measured frequencies to the corresponding type of molecular motions is computationally challenging at sufficiently high levels of quantum chemical theory. Here, a practical and affordable way to calculate coupled-cluster quality anharmonic frequencies using second-order vibrational perturbation theory (VPT2) from machine-learned models is presented. The approach, referenced as "NN + VPT2", uses a high-dimensional neural network (PhysNet) to learn potential energy surfaces (PESs) at different levels of theory from which harmonic and VPT2 frequencies can be efficiently determined. The NN + VPT2 approach is applied to eight small- to medium-sized molecules (H2CO, trans-HONO, HCOOH, CH3OH, CH3CHO, CH3NO2, CH3COOH, and CH3CONH2) and frequencies are reported from NN-learned models at the MP2/aug-cc-pVTZ, CCSD(T)/aug-cc-pVTZ, and CCSD(T)-F12/aug-cc-pVTZ-F12 levels of theory. For the largest molecules and at the highest levels of theory, transfer learning (TL) is used to determine the necessary full-dimensional, near-equilibrium PESs. Overall, NN + VPT2 yields anharmonic frequencies to within 20 cm-1 of experimentally determined frequencies for close to 90% of the modes for the highest quality PES available and to within 10 cm-1 for more than 60% of the modes. For the MP2 PESs only ∼60% of the NN + VPT2 frequencies were within 20 cm-1 of the experiment, with outliers up to ∼150 cm-1, compared to the experiment. It is also demonstrated that the approach allows to provide correct assignments for strongly interacting modes such as the OH bending and the OH torsional modes in formic acid monomer and the CO-stretch and OH-bend mode in acetic acid.
Collapse
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Eric D Boittier
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Meenu Upadhyay
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| |
Collapse
|
28
|
Takayanagi T. Application of Reaction Path Search Calculations to Potential Energy Surface Fits. J Phys Chem A 2021; 125:3994-4002. [PMID: 33915053 DOI: 10.1021/acs.jpca.1c01512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There has been significant progress in recent years in the use of machine learning techniques to model high-dimensional reactive potential energy surfaces using large-scale data obtained from ab initio electronic structure calculations. In these methods, the strategy used to gather data becomes a key issue as the molecular size increases. In this work, we examine the applicability of the reaction path search algorithm implemented in the Global Reaction Route Mapping (GRRM) code as a data-gathering approach. The electronic energies and gradients sampled by using the GRRM calculation are directly used in potential energy surface fitting to a permutationally invariant polynomial function. This simple approach was applied to the HNS and HCNO reaction systems, and we found that the fitted potential energy surfaces reasonably reproduce the features of the electronic structure calculations used in the GRRM calculations. This suggests that the GRRM sampling scheme can be used to construct an initial potential energy surface.
Collapse
Affiliation(s)
- Toshiyuki Takayanagi
- Department of Chemistry, Saitama University, Shimo-Okubo 255, Saitama City, Saitama 338-8570, Japan
| |
Collapse
|
29
|
Saito K, Sugiura Y, Miyazaki T, Takahashi Y, Takayanagi T. Quantum calculations of the photoelectron spectra of the OH -·NH 3 anion: implications for OH + NH 3→ H 2O + NH 2 reaction dynamics. Phys Chem Chem Phys 2021; 23:6950-6958. [PMID: 33729225 DOI: 10.1039/d0cp06514e] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
We present the results of quantum dynamics calculations for analyzing the experimentally measured photoelectron spectra of the OH-·NH3 anion complex. Detachment of an excess electron of OH-·NH3 initially produces a molecular arrangement, which is close to the transition-state structure of the neutral OH + NH3→ H2O + NH2 hydrogen abstraction reaction due to the Franck-Condon principle, and thus finally leads to the OH + NH3 or H2O + NH2 asymptotic channel. We used both the path integral method and the reduced-dimensionality quantum wave packet method to simulate the photoelectron spectra of the OH-·NH3 anion. The calculated spectra were found to be in qualitative agreement with the experimental spectra. It was found that the photodetached complex mainly dissociates into the OH + NH3 channel; however, we found that the hydrogen exchange process also contributes to the photodetachment spectra.
Collapse
Affiliation(s)
- Kohei Saito
- Department of Chemistry, Saitama University, Shimo-Okubo 255, Sakura-ku, Saitama City, Saitama 338-8570, Japan.
| | | | | | | | | |
Collapse
|
30
|
Affiliation(s)
- Kazuo Takatsuka
- Fukui Institute for Fundamental Chemistry, Kyoto University, 606-8103 Kyoto, Japan
| | - Yasuki Arasaki
- Fukui Institute for Fundamental Chemistry, Kyoto University, 606-8103 Kyoto, Japan
| |
Collapse
|
31
|
Panadés-Barrueta RL, Peláez D. Low-rank sum-of-products finite-basis-representation (SOP-FBR) of potential energy surfaces. J Chem Phys 2020; 153:234110. [PMID: 33353311 DOI: 10.1063/5.0027143] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
The sum-of-products finite-basis-representation (SOP-FBR) approach for the automated multidimensional fit of potential energy surfaces (PESs) is presented. In its current implementation, the method yields a PES in the so-called Tucker sum-of-products form, but it is not restricted to this specific ansatz. The novelty of our algorithm lies in the fact that the fit is performed in terms of a direct product of a Schmidt basis, also known as natural potentials. These encode in a non-trivial way all the physics of the problem and, hence, circumvent the usual extra ad hoc and a posteriori adjustments (e.g., damping functions) of the fitted PES. Moreover, we avoid the intermediate refitting stage common to other tensor-decomposition methods, typically used in the context of nuclear quantum dynamics. The resulting SOP-FBR PES is analytical and differentiable ad infinitum. Our ansatz is fully general and can be used in combination with most (molecular) dynamics codes. In particular, it has been interfaced and extensively tested with the Heidelberg implementation of the multiconfiguration time-dependent Hartree quantum dynamical software package.
Collapse
Affiliation(s)
- Ramón L Panadés-Barrueta
- Laboratoire de Physique des Lasers, Atomes et Molécules (PhLAM), Université Lille 1, Villeneuve d'Ascq Cedex, France
| | - Daniel Peláez
- Institut des Sciences Moléculaires d'Orsay (ISMO) - UMR 8214, Bât. 520, Université Paris-Saclay, 91405 Orsay Cedex, France
| |
Collapse
|
32
|
Koner D, Meuwly M. Permutationally Invariant, Reproducing Kernel-Based Potential Energy Surfaces for Polyatomic Molecules: From Formaldehyde to Acetone. J Chem Theory Comput 2020; 16:5474-5484. [DOI: 10.1021/acs.jctc.0c00535] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Debasish Koner
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, 4056 Basel, Switzerland
| |
Collapse
|
33
|
Exploring the Mechanism of Catalysis with the Unified Reaction Valley Approach (URVA)—A Review. Catalysts 2020. [DOI: 10.3390/catal10060691] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The unified reaction valley approach (URVA) differs from mainstream mechanistic studies, as it describes a chemical reaction via the reaction path and the surrounding reaction valley on the potential energy surface from the van der Waals region to the transition state and far out into the exit channel, where the products are located. The key feature of URVA is the focus on the curving of the reaction path. Moving along the reaction path, any electronic structure change of the reacting molecules is registered by a change in their normal vibrational modes and their coupling with the path, which recovers the curvature of the reaction path. This leads to a unique curvature profile for each chemical reaction with curvature minima reflecting minimal change and curvature maxima, the location of important chemical events such as bond breaking/forming, charge polarization and transfer, rehybridization, etc. A unique decomposition of the path curvature into internal coordinate components provides comprehensive insights into the origins of the chemical changes taking place. After presenting the theoretical background of URVA, we discuss its application to four diverse catalytic processes: (i) the Rh catalyzed methanol carbonylation—the Monsanto process; (ii) the Sharpless epoxidation of allylic alcohols—transition to heterogenous catalysis; (iii) Au(I) assisted [3,3]-sigmatropic rearrangement of allyl acetate; and (iv) the Bacillus subtilis chorismate mutase catalyzed Claisen rearrangement—and show how URVA leads to a new protocol for fine-tuning of existing catalysts and the design of new efficient and eco-friendly catalysts. At the end of this article the pURVA software is introduced. The overall goal of this article is to introduce to the chemical community a new protocol for fine-tuning existing catalytic reactions while aiding in the design of modern and environmentally friendly catalysts.
Collapse
|
34
|
Dral PO, Owens A, Dral A, Csányi G. Hierarchical machine learning of potential energy surfaces. J Chem Phys 2020; 152:204110. [DOI: 10.1063/5.0006498] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Pavlo O. Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Alec Owens
- Department of Physics and Astronomy, University College London, Gower Street, WC1E 6BT London, United Kingdom
| | - Alexey Dral
- BigData Team, 1A Tormoznoye Shosse Off 17, Yaroslavl, Yaroslavl 150022, Russian Federation
| | - Gábor Csányi
- Department of Engineering, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|