1
|
Shirani H, Hashemianzadeh SM. Quantum-level machine learning calculations of Levodopa. Comput Biol Chem 2024; 112:108146. [PMID: 39067350 DOI: 10.1016/j.compbiolchem.2024.108146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 07/30/2024]
Abstract
Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6-31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.
Collapse
Affiliation(s)
- Hossein Shirani
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| |
Collapse
|
2
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
3
|
Chaney G, Golov A, van Roekeghem A, Carrasco J, Mingo N. Two-Step Growth Mechanism of the Solid Electrolyte Interphase in Argyrodyte/Li-Metal Contacts. ACS APPLIED MATERIALS & INTERFACES 2024. [PMID: 38699998 DOI: 10.1021/acsami.4c02548] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2024]
Abstract
The structure and growth of the solid electrolyte interphase (SEI) region between an electrolyte and an electrode is one of the most fundamental yet less well-understood phenomena in solid-state batteries. We present an atomistic simulation of the SEI growth for one of the currently promising solid electrolytes (Li6PS5Cl), based on ab initio-trained machine learning interatomic potentials, for over 30,000 atoms during 10 ns, well beyond the capabilities of conventional molecular dynamics. This unveils a two-step growth mechanism: a Li-argyrodite chemical reaction leading to the formation of an amorphous phase, followed by a kinetically slower crystallization of the reaction products into a 5Li2S·Li3P·LiCl solid solution. The simulation results support the recent, experimentally founded hypothesis of an indirect pathway of electrolyte reduction. These findings shed light on the intricate processes governing SEI evolution, providing a valuable foundation for the design and optimization of next-generation solid-state batteries.
Collapse
Affiliation(s)
- Gracie Chaney
- Université Grenoble Alpes, CEA, LITEN, 17 Rue des Martyrs, Grenoble 38054, France
| | - Andrey Golov
- Centre for Cooperative Research on Alternative Energies (CIC EnergiGUNE), Basque Research and Technology Alliance (BRTA), Alava Technology Park, Albert Einstein 48, Vitoria-Gasteiz 01510, Spain
| | | | - Javier Carrasco
- Centre for Cooperative Research on Alternative Energies (CIC EnergiGUNE), Basque Research and Technology Alliance (BRTA), Alava Technology Park, Albert Einstein 48, Vitoria-Gasteiz 01510, Spain
- Ikerbasque, Basque Foundation for Science, Plaza Euskadi 5, Bilbao 48009, Spain
| | - Natalio Mingo
- Université Grenoble Alpes, CEA, LITEN, 17 Rue des Martyrs, Grenoble 38054, France
| |
Collapse
|
4
|
Ge F, Wang R, Qu C, Zheng P, Nandi A, Conte R, Houston PL, Bowman JM, Dral PO. Tell Machine Learning Potentials What They Are Needed For: Simulation-Oriented Training Exemplified for Glycine. J Phys Chem Lett 2024; 15:4451-4460. [PMID: 38626460 DOI: 10.1021/acs.jpclett.4c00746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Machine learning potentials (MLPs) are widely applied as an efficient alternative way to represent potential energy surfaces (PESs) in many chemical simulations. The MLPs are often evaluated with the root-mean-square errors on the test set drawn from the same distribution as the training data. Here, we systematically investigate the relationship between such test errors and the simulation accuracy with MLPs on an example of a full-dimensional, global PES for the glycine amino acid. Our results show that the errors in the test set do not unambiguously reflect the MLP performance in different simulation tasks, such as relative conformer energies, barriers, vibrational levels, and zero-point vibrational energies. We also offer an easily accessible solution for improving the MLP quality in a simulation-oriented manner, yielding the most precise relative conformer energies and barriers. This solution also passed the stringent test by diffusion Monte Carlo simulations.
Collapse
Affiliation(s)
- Fuchun Ge
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Ran Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Peikun Zheng
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
- Department of Physics and Materials Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Riccardo Conte
- Dipartimento di Chimica, Università degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| |
Collapse
|
5
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
6
|
Kurnikov IV, Pereyaslavets L, Kamath G, Sakipov SN, Voronina E, Butin O, Illarionov A, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Ivahnenko I, Chen Y, Lock CB, Levitt M, Kornberg RD, Fain B. Neural Network Corrections to Intermolecular Interaction Terms of a Molecular Force Field Capture Nuclear Quantum Effects in Calculations of Liquid Thermodynamic Properties. J Chem Theory Comput 2024; 20:1347-1357. [PMID: 38240485 PMCID: PMC11042917 DOI: 10.1021/acs.jctc.3c00921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
We incorporate nuclear quantum effects (NQE) in condensed matter simulations by introducing short-range neural network (NN) corrections to the ab initio fitted molecular force field ARROW. Force field NN corrections are fitted to average interaction energies and forces of molecular dimers, which are simulated using the Path Integral Molecular Dynamics (PIMD) technique with restrained centroid positions. The NN-corrected force field allows reproduction of the NQE for computed liquid water and methane properties such as density, radial distribution function (RDF), heat of evaporation (HVAP), and solvation free energy. Accounting for NQE through molecular force field corrections circumvents the need for explicit computationally expensive PIMD simulations in accurate calculations of the properties of chemical and biological systems. The accuracy and locality of pairwise NN NQE corrections indicate that this approach could be applicable to complex heterogeneous systems, such as proteins.
Collapse
Affiliation(s)
- Igor V Kurnikov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan N Sakipov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Oleg Butin
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Alexey Illarionov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ilya Ivahnenko
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - YuChun Chen
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher B Lock
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Roger D Kornberg
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Boris Fain
- InterX Inc., (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
7
|
Spronk SA, Glick ZL, Metcalf DP, Sherrill CD, Cheney DL. A quantum chemical interaction energy dataset for accurately modeling protein-ligand interactions. Sci Data 2023; 10:619. [PMID: 37699937 PMCID: PMC10497680 DOI: 10.1038/s41597-023-02443-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/03/2023] [Indexed: 09/14/2023] Open
Abstract
Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter ["Symmetry-adapted perturbation theory (SAPT0) protein-ligand interaction"] dataset has been created to facilitate the development and improvement of methods for performing such calculations. Molecular fragments representing commonly found substructures in proteins and small-molecule ligands were paired into >9000 unique dimers, assembled into numerous configurations using an approach designed to adequately cover the breadth of the dimers' potential energy surfaces while enhancing sampling in favorable regions. ~1.5 million configurations of these dimers were randomly generated, and a structurally diverse subset of these were minimized to obtain an additional ~80 thousand local and global minima. For all >1.6 million configurations, SAPT0 calculations were performed with two basis sets to complete the dataset. It is expected that Splinter will be a useful benchmark dataset for training and testing various methods for the calculation of intermolecular interaction energies.
Collapse
Affiliation(s)
- Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA.
| | - Zachary L Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - Derek P Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA.
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA
| |
Collapse
|
8
|
Peralta A, Odriozola G. A Neural-Network-Optimized Hydrogen Peroxide Pairwise Additive Model for Classical Simulations. J Chem Theory Comput 2023; 19:4172-4181. [PMID: 37306692 PMCID: PMC10921400 DOI: 10.1021/acs.jctc.3c00287] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Indexed: 06/13/2023]
Abstract
We have developed an all-atom pairwise additive model for hydrogen peroxide using an optimization procedure based on artificial neural networks (ANNs). The model is based on experimental molecular geometry and includes a dihedral potential that hinders the cis-type configuration and allows for crossing the trans one, defined between the planes that have the two oxygen atoms and each hydrogen. The model's parametrization is achieved by training simple ANNs to minimize a target function that measures the differences between various thermodynamic and transport properties and the corresponding experimental values. Finally, we evaluated a range of properties for the optimized model and its mixtures with SPC/E water, including bulk-liquid properties (density, thermal expansion coefficient, adiabatic compressibility, etc.) and properties of systems at equilibrium (vapor and liquid density, vapor pressure and composition, surface tension, etc.). Overall, we obtained good agreement with experimental data.
Collapse
Affiliation(s)
- Alvaro
Ramos Peralta
- Área de Física de Procesos
Irreversibles, División de Ciencias Básicas e Ingeniería, Universidad Autónoma Metropolitana-Azcapotzalco, Av. San Pablo 180, 02200 Ciudad de México, Mexico
| | - Gerardo Odriozola
- Área de Física de Procesos
Irreversibles, División de Ciencias Básicas e Ingeniería, Universidad Autónoma Metropolitana-Azcapotzalco, Av. San Pablo 180, 02200 Ciudad de México, Mexico
| |
Collapse
|
9
|
Ruth M, Gerbig D, Schreiner PR. Machine Learning for Bridging the Gap between Density Functional Theory and Coupled Cluster Energies. J Chem Theory Comput 2023. [PMID: 37418619 DOI: 10.1021/acs.jctc.3c00274] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Accurate electronic energies and properties are crucial for successful reaction design and mechanistic investigations. Computing energies and properties of molecular structures has proven extremely useful, and, with increasing computational power, the limits of high-level approaches (such as coupled cluster theory) are expanding to ever larger systems. However, because scaling is highly unfavorable, these methods are still not universally applicable to larger systems. To address the need for fast and accurate electronic energies of larger systems, we created a database of around 8000 small organic monomers (2000 dimers) optimized at the B3LYP-D3(BJ)/cc-pVTZ level of theory. This database also includes single-point energies computed at various levels of theory, including PBE1PBE, ωΒ97Χ, M06-2X, revTPSS, B3LYP, and BP86, for density functional theory as well as DLPNO-CCSD(T) and CCSD(T) for coupled cluster theory, all in conjunction with a cc-pVTZ basis. We used this database to train machine learning models based on graph neural networks using two different graph representations. Our models are able to make energy predictions from B3LYP-D3(BJ)/cc-pVTZ inputs to CCSD(T)/cc-pVTZ outputs with a mean absolute error of 0.78 and to DLPNO-CCSD(T)/cc-pVTZ with an mean absolute error of 0.50 and 0.18 kcal mol-1 for monomers and dimers, respectively. The model for dimers was further validated on the S22 database, and the monomer model was tested on challenging systems, including those with highly conjugated or functionally complex molecules.
Collapse
Affiliation(s)
- Marcel Ruth
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Dennis Gerbig
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| | - Peter R Schreiner
- Institute of Organic Chemistry, Justus Liebig University, Heinrich-Buff-Ring 17, 35392 Giessen, Germany
| |
Collapse
|
10
|
Akher FB, Shu Y, Varga Z, Bhaumik S, Truhlar DG. Parametrically Managed Activation Function for Fitting a Neural Network Potential with Physical Behavior Enforced by a Low-Dimensional Potential. J Phys Chem A 2023. [PMID: 37307218 DOI: 10.1021/acs.jpca.3c02627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Machine-learned representations of potential energy surfaces generated in the output layer of a feedforward neural network are becoming increasingly popular. One difficulty with neural network output is that it is often unreliable in regions where training data is missing or sparse. Human-designed potentials often build in proper extrapolation behavior by choice of functional form. Because machine learning is very efficient, it is desirable to learn how to add human intelligence to machine-learned potentials in a convenient way. One example is the well-understood feature of interaction potentials that they vanish when subsystems are too far separated to interact. In this article, we present a way to add a new kind of activation function to a neural network to enforce low-dimensional constraints. In particular, the activation function depends parametrically on all of the input variables. We illustrate the use of this step by showing how it can force an interaction potential to go to zero at large subsystem separations without either inputting a specific functional form for the potential or adding data to the training set in the asymptotic region of geometries where the subsystems are separated. In the process of illustrating this, we present an improved set of potential energy surfaces for the 14 lowest 3A' states of O3. The method is more general than this example, and it may be used to add other low-dimensional knowledge or lower-level knowledge to machine-learned potentials. In addition to the O3 example, we present a greater-generality method called parametrically managed diabatization by deep neural network (PM-DDNN) that is an improvement on our previously presented permutationally restrained diabatization by deep neural network (PR-DDNN).
Collapse
Affiliation(s)
- Farideh Badichi Akher
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Yinan Shu
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Zoltan Varga
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Suman Bhaumik
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Donald G Truhlar
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| |
Collapse
|
11
|
Anstine D, Isayev O. Generative Models as an Emerging Paradigm in the Chemical Sciences. J Am Chem Soc 2023; 145:8736-8750. [PMID: 37052978 PMCID: PMC10141264 DOI: 10.1021/jacs.2c13467] [Citation(s) in RCA: 32] [Impact Index Per Article: 32.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Indexed: 04/14/2023]
Abstract
Traditional computational approaches to design chemical species are limited by the need to compute properties for a vast number of candidates, e.g., by discriminative modeling. Therefore, inverse design methods aim to start from the desired property and optimize a corresponding chemical structure. From a machine learning viewpoint, the inverse design problem can be addressed through so-called generative modeling. Mathematically, discriminative models are defined by learning the probability distribution function of properties given the molecular or material structure. In contrast, a generative model seeks to exploit the joint probability of a chemical species with target characteristics. The overarching idea of generative modeling is to implement a system that produces novel compounds that are expected to have a desired set of chemical features, effectively sidestepping issues found in the forward design process. In this contribution, we overview and critically analyze popular generative algorithms like generative adversarial networks, variational autoencoders, flow, and diffusion models. We highlight key differences between each of the models, provide insights into recent success stories, and discuss outstanding challenges for realizing generative modeling discovered solutions in chemical applications.
Collapse
Affiliation(s)
- Dylan
M. Anstine
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
12
|
Abstract
Advances in machine learned interatomic potentials (MLIPs), such as those using neural networks, have resulted in short-range models that can infer interaction energies with near ab initio accuracy and orders of magnitude reduced computational cost. For many atom systems, including macromolecules, biomolecules, and condensed matter, model accuracy can become reliant on the description of short- and long-range physical interactions. The latter terms can be difficult to incorporate into an MLIP framework. Recent research has produced numerous models with considerations for nonlocal electrostatic and dispersion interactions, leading to a large range of applications that can be addressed using MLIPs. In light of this, we present a Perspective focused on key methodologies and models being used where the presence of nonlocal physics and chemistry are crucial for describing system properties. The strategies covered include MLIPs augmented with dispersion corrections, electrostatics calculated with charges predicted from atomic environment descriptors, the use of self-consistency and message passing iterations to propagated nonlocal system information, and charges obtained via equilibration schemes. We aim to provide a pointed discussion to support the development of machine learning-based interatomic potentials for systems where contributions from only nearsighted terms are deficient.
Collapse
Affiliation(s)
- Dylan M Anstine
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
13
|
Li Y, Zhai Y, Li H. MLRNet: Combining the Physics-Motivated Potential Models with Neural Networks for Intermolecular Potential Energy Surface Construction. J Chem Theory Comput 2023; 19:1421-1431. [PMID: 36826225 DOI: 10.1021/acs.jctc.2c01049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
A physics-based machine learning model called MLRNet has been developed to construct the high-accuracy two-body intermolecular potential energy surface (IPES). The outputs of the neural network are integrated into the physically realistic Morse/long-range (MLR) function, which ensures that the MLRNet has meaningful extrapolation at both short and long ranges and solves the asymptotic problem in common neural network potential (NNP) models. The neural network representation of the MLR parameters is more flexible and more efficient than the polynomial expansion in the conventional mdMLR model, especially for systems containing nonrigid monomer(s). The present work illustrates the basic framework of the current MLRNet model, including (i) how to combine the physically meaningful MLR function with different possible NN structures, (ii) the preservation of permutation symmetry, and (iii) the predetermination of the long-range function uLR. We choose two realistic systems to demonstrate the performance of MLRNet: the three-dimensional IPES of CO2-He including the CO2 antisymmetric vibration Q3 and the six-dimensional IPES of the H2O-Ar system. In both cases, the fitting errors of the MLRNet are several times smaller than those of the conventional mdMLR model. Both short-range and long-range extrapolation tests were performed to illustrate the extrapolation ability of the MLRNet and its damping function version. Moreover, for the 6-D H2O-Ar system, the MLRNet only needs 1596 trainable parameters, which is almost equal to the number needed for the 5-D mdMLR model (1509) and half that needed for the PIP-NN model (3501) within similar accuracy, which illustrates the model efficiency in high-dimensional IPES fitting.
Collapse
Affiliation(s)
- You Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Yu Zhai
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Hui Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| |
Collapse
|
14
|
Fedik N, Zubatyuk R, Kulichenko M, Lubbers N, Smith JS, Nebgen B, Messerly R, Li YW, Boldyrev AI, Barros K, Isayev O, Tretiak S. Extending machine learning beyond interatomic potentials for predicting molecular properties. Nat Rev Chem 2022; 6:653-672. [PMID: 37117713 DOI: 10.1038/s41570-022-00416-3] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/15/2022] [Indexed: 11/09/2022]
Abstract
Machine learning (ML) is becoming a method of choice for modelling complex chemical processes and materials. ML provides a surrogate model trained on a reference dataset that can be used to establish a relationship between a molecular structure and its chemical properties. This Review highlights developments in the use of ML to evaluate chemical properties such as partial atomic charges, dipole moments, spin and electron densities, and chemical bonding, as well as to obtain a reduced quantum-mechanical description. We overview several modern neural network architectures, their predictive capabilities, generality and transferability, and illustrate their applicability to various chemical properties. We emphasize that learned molecular representations resemble quantum-mechanical analogues, demonstrating the ability of the models to capture the underlying physics. We also discuss how ML models can describe non-local quantum effects. Finally, we conclude by compiling a list of available ML toolboxes, summarizing the unresolved challenges and presenting an outlook for future development. The observed trends demonstrate that this field is evolving towards physics-based models augmented by ML, which is accompanied by the development of new methods and the rapid growth of user-friendly ML frameworks for chemistry.
Collapse
|
15
|
Dardzinski D, Yu M, Moayedpour S, Marom N. Best practices for first-principles simulations of epitaxial inorganic interfaces. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2022; 34:233002. [PMID: 35193122 DOI: 10.1088/1361-648x/ac577b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 02/22/2022] [Indexed: 06/14/2023]
Abstract
At an interface between two materials physical properties and functionalities may be achieved, which would not exist in either material alone. Epitaxial inorganic interfaces are at the heart of semiconductor, spintronic, and quantum devices. First principles simulations based on density functional theory (DFT) can help elucidate the electronic and magnetic properties of interfaces and relate them to the structure and composition at the atomistic scale. Furthermore, DFT simulations can predict the structure and properties of candidate interfaces and guide experimental efforts in promising directions. However, DFT simulations of interfaces can be technically elaborate and computationally expensive. To help researchers embarking on such simulations, this review covers best practices for first principles simulations of epitaxial inorganic interfaces, including DFT methods, interface model construction, interface structure prediction, and analysis and visualization tools.
Collapse
Affiliation(s)
- Derek Dardzinski
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Maituo Yu
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Saeed Moayedpour
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| | - Noa Marom
- Department of Materials Science and Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
- Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15213, United States of America
| |
Collapse
|
16
|
Oliveira AF, Da Silva JLF, Quiles MG. Molecular Property Prediction and Molecular Design Using a Supervised Grammar Variational Autoencoder. J Chem Inf Model 2022; 62:817-828. [PMID: 35174705 DOI: 10.1021/acs.jcim.1c01573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Some of the most common applications of machine learning (ML) algorithms dealing with small molecules usually fall within two distinct domains, namely, the prediction of molecular properties and the design of novel molecules with some desirable property. Here we unite these applications under a single molecular representation and ML algorithm by modifying the grammar variational autoencoder (GVAE) model with the incorporation of property information into its training procedure, thus creating a supervised GVAE (SGVAE). Results indicate that the biased latent space generated by this approach can successfully be used to predict the molecular properties of the input molecules, produce novel and unique molecules with some desired property and also estimate the properties of random sampled molecules. We illustrate these possibilities by sampling novel molecules from the latent space with specific values of the lowest unoccupied molecular orbital (LUMO) energy after training the model using the QM9 data set. Furthermore, the trained model is also used to predict the properties of a hold-out set and the resulting mean absolute error (MAE) shows values close to chemical accuracy for the dipole moment and atomization energies, even outperforming ML models designed to exclusive predict molecular properties using the SMILES as molecular representation. Therefore, these results show that the proposed approach is a viable way to provide generative ML models with molecular property information in a way that the generation of novel molecules is likely to achieve better results, with the benefit that these new molecules can also have their molecular properties accurately predicted.
Collapse
Affiliation(s)
- André F Oliveira
- Associate Laboratory for Computing and Applied Mathematics, National Institute for Space Research, P.O. Box 515, 12227-010, São José dos Campos, SP, Brazil
| | - Juarez L F Da Silva
- São Carlos Institute of Chemistry, University of São Paulo, P.O. Box 780, 13560-970, São Carlos, SP, Brazil
| | - Marcos G Quiles
- Institute of Science and Technology, Federal University of São Paulo, 12247-014, São José dos Campos, SP, Brazil
| |
Collapse
|