101
|
Manzhos S, Carrington T. Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem Rev 2020; 121:10187-10217. [PMID: 33021368 DOI: 10.1021/acs.chemrev.0c00665] [Citation(s) in RCA: 119] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We review progress in neural network (NN)-based methods for the construction of interatomic potentials from discrete samples (such as ab initio energies) for applications in classical and quantum dynamics including reaction dynamics and computational spectroscopy. The main focus is on methods for building molecular potential energy surfaces (PES) in internal coordinates that explicitly include all many-body contributions, even though some of the methods we review limit the degree of coupling, due either to a desire to limit computational cost or to limited data. Explicit and direct treatment of all many-body contributions is only practical for sufficiently small molecules, which are therefore our primary focus. This includes small molecules on surfaces. We consider direct, single NN PES fitting as well as more complex methods that impose structure (such as a multibody representation) on the PES function, either through the architecture of one NN or by using multiple NNs. We show how NNs are effective in building representations with low-dimensional functions including dimensionality reduction. We consider NN-based approaches to build PESs in the sums-of-product form important for quantum dynamics, ways to treat symmetry, and issues related to sampling data distributions and the relation between PES errors and errors in observables. We highlight combinations of NNs with other ideas such as permutationally invariant polynomials or sums of environment-dependent atomic contributions, which have recently emerged as powerful tools for building highly accurate PESs for relatively large molecular and reactive systems.
Collapse
Affiliation(s)
- Sergei Manzhos
- Centre Énergie Matériaux Télécommunications, Institut National de la Recherche Scientifique, 1650, Boulevard Lionel-Boulet, Varennes, Québec City, Québec J3X 1S2, Canada
| | - Tucker Carrington
- Chemistry Department, Queen's University, Kingston Ontario K7L 3N6, Canada
| |
Collapse
|
102
|
Nigam J, Pozdnyakov S, Ceriotti M. Recursive evaluation and iterative contraction of N-body equivariant features. J Chem Phys 2020; 153:121101. [PMID: 33003734 DOI: 10.1063/5.0021116] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Mapping an atomistic configuration to a symmetrized N-point correlation of a field associated with the atomic positions (e.g., an atomic density) has emerged as an elegant and effective solution to represent structures as the input of machine-learning algorithms. While it has become clear that low-order density correlations do not provide a complete representation of an atomic environment, the exponential increase in the number of possible N-body invariants makes it difficult to design a concise and effective representation. We discuss how to exploit recursion relations between equivariant features of different order (generalizations of N-body invariants that provide a complete representation of the symmetries of improper rotations) to compute high-order terms efficiently. In combination with the automatic selection of the most expressive combination of features at each order, this approach provides a conceptual and practical framework to generate systematically improvable, symmetry adapted representations for atomistic machine learning.
Collapse
Affiliation(s)
- Jigyasa Nigam
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Sergey Pozdnyakov
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
103
|
Westermayr J, Marquetand P. Machine learning and excited-state molecular dynamics. MACHINE LEARNING-SCIENCE AND TECHNOLOGY 2020. [DOI: 10.1088/2632-2153/ab9c3e] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
104
|
Chen MS, Zuehlsdorff TJ, Morawietz T, Isborn CM, Markland TE. Exploiting Machine Learning to Efficiently Predict Multidimensional Optical Spectra in Complex Environments. J Phys Chem Lett 2020; 11:7559-7568. [PMID: 32808797 DOI: 10.1021/acs.jpclett.0c02168] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The excited-state dynamics of chromophores in complex environments determine a range of vital biological and energy capture processes. Time-resolved, multidimensional optical spectroscopies provide a key tool to investigate these processes. Although theory has the potential to decode these spectra in terms of the electronic and atomistic dynamics, the need for large numbers of excited-state electronic structure calculations severely limits first-principles predictions of multidimensional optical spectra for chromophores in the condensed phase. Here, we leverage the locality of chromophore excitations to develop machine learning models to predict the excited-state energy gap of chromophores in complex environments for efficiently constructing linear and multidimensional optical spectra. By analyzing the performance of these models, which span a hierarchy of physical approximations, across a range of chromophore-environment interaction strengths, we provide strategies for the construction of machine learning models that greatly accelerate the calculation of multidimensional optical spectra from first principles.
Collapse
Affiliation(s)
- Michael S Chen
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Tim J Zuehlsdorff
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Tobias Morawietz
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Christine M Isborn
- Chemistry and Chemical Biology, University of California Merced, Merced, California 95343, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
105
|
Low K, Kobayashi R, Izgorodina EI. The effect of descriptor choice in machine learning models for ionic liquid melting point prediction. J Chem Phys 2020; 153:104101. [DOI: 10.1063/5.0016289] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, Monash University, 17 Rainforest Walk, Clayton, VIC 3800, Australia
| | - Rika Kobayashi
- ANU Supercomputer Facility, Leonard Huxley Building 56, Mills Road, Canberra, ACT 2601, Australia
| | - Ekaterina I. Izgorodina
- Monash Computational Chemistry Group, Monash University, 17 Rainforest Walk, Clayton, VIC 3800, Australia
| |
Collapse
|
106
|
Boattini E, Bezem N, Punnathanam SN, Smallenburg F, Filion L. Modeling of many-body interactions between elastic spheres through symmetry functions. J Chem Phys 2020; 153:064902. [DOI: 10.1063/5.0015606] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Emanuele Boattini
- Soft Condensed Matter, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, The Netherlands
| | - Nina Bezem
- Soft Condensed Matter, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, The Netherlands
| | - Sudeep N. Punnathanam
- Department of Chemical Engineering, Indian Institute of Science, Bangalore 560012, Karnataka, India
| | - Frank Smallenburg
- Université Paris-Saclay, CNRS, Laboratoire de Physique des Solides, 91405 Orsay, France
| | - Laura Filion
- Soft Condensed Matter, Debye Institute for Nanomaterials Science, Utrecht University, Utrecht, The Netherlands
| |
Collapse
|
107
|
Yanxon H, Zagaceta D, Wood BC, Zhu Q. Neural network potential from bispectrum components: A case study on crystalline silicon. J Chem Phys 2020; 153:054118. [PMID: 32770884 DOI: 10.1063/5.0014677] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In this article, we present a systematic study on developing machine learning force fields (MLFFs) for crystalline silicon. While the main-stream approach of fitting a MLFF is to use a small and localized training set from molecular dynamics simulations, it is unlikely to cover the global features of the potential energy surface. To remedy this issue, we used randomly generated symmetrical crystal structures to train a more general Si-MLFF. Furthermore, we performed substantial benchmarks among different choices of material descriptors and regression techniques on two different sets of silicon data. Our results show that neural network potential fitting with bispectrum coefficients as descriptors is a feasible method for obtaining accurate and transferable MLFFs.
Collapse
Affiliation(s)
- Howard Yanxon
- Department of Physics and Astronomy, University of Nevada, Las Vegas, Nevada 89154, USA
| | - David Zagaceta
- Department of Physics and Astronomy, University of Nevada, Las Vegas, Nevada 89154, USA
| | - Brandon C Wood
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Qiang Zhu
- Department of Physics and Astronomy, University of Nevada, Las Vegas, Nevada 89154, USA
| |
Collapse
|
108
|
Pattnaik P, Raghunathan S, Kalluri T, Bhimalapuram P, Jawahar CV, Priyakumar UD. Machine Learning for Accurate Force Calculations in Molecular Dynamics Simulations. J Phys Chem A 2020; 124:6954-6967. [DOI: 10.1021/acs.jpca.0c03926] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Punyaslok Pattnaik
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| | - Shampa Raghunathan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| | - Tarun Kalluri
- Center for Visual Information Technology, KCIS, International Institute of Information Technology, Hyderabad 500 032, India
| | - Prabhakar Bhimalapuram
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| | - C. V. Jawahar
- Center for Visual Information Technology, KCIS, International Institute of Information Technology, Hyderabad 500 032, India
| | - U. Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
109
|
Marchwiany ME, Birowska M, Popielski M, Majewski JA, Jastrzębska AM. Surface-Related Features Responsible for Cytotoxic Behavior of MXenes Layered Materials Predicted with Machine Learning Approach. MATERIALS (BASEL, SWITZERLAND) 2020; 13:E3083. [PMID: 32664304 PMCID: PMC7412046 DOI: 10.3390/ma13143083] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/06/2020] [Accepted: 07/08/2020] [Indexed: 12/16/2022]
Abstract
To speed up the implementation of the two-dimensional materials in the development of potential biomedical applications, the toxicological aspects toward human health need to be addressed. Due to time-consuming and expensive analysis, only part of the continuously expanding family of 2D materials can be tested in vitro. The machine learning methods can be used-by extracting new insights from available biological data sets, and provide further guidance for experimental studies. This study identifies the most relevant highly surface-specific features that might be responsible for cytotoxic behavior of 2D materials, especially MXenes. In particular, two factors, namely, the presence of transition metal oxides and lithium atoms on the surface, are identified as cytotoxicity-generating features. The developed machine learning model succeeds in predicting toxicity for other 2D MXenes, previously not tested in vitro, and hence, is able to complement the existing knowledge coming from in vitro studies. Thus, we claim that it might be one of the solutions for reducing the number of toxicological studies needed, and allows for minimizing failures in future biological applications.
Collapse
Affiliation(s)
- Maciej E. Marchwiany
- Interdisciplinary Centre for Mathematical and Computational Modelling (ICM), University of Warsaw, Pawińskiego 5a, 02-106 Warsaw, Poland;
| | - Magdalena Birowska
- Faculty of Physics, University of Warsaw, Pasteura 5, 00-092 Warsaw, Poland; (M.P.); (J.A.M.)
| | - Mariusz Popielski
- Faculty of Physics, University of Warsaw, Pasteura 5, 00-092 Warsaw, Poland; (M.P.); (J.A.M.)
| | - Jacek A. Majewski
- Faculty of Physics, University of Warsaw, Pasteura 5, 00-092 Warsaw, Poland; (M.P.); (J.A.M.)
| | - Agnieszka M. Jastrzębska
- Faculty of Materials Science and Engineering, Warsaw University of Technology, Wołoska 141, 02-507 Warsaw, Poland;
| |
Collapse
|
110
|
Kato K, Masuda T, Watanabe C, Miyagawa N, Mizouchi H, Nagase S, Kamisaka K, Oshima K, Ono S, Ueda H, Tokuhisa A, Kanada R, Ohta M, Ikeguchi M, Okuno Y, Fukuzawa K, Honma T. High-Precision Atomic Charge Prediction for Protein Systems Using Fragment Molecular Orbital Calculation and Machine Learning. J Chem Inf Model 2020; 60:3361-3368. [PMID: 32496771 DOI: 10.1021/acs.jcim.0c00273] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Here, we have constructed neural network-based models that predict atomic partial charges with high accuracy at low computational cost. The models were trained using high-quality data acquired from quantum mechanics calculations using the fragment molecular orbital method. We have succeeded in obtaining highly accurate atomic partial charges for three representative molecular systems of proteins, including one large biomolecule (approx. 2000 atoms). The novelty of our approach is the ability to take into account the electronic polarization in the system, which is a system-dependent phenomenon, being important in the field of drug design. Our high-precision models are useful for the prediction of atomic partial charges and expected to be widely applicable in structure-based drug designs such as structural optimization, high-speed and high-precision docking, and molecular dynamics calculations.
Collapse
Affiliation(s)
- Koichiro Kato
- Science Solutions Division, Mizuho Information & Research Institute, Inc., 2-3 Kanda Nishiki-cho, Chiyoda, Tokyo 101-8443, Japan.,Department of Applied Chemistry, Graduate School of Engineering, Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan.,Center for Molecular Systems (CMS), Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan
| | - Tomohide Masuda
- Pharmaceutical Research Laboratories, Toray Industries, Inc., 6-10-1 Tebiro, Kamakura, Kanagawa 248-8555, Japan
| | - Chiduru Watanabe
- Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Naoki Miyagawa
- Science Solutions Division, Mizuho Information & Research Institute, Inc., 2-3 Kanda Nishiki-cho, Chiyoda, Tokyo 101-8443, Japan
| | - Hideo Mizouchi
- Science Solutions Division, Mizuho Information & Research Institute, Inc., 2-3 Kanda Nishiki-cho, Chiyoda, Tokyo 101-8443, Japan
| | - Shumpei Nagase
- Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,Masuda Keizai Kenkyusho, Y.K., Hillsidemasuda, 1-1-15 Teraya, Tsurumi-ku, Yokohama-shi, Kanagawa 230-0015, Japan
| | - Kikuko Kamisaka
- Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Kanji Oshima
- Biotechnology Research Laboratories, Kaneka Corporation, 1-8 Miyamae-cho, Takasago-cho, Takasago, Hyogo 676-8688, Japan
| | - Satoshi Ono
- Discovery Technology Laboratories, Innovative Research Division, Mitsubishi Tanabe Pharma Corporation, 1000 Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan
| | - Hiroshi Ueda
- Pharmaceutical Research Laboratories, Toray Industries, Inc., 6-10-1 Tebiro, Kamakura, Kanagawa 248-8555, Japan
| | - Atsushi Tokuhisa
- RIKEN Cluster for Science and Technology Hub, 6-3-5 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.,RIKEN Center for Computational Science, 6-3-5 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.,RIKEN Medical Sciences Innovation Hub Program, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Ryo Kanada
- RIKEN Cluster for Science and Technology Hub, 6-3-5 Minatojima-minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan
| | - Masateru Ohta
- Drug Development Data Intelligence Platform Group, Medical Science Innovation Hub Program, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Mitsunori Ikeguchi
- Graduate School of Medical Life Science, Yokohama City University, 1-7-29 Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yasushi Okuno
- RIKEN Medical Sciences Innovation Hub Program, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan.,Graduate School of Medicine, Kyoto University, 53 Shogoin-Kawahara-cho, Sakyo-ku, Kyoto 606-8507, Japan.,RIKEN Compass to Healthy Life Research Complex Program, RIKEN, 6-7-1 Minatojima Minami-machi, Chuo-ku, Kobe, Hyogo 650-0047, Japan
| | - Kaori Fukuzawa
- School of Pharmacy and Pharmaceutical Sciences, Hoshi University, 2-4-41 Ebara, Shinagawa-ku, Tokyo 142-8501, Japan
| | - Teruki Honma
- Center for Biosystems Dynamics Research, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| |
Collapse
|
111
|
Devereux C, Smith JS, Huddleston KK, Barros K, Zubatyuk R, Isayev O, Roitberg AE. Extending the Applicability of the ANI Deep Learning Molecular Potential to Sulfur and Halogens. J Chem Theory Comput 2020; 16:4192-4202. [PMID: 32543858 DOI: 10.1021/acs.jctc.0c00121] [Citation(s) in RCA: 153] [Impact Index Per Article: 38.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Machine learning (ML) methods have become powerful, predictive tools in a wide range of applications, such as facial recognition and autonomous vehicles. In the sciences, computational chemists and physicists have been using ML for the prediction of physical phenomena, such as atomistic potential energy surfaces and reaction pathways. Transferable ML potentials, such as ANI-1x, have been developed with the goal of accurately simulating organic molecules containing the chemical elements H, C, N, and O. Here, we provide an extension of the ANI-1x model. The new model, dubbed ANI-2x, is trained to three additional chemical elements: S, F, and Cl. Additionally, ANI-2x underwent torsional refinement training to better predict molecular torsion profiles. These new features open a wide range of new applications within organic chemistry and drug development. These seven elements (H, C, N, O, F, Cl, and S) make up ∼90% of drug-like molecules. To show that these additions do not sacrifice accuracy, we have tested this model across a range of organic molecules and applications, including the COMP6 benchmark, dihedral rotations, conformer scoring, and nonbonded interactions. ANI-2x is shown to accurately predict molecular energies compared to density functional theory with a ∼106 factor speedup and a negligible slowdown compared to ANI-1x and shows subchemical accuracy across most of the COMP6 benchmark. The resulting model is a valuable tool for drug development which can potentially replace both quantum calculations and classical force fields for a myriad of applications.
Collapse
Affiliation(s)
- Christian Devereux
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Justin S Smith
- Center for Non-Linear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States.,Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Kate K Huddleston
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Roman Zubatyuk
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
112
|
Gao X, Ramezanghorbani F, Isayev O, Smith JS, Roitberg AE. TorchANI: A Free and Open Source PyTorch-Based Deep Learning Implementation of the ANI Neural Network Potentials. J Chem Inf Model 2020; 60:3408-3415. [DOI: 10.1021/acs.jcim.0c00451] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Xiang Gao
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Farhad Ramezanghorbani
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh Pennsylvania 15213, United States
| | - Justin S. Smith
- Center for Nonlinear Studies and Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Adrian E. Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
113
|
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, Isayev O, Curtarolo S, Fourches D, Cohen Y, Aspuru-Guzik A, Winkler DA, Agrafiotis D, Cherkasov A, Tropsha A. QSAR without borders. Chem Soc Rev 2020; 49:3525-3564. [PMID: 32356548 PMCID: PMC8008490 DOI: 10.1039/d0cs00098a] [Citation(s) in RCA: 327] [Impact Index Per Article: 81.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure-activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.
Collapse
Affiliation(s)
- Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
114
|
Abstract
As the quantum chemistry (QC) community embraces machine learning (ML), the number of new methods and applications based on the combination of QC and ML is surging. In this Perspective, a view of the current state of affairs in this new and exciting research field is offered, challenges of using machine learning in quantum chemistry applications are described, and potential future developments are outlined. Specifically, examples of how machine learning is used to improve the accuracy and accelerate quantum chemical research are shown. Generalization and classification of existing techniques are provided to ease the navigation in the sea of literature and to guide researchers entering the field. The emphasis of this Perspective is on supervised machine learning.
Collapse
Affiliation(s)
- Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, Department of Chemistry, and College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| |
Collapse
|
115
|
Metcalf DP, Koutsoukas A, Spronk SA, Claus BL, Loughney DA, Johnson SR, Cheney DL, Sherrill CD. Approaches for machine learning intermolecular interaction energies and application to energy components from symmetry adapted perturbation theory. J Chem Phys 2020; 152:074103. [DOI: 10.1063/1.5142636] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Affiliation(s)
- Derek P. Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Alexios Koutsoukas
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Steven A. Spronk
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Brian L. Claus
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Deborah A. Loughney
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Stephen R. Johnson
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Daniel L. Cheney
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - C. David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| |
Collapse
|
116
|
Christensen AS, Bratholm LA, Faber FA, Anatole von Lilienfeld O. FCHL revisited: Faster and more accurate quantum machine learning. J Chem Phys 2020; 152:044107. [DOI: 10.1063/1.5126701] [Citation(s) in RCA: 117] [Impact Index Per Article: 29.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Affiliation(s)
- Anders S. Christensen
- Department of Chemistry, National Center for Computational Design and Discovery of Novel Materials (MARVEL), Institute of Physical Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Lars A. Bratholm
- School of Mathematics, University of Bristol, Bristol BS8 1TW, United Kingdom
- School of Chemistry, University of Bristol, Bristol BS8 1TS, United Kingdom
| | - Felix A. Faber
- Department of Chemistry, National Center for Computational Design and Discovery of Novel Materials (MARVEL), Institute of Physical Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - O. Anatole von Lilienfeld
- Department of Chemistry, National Center for Computational Design and Discovery of Novel Materials (MARVEL), Institute of Physical Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| |
Collapse
|
117
|
Shao Y, Hellström M, Mitev PD, Knijff L, Zhang C. PiNN: A Python Library for Building Atomic Neural Networks of Molecules and Materials. J Chem Inf Model 2020; 60:1184-1193. [DOI: 10.1021/acs.jcim.9b00994] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Yunqi Shao
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Matti Hellström
- Software for Chemistry and Materials B.V., De Boelelaan 1083, 1081HV Amsterdam, The Netherlands
| | - Pavlin D. Mitev
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Lisanne Knijff
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| | - Chao Zhang
- Department of Chemistry-Ångström Laboratory, Uppsala University, Lägerhyddsvägen 1, P.O. Box 538, 75121 Uppsala, Sweden
| |
Collapse
|
118
|
Chen G, Shen Z, Iyer A, Ghumman UF, Tang S, Bi J, Chen W, Li Y. Machine-Learning-Assisted De Novo Design of Organic Molecules and Polymers: Opportunities and Challenges. Polymers (Basel) 2020; 12:E163. [PMID: 31936321 PMCID: PMC7023065 DOI: 10.3390/polym12010163] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2019] [Revised: 12/27/2019] [Accepted: 01/02/2020] [Indexed: 12/18/2022] Open
Abstract
Organic molecules and polymers have a broad range of applications in biomedical, chemical, and materials science fields. Traditional design approaches for organic molecules and polymers are mainly experimentally-driven, guided by experience, intuition, and conceptual insights. Though they have been successfully applied to discover many important materials, these methods are facing significant challenges due to the tremendous demand of new materials and vast design space of organic molecules and polymers. Accelerated and inverse materials design is an ideal solution to these challenges. With advancements in high-throughput computation, artificial intelligence (especially machining learning, ML), and the growth of materials databases, ML-assisted materials design is emerging as a promising tool to flourish breakthroughs in many areas of materials science and engineering. To date, using ML-assisted approaches, the quantitative structure property/activity relation for material property prediction can be established more accurately and efficiently. In addition, materials design can be revolutionized and accelerated much faster than ever, through ML-enabled molecular generation and inverse molecular design. In this perspective, we review the recent progresses in ML-guided design of organic molecules and polymers, highlight several successful examples, and examine future opportunities in biomedical, chemical, and materials science fields. We further discuss the relevant challenges to solve in order to fully realize the potential of ML-assisted materials design for organic molecules and polymers. In particular, this study summarizes publicly available materials databases, feature representations for organic molecules, open-source tools for feature generation, methods for molecular generation, and ML models for prediction of material properties, which serve as a tutorial for researchers who have little experience with ML before and want to apply ML for various applications. Last but not least, it draws insights into the current limitations of ML-guided design of organic molecules and polymers. We anticipate that ML-assisted materials design for organic molecules and polymers will be the driving force in the near future, to meet the tremendous demand of new materials with tailored properties in different fields.
Collapse
Affiliation(s)
- Guang Chen
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA; (G.C.); (Z.S.)
| | - Zhiqiang Shen
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA; (G.C.); (Z.S.)
| | - Akshay Iyer
- Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA; (A.I.); (U.F.G.)
| | - Umar Farooq Ghumman
- Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA; (A.I.); (U.F.G.)
| | - Shan Tang
- State Key Laboratory of Structural Analysis for Industrial Equipment, Department of Engineering Mechanics, and International Research Center for Computational Mechanics, Dalian University of Technology, Dalian 116023, China;
| | - Jinbo Bi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA;
| | - Wei Chen
- Department of Mechanical Engineering, Northwestern University, Evanston, IL 60208, USA; (A.I.); (U.F.G.)
| | - Ying Li
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA; (G.C.); (Z.S.)
- Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
119
|
Wang X, Gao J. Atomic partial charge predictions for furanoses by random forest regression with atom type symmetry function. RSC Adv 2020; 10:666-673. [PMID: 35494472 PMCID: PMC9048215 DOI: 10.1039/c9ra09337k] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Accepted: 12/18/2019] [Indexed: 01/04/2023] Open
Abstract
Furanoses that are components for many important biomolecules have complicated conformational spaces due to the flexible ring and exo-cyclic moieties. Machine learning algorithms, which require descriptors as structural inputs, can be used to efficiently compute conformational adaptive (CA) charges to capture the electrostatic potential variations caused by the conformational changes in the molecular mechanics (MM) calculations. In the present study, we introduced atom type symmetry function (ATSF) developed based on atom centered symmetry function (ACSF) for describing conformations for furanoses, in which atoms were categorized by atom types defined by their properties or connectivity in classic molecular mechanics (MM) force field parameters to generate a suitable coordinate size. Random forest regression (RFR) models with ATSF showed improvements for predicting CA charges and dipole moments for furanoses compared to those with ACSF and atom name symmetry functions where atoms were categorized by their unique atom names. The CA charges predicted by RFR models with ATSF showed more comparable reproductions of the carbohydrate-water and carbohydrate-protein interactions computed with RESP charges individually derived from QM calculations than the ensemble-averaged atomic charge sets commonly employed in molecular mechanics force fields, suggesting that the predicted CA charges were capable of including electrostatic variations in their dynamic charge values. Improvements by ATSF showed that categorizing atoms by atom types introduced chemical structural perceptions to descriptors and produced a suitable coordinate size in ATSF to capture key structural features for furanoses. This categorizing scheme also allows ATSF to be readily adopted by other biomolecules thanks to the broad implementations of MM force fields.
Collapse
Affiliation(s)
- Xiaocong Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University Wuhan China
| | - Jun Gao
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University Wuhan China
| |
Collapse
|
120
|
Groenenboom MC, Moffat TP, Schwarz KA. Halide-induced Step Faceting and Dissolution Energetics from Atomistic Machine Learned Potentials on Cu(100). THE JOURNAL OF PHYSICAL CHEMISTRY. C, NANOMATERIALS AND INTERFACES 2020; 124:10.1021/acs.jpcc.0c00683. [PMID: 34194601 PMCID: PMC8240506 DOI: 10.1021/acs.jpcc.0c00683] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Adsorbates impact the surface stability and reactivity of metallic electrodes, affecting the corrosion, dissolution, and deposition behavior. Here, we use density functional theory (DFT) and DFT-based Behler-Parrinello neural networks (BPNN) to investigate the geometries, surface formation energies, and atom removal energies of stepped and kinked surfaces vicinal to Cu(100) with a c(2×2) Cl adlayer. DFT calculations indicate that the stable structures for the adsorbate-free vicinal surfaces favor steps with <110> orientation, while the addition of the c(2×2) Cl adlayer leads to <100> step facets, in agreement with scanning tunneling microscopy (STM) observations. The BPNN calculations produce energies in good agreement with DFT results (root mean square error of 1.3 meV/atom for a randomly chosen set of structures excluded from the training set). We draw three conclusions from the BPNN calculations. First, Cl on the upper <100> step edges occupies the three fold hollow sites (as opposed to the four-fold sites on the terraces), congruent with deviations of the STM height profile for the adsorbate at the upper step edge. Second, disruptions in the continuity of the halide overlayer at the steps result in significant long-range step-step interactions. Third, anisotropic metal dissolution and deposition energetics arise from phase shifts of the c(2×2) adlayer at orthogonal <100> steps. This DFT-BPNN approach offers an effective strategy for tackling large-scale surface structure challenges with atomic-level accuracy.
Collapse
|
121
|
Gastegger M, Marquetand P. Molecular Dynamics with Neural Network Potentials. MACHINE LEARNING MEETS QUANTUM PHYSICS 2020. [DOI: 10.1007/978-3-030-40245-7_12] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
122
|
Profitt TA, Pearson JK. A shared-weight neural network architecture for predicting molecular properties. Phys Chem Chem Phys 2019; 21:26175-26183. [PMID: 31750845 DOI: 10.1039/c9cp03103k] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Quantum chemical methods scale poorly with increasing molecular size and machine learning models have emerged as a promising, computationally-efficient alternative. We present a shared-weight neural network architecture based on modified atom-centered symmetry functions (ACSFs) and show that it performs similarly to the more computationally expensive per-element neural networks of previous work with ACSFs. The model achieves chemically accurate predictions, with a mean absolute error as low as 0.63 kcal mol-1 on energy predictions in the QM9 data set. Additionally, we show that it can reliably predict atomic forces.
Collapse
Affiliation(s)
- Trevor A Profitt
- Department of Chemistry, University of Prince Edward Island, Charlottetown, PE, Canada.
| | | |
Collapse
|
123
|
Himanen L, Geurts A, Foster AS, Rinke P. Data-Driven Materials Science: Status, Challenges, and Perspectives. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2019; 6:1900808. [PMID: 31728276 PMCID: PMC6839624 DOI: 10.1002/advs.201900808] [Citation(s) in RCA: 149] [Impact Index Per Article: 29.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 06/20/2019] [Indexed: 05/06/2023]
Abstract
Data-driven science is heralded as a new paradigm in materials science. In this field, data is the new resource, and knowledge is extracted from materials datasets that are too big or complex for traditional human reasoning-typically with the intent to discover new or improved materials or materials phenomena. Multiple factors, including the open science movement, national funding, and progress in information technology, have fueled its development. Such related tools as materials databases, machine learning, and high-throughput methods are now established as parts of the materials research toolset. However, there are a variety of challenges that impede progress in data-driven materials science: data veracity, integration of experimental and computational data, data longevity, standardization, and the gap between industrial interests and academic efforts. In this perspective article, the historical development and current state of data-driven materials science, building from the early evolution of open science to the rapid expansion of materials data infrastructures are discussed. Key successes and challenges so far are also reviewed, providing a perspective on the future development of the field.
Collapse
Affiliation(s)
- Lauri Himanen
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
| | - Amber Geurts
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Department of Management StudiesAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- TNO, Netherlands Organization for Applied Scientific ResearchExpertise Center for Strategy and PolicyAnna van Beurenplein 1DA 2595The HagueNetherlands
| | - Adam Stuart Foster
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Graduate School Materials Science in MainzStaudinger Weg 955128MainzGermany
- WPI Nano Life Science Institute (WPI‐NanoLSI)Kanazawa UniversityKakuma‐machiKanazawa920‐1192Japan
| | - Patrick Rinke
- Department of Applied PhysicsAalto UniversityP.O. Box 1110000076Aalto,EspooFinland
- Theoretical Chemistry and Catalysis Research CentreTechnische Universität MünchenLichtenbergstr. 4D‐85747GarchingGermany
| |
Collapse
|
124
|
Helfrecht BA, Semino R, Pireddu G, Auerbach SM, Ceriotti M. A new kind of atlas of zeolite building blocks. J Chem Phys 2019; 151:154112. [DOI: 10.1063/1.5119751] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Affiliation(s)
- Benjamin A. Helfrecht
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Rocio Semino
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Institut Charles Gerhardt Montpellier UMR 5253 CNRS, Université de Montpellier, Place E. Bataillon, 34095 Montpellier Cedex 05, France
| | - Giovanni Pireddu
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Dipartimento di Chimica e Farmacia, Università degli Studi di Sassari, Via Vienna 2, 01700 Sassari, Italy
| | - Scott M. Auerbach
- Department of Chemistry and Department of Chemical Engineering, University of Massachusetts Amherst, Amherst, Massachusetts 01003, USA
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
125
|
Dick S, Fernandez-Serra M. Learning from the density to correct total energy and forces in first principle simulations. J Chem Phys 2019; 151:144102. [DOI: 10.1063/1.5114618] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Affiliation(s)
- Sebastian Dick
- Physics and Astronomy Department, Stony Brook University, Stony Brook, New York 11794-3800, USA and Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794-3800, USA
| | - Marivi Fernandez-Serra
- Physics and Astronomy Department, Stony Brook University, Stony Brook, New York 11794-3800, USA and Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794-3800, USA
| |
Collapse
|
126
|
Zhang Y, Hu C, Jiang B. Embedded Atom Neural Network Potentials: Efficient and Accurate Machine Learning with a Physically Inspired Representation. J Phys Chem Lett 2019; 10:4962-4967. [PMID: 31397157 DOI: 10.1021/acs.jpclett.9b02037] [Citation(s) in RCA: 122] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
We propose a simple, but efficient and accurate, machine learning (ML) model for developing a high-dimensional potential energy surface. This so-called embedded atom neural network (EANN) approach is inspired by the well-known empirical embedded atom method (EAM) model used in the condensed phase. It simply replaces the scalar embedded atom density in EAM with a Gaussian-type orbital based density vector and represents the complex relationship between the embedded density vector and atomic energy by neural networks. We demonstrate that the EANN approach is equally accurate as several established ML models in representing both big molecular and extended periodic systems, yet with much fewer parameters and configurations. It is highly efficient as it implicitly contains the three-body information without an explicit sum of the conventional costly angular descriptors. With high accuracy and efficiency, EANN potentials can vastly accelerate molecular dynamics and spectroscopic simulations in complex systems at ab initio level.
Collapse
Affiliation(s)
- Yaolong Zhang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Ce Hu
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Bin Jiang
- Hefei National Laboratory for Physical Science at the Microscale, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
127
|
Herr JE, Koh K, Yao K, Parkhill J. Compressing physics with an autoencoder: Creating an atomic species representation to improve machine learning models in the chemical sciences. J Chem Phys 2019; 151:084103. [DOI: 10.1063/1.5108803] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- John E. Herr
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - Kevin Koh
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - Kun Yao
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - John Parkhill
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| |
Collapse
|
128
|
Nudejima T, Ikabata Y, Seino J, Yoshikawa T, Nakai H. Machine-learned electron correlation model based on correlation energy density at complete basis set limit. J Chem Phys 2019; 151:024104. [DOI: 10.1063/1.5100165] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Takuro Nudejima
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Yasuhiro Ikabata
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Junji Seino
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- PRESTO, Japan Science and Technology Agency, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| | - Takeshi Yoshikawa
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
| | - Hiromi Nakai
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Waseda Research Institute for Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan
- Elements Strategy Initiative for Catalysts and Batteries (ESICB), Kyoto University, Katsura, Kyoto 615-8520, Japan
| |
Collapse
|
129
|
Gao H, Wang J, Sun J. Improve the performance of machine-learning potentials by optimizing descriptors. J Chem Phys 2019; 150:244110. [DOI: 10.1063/1.5097293] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Hao Gao
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Junjie Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| | - Jian Sun
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing 210093, China
| |
Collapse
|
130
|
Abbott AS, Turney JM, Zhang B, Smith DGA, Altarawy D, Schaefer HF. PES-Learn: An Open-Source Software Package for the Automated Generation of Machine Learning Models of Molecular Potential Energy Surfaces. J Chem Theory Comput 2019; 15:4386-4398. [DOI: 10.1021/acs.jctc.9b00312] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Adam S. Abbott
- Center for Computational Quantum Chemistry, The University of Georgia, Athens, Georgia 30602, United States
| | - Justin M. Turney
- Center for Computational Quantum Chemistry, The University of Georgia, Athens, Georgia 30602, United States
| | - Boyi Zhang
- Center for Computational Quantum Chemistry, The University of Georgia, Athens, Georgia 30602, United States
| | - Daniel G. A. Smith
- Molecular Sciences Software Institute, Virginia Tech, Blacksburg, Virginia 24061, United States
| | - Doaa Altarawy
- Molecular Sciences Software Institute, Virginia Tech, Blacksburg, Virginia 24061, United States
- Computer and Systems Engineering Department, Alexandria University, Alexandria, Egypt
| | - Henry F. Schaefer
- Center for Computational Quantum Chemistry, The University of Georgia, Athens, Georgia 30602, United States
| |
Collapse
|
131
|
Singraber A, Morawietz T, Behler J, Dellago C. Parallel Multistream Training of High-Dimensional Neural Network Potentials. J Chem Theory Comput 2019; 15:3075-3092. [PMID: 30995035 DOI: 10.1021/acs.jctc.8b01092] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Over the past years high-dimensional neural network potentials (HDNNPs), fitted to accurately reproduce ab initio potential energy surfaces, have become a powerful tool in chemistry, physics and materials science. Here, we focus on the training of the neural networks that lies at the heart of the HDNNP method. We present an efficient approach for optimizing the weight parameters of the neural network via multistream Kalman filtering, using potential energies and forces as reference data. In this procedure, the choice of the free parameters of the Kalman filter can have a significant impact on the fit quality. Carrying out a large parameter study, we determine optimal settings and demonstrate how to optimize training results of HDNNPs. Moreover, we illustrate our HDNNP training approach by revisiting previously presented fits for water and developing a new potential for copper sulfide. This material, accessible in computer simulations so far only via first-principles methods, forms a particularly complex solid structure at low temperatures and undergoes a phase transition to a superionic state upon heating. Analyzing MD simulations carried out with the Cu2S HDNNP, we confirm that the underlying ab initio reference method indeed reproduces this behavior.
Collapse
Affiliation(s)
- Andreas Singraber
- Faculty of Physics , University of Vienna , Boltzmanngasse 5 , Vienna , Austria
| | - Tobias Morawietz
- Department of Chemistry , Stanford University , Stanford , California 94305 , United States
| | - Jörg Behler
- Universität Göttingen , Institut für Physikalische Chemie, Theoretische Chemie , Tammannstraße 6 , 37077 Göttingen , Germany
| | - Christoph Dellago
- Faculty of Physics , University of Vienna , Boltzmanngasse 5 , Vienna , Austria
| |
Collapse
|
132
|
Willatt MJ, Musil F, Ceriotti M. Atom-density representations for machine learning. J Chem Phys 2019; 150:154110. [DOI: 10.1063/1.5090481] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Michael J. Willatt
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Félix Musil
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Center for Computational Design and Discovery of Novel Materials (MARVEL), Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
133
|
Jackson NE, Webb MA, de Pablo JJ. Recent advances in machine learning towards multiscale soft materials design. Curr Opin Chem Eng 2019. [DOI: 10.1016/j.coche.2019.03.005] [Citation(s) in RCA: 70] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
134
|
Singraber A, Behler J, Dellago C. Library-Based LAMMPS Implementation of High-Dimensional Neural Network Potentials. J Chem Theory Comput 2019; 15:1827-1840. [DOI: 10.1021/acs.jctc.8b00770] [Citation(s) in RCA: 114] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Andreas Singraber
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Jörg Behler
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstraße 6, 37077 Göttingen, Germany
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| |
Collapse
|
135
|
Quantum-Chemical Insights from Interpretable Atomistic Neural Networks. EXPLAINABLE AI: INTERPRETING, EXPLAINING AND VISUALIZING DEEP LEARNING 2019. [DOI: 10.1007/978-3-030-28954-6_17] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
136
|
Samanta A. Representing local atomic environment using descriptors based on local correlations. J Chem Phys 2018; 149:244102. [PMID: 30599737 DOI: 10.1063/1.5055772] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Statistical learning of material properties is an emerging topic of research and has been tremendously successful in areas such as representing complex energy landscapes as well as in technologically relevant areas, like identification of better catalysts and electronic materials. However, analysis of large data sets to efficiently learn characteristic features of a complex energy landscape, for example, depends on the ability of descriptors to effectively screen different local atomic environments. Thus, discovering appropriate descriptors of bulk or defect properties and the functional dependence of such properties on these descriptors remains a difficult and tedious process. To this end, we develop a framework to generate descriptors based on many-body correlations that can effectively capture intrinsic geometric features of the local environment of an atom. These descriptors are based on the spectrum of two-body, three-body, four-body, and higher order correlations between an atom and its neighbors and are evaluated by calculating the corresponding two-body, three-body, and four-body overlap integrals. They are invariant to global translation, global rotation, reflection, and permutations of atomic indices. By systematically testing the ability to capture the local atomic environment, it is shown that the local correlation descriptors are able to successfully reconstruct structures containing 10-25 atoms which was previously not possible.
Collapse
Affiliation(s)
- Amit Samanta
- Physics Division, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| |
Collapse
|
137
|
Schütt KT, Kessel P, Gastegger M, Nicoli KA, Tkatchenko A, Müller KR. SchNetPack: A Deep Learning Toolbox For Atomistic Systems. J Chem Theory Comput 2018; 15:448-455. [PMID: 30481453 DOI: 10.1021/acs.jctc.8b00908] [Citation(s) in RCA: 178] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
SchNetPack is a toolbox for the development and application of deep neural networks that predict potential energy surfaces and other quantum-chemical properties of molecules and materials. It contains basic building blocks of atomistic neural networks, manages their training, and provides simple access to common benchmark datasets. This allows for an easy implementation and evaluation of new models. For now, SchNetPack includes implementations of (weighted) atom-centered symmetry functions and the deep tensor neural network SchNet, as well as ready-to-use scripts that allow one to train these models on molecule and material datasets. Based on the PyTorch deep learning framework, SchNetPack allows one to efficiently apply the neural networks to large datasets with millions of reference calculations, as well as parallelize the model across multiple GPUs. Finally, SchNetPack provides an interface to the Atomic Simulation Environment in order to make trained models easily accessible to researchers that are not yet familiar with neural networks.
Collapse
Affiliation(s)
- K T Schütt
- Machine Learning Group , Technische Universität Berlin , 10587 Berlin , Germany
| | - P Kessel
- Machine Learning Group , Technische Universität Berlin , 10587 Berlin , Germany
| | - M Gastegger
- Machine Learning Group , Technische Universität Berlin , 10587 Berlin , Germany
| | - K A Nicoli
- Machine Learning Group , Technische Universität Berlin , 10587 Berlin , Germany
| | - A Tkatchenko
- Physics and Materials Science Research Unit , University of Luxembourg , L-1511 Luxembourg , Luxembourg
| | - K-R Müller
- Machine Learning Group , Technische Universität Berlin , 10587 Berlin , Germany.,Department of Brain and Cognitive Engineering , Korea University , Anam-dong, Seongbuk-gu, Seoul 02841 , South Korea.,Max-Planck-Institut für Informatik , Saarbrücken , Germany
| |
Collapse
|
138
|
Jindal S, Bulusu SS. A transferable artificial neural network model for atomic forces in nanoparticles. J Chem Phys 2018; 149:194101. [DOI: 10.1063/1.5043247] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Shweta Jindal
- Discipline of Chemistry, Indian Institute of Technology Indore, Simrol, Indore 453552, India
| | - Satya S. Bulusu
- Discipline of Chemistry, Indian Institute of Technology Indore, Simrol, Indore 453552, India
| |
Collapse
|
139
|
Krykunov M, Woo TK. Bond Type Restricted Property Weighted Radial Distribution Functions for Accurate Machine Learning Prediction of Atomization Energies. J Chem Theory Comput 2018; 14:5229-5237. [PMID: 30148628 DOI: 10.1021/acs.jctc.8b00788] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Understanding the performance of machine learning algorithms is essential for designing more accurate and efficient statistical models. It is not always possible to unravel the reasoning of neural networks. Here, we propose a method for calculating machine learning kernels in closed and analytic form by combining atomic property weighted radial distribution function (AP-RDF) descriptor with a Gaussian kernel. This allowed us to analyze and improve the performance of the Bag-of-Bonds descriptor when the bond type restriction is included in AP-RDF. The improvement is achieved for the prediction of molecular atomization energies (MAE = 1.7 kcal/mol for QM7 data set) and is due to the incorporation of a tensor product into the kernel, which captures the multidimensional representation of the AP-RDF. On the other hand, the numerical version of the AP-RDF is a constant size descriptor, making it more computationally efficient than Bag-of-Bonds. We have also discussed a connection between molecular quantum similarity and machine learning kernels with first-principles kinds of descriptors.
Collapse
Affiliation(s)
- Mykhaylo Krykunov
- Department of Chemistry and Biomolecular Science , University of Ottawa , Ottawa K1N 6N5 , Canada
| | - Tom K Woo
- Department of Chemistry and Biomolecular Science , University of Ottawa , Ottawa K1N 6N5 , Canada
| |
Collapse
|
140
|
Meldgaard SA, Kolsbjerg EL, Hammer B. Machine learning enhanced global optimization by clustering local environments to enable bundled atomic energies. J Chem Phys 2018; 149:134104. [DOI: 10.1063/1.5048290] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Søren A. Meldgaard
- Department of Physics and Astronomy and Interdisciplinary Nanoscience Center (iNANO), Aarhus University, 8000 Aarhus, Denmark
| | - Esben L. Kolsbjerg
- Department of Physics and Astronomy and Interdisciplinary Nanoscience Center (iNANO), Aarhus University, 8000 Aarhus, Denmark
| | - Bjørk Hammer
- Department of Physics and Astronomy and Interdisciplinary Nanoscience Center (iNANO), Aarhus University, 8000 Aarhus, Denmark
| |
Collapse
|
141
|
Rostami S, Amsler M, Ghasemi SA. Optimized symmetry functions for machine-learning interatomic potentials of multicomponent systems. J Chem Phys 2018; 149:124106. [DOI: 10.1063/1.5040005] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Affiliation(s)
- Samare Rostami
- Institute for Advanced Studies in Basic Sciences, P.O. Box 45195-1159, Zanjan, Iran
| | - Maximilian Amsler
- Laboratory of Atomic and Solid State Physics, Cornell University, Ithaca, New York 14853, USA
| | - S. Alireza Ghasemi
- Institute for Advanced Studies in Basic Sciences, P.O. Box 45195-1159, Zanjan, Iran
| |
Collapse
|
142
|
Imbalzano G, Anelli A, Giofré D, Klees S, Behler J, Ceriotti M. Automatic selection of atomic fingerprints and reference configurations for machine-learning potentials. J Chem Phys 2018; 148:241730. [DOI: 10.1063/1.5024611] [Citation(s) in RCA: 163] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Affiliation(s)
- Giulio Imbalzano
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Andrea Anelli
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Daniele Giofré
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Sinja Klees
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44801 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie, Ruhr-Universität Bochum, 44801 Bochum, Germany
- Universität Göttingen, Institut für Physikalische Chemie, Theoretische Chemie, Tammannstr. 6, 37077 Göttingen, Germany
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
143
|
Rupp M, von Lilienfeld OA, Burke K. Guest Editorial: Special Topic on Data-Enabled Theoretical Chemistry. J Chem Phys 2018; 148:241401. [DOI: 10.1063/1.5043213] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Affiliation(s)
- Matthias Rupp
- Fritz Haber Institute of the Max Planck Society, Faradayweg 4-6, 14195 Berlin, Germany
| | - O. Anatole von Lilienfeld
- Department of Chemistry, Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials, University of Basel, 4056 Basel, Switzerland
| | - Kieron Burke
- Departments of Chemistry and Physics, University of California, Irvine, California 92697, USA
| |
Collapse
|
144
|
Willatt MJ, Musil F, Ceriotti M. Feature optimization for atomistic machine learning yields a data-driven construction of the periodic table of the elements. Phys Chem Chem Phys 2018; 20:29661-29668. [DOI: 10.1039/c8cp05921g] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
By representing elements as points in a low-dimensional chemical space it is possible to improve the performance of a machine-learning model for a chemically-diverse dataset. The resulting coordinates are reminiscent of the main groups of the periodic table.
Collapse
Affiliation(s)
- Michael J. Willatt
- National Center for Computational Design and Discovery of Novel Materials (MARVEL)
- Laboratory of Computational Science and Modelling
- Institute of Materials
- Ecole Polytechnique Federale de Lausanne
- Lausanne
| | - Félix Musil
- National Center for Computational Design and Discovery of Novel Materials (MARVEL)
- Laboratory of Computational Science and Modelling
- Institute of Materials
- Ecole Polytechnique Federale de Lausanne
- Lausanne
| | - Michele Ceriotti
- National Center for Computational Design and Discovery of Novel Materials (MARVEL)
- Laboratory of Computational Science and Modelling
- Institute of Materials
- Ecole Polytechnique Federale de Lausanne
- Lausanne
| |
Collapse
|