51
|
Klawohn S, Darby JP, Kermode JR, Csányi G, Caro MA, Bartók AP. Gaussian approximation potentials: Theory, software implementation and application examples. J Chem Phys 2023; 159:174108. [PMID: 37929869 DOI: 10.1063/5.0160898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 09/12/2023] [Indexed: 11/07/2023] Open
Abstract
Gaussian Approximation Potentials (GAPs) are a class of Machine Learned Interatomic Potentials routinely used to model materials and molecular systems on the atomic scale. The software implementation provides the means for both fitting models using ab initio data and using the resulting potentials in atomic simulations. Details of the GAP theory, algorithms and software are presented, together with detailed usage examples to help new and existing users. We review some recent developments to the GAP framework, including Message Passing Interface parallelisation of the fitting code enabling its use on thousands of central processing unit cores and compression of descriptors to eliminate the poor scaling with the number of different chemical elements.
Collapse
Affiliation(s)
- Sascha Klawohn
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - James P Darby
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - James R Kermode
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Miguel A Caro
- Department of Chemistry and Materials Science, Aalto University, 02150 Espoo, Finland
| | - Albert P Bartók
- Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom and Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| |
Collapse
|
52
|
Huguenin-Dumittan K, Loche P, Haoran N, Ceriotti M. Physics-Inspired Equivariant Descriptors of Nonbonded Interactions. J Phys Chem Lett 2023; 14:9612-9618. [PMID: 37862712 PMCID: PMC10626632 DOI: 10.1021/acs.jpclett.3c02375] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/13/2023] [Indexed: 10/22/2023]
Abstract
One essential ingredient in many machine learning (ML) based methods for atomistic modeling of materials and molecules is the use of locality. While allowing better system-size scaling, this systematically neglects long-range (LR) effects such as electrostatic or dispersion interactions. We present an extension of the long distance equivariant (LODE) framework that can handle diverse LR interactions in a consistent way and seamlessly integrates with preexisting methods by building new sets of atom centered features. We provide a direct physical interpretation of these using the multipole expansion, which allows for simpler and more efficient implementations. The framework is applied to simple toy systems as proof of concept and a heterogeneous set of molecular dimers to push the method to its limits. By generalizing LODE to arbitrary asymptotic behaviors, we provide a coherent approach to treat arbitrary two- and many-body nonbonded interactions in the data-driven modeling of matter.
Collapse
Affiliation(s)
- Kevin
K. Huguenin-Dumittan
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Philip Loche
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Ni Haoran
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
53
|
Illarionov A, Sakipov S, Pereyaslavets L, Kurnikov IV, Kamath G, Butin O, Voronina E, Ivahnenko I, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Cherniavskyi YK, Lock C, Greenslade S, Sankaranarayanan SKRS, Kurnikova MG, Potoff J, Kornberg RD, Levitt M, Fain B. Combining Force Fields and Neural Networks for an Accurate Representation of Chemically Diverse Molecular Interactions. J Am Chem Soc 2023; 145:23620-23629. [PMID: 37856313 PMCID: PMC10623557 DOI: 10.1021/jacs.3c07628] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Indexed: 10/21/2023]
Abstract
A key goal of molecular modeling is the accurate reproduction of the true quantum mechanical potential energy of arbitrary molecular ensembles with a tractable classical approximation. The challenges are that analytical expressions found in general purpose force fields struggle to faithfully represent the intermolecular quantum potential energy surface at close distances and in strong interaction regimes; that the more accurate neural network approximations do not capture crucial physics concepts, e.g., nonadditive inductive contributions and application of electric fields; and that the ultra-accurate narrowly targeted models have difficulty generalizing to the entire chemical space. We therefore designed a hybrid wide-coverage intermolecular interaction model consisting of an analytically polarizable force field combined with a short-range neural network correction for the total intermolecular interaction energy. Here, we describe the methodology and apply the model to accurately determine the properties of water, the free energy of solvation of neutral and charged molecules, and the binding free energy of ligands to proteins. The correction is subtyped for distinct chemical species to match the underlying force field, to segment and reduce the amount of quantum training data, and to increase accuracy and computational speed. For the systems considered, the hybrid ab initio parametrized Hamiltonian reproduces the two-body dimer quantum mechanics (QM) energies to within 0.03 kcal/mol and the nonadditive many-molecule contributions to within 2%. Simulations of molecular systems using this interaction model run at speeds of several nanoseconds per day.
Collapse
Affiliation(s)
- Alexey Illarionov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan Sakipov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor V. Kurnikov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Oleg Butin
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Lomonosov
MSU, Skobeltsyn Institute of Nuclear Physics, Moscow, 119991, Russia
| | - Ilya Ivahnenko
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Lomonosov
MSU, Dept. of Physics, Moscow, 119991, Russia
| | - Yevhen K. Cherniavskyi
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher Lock
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department
of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Sean Greenslade
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Subramanian KRS Sankaranarayanan
- Center
for Nanoscale Materials, Argonne National
Lab, Argonne, Illinois 604391, United States
- Department
of Mechanical and Industrial Engineering, University of Illinois, Chicago, Illinois 60607, United States
| | - Maria G. Kurnikova
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Jeffrey Potoff
- Department
of Chemical Engineering and Materials Science, Wayne State University, Detroit, Michigan 48202, United States
| | - Roger D. Kornberg
- Department
of Structural Biology, Stanford University
School of Medicine, Stanford, California 94304, United States
| | - Michael Levitt
- Department
of Structural Biology, Stanford University
School of Medicine, Stanford, California 94304, United States
| | - Boris Fain
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
54
|
Witt WC, van der Oord C, Gelžinytė E, Järvinen T, Ross A, Darby JP, Ho CH, Baldwin WJ, Sachs M, Kermode J, Bernstein N, Csányi G, Ortner C. ACEpotentials.jl: A Julia implementation of the atomic cluster expansion. J Chem Phys 2023; 159:164101. [PMID: 37870138 DOI: 10.1063/5.0158783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 08/25/2023] [Indexed: 10/24/2023] Open
Abstract
We introduce ACEpotentials.jl, a Julia-language software package that constructs interatomic potentials from quantum mechanical reference data using the Atomic Cluster Expansion [R. Drautz, Phys. Rev. B 99, 014104 (2019)]. As the latter provides a complete description of atomic environments, including invariance to overall translation and rotation as well as permutation of like atoms, the resulting potentials are systematically improvable and data efficient. Furthermore, the descriptor's expressiveness enables use of a linear model, facilitating rapid evaluation and straightforward application of Bayesian techniques for active learning. We summarize the capabilities of ACEpotentials.jl and demonstrate its strengths (simplicity, interpretability, robustness, performance) on a selection of prototypical atomistic modelling workflows.
Collapse
Affiliation(s)
- William C Witt
- Department of Materials Science and Metallurgy, University of Cambridge, Cambridge, United Kingdom
| | - Cas van der Oord
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Elena Gelžinytė
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Teemu Järvinen
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia V6T 1Z2, Canada
| | - Andres Ross
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia V6T 1Z2, Canada
| | - James P Darby
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Cheuk Hin Ho
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia V6T 1Z2, Canada
| | - William J Baldwin
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Matthias Sachs
- School of Mathematics, University of Birmingham, Birmingham B15 2TT, United Kingdom
| | - James Kermode
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Noam Bernstein
- Center for Materials Physics and Technology, U.S. Naval Research Laboratory, Washington, District of Columbia 20375, USA
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Christoph Ortner
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia V6T 1Z2, Canada
| |
Collapse
|
55
|
Taniguchi T, Hosokawa M, Asahi T. Graph Comparison of Molecular Crystals in Band Gap Prediction Using Neural Networks. ACS OMEGA 2023; 8:39481-39489. [PMID: 37901497 PMCID: PMC10601046 DOI: 10.1021/acsomega.3c05224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 10/03/2023] [Indexed: 10/31/2023]
Abstract
In material informatics, the representation of the material structure is fundamentally essential to obtaining better prediction results, and graph representation has attracted much attention in recent years. Molecular crystals can be graphically represented in molecular and crystal representations, but a comparison of which representation is more effective has not been examined. In this study, we compared the prediction accuracy between molecular and crystal graphs for band gap prediction. The results showed that the prediction accuracies using crystal graphs were better than those obtained using molecular graphs. While this result is not surprising, error analysis quantitatively evaluated that the error of the crystal graph was 0.4 times that of the molecular graph with moderate correlation. The novelty of this study lies in the comparison of molecular crystal representations and in the quantitative evaluation of the contribution of crystal structures to the band gap.
Collapse
Affiliation(s)
- Takuya Taniguchi
- Center
for Data Science, Waseda University, 1-6-1 Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
| | - Mayuko Hosokawa
- Department
of Advanced Science and Engineering, Graduate School of Advanced Science
and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-Ku, Tokyo 169-8555, Japan
| | - Toru Asahi
- Department
of Advanced Science and Engineering, Graduate School of Advanced Science
and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-Ku, Tokyo 169-8555, Japan
| |
Collapse
|
56
|
Nguyen TH, Le KM, Nguyen LH, Truong TN. Atom-Based Machine Learning Model for Quantitative Property-Structure Relationship of Electronic Properties of Fusenes and Substituted Fusenes. ACS OMEGA 2023; 8:38441-38451. [PMID: 37867641 PMCID: PMC10586267 DOI: 10.1021/acsomega.3c05212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 09/15/2023] [Indexed: 10/24/2023]
Abstract
This study presents the development of machine-learning-based quantitative structure-property relationship (QSPR) models for predicting electron affinity, ionization potential, and band gap of fusenes from different chemical classes. Three variants of the atom-based Weisfeiler-Lehman (WL) graph kernel method and the machine learning model Gaussian process regressor (GPR) were used. The data pool comprises polycyclic aromatic hydrocarbons (PAHs), thienoacenes, cyano-substituted PAHs, and nitro-substituted PAHs computed with density functional theory (DFT) at the B3LYP-D3/6-31+G(d) level of theory. The results demonstrate that the GPR/WL kernel methods can accurately predict the electronic properties of PAHs and their derivatives with root-mean-square deviations of 0.15 eV. Additionally, we also demonstrate the effectiveness of the active learning protocol for the GPR/WL kernel methods pipeline, particularly for data sets with greater diversity. The interpretation of the model for contributions of individual atoms to the predicted electronic properties provides reasons for the success of our previous degree of π-orbital overlap model.
Collapse
Affiliation(s)
- Tuan H. Nguyen
- Faculty
of Chemical Engineering, Ho Chi Minh City
University of Technology, 268 Ly Thuong Kiet Street, District 10, Ho Chi Minh City 7000000, Vietnam
| | - Khang M. Le
- Faculty
of Chemistry, VNUHCM-University of Science, 227 Nguyen Van Cu Street, Ho Chi Minh City 700000, Vietnam
| | - Lam H. Nguyen
- Faculty
of Chemistry, VNUHCM-University of Science, 227 Nguyen Van Cu Street, Ho Chi Minh City 700000, Vietnam
- Institute
for Computational Science and Technology, Ho Chi Minh City 700000, Vietnam
| | - Thanh N. Truong
- Department
of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
57
|
Priyadharsan RR, Timothy RA, Thomas JM, Jeyakumar TC, Rajaram R, Louis H. Investigating the structure, bonding, and energy decomposition analysis of group 10 transition metal carbonyls with substituted terminal germanium chalcogenides [M(CO) 3GeX] (M = Ni, Pd, and Pt; X = O, S, Se, and Te) complexes: insight from first-principles calculations. J Mol Model 2023; 29:344. [PMID: 37847395 DOI: 10.1007/s00894-023-05745-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Accepted: 10/02/2023] [Indexed: 10/18/2023]
Abstract
CONTEXT This research focused on the theoretical investigation of transition metal carbonyls [M(CO)4] coordinated with terminal germanium chalcogenides complexes [M(CO)3GeX], where M represents Ni, Pd, and Pt and X represents O, S, Se, and Te labeled 1-15. While the notable complexes M(CO)4 (where M = Ni, Pd, Pt) numbered 1, 6, and 11 are of significance, substituting one of the CO ligands in 1, 6, and 11 with a GeX ligand (where X = O, S, Se, or Te) result in substituted complexes (2-5, 7-10, and 11-15). Substituting of the CO ligand slightly alters these bond angles. Specifically, the ∠CMC bond angles for [Ni] complexes range from 111.9° to 112.2°, for [Pd] complexes from 111.4° to 111.7°, and for [Pt] complexes from 112.4° to 112.8°. These findings indicate a minor deviation from the tetrahedral geometry due to the influence of the new GeX ligand. Similarly, there is a slight change in the geometry of the metal complexes, where the ∠GeMC angles for [Ni] complexes are between 106.7° and 106.9°, for [Pd] complexes between 107.2° and 107.5°, and for [Pt] complexes between 105.9° and 106.4°. Comparing among the substituted GeX complexes, those containing GeTe exhibit a higher natural bond orbital (NBO) contribution from the Ge atom compared to the M atom. Consequently, based on the above observations, it can be inferred that GeX acts as an effective sigma donor in contrast to carbonyl compounds. Results of energy decomposition analysis (EDA) for the M-CO bond in 1, 6, and 11 and for the M-GeX bond in the other [M(CO)3(GeX)] complexes where M = Ni, Pd and Pt. The percentage contribution of ΔEelstat and ΔEorb shows a relatively identical behavior for all ligands in case of each metal complexes. METHODS Density functional theory (DFT) calculations were conducted using the B3LYP/gen/6-31G*/LanL2DZ level of theory to examine transition metal carbonyls [M(CO)4] coordinated with terminal germanium chalcogenides complexes [M(CO)3GeX], where M represents Ni, Pd, and Pt, and X represents O, S, Se, and Te labeled 1-15 utilized through the use of Gaussian 09W and GaussView 6.0.16 software packages. Post-processing computational code such as multi-wave function was employed for results analysis and visualization.
Collapse
Affiliation(s)
- R Rameshbabu Priyadharsan
- PG & Research Department of Chemistry, The American College (Autonomous), Madurai, Tamil Nadu, India
| | - Rawlings A Timothy
- Computational and Bio-Simulation Research Group, University of Calabar, Calabar, Nigeria
| | - Jisha Mary Thomas
- Department of Chemistry, Pondicherry University, Puducherry, 605014, India
| | | | - Rajendran Rajaram
- Department of Chemistry, Madanapalle Institute of Technology and Science, Angallu (V), Madanapalle, Andhra Pradesh, 517325, India
| | - Hitler Louis
- Computational and Bio-Simulation Research Group, University of Calabar, Calabar, Nigeria.
- Centre for Herbal Pharmacology and Environmental Sustainability, Chettinad Hospital and Research Institute, Chettinad Academy of Research and Education, Kelambakkam, 603103, Tamil Nadu, India.
| |
Collapse
|
58
|
Liu X, Wang W, Pérez-Ríos J. Molecular dynamics-driven global potential energy surfaces: Application to the AlF dimer. J Chem Phys 2023; 159:144103. [PMID: 37811831 DOI: 10.1063/5.0169080] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 09/20/2023] [Indexed: 10/10/2023] Open
Abstract
In this work, we present a full-dimensional potential energy surface for AlF-AlF. We apply a general machine learning approach for full-dimensional potential energy surfaces, employing an active learning scheme trained on ab initio points, whose size grows based on the accuracy required. The training points are selected based on molecular dynamics simulations, choosing the most suitable configurations for different collision energy and mapping the most relevant part of the potential energy landscape of the system. The present approach does not require long-range information and is entirely general. As a result, it is possible to provide the full-dimensional AlF-AlF potential energy surface, requiring ≲0.01% of the configurations to be calculated ab initio. Furthermore, we analyze the general properties of the AlF-AlF system, finding critical differences with other reported results on CaF or bi-alkali dimers.
Collapse
Affiliation(s)
- Xiangyue Liu
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| | - Weiqi Wang
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| | - Jesús Pérez-Ríos
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794, USA
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794-3800, USA
| |
Collapse
|
59
|
Korolev V, Protsenko P. Accurate, interpretable predictions of materials properties within transformer language models. PATTERNS (NEW YORK, N.Y.) 2023; 4:100803. [PMID: 37876904 PMCID: PMC10591138 DOI: 10.1016/j.patter.2023.100803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 06/06/2023] [Accepted: 07/04/2023] [Indexed: 10/26/2023]
Abstract
Property prediction accuracy has long been a key parameter of machine learning in materials informatics. Accordingly, advanced models showing state-of-the-art performance turn into highly parameterized black boxes missing interpretability. Here, we present an elegant way to make their reasoning transparent. Human-readable text-based descriptions automatically generated within a suite of open-source tools are proposed as materials representation. Transformer language models pretrained on 2 million peer-reviewed articles take as input well-known terms such as chemical composition, crystal symmetry, and site geometry. Our approach outperforms crystal graph networks by classifying four out of five analyzed properties if one considers all available reference data. Moreover, fine-tuned text-based models show high accuracy in the ultra-small data limit. Explanations of their internal machinery are produced using local interpretability techniques and are faithful and consistent with domain expert rationales. This language-centric framework makes accurate property predictions accessible to people without artificial-intelligence expertise.
Collapse
Affiliation(s)
- Vadim Korolev
- Department of Chemistry, Lomonosov Moscow State University, 119991 Moscow, Russia
| | - Pavel Protsenko
- Department of Chemistry, Lomonosov Moscow State University, 119991 Moscow, Russia
| |
Collapse
|
60
|
Zhang Y, Jiang B. Universal machine learning for the response of atomistic systems to external fields. Nat Commun 2023; 14:6424. [PMID: 37827998 PMCID: PMC10570356 DOI: 10.1038/s41467-023-42148-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2023] [Accepted: 10/01/2023] [Indexed: 10/14/2023] Open
Abstract
Machine learned interatomic interaction potentials have enabled efficient and accurate molecular simulations of closed systems. However, external fields, which can greatly change the chemical structure and/or reactivity, have been seldom included in current machine learning models. This work proposes a universal field-induced recursively embedded atom neural network (FIREANN) model, which integrates a pseudo field vector-dependent feature into atomic descriptors to represent system-field interactions with rigorous rotational equivariance. This "all-in-one" approach correlates various response properties like dipole moment and polarizability with the field-dependent potential energy in a single model, very suitable for spectroscopic and dynamics simulations in molecular and periodic systems in the presence of electric fields. Especially for periodic systems, we find that FIREANN can overcome the intrinsic multiple-value issue of the polarization by training atomic forces only. These results validate the universality and capability of the FIREANN method for efficient first-principles modeling of complicated systems in strong external fields.
Collapse
Affiliation(s)
- Yaolong Zhang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China
- École Polytechnique Fédérale de Lausanne, 1015, Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes, University of Science and Technology of China, Hefei, Anhui, 230026, China.
- Hefei National Laboratory, University of Science and Technology of China, Hefei, 230088, China.
| |
Collapse
|
61
|
Hermann J, Spencer J, Choo K, Mezzacapo A, Foulkes WMC, Pfau D, Carleo G, Noé F. Ab initio quantum chemistry with neural-network wavefunctions. Nat Rev Chem 2023; 7:692-709. [PMID: 37558761 DOI: 10.1038/s41570-023-00516-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/16/2023] [Indexed: 08/11/2023]
Abstract
Deep learning methods outperform human capabilities in pattern recognition and data processing problems and now have an increasingly important role in scientific discovery. A key application of machine learning in molecular science is to learn potential energy surfaces or force fields from ab initio solutions of the electronic Schrödinger equation using data sets obtained with density functional theory, coupled cluster or other quantum chemistry (QC) methods. In this Review, we discuss a complementary approach using machine learning to aid the direct solution of QC problems from first principles. Specifically, we focus on quantum Monte Carlo methods that use neural-network ansatzes to solve the electronic Schrödinger equation, in first and second quantization, computing ground and excited states and generalizing over multiple nuclear configurations. Although still at their infancy, these methods can already generate virtually exact solutions of the electronic Schrödinger equation for small systems and rival advanced conventional QC methods for systems with up to a few dozen electrons.
Collapse
Affiliation(s)
- Jan Hermann
- Microsoft Research AI4Science, Berlin, Germany
- FU Berlin, Department of Mathematics and Computer Science, Berlin, Germany
| | | | - Kenny Choo
- Department of Physics, University of Zurich, Zurich, Switzerland
- IBM Quantum, IBM Research Zurich, Ruschlikon, Switzerland
| | | | - W M C Foulkes
- Imperial College London, Department of Physics, London, UK
| | - David Pfau
- DeepMind, London, UK.
- Imperial College London, Department of Physics, London, UK.
| | | | - Frank Noé
- Microsoft Research AI4Science, Berlin, Germany.
- FU Berlin, Department of Mathematics and Computer Science, Berlin, Germany.
- FU Berlin, Department of Physics, Berlin, Germany.
- Department of Chemistry,Rice University, Houston, TX, USA.
| |
Collapse
|
62
|
Li Y, Jiang JW. Vacancy defects impede the transition from peapods to diamond: a neuroevolution machine learning study. Phys Chem Chem Phys 2023; 25:25629-25638. [PMID: 37721136 DOI: 10.1039/d3cp03862a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/19/2023]
Abstract
Exploration of novel carbon allotropes has been a central subject in materials science, in which carbon peapods hold great potential as a precursor for the development of new carbon allotropes. To enable precise large-scale molecular dynamics simulations, we develop a high-accurate and low-cost machine-learned potential (MLP) for carbon materials using the neuroevolution potential framework. Based on the MLP, we conduct an investigation into the structural transitions of peapod arrays under high-temperature and high-pressure conditions and disclose the impact of vacancy defects. Defects promote the transition from the ordered crystalline structure to the disordered amorphous structure in peapods at low temperatures, while impeding the transition to the ordered diamond structure. Benefiting from the accurate MLP, we are able to reproduce the experimentally observed carbon structures in numerical simulations. We build a diagram summarizing all the structures that appear in the compression simulation of peapod arrays at various temperatures. The present work not only discloses the underlying mechanism of structural transitions from carbon peapods into various functional carbon materials, but also provides a high-accurate and low-cost interatomic potential that shall be valuable in the exploration of novel carbon allotropes.
Collapse
Affiliation(s)
- Yu Li
- Shanghai Key Laboratory of Mechanics in Energy Engineering, Shanghai Institute of Applied Mathematics and Mechanics, Shanghai Frontier Science Center of Mechanoinformatics, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200072, P. R. China.
| | - Jin-Wu Jiang
- Shanghai Key Laboratory of Mechanics in Energy Engineering, Shanghai Institute of Applied Mathematics and Mechanics, Shanghai Frontier Science Center of Mechanoinformatics, School of Mechanics and Engineering Science, Shanghai University, Shanghai 200072, P. R. China.
- Zhejiang Laboratory, Hangzhou 311100, China
| |
Collapse
|
63
|
Goscinski A, Principe VP, Fraux G, Kliavinek S, Helfrecht BA, Loche P, Ceriotti M, Cersonsky RK. scikit-matter : A Suite of Generalisable Machine Learning Methods Born out of Chemistry and Materials Science. OPEN RESEARCH EUROPE 2023; 3:81. [PMID: 38234865 PMCID: PMC10792272 DOI: 10.12688/openreseurope.15789.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 09/01/2023] [Indexed: 08/11/2024]
Abstract
Easy-to-use libraries such as scikit-learn have accelerated the adoption and application of machine learning (ML) workflows and data-driven methods. While many of the algorithms implemented in these libraries originated in specific scientific fields, they have gained in popularity in part because of their generalisability across multiple domains. Over the past two decades, researchers in the chemical and materials science community have put forward general-purpose machine learning methods. The deployment of these methods into workflows of other domains, however, is often burdensome due to the entanglement with domainspecific functionalities. We present the python library scikit-matter that targets domain-agnostic implementations of methods developed in the computational chemical and materials science community, following the scikit-learn API and coding guidelines to promote usability and interoperability with existing workflows.
Collapse
Affiliation(s)
- Alexander Goscinski
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Victor Paul Principe
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Guillaume Fraux
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Sergei Kliavinek
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Benjamin Aaron Helfrecht
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Philip Loche
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Rose Kathleen Cersonsky
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
| |
Collapse
|
64
|
Goscinski A, Principe VP, Fraux G, Kliavinek S, Helfrecht BA, Loche P, Ceriotti M, Cersonsky RK. scikit-matter : A Suite of Generalisable Machine Learning Methods Born out of Chemistry and Materials Science. OPEN RESEARCH EUROPE 2023; 3:81. [PMID: 38234865 PMCID: PMC10792272 DOI: 10.12688/openreseurope.15789.2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Accepted: 09/01/2023] [Indexed: 01/19/2024]
Abstract
Easy-to-use libraries such as scikit-learn have accelerated the adoption and application of machine learning (ML) workflows and data-driven methods. While many of the algorithms implemented in these libraries originated in specific scientific fields, they have gained in popularity in part because of their generalisability across multiple domains. Over the past two decades, researchers in the chemical and materials science community have put forward general-purpose machine learning methods. The deployment of these methods into workflows of other domains, however, is often burdensome due to the entanglement with domainspecific functionalities. We present the python library scikit-matter that targets domain-agnostic implementations of methods developed in the computational chemical and materials science community, following the scikit-learn API and coding guidelines to promote usability and interoperability with existing workflows.
Collapse
Affiliation(s)
- Alexander Goscinski
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Victor Paul Principe
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Guillaume Fraux
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Sergei Kliavinek
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Benjamin Aaron Helfrecht
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
- Pacific Northwest National Laboratory, Richland, WA, 99352, USA
| | - Philip Loche
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
| | - Rose Kathleen Cersonsky
- Laboratory of Computational Science and Modeling (COSMO), Institute of Materials, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, 1015, Switzerland
- Department of Chemical and Biological Engineering, University of Wisconsin-Madison, Madison, Wisconsin, 53706, USA
| |
Collapse
|
65
|
Karcz MJ, Messina L, Kawasaki E, Rajaonson S, Bathellier D, Nastar M, Schuler T, Bourasseau E. Semi-supervised generative approach to chemical disorder: application to point-defect formation in uranium-plutonium mixed oxides. Phys Chem Chem Phys 2023; 25:23069-23080. [PMID: 37605928 DOI: 10.1039/d3cp02790b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Chemical disorder has a major impact on the characterization of the atomic-scale properties of highly complex chemical compounds, such as the properties of point defects. Due to the vast amount of possible atomic configurations, the study of such properties becomes intractable if treated with direct sampling. In this work, we propose an alternative approach, in which samples are selected based on the local atomic composition around the defect, and the defect formation energy is obtained as a function of this local composition with a reduced computational cost. We apply this approach to (U, Pu)O2 nuclear fuels. The formation-energy distribution is computed using machine-learning generative methods, and used to investigate the impact of chemical disorder and the range of influence of local composition on the defect properties. The predicted distributions are then used to calculate the concentration of thermal defects. This approach allows for the first time for the computation of the latter property with a physically meaningful exploration of the configuration space, and opens the way to a more efficient determination of physico-chemical properties in other chemically-disordered compounds such as high-entropy alloys.
Collapse
Affiliation(s)
- Maciej J Karcz
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
- Université Paris-Saclay, CEA, LIST, F-91120, Palaiseau, France
| | - Luca Messina
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Eiji Kawasaki
- Université Paris-Saclay, CEA, LIST, F-91120, Palaiseau, France
| | - Serenah Rajaonson
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Didier Bathellier
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| | - Maylise Nastar
- Université Paris-Saclay, CEA, Service de Recherche en Corrosion et Comportement des Matériaux, SRMP, F-91191 Gif-sur-Yvette, France
| | - Thomas Schuler
- Université Paris-Saclay, CEA, Service de Recherche en Corrosion et Comportement des Matériaux, SRMP, F-91191 Gif-sur-Yvette, France
| | - Emeric Bourasseau
- CEA, DES, IRESNE, DEC, Cadarache, F-13108 Saint-Paul-Lez-Durance, France.
| |
Collapse
|
66
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
67
|
Stenczel TK, El-Machachi Z, Liepuoniute G, Morrow JD, Bartók AP, Probert MIJ, Csányi G, Deringer VL. Machine-learned acceleration for molecular dynamics in CASTEP. J Chem Phys 2023; 159:044803. [PMID: 37497818 DOI: 10.1063/5.0155621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/03/2023] [Indexed: 07/28/2023] Open
Abstract
Machine learning (ML) methods are of rapidly growing interest for materials modeling, and yet, the use of ML interatomic potentials for new systems is often more demanding than that of established density-functional theory (DFT) packages. Here, we describe computational methodology to combine the CASTEP first-principles simulation software with the on-the-fly fitting and evaluation of ML interatomic potential models. Our approach is based on regular checking against DFT reference data, which provides a direct measure of the accuracy of the evolving ML model. We discuss the general framework and the specific solutions implemented, and we present an example application to high-temperature molecular-dynamics simulations of carbon nanostructures. The code is freely available for academic research.
Collapse
Affiliation(s)
- Tamás K Stenczel
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Zakariya El-Machachi
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Guoda Liepuoniute
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Joe D Morrow
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Albert P Bartók
- Department of Physics, University of Warwick, Coventry CV4 7AL, United Kingdom
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Matt I J Probert
- School of Physics, Engineering and Technology, University of York, York YO10 5DD, United Kingdom
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| |
Collapse
|
68
|
Kirschbaum T, von Seggern B, Dzubiella J, Bande A, Noé F. Machine Learning Frontier Orbital Energies of Nanodiamonds. J Chem Theory Comput 2023; 19:4461-4473. [PMID: 37053438 DOI: 10.1021/acs.jctc.2c01275] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Nanodiamonds have a wide range of applications including catalysis, sensing, tribology, and biomedicine. To leverage nanodiamond design via machine learning, we introduce the new data set ND5k, consisting of 5089 diamondoid and nanodiamond structures and their frontier orbital energies. ND5k structures are optimized via tight-binding density functional theory (DFTB) and their frontier orbital energies are computed using density functional theory (DFT) with the PBE0 hybrid functional. From this data set we derive a qualitative design suggestion for nanodiamonds in photocatalysis. We also compare recent machine learning models for predicting frontier orbital energies for similar structures as they have been trained on (interpolation on ND5k), and we test their abilities to extrapolate predictions to larger structures. For both the interpolation and extrapolation task, we find the best performance using the equivariant message passing neural network PaiNN. The second best results are achieved with a message passing neural network using a tailored set of atomic descriptors proposed here.
Collapse
Affiliation(s)
- Thorren Kirschbaum
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
| | - Börries von Seggern
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Arnimallee 22, 14195 Berlin, Germany
| | - Joachim Dzubiella
- Institute of Physics, Albert-Ludwigs-Universität Freiburg, Hermann-Herder-Straße 3, 79104 Freiburg im Breisgau, Germany
| | - Annika Bande
- Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Hahn-Meitner-Platz 1, 14109 Berlin, Germany
| | - Frank Noé
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, 10178 Berlin, Germany
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195 Berlin, Germany
- Department of Chemistry, Rice University, 6100 Main Street, Houston, Texas 77005, United States
| |
Collapse
|
69
|
Crippa M, Cardellini A, Caruso C, Pavan GM. Detecting dynamic domains and local fluctuations in complex molecular systems via timelapse neighbors shuffling. Proc Natl Acad Sci U S A 2023; 120:e2300565120. [PMID: 37467266 PMCID: PMC10372573 DOI: 10.1073/pnas.2300565120] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 05/25/2023] [Indexed: 07/21/2023] Open
Abstract
It is known that the behavior of many complex systems is controlled by local dynamic rearrangements or fluctuations occurring within them. Complex molecular systems, composed of many molecules interacting with each other in a Brownian storm, make no exception. Despite the rise of machine learning and of sophisticated structural descriptors, detecting local fluctuations and collective transitions in complex dynamic ensembles remains often difficult. Here, we show a machine learning framework based on a descriptor which we name Local Environments and Neighbors Shuffling (LENS), that allows identifying dynamic domains and detecting local fluctuations in a variety of systems in an abstract and efficient way. By tracking how much the microscopic surrounding of each molecular unit changes over time in terms of neighbor individuals, LENS allows characterizing the global (macroscopic) dynamics of molecular systems in phase transition, phases-coexistence, as well as intrinsically characterized by local fluctuations (e.g., defects). Statistical analysis of the LENS time series data extracted from molecular dynamics trajectories of, for example, liquid-like, solid-like, or dynamically diverse complex molecular systems allows tracking in an efficient way the presence of different dynamic domains and of local fluctuations emerging within them. The approach is found robust, versatile, and applicable independently of the features of the system and simply provided that a trajectory containing information on the relative motion of the interacting units is available. We envisage that "such a LENS" will constitute a precious basis for exploring the dynamic complexity of a variety of systems and, given its abstract definition, not necessarily of molecular ones.
Collapse
Affiliation(s)
- Martina Crippa
- Department of Applied Science and Technology, Politecnico di Torino, Torino10129, Italy
| | - Annalisa Cardellini
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Lugano-Viganello6962, Switzerland
| | - Cristina Caruso
- Department of Applied Science and Technology, Politecnico di Torino, Torino10129, Italy
| | - Giovanni M. Pavan
- Department of Applied Science and Technology, Politecnico di Torino, Torino10129, Italy
- Department of Innovative Technologies, University of Applied Sciences and Arts of Southern Switzerland, Lugano-Viganello6962, Switzerland
| |
Collapse
|
70
|
Darby JP, Kovács DP, Batatia I, Caro MA, Hart GLW, Ortner C, Csányi G. Tensor-Reduced Atomic Density Representations. PHYSICAL REVIEW LETTERS 2023; 131:028001. [PMID: 37505943 DOI: 10.1103/physrevlett.131.028001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 04/18/2023] [Indexed: 07/30/2023]
Abstract
Density-based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modeling, and the visualization and analysis of material datasets. The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. By exploiting symmetry, we recast this approach as tensor factorization of the standard neighbour-density-based descriptors and, using a new notation, identify connections to existing compression algorithms. In doing so, we form compact tensor-reduced representation of the local atomic environment whose size does not depend on the number of chemical elements, is systematically convergable, and therefore remains applicable to a wide range of data analysis and regression tasks.
Collapse
Affiliation(s)
- James P Darby
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry, CV4 7AL, United Kingdom
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Dávid P Kovács
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Ilyes Batatia
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
- ENS Paris-Saclay, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Miguel A Caro
- Department of Electrical Engineering and Automation, Aalto University, FIN-02150 Espoo, Finland
| | - Gus L W Hart
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, 84602, USA
| | - Christoph Ortner
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia, Canada V6T 1Z2
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| |
Collapse
|
71
|
Chapman J, Hsu T, Chen X, Heo TW, Wood BC. Quantifying disorder one atom at a time using an interpretable graph neural network paradigm. Nat Commun 2023; 14:4030. [PMID: 37419927 DOI: 10.1038/s41467-023-39755-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Accepted: 06/26/2023] [Indexed: 07/09/2023] Open
Abstract
Quantifying the level of atomic disorder within materials is critical to understanding how evolving local structural environments dictate performance and durability. Here, we leverage graph neural networks to define a physically interpretable metric for local disorder, called SODAS. This metric encodes the diversity of the local atomic configurations as a continuous spectrum between the solid and liquid phases, quantified against a distribution of thermal perturbations. We apply this methodology to four prototypical examples with varying levels of disorder: (1) grain boundaries, (2) solid-liquid interfaces, (3) polycrystalline microstructures, and (4) tensile failure/fracture. We also compare SODAS to several commonly used methods. Using elemental aluminum as a case study, we show how our paradigm can track the spatio-temporal evolution of interfaces, incorporating a mathematically defined description of the spatial boundary between order and disorder. We further show how to extract physics-preserved gradients from our continuous disorder fields, which may be used to understand and predict materials performance and failure. Overall, our framework provides a simple and generalizable pathway to quantify the relationship between complex local atomic structure and coarse-grained materials phenomena.
Collapse
Affiliation(s)
- James Chapman
- Department of Mechanical Engineering, Boston University, Boston, MA, USA.
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, CA, USA.
| | - Tim Hsu
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA.
| | - Xiao Chen
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Tae Wook Heo
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Brandon C Wood
- Materials Science Division, Lawrence Livermore National Laboratory, Livermore, CA, USA.
| |
Collapse
|
72
|
Kabylda A, Vassilev-Galindo V, Chmiela S, Poltavsky I, Tkatchenko A. Efficient interatomic descriptors for accurate machine learning force fields of extended molecules. Nat Commun 2023; 14:3562. [PMID: 37322039 DOI: 10.1038/s41467-023-39214-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 05/17/2023] [Indexed: 06/17/2023] Open
Abstract
Machine learning force fields (MLFFs) are gradually evolving towards enabling molecular dynamics simulations of molecules and materials with ab initio accuracy but at a small fraction of the computational cost. However, several challenges remain to be addressed to enable predictive MLFF simulations of realistic molecules, including: (1) developing efficient descriptors for non-local interatomic interactions, which are essential to capture long-range molecular fluctuations, and (2) reducing the dimensionality of the descriptors to enhance the applicability and interpretability of MLFFs. Here we propose an automatized approach to substantially reduce the number of interatomic descriptor features while preserving the accuracy and increasing the efficiency of MLFFs. To simultaneously address the two stated challenges, we illustrate our approach on the example of the global GDML MLFF. We found that non-local features (atoms separated by as far as 15 Å in studied systems) are crucial to retain the overall accuracy of the MLFF for peptides, DNA base pairs, fatty acids, and supramolecular complexes. Interestingly, the number of required non-local features in the reduced descriptors becomes comparable to the number of local interatomic features (those below 5 Å). These results pave the way to constructing global molecular MLFFs whose cost increases linearly, instead of quadratically, with system size.
Collapse
Affiliation(s)
- Adil Kabylda
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
73
|
Patel R, Colmenares S, Webb MA. Sequence Patterning, Morphology, and Dispersity in Single-Chain Nanoparticles: Insights from Simulation and Machine Learning. ACS POLYMERS AU 2023; 3:284-294. [PMID: 37334192 PMCID: PMC10273411 DOI: 10.1021/acspolymersau.3c00007] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Revised: 05/15/2023] [Accepted: 05/15/2023] [Indexed: 06/20/2023]
Abstract
Single-chain nanoparticles (SCNPs) are intriguing materials inspired by proteins that consist of a single precursor polymer chain that has collapsed into a stable structure. In many prospective applications, such as catalysis, the utility of a single-chain nanoparticle will intricately depend on the formation of a mostly specific structure or morphology. However, it is not generally well understood how to reliably control the morphology of single-chain nanoparticles. To address this knowledge gap, we simulate the formation of 7680 distinct single-chain nanoparticles from precursor chains that span a wide range of, in principle, tunable patterning characteristics of cross-linking moieties. Using a combination of molecular simulation and machine learning analyses, we show how the overall fraction of functionalization and blockiness of cross-linking moieties biases the formation of certain local and global morphological characteristics. Importantly, we illustrate and quantify the dispersity of morphologies that arise due to the stochastic nature of collapse from a well-defined sequence as well as from the ensemble of sequences that correspond to a given specification of precursor parameters. Moreover, we also examine the efficacy of precise sequence control in achieving morphological outcomes in different regimes of precursor parameters. Overall, this work critically assesses how precursor chains might be feasibly tailored to achieve given SCNP morphologies and provides a platform to pursue future sequence-based design.
Collapse
|
74
|
Staub R, Gantzer P, Harabuchi Y, Maeda S, Varnek A. Challenges for Kinetics Predictions via Neural Network Potentials: A Wilkinson's Catalyst Case. Molecules 2023; 28:molecules28114477. [PMID: 37298952 DOI: 10.3390/molecules28114477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 05/23/2023] [Accepted: 05/26/2023] [Indexed: 06/12/2023] Open
Abstract
Ab initio kinetic studies are important to understand and design novel chemical reactions. While the Artificial Force Induced Reaction (AFIR) method provides a convenient and efficient framework for kinetic studies, accurate explorations of reaction path networks incur high computational costs. In this article, we are investigating the applicability of Neural Network Potentials (NNP) to accelerate such studies. For this purpose, we are reporting a novel theoretical study of ethylene hydrogenation with a transition metal complex inspired by Wilkinson's catalyst, using the AFIR method. The resulting reaction path network was analyzed by the Generative Topographic Mapping method. The network's geometries were then used to train a state-of-the-art NNP model, to replace expensive ab initio calculations with fast NNP predictions during the search. This procedure was applied to run the first NNP-powered reaction path network exploration using the AFIR method. We discovered that such explorations are particularly challenging for general purpose NNP models, and we identified the underlying limitations. In addition, we are proposing to overcome these challenges by complementing NNP models with fast semiempirical predictions. The proposed solution offers a generally applicable framework, laying the foundations to further accelerate ab initio kinetic studies with Machine Learning Force Fields, and ultimately explore larger systems that are currently inaccessible.
Collapse
Affiliation(s)
- Ruben Staub
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21, Nishi 10, Kita-ku, Sapporo 001-0021, Japan
| | - Philippe Gantzer
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21, Nishi 10, Kita-ku, Sapporo 001-0021, Japan
| | - Yu Harabuchi
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21, Nishi 10, Kita-ku, Sapporo 001-0021, Japan
- Japan Science and Technology Agency (JST), ERATO Maeda Artificial Intelligence in Chemical Reaction Design and Discovery Project, Kita 10, Nishi 8, Kita-ku, Sapporo 060-0810, Japan
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10, Nishi 8, Kita-ku, Sapporo 060-0810, Japan
| | - Satoshi Maeda
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21, Nishi 10, Kita-ku, Sapporo 001-0021, Japan
- Japan Science and Technology Agency (JST), ERATO Maeda Artificial Intelligence in Chemical Reaction Design and Discovery Project, Kita 10, Nishi 8, Kita-ku, Sapporo 060-0810, Japan
- Department of Chemistry, Faculty of Science, Hokkaido University, Kita 10, Nishi 8, Kita-ku, Sapporo 060-0810, Japan
- Research and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), 1-1 Namiki, Tsukuba 305-0044, Japan
| | - Alexandre Varnek
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21, Nishi 10, Kita-ku, Sapporo 001-0021, Japan
- Laboratory of Chemoinformatics, UMR 7140, CNRS, University of Strasbourg, 67081 Strasbourg, France
| |
Collapse
|
75
|
Aithani L, Alcaide E, Bartunov S, Cooper CDO, Doré AS, Lane TJ, Maclean F, Rucktooa P, Shaw RA, Skerratt SE. Advancing structural biology through breakthroughs in AI. Curr Opin Struct Biol 2023; 80:102601. [PMID: 37182397 DOI: 10.1016/j.sbi.2023.102601] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 03/06/2023] [Accepted: 04/03/2023] [Indexed: 05/16/2023]
Abstract
The past century has witnessed an exponential increase in our atomic-level understanding of molecular and cellular mechanisms from a structural perspective, with multiple landmark achievements contributing to the field. This, coupled with recent and continuing breakthroughs in artificial intelligence methods such as AlphaFold2, and enhanced computational power, is enabling our understanding of protein structure and function at unprecedented levels of accuracy and predictivity. Here, we describe some of the major recent advances across these fields, and describe, as these technologies coalesce, the potential to utilise our enhanced knowledge of intricate cellular and molecular systems to discover novel therapeutics to alleviate human suffering.
Collapse
Affiliation(s)
- Laksh Aithani
- CHARM Therapeutics Ltd., The Stanley Building, 7 St. Pancras Square, London, N1C 4AG, UK.
| | - Eric Alcaide
- CHARM Therapeutics Ltd., The Stanley Building, 7 St. Pancras Square, London, N1C 4AG, UK
| | - Sergey Bartunov
- CHARM Therapeutics Ltd., The Stanley Building, 7 St. Pancras Square, London, N1C 4AG, UK
| | - Christopher D O Cooper
- CHARM Therapeutics Ltd., B900, Babraham Research Campus, Babraham, Cambridge, CB22 3AT, UK
| | - Andrew S Doré
- CHARM Therapeutics Ltd., B900, Babraham Research Campus, Babraham, Cambridge, CB22 3AT, UK
| | - Thomas J Lane
- CHARM Therapeutics Ltd., B900, Babraham Research Campus, Babraham, Cambridge, CB22 3AT, UK
| | - Finlay Maclean
- CHARM Therapeutics Ltd., The Stanley Building, 7 St. Pancras Square, London, N1C 4AG, UK
| | - Prakash Rucktooa
- CHARM Therapeutics Ltd., B900, Babraham Research Campus, Babraham, Cambridge, CB22 3AT, UK
| | - Robert A Shaw
- CHARM Therapeutics Ltd., The Stanley Building, 7 St. Pancras Square, London, N1C 4AG, UK
| | - Sarah E Skerratt
- CHARM Therapeutics Ltd., B900, Babraham Research Campus, Babraham, Cambridge, CB22 3AT, UK.
| |
Collapse
|
76
|
Guidarelli Mattioli F, Sciortino F, Russo J. Are Neural Network Potentials Trained on Liquid States Transferable to Crystal Nucleation? A Test on Ice Nucleation in the mW Water Model. J Phys Chem B 2023; 127:3894-3901. [PMID: 37075256 PMCID: PMC10165654 DOI: 10.1021/acs.jpcb.3c00693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 04/06/2023] [Indexed: 04/21/2023]
Abstract
Neural network potentials (NNPs) are increasingly being used to study processes that happen on long time scales. A typical example is crystal nucleation, which rate is controlled by the occurrence of a rare fluctuation, i.e., the appearance of the critical nucleus. Because the properties of this nucleus are far from those of the bulk crystal, it is yet unclear whether NN potentials trained on equilibrium liquid states can accurately describe nucleation processes. So far, nucleation studies on NNPs have been limited to ab initio models whose nucleation properties are unknown, preventing an accurate comparison. Here we train a NN potential on the mW model of water─a classical three-body potential whose nucleation time scale is accessible in standard simulations. We show that a NNP trained only on a small number of liquid state points can reproduce with great accuracy the nucleation rates and free energy barriers of the original model, computed from both spontaneous and biased trajectories, strongly supporting the use of NNPs to study nucleation events.
Collapse
Affiliation(s)
| | | | - John Russo
- Sapienza University of Rome, Piazzale Aldo Moro 2, 00185 Rome, Italy
| |
Collapse
|
77
|
Shayesteh Zadeh A, Khan SA, Vandervelden C, Peters B. Site-Averaged Ab Initio Kinetics: Importance Learning for Multistep Reactions on Amorphous Supports. J Chem Theory Comput 2023; 19:2873-2886. [PMID: 37093705 DOI: 10.1021/acs.jctc.3c00160] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/25/2023]
Abstract
Single-atom centers on amorphous supports include catalysts for polymerization, partial oxidation, metathesis, hydrogenolysis, and more. The disordered environment makes each site different, and the kinetics exponentially magnifies these differences to make ab initio site-averaged kinetics calculations extremely difficult. This work extends the importance learning algorithm for efficient and precise site-averaged kinetics estimates to ab initio calculations and multistep reaction mechanisms. Specifically, we calculate site-averaged proton transfer relaxation rates on an ensemble of cluster models representing Brønsted acid sites on silica-alumina. We include direct and water-assisted proton transfer pathways and simultaneously estimate the water adsorption and activation enthalpies for forward and backward proton transfers. We use density functional theory (DFT) to obtain a site-averaged rate, somewhat like a turnover frequency, for the proton transfer relaxation rate. Finally, we show that importance learning can provide orders-of-magnitude acceleration over standard sampling methods for site-averaged rate calculations in cases where the rate is dominated by a few highly active sites.
Collapse
Affiliation(s)
- Armin Shayesteh Zadeh
- Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Salman A Khan
- Delaware Energy Institute (DEI), University of Delaware, Newark, Delaware 19711, United States
| | | | - Baron Peters
- Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
78
|
Jinnouchi R, Minami S, Karsai F, Verdi C, Kresse G. Proton Transport in Perfluorinated Ionomer Simulated by Machine-Learned Interatomic Potential. J Phys Chem Lett 2023; 14:3581-3588. [PMID: 37018477 DOI: 10.1021/acs.jpclett.3c00293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Polymers are a class of materials that are highly challenging to deal with using first-principles methods. Here, we present an application of machine-learned interatomic potentials to predict structural and dynamical properties of dry and hydrated perfluorinated ionomers. An improved active-learning algorithm using a small number of descriptors allows to efficiently construct an accurate and transferable model for this multielemental amorphous polymer. Molecular dynamics simulations accelerated by the machine-learned potentials accurately reproduce the heterogeneous hydrophilic and hydrophobic domains formed in this material as well as proton and water diffusion coefficients under a variety of humidity conditions. Our results reveal pronounced contributions of Grotthuss chains consisting of two to three water molecules to the high proton mobility under strongly humidified conditions.
Collapse
Affiliation(s)
- Ryosuke Jinnouchi
- Toyota Central R&D Laboratories., Inc., 41-1 Yokomichi, Nagakute, Aichi 480-1192, Japan
| | - Saori Minami
- Toyota Central R&D Laboratories., Inc., 41-1 Yokomichi, Nagakute, Aichi 480-1192, Japan
| | - Ferenc Karsai
- VASP Software GmbH, Sensengasse 8, 1090 Vienna, Austria
| | - Carla Verdi
- University of Vienna, Faculty of Physics, Computational Materials Physics, Kolingasse 14-16, 1090 Vienna, Austria
| | - Georg Kresse
- VASP Software GmbH, Sensengasse 8, 1090 Vienna, Austria
- University of Vienna, Faculty of Physics, Computational Materials Physics, Kolingasse 14-16, 1090 Vienna, Austria
| |
Collapse
|
79
|
Durumeric AEP, Charron NE, Templeton C, Musil F, Bonneau K, Pasos-Trejo AS, Chen Y, Kelkar A, Noé F, Clementi C. Machine learned coarse-grained protein force-fields: Are we there yet? Curr Opin Struct Biol 2023; 79:102533. [PMID: 36731338 PMCID: PMC10023382 DOI: 10.1016/j.sbi.2023.102533] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/14/2022] [Accepted: 12/18/2022] [Indexed: 02/04/2023]
Abstract
The successful recent application of machine learning methods to scientific problems includes the learning of flexible and accurate atomic-level force-fields for materials and biomolecules from quantum chemical data. In parallel, the machine learning of force-fields at coarser resolutions is rapidly gaining relevance as an efficient way to represent the higher-body interactions needed in coarse-grained force-fields to compensate for the omitted degrees of freedom. Coarse-grained models are important for the study of systems at time and length scales exceeding those of atomistic simulations. However, the development of transferable coarse-grained models via machine learning still presents significant challenges. Here, we discuss recent developments in this field and current efforts to address the remaining challenges.
Collapse
Affiliation(s)
- Aleksander E P Durumeric
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Nicholas E Charron
- Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA
| | - Clark Templeton
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/pbrun03
| | - Félix Musil
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/FelixMusil
| | - Klara Bonneau
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Aldo S Pasos-Trejo
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/sayeg84
| | - Yaoyi Chen
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany. https://twitter.com/hello_yaoyi
| | - Atharva Kelkar
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany
| | - Frank Noé
- Microsoft Research AI4Science, Karl-Liebknecht Str. 32, Berlin, 10178, Berlin, Germany; Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA. https://twitter.com/FrankNoeBerlin
| | - Cecilia Clementi
- Department of Physics, Freie Universität Berlin, Arnimallee 12, 14195, Berlin, Germany; Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Chemistry, Rice University, 6100 Main Street, Houston, 77005, Texas, USA; Department of Physics and Astronomy, Rice University, 6100 Main Street, Houston, 77005, Texas, USA.
| |
Collapse
|
80
|
Sahre MJ, von Rudorff GF, von Lilienfeld OA. Quantum Alchemy Based Bonding Trends and Their Link to Hammett's Equation and Pauling's Electronegativity Model. J Am Chem Soc 2023; 145:5899-5908. [PMID: 36862462 DOI: 10.1021/jacs.2c13393] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
Abstract
We present an intuitive and general analytical approximation estimating the energy of covalent single and double bonds between participating atoms in terms of their respective nuclear charges with just three parameters, [EAB ≈ a - bZAZB + c(ZA7/3 + ZB7/3) ]. The functional form of our expression models an alchemical atomic energy decomposition between participating atoms A and B. After calibration, reasonably accurate bond dissociation energy estimates are obtained for hydrogen-saturated diatomics composed of p-block elements coming from the same row 2 ≤ n ≤ 4 in the periodic table. Corresponding changes in bond dissociation energies due to substitution of atom B by C can be obtained via simple formulas. While being of different functional form and origin, our model is as simple and accurate as Pauling's well-known electronegativity model. Analysis indicates that the model's response in covalent bonding to variation in nuclear charge is near-linear, which is consistent with Hammett's equation.
Collapse
Affiliation(s)
- Michael J Sahre
- Faculty of Physics, University of Vienna, Vienna, 1090, Austria.,Vienna Doctoral School in Chemistry (DoSChem), University of Vienna, Vienna, 1090, Austria
| | | | - O Anatole von Lilienfeld
- Vector Institute for Artificial Intelligence, Toronto, M5S 1M1, Canada.,Departments of Chemistry, Materials Science and Engineering, and Physics, University of Toronto, St. George Campus, Toronto, M5R 0A3, Canada.,Machine Learning Group, Technische Universität Berlin and Institute for the Foundations of Learning and Data, Berlin, 10587, Germany
| |
Collapse
|
81
|
Guidarelli Mattioli F, Sciortino F, Russo J. A neural network potential with self-trained atomic fingerprints: A test with the mW water potential. J Chem Phys 2023; 158:104501. [PMID: 36922151 DOI: 10.1063/5.0139245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023] Open
Abstract
We present a neural network (NN) potential based on a new set of atomic fingerprints built upon two- and three-body contributions that probe distances and local orientational order, respectively. Compared with the existing NN potentials, the atomic fingerprints depend on a small set of tunable parameters that are trained together with the NN weights. In addition to simplifying the selection of the atomic fingerprints, this strategy can also considerably increase the overall accuracy of the network representation. To tackle the simultaneous training of the atomic fingerprint parameters and NN weights, we adopt an annealing protocol that progressively cycles the learning rate, significantly improving the accuracy of the NN potential. We test the performance of the network potential against the mW model of water, which is a classical three-body potential that well captures the anomalies of the liquid phase. Trained on just three state points, the NN potential is able to reproduce the mW model in a very wide range of densities and temperatures, from negative pressures to several GPa, capturing the transition from an open random tetrahedral network to a dense interpenetrated network. The NN potential also reproduces very well properties for which it was not explicitly trained, such as dynamical properties and the structure of the stable crystalline phases of mW.
Collapse
Affiliation(s)
| | | | - John Russo
- Sapienza University of Rome, Piazzale Aldo Moro 2, 00185 Rome, Italy
| |
Collapse
|
82
|
Schmitz G, Schnieder B. Adaptive regularized Gaussian process regression for application in the context of hydrogen adsorption on graphene sheets. J Comput Chem 2023; 44:732-744. [PMID: 36382688 DOI: 10.1002/jcc.27035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Revised: 09/21/2022] [Accepted: 09/28/2022] [Indexed: 11/17/2022]
Abstract
We present a Gaussian process regression (GPR) scheme with an adaptive regularization scheme applied to the QM7 and QM9 test set, several protonated water clusters and specifically to the problem of atomic hydrogen adsorption on graphene sheets. For the last system our goal is to achieve good predictive accuracy with only a few training points. Therefore, we assess for these systems a self-correcting multilayer GPR model, in which the prediction is corrected by a chain of additional GPR models. In our adaptive regularization scheme, we impose no noise on the training data, but use an approach based on the data itself to account for its impurity. The strength of this strategy is that the data points are treated differently based on their importance and that the regularization can still be controlled by a single parameter. We assess how the accuracy of the prediction depends on this parameter. We can show that the new regularization scheme as well as the multilayer approach results in more robust predictors. Furthermore, we demonstrate that the predictor can be in good agreement with the density-functional theory results.
Collapse
Affiliation(s)
- Gunnar Schmitz
- Theoretische Chemie, Ruhr-Universität Bochum, Bochum, Germany
| | | |
Collapse
|
83
|
Kotobi A, Schwob L, Vonbun-Feldbauer GB, Rossi M, Gasparotto P, Feiler C, Berden G, Oomens J, Oostenrijk B, Scuderi D, Bari S, Meißner RH. Reconstructing the infrared spectrum of a peptide from representative conformers of the full canonical ensemble. Commun Chem 2023; 6:46. [PMID: 36869192 PMCID: PMC9984374 DOI: 10.1038/s42004-023-00835-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 02/08/2023] [Indexed: 03/05/2023] Open
Abstract
Leucine enkephalin (LeuEnk), a biologically active endogenous opioid pentapeptide, has been under intense investigation because it is small enough to allow efficient use of sophisticated computational methods and large enough to provide insights into low-lying minima of its conformational space. Here, we reproduce and interpret experimental infrared (IR) spectra of this model peptide in gas phase using a combination of replica-exchange molecular dynamics simulations, machine learning, and ab initio calculations. In particular, we evaluate the possibility of averaging representative structural contributions to obtain an accurate computed spectrum that accounts for the corresponding canonical ensemble of the real experimental situation. Representative conformers are identified by partitioning the conformational phase space into subensembles of similar conformers. The IR contribution of each representative conformer is calculated from ab initio and weighted according to the population of each cluster. Convergence of the averaged IR signal is rationalized by merging contributions in a hierarchical clustering and the comparison to IR multiple photon dissociation experiments. The improvements achieved by decomposing clusters containing similar conformations into even smaller subensembles is strong evidence that a thorough assessment of the conformational landscape and the associated hydrogen bonding is a prerequisite for deciphering important fingerprints in experimental spectroscopic data.
Collapse
Affiliation(s)
- Amir Kotobi
- Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
| | - Lucas Schwob
- Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany.
| | | | - Mariana Rossi
- Max Planck Institute for the Structure and Dynamics of Matter, Hamburg, Germany
| | - Piero Gasparotto
- Scientific Computing Division, Paul Scherrer Institute, Villigen, Switzerland
| | - Christian Feiler
- Helmholtz-Zentrum Hereon, Institute of Surface Science, Geesthacht, Germany
| | - Giel Berden
- Radboud University, Institute for Molecules and Materials, FELIX Laboratory, Nijmegen, The Netherlands
| | - Jos Oomens
- Radboud University, Institute for Molecules and Materials, FELIX Laboratory, Nijmegen, The Netherlands
| | - Bart Oostenrijk
- Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
- The Hamburg Centre for Ultrafast Imaging, Hamburg, Germany
| | - Debora Scuderi
- Institut de Chimie Physique, CNRS UMR8000, Université Paris-Saclay, Orsay, France
| | - Sadia Bari
- Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany.
- The Hamburg Centre for Ultrafast Imaging, Hamburg, Germany.
- Zernike Institute for Advanced Materials, University of Groningen, Groningen, The Netherlands.
| | - Robert H Meißner
- Helmholtz-Zentrum Hereon, Institute of Surface Science, Geesthacht, Germany.
- Hamburg University of Technology, Institute of Polymers and Composites, Hamburg, Germany.
| |
Collapse
|
84
|
Abstract
This work presents a variant of an electrostatic embedding scheme that allows the embedding of arbitrary machine learned potentials trained on molecular systems in vacuo. The scheme is based on physically motivated models of electronic density and polarizability, resulting in a generic model without relying on an exhaustive training set. The scheme only requires in vacuo single point QM calculations to provide training densities and molecular dipolar polarizabilities. As an example, the scheme is applied to create an embedding model for the QM7 data set using Gaussian Process Regression with only 445 reference atomic environments. The model was tested on the SARS-CoV-2 protease complex with PF-00835231, resulting in a predicted embedding energy RMSE of 2 kcal/mol, compared to explicit DFT/MM calculations.
Collapse
Affiliation(s)
- Kirill Zinovjev
- Departament de Química Física, Universitat de València, 46100 Burjassot, Spain
| |
Collapse
|
85
|
Deffner M, Weise MP, Zhang H, Mücke M, Proppe J, Franco I, Herrmann C. Learning Conductance: Gaussian Process Regression for Molecular Electronics. J Chem Theory Comput 2023; 19:992-1002. [PMID: 36692968 DOI: 10.1021/acs.jctc.2c00648] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Experimental studies of charge transport through single molecules often rely on break junction setups, where molecular junctions are repeatedly formed and broken while measuring the conductance, leading to a statistical distribution of conductance values. Modeling this experimental situation and the resulting conductance histograms is challenging for theoretical methods, as computations need to capture structural changes in experiments, including the statistics of junction formation and rupture. This type of extensive structural sampling implies that even when evaluating conductance from computationally efficient electronic structure methods, which typically are of reduced accuracy, the evaluation of conductance histograms is too expensive to be a routine task. Highly accurate quantum transport computations are only computationally feasible for a few selected conformations and thus necessarily ignore the rich conformational space probed in experiments. To overcome these limitations, we investigate the potential of machine learning for modeling conductance histograms, in particular by Gaussian process regression. We show that by selecting specific structural parameters as features, Gaussian process regression can be used to efficiently predict the zero-bias conductance from molecular structures, reducing the computational cost of simulating conductance histograms by an order of magnitude. This enables the efficient calculation of conductance histograms even on the basis of computationally expensive first-principles approaches by effectively reducing the number of necessary charge transport calculations, paving the way toward their routine evaluation.
Collapse
Affiliation(s)
- Michael Deffner
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany.,The Hamburg Centre for Ultrafast Imaging, Hamburg22761, Germany
| | - Marc Philipp Weise
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany
| | - Haitao Zhang
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany
| | - Maike Mücke
- Institute of Physical Chemistry, Georg-August University, Göttingen37077, Germany
| | - Jonny Proppe
- Institute of Physical and Theoretical Chemistry, TU Braunschweig, Braunschweig38106, Germany
| | - Ignacio Franco
- Departments of Chemistry and Physics, University of Rochester, Rochester, New York14627-0216, United States
| | - Carmen Herrmann
- Institute of Inorganic and Applied Chemistry, University of Hamburg, Hamburg22761, Germany.,The Hamburg Centre for Ultrafast Imaging, Hamburg22761, Germany
| |
Collapse
|
86
|
Käser S, Vazquez-Salazar LI, Meuwly M, Töpfer K. Neural network potentials for chemistry: concepts, applications and prospects. DIGITAL DISCOVERY 2023; 2:28-58. [PMID: 36798879 PMCID: PMC9923808 DOI: 10.1039/d2dd00102k] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.
Collapse
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | | | - Markus Meuwly
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | - Kai Töpfer
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| |
Collapse
|
87
|
Burn M, Popelier PLA. Gaussian Process Regression Models for Predicting Atomic Energies and Multipole Moments. J Chem Theory Comput 2023; 19:1370-1380. [PMID: 36757024 PMCID: PMC9979601 DOI: 10.1021/acs.jctc.2c00731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Developing a force field is a difficult task because its design is typically pulled in opposite directions by speed and accuracy. FFLUX breaks this trend by utilizing Gaussian process regression (GPR) to predict, at ab initio accuracy, atomic energies and multipole moments as obtained from the quantum theory of atoms in molecules (QTAIM). This work demonstrates that the in-house FFLUX training pipeline can generate successful GPR models for six representative molecules: peptide-capped glycine and alanine, glucose, paracetamol, aspirin, and ibuprofen. The molecules were sufficiently distorted to represent configurations from an AMBER-GAFF2 molecular dynamics run. All internal degrees of freedom were covered corresponding to 93 dimensions in the case of the largest molecule ibuprofen (33 atoms). Benefiting from active learning, the GPR models contain only about 2000 training points and return largely sub-kcal mol-1 prediction errors for the validation sets. A proof of concept has been reached for transferring the model produced through active learning on one atomic property to that of the remaining atomic properties. The prediction of electrostatic interaction can be assessed at the intermolecular level, and the vast majority of interactions have a root-mean-square error of less than 0.1 kJ mol-1 with a maximum value of ∼1 kJ mol-1 for a glycine and paracetamol dimer.
Collapse
|
88
|
Abstract
Reactivity scales are useful research tools for chemists, both experimental and computational. However, to determine the reactivity of a single molecule, multiple measurements need to be carried out, which is a time-consuming and resource-intensive task. In this Tutorial Review, we present alternative approaches for the efficient generation of quantitative structure-reactivity relationships that are based on quantum chemistry, supervised learning, and uncertainty quantification. First published in 2002, we observe a tendency for these relationships to become not only more predictive but also more interpretable over time.
Collapse
Affiliation(s)
- Maike Vahl
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| | - Jonny Proppe
- Institute of Physical and Theoretical Chemistry, Technische Universität Braunschweig, Gaußstraße 17, 38106 Braunschweig, Germany.
| |
Collapse
|
89
|
Xia S, Zhang D, Zhang Y. Multitask Deep Ensemble Prediction of Molecular Energetics in Solution: From Quantum Mechanics to Experimental Properties. J Chem Theory Comput 2023; 19:10.1021/acs.jctc.2c01024. [PMID: 36607141 PMCID: PMC10323048 DOI: 10.1021/acs.jctc.2c01024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The past few years have witnessed significant advances in developing machine learning methods for molecular energetics predictions, including calculated electronic energies with high-level quantum mechanical methods and experimental properties, such as solvation free energy and logP. Typically, task-specific machine learning models are developed for distinct prediction tasks. In this work, we present a multitask deep ensemble model, sPhysNet-MT-ens5, which can simultaneously and accurately predict electronic energies of molecules in gas, water, and octanol phases, as well as transfer free energies at both calculated and experimental levels. On the calculated data set Frag20-solv-678k, which is developed in this work and contains 678,916 molecular conformations, up to 20 heavy atoms, and their properties calculated at B3LYP/6-31G* level of theory with continuum solvent models, sPhysNet-MT-ens5 predicts density functional theory (DFT)-level electronic energies directly from force field-optimized geometry within chemical accuracy. On the experimental data sets, sPhysNet-MT-ens5 achieves state-of-the-art performances, which predict both experimental hydration free energy with a RMSE of 0.620 kcal/mol on the FreeSolv data set and experimental logP with a RMSE of 0.393 on the PHYSPROP data set. Furthermore, sPhysNet-MT-ens5 also provides a reasonable estimation of model uncertainty which shows correlations with prediction error. Finally, by analyzing the atomic contributions of its predictions, we find that the developed deep learning model is aware of the chemical environment of each atom by assigning reasonable atomic contributions consistent with our chemical knowledge.
Collapse
Affiliation(s)
- Song Xia
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Dongdong Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
- Simons Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
90
|
Gabellini C, Şologan M, Pellizzoni E, Marson D, Daka M, Franchi P, Bignardi L, Franchi S, Posel Z, Baraldi A, Pengo P, Lucarini M, Pasquato L, Posocco P. Spotting Local Environments in Self-Assembled Monolayer-Protected Gold Nanoparticles. ACS NANO 2022; 16:20902-20914. [PMID: 36459668 PMCID: PMC9798909 DOI: 10.1021/acsnano.2c08467] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 11/29/2022] [Indexed: 06/17/2023]
Abstract
Organic-inorganic (O-I) nanomaterials are versatile platforms for an incredible high number of applications, ranging from heterogeneous catalysis to molecular sensing, cell targeting, imaging, and cancer diagnosis and therapy, just to name a few. Much of their potential stems from the unique control of organic environments around inorganic sites within a single O-I nanomaterial, which allows for new properties that were inaccessible using purely organic or inorganic materials. Structural and mechanistic characterization plays a key role in understanding and rationally designing such hybrid nanoconstructs. Here, we introduce a general methodology to identify and classify local (supra)molecular environments in an archetypal class of O-I nanomaterials, i.e., self-assembled monolayer-protected gold nanoparticles (SAM-AuNPs). By using an atomistic machine-learning guided workflow based on the Smooth Overlap of Atomic Positions (SOAP) descriptor, we analyze a collection of chemically different SAM-AuNPs and detect and compare local environments in a way that is agnostic and automated, i.e., with no need of a priori information and minimal user intervention. In addition, the computational results coupled with experimental electron spin resonance measurements prove that is possible to have more than one local environment inside SAMs, being the thickness of the organic shell and solvation primary factors in the determining number and nature of multiple coexisting environments. These indications are extended to complex mixed hydrophilic-hydrophobic SAMs. This work demonstrates that it is possible to spot and compare local molecular environments in SAM-AuNPs exploiting atomistic machine-learning approaches, establishes ground rules to control them, and holds the potential for the rational design of O-I nanomaterials instructed from data.
Collapse
Affiliation(s)
- Cristian Gabellini
- Department
of Engineering and Architecture, University
of Trieste, 34127 Trieste, Italy
| | - Maria Şologan
- Department
of Chemical and Pharmaceutical Sciences and INSTM Trieste Research
Unit, University of Trieste, 34127 Trieste, Italy
| | - Elena Pellizzoni
- Department
of Chemical and Pharmaceutical Sciences and INSTM Trieste Research
Unit, University of Trieste, 34127 Trieste, Italy
| | - Domenico Marson
- Department
of Engineering and Architecture, University
of Trieste, 34127 Trieste, Italy
| | - Mario Daka
- Department
of Chemical and Pharmaceutical Sciences and INSTM Trieste Research
Unit, University of Trieste, 34127 Trieste, Italy
| | - Paola Franchi
- Department
of Chemistry “G. Ciamician”, University of Bologna, I-40126 Bologna, Italy
| | - Luca Bignardi
- Department
of Physics, University of Trieste, 34127 Trieste, Italy
| | - Stefano Franchi
- Elettra
Sincrotrone Trieste, 34149 Basovizza, Trieste, Italy
| | - Zbyšek Posel
- Department
of Informatics, Jan Evangelista Purkyně
University, 400 96 Ústí nad Labem, Czech Republic
| | | | - Paolo Pengo
- Department
of Chemical and Pharmaceutical Sciences and INSTM Trieste Research
Unit, University of Trieste, 34127 Trieste, Italy
| | - Marco Lucarini
- Department
of Chemistry “G. Ciamician”, University of Bologna, I-40126 Bologna, Italy
| | - Lucia Pasquato
- Department
of Chemical and Pharmaceutical Sciences and INSTM Trieste Research
Unit, University of Trieste, 34127 Trieste, Italy
| | - Paola Posocco
- Department
of Engineering and Architecture, University
of Trieste, 34127 Trieste, Italy
| |
Collapse
|
91
|
Bigi F, Huguenin-Dumittan KK, Ceriotti M, Manolopoulos DE. A smooth basis for atomistic machine learning. J Chem Phys 2022; 157:234101. [PMID: 36550032 DOI: 10.1063/5.0124363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Machine learning frameworks based on correlations of interatomic positions begin with a discretized description of the density of other atoms in the neighborhood of each atom in the system. Symmetry considerations support the use of spherical harmonics to expand the angular dependence of this density, but there is, as of yet, no clear rationale to choose one radial basis over another. Here, we investigate the basis that results from the solution of the Laplacian eigenvalue problem within a sphere around the atom of interest. We show that this generates a basis of controllable smoothness within the sphere (in the same sense as plane waves provide a basis with controllable smoothness for a problem with periodic boundaries) and that a tensor product of Laplacian eigenstates also provides a smooth basis for expanding any higher-order correlation of the atomic density within the appropriate hypersphere. We consider several unsupervised metrics of the quality of a basis for a given dataset and show that the Laplacian eigenstate basis has a performance that is much better than some widely used basis sets and competitive with data-driven bases that numerically optimize each metric. Finally, we investigate the role of the basis in building models of the potential energy. In these tests, we find that a combination of the Laplacian eigenstate basis and target-oriented heuristics leads to equal or improved regression performance when compared to both heuristic and data-driven bases in the literature. We conclude that the smoothness of the basis functions is a key aspect of successful atomic density representations.
Collapse
Affiliation(s)
- Filippo Bigi
- Physical and Theoretical Chemistry Laboratory, South Parks Road, Oxford OX1 3QZ, United Kingdom
| | - Kevin K Huguenin-Dumittan
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - David E Manolopoulos
- Physical and Theoretical Chemistry Laboratory, South Parks Road, Oxford OX1 3QZ, United Kingdom
| |
Collapse
|
92
|
Sabirov DS, Tukhbatullina AA. Distributed Polarizability Model for Covalently Bonded Fullerene Nanoaggregates: Origins of Polarizability Exaltation. NANOMATERIALS (BASEL, SWITZERLAND) 2022; 12:4404. [PMID: 36558256 PMCID: PMC9781774 DOI: 10.3390/nano12244404] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2022] [Revised: 12/06/2022] [Accepted: 12/07/2022] [Indexed: 06/17/2023]
Abstract
Polarizability exaltation is typical for (C60)n nanostructures. It relates to the ratio between the mean polarizabilities of (C60)n and C60: the first one is higher than the n-fold mean polarizability of the original fullerene. This phenomenon is used in the design of novel fullerene compounds and the understanding of its properties but still has no chemical rationalization. In the present work, we studied the distributed polarizability of (C60)2 and isomeric (C60)3 nanoaggregates with the density functional theory method. We found that polarizability exaltation increases with the size of the nanostructure and originates from the response of the sp2-hybridized carbon atoms to the external electric field. The highest contributions to the dipole polarizability of (C60)2 and (C60)3 come from the most remote atoms of the marginal fullerene cores. The sp3-hybridized carbon atoms of cyclobutane bridges negligibly contribute to the molecular property. A similar major contribution to the molecular polarizability from the marginal atoms is observed for related carbon nanostructures isomeric to (C60)2 (tubular fullerene and nanopeanut). Additionally, we discuss the analogy between the polarizability exaltation of covalently bonded (C60)n and the increase in the polarizability found in experiments on fullerene nanoclusters/films as compared with the isolated molecules.
Collapse
|
93
|
Helfrecht BA, Pireddu G, Semino R, Auerbach SM, Ceriotti M. Ranking the synthesizability of hypothetical zeolites with the sorting hat. DIGITAL DISCOVERY 2022; 1:779-789. [PMID: 36561986 PMCID: PMC9721151 DOI: 10.1039/d2dd00056c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 10/10/2022] [Indexed: 12/12/2022]
Abstract
Zeolites are nanoporous alumino-silicate frameworks widely used as catalysts and adsorbents. Even though millions of siliceous networks can be generated by computer-aided searches, no new hypothetical framework has yet been synthesized. The needle-in-a-haystack problem of finding promising candidates among large databases of predicted structures has intrigued materials scientists for decades; yet, most work to date on the zeolite problem has been limited to intuitive structural descriptors. Here, we tackle this problem through a rigorous data science scheme-the "Zeolite Sorting Hat"-that exploits interatomic correlations to discriminate between real and hypothetical zeolites and to partition real zeolites into compositional classes that guide synthetic strategies for a given hypothetical framework. We find that, regardless of the structural descriptor used by the Zeolite Sorting Hat, there remain hypothetical frameworks that are incorrectly classified as real ones, suggesting that they might be good candidates for synthesis. We seek to minimize the number of such misclassified frameworks by using as complete a structural descriptor as possible, thus focusing on truly viable synthetic targets, while discovering structural features that distinguish real and hypothetical frameworks as an output of the Zeolite Sorting Hat. Further ranking of the candidates can be achieved based on thermodynamic stability and/or their suitability for the desired applications. Based on this workflow, we propose three hypothetical frameworks differing in their molar volume range as the top targets for synthesis, each with a composition suggested by the Zeolite Sorting Hat. Finally, we analyze the behavior of the Zeolite Sorting Hat with a hierarchy of structural descriptors including intuitive descriptors reported in previous studies, finding that intuitive descriptors produce significantly more misclassified hypothetical frameworks, and that more rigorous interatomic correlations point to second-neighbor Si-O distances around 3.2-3.4 Å as the key discriminatory factor.
Collapse
Affiliation(s)
- Benjamin A. Helfrecht
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne1015 LausanneSwitzerland
| | - Giovanni Pireddu
- PASTEUR, Département de Chimie, École Normale Supérieure, PSL University, Sorbonne Université, CNRS24 rue Lhomond75005 ParisFrance,Sorbonne Université, CNRS, Physico-chimie des Electrolytes et Nanosystèmes InterfaciauxPHENIXF-75005 ParisFrance
| | - Rocio Semino
- Sorbonne Université, CNRS, Physico-chimie des Electrolytes et Nanosystèmes InterfaciauxPHENIXF-75005 ParisFrance,ICGM, Univ. Montpellier, CNRS, ENSCMMontpellierFrance
| | - Scott M. Auerbach
- Department of Chemistry and Department of Chemical Engineering, University of Massachusetts AmherstAmherstMA 01003USA
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institut des Matériaux, École Polytechnique Fédérale de Lausanne1015 LausanneSwitzerland
| |
Collapse
|
94
|
Burn MJ, Popelier PLA. Producing chemically accurate atomic Gaussian process regression models by active learning for molecular simulation. J Comput Chem 2022; 43:2084-2098. [PMID: 36165338 PMCID: PMC9828508 DOI: 10.1002/jcc.27006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Revised: 08/20/2022] [Accepted: 08/24/2022] [Indexed: 01/12/2023]
Abstract
Machine learning is becoming increasingly more important in the field of force field development. Never has it been more vital to have chemically accurate machine learning potentials because force fields become more sophisticated and their applications expand. In this study a method for developing chemically accurate Gaussian process regression models is demonstrated for an increasingly complex set of molecules. This work is an extension to previous work showing the progression of the active learning technique in producing more accurate models in much less CPU time than ever before. The per-atom active learning approach has unlocked the potential to generate chemically accurate models for molecules such as peptide-capped glycine.
Collapse
Affiliation(s)
- Matthew J. Burn
- Manchester Institute of BiotechnologyThe University of ManchesterManchesterUK,Department of ChemistryThe University of ManchesterManchesterUK
| | - Paul L. A. Popelier
- Manchester Institute of BiotechnologyThe University of ManchesterManchesterUK,Department of ChemistryThe University of ManchesterManchesterUK
| |
Collapse
|
95
|
Zhang Y, Lin Q, Jiang B. Atomistic neural network representations for chemical dynamics simulations of molecular, condensed phase, and interfacial systems: Efficiency, representability, and generalization. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Yaolong Zhang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Qidong Lin
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Bin Jiang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
96
|
Actinides in complex reactive media: A combined ab initio molecular dynamics and machine learning analytics study of transuranic ions in molten salts. J Mol Liq 2022. [DOI: 10.1016/j.molliq.2022.120115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
97
|
Gentili D, Ori G. Reversible assembly of nanoparticles: theory, strategies and computational simulations. NANOSCALE 2022; 14:14385-14432. [PMID: 36169572 DOI: 10.1039/d2nr02640f] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
The significant advances in synthesis and functionalization have enabled the preparation of high-quality nanoparticles that have found a plethora of successful applications. The unique physicochemical properties of nanoparticles can be manipulated through the control of size, shape, composition, and surface chemistry, but their technological application possibilities can be further expanded by exploiting the properties that emerge from their assembly. The ability to control the assembly of nanoparticles not only is required for many real technological applications, but allows the combination of the intrinsic properties of nanoparticles and opens the way to the exploitation of their complex interplay, giving access to collective properties. Significant advances and knowledge gained over the past few decades on nanoparticle assembly have made it possible to implement a growing number of strategies for reversible assembly of nanoparticles. In addition to being of interest for basic studies, such advances further broaden the range of applications and the possibility of developing innovative devices using nanoparticles. This review focuses on the reversible assembly of nanoparticles and includes the theoretical aspects related to the concept of reversibility, an up-to-date assessment of the experimental approaches applied to this field and the advanced computational schemes that offer key insights into the assembly mechanisms. We aim to provide readers with a comprehensive guide to address the challenges in assembling reversible nanoparticles and promote their applications.
Collapse
Affiliation(s)
- Denis Gentili
- Consiglio Nazionale delle Ricerche, Istituto per lo Studio dei Materiali Nanostrutturati (CNR-ISMN), Via P. Gobetti 101, 40129 Bologna, Italy.
| | - Guido Ori
- Université de Strasbourg, CNRS, Institut de Physique et Chimie des Matériaux de Strasbourg, UMR 7504, Rue du Loess 23, F-67034 Strasbourg, France.
| |
Collapse
|
98
|
Gugler S, Reiher M. Quantum Chemical Roots of Machine-Learning Molecular Similarity Descriptors. J Chem Theory Comput 2022; 18:6670-6689. [PMID: 36218328 DOI: 10.1021/acs.jctc.2c00718] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In this work, we explore the quantum chemical foundations of descriptors for molecular similarity. Such descriptors are key for traversing chemical compound space with machine learning. Our focus is on the Coulomb matrix and on the smooth overlap of atomic positions (SOAP). We adopt a basic framework that allows us to connect both descriptors to electronic structure theory. This framework enables us to then define two new descriptors that are more closely related to electronic structure theory, which we call Coulomb lists and smooth overlap of electron densities (SOED). By investigating their usefulness as molecular similarity descriptors, we gain new insights into how and why Coulomb matrix and SOAP work. Moreover, Coulomb lists avoid the somewhat mysterious diagonalization step of the Coulomb matrix and might provide a direct means to extract subsystem information that can be compared across Born-Oppenheimer surfaces of varying dimension. For the electron density, we derive the necessary formalism to create the SOED measure in close analogy to SOAP. Because this formalism is more involved than that of SOAP, we review the essential theory as well as introduce a set of approximations that eventually allow us to work with SOED in terms of the same implementation available for the evaluation of SOAP. We focus our analysis on elementary reaction steps, where transition state structures are more similar to either reactant or product structures than the latter two are with respect to one another. The prediction of electronic energies of transition state structures can, however, be more difficult than that of stable intermediates due to multi-configurational effects. The question arises to what extent molecular similarity descriptors rooted in electronic structure theory can resolve these intricate effects.
Collapse
Affiliation(s)
- Stefan Gugler
- Laboratorium für Physikalische Chemie, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Markus Reiher
- Laboratorium für Physikalische Chemie, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
99
|
Galuzzi B, Mirarchi A, Viganò EL, De Gioia L, Damiani C, Arrigoni F. Machine Learning for Efficient Prediction of Protein Redox Potential: The Flavoproteins Case. J Chem Inf Model 2022; 62:4748-4759. [PMID: 36126254 PMCID: PMC9554915 DOI: 10.1021/acs.jcim.2c00858] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Indexed: 11/29/2022]
Abstract
Determining the redox potentials of protein cofactors and how they are influenced by their molecular neighborhoods is essential for basic research and many biotechnological applications, from biosensors and biocatalysis to bioremediation and bioelectronics. The laborious determination of redox potential with current experimental technologies pushes forward the need for computational approaches that can reliably predict it. Although current computational approaches based on quantum and molecular mechanics are accurate, their large computational costs hinder their usage. In this work, we explored the possibility of using more efficient QSPR models based on machine learning (ML) for the prediction of protein redox potential, as an alternative to classical approaches. As a proof of concept, we focused on flavoproteins, one of the most important families of enzymes directly involved in redox processes. To train and test different ML models, we retrieved a dataset of flavoproteins with a known midpoint redox potential (Em) and 3D structure. The features of interest, accounting for both short- and long-range effects of the protein matrix on the flavin cofactor, have been automatically extracted from each protein PDB file. Our best ML model (XGB) has a performance error below 1 kcal/mol (∼36 mV), comparing favorably to more sophisticated computational approaches. We also provided indications on the features that mostly affect the Em value, and when possible, we rationalized them on the basis of previous studies.
Collapse
Affiliation(s)
- Bruno
Giovanni Galuzzi
- Department
of Biotechnology and Biosciences, University
of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
- SYSBIO
Centre of Systems Biology/ISBE.IT, Piazza della Scienza 2, 20126, Milan, Italy
| | - Antonio Mirarchi
- Department
of Biotechnology and Biosciences, University
of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
| | - Edoardo Luca Viganò
- Istituto
di Ricerche Farmacologiche Mario Negri, Via Mario Negri 2, 20156 Milan, Italy
| | - Luca De Gioia
- Department
of Biotechnology and Biosciences, University
of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
| | - Chiara Damiani
- Department
of Biotechnology and Biosciences, University
of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
- SYSBIO
Centre of Systems Biology/ISBE.IT, Piazza della Scienza 2, 20126, Milan, Italy
| | - Federica Arrigoni
- Department
of Biotechnology and Biosciences, University
of Milano-Bicocca, Piazza della Scienza 2, 20126 Milan, Italy
| |
Collapse
|
100
|
Morrow JD, Deringer VL. Indirect learning and physically guided validation of interatomic potential models. J Chem Phys 2022; 157:104105. [DOI: 10.1063/5.0099929] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Machine learning (ML) based interatomic potentials are emerging tools for material simulations, but require a trade-off between accuracy and speed. Here, we show how one can use one ML potential model to train another: we use an accurate, but more computationally expensive model to generate reference data (locations and labels) for a series of much faster potentials. Without the need for quantum-mechanical reference computations at the secondary stage, extensive reference datasets can be easily generated, and we find that this improves the quality of fast potentials with less flexible functional forms. We apply the technique to disordered silicon, including a simulation of vitrification and polycrystalline grain formation under pressure with a system size of a million atoms. Our work provides conceptual insight into the ML of interatomic potential models and suggests a route toward accelerated simulations of condensed-phase systems.
Collapse
Affiliation(s)
- Joe D. Morrow
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Volker L. Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| |
Collapse
|