1
|
Gallegos M, Vassilev-Galindo V, Poltavsky I, Martín Pendás Á, Tkatchenko A. Explainable chemical artificial intelligence from accurate machine learning of real-space chemical descriptors. Nat Commun 2024; 15:4345. [PMID: 38773090 DOI: 10.1038/s41467-024-48567-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/24/2024] [Indexed: 05/23/2024] Open
Abstract
Machine-learned computational chemistry has led to a paradoxical situation in which molecular properties can be accurately predicted, but they are difficult to interpret. Explainable AI (XAI) tools can be used to analyze complex models, but they are highly dependent on the AI technique and the origin of the reference data. Alternatively, interpretable real-space tools can be employed directly, but they are often expensive to compute. To address this dilemma between explainability and accuracy, we developed SchNet4AIM, a SchNet-based architecture capable of dealing with local one-body (atomic) and two-body (interatomic) descriptors. The performance of SchNet4AIM is tested by predicting a wide collection of real-space quantities ranging from atomic charges and delocalization indices to pairwise interaction energies. The accuracy and speed of SchNet4AIM breaks the bottleneck that has prevented the use of real-space chemical descriptors in complex systems. We show that the group delocalization indices, arising from our physically rigorous atomistic predictions, provide reliable indicators of supramolecular binding events, thus contributing to the development of Explainable Chemical Artificial Intelligence (XCAI) models.
Collapse
Affiliation(s)
- Miguel Gallegos
- Department of Analytical and Physical Chemistry, University of Oviedo, E-33006, Oviedo, Spain
| | | | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Ángel Martín Pendás
- Department of Analytical and Physical Chemistry, University of Oviedo, E-33006, Oviedo, Spain.
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
2
|
Davie SJ, Di Pasquale N, Popelier PLA. Incorporation of local structure into kriging models for the prediction of atomistic properties in the water decamer. J Comput Chem 2016; 37:2409-22. [PMID: 27535711 PMCID: PMC5031213 DOI: 10.1002/jcc.24465] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2016] [Revised: 06/06/2016] [Accepted: 07/07/2016] [Indexed: 01/17/2023]
Abstract
Machine learning algorithms have been demonstrated to predict atomistic properties approaching the accuracy of quantum chemical calculations at significantly less computational cost. Difficulties arise, however, when attempting to apply these techniques to large systems, or systems possessing excessive conformational freedom. In this article, the machine learning method kriging is applied to predict both the intra-atomic and interatomic energies, as well as the electrostatic multipole moments, of the atoms of a water molecule at the center of a 10 water molecule (decamer) cluster. Unlike previous work, where the properties of small water clusters were predicted using a molecular local frame, and where training set inputs (features) were based on atomic index, a variety of feature definitions and coordinate frames are considered here to increase prediction accuracy. It is shown that, for a water molecule at the center of a decamer, no single method of defining features or coordinate schemes is optimal for every property. However, explicitly accounting for the structure of the first solvation shell in the definition of the features of the kriging training set, and centring the coordinate frame on the atom-of-interest will, in general, return better predictions than models that apply the standard methods of feature definition, or a molecular coordinate frame. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Stuart J Davie
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain and School of Chemistry, University of Manchester, Oxford Road, Manchester, M13 9PL, Great Britain
| | - Nicodemo Di Pasquale
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain and School of Chemistry, University of Manchester, Oxford Road, Manchester, M13 9PL, Great Britain
| | - Paul L A Popelier
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain and School of Chemistry, University of Manchester, Oxford Road, Manchester, M13 9PL, Great Britain.
| |
Collapse
|
3
|
Cisneros G, Wikfeldt KT, Ojamäe L, Lu J, Xu Y, Torabifard H, Bartók AP, Csányi G, Molinero V, Paesani F. Modeling Molecular Interactions in Water: From Pairwise to Many-Body Potential Energy Functions. Chem Rev 2016; 116:7501-28. [PMID: 27186804 PMCID: PMC5450669 DOI: 10.1021/acs.chemrev.5b00644] [Citation(s) in RCA: 272] [Impact Index Per Article: 34.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2015] [Indexed: 12/17/2022]
Abstract
Almost 50 years have passed from the first computer simulations of water, and a large number of molecular models have been proposed since then to elucidate the unique behavior of water across different phases. In this article, we review the recent progress in the development of analytical potential energy functions that aim at correctly representing many-body effects. Starting from the many-body expansion of the interaction energy, specific focus is on different classes of potential energy functions built upon a hierarchy of approximations and on their ability to accurately reproduce reference data obtained from state-of-the-art electronic structure calculations and experimental measurements. We show that most recent potential energy functions, which include explicit short-range representations of two-body and three-body effects along with a physically correct description of many-body effects at all distances, predict the properties of water from the gas to the condensed phase with unprecedented accuracy, thus opening the door to the long-sought "universal model" capable of describing the behavior of water under different conditions and in different environments.
Collapse
Affiliation(s)
| | - Kjartan Thor Wikfeldt
- Science
Institute, University of Iceland, VR-III, 107, Reykjavik, Iceland
- Department
of Physics, Albanova, Stockholm University, S-106 91 Stockholm, Sweden
| | - Lars Ojamäe
- Department
of Chemistry, Linköping University, SE-581 83 Linköping, Sweden
| | - Jibao Lu
- Department
of Chemistry, The University of Utah, Salt Lake City, Utah 84112-0850, United States
| | - Yao Xu
- Lehrstuhl
Physikalische Chemie II, Ruhr-Universität
Bochum, 44801 Bochum, Germany
| | - Hedieh Torabifard
- Department
of Chemistry, Wayne State University, Detroit, Michigan 48202, United States
| | - Albert P. Bartók
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB21PZ, United Kingdom
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB21PZ, United Kingdom
| | - Valeria Molinero
- Department
of Chemistry, The University of Utah, Salt Lake City, Utah 84112-0850, United States
| | - Francesco Paesani
- Department
of Chemistry and Biochemistry, University
of California San Diego, La Jolla, California 92093, United States
| |
Collapse
|
4
|
Hughes TJ, Kandathil SM, Popelier PLA. Accurate prediction of polarised high order electrostatic interactions for hydrogen bonded complexes using the machine learning method kriging. SPECTROCHIMICA ACTA. PART A, MOLECULAR AND BIOMOLECULAR SPECTROSCOPY 2015; 136 Pt A:32-41. [PMID: 24274986 DOI: 10.1016/j.saa.2013.10.059] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2013] [Revised: 09/02/2013] [Accepted: 10/15/2013] [Indexed: 06/02/2023]
Abstract
As intermolecular interactions such as the hydrogen bond are electrostatic in origin, rigorous treatment of this term within force field methodologies should be mandatory. We present a method able of accurately reproducing such interactions for seven van der Waals complexes. It uses atomic multipole moments up to hexadecupole moment mapped to the positions of the nuclear coordinates by the machine learning method kriging. Models were built at three levels of theory: HF/6-31G(**), B3LYP/aug-cc-pVDZ and M06-2X/aug-cc-pVDZ. The quality of the kriging models was measured by their ability to predict the electrostatic interaction energy between atoms in external test examples for which the true energies are known. At all levels of theory, >90% of test cases for small van der Waals complexes were predicted within 1 kJ mol(-1), decreasing to 60-70% of test cases for larger base pair complexes. Models built on moments obtained at B3LYP and M06-2X level generally outperformed those at HF level. For all systems the individual interactions were predicted with a mean unsigned error of less than 1 kJ mol(-1).
Collapse
Affiliation(s)
- Timothy J Hughes
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom; School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
| | - Shaun M Kandathil
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom; School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
| | - Paul L A Popelier
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom; School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom.
| |
Collapse
|
5
|
Kandathil SM, Fletcher TL, Yuan Y, Knowles J, Popelier PLA. Accuracy and tractability of a kriging model of intramolecular polarizable multipolar electrostatics and its application to histidine. J Comput Chem 2013; 34:1850-61. [PMID: 23720381 DOI: 10.1002/jcc.23333] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2013] [Revised: 04/23/2013] [Accepted: 04/24/2013] [Indexed: 11/05/2022]
Abstract
We propose a generic method to model polarization in the context of high-rank multipolar electrostatics. This method involves the machine learning technique kriging, here used to capture the response of an atomic multipole moment of a given atom to a change in the positions of the atoms surrounding this atom. The atoms are malleable boxes with sharp boundaries, they do not overlap and exhaust space. The method is applied to histidine where it is able to predict atomic multipole moments (up to hexadecapole) for unseen configurations, after training on 600 geometries distorted using normal modes of each of its 24 local energy minima at B3LYP/apc-1 level. The quality of the predictions is assessed by calculating the Coulomb energy between an atom for which the moments have been predicted and the surrounding atoms (having exact moments). Only interactions between atoms separated by three or more bonds ("1, 4 and higher" interactions) are included in this energy error. This energy is compared with that of a central atom with exact multipole moments interacting with the same environment. The resulting energy discrepancies are summed for 328 atom-atom interactions, for each of the 29 atoms of histidine being a central atom in turn. For 80% of the 539 test configurations (outside the training set), this summed energy deviates by less than 1 kcal mol(-1).
Collapse
Affiliation(s)
- Shaun M Kandathil
- Manchester Institute of Biotechnology, 131 Princess Street, Manchester, M1 7DN, United Kingdom
| | | | | | | | | |
Collapse
|