1
|
Burn M, Popelier PLA. Gaussian Process Regression Models for Predicting Atomic Energies and Multipole Moments. J Chem Theory Comput 2023; 19:1370-1380. [PMID: 36757024 PMCID: PMC9979601 DOI: 10.1021/acs.jctc.2c00731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
Developing a force field is a difficult task because its design is typically pulled in opposite directions by speed and accuracy. FFLUX breaks this trend by utilizing Gaussian process regression (GPR) to predict, at ab initio accuracy, atomic energies and multipole moments as obtained from the quantum theory of atoms in molecules (QTAIM). This work demonstrates that the in-house FFLUX training pipeline can generate successful GPR models for six representative molecules: peptide-capped glycine and alanine, glucose, paracetamol, aspirin, and ibuprofen. The molecules were sufficiently distorted to represent configurations from an AMBER-GAFF2 molecular dynamics run. All internal degrees of freedom were covered corresponding to 93 dimensions in the case of the largest molecule ibuprofen (33 atoms). Benefiting from active learning, the GPR models contain only about 2000 training points and return largely sub-kcal mol-1 prediction errors for the validation sets. A proof of concept has been reached for transferring the model produced through active learning on one atomic property to that of the remaining atomic properties. The prediction of electrostatic interaction can be assessed at the intermolecular level, and the vast majority of interactions have a root-mean-square error of less than 0.1 kJ mol-1 with a maximum value of ∼1 kJ mol-1 for a glycine and paracetamol dimer.
Collapse
|
2
|
Oliveira MP, Gonçalves YMH, Ol Gheta SK, Rieder SR, Horta BAC, Hünenberger PH. Comparison of the United- and All-Atom Representations of (Halo)alkanes Based on Two Condensed-Phase Force Fields Optimized against the Same Experimental Data Set. J Chem Theory Comput 2022; 18:6757-6778. [PMID: 36190354 DOI: 10.1021/acs.jctc.2c00524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The level of accuracy that can be achieved by a force field is influenced by choices made in the interaction-function representation and in the relevant simulation parameters. These choices, referred to here as functional-form variants (FFVs), include for example the model resolution, the charge-derivation procedure, the van der Waals combination rules, the cutoff distance, and the treatment of the long-range interactions. Ideally, assessing the effect of a given FFV on the intrinsic accuracy of the force-field representation requires that only the specific FFV is changed and that this change is performed at an optimal level of parametrization, a requirement that may prove extremely challenging to achieve in practice. Here, we present a first attempt at such a comparison for one specific FFV, namely the choice of a united-atom (UA) versus an all-atom (AA) resolution in a force field for saturated acyclic (halo)alkanes. Two force-field versions (UA vs AA) are optimized in an automated way using the CombiFF approach against 961 experimental values for the pure-liquid densities ρliq and vaporization enthalpies ΔHvap of 591 compounds. For the AA force field, the torsional and third-neighbor Lennard-Jones parameters are also refined based on quantum-mechanical rotational-energy profiles. The comparison between the UA and AA resolutions is also extended to properties that have not been included as parameterization targets, namely the surface-tension coefficient γ, the isothermal compressibility κT, the isobaric thermal-expansion coefficient αP, the isobaric heat capacity cP, the static relative dielectric permittivity ϵ, the self-diffusion coefficient D, the shear viscosity η, the hydration free energy ΔGwat, and the free energy of solvation ΔGche in cyclohexane. For the target properties ρliq and ΔHvap, the UA and AA resolutions reach very similar levels of accuracy after optimization. For the nine other properties, the AA representation leads to more accurate results in terms of η; comparably accurate results in terms of γ, κT, αP, ϵ, D, and ΔGche; and less accurate results in terms of cP and ΔGwat. This work also represents a first step toward the calibration of a GROMOS-compatible force field at the AA resolution.
Collapse
Affiliation(s)
- Marina P Oliveira
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| | - Yan M H Gonçalves
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| | - S Kashef Ol Gheta
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| | - Salomé R Rieder
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| | - Bruno A C Horta
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| | - Philippe H Hünenberger
- Laboratorium für Physikalische Chemie, ETH Zürich, ETH-Hönggerberg, HCI, CH-8093 Zürich, Switzerland
| |
Collapse
|
3
|
Alkorta I, Elguero J. Theoretical studies of conformational analysis and intramolecular dynamic phenomena. Struct Chem 2019. [DOI: 10.1007/s11224-019-01370-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
4
|
McDonagh JL, Silva AF, Vincent MA, Popelier PLA. Machine Learning of Dynamic Electron Correlation Energies from Topological Atoms. J Chem Theory Comput 2017; 14:216-224. [PMID: 29211469 DOI: 10.1021/acs.jctc.7b01157] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We present an innovative method for predicting the dynamic electron correlation energy of an atom or a bond in a molecule utilizing topological atoms. Our approach uses the machine learning method Kriging (Gaussian Process Regression with a non-zero mean function) to predict these dynamic electron correlation energy contributions. The true energy values are calculated by partitioning the MP2 two-particle density-matrix via the Interacting Quantum Atoms (IQA) procedure. To our knowledge, this is the first time such energies have been predicted by a machine learning technique. We present here three important proof-of-concept cases: the water monomer, the water dimer, and the van der Waals complex H2···He. These cases represent the final step toward the design of a full IQA potential for molecular simulation. This final piece will enable us to consider situations in which dispersion is the dominant intermolecular interaction. The results from these examples suggest a new method by which dispersion potentials for molecular simulation can be generated.
Collapse
Affiliation(s)
- James L McDonagh
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Arnaldo F Silva
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Mark A Vincent
- School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| | - Paul L A Popelier
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain.,School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
5
|
Geometry Optimization with Machine Trained Topological Atoms. Sci Rep 2017; 7:12817. [PMID: 28993674 PMCID: PMC5634454 DOI: 10.1038/s41598-017-12600-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2017] [Accepted: 09/06/2017] [Indexed: 11/19/2022] Open
Abstract
The geometry optimization of a water molecule with a novel type of energy function called FFLUX is presented, which bypasses the traditional bonded potentials. Instead, topologically-partitioned atomic energies are trained by the machine learning method kriging to predict their IQA atomic energies for a previously unseen molecular geometry. Proof-of-concept that FFLUX’s architecture is suitable for geometry optimization is rigorously demonstrated. It is found that accurate kriging models can optimize 2000 distorted geometries to within 0.28 kJ mol−1 of the corresponding ab initio energy, and 50% of those to within 0.05 kJ mol−1. Kriging models are robust enough to optimize the molecular geometry to sub-noise accuracy, when two thirds of the geometric inputs are outside the training range of that model. Finally, the individual components of the potential energy are analyzed, and chemical intuition is reflected in the independent behavior of the three energy terms \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${E}_{{\rm{intra}}}^{{\rm{A}}}$$\end{document}EintraA(intra-atomic), \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${V}_{{\rm{cl}}}^{\text{AA}\text{'}}$$\end{document}VclAA' (electrostatic) and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$${V}_{{\rm{x}}}^{\text{AA}\text{'}}$$\end{document}VxAA' (exchange), in contrast to standard force fields.
Collapse
|
6
|
Fletcher TL, Popelier PLA. FFLUX: Transferability of polarizable machine-learned electrostatics in peptide chains. J Comput Chem 2017; 38:1005-1014. [DOI: 10.1002/jcc.24775] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Revised: 02/06/2017] [Accepted: 02/08/2017] [Indexed: 11/12/2022]
Affiliation(s)
- Timothy L. Fletcher
- Manchester Institute of Biotechnology (MIB); 131 Princess Street Manchester M1 7DN Great Britain
- School of Chemistry; University of Manchester; Oxford Road Manchester M13 9PL Great Britain
| | - Paul L. A. Popelier
- Manchester Institute of Biotechnology (MIB); 131 Princess Street Manchester M1 7DN Great Britain
- School of Chemistry; University of Manchester; Oxford Road Manchester M13 9PL Great Britain
| |
Collapse
|
7
|
Maxwell PI, Popelier PLA. Accurate prediction of the energetics of weakly bound complexes using the machine learning method kriging. Struct Chem 2017. [DOI: 10.1007/s11224-017-0928-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
8
|
Fletcher TL, Popelier PLA. Toward amino acid typing for proteins in FFLUX. J Comput Chem 2016; 38:336-345. [PMID: 27991680 PMCID: PMC6681421 DOI: 10.1002/jcc.24686] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Revised: 11/14/2016] [Accepted: 11/14/2016] [Indexed: 01/18/2023]
Abstract
Continuing the development of the FFLUX, a multipolar polarizable force field driven by machine learning, we present a modern approach to atom-typing and building transferable models for predicting atomic properties in proteins. Amino acid atomic charges in a peptide chain respond to the substitution of a neighboring residue and this response can be categorized in a manner similar to atom-typing. Using a machine learning method called kriging, we are able to build predictive models for an atom that is defined, not only by its local environment, but also by its neighboring residues, for a minimal additional computational cost. We found that prediction errors were up to 11 times lower when using a model specific to the correct group of neighboring residues, with a mean prediction of ∼0.0015 au. This finding suggests that atoms in a force field should be defined by more than just their immediate atomic neighbors. When comparing an atom in a single alanine to an analogous atom in a deca-alanine helix, the mean difference in charge is 0.026 au. Meanwhile, the same difference between a trialanine and a deca-alanine helix is only 0.012 au. When compared to deca-alanine models, the transferable models are up to 20 times faster to train, and require significantly less ab initio calculation, providing a practical route to modeling large biological systems. © 2016 The Authors. Journal of Computational Chemistry Published by Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Timothy L Fletcher
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester, M1 7DN, United Kingdom
| | - Paul L A Popelier
- School of Chemistry, University of Manchester, Oxford Road, Manchester, M13 9PL, United Kingdom
| |
Collapse
|
9
|
Davie SJ, Di Pasquale N, Popelier PLA. Kriging atomic properties with a variable number of inputs. J Chem Phys 2016; 145:104104. [DOI: 10.1063/1.4962197] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Stuart J. Davie
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
| | - Nicodemo Di Pasquale
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
| | - Paul L. A. Popelier
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, United Kingdom and School of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, United Kingdom
| |
Collapse
|
10
|
|
11
|
Horta BAC, Merz PT, Fuchs PFJ, Dolenc J, Riniker S, Hünenberger PH. A GROMOS-Compatible Force Field for Small Organic Molecules in the Condensed Phase: The 2016H66 Parameter Set. J Chem Theory Comput 2016; 12:3825-50. [DOI: 10.1021/acs.jctc.6b00187] [Citation(s) in RCA: 88] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Bruno A. C. Horta
- Laboratory
of Physical Chemistry, ETH Zürich, CH-8093 Zürich, Switzerland
- Instituto de Química, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-909, Brazil
| | - Pascal T. Merz
- Laboratory
of Physical Chemistry, ETH Zürich, CH-8093 Zürich, Switzerland
| | - Patrick F. J. Fuchs
- Institut Jacques Monod, UMR 7592 CNRS, Université Paris-Diderot, Sorbonne Paris Cité, F-75205 Paris, France
| | - Jozica Dolenc
- Laboratory
of Physical Chemistry, ETH Zürich, CH-8093 Zürich, Switzerland
- Chemistry,
Biology and Pharmacy Information Center, ETH Zürich, CH-8093 Zürich, Switzerland
| | - Sereina Riniker
- Laboratory
of Physical Chemistry, ETH Zürich, CH-8093 Zürich, Switzerland
| | | |
Collapse
|
12
|
Fletcher TL, Popelier PLA. Multipolar Electrostatic Energy Prediction for all 20 Natural Amino Acids Using Kriging Machine Learning. J Chem Theory Comput 2016; 12:2742-51. [DOI: 10.1021/acs.jctc.6b00457] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Timothy L. Fletcher
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain
- School
of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| | - Paul L. A. Popelier
- Manchester Institute of Biotechnology (MIB), 131 Princess Street, Manchester M1 7DN, Great Britain
- School
of Chemistry, University of Manchester, Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
13
|
Cardamone S, Popelier PLA. Prediction of conformationally dependent atomic multipole moments in carbohydrates. J Comput Chem 2015; 36:2361-73. [PMID: 26547500 PMCID: PMC5031233 DOI: 10.1002/jcc.24215] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Revised: 08/18/2015] [Accepted: 09/10/2015] [Indexed: 01/18/2023]
Abstract
The conformational flexibility of carbohydrates is challenging within the field of computational chemistry. This flexibility causes the electron density to change, which leads to fluctuating atomic multipole moments. Quantum Chemical Topology (QCT) allows for the partitioning of an "atom in a molecule," thus localizing electron density to finite atomic domains, which permits the unambiguous evaluation of atomic multipole moments. By selecting an ensemble of physically realistic conformers of a chemical system, one evaluates the various multipole moments at defined points in configuration space. The subsequent implementation of the machine learning method kriging delivers the evaluation of an analytical function, which smoothly interpolates between these points. This allows for the prediction of atomic multipole moments at new points in conformational space, not trained for but within prediction range. In this work, we demonstrate that the carbohydrates erythrose and threose are amenable to the above methodology. We investigate how kriging models respond when the training ensemble incorporating multiple energy minima and their environment in conformational space. Additionally, we evaluate the gains in predictive capacity of our models as the size of the training ensemble increases. We believe this approach to be entirely novel within the field of carbohydrates. For a modest training set size of 600, more than 90% of the external test configurations have an error in the total (predicted) electrostatic energy (relative to ab initio) of maximum 1 kJ mol(-1) for open chains and just over 90% an error of maximum 4 kJ mol(-1) for rings.
Collapse
Affiliation(s)
- Salvatore Cardamone
- Manchester Institute of Biotechnology (MIB)131 Princess StreetManchesterM1 7DNGreat Britain
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLGreat Britain
| | - Paul L. A. Popelier
- Manchester Institute of Biotechnology (MIB)131 Princess StreetManchesterM1 7DNGreat Britain
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLGreat Britain
| |
Collapse
|