1
|
Bursik B, Eller J, Gross J. Predicting Solvation Free Energies from the Minnesota Solvation Database Using Classical Density Functional Theory Based on the PC-SAFT Equation of State. J Phys Chem B 2024; 128:3677-3688. [PMID: 38579126 DOI: 10.1021/acs.jpcb.3c07447] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/07/2024]
Abstract
We critically assess the capabilities of classical density functional theory (DFT) based on the perturbed-chain statistical associating fluid theory (PC-SAFT) equation of state to predict the solvation free energies of small molecules in various hydrocarbon solvents. We compare DFT results with experimental data from the Minnesota solvation database and utilize statistical methods to analyze the accuracy of our approach, as well as its weaknesses. The mean absolute error of the solvation free energies is 3.7 kJ mol-1 for n-alkane solvents, ranging from pentane to hexadecane, with 473 solute-solvent systems. For solvents consisting of cyclic hydrocarbons (cyclohexane, benzene, toluene, and ethylbenzene) with 245 solute-solvent systems, we report a slightly larger mean absolute error of 4.2 kJ mol-1. We identify three possible sources of errors: (i) the neglect of solute-solvent and solvent-solvent Coulomb interactions, which limits the applicability of PC-SAFT DFT to nonpolar and weakly polar molecules; (ii) the solute's Lennard-Jones parameters supplied by the general AMBER force field, which are not parametrized toward solvation free energies; and (iii) the application of the Lorentz-Berthelot combining rules to the dispersive interactions between a segment of the PC-SAFT solvent and a Lennard-Jones interaction site of the solute. The approach is more accurate than standard implementations of phenomenological models in common chemistry software packages, which exhibit mean absolute errors larger than 9.12 kJ mol-1, even though newer phenomenological models achieve a mean absolute error of about 2 kJ mol-1. PC-SAFT DFT is more computationally efficient than state of the art explicit molecular simulations in combination with free energy perturbation methods. It is predictive with respect to solvation free energies, i.e., the input for the model is the (element-specific) molecular force field, the solute configuration from molecular dynamics simulations, and the (substance-specific) PC-SAFT parameters. The PC-SAFT parametrization uses pure-component data and does not require experimental solvation free energies. The PC-SAFT equation of state, without applying a DFT formalism, can also be used to calculate solvation free energies, provided that the PC-SAFT parameters for the solute are available. A large number of substances was recently parametrized by members of our group (Esper, T.; Bauer, G.; Rehner, P.; Gross, J. Ind. Eng. Chem. Res. 2023, 62), which enables a comparison to the DFT approach for 103 substances. Accurate results are obtained from the PC-SAFT equation of state with an MAE below 2.51 kJ mol-1. The DFT approach does not require PC-SAFT parameters for the solute and can be applied to all solutes that can be represented by the molecular force field.
Collapse
Affiliation(s)
- Benjamin Bursik
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Stuttgart 70569, Germany
| | - Johannes Eller
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Stuttgart 70569, Germany
| | - Joachim Gross
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart, Stuttgart 70569, Germany
| |
Collapse
|
2
|
Gilson MK, Kurtzman T. Free Energy Density of a Fluid and Its Role in Solvation and Binding. J Chem Theory Comput 2024; 20:2871-2887. [PMID: 38536144 PMCID: PMC11197885 DOI: 10.1021/acs.jctc.3c01173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/10/2024]
Abstract
The concept that a fluid has a position-dependent free energy density appears in the literature but has not been fully developed or accepted. We set this concept on an unambiguous theoretical footing via the following strategy. First, we set forth four desiderata that should be satisfied by any definition of the position-dependent free energy density, f(R), in a system comprising only a fluid and a rigid solute: its volume integral, plus the fixed internal energy of the solute, should be the system free energy; it deviates from its bulk value, fbulk, near a solute but should asymptotically approach fbulk with increasing distance from the solute; it should go to zero where the solvent density goes to zero; and it should be well-defined in the most general case of a fluid made up of flexible molecules with an arbitrary interaction potential. Second, we use statistical thermodynamics to formulate a definition of the free energy density that satisfies these desiderata. Third, we show how any free energy density satisfying the desiderata may be used to analyze molecular processes in solution. In particular, because the spatial integral of f(R) equals the free energy of the system, it can be used to compute free energy changes that result from the rearrangement of solutes as well as the forces exerted on the solutes by the solvent. This enables the use of a thermodynamic analysis of water in protein binding sites to inform ligand design. Finally, we discuss related literature and address published concerns regarding the thermodynamic plausibility of a position-dependent free energy density. The theory presented here has applications in theoretical and computational chemistry and may be further generalizable beyond fluids, such as to solids and macromolecules.
Collapse
Affiliation(s)
- Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, and Department of Chemistry and Biochemistry, UC San Diego, La Jolla, CA, 92093, USA
| | - Tom Kurtzman
- PhD Programs in Chemistry, Biochemistry, and Biology, The Graduate Center of the City University of New York, New York, 10016, USA; Department of Chemistry, Lehman College, The City University of New York, Bronx, New York, 10468, USA
| |
Collapse
|
3
|
Karwounopoulos J, Kaupang Å, Wieder M, Boresch S. Calculations of Absolute Solvation Free Energies with Transformato─Application to the FreeSolv Database Using the CGenFF Force Field. J Chem Theory Comput 2023; 19:5988-5998. [PMID: 37616333 PMCID: PMC10500982 DOI: 10.1021/acs.jctc.3c00691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Indexed: 08/26/2023]
Abstract
We recently introduced transformato, an open-source Python package for the automated setup of large-scale calculations of relative solvation and binding free energy differences. Here, we extend the capabilities of transformato to the calculation of absolute solvation free energy differences. After careful validation against the literature results and reference calculations with the PERT module of CHARMM, we used transformato to compute absolute solvation free energies for most molecules in the FreeSolv database (621 out of 642). The force field parameters were obtained with the program cgenff (v2.5.1), which derives missing parameters from the CHARMM general force field (CGenFF v4.6). A long-range correction for the Lennard-Jones interactions was added to all computed solvation free energies. The mean absolute error compared to the experimental data is 1.12 kcal/mol. Our results allow a detailed comparison between the AMBER and CHARMM general force fields and provide a more in-depth understanding of the capabilities and limitations of the CGenFF small molecule parameters.
Collapse
Affiliation(s)
- Johannes Karwounopoulos
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstr. 17, 1090 Vienna, Austria
- Vienna
Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstr. 42, 1090 Vienna, Austria
| | - Åsmund Kaupang
- Department
of Pharmacy, Section for Pharmaceutical Chemistry, University of Oslo, 0316 Oslo, Norway
| | - Marcus Wieder
- Department
of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Althanstrasse 14, 1090 Vienna, Austria
| | - Stefan Boresch
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstr. 17, 1090 Vienna, Austria
| |
Collapse
|
4
|
Zhang ZY, Peng D, Liu L, Shen L, Fang WH. Machine Learning Prediction of Hydration Free Energy with Physically Inspired Descriptors. J Phys Chem Lett 2023; 14:1877-1884. [PMID: 36779933 DOI: 10.1021/acs.jpclett.2c03858] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
We present machine learning models for predicting experimental hydration free energies of molecules without any atom-, bond-, or geometry-specific input feature. Four types of physically inspired descriptors are adopted for predictions. The first type is composed of the total dipole moment, anisotropic polarizability, and vibrational analysis results of the solute molecule. The second and third types are derived from the electrostatic potential distribution of the solute. The last type includes the solvent accessible surface area and shape similarities. Several machine learning regression models are built on the basis of the FreeSolv database with ∼600 samples, showing a better performance in comparison with that of most traditional approaches and other prediction methods based on molecular fingerprints. In particular, the present descriptors are capable of predicting hydration free energies of new compounds with elements or fragments that are never seen in the training set. The importance of these descriptors, the impact of dissociation energies of specific covalent bonds, and the outliers with relatively large prediction errors are also discussed.
Collapse
Affiliation(s)
- Zhan-Yun Zhang
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Ding Peng
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Lihong Liu
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
| | - Lin Shen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Yantai-Jingshi Institute of Material Genome Engineering, Yantai 265505, Shandong, P. R. China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, P. R. China
- Shandong Laboratory of Yantai Advanced Materials and Green Manufacturing, Yantai 264006, Shandong, P. R. China
| |
Collapse
|
5
|
Casillas L, Grigorian VM, Luchko T. Identifying Systematic Force Field Errors Using a 3D-RISM Element Counting Correction. Molecules 2023; 28:molecules28030925. [PMID: 36770599 PMCID: PMC9921782 DOI: 10.3390/molecules28030925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Revised: 01/09/2023] [Accepted: 01/11/2023] [Indexed: 01/19/2023] Open
Abstract
Hydration free energies of small molecules are commonly used as benchmarks for solvation models. However, errors in predicting hydration free energies are partially due to the force fields used and not just the solvation model. To address this, we have used the 3D reference interaction site model (3D-RISM) of molecular solvation and existing benchmark explicit solvent calculations with a simple element count correction (ECC) to identify problems with the non-bond parameters in the general AMBER force field (GAFF). 3D-RISM was used to calculate hydration free energies of all 642 molecules in the FreeSolv database, and a partial molar volume correction (PMVC), ECC, and their combination (PMVECC) were applied to the results. The PMVECC produced a mean unsigned error of 1.01±0.04kcal/mol and root mean squared error of 1.44±0.07kcal/mol, better than the benchmark explicit solvent calculations from FreeSolv, and required less than 15 s of computing time per molecule on a single CPU core. Importantly, parameters for PMVECC showed systematic errors for molecules containing Cl, Br, I, and P. Applying ECC to the explicit solvent hydration free energies found the same systematic errors. The results strongly suggest that some small adjustments to the Lennard-Jones parameters for GAFF will lead to improved hydration free energy calculations for all solvent models.
Collapse
|
6
|
Gao P, Yang X, Tang YH, Zheng M, Andersen A, Murugesan V, Hollas A, Wang W. Graphical Gaussian process regression model for aqueous solvation free energy prediction of organic molecules in redox flow batteries. Phys Chem Chem Phys 2021; 23:24892-24904. [PMID: 34724700 DOI: 10.1039/d1cp04475c] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The solvation free energy of organic molecules is a critical parameter in determining emergent properties such as solubility, liquid-phase equilibrium constants, pKa and redox potentials in an organic redox flow battery. In this work, we present a machine learning (ML) model that can learn and predict the aqueous solvation free energy of an organic molecule using the Gaussian process regression method based on a new molecular graph kernel. To investigate the performance of the ML model for electrostatic interaction, the nonpolar interaction contribution of the solvent and the conformational entropy of the solute in the solvation free energy, three data sets with implicit or explicit water solvent models, and contribution of the conformational entropy of the solute are tested. We demonstrate that our ML model can predict the solvation free energy of molecules at chemical accuracy with a mean absolute error of less than 1 kcal mol-1 for subsets of the QM9 dataset and the Freesolv database. To solve the general data scarcity problem for a graph-based ML model, we propose a dimension reduction algorithm based on the distance between molecular graphs, which can be used to examine the diversity of the molecular data set. It provides a promising way to build a minimum training set to improve prediction for certain test sets where the space of molecular structures is predetermined.
Collapse
Affiliation(s)
- Peiyuan Gao
- Pacific Northwest National Laboratory, Richland 99352, USA.
| | - Xiu Yang
- Department of Industrial and Systems Engineering, Lehigh University, Bethlehem, PA 18015, USA.
| | - Yu-Hang Tang
- Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Muqing Zheng
- Department of Industrial and Systems Engineering, Lehigh University, Bethlehem, PA 18015, USA.
| | - Amity Andersen
- Pacific Northwest National Laboratory, Richland 99352, USA.
| | | | - Aaron Hollas
- Pacific Northwest National Laboratory, Richland 99352, USA.
| | - Wei Wang
- Pacific Northwest National Laboratory, Richland 99352, USA.
| |
Collapse
|
7
|
Fortuna A, Costa PJ. Optimized Halogen Atomic Radii for PBSA Calculations Using Off-Center Point Charges. J Chem Inf Model 2021; 61:3361-3375. [PMID: 34185532 DOI: 10.1021/acs.jcim.1c00177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
In force-field methods, the usage of off-center point charges, also called extra points (EPs), is a common strategy to tackle the anisotropy of the electrostatic potential of covalently bonded halogens (X), thus allowing the description of halogen bonds (XBs) at the molecular mechanics/molecular dynamics (MM/MD) level. Diverse EP implementations exist in the literature differing on the charge sets and/or the X-EP distances. Poisson-Boltzmann and surface area (PBSA) calculations can be used to obtain solvation free energies (ΔGsolv) of small molecules, often to compute binding free energies (ΔGbind) at the MM-PBSA level. This method depends, among other parameters, on the empirical assignment of atomic radii (PB radii). Given the multiplicity of off-center point-charge models and the lack of specific PB radii for halogens compatible with such implementations, in this work, we assessed the performance of PBSA calculations for the estimation of ΔGsolv values in water (ΔGhyd), also conducting an optimization of the halogen PB radii (Cl, Br, and I) for each EP model. We not only expand the usage of EP models in the scope of the general AMBER force field (GAFF) but also provide the first optimized halogen PB radii in the context of the CHARMM general force field (CGenFF), thus contributing to improving the description of halogenated compounds in PBSA calculations.
Collapse
Affiliation(s)
- Andreia Fortuna
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, 1749-016 Lisboa, Portugal.,Research Institute for Medicines (iMed.ULisboa), Faculty of Pharmacy, University of Lisbon, Av. Professor Gama Pinto, 1649-003 Lisbon, Portugal
| | - Paulo J Costa
- BioISI-Biosystems & Integrative Sciences Institute, Faculty of Sciences, University of Lisboa, 1749-016 Lisboa, Portugal
| |
Collapse
|