1
|
Wu J, Kang Y, Pan P, Hou T. Machine learning methods for pK a prediction of small molecules: Advances and challenges. Drug Discov Today 2022; 27:103372. [PMID: 36167281 DOI: 10.1016/j.drudis.2022.103372] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 08/15/2022] [Accepted: 09/21/2022] [Indexed: 11/27/2022]
Abstract
The acid-base dissociation constant (pKa) is a fundamental property influencing many ADMET properties of small molecules. However, rapid and accurate pKa prediction remains a great challenge. In this review, we outline the current advances in machine-learning-based QSAR models for pKa prediction, including descriptor-based and graph-based approaches, and summarize their pros and cons. Moreover, we highlight the current challenges and future directions regarding experimental data, crucial factors influencing pKa and in silico prediction tools. We hope that this review can provide a practical guidance for the follow-up studies.
Collapse
Affiliation(s)
- Jialu Wu
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, China.
| |
Collapse
|
2
|
Mayr F, Wieder M, Wieder O, Langer T. Improving Small Molecule pKa Prediction Using Transfer Learning With Graph Neural Networks. Front Chem 2022; 10:866585. [PMID: 35721000 PMCID: PMC9204323 DOI: 10.3389/fchem.2022.866585] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 04/04/2022] [Indexed: 11/13/2022] Open
Abstract
Enumerating protonation states and calculating microstate pKa values of small molecules is an important yet challenging task for lead optimization and molecular modeling. Commercial and non-commercial solutions have notable limitations such as restrictive and expensive licenses, high CPU/GPU hour requirements, or the need for expert knowledge to set up and use. We present a graph neural network model that is trained on 714,906 calculated microstate pKa predictions from molecules obtained from the ChEMBL database. The model is fine-tuned on a set of 5,994 experimental pKa values significantly improving its performance on two challenging test sets. Combining the graph neural network model with Dimorphite-DL, an open-source program for enumerating ionization states, we have developed the open-source Python package pkasolver, which is able to generate and enumerate protonation states and calculate pKa values with high accuracy.
Collapse
|
3
|
Ganyecz Á, Kállay M. Implementation and Optimization of the Embedded Cluster Reference Interaction Site Model with Atomic Charges. J Phys Chem A 2022; 126:2417-2429. [PMID: 35394778 PMCID: PMC9036516 DOI: 10.1021/acs.jpca.1c07904] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
In this work, we
implemented the embedded cluster reference interaction
site model (EC-RISM) originally developed by Kloss, Heil, and Kast
(J. Phys. Chem. B2008, 112, 4337–4343).
This method combines quantum mechanical calculations with the 3D reference
interaction site model (3D-RISM). Numerous options, such as buffer,
grid space, basis set, charge model, water model, closure relation,
and so forth, were investigated to find the best settings. Additionally,
the small point charges, which are derived from the solvent distribution
from the 3D-RISM solution to represent the solvent in the QM calculation,
were neglected to reduce the overhead without the loss of accuracy.
On the MNSOL[a], MNSOL, and FreeSolv databases, our implemented and
optimized method provides solvation free energies in water with 5.70,
6.32, and 6.44 kJ/mol root-mean-square deviations, respectively, but
with different settings, 5.22, 6.08, and 6.63 kJ/mol can also be achieved.
Only solvent models containing fitting parameters, like COSMO-RS and
EC-RISM with universal correction and directly used electrostatic
potential, perform better than our EC-RISM implementation with atomic
charges.
Collapse
Affiliation(s)
- Ádám Ganyecz
- Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, Budapest P.O. Box 91, H-1521 Hungary
| | - Mihály Kállay
- Department of Physical Chemistry and Materials Science, Budapest University of Technology and Economics, Budapest P.O. Box 91, H-1521 Hungary
| |
Collapse
|
4
|
Recent Developments of Computational Methods for pKa Prediction Based on Electronic Structure Theory with Solvation Models. J 2021. [DOI: 10.3390/j4040058] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
The protonation/deprotonation reaction is one of the most fundamental processes in solutions and biological systems. Compounds with dissociative functional groups change their charge states by protonation/deprotonation. This change not only significantly alters the physical properties of a compound itself, but also has a profound effect on the surrounding molecules. In this paper, we review our recent developments of the methods for predicting the Ka, the equilibrium constant for protonation reactions or acid dissociation reactions. The pKa, which is a logarithm of Ka, is proportional to the reaction Gibbs energy of the protonation reaction, and the reaction free energy can be determined by electronic structure calculations with solvation models. The charge of the compound changes before and after protonation; therefore, the solvent effect plays an important role in determining the reaction Gibbs energy. Here, we review two solvation models: the continuum model, and the integral equation theory of molecular liquids. Furthermore, the reaction Gibbs energy calculations for the protonation reactions require special attention to the handling of dissociated protons. An efficient method for handling the free energy of dissociated protons will also be reviewed.
Collapse
|
5
|
Gabriel TS, Hansen UP, Urban M, Drexler N, Winterstein T, Rauh O, Thiel G, Kast SM, Schroeder I. Asymmetric Interplay Between K + and Blocker and Atomistic Parameters From Physiological Experiments Quantify K + Channel Blocker Release. Front Physiol 2021; 12:737834. [PMID: 34777005 PMCID: PMC8586521 DOI: 10.3389/fphys.2021.737834] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2021] [Accepted: 10/04/2021] [Indexed: 11/23/2022] Open
Abstract
Modulating the activity of ion channels by blockers yields information on both the mode of drug action and on the biophysics of ion transport. Here we investigate the interplay between ions in the selectivity filter (SF) of K+ channels and the release kinetics of the blocker tetrapropylammonium in the model channel KcvNTS. A quantitative expression calculates blocker release rate constants directly from voltage-dependent ion occupation probabilities in the SF. The latter are obtained by a kinetic model of single-channel currents recorded in the absence of the blocker. The resulting model contains only two adjustable parameters of ion-blocker interaction and holds for both symmetric and asymmetric ionic conditions. This data-derived model is corroborated by 3D reference interaction site model (3D RISM) calculations on several model systems, which show that the K+ occupation probability is unaffected by the blocker, a direct consequence of the strength of the ion-carbonyl attraction in the SF, independent of the specific protein background. Hence, KcvNTS channel blocker release kinetics can be reduced to a small number of system-specific parameters. The pore-independent asymmetric interplay between K+ and blocker ions potentially allows for generalizing these results to similar potassium channels.
Collapse
Affiliation(s)
- Tobias S Gabriel
- Plant Membrane Biophysics, Technische Universität Darmstadt, Darmstadt, Germany
| | - Ulf-Peter Hansen
- Department of Structural Biology, Christian-Albrechts-Universität zu Kiel, Kiel, Germany
| | - Martin Urban
- Physikalische Chemie III, Technische Universita̋t Dortmund, Dortmund, Germany
| | - Nils Drexler
- Institute of Physiology II, University Hospital Jena, Friedrich Schiller University Jena, Jena, Germany
| | - Tobias Winterstein
- Plant Membrane Biophysics, Technische Universität Darmstadt, Darmstadt, Germany
| | - Oliver Rauh
- Plant Membrane Biophysics, Technische Universität Darmstadt, Darmstadt, Germany
| | - Gerhard Thiel
- Plant Membrane Biophysics, Technische Universität Darmstadt, Darmstadt, Germany
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universita̋t Dortmund, Dortmund, Germany
| | - Indra Schroeder
- Plant Membrane Biophysics, Technische Universität Darmstadt, Darmstadt, Germany.,Institute of Physiology II, University Hospital Jena, Friedrich Schiller University Jena, Jena, Germany
| |
Collapse
|
6
|
Sharma B, Tran VA, Pongratz T, Galazzo L, Zhurko I, Bordignon E, Kast SM, Neese F, Marx D. A Joint Venture of Ab Initio Molecular Dynamics, Coupled Cluster Electronic Structure Methods, and Liquid-State Theory to Compute Accurate Isotropic Hyperfine Constants of Nitroxide Probes in Water. J Chem Theory Comput 2021; 17:6366-6386. [PMID: 34516119 PMCID: PMC8515807 DOI: 10.1021/acs.jctc.1c00582] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Indexed: 01/11/2023]
Abstract
The isotropic hyperfine coupling constant (HFCC, Aiso) of a pH-sensitive spin probe in a solution, HMI (2,2,3,4,5,5-hexamethylimidazolidin-1-oxyl, C9H19N2O) in water, is computed using an ensemble of state-of-the-art computational techniques and is gauged against X-band continuous wave electron paramagnetic resonance (EPR) measurement spectra at room temperature. Fundamentally, the investigation aims to delineate the cutting edge of current first-principles-based calculations of EPR parameters in aqueous solutions based on using rigorous statistical mechanics combined with correlated electronic structure techniques. In particular, the impact of solvation is described by exploiting fully atomistic, RISM integral equation, and implicit solvation approaches as offered by ab initio molecular dynamics (AIMD) of the periodic bulk solution (using the spin-polarized revPBE0-D3 hybrid functional), embedded cluster reference interaction site model integral equation theory (EC-RISM), and polarizable continuum embedding (using CPCM) of microsolvated complexes, respectively. HFCCs are obtained from efficient coupled cluster calculations (using open-shell DLPNO-CCSD theory) as well as from hybrid density functional theory (using revPBE0-D3). Re-solvation of "vertically desolvated" spin probe configuration snapshots by EC-RISM embedding is shown to provide significantly improved results compared to CPCM since only the former captures the inherent structural heterogeneity of the solvent close to the spin probe. The average values of the Aiso parameter obtained based on configurational statistics using explicit water within AIMD and from EC-RISM solvation are found to be satisfactorily close. Using either such explicit or RISM solvation in conjunction with DLPNO-CCSD calculations of the HFCCs provides an average Aiso parameter for HMI in aqueous solution at 300 K and 1 bar that is in good agreement with the experimentally determined one. The developed computational strategy is general in the sense that it can be readily applied to other spin probes of similar molecular complexity, to aqueous solutions beyond ambient conditions, as well as to other solvents in the longer run.
Collapse
Affiliation(s)
- Bikramjit Sharma
- Lehrstuhl
für Theoretische Chemie, Ruhr-Universität
Bochum, 44780 Bochum, Germany
| | - Van Anh Tran
- Max-Planck-Institut
für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Tim Pongratz
- Physikalische
Chemie III, Technische Universität
Dortmund, Otto-Hahn-Str. 4a, 44227 Dortmund, Germany
| | - Laura Galazzo
- Faculty
of Chemistry and Biochemistry, Ruhr University
Bochum, 44780 Bochum, Germany
| | - Irina Zhurko
- Laboratory
of Nitrogen Compounds, N.N. Vorozhtsov Novosibirsk Institute of Organic
Chemistry, NIOCH SB RAS, 9 Lavrentiev Avenue, 630090, Novosibirsk, Russia
| | - Enrica Bordignon
- Faculty
of Chemistry and Biochemistry, Ruhr University
Bochum, 44780 Bochum, Germany
| | - Stefan M. Kast
- Physikalische
Chemie III, Technische Universität
Dortmund, Otto-Hahn-Str. 4a, 44227 Dortmund, Germany
| | - Frank Neese
- Max-Planck-Institut
für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470 Mülheim an der Ruhr, Germany
| | - Dominik Marx
- Lehrstuhl
für Theoretische Chemie, Ruhr-Universität
Bochum, 44780 Bochum, Germany
| |
Collapse
|
7
|
|
8
|
Tielker N, Güssregen S, Kast SM. SAMPL7 physical property prediction from EC-RISM theory. J Comput Aided Mol Des 2021; 35:933-941. [PMID: 34278539 PMCID: PMC8367877 DOI: 10.1007/s10822-021-00410-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/05/2021] [Indexed: 01/08/2023]
Abstract
Inspired by the successful application of the embedded cluster reference interaction site model (EC-RISM), a combination of quantum–mechanical calculations with three-dimensional RISM theory to predict Gibbs energies of species in solution within the SAMPL6.1 (acidity constants, pKa) and SAMPL6.2 (octanol–water partition coefficients, log P) the methodology was applied to the recent SAMPL7 physical property challenge on aqueous pKa and octanol–water log P values. Not part of the challenge but provided by the organizers, we also computed distribution coefficients log D7.4 from predicted pKa and log P data. While macroscopic pKa predictions compared very favorably with experimental data (root mean square error, RMSE 0.72 pK units), the performance of the log P model (RMSE 1.84) fell behind expectations from the SAMPL6.2 challenge, leading to reasonable log D7.4 predictions (RMSE 1.69) from combining the independent calculations. In the post-submission phase, conformations generated by different methodology yielded results that did not significantly improve the original predictions. While overall satisfactory compared to previous log D challenges, the predicted data suggest that further effort is needed for optimizing the robustness of the partition coefficient model within EC-RISM calculations and for shaping the agreement between experimental conditions and the corresponding model description.
Collapse
Affiliation(s)
- Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Stefan Güssregen
- Sanofi-Aventis Deutschland GmbH, R&D Integrated Drug Discovery, 65926, Frankfurt am Main, Germany
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany.
| |
Collapse
|
9
|
Bergazin TD, Tielker N, Zhang Y, Mao J, Gunner MR, Francisco K, Ballatore C, Kast SM, Mobley DL. Evaluation of log P, pK a, and log D predictions from the SAMPL7 blind challenge. J Comput Aided Mol Des 2021; 35:771-802. [PMID: 34169394 PMCID: PMC8224998 DOI: 10.1007/s10822-021-00397-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/05/2021] [Indexed: 12/16/2022]
Abstract
The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. The dataset was composed of a series of N-acylsulfonamides and related bioisosteres. 17 research groups participated in the log P challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water log P predictions in the SAMPL7 challenge was lower than octanol-water log P predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7. Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.
Collapse
Affiliation(s)
| | - Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Yingying Zhang
- Department of Physics, The Graduate Center, City University of New York, New York, 10016, USA
| | - Junjun Mao
- Department of Physics, City College of New York, New York, 10031, USA
| | - M R Gunner
- Department of Physics, The Graduate Center, City University of New York, New York, 10016, USA.,Department of Physics, City College of New York, New York, 10031, USA
| | - Karol Francisco
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, Ja Jolla, CA, 92093-0756, USA
| | - Carlo Ballatore
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, Ja Jolla, CA, 92093-0756, USA
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA. .,Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
10
|
Tielker N, Eberlein L, Hessler G, Schmidt KF, Güssregen S, Kast SM. Quantum-mechanical property prediction of solvated drug molecules: what have we learned from a decade of SAMPL blind prediction challenges? J Comput Aided Mol Des 2021; 35:453-472. [PMID: 33079358 PMCID: PMC8018924 DOI: 10.1007/s10822-020-00347-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2020] [Accepted: 09/26/2020] [Indexed: 01/26/2023]
Abstract
Joint academic-industrial projects supporting drug discovery are frequently pursued to deploy and benchmark cutting-edge methodical developments from academia in a real-world industrial environment at different scales. The dimensionality of tasks ranges from small molecule physicochemical property assessment over protein-ligand interaction up to statistical analyses of biological data. This way, method development and usability both benefit from insights gained at both ends, when predictiveness and readiness of novel approaches are confirmed, but the pharmaceutical drug makers get early access to novel tools for the quality of drug products and benefit of patients. Quantum-mechanical and simulation methods particularly fall into this group of methods, as they require skills and expense in their development but also significant resources in their application, thus are comparatively slowly dripping into the realm of industrial use. Nevertheless, these physics-based methods are becoming more and more useful. Starting with a general overview of these and in particular quantum-mechanical methods for drug discovery we review a decade-long and ongoing collaboration between Sanofi and the Kast group focused on the application of the embedded cluster reference interaction site model (EC-RISM), a solvation model for quantum chemistry, to study small molecule chemistry in the context of joint participation in several SAMPL (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenges. Starting with early application to tautomer equilibria in water (SAMPL2) the methodology was further developed to allow for challenge contributions related to predictions of distribution coefficients (SAMPL5) and acidity constants (SAMPL6) over the years. Particular emphasis is put on a frequently overlooked aspect of measuring the quality of models, namely the retrospective analysis of earlier datasets and predictions in light of more recent and advanced developments. We therefore demonstrate the performance of the current methodical state of the art as developed and optimized for the SAMPL6 pKa and octanol-water log P challenges when re-applied to the earlier SAMPL5 cyclohexane-water log D and SAMPL2 tautomer equilibria datasets. Systematic improvement is not consistently found throughout despite the similarity of the problem class, i.e. protonation reactions and phase distribution. Hence, it is possible to learn about hidden bias in model assessment, as results derived from more elaborate methods do not necessarily improve quantitative agreement. This indicates the role of chance or coincidence for model development on the one hand which allows for the identification of systematic error and opportunities toward improvement and reveals possible sources of experimental uncertainty on the other. These insights are particularly useful for further academia-industry collaborations, as both partners are then enabled to optimize both the computational and experimental settings for data generation.
Collapse
Affiliation(s)
- Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Lukas Eberlein
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Gerhard Hessler
- R&D Integrated Drug Discovery, Sanofi-Aventis Deutschland GmbH, 65926, Frankfurt am Main, Germany
| | - K Friedemann Schmidt
- R&D Preclinical Safety, Sanofi-Aventis Deutschland GmbH, 65926, Frankfurt am Main, Germany
| | - Stefan Güssregen
- R&D Integrated Drug Discovery, Sanofi-Aventis Deutschland GmbH, 65926, Frankfurt am Main, Germany.
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany.
| |
Collapse
|
11
|
Işık M, Rustenburg AS, Rizzi A, Gunner MR, Mobley DL, Chodera JD. Overview of the SAMPL6 pK a challenge: evaluating small molecule microscopic and macroscopic pK a predictions. J Comput Aided Mol Des 2021; 35:131-166. [PMID: 33394238 PMCID: PMC7904668 DOI: 10.1007/s10822-020-00362-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2020] [Accepted: 11/17/2020] [Indexed: 01/01/2023]
Abstract
The prediction of acid dissociation constants (pKa) is a prerequisite for predicting many other properties of a small molecule, such as its protein-ligand binding affinity, distribution coefficient (log D), membrane permeability, and solubility. The prediction of each of these properties requires knowledge of the relevant protonation states and solution free energy penalties of each state. The SAMPL6 pKa Challenge was the first time that a separate challenge was conducted for evaluating pKa predictions as part of the Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) exercises. This challenge was motivated by significant inaccuracies observed in prior physical property prediction challenges, such as the SAMPL5 log D Challenge, caused by protonation state and pKa prediction issues. The goal of the pKa challenge was to assess the performance of contemporary pKa prediction methods for drug-like molecules. The challenge set was composed of 24 small molecules that resembled fragments of kinase inhibitors, a number of which were multiprotic. Eleven research groups contributed blind predictions for a total of 37 pKa distinct prediction methods. In addition to blinded submissions, four widely used pKa prediction methods were included in the analysis as reference methods. Collecting both microscopic and macroscopic pKa predictions allowed in-depth evaluation of pKa prediction performance. This article highlights deficiencies of typical pKa prediction evaluation approaches when the distinction between microscopic and macroscopic pKas is ignored; in particular, we suggest more stringent evaluation criteria for microscopic and macroscopic pKa predictions guided by the available experimental data. Top-performing submissions for macroscopic pKa predictions achieved RMSE of 0.7-1.0 pKa units and included both quantum chemical and empirical approaches, where the total number of extra or missing macroscopic pKas predicted by these submissions were fewer than 8 for 24 molecules. A large number of submissions had RMSE spanning 1-3 pKa units. Molecules with sulfur-containing heterocycles or iodo and bromo groups were less accurately predicted on average considering all methods evaluated. For a subset of molecules, we utilized experimentally-determined microstates based on NMR to evaluate the dominant tautomer predictions for each macroscopic state. Prediction of dominant tautomers was a major source of error for microscopic pKa predictions, especially errors in charged tautomers. The degree of inaccuracy in pKa predictions observed in this challenge is detrimental to the protein-ligand binding affinity predictions due to errors in dominant protonation state predictions and the calculation of free energy corrections for multiple protonation states. Underestimation of ligand pKa by 1 unit can lead to errors in binding free energy errors up to 1.2 kcal/mol. The SAMPL6 pKa Challenge demonstrated the need for improving pKa prediction methods for drug-like molecules, especially for challenging moieties and multiprotic molecules.
Collapse
Affiliation(s)
- Mehtap Işık
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.
- Tri-Institutional PhD Program in Chemical Biology, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, 10065, USA.
| | - Ariën S Rustenburg
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Graduate Program in Physiology, Biophysics, and Systems Biology, Weill Cornell Medical College, New York, NY, 10065, USA
| | - Andrea Rizzi
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, 10065, USA
| | - M R Gunner
- Department of Physics, City College of New York, New York, NY, 10031, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences and Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| |
Collapse
|
12
|
Malloum A, Fifen JJ, Conradie J. Determination of the absolute solvation free energy and enthalpy of the proton in solutions. J Mol Liq 2021. [DOI: 10.1016/j.molliq.2020.114919] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
|
13
|
Yang Q, Li Y, Yang J, Liu Y, Zhang L, Luo S, Cheng J. Holistic Prediction of the p
K
a
in Diverse Solvents Based on a Machine‐Learning Approach. Angew Chem Int Ed Engl 2020; 59:19282-19291. [DOI: 10.1002/anie.202008528] [Citation(s) in RCA: 64] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 07/13/2020] [Indexed: 12/12/2022]
Affiliation(s)
- Qi Yang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Yao Li
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Jin‐Dong Yang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Yidi Liu
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Long Zhang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Sanzhong Luo
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Jin‐Pei Cheng
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| |
Collapse
|
14
|
Yang Q, Li Y, Yang J, Liu Y, Zhang L, Luo S, Cheng J. Holistic Prediction of the p
K
a
in Diverse Solvents Based on a Machine‐Learning Approach. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.202008528] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Qi Yang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Yao Li
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Jin‐Dong Yang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Yidi Liu
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Long Zhang
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Sanzhong Luo
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| | - Jin‐Pei Cheng
- Center of Basic Molecular Science Department of Chemistry Tsinghua University 100084 Beijing China
| |
Collapse
|
15
|
Hunt P, Hosseini-Gerami L, Chrien T, Plante J, Ponting DJ, Segall M. Predicting p Ka Using a Combination of Semi-Empirical Quantum Mechanics and Radial Basis Function Methods. J Chem Inf Model 2020; 60:2989-2997. [PMID: 32357002 DOI: 10.1021/acs.jcim.0c00105] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The acid dissociation constant (pKa) has an important influence on molecular properties crucial to compound development in synthesis, formulation, and optimization of absorption, distribution, metabolism, and excretion properties. We will present a method that combines quantum mechanical calculations, at a semi-empirical level of theory, with machine learning to accurately predict pKa for a diverse range of mono- and polyprotic compounds. The resulting model has been tested on two external data sets, one specifically used to test pKa prediction methods (SAMPL6) and the second covering known drugs containing basic functionalities. Both sets were predicted with excellent accuracy (root-mean-square errors of 0.7-1.0 log units), comparable to other methodologies using a much higher level of theory and computational cost.
Collapse
Affiliation(s)
- Peter Hunt
- Optibrium Ltd., F5-6 Blenheim House, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9PB, U.K
| | - Layla Hosseini-Gerami
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Tomas Chrien
- Optibrium Ltd., F5-6 Blenheim House, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9PB, U.K
| | - Jeffrey Plante
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, U.K
| | - David J Ponting
- Lhasa Limited, Granary Wharf House, 2 Canal Wharf, Leeds LS11 5PS, U.K
| | - Matthew Segall
- Optibrium Ltd., F5-6 Blenheim House, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9PB, U.K
| |
Collapse
|
16
|
Patel P, Kuntz DM, Jones MR, Brooks BR, Wilson AK. SAMPL6 logP challenge: machine learning and quantum mechanical approaches. J Comput Aided Mol Des 2020; 34:495-510. [PMID: 32002780 PMCID: PMC10817701 DOI: 10.1007/s10822-020-00287-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2019] [Accepted: 01/08/2020] [Indexed: 10/25/2022]
Abstract
Two different types of approaches: (a) approaches that combine quantitative structure activity relationships, quantum mechanical electronic structure methods, and machine-learning and, (b) electronic structure vertical solvation approaches, were used to predict the logP coefficients of 11 molecules as part of the SAMPL6 logP blind prediction challenge. Using electronic structures optimized with density functional theory (DFT), several molecular descriptors were calculated for each molecule, including van der Waals areas and volumes, HOMO/LUMO energies, dipole moments, polarizabilities, and electrophilic and nucleophilic superdelocalizabilities. A multilinear regression model and a partial least squares model were used to train a set of 97 molecules. As well, descriptors were generated using the molecular operating environment and used to create additional machine learning models. Electronic structure vertical solvation approaches considered include DFT and the domain-based local pair natural orbital methods combined with the solvated variant of the correlation consistent composite approach.
Collapse
Affiliation(s)
- Prajay Patel
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824-1322, USA
| | - David M Kuntz
- Department of Chemistry and Center for Advanced Scientific Computing and Modeling (CASCaM), University of North Texas, Denton, TX, 76203-5070, USA
| | - Michael R Jones
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20852-5690, USA
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20852-5690, USA
| | - Angela K Wilson
- Department of Chemistry, Michigan State University, East Lansing, MI, 48824-1322, USA.
- Department of Chemistry and Center for Advanced Scientific Computing and Modeling (CASCaM), University of North Texas, Denton, TX, 76203-5070, USA.
| |
Collapse
|
17
|
Işık M, Bergazin TD, Fox T, Rizzi A, Chodera JD, Mobley DL. Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge. J Comput Aided Mol Des 2020; 34:335-370. [PMID: 32107702 PMCID: PMC7138020 DOI: 10.1007/s10822-020-00295-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The SAMPL Challenges aim to focus the biomolecular and physical modeling community on issues that limit the accuracy of predictive modeling of protein-ligand binding for rational drug design. In the SAMPL5 log D Challenge, designed to benchmark the accuracy of methods for predicting drug-like small molecule transfer free energies from aqueous to nonpolar phases, participants found it difficult to make accurate predictions due to the complexity of protonation state issues. In the SAMPL6 log P Challenge, we asked participants to make blind predictions of the octanol-water partition coefficients of neutral species of 11 compounds and assessed how well these methods performed absent the complication of protonation state effects. This challenge builds on the SAMPL6 p[Formula: see text] Challenge, which asked participants to predict p[Formula: see text] values of a superset of the compounds considered in this log P challenge. Blind prediction sets of 91 prediction methods were collected from 27 research groups, spanning a variety of quantum mechanics (QM) or molecular mechanics (MM)-based physical methods, knowledge-based empirical methods, and mixed approaches. There was a 50% increase in the number of participating groups and a 20% increase in the number of submissions compared to the SAMPL5 log D Challenge. Overall, the accuracy of octanol-water log P predictions in SAMPL6 Challenge was higher than cyclohexane-water log D predictions in SAMPL5, likely because modeling only the neutral species was necessary for log P and several categories of method benefited from the vast amounts of experimental octanol-water log P data. There were many highly accurate methods: 10 diverse methods achieved RMSE less than 0.5 log P units. These included QM-based methods, empirical methods, and mixed methods with physical modeling supported with empirical corrections. A comparison of physical modeling methods showed that QM-based methods outperformed MM-based methods. The average RMSE of the most accurate five MM-based, QM-based, empirical, and mixed approach methods based on RMSE were 0.92 ± 0.13, 0.48 ± 0.06, 0.47 ± 0.05, and 0.50 ± 0.06, respectively.
Collapse
Affiliation(s)
- Mehtap Işık
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.
- Tri-Institutional PhD Program in Chemical Biology, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, 10065, USA.
| | | | - Thomas Fox
- Computational Chemistry, Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co KG, 88397, Biberach, Germany
| | - Andrea Rizzi
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, 10065, USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
- Department of Chemistry, University of California, Irvine, CA, 92697, USA
| |
Collapse
|
18
|
Eberlein L, Beierlein FR, van Eikema Hommes NJR, Radadiya A, Heil J, Benner SA, Clark T, Kast SM, Richards NGJ. Tautomeric Equilibria of Nucleobases in the Hachimoji Expanded Genetic Alphabet. J Chem Theory Comput 2020; 16:2766-2777. [PMID: 32125859 DOI: 10.1021/acs.jctc.9b01079] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Evolution has yielded biopolymers that are constructed from exactly four building blocks and are able to support Darwinian evolution. Synthetic biology aims to extend this alphabet, and we recently showed that 8-letter (hachimoji) DNA can support rule-based information encoding. One source of replicative error in non-natural DNA-like systems, however, is the occurrence of alternative tautomeric forms, which pair differently. Unfortunately, little is known about how structural modifications impact free-energy differences between tautomers of the non-natural nucleobases used in the hachimoji expanded genetic alphabet. Determining experimental tautomer ratios is technically difficult, and so, strategies for improving hachimoji DNA replication efficiency will benefit from accurate computational predictions of equilibrium tautomeric ratios. We now report that high-level quantum-chemical calculations in aqueous solution by the embedded cluster reference interaction site model, benchmarked against free-energy molecular simulations for solvation thermodynamics, provide useful quantitative information on the tautomer ratios of both Watson-Crick and hachimoji nucleobases. In agreement with previous computational studies, all four Watson-Crick nucleobases adopt essentially only one tautomer in water. This is not the case, however, for non-natural nucleobases and their analogues. For example, although the enols of isoguanine and a series of related purines are not populated in water, these heterocycles possess N1-H and N3-H keto tautomers that are similar in energy, thereby adversely impacting accurate nucleobase pairing. These robust computational strategies offer a firm basis for improving experimental measurements of tautomeric ratios, which are currently limited to studying molecules that exist only as two tautomers in solution.
Collapse
Affiliation(s)
- Lukas Eberlein
- Physikalische Chemie III, Technische Universität Dortmund, Dortmund 44227, Germany
| | - Frank R Beierlein
- Computer-Chemistry-Centre and Interdisciplinary Centre for Molecular Materials, Department of Chemistry & Pharmacy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Nico J R van Eikema Hommes
- Computer-Chemistry-Centre and Interdisciplinary Centre for Molecular Materials, Department of Chemistry & Pharmacy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Ashish Radadiya
- School of Chemistry, Cardiff University, Cardiff CF10 3AT, U.K
| | - Jochen Heil
- Physikalische Chemie III, Technische Universität Dortmund, Dortmund 44227, Germany
| | - Steven A Benner
- Foundation for Applied Molecular Evolution, Alachua, Florida 32615, United States
| | - Timothy Clark
- Computer-Chemistry-Centre and Interdisciplinary Centre for Molecular Materials, Department of Chemistry & Pharmacy, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen 91054, Germany
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Dortmund 44227, Germany
| | - Nigel G J Richards
- School of Chemistry, Cardiff University, Cardiff CF10 3AT, U.K.,Foundation for Applied Molecular Evolution, Alachua, Florida 32615, United States
| |
Collapse
|
19
|
Standard state free energies, not pK as, are ideal for describing small molecule protonation and tautomeric states. J Comput Aided Mol Des 2020; 34:561-573. [PMID: 32052350 DOI: 10.1007/s10822-020-00280-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 01/08/2020] [Indexed: 12/14/2022]
Abstract
The pKa is the standard measure used to describe the aqueous proton affinity of a compound, indicating the proton concentration (pH) at which two protonation states (e.g. A- and AH) have equal free energy. However, compounds can have additional protonation states (e.g. AH2+), and may assume multiple tautomeric forms, with the protons in different positions (microstates). Macroscopic pKas give the pH where the molecule changes its total number of protons, while microscopic pKas identify the tautomeric states involved. As tautomers have the same number of protons, the free energy difference between them and their relative probability is pH independent so there is no pKa connecting them. The question arises: What is the best way to describe protonation equilibria of a complex molecule in any pH range? Knowing the number of protons and the relative free energy of all microstates at a single pH, ∆G°, provides all the information needed to determine the free energy, and thus the probability of each microstate at each pH. Microstate probabilities as a function of pH generate titration curves that highlight the low energy, observable microstates, which can then be compared with experiment. A network description connecting microstates as nodes makes it straightforward to test thermodynamic consistency of microstate free energies. The utility of this analysis is illustrated by a description of one molecule from the SAMPL6 Blind pKa Prediction Challenge. Analysis of microstate ∆G°s also makes a more compact way to archive and compare the pH dependent behavior of compounds with multiple protonatable sites.
Collapse
|
20
|
Pongratz T, Kibies P, Eberlein L, Tielker N, Hölzl C, Imoto S, Beck Erlach M, Kurrmann S, Schummel PH, Hofmann M, Reiser O, Winter R, Kremer W, Kalbitzer HR, Marx D, Horinek D, Kast SM. Pressure-dependent electronic structure calculations using integral equation-based solvation models. Biophys Chem 2020; 257:106258. [DOI: 10.1016/j.bpc.2019.106258] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Accepted: 08/25/2019] [Indexed: 12/18/2022]
|
21
|
Tielker N, Tomazic D, Eberlein L, Güssregen S, Kast SM. The SAMPL6 challenge on predicting octanol-water partition coefficients from EC-RISM theory. J Comput Aided Mol Des 2020; 34:453-461. [PMID: 31981015 PMCID: PMC7125249 DOI: 10.1007/s10822-020-00283-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Accepted: 01/08/2020] [Indexed: 12/14/2022]
Abstract
Results are reported for octanol–water partition coefficients (log P) of the neutral states of drug-like molecules provided during the SAMPL6 (Statistical Assessment of Modeling of Proteins and Ligands) blind prediction challenge from applying the “embedded cluster reference interaction site model” (EC-RISM) as a solvation model for quantum-chemical calculations. Following the strategy outlined during earlier SAMPL challenges we first train 1- and 2-parameter water-free (“dry”) and water-saturated (“wet”) models for n-octanol solvation Gibbs energies with respect to experimental values from the “Minnesota Solvation Database” (MNSOL), yielding a root mean square error (RMSE) of 1.5 kcal mol−1 for the best-performing 2-parameter wet model, while the optimal water model developed for the pKa part of the SAMPL6 challenge is kept unchanged (RMSE 1.6 kcal mol−1 for neutral compounds from a model trained on both neutral and ionic species). Applying these models to the blind prediction set yields a log P RMSE of less than 0.5 for our best model (2-parameters, wet). Further analysis of our results reveals that a single compound is responsible for most of the error, SM15, without which the RMSE drops to 0.2. Since this is the only compound in the challenge dataset with a hydroxyl group we investigate other alcohols for which Gibbs energy of solvation data for both water and n-octanol are available in the MNSOL database to demonstrate a systematic cause of error and to discuss strategies for improvement.
Collapse
Affiliation(s)
- Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Daniel Tomazic
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Lukas Eberlein
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Stefan Güssregen
- Sanofi-Aventis Deutschland GmbH, R&D Integrated Drug Discovery, 65926, Frankfurt am Main, Germany
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany.
| |
Collapse
|
22
|
Munte CE, Karl M, Kauter W, Eberlein L, Pham TV, Erlach MB, Kast SM, Kremer W, Kalbitzer HR. High pressure response of 1H NMR chemical shifts of purine nucleotides. Biophys Chem 2019; 254:106261. [PMID: 31522070 DOI: 10.1016/j.bpc.2019.106261] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2019] [Revised: 09/01/2019] [Accepted: 09/01/2019] [Indexed: 11/25/2022]
Abstract
The study of the pressure response by NMR spectroscopy provides information on the thermodynamics of conformational equilibria in proteins and nucleic acids. For obtaining a database for expected pressure effects on free nucleotides and nucleotides bound in macromolecular complexes, the pressure response of 1H chemical shifts and J-coupling constants of the purine 5'-ribonucleotides AMP, ADP, ATP, GMP, GDP, and GTP were studied in the absence and presence of Mg2+-ions. Experiments are supported by quantum-chemical calculations of populations and chemical shift differences in order to corroborate structural interpretations and to estimate missing data for AMP. The preference of the ribose S puckering obtained from the analysis of the experimental J-couplings is also confirmed by the calculations. In addition, the pressure response of the non-hydrolysable GTP analogues GppNHp, GppCH2p, and GTPγS was examined within a pressure range up to 200 MPa. As observed earlier for 31P NMR chemical shifts of these nucleotides the pressure dependence of chemical shifts is clearly non-linear in most cases. In di- and tri-phospho nucleosides, the resonances of the two protons bound to the ribose 5' carbon are non-equivalent and can be observed separately. The gg-rotamer at C4'- C5' bond is strongly preferred and the downfield shifted resonance can be assigned to the H5″ proton in the nucleotides. In contrast, in adenosine itself the frequencies of the two resonances are interchanged.
Collapse
Affiliation(s)
- Claudia E Munte
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Matthias Karl
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Waldemar Kauter
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Lukas Eberlein
- TU Dortmund University, Physical Chemistry III, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany
| | - Thuy-Vy Pham
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Markus Beck Erlach
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Stefan M Kast
- TU Dortmund University, Physical Chemistry III, Otto-Hahn-Straße 4a, 44227 Dortmund, Germany
| | - Werner Kremer
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany
| | - Hans Robert Kalbitzer
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, Center of Magnetic Resonance in Chemistry and Biomedicine, Universitätsstraße 31, 93053 Regensburg, Germany.
| |
Collapse
|
23
|
Tielker N, Eberlein L, Chodun C, Güssregen S, Kast SM. pK a calculations for tautomerizable and conformationally flexible molecules: partition function vs. state transition approach. J Mol Model 2019; 25:139. [PMID: 31041535 DOI: 10.1007/s00894-019-4033-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Accepted: 04/07/2019] [Indexed: 11/26/2022]
Abstract
Calculations of acidities of molecules with multiple tautomeric and/or conformational states require adequate treatment of the relative energetics of accessible states accompanied by a statistical-mechanical formulation of their contribution to the macroscopic pKa value. Here, we demonstrate rigorously the formal equivalence of two such approaches: a partition function treatment and statistics over transitions between molecular tautomeric and conformational states in the limit of a theory that does not require adjustment by empirical parameters correcting energetic values. However, for a frequently employed correction scheme, linear scaling of (free) energies and regression with respect to reference data taking an additive constant into account, this equivalence breaks down if more than one acid or base state is involved. The consequences of the resulting inconsistency are discussed on our datasets developed for aqueous pKa predictions during the recent SAMPL6 challenge, where molecular state energetics were computed based on the "embedded cluster reference interaction site model" (EC-RISM). This method couples integral equation theory as a solvation model to quantum-chemical calculations and yielded a test set root mean square error of 1.1 pK units from a partition function ansatz. For all practical purposes, the present results indicate that a state transition approach yields comparable accuracy despite the formal theoretical inconsistency, and that an additive regression intercept, which is strictly constant in the limit of large compound mass only, is a valid approximation. Graphical abstract Embedded cluster reference interaction site model-derived vs. experimental pKa for the test set calculated with either the partition function (blue) or the state transition approach (red), using m as a free parameter.
Collapse
Affiliation(s)
- Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Lukas Eberlein
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Christian Chodun
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Stefan Güssregen
- R&D Integrated Drug Discovery, Sanofi-Aventis Deutschland GmbH, 65926, Frankfurt am Main, Germany
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany.
| |
Collapse
|