1
|
Zamora WJ, Viayna A, Pinheiro S, Curutchet C, Bisbal L, Ruiz R, Ràfols C, Luque FJ. Prediction of toluene/water partition coefficients in the SAMPL9 blind challenge: assessment of machine learning and IEF-PCM/MST continuum solvation models. Phys Chem Chem Phys 2023; 25:17952-17965. [PMID: 37376995 DOI: 10.1039/d3cp01428b] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
In recent years the use of partition systems other than the widely used biphasic n-octanol/water has received increased attention to gain insight into the molecular features that dictate the lipophilicity of compounds. Thus, the difference between n-octanol/water and toluene/water partition coefficients has proven to be a valuable descriptor to study the propensity of molecules to form intramolecular hydrogen bonds and exhibit chameleon-like properties that modulate solubility and permeability. In this context, this study reports the experimental toluene/water partition coefficients (log Ptol/w) for a series of 16 drugs that were selected as an external test set in the framework of the Statistical Assessment of the Modeling of Proteins and Ligands (SAMPL) blind challenge. This external set has been used by the computational community to calibrate their methods in the current edition (SAMPL9) of this contest. Furthermore, the study also investigates the performance of two computational strategies for the prediction of log Ptol/w. The first relies on the development of two machine learning (ML) models, which are built up by combining the selection of 11 molecular descriptors in conjunction with either the multiple linear regression (MLR) or the random forest regression (RFR) model to target a dataset of 252 experimental log Ptol/w values. The second consists of the parametrization of the IEF-PCM/MST continuum solvation model from B3LYP/6-31G(d) calculations to predict the solvation free energies of 163 compounds in toluene and benzene. The performance of the ML and IEF-PCM/MST models has been calibrated against external test sets, including the compounds that define the SAMPL9 log Ptol/w challenge. The results are used to discuss the merits and weaknesses of the two computational approaches.
Collapse
Affiliation(s)
- William J Zamora
- CBio3 Laboratory, School of Chemistry, University of Costa Rica, San Pedro, San José, Costa Rica.
- Laboratory of Computational Toxicology and Artificial Intelligence (LaToxCIA), Biological Testing Laboratory (LEBi), University of Costa Rica, San Pedro, San José, Costa Rica
- Advanced Computing Lab (CNCA), National High Technology Center (CeNAT), Pavas, San José, Costa Rica
| | - Antonio Viayna
- Departament de Nutrició, Ciències de l'Alimentació i Gastronomia, Facultat de Farmàcia i Ciències de l'Alimentació, Universitat de Barcelona (UB), Av. Prat de la Riba 171, 08921 Santa Coloma de Gramenet, Spain.
- Institut de Biomedicina (IBUB), Universitat de Barcelona (UB), Barcelona, Spain
- Institut de Química Teòrica i Computacional (IQTC-UB), Universitat de Barcelona (UB), Barcelona, Spain
| | - Silvana Pinheiro
- CBio3 Laboratory, School of Chemistry, University of Costa Rica, San Pedro, San José, Costa Rica.
- Laboratory of Computational Toxicology and Artificial Intelligence (LaToxCIA), Biological Testing Laboratory (LEBi), University of Costa Rica, San Pedro, San José, Costa Rica
| | - Carles Curutchet
- Institut de Química Teòrica i Computacional (IQTC-UB), Universitat de Barcelona (UB), Barcelona, Spain
- Departament de Farmàcia i Tecnologia Farmacèutica, i Fisicoquímica, Facultat de Farmàcia i Ciències de l'Alimentació, Universitat de Barcelona (UB), Av. Joan XXIII 27-31, 08028, Barcelona, Spain
| | - Laia Bisbal
- Institut de Biomedicina (IBUB), Universitat de Barcelona (UB), Barcelona, Spain
- Departament d'Enginyeria Química i Química Analítica, Universitat de Barcelona (UB), Martí i Franquès 1-11, 08028 Barcelona, Spain.
| | - Rebeca Ruiz
- Pion Inc., Forest Row Business Park, Forest Row RH18 5DW, UK
| | - Clara Ràfols
- Institut de Biomedicina (IBUB), Universitat de Barcelona (UB), Barcelona, Spain
- Departament d'Enginyeria Química i Química Analítica, Universitat de Barcelona (UB), Martí i Franquès 1-11, 08028 Barcelona, Spain.
| | - F Javier Luque
- Departament de Nutrició, Ciències de l'Alimentació i Gastronomia, Facultat de Farmàcia i Ciències de l'Alimentació, Universitat de Barcelona (UB), Av. Prat de la Riba 171, 08921 Santa Coloma de Gramenet, Spain.
- Institut de Biomedicina (IBUB), Universitat de Barcelona (UB), Barcelona, Spain
- Institut de Química Teòrica i Computacional (IQTC-UB), Universitat de Barcelona (UB), Barcelona, Spain
| |
Collapse
|
2
|
Vázquez J, Ginex T, Herrero A, Morisseau C, Hammock BD, Luque FJ. Screening and Biological Evaluation of Soluble Epoxide Hydrolase Inhibitors: Assessing the Role of Hydrophobicity in the Pharmacophore-Guided Search of Novel Hits. J Chem Inf Model 2023; 63:3209-3225. [PMID: 37141492 PMCID: PMC10207366 DOI: 10.1021/acs.jcim.3c00301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Indexed: 05/06/2023]
Abstract
The human soluble epoxide hydrolase (sEH) is a bifunctional enzyme that modulates the levels of regulatory epoxy lipids. The hydrolase activity is carried out by a catalytic triad located at the center of a wide L-shaped binding site, which contains two hydrophobic subpockets at both sides. On the basis of these structural features, it can be assumed that desolvation is a major factor in determining the maximal achievable affinity that can be attained for this pocket. Accordingly, hydrophobic descriptors may be better suited to the search of novel hits targeting this enzyme. This study examines the suitability of quantum mechanically derived hydrophobic descriptors in the discovery of novel sEH inhibitors. To this end, three-dimensional quantitative structure-activity relationship (3D-QSAR) pharmacophores were generated by combining electrostatic and steric or alternatively hydrophobic and hydrogen-bond parameters in conjunction with a tailored list of 76 known sEH inhibitors. The pharmacophore models were then validated by using two external sets chosen (i) to rank the potency of four distinct series of compounds and (ii) to discriminate actives from decoys, using in both cases datasets taken from the literature. Finally, a prospective study was performed including a virtual screening of two chemical libraries to identify new potential hits, which were subsequently experimentally tested for their inhibitory activity on human, rat, and mouse sEH. The use of hydrophobic-based descriptors led to the identification of six compounds as inhibitors of the human enzyme with IC50 < 20 nM, including two with IC50 values of 0.4 and 0.7 nM. The results support the use of hydrophobic descriptors as a valuable tool in the search of novel scaffolds that encode a proper hydrophilic/hydrophobic distribution complementary to the target's binding site.
Collapse
Affiliation(s)
- Javier Vázquez
- Departament
de Nutrició, Ciències de l′Alimentació
i Gastronomia, Facultat de Farmàcia i Ciències de l′Alimentació, Institut de Biomedicina (IBUB), Prat de la Riba 171, 08921 Santa Coloma de Gramenet, Spain
- Pharmacelera,
Parc Científic de Barcelona (PCB), Baldiri Reixac 4-8, 08028 Barcelona, Spain
| | - Tiziana Ginex
- Departament
de Nutrició, Ciències de l′Alimentació
i Gastronomia, Facultat de Farmàcia i Ciències de l′Alimentació, Institut de Biomedicina (IBUB), Prat de la Riba 171, 08921 Santa Coloma de Gramenet, Spain
| | - Albert Herrero
- Pharmacelera,
Parc Científic de Barcelona (PCB), Baldiri Reixac 4-8, 08028 Barcelona, Spain
| | - Christophe Morisseau
- Department
of Entomology and Nematology, and Comprehensive Cancer Center, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - Bruce D. Hammock
- Department
of Entomology and Nematology, and Comprehensive Cancer Center, University of California, Davis, One Shields Avenue, Davis, California 95616, United States
| | - F. Javier Luque
- Departament
de Nutrició, Ciències de l′Alimentació
i Gastronomia, Facultat de Farmàcia i Ciències de l′Alimentació, Institut de Biomecidina (IBUB) and Institut de Química
Teòrica i Computacional (IQTCUB), Prat de la Riba 171, 08921 Santa Coloma de Gramenet, Spain
| |
Collapse
|
3
|
Ruiz R, Zamora WJ, Ràfols C, Bosch E. Molecular characteristics of several drugs evaluated from solvent/water partition measurements: Solvation parameters and intramolecular hydrogen bond indicator. Eur J Pharm Sci 2022; 168:106066. [PMID: 34767947 DOI: 10.1016/j.ejps.2021.106066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/03/2021] [Accepted: 11/05/2021] [Indexed: 11/03/2022]
Abstract
A wide set of well-known drugs, most of them included in the Abraham´s reference database, covering a wide variety of chemical structures and therapeutical functionalities were chosen in order to determine some molecular properties from solvent/water partition measurements. Partition data from aqueous solutions and four different solvents (n-dodecane, toluene, chloroform and n-octanol) were measured and reported. From them, Abraham´s molecular descriptors of selected compounds (A, B and S, accounting for hydrogen bond donor, hydrogen bond acceptor and dipolarity/polaritzability, respectively) were estimated. A and B values derived from the experimental measurements strongly agree with the tabulated ones showing the suitability of the used procedure to achieve reliable values for new molecules. However, obtained S values differ from those previously reported for several compounds. Moreover, values for a new indicator of the propensity to form intramolecular hydrogen bonds (Δlog Poct-tol) were estimated from the experimental data and also calculated according to both, the Abraham´s model and the molecular structures (SMD). The quality of both series of calculated descriptors was evaluated by contrast with the experimental values and satisfactory results were obtained in both instances. Thus, the Abraham´s way is useful when molecular descriptors are available but very good estimations can be achieved by SMD, which only requires the drug´s molecular structure.
Collapse
Affiliation(s)
- Rebeca Ruiz
- Pion Inc., Forest Row Business Park, Forest Row RH18 5DW, UK
| | - William J Zamora
- School of Chemistry and Faculty of Pharmacy, University of Costa Rica, San Pedro, San José, Costa Rica; Advanced Computing Lab (CNCA), National High Technology Center (CeNAT), Pavas, San José, Costa Rica
| | - Clara Ràfols
- Departament d'Enginyeria Química i Química Analítica and Institut de Biomedicina (IBUB), Universitat de Barcelona, Martí i Franquès 1-11, 08028 Barcelona, Spain.
| | - Elisabeth Bosch
- Departament d'Enginyeria Química i Química Analítica and Institut de Biomedicina (IBUB), Universitat de Barcelona, Martí i Franquès 1-11, 08028 Barcelona, Spain
| |
Collapse
|
4
|
Viayna A, Pinheiro S, Curutchet C, Luque FJ, Zamora WJ. Prediction of n-octanol/water partition coefficients and acidity constants (pK a) in the SAMPL7 blind challenge with the IEFPCM-MST model. J Comput Aided Mol Des 2021; 35:803-811. [PMID: 34244905 PMCID: PMC8295120 DOI: 10.1007/s10822-021-00394-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2021] [Accepted: 06/04/2021] [Indexed: 12/17/2022]
Abstract
Within the scope of SAMPL7 challenge for predicting physical properties, the Integral Equation Formalism of the Miertus-Scrocco-Tomasi (IEFPCM/MST) continuum solvation model has been used for the blind prediction of n-octanol/water partition coefficients and acidity constants of a set of 22 and 20 sulfonamide-containing compounds, respectively. The log P and pKa were computed using the B3LPYP/6-31G(d) parametrized version of the IEFPCM/MST model. The performance of our method for partition coefficients yielded a root-mean square error of 1.03 (log P units), placing this method among the most accurate theoretical approaches in the comparison with both globally (rank 8th) and physical (rank 2nd) methods. On the other hand, the deviation between predicted and experimental pKa values was 1.32 log units, obtaining the second best-ranked submission. Though this highlights the reliability of the IEFPCM/MST model for predicting the partitioning and the acid dissociation constant of drug-like compounds compound, the results are discussed to identify potential weaknesses and improve the performance of the method.
Collapse
Affiliation(s)
- Antonio Viayna
- Department of Nutrition, Food Sciences and Gastronomy, Faculty of Pharmacy and Food Sciences, Institute of Biomedicine (IBUB), and Institute of Theoretical and Computational Chemistry (IQTC-UB), University of Barcelona (UB), Avda. Prat de La Riba, 171, 08921, Santa Coloma de Gramenet, Spain.
| | - Silvana Pinheiro
- Institute of Exact and Natural Sciences, Federal University of Pará, Belém, Pará, 66075-110, Brazil
| | - Carles Curutchet
- Department of Pharmacy and Pharmaceutical Technology and Physical Chemistry, Faculty of Pharmacy and Food Sciences, and Institute of Theoretical and Computational Chemistry (IQTC-UB), University of Barcelona, Av. de Joan XXIII, 27-31, 08028, Barcelona, Spain
| | - F Javier Luque
- Department of Nutrition, Food Sciences and Gastronomy, Faculty of Pharmacy and Food Sciences, Institute of Biomedicine (IBUB), and Institute of Theoretical and Computational Chemistry (IQTC-UB), University of Barcelona (UB), Avda. Prat de La Riba, 171, 08921, Santa Coloma de Gramenet, Spain
| | - William J Zamora
- School of Chemistry and Faculty of Pharmacy, University of Costa Rica, San Pedro, San José, Costa Rica.,Advanced Computing Lab (CNCA), National High Technology Center (CeNAT), Pavas, San José, Costa Rica
| |
Collapse
|
5
|
Donyapour N, Dickson A. Predicting partition coefficients for the SAMPL7 physical property challenge using the ClassicalGSG method. J Comput Aided Mol Des 2021; 35:819-830. [PMID: 34181200 PMCID: PMC8295205 DOI: 10.1007/s10822-021-00400-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Accepted: 06/17/2021] [Indexed: 02/02/2023]
Abstract
The prediction of [Formula: see text] values is one part of the statistical assessment of the modeling of proteins and ligands (SAMPL) blind challenges. Here, we use a molecular graph representation method called Geometric Scattering for Graphs (GSG) to transform atomic attributes to molecular features. The atomic attributes used here are parameters from classical molecular force fields including partial charges and Lennard-Jones interaction parameters. The molecular features from GSG are used as inputs to neural networks that are trained using a "master" dataset comprised of over 41,000 unique [Formula: see text] values. The specific molecular targets in the SAMPL7 [Formula: see text] prediction challenge were unique in that they all contained a sulfonyl moeity. This motivated a set of ClassicalGSG submissions where predictors were trained on different subsets of the master dataset that are filtered according to chemical types and/or the presence of the sulfonyl moeity. We find that our ranked prediction obtained 5th place with an RMSE of 0.77 [Formula: see text] units and an MAE of 0.62, while one of our non-ranked predictions achieved first place among all submissions with an RMSE of 0.55 and an MAE of 0.44. After the conclusion of the challenge we also examined the performance of open-source force field parameters that allow for an end-to-end [Formula: see text] predictor model: General AMBER Force Field (GAFF), Universal Force Field (UFF), Merck Molecular Force Field 94 (MMFF94) and Ghemical. We find that ClassicalGSG models trained with atomic attributes from MMFF94 can yield more accurate predictions compared to those trained with CGenFF atomic attributes.
Collapse
Affiliation(s)
- Nazanin Donyapour
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Alex Dickson
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI, USA.
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
6
|
Bergazin TD, Tielker N, Zhang Y, Mao J, Gunner MR, Francisco K, Ballatore C, Kast SM, Mobley DL. Evaluation of log P, pK a, and log D predictions from the SAMPL7 blind challenge. J Comput Aided Mol Des 2021; 35:771-802. [PMID: 34169394 PMCID: PMC8224998 DOI: 10.1007/s10822-021-00397-3] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2021] [Accepted: 06/05/2021] [Indexed: 12/16/2022]
Abstract
The Statistical Assessment of Modeling of Proteins and Ligands (SAMPL) challenges focuses the computational modeling community on areas in need of improvement for rational drug design. The SAMPL7 physical property challenge dealt with prediction of octanol-water partition coefficients and pKa for 22 compounds. The dataset was composed of a series of N-acylsulfonamides and related bioisosteres. 17 research groups participated in the log P challenge, submitting 33 blind submissions total. For the pKa challenge, 7 different groups participated, submitting 9 blind submissions in total. Overall, the accuracy of octanol-water log P predictions in the SAMPL7 challenge was lower than octanol-water log P predictions in SAMPL6, likely due to a more diverse dataset. Compared to the SAMPL6 pKa challenge, accuracy remains unchanged in SAMPL7. Interestingly, here, though macroscopic pKa values were often predicted with reasonable accuracy, there was dramatically more disagreement among participants as to which microscopic transitions produced these values (with methods often disagreeing even as to the sign of the free energy change associated with certain transitions), indicating far more work needs to be done on pKa prediction methods.
Collapse
Affiliation(s)
| | - Nicolas Tielker
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - Yingying Zhang
- Department of Physics, The Graduate Center, City University of New York, New York, 10016, USA
| | - Junjun Mao
- Department of Physics, City College of New York, New York, 10031, USA
| | - M R Gunner
- Department of Physics, The Graduate Center, City University of New York, New York, 10016, USA.,Department of Physics, City College of New York, New York, 10031, USA
| | - Karol Francisco
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, Ja Jolla, CA, 92093-0756, USA
| | - Carlo Ballatore
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, Ja Jolla, CA, 92093-0756, USA
| | - Stefan M Kast
- Physikalische Chemie III, Technische Universität Dortmund, Otto-Hahn-Str. 4a, 44227, Dortmund, Germany
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, CA, 92697, USA. .,Department of Chemistry, University of California, Irvine, Irvine, CA, 92697, USA.
| |
Collapse
|
7
|
Işık M, Bergazin TD, Fox T, Rizzi A, Chodera JD, Mobley DL. Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge. J Comput Aided Mol Des 2020; 34:335-370. [PMID: 32107702 PMCID: PMC7138020 DOI: 10.1007/s10822-020-00295-0] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 01/24/2020] [Indexed: 12/12/2022]
Abstract
The SAMPL Challenges aim to focus the biomolecular and physical modeling community on issues that limit the accuracy of predictive modeling of protein-ligand binding for rational drug design. In the SAMPL5 log D Challenge, designed to benchmark the accuracy of methods for predicting drug-like small molecule transfer free energies from aqueous to nonpolar phases, participants found it difficult to make accurate predictions due to the complexity of protonation state issues. In the SAMPL6 log P Challenge, we asked participants to make blind predictions of the octanol-water partition coefficients of neutral species of 11 compounds and assessed how well these methods performed absent the complication of protonation state effects. This challenge builds on the SAMPL6 p[Formula: see text] Challenge, which asked participants to predict p[Formula: see text] values of a superset of the compounds considered in this log P challenge. Blind prediction sets of 91 prediction methods were collected from 27 research groups, spanning a variety of quantum mechanics (QM) or molecular mechanics (MM)-based physical methods, knowledge-based empirical methods, and mixed approaches. There was a 50% increase in the number of participating groups and a 20% increase in the number of submissions compared to the SAMPL5 log D Challenge. Overall, the accuracy of octanol-water log P predictions in SAMPL6 Challenge was higher than cyclohexane-water log D predictions in SAMPL5, likely because modeling only the neutral species was necessary for log P and several categories of method benefited from the vast amounts of experimental octanol-water log P data. There were many highly accurate methods: 10 diverse methods achieved RMSE less than 0.5 log P units. These included QM-based methods, empirical methods, and mixed methods with physical modeling supported with empirical corrections. A comparison of physical modeling methods showed that QM-based methods outperformed MM-based methods. The average RMSE of the most accurate five MM-based, QM-based, empirical, and mixed approach methods based on RMSE were 0.92 ± 0.13, 0.48 ± 0.06, 0.47 ± 0.05, and 0.50 ± 0.06, respectively.
Collapse
Affiliation(s)
- Mehtap Işık
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA.
- Tri-Institutional PhD Program in Chemical Biology, Weill Cornell Graduate School of Medical Sciences, Cornell University, New York, NY, 10065, USA.
| | | | - Thomas Fox
- Computational Chemistry, Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co KG, 88397, Biberach, Germany
| | - Andrea Rizzi
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
- Tri-Institutional Training Program in Computational Biology and Medicine, New York, NY, 10065, USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY, 10065, USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California, Irvine, CA, 92697, USA
- Department of Chemistry, University of California, Irvine, CA, 92697, USA
| |
Collapse
|