1
|
Llompart P, Minoletti C, Baybekov S, Horvath D, Marcou G, Varnek A. Will we ever be able to accurately predict solubility? Sci Data 2024; 11:303. [PMID: 38499581 PMCID: PMC10948805 DOI: 10.1038/s41597-024-03105-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 02/29/2024] [Indexed: 03/20/2024] Open
Abstract
Accurate prediction of thermodynamic solubility by machine learning remains a challenge. Recent models often display good performances, but their reliability may be deceiving when used prospectively. This study investigates the origins of these discrepancies, following three directions: a historical perspective, an analysis of the aqueous solubility dataverse and data quality. We investigated over 20 years of published solubility datasets and models, highlighting overlooked datasets and the overlaps between popular sets. We benchmarked recently published models on a novel curated solubility dataset and report poor performances. We also propose a workflow to cure aqueous solubility data aiming at producing useful models for bench chemist. Our results demonstrate that some state-of-the-art models are not ready for public usage because they lack a well-defined applicability domain and overlook historical data sources. We report the impact of factors influencing the utility of the models: interlaboratory standard deviation, ionic state of the solute and data sources. The herein obtained models, and quality-assessed datasets are publicly available.
Collapse
Affiliation(s)
- P Llompart
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
- IDD/CADD, Sanofi, Vitry-Sur-Seine, France
| | | | - S Baybekov
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - D Horvath
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - G Marcou
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France.
| | - A Varnek
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| |
Collapse
|
2
|
Marchese Robinson RL, Palczewska A, Palczewski J, Kidley N. Comparison of the Predictive Performance and Interpretability of Random Forest and Linear Models on Benchmark Data Sets. J Chem Inf Model 2017; 57:1773-1792. [PMID: 28715209 DOI: 10.1021/acs.jcim.6b00753] [Citation(s) in RCA: 59] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The ability to interpret the predictions made by quantitative structure-activity relationships (QSARs) offers a number of advantages. While QSARs built using nonlinear modeling approaches, such as the popular Random Forest algorithm, might sometimes be more predictive than those built using linear modeling approaches, their predictions have been perceived as difficult to interpret. However, a growing number of approaches have been proposed for interpreting nonlinear QSAR models in general and Random Forest in particular. In the current work, we compare the performance of Random Forest to those of two widely used linear modeling approaches: linear Support Vector Machines (SVMs) (or Support Vector Regression (SVR)) and partial least-squares (PLS). We compare their performance in terms of their predictivity as well as the chemical interpretability of the predictions using novel scoring schemes for assessing heat map images of substructural contributions. We critically assess different approaches for interpreting Random Forest models as well as for obtaining predictions from the forest. We assess the models on a large number of widely employed public-domain benchmark data sets corresponding to regression and binary classification problems of relevance to hit identification and toxicology. We conclude that Random Forest typically yields comparable or possibly better predictive performance than the linear modeling approaches and that its predictions may also be interpreted in a chemically and biologically meaningful way. In contrast to earlier work looking at interpretation of nonlinear QSAR models, we directly compare two methodologically distinct approaches for interpreting Random Forest models. The approaches for interpreting Random Forest assessed in our article were implemented using open-source programs that we have made available to the community. These programs are the rfFC package ( https://r-forge.r-project.org/R/?group_id=1725 ) for the R statistical programming language and the Python program HeatMapWrapper [ https://doi.org/10.5281/zenodo.495163 ] for heat map generation.
Collapse
Affiliation(s)
- Richard L Marchese Robinson
- Syngenta Ltd., Jealott's Hill International Research Centre , Bracknell, Berkshire RG42 6EY, United Kingdom.,School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University , James Parsons Building, Byrom Street, Liverpool L3 3AF, United Kingdom
| | - Anna Palczewska
- Department of Computing, University of Bradford , Bradford BD7 1DP, United Kingdom
| | - Jan Palczewski
- School of Mathematics, University of Leeds , Leeds LS2 9JT, United Kingdom
| | - Nathan Kidley
- Syngenta Ltd., Jealott's Hill International Research Centre , Bracknell, Berkshire RG42 6EY, United Kingdom
| |
Collapse
|
3
|
Equilibrium solubility measurement of compounds with low dissolution rate by Higuchi's Facilitated Dissolution Method. A validation study. Eur J Pharm Sci 2017; 106:133-141. [PMID: 28577995 DOI: 10.1016/j.ejps.2017.05.064] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2017] [Revised: 05/25/2017] [Accepted: 05/30/2017] [Indexed: 11/23/2022]
Abstract
Incubation time plays a critical role in the accurate measurement of equilibrium solubility of compounds. Substances which dissolve very slowly generally need long incubation times (days or weeks) to reach equilibrium. However, long times may pose several problems, such as decomposition of solute, molding of buffer, and drifting of pH. Higuchi in 1979 proposed the Facilitated Dissolution Method (FDM) to dramatically reduce incubation time. It employs a small volume of water-immiscible organic solvent to partly solubilize the sample and thereby increase the surface area available for dissolution. The method has been used only rarely. In this study we performed a systematic validation of FDM using progesterone as model compound. The reference solubility value, 7.95±0.21μg/mL (p<0.05, n=5), was determined in Britton-Robinson buffer solution (pH7.4) at 25.0°C by the standardized protocol of Saturation Shake-Flask (SSF) method. Also, the solubility was measured by the FDM approach under varied experimental conditions (e.g., type and volume of organic solvent, time of agitation, and amount of solid excess), and compared to the reference value. It was demonstrated that the small amount of organic solvent used in the FDM does not impact the measured solubility, compared to the reference value. Additionally, four compounds of low dissolution rate (dexamethasone, digoxin, haloperidol and cosalane) were used to demonstrate that FDM can reduce the long equilibration time to the standardized 24h (6h stirring and 18h sedimentation). The time dependence of solubility equilibrium was measured by SSF, and the results were compared with those obtained by FDM. Our study, based on >200 solubility experiments, supports the validity of Higuchi's method. In this study we propose a standardized protocol for the FDM, where 1% v/v of organic solvent is used. Octane (or isooctane) was found to be suitable for highly hydrophobic compounds. Alternatively, octanol or 1,2-dichloroethane can be used for less lipophilic compounds.
Collapse
|
4
|
Mohammed M, Ettinoffe YSB, Ogundolie TO, Kioko BM, Mauge-Lewis K, Aslan K. High-Throughput Crystallization of l-Alanine Using iCrystal Plates and Metal-Assisted and Microwave-Accelerated Evaporative Crystallization. Ind Eng Chem Res 2016. [DOI: 10.1021/acs.iecr.5b04427] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Affiliation(s)
- Muzaffer Mohammed
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| | - Yehnara S. B. Ettinoffe
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| | - Taiwo O. Ogundolie
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| | - Bridgit M. Kioko
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| | - Kevin Mauge-Lewis
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| | - Kadir Aslan
- Department of Chemistry, Morgan State University, 1700
East Cold Spring Lane, Baltimore, Maryland 21251, United States
| |
Collapse
|
5
|
Fu T, Wei T, Liu Y, Jing J, Xu Y, Ou C, Chen Y, Li J, Li B, Zhu H. Inhibition of growth of l-cystine crystals by N-acetyl- l-cysteine. CrystEngComm 2016. [DOI: 10.1039/c6ce01749e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
6
|
Karamertzanis PG, Raiteri P, Galindo A. The Use of Anisotropic Potentials in Modeling Water and Free Energies of Hydration. J Chem Theory Comput 2015; 6:1590-607. [PMID: 26615693 DOI: 10.1021/ct900693q] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
We propose a novel, anisotropic rigid-body intermolecular potential model to predict the properties of water and the hydration free energies of neutral organic solutes. The electrostatic interactions of water and the solutes are modeled using atomic multipole moments up to hexadecapole; these are obtained from distributed multipole analysis of the quantum mechanically computed charge densities and include average polarization effects in solution. The repulsion-dispersion water-water interactions are modeled with a three-site, exp-6 model fitted to the experimental liquid water density and oxygen-oxygen radial distribution function at ambient conditions. The proposed water model reproduces well several water properties not used in its parametrization, including vapor-liquid coexistence densities, the maximum in liquid water density at atmospheric pressure, the structure of ordered ice polymorphs, and the liquid water heat capacity. The model is used to compute the hydration free energy of 10 neutral organic solutes using explicit-solvent free energy perturbation. The solute-solute repulsion-dispersion intermolecular potential is obtained from previous parametrizations on organic crystal structures. In order to calculate the free energies of hydration, water-solute repulsion-dispersion interactions are modeled using Lorenz-Berthelot combining rules. The root-mean-square error of the predicted hydration free energies is 1.5 kcal mol(-1), which is comparable to the error found using a continuum mean-field quantum mechanical approach parametrized using experimental free energy of hydration data. The results are also contrasted with explicit-solvent hydration free energies obtained with an atomic charge representation of the solute's charge density computed at the same level of theory used to compute the distributed multipoles. Replacing the multipole description of the solute's charge density with an atomic charge model changes the free energy of hydration by as much as 3 kcal mol(-1) and provides an estimate for the effect of the modeling quality of the intermolecular electrostatic forces in free energy of solvation calculations.
Collapse
Affiliation(s)
- Panagiotis G Karamertzanis
- Centre for Process Systems Engineering, Department of Chemical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom, and Department of Chemistry and Nanochemistry Research Institute, GPO Box U1987, 6845 Perth, Western Australia
| | - Paolo Raiteri
- Centre for Process Systems Engineering, Department of Chemical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom, and Department of Chemistry and Nanochemistry Research Institute, GPO Box U1987, 6845 Perth, Western Australia
| | - Amparo Galindo
- Centre for Process Systems Engineering, Department of Chemical Engineering, Imperial College London, London, SW7 2AZ, United Kingdom, and Department of Chemistry and Nanochemistry Research Institute, GPO Box U1987, 6845 Perth, Western Australia
| |
Collapse
|
7
|
Guo H, Kim JC. Upper critical solution temperature behavior of cinnamic acid and polyethyleneimine mixture and its effect on temperature-dependent release of liposome. Int J Pharm 2015; 494:172-9. [PMID: 26283281 DOI: 10.1016/j.ijpharm.2015.08.034] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Revised: 07/31/2015] [Accepted: 08/12/2015] [Indexed: 12/28/2022]
Abstract
The mixture of polyethyleneimine (PEI) and cinnamic acid (CA) in HEPES buffer (pH 7.0) exhibited an upper critical solution temperature in the temperature range of 20-50 °C. CA would be electrostatically conjugated with PEI and the PEI-CA conjugate is thought to act as a thermo-sensitive polymer. On the optical microscope image of PEI/CA mixture, microparticles were found at 25 °C, disappeared when heated to 50 °C, and formed again upon cooling to 25 °C. PEI-CA conjugate was immobilized on the surface of egg phosphatidylcholine (EPC) liposome by adding PEI to the suspension of liposome incorporating CA. The size and the zeta potential of the liposome markedly increased by cooling the liposomal suspension from 50 °C to 20 °C. This could be ascribed to the cooling-induced self-assembling property of PEI-CA conjugate. The release profile of Rhodamine B base from liposome incorporating CA with PEI was investigated while the liposome suspension of 50 °C was exposed to the release medium of 20 °C, 30 °C, 40 °C and 50 °C. The release degree was higher at a lower temperature. When exposed to a lower temperature (20 °C, 30 °C, 40 °C), PEI-CA could be self-assembled and change its configuration on the surface of liposome, promoting the release from the liposome.
Collapse
Affiliation(s)
- Huangying Guo
- College of Biomedical Science and Institute of Bioscience and Biotechnology, Kangwon National University, 192-1, Hyoja 2 dong, Chuncheon, Kangwon-do 200-701, Republic of Korea
| | - Jin-Chul Kim
- College of Biomedical Science and Institute of Bioscience and Biotechnology, Kangwon National University, 192-1, Hyoja 2 dong, Chuncheon, Kangwon-do 200-701, Republic of Korea.
| |
Collapse
|
8
|
Abramov YA. Major Source of Error in QSPR Prediction of Intrinsic Thermodynamic Solubility of Drugs: Solid vs Nonsolid State Contributions? Mol Pharm 2015; 12:2126-41. [DOI: 10.1021/acs.molpharmaceut.5b00119] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yuriy A. Abramov
- Pfizer Global Research and Development, Groton, Connecticut 06340, United States
| |
Collapse
|
9
|
Moghadam BT, Alvarsson J, Holm M, Eklund M, Carlsson L, Spjuth O. Scaling predictive modeling in drug development with cloud computing. J Chem Inf Model 2015; 55:19-25. [PMID: 25493610 DOI: 10.1021/ci500580y] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Growing data sets with increased time for analysis is hampering predictive modeling in drug discovery. Model building can be carried out on high-performance computer clusters, but these can be expensive to purchase and maintain. We have evaluated ligand-based modeling on cloud computing resources where computations are parallelized and run on the Amazon Elastic Cloud. We trained models on open data sets of varying sizes for the end points logP and Ames mutagenicity and compare with model building parallelized on a traditional high-performance computing cluster. We show that while high-performance computing results in faster model building, the use of cloud computing resources is feasible for large data sets and scales well within cloud instances. An additional advantage of cloud computing is that the costs of predictive models can be easily quantified, and a choice can be made between speed and economy. The easy access to computational resources with no up-front investments makes cloud computing an attractive alternative for scientists, especially for those without access to a supercomputer, and our study shows that it enables cost-efficient modeling of large data sets on demand within reasonable time.
Collapse
Affiliation(s)
- Behrooz Torabi Moghadam
- Department of Pharmaceutical Biosciences, ‡Department of Information Technology, and §Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University , SE-751 24 Uppsala, Sweden
| | | | | | | | | | | |
Collapse
|
10
|
Hansen N, van Gunsteren WF. Practical Aspects of Free-Energy Calculations: A Review. J Chem Theory Comput 2014; 10:2632-47. [PMID: 26586503 DOI: 10.1021/ct500161f] [Citation(s) in RCA: 289] [Impact Index Per Article: 28.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Free-energy calculations in the framework of classical molecular dynamics simulations are nowadays used in a wide range of research areas including solvation thermodynamics, molecular recognition, and protein folding. The basic components of a free-energy calculation, that is, a suitable model Hamiltonian, a sampling protocol, and an estimator for the free energy, are independent of the specific application. However, the attention that one has to pay to these components depends considerably on the specific application. Here, we review six different areas of application and discuss the relative importance of the three main components to provide the reader with an organigram and to make nonexperts aware of the many pitfalls present in free energy calculations.
Collapse
Affiliation(s)
- Niels Hansen
- Institute of Thermodynamics and Thermal Process Engineering, University of Stuttgart , D-70569 Stuttgart, Germany.,Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH , CH-8093 Zürich, Switzerland
| | - Wilfred F van Gunsteren
- Laboratory of Physical Chemistry, Swiss Federal Institute of Technology, ETH , CH-8093 Zürich, Switzerland
| |
Collapse
|
11
|
Salahinejad M, Le TC, Winkler DA. Aqueous Solubility Prediction: Do Crystal Lattice Interactions Help? Mol Pharm 2013; 10:2757-66. [DOI: 10.1021/mp4001958] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Maryam Salahinejad
- Faculty of Chemistry, Tarbiat Moallem University, Tehran 15719-14911, Iran
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
- Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia
| | - Tu C. Le
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
| | - David A. Winkler
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
- Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia
| |
Collapse
|
12
|
Ju Cha H, Dai J, Kim JC. Microgels composed of poly(ethylene imine) and carboxymethoxycoumarin: pH-dependent and photodependent integrity. J Appl Polym Sci 2012. [DOI: 10.1002/app.38531] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
13
|
Elder D, Holm R. Aqueous solubility: simple predictive methods (in silico, in vitro and bio-relevant approaches). Int J Pharm 2012; 453:3-11. [PMID: 23124107 DOI: 10.1016/j.ijpharm.2012.10.041] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Revised: 10/18/2012] [Accepted: 10/24/2012] [Indexed: 11/28/2022]
Abstract
Aqueous solubility is a key physicochemical attribute required for the characterisation of an active pharmaceutical ingredient (API) during drug discovery and beyond. Furthermore, aqueous solubility is highly important for formulation selection and subsequent development processes. This review provides a summary of simple predictive methods used to assess aqueous solubility as well as an assessment of the more complex in silico methodologies and a review of the recent solubility challenge. In addition, a summary of experimental methods to determine solubility is included, with a discussion of some potential pitfalls.
Collapse
Affiliation(s)
- David Elder
- GSK Pharmaceuticals, Park Road, Ware, Hertfordshire, SG12 0DP, United Kingdom
| | | |
Collapse
|
14
|
Dandapani S, Rosse G, Southall N, Salvino JM, Thomas CJ. Selecting, Acquiring, and Using Small Molecule Libraries for High-Throughput Screening. ACTA ACUST UNITED AC 2012; 4:177-191. [PMID: 26705509 DOI: 10.1002/9780470559277.ch110252] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
The selection, acquisition and use of high quality small molecule libraries for screening is an essential aspect of drug discovery and chemical biology programs. Screening libraries continue to evolve as researchers gain a greater appreciation of the suitability of small molecules for specific biological targets, processes and environments. The decisions surrounding the make-up of any given small molecule library is informed by a multitude of variables and opinions vary on best-practices. The fitness of any collection relies upon upfront filtering to avoiding problematic compounds, assess appropriate physicochemical properties, install the ideal level of structural uniqueness and determine the desired extent of molecular complexity. These criteria are under constant evaluation and revision as academic and industrial organizations seek out collections that yield ever improving results from their screening portfolios. Practical questions including cost, compound management, screening sophistication and assay objective also play a significant role in the choice of library composition. This overview attempts to offer advice to all organizations engaged in small molecule screening based upon current best practices and theoretical considerations in library selection and acquisition.
Collapse
Affiliation(s)
- Sivaraman Dandapani
- Chemical Biology Platform, The Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge Massachusetts 02142 USA
| | - Gerard Rosse
- Dart NeuroScience LLC, 7473 Lusk Boulevard, San Diego, CA 92121 USA
| | - Noel Southall
- NIH Chemical Genomics Center, National Human Genome Research Institute, 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| | - Joseph M Salvino
- Alliance Discovery, Inc, Biotechnology Center 3805 Old Easton Road, Doylestown, PA 18902 USA
| | - Craig J Thomas
- NIH Chemical Genomics Center, National Human Genome Research Institute, 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| |
Collapse
|
15
|
Palmer DS, McDonagh JL, Mitchell JBO, van Mourik T, Fedorov MV. First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules. J Chem Theory Comput 2012; 8:3322-37. [PMID: 26605739 DOI: 10.1021/ct300345m] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We demonstrate that the intrinsic aqueous solubility of crystalline druglike molecules can be estimated with reasonable accuracy from sublimation free energies calculated using crystal lattice simulations and hydration free energies calculated using the 3D Reference Interaction Site Model (3D-RISM) of the Integral Equation Theory of Molecular Liquids (IET). The solubilities of 25 crystalline druglike molecules taken from different chemical classes are predicted by the model with a correlation coefficient of R = 0.85 and a root mean square error (RMSE) equal to 1.45 log10S units, which is significantly more accurate than results obtained using implicit continuum solvent models. The method is not directly parametrized against experimental solubility data, and it offers a full computational characterization of the thermodynamics of transfer of the drug molecule from crystal phase to gas phase to dilute aqueous solution.
Collapse
Affiliation(s)
- David S Palmer
- Department of Physics, University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, United Kingdom.,Max Planck Institute for Mathematics in the Sciences , Inselstrasse 22, DE-04103 Leipzig, Germany
| | - James L McDonagh
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - John B O Mitchell
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - Tanja van Mourik
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - Maxim V Fedorov
- Department of Physics, University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, United Kingdom.,Max Planck Institute for Mathematics in the Sciences , Inselstrasse 22, DE-04103 Leipzig, Germany
| |
Collapse
|
16
|
Bridging solubility between drug discovery and development. Drug Discov Today 2012; 17:486-95. [DOI: 10.1016/j.drudis.2011.11.007] [Citation(s) in RCA: 295] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2011] [Revised: 10/25/2011] [Accepted: 11/18/2011] [Indexed: 11/22/2022]
|
17
|
Yang Y, Engkvist O, Llinàs A, Chen H. Beyond Size, Ionization State, and Lipophilicity: Influence of Molecular Topology on Absorption, Distribution, Metabolism, Excretion, and Toxicity for Druglike Compounds. J Med Chem 2012; 55:3667-77. [DOI: 10.1021/jm201548z] [Citation(s) in RCA: 95] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Yidong Yang
- Discovery
Sciences, Computational Sciences, Computational Chemistry, and ‡R&I iMED, In Vitro & In Vivo ADME, AstraZeneca R&D Mölndal,
SE-431 83 Mölndal, Sweden
| | - Ola Engkvist
- Discovery
Sciences, Computational Sciences, Computational Chemistry, and ‡R&I iMED, In Vitro & In Vivo ADME, AstraZeneca R&D Mölndal,
SE-431 83 Mölndal, Sweden
| | - Antonio Llinàs
- Discovery
Sciences, Computational Sciences, Computational Chemistry, and ‡R&I iMED, In Vitro & In Vivo ADME, AstraZeneca R&D Mölndal,
SE-431 83 Mölndal, Sweden
| | - Hongming Chen
- Discovery
Sciences, Computational Sciences, Computational Chemistry, and ‡R&I iMED, In Vitro & In Vivo ADME, AstraZeneca R&D Mölndal,
SE-431 83 Mölndal, Sweden
| |
Collapse
|
18
|
Guha R, Dexheimer TS, Kestranek AN, Jadhav A, Chervenak AM, Ford MG, Simeonov A, Roth GP, Thomas CJ. Exploratory analysis of kinetic solubility measurements of a small molecule library. Bioorg Med Chem 2011; 19:4127-34. [PMID: 21640593 PMCID: PMC3236531 DOI: 10.1016/j.bmc.2011.05.005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2011] [Revised: 04/29/2011] [Accepted: 05/04/2011] [Indexed: 11/20/2022]
Abstract
Kinetic solubility measurements using prototypical assay buffer conditions are presented for a ∼58,000 member library of small molecules. Analyses of the data based upon physical and calculated properties of each individual molecule were performed and resulting trends were considered in the context of commonly held opinions of how physicochemical properties influence aqueous solubility. We further analyze the data using a decision tree model for solubility prediction and via a multi-dimensional assessment of physicochemical relationships to solubility in the context of specific 'rule-breakers' relative to common dogma. The role of solubility as a determinant of assay outcome is also considered based upon each compound's cross-assay activity score for a collection of publicly available screening results. Further, the role of solubility as a governing factor for colloidal aggregation formation within a specified assay setting is examined and considered as a possible cause of a high cross-assay activity score. The results of this solubility profile should aid chemists during library design and optimization efforts and represent a useful training set for computational solubility prediction.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Chemical Genomics Center, National Human Genome Research Institute, NIH 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| | - Thomas S. Dexheimer
- NIH Chemical Genomics Center, National Human Genome Research Institute, NIH 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| | - Aimee N. Kestranek
- Analiza, Inc., 3615 Superior Avenue, Suite 4407B, Cleveland, OH 44114 USA
| | - Ajit Jadhav
- NIH Chemical Genomics Center, National Human Genome Research Institute, NIH 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| | | | - Michael G. Ford
- Analiza, Inc., 3615 Superior Avenue, Suite 4407B, Cleveland, OH 44114 USA
| | - Anton Simeonov
- NIH Chemical Genomics Center, National Human Genome Research Institute, NIH 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| | - Gregory P. Roth
- Sanford–Burnham Medical Research Institute at Lake Nona, Conrad Prebys Center for Chemical Genomics, 6400 Sanger Road, Orlando, Florida 32827
| | - Craig J. Thomas
- NIH Chemical Genomics Center, National Human Genome Research Institute, NIH 9800 Medical Center Drive, MSC 3370 Bethesda, MD 20892-3370 USA
| |
Collapse
|
19
|
Abstract
This article reviews the use of informatics and computational chemistry methods in medicinal chemistry, with special consideration of how computational techniques can be adapted and extended to obtain more and higher-quality information. Special consideration is given to the computation of protein–ligand binding affinities, to the prediction of off-target bioactivities, bioactivity spectra and computational toxicology, and also to calculating absorption-, distribution-, metabolism- and excretion-relevant properties, such as solubility.
Collapse
|
20
|
Horobin RW, Stockert JC. Uptake and localization mechanisms of fluorescent and colored lipid probes. 1. Physicochemistry of probe uptake and localization, and the use of QSAR models for selectivity prediction. Biotech Histochem 2010; 86:379-93. [DOI: 10.3109/10520295.2010.515489] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Affiliation(s)
- RW Horobin
- School of Life Sciences, The University of Glasgow, University Avenue,
Glasgow G12 8QQ, Scotland, UK
| | - JC Stockert
- Department of Biology, Faculty of Sciences, Autonomous University of Madrid,
Cantoblanco, Madrid 28049
- Center for Biological Research, High Council of Scientific Research,
Madrid 28040, Spain
| |
Collapse
|
21
|
Spjuth O, Willighagen EL, Guha R, Eklund M, Wikberg JE. Towards interoperable and reproducible QSAR analyses: Exchange of datasets. J Cheminform 2010; 2:5. [PMID: 20591161 PMCID: PMC2909924 DOI: 10.1186/1758-2946-2-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2010] [Accepted: 06/30/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND QSAR is a widely used method to relate chemical structures to responses or properties based on experimental observations. Much effort has been made to evaluate and validate the statistical modeling in QSAR, but these analyses treat the dataset as fixed. An overlooked but highly important issue is the validation of the setup of the dataset, which comprises addition of chemical structures as well as selection of descriptors and software implementations prior to calculations. This process is hampered by the lack of standards and exchange formats in the field, making it virtually impossible to reproduce and validate analyses and drastically constrain collaborations and re-use of data. RESULTS We present a step towards standardizing QSAR analyses by defining interoperable and reproducible QSAR datasets, consisting of an open XML format (QSAR-ML) which builds on an open and extensible descriptor ontology. The ontology provides an extensible way of uniquely defining descriptors for use in QSAR experiments, and the exchange format supports multiple versioned implementations of these descriptors. Hence, a dataset described by QSAR-ML makes its setup completely reproducible. We also provide a reference implementation as a set of plugins for Bioclipse which simplifies setup of QSAR datasets, and allows for exporting in QSAR-ML as well as old-fashioned CSV formats. The implementation facilitates addition of new descriptor implementations from locally installed software and remote Web services; the latter is demonstrated with REST and XMPP Web services. CONCLUSIONS Standardized QSAR datasets open up new ways to store, query, and exchange data for subsequent analyses. QSAR-ML supports completely reproducible creation of datasets, solving the problems of defining which software components were used and their versions, and the descriptor ontology eliminates confusions regarding descriptors by defining them crisply. This makes is easy to join, extend, combine datasets and hence work collectively, but also allows for analyzing the effect descriptors have on the statistical model's performance. The presented Bioclipse plugins equip scientists with graphical tools that make QSAR-ML easily accessible for the community.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences, Uppsala University, Uppsala, Sweden.
| | | | | | | | | |
Collapse
|
22
|
vanâ
deâ
Waterbeemd H. Improving Compound Quality throughin vitroandin silicoPhysicochemical Profiling. Chem Biodivers 2009; 6:1760-6. [DOI: 10.1002/cbdv.200900056] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
23
|
Llinàs A, Glen RC, Goodman JM. Solubility Challenge: Can You Predict Solubilities of 32 Molecules Using a Database of 100 Reliable Measurements? J Chem Inf Model 2008; 48:1289-303. [DOI: 10.1021/ci800058v] [Citation(s) in RCA: 134] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Antonio Llinàs
- Pfizer Institute for Pharmaceutical Materials Science & Unilever Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Robert C. Glen
- Pfizer Institute for Pharmaceutical Materials Science & Unilever Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jonathan M. Goodman
- Pfizer Institute for Pharmaceutical Materials Science & Unilever Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
24
|
Palmer DS, Llinàs A, Morao I, Day GM, Goodman JM, Glen RC, Mitchell JBO. Predicting intrinsic aqueous solubility by a thermodynamic cycle. Mol Pharm 2008; 5:266-79. [PMID: 18290628 DOI: 10.1021/mp7000878] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We report methods to predict the intrinsic aqueous solubility of crystalline organic molecules from two different thermodynamic cycles. We find that direct computation of solubility, via ab initio calculation of thermodynamic quantities at an affordable level of theory, cannot deliver the required accuracy. Therefore, we have turned to a mixture of direct computation and informatics, using the calculated thermodynamic properties, along with a few other key descriptors, in regression models. The prediction of log intrinsic solubility (referred to mol/L) by a three-variable linear regression equation gave r(2)=0.77 and RMSE=0.71 for an external test set comprising drug molecules. The model includes a calculated crystal lattice energy which provides a computational method to account for the interactions in the solid state. We suggest that it is not necessary to know the polymorphic form prior to prediction. Furthermore, the method developed here may be applicable to other solid-state systems such as salts or cocrystals.
Collapse
Affiliation(s)
- David S Palmer
- The Pfizer Institute for Pharmaceutical Materials Science and Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, United Kingdom
| | | | | | | | | | | | | |
Collapse
|