51
|
Fatemi MH, Shakoori Z, Samghani K. Comparative molecular field analysis of chromatographic hydrophobicity indices for some coumarin analogs. Biomed Chromatogr 2016; 31. [DOI: 10.1002/bmc.3876] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2016] [Revised: 09/03/2016] [Accepted: 10/24/2016] [Indexed: 11/08/2022]
Affiliation(s)
| | - Zahra Shakoori
- Laboratory of Chemometrics, Faculty of Chemistry; University of Mazandaran; Babolsar Iran
| | - Kobra Samghani
- Laboratory of Chemometrics, Faculty of Chemistry; University of Mazandaran; Babolsar Iran
| |
Collapse
|
52
|
McDonagh JL, Palmer DS, Mourik TV, Mitchell JBO. Are the Sublimation Thermodynamics of Organic Molecules Predictable? J Chem Inf Model 2016; 56:2162-2179. [PMID: 27749062 DOI: 10.1021/acs.jcim.6b00033] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We compare a range of computational methods for the prediction of sublimation thermodynamics (enthalpy, entropy, and free energy of sublimation). These include a model from theoretical chemistry that utilizes crystal lattice energy minimization (with the DMACRYS program) and quantitative structure property relationship (QSPR) models generated by both machine learning (random forest and support vector machines) and regression (partial least squares) methods. Using these methods we investigate the predictability of the enthalpy, entropy and free energy of sublimation, with consideration of whether such a method may be able to improve solubility prediction schemes. Previous work has suggested that the major source of error in solubility prediction schemes involving a thermodynamic cycle via the solid state is in the modeling of the free energy change away from the solid state. Yet contrary to this conclusion other work has found that the inclusion of terms such as the enthalpy of sublimation in QSPR methods does not improve the predictions of solubility. We suggest the use of theoretical chemistry terms, detailed explicitly in the Methods section, as descriptors for the prediction of the enthalpy and free energy of sublimation. A data set of 158 molecules with experimental sublimation thermodynamics values and some CSD refcodes has been collected from the literature and is provided with their original source references.
Collapse
Affiliation(s)
- James L McDonagh
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester, M1 7DN, U.K.,School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| | - David S Palmer
- Department of Pure and Applied Chemistry, University of Strathclyde , Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland, United Kingdom , G1 1XL
| | - Tanja van Mourik
- School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| | - John B O Mitchell
- School of Chemistry, University of St Andrews , North Haugh, St Andrews, Fife, Scotland, United Kingdom , KY16 9ST
| |
Collapse
|
53
|
Dave RA, Morris ME. Novel high/low solubility classification methods for new molecular entities. Int J Pharm 2016; 511:111-126. [PMID: 27349790 DOI: 10.1016/j.ijpharm.2016.06.060] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Revised: 06/21/2016] [Accepted: 06/24/2016] [Indexed: 11/25/2022]
Abstract
This research describes a rapid solubility classification approach that could be used in the discovery and development of new molecular entities. Compounds (N=635) were divided into two groups based on information available in the literature: high solubility (BDDCS/BCS 1/3) and low solubility (BDDCS/BCS 2/4). We established decision rules for determining solubility classes using measured log solubility in molar units (MLogSM) or measured solubility (MSol) in mg/ml units. ROC curve analysis was applied to determine statistically significant threshold values of MSol and MLogSM. Results indicated that NMEs with MLogSM>-3.05 or MSol>0.30mg/mL will have ≥85% probability of being highly soluble and new molecular entities with MLogSM≤-3.05 or MSol≤0.30mg/mL will have ≥85% probability of being poorly soluble. When comparing solubility classification using the threshold values of MLogSM or MSol with BDDCS, we were able to correctly classify 85% of compounds. We also evaluated solubility classification of an independent set of 108 orally administered drugs using MSol (0.3mg/mL) and our method correctly classified 81% and 95% of compounds into high and low solubility classes, respectively. The high/low solubility classification using MLogSM or MSol is novel and independent of traditionally used dose number criteria.
Collapse
Affiliation(s)
- Rutwij A Dave
- Department of Pharmaceutical Sciences, School of Pharmacy and Pharmaceutical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA
| | - Marilyn E Morris
- Department of Pharmaceutical Sciences, School of Pharmacy and Pharmaceutical Sciences, University at Buffalo, State University of New York, Buffalo, NY, USA.
| |
Collapse
|
54
|
Liu S, Cao S, Hoang K, Young KL, Paluch AS, Mobley DL. Using MD Simulations To Calculate How Solvents Modulate Solubility. J Chem Theory Comput 2016; 12:1930-41. [PMID: 26878198 DOI: 10.1021/acs.jctc.5b00934] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Here, our interest is in predicting solubility in general, and we focus particularly on predicting how the solubility of particular solutes is modulated by the solvent environment. Solubility in general is extremely important, both for theoretical reasons - it provides an important probe of the balance between solute-solute and solute-solvent interactions - and for more practical reasons, such as how to control the solubility of a given solute via modulation of its environment, as in process chemistry and separations. Here, we study how the change of solvent affects the solubility of a given compound. That is, we calculate relative solubilities. We use MD simulations to calculate relative solubility and compare our calculated values with experiment as well as with results from several other methods, SMD and UNIFAC, the latter of which is commonly used in chemical engineering design. We find that straightforward solubility calculations based on molecular simulations using a general small-molecule force field outperform SMD and UNIFAC both in terms of accuracy and coverage of the relevant chemical space.
Collapse
Affiliation(s)
| | | | | | - Kayla L Young
- Department of Chemical, Paper and Biomedical Engineering, Miami University , Oxford, Ohio 45056, United States
| | - Andrew S Paluch
- Department of Chemical, Paper and Biomedical Engineering, Miami University , Oxford, Ohio 45056, United States
| | | |
Collapse
|
55
|
Tetko IV, M. Lowe D, Williams AJ. The development of models to predict melting and pyrolysis point data associated with several hundred thousand compounds mined from PATENTS. J Cheminform 2016; 8:2. [PMID: 26807157 PMCID: PMC4724158 DOI: 10.1186/s13321-016-0113-y] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 01/08/2016] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Melting point (MP) is an important property in regards to the solubility of chemical compounds. Its prediction from chemical structure remains a highly challenging task for quantitative structure-activity relationship studies. Success in this area of research critically depends on the availability of high quality MP data as well as accurate chemical structure representations in order to develop models. Currently, available datasets for MP predictions have been limited to around 50k molecules while lots more data are routinely generated following the synthesis of novel materials. Significant amounts of MP data are freely available within the patent literature and, if it were available in the appropriate form, could potentially be used to develop predictive models. RESULTS We have developed a pipeline for the automated extraction and annotation of chemical data from published PATENTS. Almost 300,000 data points have been collected and used to develop models to predict melting and pyrolysis (decomposition) points using tools available on the OCHEM modeling platform (http://ochem.eu). A number of technical challenges were simultaneously solved to develop models based on these data. These included the handing of sparse data matrices with >200,000,000,000 entries and parallel calculations using 32 × 6 cores per task using 13 descriptor sets totaling more than 700,000 descriptors. We showed that models developed using data collected from PATENTS had similar or better prediction accuracy compared to the highly curated data used in previous publications. The separation of data for chemicals that decomposed rather than melting, from compounds that did undergo a normal melting transition, was performed and models for both pyrolysis and MPs were developed. The accuracy of the consensus MP models for molecules from the drug-like region of chemical space was similar to their estimated experimental accuracy, 32 °C. Last but not least, important structural features related to the pyrolysis of chemicals were identified, and a model to predict whether a compound will decompose instead of melting was developed. CONCLUSIONS We have shown that automated tools for the analysis of chemical information have reached a mature stage allowing for the extraction and collection of high quality data to enable the development of structure-activity relationship models. The developed models and data are publicly available at http://ochem.eu/article/99826.
Collapse
Affiliation(s)
- Igor V. Tetko
- />Institute of Structural Biology, Helmholtz Zentrum München für Gesundheit und Umwelt (HMGU), Ingolstädter Landstraße 1, b. 60w, 85764 Neuherberg, Germany
- />BigChem GmbH, 85764 Neuherberg, Germany
| | - Daniel M. Lowe
- />NextMove Software Limited, Innovation Centre (Unit 23), Cambridge Science Park, Cambridge, CB4 0EY UK
| | | |
Collapse
|
56
|
Skyner RE, McDonagh JL, Groom CR, van Mourik T, Mitchell JBO. A review of methods for the calculation of solution free energies and the modelling of systems in solution. Phys Chem Chem Phys 2016; 17:6174-91. [PMID: 25660403 DOI: 10.1039/c5cp00288e] [Citation(s) in RCA: 280] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Over the past decade, pharmaceutical companies have seen a decline in the number of drug candidates successfully passing through clinical trials, though billions are still spent on drug development. Poor aqueous solubility leads to low bio-availability, reducing pharmaceutical effectiveness. The human cost of inefficient drug candidate testing is of great medical concern, with fewer drugs making it to the production line, slowing the development of new treatments. In biochemistry and biophysics, water mediated reactions and interactions within active sites and protein pockets are an active area of research, in which methods for modelling solvated systems are continually pushed to their limits. Here, we discuss a multitude of methods aimed towards solvent modelling and solubility prediction, aiming to inform the reader of the options available, and outlining the various advantages and disadvantages of each approach.
Collapse
Affiliation(s)
- R E Skyner
- School of Chemistry, University of St Andrews, Purdie Building, North Haugh, St Andrews, Fife KY16 9ST, UK.
| | | | | | | | | |
Collapse
|
57
|
Emami S, Jouyban A, Valizadeh H, Shayanfar A. Are Crystallinity Parameters Critical for Drug Solubility Prediction? J SOLUTION CHEM 2015. [DOI: 10.1007/s10953-015-0410-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
58
|
Haghbakhsh R, Raeissi S. Two simple correlations to predict viscosities of pure and aqueous solutions of ionic liquids. J Mol Liq 2015. [DOI: 10.1016/j.molliq.2015.08.036] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
59
|
Morrill JA, Byrd EFC. Development of quantitative structure property relationships for predicting the melting point of energetic materials. J Mol Graph Model 2015; 62:190-201. [PMID: 26473455 DOI: 10.1016/j.jmgm.2015.09.017] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Revised: 08/19/2015] [Accepted: 09/25/2015] [Indexed: 11/15/2022]
Abstract
The accurate prediction of the melting temperature of organic compounds is a significant problem that has eluded researchers for many years. The most common approach used to develop predictive models entails the derivation of quantitative structure-property relationships (QSPRs), which are multivariate linear relationships between calculated quantities that are descriptors of molecular or electronic features and a property of interest. In this report the derivation of QSPRs to predict melting temperatures of energetic materials based on descriptors calculated using the AM1 semiempirical quantum mechanical method are described. In total, the melting points and experimental crystal structures of 148 energetic materials were analyzed. Principal components analysis was performed in order to assess the relative importance and roles of the descriptors in our QSPR models. Also described are the results of k means cluster analysis, performed in order to identify natural groupings within our study set of structures. The QSPR models resulting from these analyses gave training set R(2) values of 0.6085 (RMSE = ± 15.7 °C) and 0.7468 (RMSE = ± 13.2 °C). The test sets for these clusters had R(2) values of 0.9428 (RMSE = ± 7.0 °C) and 0.8974 (RMSE = ± 8.8 °C), respectively. These models are among the best melting point QSPRs yet published for energetic materials.
Collapse
Affiliation(s)
- Jason A Morrill
- Department of Chemistry, William Jewell College, 500 College Hill, Liberty, MO 64068, USA.
| | - Edward F C Byrd
- United States Army Research Laboratories, AMSRD-ARL-WM-BD, Aberdeen Proving Ground, MD 21005-5069, USA
| |
Collapse
|
60
|
Palmer DS, Mišin M, Fedorov MV, Llinas A. Fast and General Method To Predict the Physicochemical Properties of Druglike Molecules Using the Integral Equation Theory of Molecular Liquids. Mol Pharm 2015. [PMID: 26212723 DOI: 10.1021/acs.molpharmaceut.5b00441] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We report a method to predict physicochemical properties of druglike molecules using a classical statistical mechanics based solvent model combined with machine learning. The RISM-MOL-INF method introduced here provides an accurate technique to characterize solvation and desolvation processes based on solute-solvent correlation functions computed by the 1D reference interaction site model of the integral equation theory of molecular liquids. These functions can be obtained in a matter of minutes for most small organic and druglike molecules using existing software (RISM-MOL) (Sergiievskyi, V. P.; Hackbusch, W.; Fedorov, M. V. J. Comput. Chem. 2011, 32, 1982-1992). Predictions of caco-2 cell permeability and hydration free energy obtained using the RISM-MOL-INF method are shown to be more accurate than the state-of-the-art tools for benchmark data sets. Due to the importance of solvation and desolvation effects in biological systems, it is anticipated that the RISM-MOL-INF approach will find many applications in biophysical and biomedical property prediction.
Collapse
Affiliation(s)
- David S Palmer
- Department of Pure and Applied Chemistry, University of Strathclyde , Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland G1 1XL, U.K
| | - Maksim Mišin
- Department of Physics, Scottish Universities Physics Alliance (SUPA), University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, U.K
| | - Maxim V Fedorov
- Department of Physics, Scottish Universities Physics Alliance (SUPA), University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, U.K
| | - Antonio Llinas
- Respiratory, Inflammation and Autoimmune iMed, AstraZeneca R&D , Pepparedsleden 1, SE-431 83, Mölndal, Sweden
| |
Collapse
|
61
|
Đanić M, Pavlović N, Stanimirov B, Vukmirović S, Nikolić K, Agbaba D, Mikov M. The influence of bile salts on the distribution of simvastatin in the octanol/buffer system. Drug Dev Ind Pharm 2015. [DOI: 10.3109/03639045.2015.1067626] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Affiliation(s)
- Maja Đanić
- Department of Pharmacology, Toxicology and Clinical Pharmacology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia,
| | - Nebojša Pavlović
- Department of Pharmacology, Toxicology and Clinical Pharmacology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia,
| | - Bojan Stanimirov
- Department of Pharmacology, Toxicology and Clinical Pharmacology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia,
| | - Saša Vukmirović
- Department of Pharmacology, Toxicology and Clinical Pharmacology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia,
| | - Katarina Nikolić
- Institute of Pharmaceutical Chemistry and Drug Analysis, Faculty of Pharmacy, University of Belgrade, Belgrade, Serbia, and
| | - Danica Agbaba
- Institute of Pharmaceutical Chemistry and Drug Analysis, Faculty of Pharmacy, University of Belgrade, Belgrade, Serbia, and
| | - Momir Mikov
- Department of Pharmacology, Toxicology and Clinical Pharmacology, Faculty of Medicine, University of Novi Sad, Novi Sad, Serbia,
- Curtin Health Innovation Research Institute, School of Pharmacy, Curtin University, Perth, WA, Australia
| |
Collapse
|
62
|
McDonagh JL, van Mourik T, Mitchell JBO. Predicting Melting Points of Organic Molecules: Applications to Aqueous Solubility Prediction Using the General Solubility Equation. Mol Inform 2015; 34:715-24. [PMID: 27491032 DOI: 10.1002/minf.201500052] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Accepted: 06/05/2015] [Indexed: 01/16/2023]
Abstract
In this work we make predictions of several important molecular properties of academic and industrial importance to seek answers to two questions: 1) Can we apply efficient machine learning techniques, using inexpensive descriptors, to predict melting points to a reasonable level of accuracy? 2) Can values of this level of accuracy be usefully applied to predicting aqueous solubility? We present predictions of melting points made by several novel machine learning models, previously applied to solubility prediction. Additionally, we make predictions of solubility via the General Solubility Equation (GSE) and monitor the impact of varying the logP prediction model (AlogP and XlogP) on the GSE. We note that the machine learning models presented, using a modest number of 2D descriptors, can make melting point predictions in line with the current state of the art prediction methods (RMSE≥40 °C). We also find that predicted melting points, with an RMSE of tens of degrees Celsius, can be usefully applied to the GSE to yield accurate solubility predictions (log10 S RMSE<1) over a small dataset of drug-like molecules.
Collapse
Affiliation(s)
- J L McDonagh
- School of Chemistry, University of St Andrews, North Haugh, St Andrews, Fife, Scotland, United Kingdom, KY16 9ST.,Manchester Institute of Biotechnology, The University of Manchester, 131 Princess Street, Manchester, M1 7DN, UK
| | - T van Mourik
- School of Chemistry, University of St Andrews, North Haugh, St Andrews, Fife, Scotland, United Kingdom, KY16 9ST
| | - J B O Mitchell
- School of Chemistry, University of St Andrews, North Haugh, St Andrews, Fife, Scotland, United Kingdom, KY16 9ST.
| |
Collapse
|
63
|
Ratkova EL, Palmer DS, Fedorov MV. Solvation thermodynamics of organic molecules by the molecular integral equation theory: approaching chemical accuracy. Chem Rev 2015; 115:6312-56. [PMID: 26073187 DOI: 10.1021/cr5000283] [Citation(s) in RCA: 135] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Affiliation(s)
- Ekaterina L Ratkova
- †G. A. Krestov Institute of Solution Chemistry of the Russian Academy of Sciences, Akademicheskaya Street 1, Ivanovo 153045, Russia.,‡The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig 04103, Germany
| | - David S Palmer
- ‡The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig 04103, Germany.,§Department of Chemistry, University of Strathclyde, Thomas Graham Building, 295 Cathedral Street, Glasgow, Scotland G1 1XL, United Kingdom
| | - Maxim V Fedorov
- ‡The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig 04103, Germany.,∥Department of Physics, Scottish Universities Physics Alliance (SUPA), University of Strathclyde, John Anderson Building, 107 Rottenrow East, Glasgow G4 0NG, United Kingdom
| |
Collapse
|
64
|
Welling SH, Clemmensen LKH, Buckley ST, Hovgaard L, Brockhoff PB, Refsgaard HHF. In silico modelling of permeation enhancement potency in Caco-2 monolayers based on molecular descriptors and random forest. Eur J Pharm Biopharm 2015; 94:152-9. [PMID: 26004819 DOI: 10.1016/j.ejpb.2015.05.012] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2015] [Revised: 05/14/2015] [Accepted: 05/17/2015] [Indexed: 10/23/2022]
Abstract
Structural traits of permeation enhancers are important determinants of their capacity to promote enhanced drug absorption. Therefore, in order to obtain a better understanding of structure-activity relationships for permeation enhancers, a Quantitative Structural Activity Relationship (QSAR) model has been developed. The random forest-QSAR model was based upon Caco-2 data for 41 surfactant-like permeation enhancers from Whitehead et al. (2008) and molecular descriptors calculated from their structure. The QSAR model was validated by two test-sets: (i) an eleven compound experimental set with Caco-2 data and (ii) nine compounds with Caco-2 data from literature. Feature contributions, a recent developed diagnostic tool, was applied to elucidate the contribution of individual molecular descriptors to the predicted potency. Feature contributions provided easy interpretable suggestions of important structural properties for potent permeation enhancers such as segregation of hydrophilic and lipophilic domains. Focusing on surfactant-like properties, it is possible to model the potency of the complex pharmaceutical excipients, permeation enhancers. For the first time, a QSAR model has been developed for permeation enhancement. The model is a valuable in silico approach for both screening of new permeation enhancers and physicochemical optimisation of surfactant enhancer systems.
Collapse
Affiliation(s)
- Søren H Welling
- Global Research, Novo Nordisk A/S, Novo Nordisk Park, 2760 Måløv, Denmark; Technical University of Denmark, DTU Compute, 2800 Kgs. Lyngby, Denmark
| | | | - Stephen T Buckley
- Global Research, Novo Nordisk A/S, Novo Nordisk Park, 2760 Måløv, Denmark
| | - Lars Hovgaard
- Global Research, Novo Nordisk A/S, Novo Nordisk Park, 2760 Måløv, Denmark
| | - Per B Brockhoff
- Technical University of Denmark, DTU Compute, 2800 Kgs. Lyngby, Denmark
| | | |
Collapse
|
65
|
Abramov YA. Major Source of Error in QSPR Prediction of Intrinsic Thermodynamic Solubility of Drugs: Solid vs Nonsolid State Contributions? Mol Pharm 2015; 12:2126-41. [DOI: 10.1021/acs.molpharmaceut.5b00119] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yuriy A. Abramov
- Pfizer Global Research and Development, Groton, Connecticut 06340, United States
| |
Collapse
|
66
|
Economical synthesis of 13C-labeled opiates, cocaine derivatives and selected urinary metabolites by derivatization of the natural products. Molecules 2015; 20:5329-45. [PMID: 25816077 PMCID: PMC6272324 DOI: 10.3390/molecules20045329] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2014] [Revised: 02/10/2015] [Accepted: 03/19/2015] [Indexed: 11/17/2022] Open
Abstract
The illegal use of opiates and cocaine is a challenge world-wide, but some derivatives are also valuable pharmaceuticals. Reference samples of the active ingredients and their metabolites are needed both for controlling administration in the clinic and to detect drugs of abuse. Especially, 13C-labeled compounds are useful for identification and quantification purposes by mass spectroscopic techniques, potentially increasing accuracy by minimizing ion alteration/suppression effects. Thus, the synthesis of [acetyl-13C4]heroin, [acetyl-13C4-methyl-13C]heroin, [acetyl-13C2-methyl-13C]6-acetylmorphine, [N-methyl-13C-O-metyl-13C]codeine and phenyl-13C6-labeled derivatives of cocaine, benzoylecgonine, norcocaine and cocaethylene was undertaken to provide such reference materials. The synthetic work has focused on identifying 13C atom-efficient routes towards these derivatives. Therefore, the 13C-labeled opiates and cocaine derivatives were made from the corresponding natural products.
Collapse
|
67
|
Kew W, Mitchell JBO. Greedy and Linear Ensembles of Machine Learning Methods Outperform Single Approaches for QSPR Regression Problems. Mol Inform 2015; 34:634-47. [DOI: 10.1002/minf.201400122] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2014] [Accepted: 01/20/2015] [Indexed: 12/20/2022]
|
68
|
Tetko IV, Sushko Y, Novotarskyi S, Patiny L, Kondratov I, Petrenko AE, Charochkina L, Asiri AM. How accurately can we predict the melting points of drug-like compounds? J Chem Inf Model 2014; 54:3320-9. [PMID: 25489863 PMCID: PMC4702524 DOI: 10.1021/ci5005288] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
This article contributes a highly accurate model for predicting the melting points (MPs) of medicinal chemistry compounds. The model was developed using the largest published data set, comprising more than 47k compounds. The distributions of MPs in drug-like and drug lead sets showed that >90% of molecules melt within [50,250]°C. The final model calculated an RMSE of less than 33 °C for molecules from this temperature interval, which is the most important for medicinal chemistry users. This performance was achieved using a consensus model that performed calculations to a significantly higher accuracy than the individual models. We found that compounds with reactive and unstable groups were overrepresented among outlying compounds. These compounds could decompose during storage or measurement, thus introducing experimental errors. While filtering the data by removing outliers generally increased the accuracy of individual models, it did not significantly affect the results of the consensus models. Three analyzed distance to models did not allow us to flag molecules, which had MP values fell outside the applicability domain of the model. We believe that this negative result and the public availability of data from this article will encourage future studies to develop better approaches to define the applicability domain of models. The final model, MP data, and identified reactive groups are available online at http://ochem.eu/article/55638.
Collapse
Affiliation(s)
- Igor V Tetko
- Helmholtz-Zentrum München - German Research Centre for Environmental Health (GmbH), Institute of Structural Biology , Munich 85764, Germany
| | | | | | | | | | | | | | | |
Collapse
|
69
|
Lavecchia A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 2014; 20:318-31. [PMID: 25448759 DOI: 10.1016/j.drudis.2014.10.012] [Citation(s) in RCA: 359] [Impact Index Per Article: 35.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Revised: 09/27/2014] [Accepted: 10/24/2014] [Indexed: 12/19/2022]
Abstract
During the past decade, virtual screening (VS) has evolved from traditional similarity searching, which utilizes single reference compounds, into an advanced application domain for data mining and machine-learning approaches, which require large and representative training-set compounds to learn robust decision rules. The explosive growth in the amount of public domain-available chemical and biological data has generated huge effort to design, analyze, and apply novel learning methodologies. Here, I focus on machine-learning techniques within the context of ligand-based VS (LBVS). In addition, I analyze several relevant VS studies from recent publications, providing a detailed view of the current state-of-the-art in this field and highlighting not only the problematic issues, but also the successes and opportunities for further advances.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Department of Pharmacy, Drug Discovery Laboratory, University of Napoli 'Federico II', via D. Montesano 49, I-80131 Napoli, Italy.
| |
Collapse
|
70
|
Palmer DS, Mitchell JBO. Is Experimental Data Quality the Limiting Factor in Predicting the Aqueous Solubility of Druglike Molecules? Mol Pharm 2014; 11:2962-72. [DOI: 10.1021/mp500103r] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- David S. Palmer
- Department
of Chemistry, University of Strathclyde, Thomas Graham Building, 295 Cathedral
Street, Glasgow, Scotland G1 1XL, U.K
| | - John B. O. Mitchell
- Biomedical
Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews, Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, U.K
| |
Collapse
|
71
|
McDonagh JL, Nath N, De Ferrari L, van Mourik T, Mitchell JBO. Uniting cheminformatics and chemical theory to predict the intrinsic aqueous solubility of crystalline druglike molecules. J Chem Inf Model 2014; 54:844-56. [PMID: 24564264 PMCID: PMC3965570 DOI: 10.1021/ci4005805] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
We
present four models of solution free-energy prediction for druglike
molecules utilizing cheminformatics descriptors and theoretically
calculated thermodynamic values. We make predictions of solution free
energy using physics-based theory alone and using machine learning/quantitative
structure–property relationship (QSPR) models. We also develop
machine learning models where the theoretical energies and cheminformatics
descriptors are used as combined input. These models are used to predict
solvation free energy. While direct theoretical calculation does not
give accurate results in this approach, machine learning is able to
give predictions with a root mean squared error (RMSE) of ∼1.1
log S units in a 10-fold cross-validation for our
Drug-Like-Solubility-100 (DLS-100) dataset of 100 druglike molecules.
We find that a model built using energy terms from our theoretical
methodology as descriptors is marginally less predictive than one
built on Chemistry Development Kit (CDK) descriptors. Combining both
sets of descriptors allows a further but very modest improvement in
the predictions. However, in some cases, this is a statistically significant
enhancement. These results suggest that there is little complementarity
between the chemical information provided by these two sets of descriptors,
despite their different sources and methods of calculation. Our machine
learning models are also able to predict the well-known Solubility
Challenge dataset with an RMSE value of 0.9–1.0 log S units.
Collapse
Affiliation(s)
- James L McDonagh
- Biomedical Sciences Research Complex and ‡EaStCHEM, School of Chemistry, Purdie Building, University of St. Andrews , North Haugh, St. Andrews, Scotland , KY16 9ST, United Kingdom
| | | | | | | | | |
Collapse
|
72
|
Mitchell JBO. Machine learning methods in chemoinformatics. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014; 4:468-481. [PMID: 25285160 PMCID: PMC4180928 DOI: 10.1002/wcms.1183] [Citation(s) in RCA: 248] [Impact Index Per Article: 24.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure-activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers.
Collapse
|
73
|
|
74
|
Batisai E, Ayamine A, Kilinkissa OEY, Báthori NB. Melting point–solubility–structure correlations in multicomponent crystals containing fumaric or adipic acid. CrystEngComm 2014. [DOI: 10.1039/c4ce01298d] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The relationships between the melting point, solubility and structure was investigated in a series of multicomponent crystals of fumaric and adipic acid.
Collapse
Affiliation(s)
- Eustina Batisai
- Crystal Engineering Research Unit
- Department of Chemistry
- Cape Peninsula University of Technology
- Cape Town, South Africa
| | - Alban Ayamine
- Crystal Engineering Research Unit
- Department of Chemistry
- Cape Peninsula University of Technology
- Cape Town, South Africa
| | - Ornella E. Y. Kilinkissa
- Crystal Engineering Research Unit
- Department of Chemistry
- Cape Peninsula University of Technology
- Cape Town, South Africa
| | - Nikoletta B. Báthori
- Crystal Engineering Research Unit
- Department of Chemistry
- Cape Peninsula University of Technology
- Cape Town, South Africa
| |
Collapse
|
75
|
Salahinejad M, Le TC, Winkler DA. Aqueous Solubility Prediction: Do Crystal Lattice Interactions Help? Mol Pharm 2013; 10:2757-66. [DOI: 10.1021/mp4001958] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Affiliation(s)
- Maryam Salahinejad
- Faculty of Chemistry, Tarbiat Moallem University, Tehran 15719-14911, Iran
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
- Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia
| | - Tu C. Le
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
| | - David A. Winkler
- CSIRO Materials Science & Engineering, Clayton 3168, Australia
- Monash Institute of Pharmaceutical Sciences, Parkville 3052, Australia
| |
Collapse
|
76
|
Brauner N, Shacham M. Prediction of normal melting point of pure substances by a reference series method. AIChE J 2013. [DOI: 10.1002/aic.14128] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Neima Brauner
- School of Mechanical Engineering; Tel-Aviv University; Tel-Aviv 69978 Israel
| | - Mordechai Shacham
- Dept. of Chemical Engineering; Ben-Gurion University of the Negev; Beer-Sheva 84105 Israel
| |
Collapse
|
77
|
Saldana DA, Starck L, Mougin P, Rousseau B, Creton B. On the rational formulation of alternative fuels: melting point and net heat of combustion predictions for fuel compounds using machine learning methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:259-277. [PMID: 23574496 DOI: 10.1080/1062936x.2013.766634] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We report the development of predictive models for two fuel specifications: melting points (T(m)) and net heat of combustion (Δ(c)H). Compounds inside the scope of these models are those likely to be found in alternative fuels, i.e. hydrocarbons, alcohols and esters. Experimental T(m) and Δ(c)H values for these types of molecules have been gathered to generate a unique database. Various quantitative structure-property relationship (QSPR) approaches have been used to build models, ranging from methods leading to multi-linear models such as genetic function approximation (GFA), or partial least squares (PLS) to those leading to non-linear models such as feed-forward artificial neural networks (FFANN), general regression neural networks (GRNN), support vector machines (SVM), or graph machines. Except for the case of the graph machines method for which the only inputs are SMILES formulae, previously listed approaches working on molecular descriptors and functional group count descriptors were used to develop specific models for T(m) and Δ(c)H. For each property, the predictive models return slightly different responses for each molecular structure. Therefore, models labelled as 'consensus models' were built by averaging values computed with selected individual models. Predicted results were then compared with experimental data and with predictions of models in the literature.
Collapse
Affiliation(s)
- D A Saldana
- IFP Energies Nouvelles, Rueil-Malmaison, France
| | | | | | | | | |
Collapse
|
78
|
Salahinejad M, Le TC, Winkler DA. Capturing the crystal: prediction of enthalpy of sublimation, crystal lattice energy, and melting points of organic compounds. J Chem Inf Model 2013; 53:223-9. [PMID: 23215043 DOI: 10.1021/ci3005012] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Accurate computational prediction of melting points and aqueous solubilities of organic compounds would be very useful but is notoriously difficult. Predicting the lattice energies of compounds is key to understanding and predicting their melting behavior and ultimately their solubility behavior. We report robust, predictive, quantitative structure-property relationship (QSPR) models for enthalpies of sublimation, crystal lattice energies, and melting points for a very large and structurally diverse set of small organic compounds. Sparse Bayesian feature selection and machine learning methods were employed to select the most relevant molecular descriptors for the model and to generate parsimonious quantitative models. The final enthalpy of sublimation model is a four-parameter multilinear equation that has an r(2) value of 0.96 and an average absolute error of 7.9 ± 0.3 kJ.mol(-1). The melting point model can predict this property with a standard error of 45° ± 1 K and r(2) value of 0.79. Given the size and diversity of the training data, these conceptually transparent and accurate models can be used to predict sublimation enthalpy, lattice energy, and melting points of organic compounds in general.
Collapse
Affiliation(s)
- Maryam Salahinejad
- Faculty of Chemistry, Tarbiat Moallem University, Tehran 15719-14911, Iran
| | | | | |
Collapse
|
79
|
Bhatnagar N, Kamath G, Potoff JJ. Prediction of 1-octanol–water and air–water partition coefficients for nitro-aromatic compounds from molecular dynamics simulations. Phys Chem Chem Phys 2013; 15:6467-74. [DOI: 10.1039/c3cp44284e] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
|
80
|
Affiliation(s)
- Bo Lian
- College of Pharmacy, University of Arizona, Tucson, Arizona 85721, United
States
| | - Samuel H. Yalkowsky
- College of Pharmacy, University of Arizona, Tucson, Arizona 85721, United
States
| |
Collapse
|
81
|
Bhatnagar N, Kamath G, Chelst I, Potoff JJ. Direct calculation of 1-octanol-water partition coefficients from adaptive biasing force molecular dynamics simulations. J Chem Phys 2012; 137:014502. [PMID: 22779660 DOI: 10.1063/1.4730040] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The 1-octanol-water partition coefficient log K(ow) of a solute is a key parameter used in the prediction of a wide variety of complex phenomena such as drug availability and bioaccumulation potential of trace contaminants. In this work, adaptive biasing force molecular dynamics simulations are used to determine absolute free energies of hydration, solvation, and 1-octanol-water partition coefficients for n-alkanes from methane to octane. Two approaches are evaluated; the direct transfer of the solute from 1-octanol to water phase, and separate transfers of the solute from the water or 1-octanol phase to vacuum, with both methods yielding statistically indistinguishable results. Calculations performed with the TIP4P and SPC∕E water models and the TraPPE united-atom force field for n-alkanes show that the choice of water model has a negligible effect on predicted free energies of transfer and partition coefficients for n-alkanes. A comparison of calculations using wet and dry octanol phases shows that the predictions for log K(ow) using wet octanol are 0.2-0.4 log units lower than for dry octanol, although this is within the statistical uncertainty of the calculation.
Collapse
Affiliation(s)
- Navendu Bhatnagar
- Department of Chemical Engineering and Materials Science, Wayne State University, Detroit, Michigan 48202, USA
| | | | | | | |
Collapse
|
82
|
Erić S, Kalinić M, Popović A, Zloh M, Kuzmanovski I. Prediction of aqueous solubility of drug-like molecules using a novel algorithm for automatic adjustment of relative importance of descriptors implemented in counter-propagation artificial neural networks. Int J Pharm 2012; 437:232-41. [DOI: 10.1016/j.ijpharm.2012.08.022] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2012] [Revised: 08/12/2012] [Accepted: 08/16/2012] [Indexed: 10/28/2022]
|
83
|
Heikkinen AT, Baneyx G, Caruso A, Parrott N. Application of PBPK modeling to predict human intestinal metabolism of CYP3A substrates – An evaluation and case study using GastroPlus™. Eur J Pharm Sci 2012; 47:375-86. [DOI: 10.1016/j.ejps.2012.06.013] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2012] [Revised: 05/11/2012] [Accepted: 06/23/2012] [Indexed: 01/10/2023]
|
84
|
Hechinger M, Leonhard K, Marquardt W. What is Wrong with Quantitative Structure–Property Relations Models Based on Three-Dimensional Descriptors? J Chem Inf Model 2012; 52:1984-93. [DOI: 10.1021/ci300246m] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- M. Hechinger
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| | - K. Leonhard
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| | - W. Marquardt
- AVT-Process
Systems Engineering and ‡Chair of Technical Thermodynamics, RWTH Aachen University, 52064 Aachen, Germany
| |
Collapse
|
85
|
Palmer DS, McDonagh JL, Mitchell JBO, van Mourik T, Fedorov MV. First-Principles Calculation of the Intrinsic Aqueous Solubility of Crystalline Druglike Molecules. J Chem Theory Comput 2012; 8:3322-37. [PMID: 26605739 DOI: 10.1021/ct300345m] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We demonstrate that the intrinsic aqueous solubility of crystalline druglike molecules can be estimated with reasonable accuracy from sublimation free energies calculated using crystal lattice simulations and hydration free energies calculated using the 3D Reference Interaction Site Model (3D-RISM) of the Integral Equation Theory of Molecular Liquids (IET). The solubilities of 25 crystalline druglike molecules taken from different chemical classes are predicted by the model with a correlation coefficient of R = 0.85 and a root mean square error (RMSE) equal to 1.45 log10S units, which is significantly more accurate than results obtained using implicit continuum solvent models. The method is not directly parametrized against experimental solubility data, and it offers a full computational characterization of the thermodynamics of transfer of the drug molecule from crystal phase to gas phase to dilute aqueous solution.
Collapse
Affiliation(s)
- David S Palmer
- Department of Physics, University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, United Kingdom.,Max Planck Institute for Mathematics in the Sciences , Inselstrasse 22, DE-04103 Leipzig, Germany
| | - James L McDonagh
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - John B O Mitchell
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - Tanja van Mourik
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, University of St. Andrews , Purdie Building, North Haugh, St. Andrews, Scotland KY16 9ST, United Kingdom
| | - Maxim V Fedorov
- Department of Physics, University of Strathclyde , John Anderson Building, 107 Rottenrow, Glasgow, Scotland G4 0NG, United Kingdom.,Max Planck Institute for Mathematics in the Sciences , Inselstrasse 22, DE-04103 Leipzig, Germany
| |
Collapse
|
86
|
Duffy BC, Zhu L, Decornez H, Kitchen DB. Early phase drug discovery: cheminformatics and computational techniques in identifying lead series. Bioorg Med Chem 2012; 20:5324-42. [PMID: 22938785 DOI: 10.1016/j.bmc.2012.04.062] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Revised: 04/24/2012] [Accepted: 04/27/2012] [Indexed: 01/31/2023]
Abstract
Early drug discovery processes rely on hit finding procedures followed by extensive experimental confirmation in order to select high priority hit series which then undergo further scrutiny in hit-to-lead studies. The experimental cost and the risk associated with poor selection of lead series can be greatly reduced by the use of many different computational and cheminformatic techniques to sort and prioritize compounds. We describe the steps in typical hit identification and hit-to-lead programs and then describe how cheminformatic analysis assists this process. In particular, scaffold analysis, clustering and property calculations assist in the design of high-throughput screening libraries, the early analysis of hits and then organizing compounds into series for their progression from hits to leads. Additionally, these computational tools can be used in virtual screening to design hit-finding libraries and as procedures to help with early SAR exploration.
Collapse
Affiliation(s)
- Bryan C Duffy
- AMRI, 26 Corporate Circle, PO Box 15098, Albany, NY 12212-5098, USA
| | | | | | | |
Collapse
|
87
|
Nath N, Mitchell JBO. Is EC class predictable from reaction mechanism? BMC Bioinformatics 2012; 13:60. [PMID: 22530800 PMCID: PMC3368749 DOI: 10.1186/1471-2105-13-60] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2012] [Accepted: 04/24/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We investigate the relationships between the EC (Enzyme Commission) class, the associated chemical reaction, and the reaction mechanism by building predictive models using Support Vector Machine (SVM), Random Forest (RF) and k-Nearest Neighbours (kNN). We consider two ways of encoding the reaction mechanism in descriptors, and also three approaches that encode only the overall chemical reaction. Both cross-validation and also an external test set are used. RESULTS The three descriptor sets encoding overall chemical transformation perform better than the two descriptions of mechanism. SVM and RF models perform comparably well; kNN is less successful. Oxidoreductases and hydrolases are relatively well predicted by all types of descriptor; isomerases are well predicted by overall reaction descriptors but not by mechanistic ones. CONCLUSIONS Our results suggest that pairs of similar enzyme reactions tend to proceed by different mechanisms. Oxidoreductases, hydrolases, and to some extent isomerases and ligases, have clear chemical signatures, making them easier to predict than transferases and lyases. We find evidence that isomerases as a class are notably mechanistically diverse and that their one shared property, of substrate and product being isomers, can arise in various unrelated ways.The performance of the different machine learning algorithms is in line with many cheminformatics applications, with SVM and RF being roughly equally effective. kNN is less successful, given the role that non-local information plays in successful classification. We note also that, despite a lack of clarity in the literature, EC number prediction is not a single problem; the challenge of predicting protein function from available sequence data is quite different from assigning an EC classification from a cheminformatics representation of a reaction.
Collapse
Affiliation(s)
- Neetika Nath
- Biomedical Sciences Research Complex and EaStCHEM School of Chemistry, Purdie Building, University of St Andrews, North Haugh, St Andrews, Scotland KY16 9ST, UK
| | | |
Collapse
|
88
|
Ognichenko LN, Kuz'min VE, Gorb L, Hill FC, Artemenko AG, Polischuk PG, Leszczynski J. QSPR Prediction of Lipophilicity for Organic Compounds Using Random Forest Technique on the Basis of Simplex Representation of Molecular Structure. Mol Inform 2012; 31:273-80. [PMID: 27477097 DOI: 10.1002/minf.201100102] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2011] [Accepted: 02/05/2012] [Indexed: 11/08/2022]
Abstract
The relationship between the octanol-water partition coefficient for more than twelve thousand organic compounds and their structures was investigated using a QSPR approach based on Simplex Representation of Molecular Structure (SiRMS). The dataset used in our study included 10973 compounds with experimental values of lipophilicity (LogKow ) for different chemical compounds. Random Forest (RF) method was used for statistical modeling at the 2D level of representation of molecular structure. Developed models are adequate and successfully validated with external test sets. Proposed models have clear interpretation due to the use of simplex representation of molecular structure and predict the LogKow values with the accuracy of the best modern models. Thus QSPR models proposed in this study represent powerful and easy-to use virtual screening tool that can be recommended for prediction of octanol-water partition coefficient.
Collapse
Affiliation(s)
- Liudmyla N Ognichenko
- Laboratory of Theoretical Chemistry, Department of Molecular Structure, A.V. Bogatsky Physical-Chemical Institute, National Academy of Science of Ukraine, Ukraine, Odessa, 65080, Lustdorfskaya Doroga 86
| | - Victor E Kuz'min
- Laboratory of Theoretical Chemistry, Department of Molecular Structure, A.V. Bogatsky Physical-Chemical Institute, National Academy of Science of Ukraine, Ukraine, Odessa, 65080, Lustdorfskaya Doroga 86
| | - Leonid Gorb
- Badger Technical Services, LLC, Vicksburg, Mississippi, USA
| | - Frances C Hill
- US Army ERDC, 3532 Manor Dr, Vicksburg, Mississippi, 39180, USA
| | - Anatoly G Artemenko
- Laboratory of Theoretical Chemistry, Department of Molecular Structure, A.V. Bogatsky Physical-Chemical Institute, National Academy of Science of Ukraine, Ukraine, Odessa, 65080, Lustdorfskaya Doroga 86
| | - Pavel G Polischuk
- Laboratory of Theoretical Chemistry, Department of Molecular Structure, A.V. Bogatsky Physical-Chemical Institute, National Academy of Science of Ukraine, Ukraine, Odessa, 65080, Lustdorfskaya Doroga 86
| | - Jerzy Leszczynski
- US Army ERDC, 3532 Manor Dr, Vicksburg, Mississippi, 39180, USA. .,Interdisciplinary Center for Nanotoxicity, Department of Chemistry and Biochemistry, Jackson State University, Jackson, Mississippi, 39217, USA.
| |
Collapse
|
89
|
Alechaga É, Moyano E, Galceran MT. Ultra-high performance liquid chromatography-tandem mass spectrometry for the analysis of phenicol drugs and florfenicol-amine in foods. Analyst 2012; 137:2486-94. [DOI: 10.1039/c2an16052h] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
90
|
Abstract
Physicochemical properties are key factors in controlling the interactions of xenobiotics with living organisms. Computational approaches to toxicity prediction therefore generally rely to a very large extent on the physicochemical properties of the query compounds. Consequently it is important that reliable in silico methods are available for the rapid calculation of physicochemical properties. The key properties are partition coefficient, aqueous solubility, and pKa and, to a lesser extent, melting point, boiling point, vapor pressure, and Henry's law constant (air-water partition coefficient). The calculation of each of these properties from quantitative structure-property relationships (QSPRs) and from available software is discussed in detail, and recommendations made. Finally, detailed consideration is given of guidelines for the development of QSPRs and QSARs.
Collapse
Affiliation(s)
- John C Dearden
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University, Liverpool, UK.
| |
Collapse
|
91
|
Schultes S, de Graaf C, Berger H, Mayer M, Steffen A, Haaksma EEJ, de Esch IJP, Leurs R, Krämer O. A medicinal chemistry perspective on melting point: matched molecular pair analysis of the effects of simple descriptors on the melting point of drug-like compounds. MEDCHEMCOMM 2012. [DOI: 10.1039/c2md00313a] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
92
|
Estimating the octanol/water partition coefficient for aliphatic organic compounds using semi-empirical electrotopological index. Int J Mol Sci 2011; 12:7250-64. [PMID: 22072945 PMCID: PMC3211036 DOI: 10.3390/ijms12107250] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2011] [Revised: 10/08/2011] [Accepted: 10/14/2011] [Indexed: 11/16/2022] Open
Abstract
A new possibility for estimating the octanol/water coefficient (log P) was investigated using only one descriptor, the semi-empirical electrotopological index (I(SET)). The predictability of four octanol/water partition coefficient (log P) calculation models was compared using a set of 131 aliphatic organic compounds from five different classes. Log P values were calculated employing atomic-contribution methods, as in the Ghose/Crippen approach and its later refinement, AlogP; using fragmental methods through the ClogP method; and employing an approach considering the whole molecule using topological indices with the MlogP method. The efficiency and the applicability of the I(SET) in terms of calculating log P were demonstrated through good statistical quality (r > 0.99; s < 0.18), high internal stability and good predictive ability for an external group of compounds in the same order as the widely used models based on the fragmental method, ClogP, and the atomic contribution method, AlogP, which are among the most used methods of predicting log P.
Collapse
|
93
|
Palmer DS, Frolov AI, Ratkova EL, Fedorov MV. Toward a Universal Model To Calculate the Solvation Thermodynamics of Druglike Molecules: The Importance of New Experimental Databases. Mol Pharm 2011; 8:1423-9. [DOI: 10.1021/mp200119r] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- David S. Palmer
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig, Germany
| | - Andrey I. Frolov
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig, Germany
| | - Ekaterina L. Ratkova
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig, Germany
| | - Maxim V. Fedorov
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig, Germany
| |
Collapse
|
94
|
Sergiievskyi VP. Model for calculating the free energy of hydration of bioactive compounds based on integral equation theory of liquids. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY B 2011. [DOI: 10.1134/s1990793111020382] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
95
|
Liu F, Cao C, Cheng B. A Quantitative Structure-Property Relationship (QSPR) Study of aliphatic alcohols by the method of dividing the molecular structure into substructure. Int J Mol Sci 2011; 12:2448-62. [PMID: 21731451 PMCID: PMC3127127 DOI: 10.3390/ijms12042448] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Revised: 03/24/2011] [Accepted: 04/06/2011] [Indexed: 11/16/2022] Open
Abstract
A quantitative structure–property relationship (QSPR) analysis of aliphatic alcohols is presented. Four physicochemical properties were studied: boiling point (BP), n-octanol–water partition coefficient (lg POW), water solubility (lg W) and the chromatographic retention indices (RI) on different polar stationary phases. In order to investigate the quantitative structure–property relationship of aliphatic alcohols, the molecular structure ROH is divided into two parts, R and OH to generate structural parameter. It was proposed that the property is affected by three main factors for aliphatic alcohols, alkyl group R, substituted group OH, and interaction between R and OH. On the basis of the polarizability effect index (PEI), previously developed by Cao, the novel molecular polarizability effect index (MPEI) combined with odd-even index (OEI), the sum eigenvalues of bond-connecting matrix (SX1CH) previously developed in our team, were used to predict the property of aliphatic alcohols. The sets of molecular descriptors were derived directly from the structure of the compounds based on graph theory. QSPR models were generated using only calculated descriptors and multiple linear regression techniques. These QSPR models showed high values of multiple correlation coefficient (R > 0.99) and Fisher-ratio statistics. The leave-one-out cross-validation demonstrated the final models to be statistically significant and reliable.
Collapse
Affiliation(s)
- Fengping Liu
- School of Chemistry and Chemical Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; E-Mails: (F.L.); (B.C.)
| | - Chenzhong Cao
- School of Chemistry and Chemical Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; E-Mails: (F.L.); (B.C.)
- Key Laboratory of Theoretical Chemistry and Molecular Simulation of Ministry of Education, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China
- Author to whom correspondence should be addressed; E-Mail:
| | - Bin Cheng
- School of Chemistry and Chemical Engineering, Hunan University of Science and Technology, Xiangtan, Hunan 411201, China; E-Mails: (F.L.); (B.C.)
| |
Collapse
|
96
|
Abstract
This article reviews the use of informatics and computational chemistry methods in medicinal chemistry, with special consideration of how computational techniques can be adapted and extended to obtain more and higher-quality information. Special consideration is given to the computation of protein–ligand binding affinities, to the prediction of off-target bioactivities, bioactivity spectra and computational toxicology, and also to calculating absorption-, distribution-, metabolism- and excretion-relevant properties, such as solubility.
Collapse
|
97
|
Deeb O, Goodarzi M, Alfalah S. Prediction of melting point for drug-like compounds via QSPR methods. Mol Phys 2011. [DOI: 10.1080/00268976.2010.532164] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
98
|
Palmer DS, Frolov AI, Ratkova EL, Fedorov MV. Towards a universal method for calculating hydration free energies: a 3D reference interaction site model with partial molar volume correction. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2010; 22:492101. [PMID: 21406779 DOI: 10.1088/0953-8984/22/49/492101] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
We report a simple universal method to systematically improve the accuracy of hydration free energies calculated using an integral equation theory of molecular liquids, the 3D reference interaction site model. A strong linear correlation is observed between the difference of the experimental and (uncorrected) calculated hydration free energies and the calculated partial molar volume for a data set of 185 neutral organic molecules from different chemical classes. By using the partial molar volume as a linear empirical correction to the calculated hydration free energy, we obtain predictions of hydration free energies in excellent agreement with experiment (R = 0.94, σ = 0.99 kcal mol (- 1) for a test set of 120 organic molecules).
Collapse
Affiliation(s)
- David S Palmer
- Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, DE-04103 Leipzig, Germany
| | | | | | | |
Collapse
|
99
|
Ratkova EL, Chuev GN, Sergiievskyi VP, Fedorov MV. An Accurate Prediction of Hydration Free Energies by Combination of Molecular Integral Equations Theory with Structural Descriptors. J Phys Chem B 2010; 114:12068-79. [DOI: 10.1021/jp103955r] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Ekaterina L. Ratkova
- The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, 04103, Germany, and Institute of Theoretical and Experimental Biophysics, Russian Academy of Science, Pushchino, Moscow Region, 142290, Russia
| | - Gennady N. Chuev
- The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, 04103, Germany, and Institute of Theoretical and Experimental Biophysics, Russian Academy of Science, Pushchino, Moscow Region, 142290, Russia
| | - Volodymyr P. Sergiievskyi
- The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, 04103, Germany, and Institute of Theoretical and Experimental Biophysics, Russian Academy of Science, Pushchino, Moscow Region, 142290, Russia
| | - Maxim V. Fedorov
- The Max Planck Institute for Mathematics in the Sciences, Inselstrasse 22, Leipzig, 04103, Germany, and Institute of Theoretical and Experimental Biophysics, Russian Academy of Science, Pushchino, Moscow Region, 142290, Russia
| |
Collapse
|
100
|
Palmer DS, Sergiievskyi VP, Jensen F, Fedorov MV. Accurate calculations of the hydration free energies of druglike molecules using the reference interaction site model. J Chem Phys 2010; 133:044104. [DOI: 10.1063/1.3458798] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|