1
|
Moriarty A, Kobayashi T, Salvalaglio M, Angeli P, Striolo A, McRobbie I. Analyzing the Accuracy of Critical Micelle Concentration Predictions Using Deep Learning. J Chem Theory Comput 2023; 19:7371-7386. [PMID: 37815387 DOI: 10.1021/acs.jctc.3c00868] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
This paper presents a novel approach to predicting critical micelle concentrations (CMCs) by using graph neural networks (GNNs) augmented with Gaussian processes (GPs). The proposed model uses learned latent space representations of molecules to predict CMCs and estimate uncertainties. The performance of the model on a data set containing nonionic, cationic, anionic, and zwitterionic molecules is compared against a linear model that works with extended connectivity fingerprints (ECFPs). The GNN-based model performs slightly better than the linear ECFP model when there is enough well-balanced training data and achieves predictive accuracy that is comparable to published models that were evaluated on a smaller range of surfactant chemistries. We illustrate the applicability domain of our model using a molecular cartogram to visualize the latent space, which helps to identify molecules for which predictions are likely to be erroneous. In addition to accurately predicting CMCs for some surfactant classes, the proposed approach can provide valuable insights into the molecular properties that influence CMCs.
Collapse
Affiliation(s)
- Alexander Moriarty
- Department of Chemical Engineering, University College London, London WC1E 7JE, U.K
| | - Takeshi Kobayashi
- Department of Chemical Engineering, University College London, London WC1E 7JE, U.K
| | - Matteo Salvalaglio
- Department of Chemical Engineering, University College London, London WC1E 7JE, U.K
| | - Panagiota Angeli
- Department of Chemical Engineering, University College London, London WC1E 7JE, U.K
| | - Alberto Striolo
- Department of Chemical Engineering, University College London, London WC1E 7JE, U.K
- School of Sustainable Chemical, Biological and Materials Engineering, University of Oklahoma, Norman, Oklahoma 73019-0390, United States
| | | |
Collapse
|
2
|
Machine Learning Prediction of Mycobacterial Cell Wall Permeability of Drugs and Drug-like Compounds. MOLECULES (BASEL, SWITZERLAND) 2023; 28:molecules28020633. [PMID: 36677691 PMCID: PMC9863426 DOI: 10.3390/molecules28020633] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2022] [Revised: 12/30/2022] [Accepted: 12/30/2022] [Indexed: 01/11/2023]
Abstract
The cell wall of Mycobacterium tuberculosis and related organisms has a very complex and unusual organization that makes it much less permeable to nutrients and antibiotics, leading to the low activity of many potential antimycobacterial drugs against whole-cell mycobacteria compared to their isolated molecular biotargets. The ability to predict and optimize the cell wall permeability could greatly enhance the development of novel antitubercular agents. Using an extensive structure-permeability dataset for organic compounds derived from published experimental big data (5371 compounds including 2671 penetrating and 2700 non-penetrating compounds), we have created a predictive classification model based on fragmental descriptors and an artificial neural network of a novel architecture that provides better accuracy (cross-validated balanced accuracy 0.768, sensitivity 0.768, specificity 0.769, area under ROC curve 0.911) and applicability domain compared with the previously published results.
Collapse
|
3
|
Applications of artificial intelligence to drug design and discovery in the big data era: a comprehensive review. Mol Divers 2021; 25:1643-1664. [PMID: 34110579 DOI: 10.1007/s11030-021-10237-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 05/26/2021] [Indexed: 10/21/2022]
Abstract
Artificial intelligence (AI) renders cutting-edge applications in diverse sectors of society. Due to substantial progress in high-performance computing, the development of superior algorithms, and the accumulation of huge biological and chemical data, computer-assisted drug design technology is playing a key role in drug discovery with its advantages of high efficiency, fast speed, and low cost. Over recent years, due to continuous progress in machine learning (ML) algorithms, AI has been extensively employed in various drug discovery stages. Very recently, drug design and discovery have entered the big data era. ML algorithms have progressively developed into a deep learning technique with potent generalization capability and more effectual big data handling, which further promotes the integration of AI technology and computer-assisted drug discovery technology, hence accelerating the design and discovery of the newest drugs. This review mainly summarizes the application progression of AI technology in the drug discovery process, and explores and compares its advantages over conventional methods. The challenges and limitations of AI in drug design and discovery have also been discussed.
Collapse
|
4
|
Radchenko EV, Dyabina AS, Palyulin VA. Towards Deep Neural Network Models for the Prediction of the Blood-Brain Barrier Permeability for Diverse Organic Compounds. Molecules 2020; 25:molecules25245901. [PMID: 33322142 PMCID: PMC7763607 DOI: 10.3390/molecules25245901] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 12/06/2020] [Accepted: 12/10/2020] [Indexed: 11/24/2022] Open
Abstract
Permeation through the blood–brain barrier (BBB) is among the most important processes controlling the pharmacokinetic properties of drugs and other bioactive compounds. Using the fragmental (substructural) descriptors representing the occurrence number of various substructures, as well as the artificial neural network approach and the double cross-validation procedure, we have developed a predictive in silico LogBB model based on an extensive and verified dataset (529 compounds), which is applicable to diverse drugs and drug-like compounds. The model has good predictivity parameters (Q2=0.815, RMSEcv=0.318) that are similar to or better than those of the most reliable models available in the literature. Larger datasets, and perhaps more sophisticated network architectures, are required to realize the full potential of deep neural networks. The analysis of fragment contributions reveals patterns of influence consistent with the known concepts of structural characteristics that affect the BBB permeability of organic compounds. The external validation of the model confirms good agreement between the predicted and experimental LogBB values for most of the compounds. The model enables the evaluation and optimization of the BBB permeability of potential neuroactive agents and other drug compounds.
Collapse
|
5
|
Abstract
Due to the massive data sets available for drug candidates, modern drug discovery has advanced to the big data era. Central to this shift is the development of artificial intelligence approaches to implementing innovative modeling based on the dynamic, heterogeneous, and large nature of drug data sets. As a result, recently developed artificial intelligence approaches such as deep learning and relevant modeling studies provide new solutions to efficacy and safety evaluations of drug candidates based on big data modeling and analysis. The resulting models provided deep insights into the continuum from chemical structure to in vitro, in vivo, and clinical outcomes. The relevant novel data mining, curation, and management techniques provided critical support to recent modeling studies. In summary, the new advancement of artificial intelligence in the big data era has paved the road to future rational drug development and optimization, which will have a significant impact on drug discovery procedures and, eventually, public health.
Collapse
Affiliation(s)
- Hao Zhu
- Department of Chemistry and Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey 08102, USA;
| |
Collapse
|
6
|
Shults OV. Estimating the Thermodynamic Properties of Chemical Compounds, Based on Quantitative Structural Property Relationships. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2019. [DOI: 10.1134/s0036024419070264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
7
|
Zhokhov AK, Loskutov AY, Rybal’chenko IV. Methodological Approaches to the Calculation and Prediction of Retention Indices in Capillary Gas Chromatography. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934818030127] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
8
|
Radchenko EV, Rulev YA, Safanyaev AY, Palyulin VA, Zefirov NS. Computer-aided estimation of the hERG-mediated cardiotoxicity risk of potential drug components. DOKL BIOCHEM BIOPHYS 2017; 473:128-131. [DOI: 10.1134/s1607672917020107] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Indexed: 11/22/2022]
|
9
|
|
10
|
Dyabina AS, Radchenko EV, Palyulin VA, Zefirov NS. Prediction of blood-brain barrier permeability of organic compounds. DOKL BIOCHEM BIOPHYS 2016; 470:371-374. [DOI: 10.1134/s1607672916050173] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Indexed: 11/22/2022]
|
11
|
Sosnin SB, Radchenko EV, Palyulin VA, Zefirov NS. Generalized fragmental approach in QSAR/QSPR studies. DOKLADY CHEMISTRY 2015. [DOI: 10.1134/s0012500815070071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Kurilo MN, Ryzhkov FV, Karpov PV, Radchenko EV, Palyulin VA, Zefirov NS. Molecular design of selective ligands of chemokine receptors. DOKL BIOCHEM BIOPHYS 2015; 461:131-4. [DOI: 10.1134/s1607672915020167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Indexed: 11/23/2022]
|
13
|
Vassiliev PM, Spasov AA, Kosolapov VA, Kucheryavenko AF, Gurova NA, Anisimova VA. Consensus Drug Design Using IT Microcosm. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2014. [DOI: 10.1007/978-94-017-9257-8_12] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
|
14
|
Insight into substituent effects in Cal-B catalyzed transesterification by combining experimental and theoretical approaches. J Mol Model 2012; 19:349-58. [DOI: 10.1007/s00894-012-1552-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2012] [Accepted: 07/25/2012] [Indexed: 10/28/2022]
|
15
|
Kravtsov AA, Karpov PV, Baskin II, Palyulin VA, Zefirov NS. Prediction of the preferable mechanism of nucleophilic substitution at saturated carbon atom and prognosis of S N 1 rate constants by means of QSPR. DOKLADY CHEMISTRY 2011. [DOI: 10.1134/s0012500811110048] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
16
|
Kravtsov AA, Karpov PV, Baskin II, Palyulin VA, Zefirov NS. Prediction of rate constants of S N 2 reactions by the multicomponent QSPR method. DOKLADY CHEMISTRY 2011. [DOI: 10.1134/s0012500811100107] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
17
|
Rykounov AA, Stash AI, Zhurov VV, Zhurova EA, Pinkerton AA, Tsirelson VG. On the transferability of QTAIMC descriptors derived from X-ray diffraction data and DFT calculations: substituted hydropyrimidine derivatives. ACTA CRYSTALLOGRAPHICA SECTION B: STRUCTURAL SCIENCE 2011; 67:425-36. [DOI: 10.1107/s0108768111033015] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2011] [Accepted: 08/15/2011] [Indexed: 11/11/2022]
Abstract
The combined study of electron-density features in three substituted hydropyrimidines of the Biginelli compound family has been fulfilled. Results of the low-temperature X-ray diffraction measurements and density functional theory (DFT) B3LYP/6-311++G** calculations of these compounds are described. The experimentally derived atomic and bonding characteristics determined within the quantum-topological theory of atoms in molecules and crystals (QTAIMC) were demonstrated to be fully transferable within chemically similar structures such as the Biginelli compounds. However, for certain covalent bonds they differ significantly from the theoretical results because of insufficient flexibility of the atom-centered multipole electron density model. It was concluded that currently analysis of the theoretical electron density provides a more reliable basis for the determination of the transferability of QTAIMC descriptors for molecular structures. Empirical corrections making the experimentally derived QTAIMC bond descriptors more transferable are proposed.
Collapse
|
18
|
Abstract
This chapter reviews the application of fragment descriptors at different stages of virtual screening: filtering, similarity search, and direct activity assessment using QSAR/QSPR models. Several case studies are considered. It is demonstrated that the power of fragment descriptors stems from their universality, very high computational efficiency, simplicity of interpretation, and versatility.
Collapse
Affiliation(s)
- Alexandre Varnek
- Laboratory of Chemoinformatics, UMR7177 CNRS, University of Strasbourg, Strasbourg, France
| |
Collapse
|
19
|
Kondratovich EP, Zhokhova NI, Baskin II, Palyulin VA, Zefirov NS. Fragmental descriptors in (Q)SAR: prediction of the assignment of organic compounds to pharmacological groups using the support vector machine approach. Russ Chem Bull 2010. [DOI: 10.1007/s11172-009-0076-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
20
|
Zhokhova NI, Baskin II, Zefirov AN, Palyulin VA, Zefirov NS. Pseudofragmental descriptors based on combinations of atomic properties for prediction of physical properties of polymers in quantitative structure—property relationship studies. DOKLADY CHEMISTRY 2010. [DOI: 10.1134/s0012500810020023] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
21
|
Sirimulla S, Lerma M, Herndon WC. Prediction of Partial Molar Volumes of Amino Acids and Small Peptides: Counting Atoms versus Topological Indices. J Chem Inf Model 2010; 50:194-204. [PMID: 20058884 DOI: 10.1021/ci900318c] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Suman Sirimulla
- Department of Chemistry, University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968
| | - Maricarmen Lerma
- Department of Chemistry, University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968
| | - William C. Herndon
- Department of Chemistry, University of Texas at El Paso, 500 West University Avenue, El Paso, Texas 79968
| |
Collapse
|
22
|
Baskin II, Zhokhova NI, Palyulin VA, Zefirov AN, Zefirov NS. Multilevel approach to the prediction of properties of organic compounds in the framework of the QSAR/QSPR methodology. DOKLADY CHEMISTRY 2009. [DOI: 10.1134/s0012500809070076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
23
|
Katritzky AR, Pacureanu LM, Slavov SH, Dobchev DA, Karelson M. QSPR Study of Critical Micelle Concentrations of Nonionic Surfactants. Ind Eng Chem Res 2008. [DOI: 10.1021/ie800954k] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Alan R. Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia
| | - Liliana M. Pacureanu
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia
| | - Svetoslav H. Slavov
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia
| | - Dimitar A. Dobchev
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia
| | - Mati Karelson
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, Institute of Chemistry of Romanian Academy, M. Viteazul 24, Timisoara 300223, Romania, Institute of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn 19086, Estonia, and MolCode Ltd., Soola 8, Tartu 51013, Estonia
| |
Collapse
|
24
|
Zhokhova NI, Baskin II, Palyulin VA, Zefirov AN, Zefirov NS. Fragmental descriptors with labeled atoms and their application in QSAR/QSPR studies. DOKLADY CHEMISTRY 2007. [DOI: 10.1134/s0012500807120026] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
25
|
Kravtsov AA, Karpov PV, Baskin II, Palyulin VA, Zefirov NS. “Bimolecular” QSPR: Estimation of the solvation free energy of organic molecules in different solvents. DOKLADY CHEMISTRY 2007. [DOI: 10.1134/s0012500807050072] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
26
|
Ivanova AA, Baskin II, Palyulin VA, Zefirov NS. Estimation of ionization constants for different classes of organic compounds with the use of the fragmental approach to the search of structure-property relationships. DOKLADY CHEMISTRY 2007. [DOI: 10.1134/s0012500807040040] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
27
|
Héberger K. Quantitative structure-(chromatographic) retention relationships. J Chromatogr A 2007; 1158:273-305. [PMID: 17499256 DOI: 10.1016/j.chroma.2007.03.108] [Citation(s) in RCA: 268] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Revised: 03/13/2007] [Accepted: 03/19/2007] [Indexed: 01/30/2023]
Abstract
Since the pioneering works of Kaliszan (R. Kaliszan, Quantitative Structure-Chromatographic Retention Relationships, Wiley, New York, 1987; and R. Kaliszan, Structure and Retention in Chromatography. A Chemometric Approach, Harwood Academic, Amsterdam, 1997) no comprehensive summary is available in the field. Present review covers the period of 1996-August 2006. The sources are grouped according to the special properties of kinds of chromatography: Quantitative structure-retention relationship in gas chromatography, in planar chromatography, in column liquid chromatography, in micellar liquid chromatography, affinity chromatography and quantitative structure enantioselective retention relationships. General tendencies, misleading practice and conclusions, validation of the models, suggestions for future works are summarized for each sub-field. Some straightforward applications are emphasized but standard ones. The sources and the model compounds, descriptors, predicted retention data, modeling methods and indicators of their performance, validation of models, and stationary phases are collected in the tables. Some important conclusions are: Not all physicochemical descriptors correlate with the retention data strongly; the heat of formation is not related to the chromatographic retention. It is not appropriate to give the errors of Kovats indices in percentages. The apparently low values (1-3%) can disorient the reviewers and readers. Contemporary mean interlaboratory reproducibility of Kovats indices are about 5-10 i.u. for standard non polar phases and 10-25 i.u. for standard polar phases. The predictive performance of QSRR models deteriorates as the polarity of GC stationary phase increases. The correlation coefficient alone is not a particularly good indicator for the model performance. Residuals are more useful than plots of measured and calculated values. There is no need to give the retention data in a form of an equation if the numbers of compounds are small. The domain of model applicability of models should be given in all cases.
Collapse
Affiliation(s)
- Károly Héberger
- Chemical Research Center, Hungarian Academy of Sciences, P.O. Box 17, H-1525 Budapest, Hungary.
| |
Collapse
|
28
|
Zhokhova NI, Palyulin VA, Baskin II, Zefirov AN, Zefirov NS. Fragment descriptors in the QSPR method: Their use for calculating the enthalpies of vaporization of organic substances. RUSSIAN JOURNAL OF PHYSICAL CHEMISTRY A 2007. [DOI: 10.1134/s0036024407010037] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
29
|
Estrada E, Uriarte E, Molina E, Simón-Manso Y, Milne GWA. An Integrated in Silico Analysis of Drug-Binding to Human Serum Albumin. J Chem Inf Model 2006; 46:2709-24. [PMID: 17125211 DOI: 10.1021/ci600274f] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Approaches such as quantitative structure-activity relationships (QSAR) and molecular modeling are integrated with the study of complex networks to understand drug binding to human serum albumin (HSA). A robust QSAR model using the topological substructural molecular descriptors/design (TOPS-MODE) approach has been derived and shows good predictability and interpretability in terms of structural contribution to drug binding to HSA. A perfect agreement exists between the group/fragment contributions found by TOPS-MODE and the specific interactions of drugs with HSA. These results indicate a preponderant contribution of hydrophobic regions of drugs to the specific binding to drug-binding sites 1 and 2 in HSA and specific roles of polar groups which anchor drugs to HSA binding sites. The occurrence of fragments contributing to drug binding to HSA can be represented by complex networks. The fragment-to-fragment complex network displays "small-world" and "scale-free" characteristics and in this way is similar to other complex networks including biological, social, and technological networks. A small number of fragments appear very frequently in most drugs. These molecular "empathic" fragments are good candidates for guiding future drug discovery research.
Collapse
Affiliation(s)
- Ernesto Estrada
- Complex Systems Research Group, X-rays Unit, Edificio CACTUS, Santiago de Compostela 15982, Spain.
| | | | | | | | | |
Collapse
|
30
|
Liao Q, Yao J, Yuan S. SVM approach for predicting LogP. Mol Divers 2006; 10:301-9. [PMID: 17031534 DOI: 10.1007/s11030-006-9036-2] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2005] [Accepted: 10/27/2005] [Indexed: 01/04/2023]
Abstract
The logarithm of the partition coefficient between n-octanol and water (logP) is an important parameter for drug discovery. Based upon the comparison of several prediction logP models, i.e. Support Vector Machines (SVM), Partial Least Squares (PLS) and Multiple Linear Regression (MLR), the authors reported SVM model is the best one in this paper.
Collapse
Affiliation(s)
- Quan Liao
- Department of Computer Chemistry and Chemoinformatics, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai, China
| | | | | |
Collapse
|
31
|
Rybolt TR, Ziegler KA, Thomas HE, Boyd JL, Ridgeway ME. Adsorption energies for a nanoporous carbon from gas–solid chromatography and molecular mechanics. J Colloid Interface Sci 2006; 296:41-50. [PMID: 16168430 DOI: 10.1016/j.jcis.2005.08.057] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2005] [Revised: 08/09/2005] [Accepted: 08/25/2005] [Indexed: 11/25/2022]
Abstract
Gas-solid chromatography was used to obtain second gas-solid virial coefficients, B2s, in the temperature range 342-613 K for methane, ethane, propane, butane, 2-methylpropane, chloromethane, chlorodifluoromethane, dichloromethane, and dichlorodifluoromethane. The adsorbent used was Carbosieve S-III (Supelco), a carbon powder with fairly uniform, predominately 0.55 nm slit width pores and a N2 BET surface area of 995 m2/g. The temperature dependence of B2s was used to determine experimental values of the gas-solid interaction energy, E*, for each of these molecular adsorbates. MM2 and MM3 molecular mechanics calculations were used to determine the gas-solid interaction energy, E*(cal), for each of the molecules on various flat and nanoporous model surfaces. The flat model consisted of three parallel graphene layers with each graphene layer containing 127 interconnected benzene rings. The nanoporous model consisted of two sets of three parallel graphene layers adjacent to one another but separated to represent the pore diameter. A variety of calculated adsorption energies, E*(cal), were compared and correlated to the experimental E* values. It was determined that simple molecular mechanics could be used to calculate an attraction energy parameter between an adsorbed molecule and the carbon surface. The best correlation between the E*(cal) and E* values was provided by a 0.50 nm nanoporous model using MM2 parameters.
Collapse
Affiliation(s)
- Thomas R Rybolt
- Department of Chemistry, University of Tennessee at Chattanooga, Chattanooga, TN 37403, USA.
| | | | | | | | | |
Collapse
|
32
|
Solov’ev VP, Kireeva NV, Tsivadze AY, Varnek AA. Structure-property modelling of complex formation of strontium with organic ligands in water. J STRUCT CHEM+ 2006. [DOI: 10.1007/s10947-006-0300-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
33
|
Varnek A, Fourches D, Solov'ev VP, Baulin VE, Turanov AN, Karandashev VK, Fara D, Katritzky AR. "In silico" design of new uranyl extractants based on phosphoryl-containing podands: QSPR studies, generation and screening of virtual combinatorial library, and experimental tests. ACTA ACUST UNITED AC 2005; 44:1365-82. [PMID: 15272845 DOI: 10.1021/ci049976b] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This paper is devoted to computer-aided design of new extractants of the uranyl cation involving three main steps: (i) a QSPR study, (ii) generation and screening of a virtual combinatorial library, and (iii) synthesis of several predicted compounds and their experimental extraction studies. First, we performed a QSPR modeling of the distribution coefficient (logD) of uranyl extracted by phosphoryl-containing podands from water to 1,2-dichloroethane. Two different approaches were used: one based on classical structural and physicochemical descriptors (implemented in the CODESSA PRO program) and another one based on fragment descriptors (implemented in the TRAIL program). Three statistically significant models obtained with TRAIL involve as descriptors either sequences of atoms and bonds or atoms with their close environment (augmented atoms). The best models of CODESSA PRO include its own molecular descriptors as well as fragment descriptors obtained with TRAIL. At the second step, a virtual combinatorial library of 2024 podands has been generated with the CombiLib program, followed by the assessment of logD values using developed QSPR models. At the third step, eight of these hypothetical compounds were synthesized and tested experimentally. Comparison with experiment shows that developed QSPR models successfully predict logD values for 7 of 8 compounds from that "blind test" set.
Collapse
Affiliation(s)
- A Varnek
- Laboratoire d'Infochimie, UMR 7551 CNRS, Université Louis Pasteur, 4, rue B. Pascal, Strasbourg 67000, France.
| | | | | | | | | | | | | | | |
Collapse
|
34
|
Katritzky AR, Kuanar M, Fara DC, Karelson M, Acree WE, Solov'ev VP, Varnek A. QSAR modeling of blood:air and tissue:air partition coefficients using theoretical descriptors. Bioorg Med Chem 2005; 13:6450-63. [PMID: 16202613 DOI: 10.1016/j.bmc.2005.06.066] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2005] [Revised: 06/29/2005] [Accepted: 06/30/2005] [Indexed: 11/21/2022]
Abstract
Human blood:air, human and rat tissue (fat, brain, liver, muscle, and kidney):air partition coefficients of a diverse set of organic compounds were correlated and predicted using structural descriptors by employing CODESSA-PRO and ISIDA programs. Four and five descriptor regression models developed using CODESSA-PRO were validated on three different test sets. Overall, these models have reasonable values of correlation coefficients (R(2)) and leave-one-out correlation coefficients (R(cv)(2)): R(2) = 0.881-0.983; R(cv)(2) = 0.826-0.962. Calculations with ISIDA resulted in models based on atom/bond sequences involving two to three atoms with statistical parameters that were similar to those of models obtained with CODESSA-PRO (R(2) = 0.911-0.974; R(cv)(2) = 0.831-0.936). A mixed pool of molecular and fragment descriptors did not lead to significant improvement of the models.
Collapse
Affiliation(s)
- Alan R Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, 32611, USA.
| | | | | | | | | | | | | |
Collapse
|
35
|
Varnek A, Fourches D, Hoonakker F, Solov'ev VP. Substructural fragments: an universal language to encode reactions, molecular and supramolecular structures. J Comput Aided Mol Des 2005; 19:693-703. [PMID: 16292611 DOI: 10.1007/s10822-005-9008-0] [Citation(s) in RCA: 132] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2005] [Accepted: 07/28/2005] [Indexed: 10/25/2022]
Abstract
Substructural fragments are proposed as a simple and safe way to encode molecular structures in a matrix containing the occurrence of fragments of a given type. The knowledge retrieved from QSPR modelling can also be stored in that matrix in addition to the information about fragments. Complex supramolecular systems (using special bond types) and chemical reactions (represented as Condensed Graphs of Reactions, CGR) can be treated similarly. The efficiency of fragments as descriptors has been demonstrated in QSPR studies of aqueous solubility for a diverse set of organic compounds as well as in the analysis of thermodynamic parameters for hydrogen-bonding in some supramolecular complexes. It has also been shown that CGR may be an interesting opportunity to perform similarity searches for chemical reactions. The relationship between the density of information in descriptors/knowledge matrices and the robustness of QSPR models is discussed.
Collapse
Affiliation(s)
- A Varnek
- Laboratoire d'Infochimie, UMR 7551 CNRS, Université Louis Pasteur, 4, rue B., 67000, Pascal, Strasbourg, France.
| | | | | | | |
Collapse
|
36
|
Faulon JL, Brown WM, Martin S. Reverse engineering chemical structures from molecular descriptors: how many solutions? J Comput Aided Mol Des 2005; 19:637-50. [PMID: 16267694 DOI: 10.1007/s10822-005-9007-1] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2005] [Accepted: 07/28/2005] [Indexed: 01/04/2023]
Abstract
Physical, chemical and biological properties are the ultimate information of interest for chemical compounds. Molecular descriptors that map structural information to activities and properties are obvious candidates for information sharing. In this paper, we consider the feasibility of using molecular descriptors to safely exchange chemical information in such a way that the original chemical structures cannot be reverse engineered. To investigate the safety of sharing such descriptors, we compute the degeneracy (the number of structure matching a descriptor value) of several 2D descriptors, and use various methods to search for and reverse engineer structures. We examine degeneracy in the entire chemical space taking descriptors values from the alkane isomer series and the PubChem database. We further use a stochastic search to retrieve structures matching specific topological index values. Finally, we investigate the safety of exchanging of fragmental descriptors using deterministic enumeration.
Collapse
Affiliation(s)
- Jean-Loup Faulon
- Computational Bioscience, Sandia National Laboratories, 969, Livermore, CA 94551-9292, USA.
| | | | | |
Collapse
|
37
|
A Study of the Affinity of Dyes for Cellulose Fiber within the Framework of a Fragment Approach in QSPR. RUSS J APPL CHEM+ 2005. [DOI: 10.1007/s11167-005-0439-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
38
|
Relationship Between Structure and Hepatoprotector Activity of Adamantane Derivatives. Part 2. Application of Autocorrelative, Substructural, and 3D Molecular Descriptors. Pharm Chem J 2005. [DOI: 10.1007/s11094-005-0102-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
39
|
Rücker C, Meringer M, Kerber A. QSPR Using MOLGEN-QSPR: The Challenge of Fluoroalkane Boiling Points. J Chem Inf Model 2004; 45:74-80. [PMID: 15667131 DOI: 10.1021/ci0497298] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
By means of the new software MOLGEN-QSPR, a multilinear regression model for the boiling points of lower fluoroalkanes is established. The model is based exclusively on simple descriptors derived directly from molecular structure and nevertheless describes a broader set of data more precisely than previous attempts that used either more demanding (quantum chemical) descriptors or more demanding (nonlinear) statistical methods such as neural networks. The model's internal consistency was confirmed by leave-one-out cross-validation. The model was used to predict all unknown boiling points of fluorobutanes, and the quality of predictions was estimated by means of comparison with boiling point predictions for fluoropentanes.
Collapse
Affiliation(s)
- Christoph Rücker
- Department of Mathematics, Universität Bayreuth, D-95440 Bayreuth, Germany.
| | | | | |
Collapse
|
40
|
Solov'ev VP, Varnek A. Anti-HIV activity of HEPT, TIBO, and cyclic urea derivatives: structure-property studies, focused combinatorial library generation, and hits selection using substructural molecular fragments method. ACTA ACUST UNITED AC 2004; 43:1703-19. [PMID: 14502505 DOI: 10.1021/ci020388c] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Substructural molecular fragments (SMF) method [Solov'ev, V. P.; Varnek, A.; Wipff, G. J. Chem. Inf. Comput. Sci. 2000, 40, 847-858] was applied to assess anti-HIV activity for large data sets for three families of compounds: 1-[2-hydroxyethoxy)methyl]-6-(phenylthio)thymine (HEPT) derivatives, tetrahydroimidazobenzodiazepinone (TIBO) derivatives, and cyclic urea (CU) derivatives. The SMF method uses 49 types of topological descriptors (atom/bond sequences and "augmented atoms") which, being coupled with 3 linear and nonlinear fitting equations, allows the user to generate up to 147 structure-property models. For each family of compounds, the modeling was performed on several training sets followed by the validation calculations where three best fit models were applied. Calculated activities well reproduce available experimental data. On the basis of the "optimal" molecular fragments, the focused combinatorial library containing 252 virtual HEPT derivatives has been generated. Its filtering led to several hits potentially possessing anti-HIV activity.
Collapse
Affiliation(s)
- V P Solov'ev
- Institute of Physiologically Active Compounds, Russian Academy of Sciences, 142432, Chernogolovka, Moscow Region, Russia
| | | |
Collapse
|
41
|
Zhokhova NI, Baskin II, Palyulin VA, Zefirov AN, Zefirov NS. Fragment descriptors in QSPR: Application to magnetic susceptibility calculations. J STRUCT CHEM+ 2004. [DOI: 10.1007/s10947-005-0037-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
|
42
|
Rybolt TR, Janeksela VE, Hooper DN, Thomas HE, Carrington NA, Williamson EJ. Predicting second gas–solid virial coefficients using calculated molecular properties on various carbon surfaces. J Colloid Interface Sci 2004; 272:35-45. [PMID: 14985020 DOI: 10.1016/j.jcis.2003.09.026] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2003] [Accepted: 09/23/2003] [Indexed: 11/17/2022]
Abstract
Gas-solid chromatography was used to obtain values of the second gas-solid virial coefficient, B2s, in the temperature range from 343 to 493 K for seven adsorbate gases: methane, ethane, propane, chloromethane, chlorodifluoromethane, dimethyl ether, and sulfur hexafluoride. Carboxen-1000, a 1200 m2/g carbon molecular sieve (Supelco Inc.), was used as the adsorbent. These data were combined with earlier work to make a combined data set of 36 different adsorbate gases variously interacting with from one to four different carbon surfaces. All B2s values were extrapolated to 403 K to create a set of 65 different gas-solid B2s values at a fixed temperature. The B2s value for a given gas-solid system can be converted to a chromatographic retention time at any desired flow rate and can be converted to the amount of gas adsorbed at any pressure in the low-coverage, Henry's law region. Beginning with a theoretical equation for the second gas-solid virial coefficient, various quantitative structure retention relations (QSRR) were developed and used to correlate the B2s values for different gas adsorbates with different carbon surfaces. Two calculated adsorbate molecular parameters (molar refractivity and connectivity index), when combined with two adsorbent parameters (surface area and a surface energy contribution to the gas-solid interaction), provided an effective correlation (r2 = 0.952) of the 65 different B2s values. The two surface parameters provided a simple yet useful representation of the structure and energy of the carbon surfaces and thus our correlations considered variation in both the adsorbate gas and the adsorbent solid.
Collapse
Affiliation(s)
- Thomas R Rybolt
- Department of Chemistry, University of Tennessee at Chattanooga, Chattanooga, TN 37403, USA.
| | | | | | | | | | | |
Collapse
|
43
|
Pompe M, Davis JM, Samuel CD. Prediction of Thermodynamic Parameters in Gas Chromatography from Molecular Structure: Hydrocarbons. ACTA ACUST UNITED AC 2004; 44:399-409. [PMID: 15032518 DOI: 10.1021/ci0304268] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Theoretical prediction of gas-chromatographic retention times could be used as an additional method for a more accurate identification of organic compounds during GC/MS analysis. Two separate quantitative structure-property relationship models were introduced for the calculation of thermodynamic values (DeltaH degrees, DeltaS degrees) for aliphatic and aromatic hydrocarbons. These values are required for the calculation of retention times in temperature programmed gas chromatography. Seven-descriptor and five-descriptor MLR models were selected for the calculation of DeltaH degrees and DeltaS degrees values, respectively, based on the best cross-validation abilities. The final prediction capabilities of the models were evaluated by a test set procedure. RMS errors calculated from the test set were 207 cal mol(-1) and 0.58 cal mol(-1) K(-1) for DeltaH degrees and DeltaS degrees prediction models, respectively. To evaluate the error of the models represented in the time scale, several chromatograms were simulated using experimental Pro ezGC and theoretically calculated thermodynamic data. Afterward a standard deviation of retention time residuals was calculated. It was found out that, although the standard deviation varies from one chromatographic condition to another, the ratio between the standard deviation and the maximum available separation space for the particular set of organic compounds remains constant and was around 5% of the maximum separation space available at selected chromatographic conditions. Our prediction model was able to accurately differentiate between the retention times of the consecutive compounds in the n-alkanes, 1-alkenes, and 2-alkenes homological series.
Collapse
Affiliation(s)
- Matevz Pompe
- Faculty of Chemistry and Chemical Technology, University of Ljubljana, Askerceva 5, 1000 Ljubljana, Slovenia.
| | | | | |
Collapse
|
44
|
Katritzky AR, Fara DC, Yang H, Karelson M, Suzuki T, Solov'ev VP, Varnek A. Quantitative Structure−Property Relationship Modeling of β-Cyclodextrin Complexation Free Energies. ACTA ACUST UNITED AC 2004; 44:529-41. [PMID: 15032533 DOI: 10.1021/ci034190j] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
CODESSA-PRO was used to model binding energies for 1:1 complexation systems between 218 organic guest molecules and beta-cyclodextrin, using a seven-parameter equation with R2 = 0.796 and Rcv2 = 0.779. Fragment-based TRAIL calculations gave a better fit with R2 = 0.943 and Rcv2 = 0.848 for 195 data points in the database. The advantages and disadvantages of each approach are discussed, and it is concluded that a combination of the two approaches has much promise from a practical viewpoint.
Collapse
Affiliation(s)
- Alan R Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611, USA.
| | | | | | | | | | | | | |
Collapse
|