1
|
Zhang R, Wang B, Li L, Li S, Guo H, Zhang P, Hua Y, Cui X, Li Y, Mu Y, Huang X, Li X. Modeling and insights into the structural characteristics of endocrine-disrupting chemicals. ECOTOXICOLOGY AND ENVIRONMENTAL SAFETY 2023; 263:115251. [PMID: 37451095 DOI: 10.1016/j.ecoenv.2023.115251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 07/03/2023] [Accepted: 07/09/2023] [Indexed: 07/18/2023]
Abstract
Endocrine-disrupting chemicals (EDCs) can cause serious harm to human health and the environment; therefore, it is important to rapidly and correctly identify EDCs. Different computational models have been proposed for the prediction of EDCs over the past few decades, but the reported models are not always easily available, and few studies have investigated the structural characteristics of EDCs. In the present study, we have developed a series of artificial intelligence models targeting EDC receptors: the androgen receptor (AR); estrogen receptor (ER); and pregnane X receptor (PXR). The consensus models achieved good predictive results for validation sets with balanced accuracy values of 87.37%, 90.13%, and 79.21% for AR, ER, and PXR binding assays, respectively. Analysis of the physical-chemical properties suggested that several chemical properties were significantly (p < 0.05) different between EDCs and non-EDCs. We also identified structural alerts that can indicate an EDC, which were integrated into the web server SApredictor. These models and structural characteristics can provide useful tools and information in the discrimination and mechanistic understanding of EDCs in drug discovery and environmental risk assessment.
Collapse
Affiliation(s)
- Ruiqiu Zhang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Bailun Wang
- Department of Anesthesiology and perioperative medicine, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Institute of Anesthesia and Respiratory Intensive Care Medicine, Jinan 250014, China
| | - Ling Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Shengjie Li
- Department of Neurosurgery, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan 250014, China
| | - Huizhu Guo
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Pei Zhang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yuqing Hua
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xueyan Cui
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yan Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Yan Mu
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xin Huang
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China
| | - Xiao Li
- Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Shandong Engineering and Technology Research Center for Pediatric Drug Development, Shandong Medicine and Health Key Laboratory of Clinical Pharmacy, Jinan 250014, China.
| |
Collapse
|
2
|
Machine learning modelling of chemical reaction characteristics: yesterday, today, tomorrow. MENDELEEV COMMUNICATIONS 2021. [DOI: 10.1016/j.mencom.2021.11.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
3
|
Varnek A, Baskin II. Modern Trends in Chemical Reactions Modeling. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11543-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022] Open
|
4
|
Glavatskikh M, Madzhidov T, Baskin II, Horvath D, Nugmanov R, Gimadiev T, Marcou G, Varnek A. Visualization and Analysis of Complex Reaction Data: The Case of Tautomeric Equilibria. Mol Inform 2018; 37:e1800056. [DOI: 10.1002/minf.201800056] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 06/29/2018] [Indexed: 11/07/2022]
Affiliation(s)
- Marta Glavatskikh
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Igor I. Baskin
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
- Faculty of Physics; Lomonosov Moscow State University; Leninskie Gory 1/2 119991 Moscow Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Gimadiev
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
5
|
Zhokhov AK, Loskutov AY, Rybal’chenko IV. Methodological Approaches to the Calculation and Prediction of Retention Indices in Capillary Gas Chromatography. JOURNAL OF ANALYTICAL CHEMISTRY 2018. [DOI: 10.1134/s1061934818030127] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
6
|
Abstract
Various methods of machine learning, supervised and unsupervised, linear and nonlinear, classification and regression, in combination with various types of molecular descriptors, both "handcrafted" and "data-driven," are considered in the context of their use in computational toxicology. The use of multiple linear regression, variants of naïve Bayes classifier, k-nearest neighbors, support vector machine, decision trees, ensemble learning, random forest, several types of neural networks, and deep learning is the focus of attention of this review. The role of fragment descriptors, graph mining, and graph kernels is highlighted. The application of unsupervised methods, such as Kohonen's self-organizing maps and related approaches, which allow for combining predictions with data analysis and visualization, is also considered. The necessity of applying a wide range of machine learning methods in computational toxicology is underlined.
Collapse
Affiliation(s)
- Igor I Baskin
- Faculty of Physics, M.V. Lomonosov Moscow State University, Moscow, Russian Federation.
- Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russian Federation.
| |
Collapse
|
7
|
Li X, Zhang Y, Li H, Zhao Y. Modeling of the hERG K+ Channel Blockage Using Online Chemical Database and Modeling Environment (OCHEM). Mol Inform 2017; 36. [PMID: 28857516 DOI: 10.1002/minf.201700074] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 08/08/2017] [Indexed: 11/06/2022]
Abstract
Human ether-a-go-go related gene (hERG) K+ channel plays an important role in cardiac action potential. Blockage of hERG channel may result in long QT syndrome (LQTS), even cause sudden cardiac death. Many drugs have been withdrawn from the market because of the serious hERG-related cardiotoxicity. Therefore, it is quite essential to estimate the chemical blockage of hERG in the early stage of drug discovery. In this study, a diverse set of 3721 compounds with hERG inhibition data was assembled from literature. Then, we make full use of the Online Chemical Modeling Environment (OCHEM), which supplies rich machine learning methods and descriptor sets, to build a series of classification models for hERG blockage. We also generated two consensus models based on the top-performing individual models. The consensus models performed much better than the individual models both on 5-fold cross validation and external validation. Especially, consensus model II yielded the prediction accuracy of 89.5 % and MCC of 0.670 on external validation. This result indicated that the predictive power of consensus model II should be stronger than most of the previously reported models. The 17 top-performing individual models and the consensus models and the data sets used for model development are available at https://ochem.eu/article/103592.
Collapse
Affiliation(s)
- Xiao Li
- Beijing Computing Center, Beijing Academy of Science and Technology, 7 Fengxian road, Beijing, 100094, China.,Beijing Beike Deyuan Bio-Pharm Technology Co.Ltd, 7 Fengxian road, Beijing, 100094, China
| | - Yuan Zhang
- Beijing Beike Deyuan Bio-Pharm Technology Co.Ltd, 7 Fengxian road, Beijing, 100094, China
| | - Huanhuan Li
- Beijing Beike Deyuan Bio-Pharm Technology Co.Ltd, 7 Fengxian road, Beijing, 100094, China
| | - Yong Zhao
- Beijing Computing Center, Beijing Academy of Science and Technology, 7 Fengxian road, Beijing, 100094, China.,Beijing Beike Deyuan Bio-Pharm Technology Co.Ltd, 7 Fengxian road, Beijing, 100094, China
| |
Collapse
|
8
|
3D molecular fragment descriptors for structure–property modeling: predicting the free energies for the complexation between antipodal guests and β-cyclodextrins. J INCL PHENOM MACRO 2017. [DOI: 10.1007/s10847-017-0739-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
9
|
Radchenko EV, Rulev YA, Safanyaev AY, Palyulin VA, Zefirov NS. Computer-aided estimation of the hERG-mediated cardiotoxicity risk of potential drug components. DOKL BIOCHEM BIOPHYS 2017; 473:128-131. [DOI: 10.1134/s1607672917020107] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2016] [Indexed: 11/22/2022]
|
10
|
Dyabina AS, Radchenko EV, Palyulin VA, Zefirov NS. Prediction of blood-brain barrier permeability of organic compounds. DOKL BIOCHEM BIOPHYS 2016; 470:371-374. [DOI: 10.1134/s1607672916050173] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2016] [Indexed: 11/22/2022]
|
11
|
Tetko IV, Maran U, Tropsha A. Public (Q)SAR Services, Integrated Modeling Environments, and Model Repositories on the Web: State of the Art and Perspectives for Future Development. Mol Inform 2016; 36. [PMID: 27778468 DOI: 10.1002/minf.201600082] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2016] [Accepted: 10/03/2016] [Indexed: 01/08/2023]
Abstract
Thousands of (Quantitative) Structure-Activity Relationships (Q)SAR models have been described in peer-reviewed publications; however, this way of sharing seldom makes models available for the use by the research community outside of the developer's laboratory. Conversely, on-line models allow broad dissemination and application representing the most effective way of sharing the scientific knowledge. Approaches for sharing and providing on-line access to models range from web services created by individual users and laboratories to integrated modeling environments and model repositories. This emerging transition from the descriptive and informative, but "static", and for the most part, non-executable print format to interactive, transparent and functional delivery of "living" models is expected to have a transformative effect on modern experimental research in areas of scientific and regulatory use of (Q)SAR models.
Collapse
Affiliation(s)
- Igor V Tetko
- Institute of Structural Biology, Helmholtz Zentrum München -, German Research Center for Environmental Health (GmbH), Institute of Structural Biology, Ingolstädter Landstraße 1, D-, 85764, Neuherberg, Germany.,BigChem GmbH, Ingolstädter Landstraße 1, b. 60w, D-, 85764, Neuherberg, Germany
| | - Uko Maran
- Institute of Chemistry, University of Tartu, Ravila 14A, Tartu, 50411, Estonia
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, 27599, USA.,Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya St. 18, 420008, Kazan, Russia
| |
Collapse
|
12
|
Tetko IV, Varbanov HP, Galanski MS, Talmaciu M, Platts JA, Ravera M, Gabano E. Prediction of logP for Pt(II) and Pt(IV) complexes: Comparison of statistical and quantum-chemistry based approaches. J Inorg Biochem 2016; 156:1-13. [PMID: 26717258 DOI: 10.1016/j.jinorgbio.2015.12.006] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2015] [Revised: 11/19/2015] [Accepted: 12/09/2015] [Indexed: 01/31/2023]
Abstract
The octanol/water partition coefficient, logP, is one of the most important physico-chemical parameters for the development of new metal-based anticancer drugs with improved pharmacokinetic properties. This study addresses an issue with the absence of publicly available models to predict logP of Pt(IV) complexes. Following data collection and subsequent development of models based on 187 complexes from literature, we validate new and previously published models on a new set of 11 Pt(II) and 35 Pt(IV) complexes, which were kept blind during the model development step. The error of the consensus model, 0.65 for Pt(IV) and 0.37 for Pt(II) complexes, indicates its good accuracy of predictions. The lower accuracy for Pt(IV) complexes was attributed to experimental difficulties with logP measurements for some poorly-soluble compounds. This model was developed using general-purpose descriptors such as extended functional groups, molecular fragments and E-state indices. Surprisingly, models based on quantum-chemistry calculations provided lower prediction accuracy. We also found that all the developed models strongly overestimate logP values for the three complexes measured in the presence of DMSO. Considering that DMSO is frequently used as a solvent to store chemicals, its effect should not be overlooked when logP measurements by means of the shake flask method are performed. The final models are freely available at http://ochem.eu/article/76903.
Collapse
Affiliation(s)
- Igor V Tetko
- Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Institute of Structural Biology, Ingolstaedter Landstrasse 1, b. 60w, D-85764 Neuherberg, Germany; BigChem GmbH, Ingolstaedter Landstrasse 1, b. 60w, D-85764 Neuherberg, Germany.
| | - Hristo P Varbanov
- Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), CH-1015 Lausanne, Switzerland; Institute of Inorganic Chemistry, University of Vienna, Waehringer Strasse 42, A-1090 Vienna, Austria
| | - Mathea S Galanski
- Institute of Inorganic Chemistry, University of Vienna, Waehringer Strasse 42, A-1090 Vienna, Austria
| | - Mona Talmaciu
- School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, UK; «Iuliu Haţieganu» University of Medicine and Pharmacy, Faculty of Pharmacy, Analytical Chemistry Department, Cluj-Napoca, Romania
| | - James A Platts
- School of Chemistry, Cardiff University, Park Place, Cardiff CF10 3AT, UK
| | - Mauro Ravera
- Dipartimento di Scienze e Innovazione Tecnologica, Università del Piemonte Orientale, Viale Teresa Michel 11, 15121 Alessandria, Italy
| | - Elisabetta Gabano
- Dipartimento di Scienze e Innovazione Tecnologica, Università del Piemonte Orientale, Viale Teresa Michel 11, 15121 Alessandria, Italy
| |
Collapse
|
13
|
Sosnin SB, Radchenko EV, Palyulin VA, Zefirov NS. Generalized fragmental approach in QSAR/QSPR studies. DOKLADY CHEMISTRY 2015. [DOI: 10.1134/s0012500815070071] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
14
|
Kurilo MN, Ryzhkov FV, Karpov PV, Radchenko EV, Palyulin VA, Zefirov NS. Molecular design of selective ligands of chemokine receptors. DOKL BIOCHEM BIOPHYS 2015; 461:131-4. [DOI: 10.1134/s1607672915020167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Indexed: 11/23/2022]
|
15
|
Nugmanov RI, Madzhidov TI, Khaliullina GR, Baskin II, Antipin IS, Varnek AA. Development of “structure-property” models in nucleophilic substitution reactions involving azides. J STRUCT CHEM+ 2015. [DOI: 10.1134/s0022476614060043] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
16
|
Sitnikov GV, Zhokhova NI, Ustynyuk YA, Varnek A, Baskin II. Continuous indicator fields: a novel universal type of molecular fields. J Comput Aided Mol Des 2014; 29:233-47. [DOI: 10.1007/s10822-014-9818-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Accepted: 11/24/2014] [Indexed: 11/25/2022]
|
17
|
Madzhidov TI, Polishchuk PG, Nugmanov RI, Bodrov AV, Lin AI, Baskin II, Varnek AA, Antipin IS. Structure-reactivity relationships in terms of the condensed graphs of reactions. RUSSIAN JOURNAL OF ORGANIC CHEMISTRY 2014. [DOI: 10.1134/s1070428014040010] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
18
|
Vorberg S, Tetko IV. Modeling the Biodegradability of Chemical Compounds Using the Online CHEmical Modeling Environment (OCHEM). Mol Inform 2013; 33:73-85. [PMID: 27485201 PMCID: PMC5175213 DOI: 10.1002/minf.201300030] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2013] [Accepted: 10/11/2013] [Indexed: 11/10/2022]
Abstract
Biodegradability describes the capacity of substances to be mineralized by free‐living bacteria. It is a crucial property in estimating a compound’s long‐term impact on the environment. The ability to reliably predict biodegradability would reduce the need for laborious experimental testing. However, this endpoint is difficult to model due to unavailability or inconsistency of experimental data. Our approach makes use of the Online Chemical Modeling Environment (OCHEM) and its rich supply of machine learning methods and descriptor sets to build classification models for ready biodegradability. These models were analyzed to determine the relationship between characteristic structural properties and biodegradation activity. The distinguishing feature of the developed models is their ability to estimate the accuracy of prediction for each individual compound. The models developed using seven individual descriptor sets were combined in a consensus model, which provided the highest accuracy. The identified overrepresented structural fragments can be used by chemists to improve the biodegradability of new chemical compounds. The consensus model, the datasets used, and the calculated structural fragments are publicly available at http://ochem.eu/article/31660.
Collapse
Affiliation(s)
- Susann Vorberg
- Institute of Structural Biology, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764 Neuherberg, Germany tel: +49-89-3187-3575; fax: +49-89-3187-3585
| | - Igor V Tetko
- Institute of Structural Biology, Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Ingolstädter Landstraße 1, D-85764 Neuherberg, Germany tel: +49-89-3187-3575; fax: +49-89-3187-3585. .,Chemistry Department, Faculty of Science, King Abdulaziz University, P. O. Box 80203, Jeddah 21589, Saudi Arabia. .,eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Germany.
| |
Collapse
|
19
|
Development of conformation independent computational models for the early recognition of breast cancer resistance protein substrates. BIOMED RESEARCH INTERNATIONAL 2013; 2013:863592. [PMID: 23984415 PMCID: PMC3747366 DOI: 10.1155/2013/863592] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 06/25/2013] [Indexed: 01/08/2023]
Abstract
ABC efflux transporters are polyspecific members of the ABC superfamily that, acting as drug and metabolite carriers, provide a biochemical barrier against drug penetration and contribute to detoxification. Their overexpression is linked to multidrug resistance issues in a diversity of diseases. Breast cancer resistance protein (BCRP) is the most expressed ABC efflux transporter throughout the intestine and the blood-brain barrier, limiting oral absorption and brain bioavailability of its substrates. Early recognition of BCRP substrates is thus essential to optimize oral drug absorption, design of novel therapeutics for central nervous system conditions, and overcome BCRP-mediated cross-resistance issues. We present the development of an ensemble of ligand-based machine learning algorithms for the early recognition of BCRP substrates, from a database of 262 substrates and nonsubstrates compiled from the literature. Such dataset was rationally partitioned into training and test sets by application of a 2-step clustering procedure. The models were developed through application of linear discriminant analysis to random subsamples of Dragon molecular descriptors. Simple data fusion and statistical comparison of partial areas under the curve of ROC curves were applied to obtain the best 2-model combination, which presented 82% and 74.5% of overall accuracy in the training and test set, respectively.
Collapse
|
20
|
Tetko IV, Novotarskyi S, Sushko I, Ivanov V, Petrenko AE, Dieden R, Lebon F, Mathieu B. Development of dimethyl sulfoxide solubility models using 163,000 molecules: using a domain applicability metric to select more reliable predictions. J Chem Inf Model 2013; 53:1990-2000. [PMID: 23855787 PMCID: PMC3760295 DOI: 10.1021/ci400213d] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
![]()
The
dimethyl sulfoxide (DMSO) solubility data from Enamine and two UCB
pharma compound collections were analyzed using 8 different machine
learning methods and 12 descriptor sets. The analyzed data sets were
highly imbalanced with 1.7–5.8% nonsoluble compounds. The libraries’
enrichment by soluble molecules from the set of 10% of the most reliable
predictions was used to compare prediction performances of the methods.
The highest accuracies were calculated using a C4.5 decision classification
tree, random forest, and associative neural networks. The performances
of the methods developed were estimated on individual data sets and
their combinations. The developed models provided on average a 2-fold
decrease of the number of nonsoluble compounds amid all compounds
predicted as soluble in DMSO. However, a 4–9-fold enrichment
was observed if only 10% of the most reliable predictions were considered.
The structural features influencing compounds to be soluble or nonsoluble
in DMSO were also determined. The best models developed with the publicly
available Enamine data set are freely available online at http://ochem.eu/article/33409.
Collapse
Affiliation(s)
- Igor V Tetko
- Helmholtz Zentrum München-German Research Center for Environmental Health-GmbH, Ingolstädter Landstraße 1, D-85764 Neuherberg, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
21
|
The continuous molecular fields approach to building 3D-QSAR models. J Comput Aided Mol Des 2013; 27:427-42. [PMID: 23719959 DOI: 10.1007/s10822-013-9656-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 05/22/2013] [Indexed: 10/26/2022]
Abstract
The continuous molecular fields (CMF) approach is based on the application of continuous functions for the description of molecular fields instead of finite sets of molecular descriptors (such as interaction energies computed at grid nodes) commonly used for this purpose. These functions can be encapsulated into kernels and combined with kernel-based machine learning algorithms to provide a variety of novel methods for building classification and regression structure-activity models, visualizing chemical datasets and conducting virtual screening. In this article, the CMF approach is applied to building 3D-QSAR models for 8 datasets through the use of five types of molecular fields (the electrostatic, steric, hydrophobic, hydrogen-bond acceptor and donor ones), the linear convolution molecular kernel with the contribution of each atom approximated with a single isotropic Gaussian function, and the kernel ridge regression data analysis technique. It is shown that the CMF approach even in this simplest form provides either comparable or enhanced predictive performance in comparison with state-of-the-art 3D-QSAR methods.
Collapse
|
22
|
Makhaeva GF, Radchenko EV, Baskin II, Palyulin VA, Richardson RJ, Zefirov NS. Combined QSAR studies of inhibitor properties of O-phosphorylated oximes toward serine esterases involved in neurotoxicity, drug metabolism and Alzheimer's disease. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2012; 23:627-647. [PMID: 22587543 DOI: 10.1080/1062936x.2012.679690] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Oxime reactivation of serine esterases (EOHs) inhibited by organophosphorus (OP) compounds can produce O-phosphorylated oximes (POXs). Such oxime derivatives are of interest, because some of them can have greater anti-EOH potencies than the OP inhibitors from which they were derived. Accordingly, inhibitor properties of 58 POXs against four EOHs, along with pair-wise selectivities between them, have been analysed using different QSAR approaches. EOHs (with their abbreviations and consequences of inhibition in parentheses) comprised acetylcholinesterase (AChE: acute neurotoxicity; cognition enhancement), butyrylcholinesterase (BChE: inhibition of drug metabolism or stoichiometric scavenging of EOH inhibitors; cognition enhancement), carboxylesterase (CaE: inhibition of drug metabolism or stoichiometric scavenging of EOH inhibitors), and neuropathy target esterase (NTE: delayed neurotoxicity). QSAR techniques encompassed linear regression and backpropagation neural networks in conjunction with fragmental descriptors containing labelled atoms, Molecular Field Topology Analysis (MFTA), Comparative Molecular Similarity Index Analysis (CoMSIA), and molecular modelling. All methods provided mostly consistent and complementary information, and they revealed structural features controlling the 'esterase profiles', i.e. patterns of anti-EOH activities and selectivities of the compounds of interest. In addition, MFTA models were used to design a library of compounds having a cognition-enhancement esterase profile suitable for potential application to the treatment of Alzheimer's disease.
Collapse
Affiliation(s)
- G F Makhaeva
- Institute of Physiologically Active Compounds, Chernogolovka, Moscow Region, Russia
| | | | | | | | | | | |
Collapse
|
23
|
Solov’ev VP, Kireeva N, Tsivadze AY, Varnek A. QSPR ensemble modelling of alkaline-earth metal complexation. J INCL PHENOM MACRO 2012. [DOI: 10.1007/s10847-012-0185-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
24
|
Varnek A, Baskin I. Machine learning methods for property prediction in chemoinformatics: Quo Vadis? J Chem Inf Model 2012; 52:1413-37. [PMID: 22582859 DOI: 10.1021/ci200409x] [Citation(s) in RCA: 148] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
This paper is focused on modern approaches to machine learning, most of which are as yet used infrequently or not at all in chemoinformatics. Machine learning methods are characterized in terms of the "modes of statistical inference" and "modeling levels" nomenclature and by considering different facets of the modeling with respect to input/ouput matching, data types, models duality, and models inference. Particular attention is paid to new approaches and concepts that may provide efficient solutions of common problems in chemoinformatics: improvement of predictive performance of structure-property (activity) models, generation of structures possessing desirable properties, model applicability domain, modeling of properties with functional endpoints (e.g., phase diagrams and dose-response curves), and accounting for multiple molecular species (e.g., conformers or tautomers).
Collapse
Affiliation(s)
- Alexandre Varnek
- Laboratoire d'Infochimie, UMR 7177 CNRS, Université de Strasbourg, 4, rue B. Pascal, Strasbourg 67000, France.
| | | |
Collapse
|
25
|
Sushko I, Novotarskyi S, Körner R, Pandey AK, Rupp M, Teetz W, Brandmaier S, Abdelaziz A, Prokopenko VV, Tanchuk VY, Todeschini R, Varnek A, Marcou G, Ertl P, Potemkin V, Grishina M, Gasteiger J, Schwab C, Baskin II, Palyulin VA, Radchenko EV, Welsh WJ, Kholodovych V, Chekmarev D, Cherkasov A, Aires-de-Sousa J, Zhang QY, Bender A, Nigsch F, Patiny L, Williams A, Tkachenko V, Tetko IV. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 2011; 25:533-54. [PMID: 21660515 PMCID: PMC3131510 DOI: 10.1007/s10822-011-9440-2] [Citation(s) in RCA: 363] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2011] [Accepted: 05/24/2011] [Indexed: 11/25/2022]
Abstract
The Online Chemical Modeling Environment is a web-based platform that aims to automate and simplify the typical steps required for QSAR modeling. The platform consists of two major subsystems: the database of experimental measurements and the modeling framework. A user-contributed database contains a set of tools for easy input, search and modification of thousands of records. The OCHEM database is based on the wiki principle and focuses primarily on the quality and verifiability of the data. The database is tightly integrated with the modeling framework, which supports all the steps required to create a predictive model: data search, calculation and selection of a vast variety of molecular descriptors, application of machine learning methods, validation, analysis of the model and assessment of the applicability domain. As compared to other similar systems, OCHEM is not intended to re-implement the existing tools or models but rather to invite the original authors to contribute their results, make them publicly available, share them with other users and to become members of the growing research community. Our intention is to make OCHEM a widely used platform to perform the QSPR/QSAR studies online and share it with other users on the Web. The ultimate goal of OCHEM is collecting all possible chemoinformatics tools within one simple, reliable and user-friendly resource. The OCHEM is free for web users and it is available online at http://www.ochem.eu.
Collapse
Affiliation(s)
- Iurii Sushko
- eADMET GmbH, Ingolstädter Landstraße 1, 85764, Neuherberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
26
|
Myint KZ, Xie XQ. Recent advances in fragment-based QSAR and multi-dimensional QSAR methods. Int J Mol Sci 2010; 11:3846-66. [PMID: 21152304 PMCID: PMC2996787 DOI: 10.3390/ijms11103846] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2010] [Revised: 09/17/2010] [Accepted: 09/23/2010] [Indexed: 12/13/2022] Open
Abstract
This paper provides an overview of recently developed two dimensional (2D) fragment-based QSAR methods as well as other multi-dimensional approaches. In particular, we present recent fragment-based QSAR methods such as fragment-similarity-based QSAR (FS-QSAR), fragment-based QSAR (FB-QSAR), Hologram QSAR (HQSAR), and top priority fragment QSAR in addition to 3D- and nD-QSAR methods such as comparative molecular field analysis (CoMFA), comparative molecular similarity analysis (CoMSIA), Topomer CoMFA, self-organizing molecular field analysis (SOMFA), comparative molecular moment analysis (COMMA), autocorrelation of molecular surfaces properties (AMSP), weighted holistic invariant molecular (WHIM) descriptor-based QSAR (WHIM), grid-independent descriptors (GRIND)-based QSAR, 4D-QSAR, 5D-QSAR and 6D-QSAR methods.
Collapse
Affiliation(s)
- Kyaw Zeyar Myint
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, USA; E-Mail:
| | - Xiang-Qun Xie
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, USA; E-Mail:
- Department of Pharmaceutical Sciences, School of Pharmacy, University of Pittsburgh, Pittsburgh, PA 15260, USA
- Pittsburgh Chemical Methodologies & Library Development (PCMLD) and Pittsburgh Drug Discovery Institute, University of Pittsburgh, Pittsburgh, PA 15260, USA
- * Author to whom correspondence should be addressed; E-Mail: ; Tel.: +1-412-383-5276; Fax: +1-412-383-7436
| |
Collapse
|
27
|
Kurilo MN, Karpov PV, Baskin II, Palyulin VA, Zefirov NS. Neural network modeling of substituent constants on the basis of fragmental descriptors. DOKLADY CHEMISTRY 2010. [DOI: 10.1134/s0012500810030067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
28
|
Baskin II, Zhokhova NI, Palyulin VA, Zefirov AN, Zefirov NS. Multilevel approach to the prediction of properties of organic compounds in the framework of the QSAR/QSPR methodology. DOKLADY CHEMISTRY 2009. [DOI: 10.1134/s0012500809070076] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|