101
|
The prediction of topologically partitioned intra-atomic and inter-atomic energies by the machine learning method kriging. Theor Chem Acc 2016. [DOI: 10.1007/s00214-016-1951-4] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
102
|
Kaur M, Silakari O. Identification of new dual spleen tyrosine kinase (Syk) and phosphoionositide-3-kinase δ (PI3Kδ) inhibitors using ligand and structure-based integrated ideal pharmacophore models. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:469-499. [PMID: 27431536 DOI: 10.1080/1062936x.2016.1209555] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2016] [Accepted: 07/01/2016] [Indexed: 06/06/2023]
Abstract
Owing to the complex pathophysiology of autoimmune disorders, it is very challenging to develop successful treatment strategies. Single-target agents are not desired therapeutics for such multi-factorial disorders. Considering the current need for the treatment of complex autoimmune disorders, dual inhibitors of Syk and PI3Kδ have been designed using ligand and structure-based molecular modelling strategies. In the present work, structure and ligand-based pharmacophore modelling was implemented for a varied set of Syk and PI3Kδ inhibitors. Ligand-based pharmacophore models (LBPMs) were developed for two kinases: ADPR.14 (r(2)train = 0.809) for Syk, comprising one hydrogen bond acceptor, one hydrogen bond donor, one positive ionisable and one ring aromatic feature, and for PI3Kδ: AAARR.45 (r(2)train = 0.942) consisting of three hydrogen bond acceptor and two ring aromatic features. The generated e-pharmacophore models revealed an additional ring aromatic and hydrophobic feature important for Syk and PI3Kδ inhibition, respectively. Subsequently, LBPMs were modified resulting in APDRR.14 hypothesis for Syk inhibitors and AAAHRR.45 hypothesis for PI3Kδ inhibitors employed for virtual screening. Thus, the combination of ligand and structure-based pharmacophore modelling helped in developing ideal pharmacophore models that may be an efficient tool for the designing of novel dual inhibitors of Syk and PI3Kδ.
Collapse
Affiliation(s)
- M Kaur
- a Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research , Punjabi University , Patiala , India
| | - O Silakari
- a Molecular Modeling Lab (MML), Department of Pharmaceutical Sciences and Drug Research , Punjabi University , Patiala , India
| |
Collapse
|
103
|
Norinder U, Boyer S. Conformal Prediction Classification of a Large Data Set of Environmental Chemicals from ToxCast and Tox21 Estrogen Receptor Assays. Chem Res Toxicol 2016; 29:1003-10. [DOI: 10.1021/acs.chemrestox.6b00037] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Ulf Norinder
- Swedish Toxicology Sciences Research Center, SE-151
36 Södertälje, Sweden
| | - Scott Boyer
- Swedish Toxicology Sciences Research Center, SE-151
36 Södertälje, Sweden
| |
Collapse
|
104
|
Mangiatordi GF, Alberga D, Altomare CD, Carotti A, Catto M, Cellamare S, Gadaleta D, Lattanzi G, Leonetti F, Pisani L, Stefanachi A, Trisciuzzi D, Nicolotti O. Mind the Gap! A Journey towards Computational Toxicology. Mol Inform 2016; 35:294-308. [PMID: 27546034 DOI: 10.1002/minf.201501017] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 03/23/2016] [Indexed: 11/11/2022]
Abstract
Computational methods have advanced toxicology towards the development of target-specific models based on a clear cause-effect rationale. However, the predictive potential of these models presents strengths and weaknesses. On the good side, in silico models are valuable cheap alternatives to in vitro and in vivo experiments. On the other, the unconscious use of in silico methods can mislead end-users with elusive results. The focus of this review is on the basic scientific and regulatory recommendations in the derivation and application of computational models. Attention is paid to examine the interplay between computational toxicology and drug discovery and development. Avoiding the easy temptation of an overoptimistic future, we report our view on what can, or cannot, realistically be done. Indeed, studies of safety/toxicity represent a key element of chemical prioritization programs carried out by chemical industries, and primarily by pharmaceutical companies.
Collapse
Affiliation(s)
- Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Domenico Alberga
- Dipartimento Interateneo di Fisica 'M.Merlin', Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Angelo Carotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Marco Catto
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Saverio Cellamare
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Domenico Gadaleta
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Gianluca Lattanzi
- Dipartimento Interateneo di Fisica 'M.Merlin', Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Leonardo Pisani
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Angela Stefanachi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy.
| |
Collapse
|
105
|
Wang NN, Dong J, Deng YH, Zhu MF, Wen M, Yao ZJ, Lu AP, Wang JB, Cao DS. ADME Properties Evaluation in Drug Discovery: Prediction of Caco-2 Cell Permeability Using a Combination of NSGA-II and Boosting. J Chem Inf Model 2016; 56:763-73. [DOI: 10.1021/acs.jcim.5b00642] [Citation(s) in RCA: 98] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ning-Ning Wang
- School
of Pharmaceutical Sciences, Central South University, Changsha 410013, P. R. China
| | - Jie Dong
- School
of Pharmaceutical Sciences, Central South University, Changsha 410013, P. R. China
| | - Yin-Hua Deng
- School
of Pharmaceutical Sciences, Central South University, Changsha 410013, P. R. China
| | - Min-Feng Zhu
- School
of Mathematics and Statistics, Central South University, Changsha 410083, P. R. China
| | - Ming Wen
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, P. R. China
| | - Zhi-Jiang Yao
- School
of Pharmaceutical Sciences, Central South University, Changsha 410013, P. R. China
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, P. R. China
| | - Ai-Ping Lu
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, P. R. China
| | - Jian-Bing Wang
- College
of Chemistry and Chemical Engineering, Central South University, Changsha 410083, P. R. China
| | - Dong-Sheng Cao
- School
of Pharmaceutical Sciences, Central South University, Changsha 410013, P. R. China
- Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, P. R. China
| |
Collapse
|
106
|
Raies AB, Bajic VB. In silico toxicology: computational methods for the prediction of chemical toxicity. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2016; 6:147-172. [PMID: 27066112 PMCID: PMC4785608 DOI: 10.1002/wcms.1240] [Citation(s) in RCA: 339] [Impact Index Per Article: 42.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2015] [Revised: 10/27/2015] [Accepted: 11/10/2015] [Indexed: 01/08/2023]
Abstract
Determining the toxicity of chemicals is necessary to identify their harmful effects on humans, animals, plants, or the environment. It is also one of the main steps in drug design. Animal models have been used for a long time for toxicity testing. However, in vivo animal tests are constrained by time, ethical considerations, and financial burden. Therefore, computational methods for estimating the toxicity of chemicals are considered useful. In silico toxicology is one type of toxicity assessment that uses computational methods to analyze, simulate, visualize, or predict the toxicity of chemicals. In silico toxicology aims to complement existing toxicity tests to predict toxicity, prioritize chemicals, guide toxicity tests, and minimize late-stage failures in drugs design. There are various methods for generating models to predict toxicity endpoints. We provide a comprehensive overview, explain, and compare the strengths and weaknesses of the existing modeling methods and algorithms for toxicity prediction with a particular (but not exclusive) emphasis on computational tools that can implement these methods and refer to expert systems that deploy the prediction models. Finally, we briefly review a number of new research directions in in silico toxicology and provide recommendations for designing in silico models. WIREs Comput Mol Sci 2016, 6:147-172. doi: 10.1002/wcms.1240 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Arwa B Raies
- King Abdullah University of Science and Technology (KAUST) Computational Bioscience Research Centre (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE) Thuwal Saudi Arabia
| | - Vladimir B Bajic
- King Abdullah University of Science and Technology (KAUST) Computational Bioscience Research Centre (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE) Thuwal Saudi Arabia
| |
Collapse
|
107
|
Lei T, Li Y, Song Y, Li D, Sun H, Hou T. ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminform 2016; 8:6. [PMID: 26839598 PMCID: PMC4736633 DOI: 10.1186/s13321-016-0117-7] [Citation(s) in RCA: 85] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 01/20/2016] [Indexed: 01/31/2023] Open
Abstract
Background
Determination of acute toxicity, expressed as median lethal dose (LD50), is one of the most important steps in drug discovery pipeline. Because in vivo assays for oral acute toxicity in mammals are time-consuming and costly, there is thus an urgent need to develop in silico prediction models of oral acute toxicity.
Results In this study, based on a comprehensive data set containing 7314 diverse chemicals with rat oral LD50 values, relevance vector machine (RVM) technique was employed to build the regression models for the prediction of oral acute toxicity in rate, which were compared with those built using other six machine learning approaches, including k-nearest-neighbor regression, random forest (RF), support vector machine, local approximate Gaussian process, multilayer perceptron ensemble, and eXtreme gradient boosting. A subset of the original molecular descriptors and structural fingerprints (PubChem or SubFP) was chosen by the Chi squared statistics. The prediction capabilities of individual QSAR models, measured by qext2 for the test set containing 2376 molecules, ranged from 0.572 to 0.659. Conclusion Considering the overall prediction accuracy for the test set, RVM with Laplacian kernel and RF were recommended to build in silico models with better predictivity for rat oral acute toxicity. By combining the predictions from individual models, four consensus models were developed, yielding better prediction capabilities for the test set (qext2 = 0.669–0.689). Finally, some essential descriptors and substructures relevant to oral acute toxicity were identified and analyzed, and they may be served as property or substructure alerts to avoid toxicity. We believe that the best consensus model with high prediction accuracy can be used as a reliable virtual screening tool to filter out compounds with high rat oral acute toxicity.
Workflow of combinatorial QSAR modelling to predict rat oral acute toxicity ![]()
Collapse
Affiliation(s)
- Tailong Lei
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang People's Republic of China
| | - Youyong Li
- Institute of Functional Nano and Soft Materials (FUNSOM), Soochow University, Suzhou, 215123 Jiangsu People's Republic of China
| | - Yunlong Song
- Department of Medicinal Chemistry, School of Pharmacy, Second Military Medical University, Shanghai, 200433 People's Republic of China
| | - Dan Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang People's Republic of China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang People's Republic of China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang People's Republic of China ; State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058 Zhejiang People's Republic of China
| |
Collapse
|
108
|
Fang J, Pang X, Yan R, Lian W, Li C, Wang Q, Liu AL, Du GH. Discovery of neuroprotective compounds by machine learning approaches. RSC Adv 2016. [DOI: 10.1039/c5ra23035g] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
The classification models were constructed to discover neuroprotective compounds against glutamate or H2O2-induced neurotoxicity through machine learning approaches.
Collapse
Affiliation(s)
- Jiansong Fang
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
- Institute of Clinical Pharmacology
| | - Xiaocong Pang
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
| | - Rong Yan
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
| | - Wenwen Lian
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
| | - Chao Li
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
| | - Qi Wang
- Institute of Clinical Pharmacology
- Guangzhou University of Traditional Chinese Medicine
- Guangzhou 510006
- China
| | - Ai-Lin Liu
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
- Beijing Key Laboratory of Drug Target and Screening Research
| | - Guan-Hua Du
- Institute of Materia Medica
- Chinese Academy of Medical Sciences and Peking Union Medical College
- Beijing 100050
- PR China
- Beijing Key Laboratory of Drug Target and Screening Research
| |
Collapse
|
109
|
In silico prediction of the β-cyclodextrin complexation based on Monte Carlo method. Int J Pharm 2015; 495:404-409. [DOI: 10.1016/j.ijpharm.2015.08.078] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2015] [Accepted: 08/24/2015] [Indexed: 01/24/2023]
|
110
|
Mervin LH, Afzal AM, Drakakis G, Lewis R, Engkvist O, Bender A. Target prediction utilising negative bioactivity data covering large chemical space. J Cheminform 2015; 7:51. [PMID: 26500705 PMCID: PMC4619454 DOI: 10.1186/s13321-015-0098-y] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2015] [Accepted: 09/29/2015] [Indexed: 02/25/2023] Open
Abstract
BACKGROUND In silico analyses are increasingly being used to support mode-of-action investigations; however many such approaches do not utilise the large amounts of inactive data held in chemogenomic repositories. The objective of this work is concerned with the integration of such bioactivity data in the target prediction of orphan compounds to produce the probability of activity and inactivity for a range of targets. To this end, a novel human bioactivity data set was constructed through the assimilation of over 195 million bioactivity data points deposited in the ChEMBL and PubChem repositories, and the subsequent application of a sphere-exclusion selection algorithm to oversample presumed inactive compounds. RESULTS A Bernoulli Naïve Bayes algorithm was trained using the data and evaluated using fivefold cross-validation, achieving a mean recall and precision of 67.7 and 63.8 % for active compounds and 99.6 and 99.7 % for inactive compounds, respectively. We show the performances of the models are considerably influenced by the underlying intraclass training similarity, the size of a given class of compounds, and the degree of additional oversampling. The method was also validated using compounds extracted from WOMBAT producing average precision-recall AUC and BEDROC scores of 0.56 and 0.85, respectively. Inactive data points used for this test are based on presumed inactivity, producing an approximated indication of the true extrapolative ability of the models. A distance-based applicability domain analysis was also conducted; indicating an average Tanimoto Coefficient distance of 0.3 or greater between a test and training set can be used to give a global measure of confidence in model predictions. A final comparison to a method trained solely on active data from ChEMBL performed with precision-recall AUC and BEDROC scores of 0.45 and 0.76. CONCLUSIONS The inclusion of inactive data for model training produces models with superior AUC and improved early recognition capabilities, although the results from internal and external validation of the models show differing performance between the breadth of models. The realised target prediction protocol is available at https://github.com/lhm30/PIDGIN.Graphical abstractThe inclusion of large scale negative training data for in silico target prediction improves the precision and recall AUC and BEDROC scores for target models.
Collapse
Affiliation(s)
- Lewis H. Mervin
- />Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Avid M. Afzal
- />Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Georgios Drakakis
- />Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Richard Lewis
- />Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Ola Engkvist
- />Discovery Sciences, Chemistry Innovation Centre, AstraZeneca R&D, 43183 Mölndal, Sweden
| | - Andreas Bender
- />Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| |
Collapse
|
111
|
Dearden JC, Hewitt M, Roberts DW, Enoch SJ, Rowe PH, Przybylak KR, Vaughan-Williams GD, Smith ML, Pillai GG, Katritzky AR. Mechanism-Based QSAR Modeling of Skin Sensitization. Chem Res Toxicol 2015; 28:1975-86. [PMID: 26382665 DOI: 10.1021/acs.chemrestox.5b00197] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Many chemicals can induce skin sensitization, and there is a pressing need for non-animal methods to give a quantitative indication of potency. Using two large published data sets of skin sensitizers, we have allocated each sensitizing chemical to one of 10 mechanistic categories and then developed good QSAR models for the seven categories that have a sufficient number of chemicals to allow modeling. Both internal and external validation checks showed that each model had good predictivity.
Collapse
Affiliation(s)
- J C Dearden
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - M Hewitt
- School of Pharmacy, University of Wolverhampton , Wulfruna Street, Wolverhampton WV1 1LY, United Kingdom
| | - D W Roberts
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - S J Enoch
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - P H Rowe
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - K R Przybylak
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - G D Vaughan-Williams
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - M L Smith
- School of Pharmacy & Biomolecular Sciences, Liverpool John Moores University , Byrom Street, Liverpool L3 3AF, United Kingdom
| | - G G Pillai
- Department of Chemistry, University of Florida , Gainsville, Florida 32611-7200, United States.,Institute of Chemistry, University of Tartu , 50411 Tartu, Estonia
| | - A R Katritzky
- Department of Chemistry, University of Florida , Gainsville, Florida 32611-7200, United States
| |
Collapse
|
112
|
Žuvela P, Liu JJ, Macur K, Bączek T. Molecular Descriptor Subset Selection in Theoretical Peptide Quantitative Structure–Retention Relationship Model Development Using Nature-Inspired Optimization Algorithms. Anal Chem 2015; 87:9876-83. [DOI: 10.1021/acs.analchem.5b02349] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Petar Žuvela
- Department
of Chemical Engineering, Pukyong National University, 365 Sinseon-ro, 608-739 Busan, Korea
| | - J. Jay Liu
- Department
of Chemical Engineering, Pukyong National University, 365 Sinseon-ro, 608-739 Busan, Korea
| | - Katarzyna Macur
- Laboratory
of Mass Spectrometry, Intercollegiate Faculty of Biotechnology, University of Gdańsk and Medical University of Gdańsk, Kładki
24, 80-822 Gdańsk, Poland
| | - Tomasz Bączek
- Department
of Pharmaceutical Chemistry, Medical University of Gdańsk, Hallera
107, 80-416 Gdańsk, Poland
| |
Collapse
|
113
|
Vuorinen A, Odermatt A, Schuster D. Reprint of "In silico methods in the discovery of endocrine disrupting chemicals". J Steroid Biochem Mol Biol 2015; 153:93-101. [PMID: 26291836 DOI: 10.1016/j.jsbmb.2015.08.015] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Revised: 04/03/2013] [Accepted: 04/07/2013] [Indexed: 12/18/2022]
Abstract
The prevalence of sex hormone-dependent cancers, reproductive problems, obesity, and cardiovascular complications has risen especially in the Western world. It has been suggested, that the exposure to various endocrine disrupting chemicals (EDCs) contributes to the development and progression of these diseases. EDCs can interfere with various proteins: nuclear steroid hormone receptors, such as estrogen-, androgen-, glucocorticoid- and mineralocorticoid receptors (ER, AR, GR, MR), and enzymes that are involved in steroid hormone synthesis and metabolism, for example hydroxysteroid dehydrogenases (HSDs). Numerous chemicals are known as endocrine disruptors. However, the mechanism of action for most of these EDCs is still unknown. It is exhaustive and time consuming to test in vitro all chemicals - potential EDCs - used in industry, agriculture or as food preservatives against their effects on the endocrine system. Computational methods, such as virtual screening, quantitative structure activity relationships and docking, are already well recognized and used in drug development. The same methods could also aid the research on EDCs. So far, the computational methods in the search of EDCs have been retrospective. There are, however, some prospective studies reporting the use of in silico methods: five studies reporting the identification of previously unknown 17β-HSD3 inhibitors, MR agonists, and ER antagonists/agonists. This review provides an overview of case studies and in silico methods that are used in the search of EDCs. This article is part of a Special Issue entitled 'CSR 2013'.
Collapse
Affiliation(s)
- Anna Vuorinen
- Institute of Pharmacy/Pharmaceutical Chemistry and Center for Molecular Biosciences Innsbruck - CMBI, University of Innsbruck, Innrain 80-82, 6020 Innsbruck, Austria
| | - Alex Odermatt
- Swiss Center for Applied Human Toxicology and Division of Molecular and Systems Toxicology, Department of Pharmaceutical Sciences, University of Basel, Klingelbergstrasse 50, 4056 Basel, Switzerland
| | - Daniela Schuster
- Institute of Pharmacy/Pharmaceutical Chemistry and Center for Molecular Biosciences Innsbruck - CMBI, University of Innsbruck, Innrain 80-82, 6020 Innsbruck, Austria.
| |
Collapse
|
114
|
Wang J, Hou T. Advances in computationally modeling human oral bioavailability. Adv Drug Deliv Rev 2015; 86:11-6. [PMID: 25582307 PMCID: PMC4490973 DOI: 10.1016/j.addr.2015.01.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 11/03/2014] [Accepted: 01/05/2015] [Indexed: 12/15/2022]
Abstract
Although significant progress has been made in experimental high throughput screening (HTS) of ADME (absorption, distribution, metabolism, excretion) and pharmacokinetic properties, the ADME and Toxicity (ADME-Tox) in silico modeling is still indispensable in drug discovery as it can guide us to wisely select drug candidates prior to expensive ADME screenings and clinical trials. Compared to other ADME-Tox properties, human oral bioavailability (HOBA) is particularly important but extremely difficult to predict. In this paper, the advances in human oral bioavailability modeling will be reviewed. Moreover, our deep insight on how to construct more accurate and reliable HOBA QSAR and classification models will also discussed.
Collapse
Affiliation(s)
- Junmei Wang
- Green Center for Systems Biology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd. Dallas, TX 75390, USA.
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
115
|
Sheridan RP. The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity. J Chem Inf Model 2015; 55:1098-107. [DOI: 10.1021/acs.jcim.5b00110] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Robert P. Sheridan
- Cheminformatics Department, RY800B-305, Merck Research Laboratories, Rahway, New Jersey 07065, United States
| |
Collapse
|
116
|
Singh S, Supuran CT. In silicomodeling ofβ-carbonic anhydrase inhibitors from the fungusMalassezia globosaas antidandruff agents. J Enzyme Inhib Med Chem 2015; 31:417-24. [DOI: 10.3109/14756366.2015.1031127] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open
|
117
|
Norinder U, Carlsson L, Boyer S, Eklund M. Introducing conformal prediction in predictive modeling for regulatory purposes. A transparent and flexible alternative to applicability domain determination. Regul Toxicol Pharmacol 2015; 71:279-84. [DOI: 10.1016/j.yrtph.2014.12.021] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Revised: 12/23/2014] [Accepted: 12/24/2014] [Indexed: 10/24/2022]
|
118
|
Aliagas I, Gobbi A, Heffron T, Lee ML, Ortwine DF, Zak M, Khojasteh SC. A probabilistic method to report predictions from a human liver microsomes stability QSAR model: a practical tool for drug discovery. J Comput Aided Mol Des 2015; 29:327-38. [DOI: 10.1007/s10822-015-9838-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Accepted: 02/14/2015] [Indexed: 02/04/2023]
|
119
|
Nantasenamat C, Prachayasittikul V. Maximizing computational tools for successful drug discovery. Expert Opin Drug Discov 2015; 10:321-9. [PMID: 25693813 DOI: 10.1517/17460441.2015.1016497] [Citation(s) in RCA: 42] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Drug discovery is an iterative cycle of identifying promising hits followed by lead optimization via bioisosteric replacements. In the search for compounds affording good bioactivity, equal importance should also be placed on achieving those with favorable pharmacokinetic properties. Thus, the balance and realization of both key properties is an intricate problem that requires great caution. In this editorial, the authors explore the available computational tools in the context of the extant of big data that has borne out via advents of the Omics revolution. As such, the selection of appropriate computational tools for analyzing the vast number of chemical libraries, target proteins and interactomes is the first step toward maximizing the chance for success. However, in order to realize this, it is also necessary to have a solid foundation on the big concepts of drug discovery as well as knowing which tools are available in order to give drug discovery scientists the best opportunity.
Collapse
Affiliation(s)
- Chanin Nantasenamat
- Mahidol University, Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology , 10700 Bangkok , Thailand
| | | |
Collapse
|
120
|
Gissi A, Lombardo A, Roncaglioni A, Gadaleta D, Mangiatordi GF, Nicolotti O, Benfenati E. Evaluation and comparison of benchmark QSAR models to predict a relevant REACH endpoint: The bioconcentration factor (BCF). ENVIRONMENTAL RESEARCH 2015; 137:398-409. [PMID: 25616163 DOI: 10.1016/j.envres.2014.12.019] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/08/2014] [Revised: 12/22/2014] [Accepted: 12/23/2014] [Indexed: 05/27/2023]
Abstract
The bioconcentration factor (BCF) is an important bioaccumulation hazard assessment metric in many regulatory contexts. Its assessment is required by the REACH regulation (Registration, Evaluation, Authorization and Restriction of Chemicals) and by CLP (Classification, Labeling and Packaging). We challenged nine well-known and widely used BCF QSAR models against 851 compounds stored in an ad-hoc created database. The goodness of the regression analysis was assessed by considering the determination coefficient (R(2)) and the Root Mean Square Error (RMSE); Cooper's statistics and Matthew's Correlation Coefficient (MCC) were calculated for all the thresholds relevant for regulatory purposes (i.e. 100L/kg for Chemical Safety Assessment; 500L/kg for Classification and Labeling; 2000 and 5000L/kg for Persistent, Bioaccumulative and Toxic (PBT) and very Persistent, very Bioaccumulative (vPvB) assessment) to assess the classification, with particular attention to the models' ability to control the occurrence of false negatives. As a first step, statistical analysis was performed for the predictions of the entire dataset; R(2)>0.70 was obtained using CORAL, T.E.S.T. and EPISuite Arnot-Gobas models. As classifiers, ACD and logP-based equations were the best in terms of sensitivity, ranging from 0.75 to 0.94. External compound predictions were carried out for the models that had their own training sets. CORAL model returned the best performance (R(2)ext=0.59), followed by the EPISuite Meylan model (R(2)ext=0.58). The latter gave also the highest sensitivity on external compounds with values from 0.55 to 0.85, depending on the thresholds. Statistics were also compiled for compounds falling into the models Applicability Domain (AD), giving better performances. In this respect, VEGA CAESAR was the best model in terms of regression (R(2)=0.94) and classification (average sensitivity>0.80). This model also showed the best regression (R(2)=0.85) and sensitivity (average>0.70) for new compounds in the AD but not present in the training set. However, no single optimal model exists and, thus, it would be wise a case-by-case assessment. Yet, integrating the wealth of information from multiple models remains the winner approach.
Collapse
Affiliation(s)
- Andrea Gissi
- Laboratory of Environmental Chemistry and Toxicology, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy; Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona 4, 70125 Bari, Italy
| | - Anna Lombardo
- Laboratory of Environmental Chemistry and Toxicology, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy
| | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy
| | - Domenico Gadaleta
- Laboratory of Environmental Chemistry and Toxicology, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy; Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona 4, 70125 Bari, Italy
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona 4, 70125 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia - Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro", Via E. Orabona 4, 70125 Bari, Italy
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, IRCCS - Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy.
| |
Collapse
|
121
|
Singh S. Computational design and chemometric QSAR modeling of Plasmodium falciparum carbonic anhydrase inhibitors. Bioorg Med Chem Lett 2015; 25:133-41. [DOI: 10.1016/j.bmcl.2014.10.089] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2014] [Revised: 10/25/2014] [Accepted: 10/28/2014] [Indexed: 12/12/2022]
|
122
|
Abstract
Quantitative Structure-Activity Relationship (QSAR) models have manifold applications in drug discovery, environmental fate modeling, risk assessment, and property prediction of chemicals and pharmaceuticals. One of the principles recommended by the Organization of Economic Co-operation and Development (OECD) for model validation requires defining the Applicability Domain (AD) for QSAR models, which allows one to estimate the uncertainty in the prediction of a compound based on how similar it is to the training compounds, which are used in the model development. The AD is a significant tool to build a reliable QSAR model, which is generally limited in use to query chemicals structurally similar to the training compounds. Thus, characterization of interpolation space is significant in defining the AD. An attempt is made in this chapter to address the important concepts and methodology of the AD as well as criteria for estimating AD through training set interpolation in the descriptor space.
Collapse
|
123
|
Sushko Y, Novotarskyi S, Körner R, Vogt J, Abdelaziz A, Tetko IV. Prediction-driven matched molecular pairs to interpret QSARs and aid the molecular optimization process. J Cheminform 2014; 6:48. [PMID: 25544551 PMCID: PMC4272757 DOI: 10.1186/s13321-014-0048-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2014] [Accepted: 11/07/2014] [Indexed: 11/24/2022] Open
Abstract
Background QSAR is an established and powerful method for cheap in silico assessment of physicochemical properties and biological activities of chemical compounds. However, QSAR models are rather complex mathematical constructs that cannot easily be interpreted. Medicinal chemists would benefit from practical guidance regarding which molecules to synthesize. Another possible approach is analysis of pairs of very similar molecules, so-called matched molecular pairs (MMPs). Such an approach allows identification of molecular transformations that affect particular activities (e.g. toxicity). In contrast to QSAR, chemical interpretation of these transformations is straightforward. Furthermore, such transformations can give medicinal chemists useful hints for the hit-to-lead optimization process. Results The current study suggests a combination of QSAR and MMP approaches by finding MMP transformations based on QSAR predictions for large chemical datasets. The study shows that such an approach, referred to as prediction-driven MMP analysis, is a useful tool for medicinal chemists, allowing identification of large numbers of “interesting” transformations that can be used to drive the molecular optimization process. All the methodological developments have been implemented as software products available online as part of OCHEM (http://ochem.eu/). Conclusions The prediction-driven MMPs methodology was exemplified by two use cases: modelling of aquatic toxicity and CYP3A4 inhibition. This approach helped us to interpret QSAR models and allowed identification of a number of “significant” molecular transformations that affect the desired properties. This can facilitate drug design as a part of molecular optimization process. Molecular matched pairs and transformation graphs facilitate interpretable molecular optimisation process. ![]()
Collapse
Affiliation(s)
- Yurii Sushko
- eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Munich Germany
| | | | - Robert Körner
- eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Munich Germany
| | - Joachim Vogt
- eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Munich Germany
| | - Ahmed Abdelaziz
- eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Munich Germany
| | - Igor V Tetko
- eADMET GmbH, Lichtenbergstraße 8, D-85748 Garching, Munich Germany ; Helmholtz-Zentrum München - German Research Centre for Environmental Health (GmbH), Institute of Structural Biology, Ingolstädter Landstraße 1, D-85764 Neuherberg, Germany ; A.M. Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya St. 18, 420008 Kazan, Russia
| |
Collapse
|
124
|
Veselinović JB, Toropov AA, Toropova AP, Nikolić GM, Veselinović AM. Monte Carlo Method-Based QSAR Modeling of Penicillins Binding to Human Serum Proteins. Arch Pharm (Weinheim) 2014; 348:62-7. [DOI: 10.1002/ardp.201400259] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 09/12/2014] [Accepted: 10/01/2014] [Indexed: 11/12/2022]
Affiliation(s)
| | - Andrey A. Toropov
- IRCCS - Istituto di Ricerche Farmacologiche Mario Negri; Milano Italy
| | - Alla P. Toropova
- IRCCS - Istituto di Ricerche Farmacologiche Mario Negri; Milano Italy
| | - Goran M. Nikolić
- Faculty of Medicine; Department of Chemistry; University of Niš; Niš Serbia
| | | |
Collapse
|
125
|
Carpenter TS, Kirshner DA, Lau EY, Wong SE, Nilmeier JP, Lightstone FC. A method to predict blood-brain barrier permeability of drug-like compounds using molecular dynamics simulations. Biophys J 2014; 107:630-641. [PMID: 25099802 PMCID: PMC4129472 DOI: 10.1016/j.bpj.2014.06.024] [Citation(s) in RCA: 176] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Revised: 06/10/2014] [Accepted: 06/16/2014] [Indexed: 02/06/2023] Open
Abstract
The blood-brain barrier (BBB) is formed by specialized tight junctions between endothelial cells that line brain capillaries to create a highly selective barrier between the brain and the rest of the body. A major problem to overcome in drug design is the ability of the compound in question to cross the BBB. Neuroactive drugs are required to cross the BBB to function. Conversely, drugs that target other parts of the body ideally should not cross the BBB to avoid possible psychotropic side effects. Thus, the task of predicting the BBB permeability of new compounds is of great importance. Two gold-standard experimental measures of BBB permeability are logBB (the concentration of drug in the brain divided by concentration in the blood) and logPS (permeability surface-area product). Both methods are time-consuming and expensive, and although logPS is considered the more informative measure, it is lower throughput and more resource intensive. With continual increases in computer power and improvements in molecular simulations, in silico methods may provide viable alternatives. Computational predictions of these two parameters for a sample of 12 small molecule compounds were performed. The potential of mean force for each compound through a 1,2-dioleoyl-sn-glycero-3-phosphocholine bilayer is determined by molecular dynamics simulations. This system setup is often used as a simple BBB mimetic. Additionally, one-dimensional position-dependent diffusion coefficients are calculated from the molecular dynamics trajectories. The diffusion coefficient is combined with the free energy landscape to calculate the effective permeability (Peff) for each sample compound. The relative values of these permeabilities are compared to experimentally determined logBB and logPS values. Our computational predictions correlate remarkably well with both logBB (R(2) = 0.94) and logPS (R(2) = 0.90). Thus, we have demonstrated that this approach may have the potential to provide reliable, quantitatively predictive BBB permeability, using a relatively quick, inexpensive method.
Collapse
Affiliation(s)
- Timothy S Carpenter
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California
| | - Daniel A Kirshner
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California
| | - Edmond Y Lau
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California
| | - Sergio E Wong
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California
| | - Jerome P Nilmeier
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California
| | - Felice C Lightstone
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California.
| |
Collapse
|
126
|
Prana V, Rotureau P, Fayet G, André D, Hub S, Vicot P, Rao L, Adamo C. Prediction of the thermal decomposition of organic peroxides by validated QSPR models. JOURNAL OF HAZARDOUS MATERIALS 2014; 276:216-224. [PMID: 24887124 DOI: 10.1016/j.jhazmat.2014.05.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 04/15/2014] [Accepted: 05/05/2014] [Indexed: 06/03/2023]
Abstract
Organic peroxides are unstable chemicals which can easily decompose and may lead to explosion. Such a process can be characterized by physico-chemical parameters such as heat and temperature of decomposition, whose determination is crucial to manage related hazards. These thermal stability properties are also required within many regulatory frameworks related to chemicals in order to assess their hazardous properties. In this work, new quantitative structure-property relationships (QSPR) models were developed to predict accurately the thermal stability of organic peroxides from their molecular structure respecting the OECD guidelines for regulatory acceptability of QSPRs. Based on the acquisition of 38 reference experimental data using DSC (differential scanning calorimetry) apparatus in homogenous experimental conditions, multi-linear models were derived for the prediction of the decomposition heat and the onset temperature using different types of molecular descriptors. Models were tested by internal and external validation tests and their applicability domains were defined and analyzed. Being rigorously validated, they presented the best performances in terms of fitting, robustness and predictive power and the descriptors used in these models were linked to the peroxide bond whose breaking represents the main decomposition mechanism of organic peroxides.
Collapse
Affiliation(s)
- Vinca Prana
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France; Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - Patricia Rotureau
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France.
| | - Guillaume Fayet
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - David André
- ARKEMA, rue Henri Moissan, BP63, Pierre Benite 69493, France
| | - Serge Hub
- ARKEMA, rue Henri Moissan, BP63, Pierre Benite 69493, France
| | - Patricia Vicot
- Institut National de l'Environnement Industriel et des Risques (INERIS), Parc Technologique Alata, BP2, Verneuil-en-Halatte 60550, France
| | - Li Rao
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France
| | - Carlo Adamo
- Institut de Recherche de Chimie Paris, Chimie ParisTech CNRS, 11 rue P. et M. Curie, Paris 75005, France; Institut Universitaire de France, 103 Boulevard Saint Michel, Paris F-75005, France
| |
Collapse
|
127
|
Clark RD, Liang W, Lee AC, Lawless MS, Fraczkiewicz R, Waldman M. Using beta binomials to estimate classification uncertainty for ensemble models. J Cheminform 2014; 6:34. [PMID: 24987464 PMCID: PMC4076254 DOI: 10.1186/1758-2946-6-34] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Accepted: 06/16/2014] [Indexed: 12/14/2022] Open
Abstract
Background Quantitative structure-activity (QSAR) models have enormous potential for reducing drug discovery and development costs as well as the need for animal testing. Great strides have been made in estimating their overall reliability, but to fully realize that potential, researchers and regulators need to know how confident they can be in individual predictions. Results Submodels in an ensemble model which have been trained on different subsets of a shared training pool represent multiple samples of the model space, and the degree of agreement among them contains information on the reliability of ensemble predictions. For artificial neural network ensembles (ANNEs) using two different methods for determining ensemble classification – one using vote tallies and the other averaging individual network outputs – we have found that the distribution of predictions across positive vote tallies can be reasonably well-modeled as a beta binomial distribution, as can the distribution of errors. Together, these two distributions can be used to estimate the probability that a given predictive classification will be in error. Large data sets comprised of logP, Ames mutagenicity, and CYP2D6 inhibition data are used to illustrate and validate the method. The distributions of predictions and errors for the training pool accurately predicted the distribution of predictions and errors for large external validation sets, even when the number of positive and negative examples in the training pool were not balanced. Moreover, the likelihood of a given compound being prospectively misclassified as a function of the degree of consensus between networks in the ensemble could in most cases be estimated accurately from the fitted beta binomial distributions for the training pool. Conclusions Confidence in an individual predictive classification by an ensemble model can be accurately assessed by examining the distributions of predictions and errors as a function of the degree of agreement among the constituent submodels. Further, ensemble uncertainty estimation can often be improved by adjusting the voting or classification threshold based on the parameters of the error distribution. Finally, the profiles for models whose predictive uncertainty estimates are not reliable provide clues to that effect without the need for comparison to an external test set.
Collapse
Affiliation(s)
- Robert D Clark
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| | - Wenkel Liang
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| | - Adam C Lee
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| | - Michael S Lawless
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| | - Robert Fraczkiewicz
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| | - Marvin Waldman
- Department of Life Sciences, Simulations Plus, Inc., 45205 10th Street West, Lancaster, CA 93534, USA
| |
Collapse
|
128
|
Norinder U, Carlsson L, Boyer S, Eklund M. Introducing conformal prediction in predictive modeling. A transparent and flexible alternative to applicability domain determination. J Chem Inf Model 2014; 54:1596-603. [PMID: 24797111 DOI: 10.1021/ci5001168] [Citation(s) in RCA: 116] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Conformal prediction is introduced as an alternative approach to domain applicability estimation. The advantages of using conformal prediction are as follows: First, the approach is based on a consistent and well-defined mathematical framework. Second, the understanding of the confidence level concept in conformal predictions is straightforward, e.g. a confidence level of 0.8 means that the conformal predictor will commit, at most, 20% errors (i.e., true values outside the assigned prediction range). Third, the confidence level can be varied depending on the situation where the model is to be applied and the consequences of such changes are readily understandable, i.e. prediction ranges are increased or decreased, and the changes can immediately be inspected. We demonstrate the usefulness of conformal prediction by applying it to 10 publicly available data sets.
Collapse
Affiliation(s)
- Ulf Norinder
- H. Lundbeck A/S, Ottiliavej 9, 2500 Valby, Denmark
| | | | | | | |
Collapse
|
129
|
Yan J, Zhu WW, Kong B, Lu HB, Yun YH, Huang JH, Liang YZ. A Combinational Strategy of Model Disturbance and Outlier Comparison to Define Applicability Domain in Quantitative Structural Activity Relationship. Mol Inform 2014; 33:503-13. [PMID: 27486037 DOI: 10.1002/minf.201300161] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Accepted: 04/16/2014] [Indexed: 01/21/2023]
Abstract
In order to define an applicability domain for quantitative structure-activity relationship modeling, a combinational strategy of model disturbance and outlier comparison is developed. An indicator named model disturbance index was defined to estimate the prediction error. Moreover, the information of the outliers in the training set was used to filter the unreliable samples in the test set based on "structural similarity". Chromatography retention indices data were used to investigate this approach. The relationship between model disturbance index and prediction error can be found. Also, the comparison between the outlier set and the test set could provide additional information about which unknown samples should be paid more attentions. A novel technique based on model population analysis was used to evaluate the validity of applicability domain. Finally, three commonly used methods, i.e. Leverage, descriptor range-based and model perturbation method, were compared with the proposed approach.
Collapse
Affiliation(s)
- Jun Yan
- Research Center of Modernization of Traditional Chinese Medicine, Central South University, Changsha 410083, P. R. China tel: +86 731 8830831; fax: +86 731 8830831
| | - Wei-Wei Zhu
- Department of Chemical and Bioscience, HeChi University, YiZhou 546300, P. R. China
| | - Bo Kong
- Technology Center of China Tobacco Hunan Industrial Co., LTD, Changsha 410014, P. R. China
| | - Hong-Bing Lu
- Technology Center of China Tobacco Hunan Industrial Co., LTD, Changsha 410014, P. R. China
| | - Yong-Huan Yun
- Research Center of Modernization of Traditional Chinese Medicine, Central South University, Changsha 410083, P. R. China tel: +86 731 8830831; fax: +86 731 8830831
| | - Jian-Hua Huang
- Research Center of Modernization of Traditional Chinese Medicine, Central South University, Changsha 410083, P. R. China tel: +86 731 8830831; fax: +86 731 8830831
| | - Yi-Zeng Liang
- Research Center of Modernization of Traditional Chinese Medicine, Central South University, Changsha 410083, P. R. China tel: +86 731 8830831; fax: +86 731 8830831.
| |
Collapse
|
130
|
Carrió P, Pinto M, Ecker G, Sanz F, Pastor M. Applicability Domain ANalysis (ADAN): a robust method for assessing the reliability of drug property predictions. J Chem Inf Model 2014; 54:1500-11. [PMID: 24821140 DOI: 10.1021/ci500172z] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
We report a novel method called ADAN (Applicability Domain ANalysis) for assessing the reliability of drug property predictions obtained by in silico methods. The assessment provided by ADAN is based on the comparison of the query compound with the training set, using six diverse similarity criteria. For every criterion, the query compound is considered out of range when the similarity value obtained is larger than the 95th percentile of the values obtained for the training set. The final outcome is a number in the range of 0-6 that expresses the number of unmet similarity criteria and allows classifying the query compound within seven reliability categories. Such categories can be further exploited to assign simpler reliability classes using a traffic light schema, to assign approximate confidence intervals or to mark the predictions as unreliable. The entire methodology has been validated simulating realistic conditions, where query compounds are structurally diverse from those in the training set. The validation exercise involved the construction of more than 1000 models. These models were built using a combination of training set, molecular descriptors, and modeling methods representative of the real predictive tasks performed in the eTOX project (a project whose objective is to predict in vivo toxicological end points in drug development). Validation results confirm the robustness of the proposed assessment methodology, which compares favorably with other classical methods based solely on the structural similarity of the compounds. ADAN characteristics make the method well-suited for estimate the quality of drug predictions obtained in extremely unfavorable conditions, like the prediction of drug toxicity end points.
Collapse
Affiliation(s)
- Pau Carrió
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute) , Dr. Aiguader, 88, E-08003 Barcelona, Spain
| | | | | | | | | |
Collapse
|
131
|
Ovchinnikova SI, Bykov AA, Tsivadze AY, Dyachkov EP, Kireeva NV. Supervised extensions of chemography approaches: case studies of chemical liabilities assessment. J Cheminform 2014; 6:20. [PMID: 24868246 PMCID: PMC4018504 DOI: 10.1186/1758-2946-6-20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2013] [Accepted: 04/28/2014] [Indexed: 12/04/2022] Open
Abstract
Chemical liabilities, such as adverse effects and toxicity, play a significant role in modern drug discovery process. In silico assessment of chemical liabilities is an important step aimed to reduce costs and animal testing by complementing or replacing in vitro and in vivo experiments. Herein, we propose an approach combining several classification and chemography methods to be able to predict chemical liabilities and to interpret obtained results in the context of impact of structural changes of compounds on their pharmacological profile. To our knowledge for the first time, the supervised extension of Generative Topographic Mapping is proposed as an effective new chemography method. New approach for mapping new data using supervised Isomap without re-building models from the scratch has been proposed. Two approaches for estimation of model's applicability domain are used in our study to our knowledge for the first time in chemoinformatics. The structural alerts responsible for the negative characteristics of pharmacological profile of chemical compounds has been found as a result of model interpretation.
Collapse
Affiliation(s)
- Svetlana I Ovchinnikova
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| | - Arseniy A Bykov
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| | - Aslan Yu Tsivadze
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
| | - Evgeny P Dyachkov
- Kurnakov Institute of General and Inorganic Chemistry RAS, Leninsky pr-t 31, 119071 Moscow, Russia
| | - Natalia V Kireeva
- Frumkin Institute of Physical Chemistry and Electrochemistry RAS, Leninsky pr-t 31-4, 119071 Moscow, Russia
- Moscow Institute of Physics and Technology, Institutsky per., 9, 141700 Dolgoprudny, Russia
| |
Collapse
|
132
|
Lewis RA, Wood D. Modern 2D QSAR for drug discovery. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014. [DOI: 10.1002/wcms.1187] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Affiliation(s)
- Richard A. Lewis
- Novartis Institutes for BioMedical Research; Novartis Pharma AG; Basel Switzerland
| | - David Wood
- Novartis Institutes for BioMedical Research; Novartis Horsham Research Centre; Horsham UK
| |
Collapse
|
133
|
Lexa KW, Dolghih E, Jacobson MP. A structure-based model for predicting serum albumin binding. PLoS One 2014; 9:e93323. [PMID: 24691448 PMCID: PMC3972100 DOI: 10.1371/journal.pone.0093323] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 03/04/2014] [Indexed: 11/21/2022] Open
Abstract
One of the many factors involved in determining the distribution and metabolism of a compound is the strength of its binding to human serum albumin. While experimental and QSAR approaches for determining binding to albumin exist, various factors limit their ability to provide accurate binding affinity for novel compounds. Thus, to complement the existing tools, we have developed a structure-based model of serum albumin binding. Our approach for predicting binding incorporated the inherent flexibility and promiscuity known to exist for albumin. We found that a weighted combination of the predicted logP and docking score most accurately distinguished between binders and nonbinders. This model was successfully used to predict serum albumin binding in a large test set of therapeutics that had experimental binding data.
Collapse
Affiliation(s)
- Katrina W. Lexa
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
- * E-mail:
| | - Elena Dolghih
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
| | - Matthew P. Jacobson
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California, United States of America
| |
Collapse
|
134
|
Ruggiu F, Gizzi P, Galzi JL, Hibert M, Haiech J, Baskin I, Horvath D, Marcou G, Varnek A. Quantitative structure-property relationship modeling: a valuable support in high-throughput screening quality control. Anal Chem 2014; 86:2510-20. [PMID: 24479843 DOI: 10.1021/ac403544k] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Evaluation of important pharmacokinetic properties such as hydrophobicity by high-throughput screening (HTS) methods is a major issue in drug discovery. In this paper, we present measurements of the chromatographic hydrophobicity index (CHI) on a subset of the French chemical library Chimiothèque Nationale (CN). The data were used in quantitative structure-property relationship (QSPR) modeling in order to annotate the CN. An algorithm is proposed to detect problematic molecules with large prediction errors, called outliers. In order to find an explanation for these large discrepancies between predicted and experimental values, these compounds were reanalyzed experimentally. As the first selected outliers indeed had experimental problems, including hydrolysis or sheer absence of expected structure, we herewith propose the use of QSPR as a support tool for quality control of screening data and encourage cooperation between experimental and theoretical teams to improve results. The corrected data were used to produce a model, which is freely available on our web server at http://infochim.u-strasbg.fr/webserv/VSEngine.html .
Collapse
Affiliation(s)
- Fiorella Ruggiu
- Laboratoire de Chémoinformatique, UMR 7140 CNRS, Université de Strasbourg , 1 rue Blaise Pascal, 67000 Strasbourg, France
| | | | | | | | | | | | | | | | | |
Collapse
|
135
|
Toplak M, Močnik R, Polajnar M, Bosnić Z, Carlsson L, Hasselgren C, Demšar J, Boyer S, Zupan B, Stålring J. Assessment of machine learning reliability methods for quantifying the applicability domain of QSAR regression models. J Chem Inf Model 2014; 54:431-41. [PMID: 24490838 DOI: 10.1021/ci4006595] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The vastness of chemical space and the relatively small coverage by experimental data recording molecular properties require us to identify subspaces, or domains, for which we can confidently apply QSAR models. The prediction of QSAR models in these domains is reliable, and potential subsequent investigations of such compounds would find that the predictions closely match the experimental values. Standard approaches in QSAR assume that predictions are more reliable for compounds that are "similar" to those in subspaces with denser experimental data. Here, we report on a study of an alternative set of techniques recently proposed in the machine learning community. These methods quantify prediction confidence through estimation of the prediction error at the point of interest. Our study includes 20 public QSAR data sets with continuous response and assesses the quality of 10 reliability scoring methods by observing their correlation with prediction error. We show that these new alternative approaches can outperform standard reliability scores that rely only on similarity to compounds in the training set. The results also indicate that the quality of reliability scoring methods is sensitive to data set characteristics and to the regression method used in QSAR. We demonstrate that at the cost of increased computational complexity these dependencies can be leveraged by integration of scores from various reliability estimation approaches. The reliability estimation techniques described in this paper have been implemented in an open source add-on package ( https://bitbucket.org/biolab/orange-reliability ) to the Orange data mining suite.
Collapse
Affiliation(s)
- Marko Toplak
- Faculty of Computer and Information Science, University of Ljubljana , Tržaška 25, 1000 Ljubljana, Slovenia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
136
|
Singh S, Supuran CT. Chemometric modeling of breast cancer associated carbonic anhydrase IX inhibitors belonging to the ureido-substituted benzene sulfonamide class. J Enzyme Inhib Med Chem 2014; 29:877-83. [DOI: 10.3109/14756366.2013.864652] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
137
|
Promkatkaew M, Gleeson D, Hannongbua S, Gleeson MP. Skin Sensitization Prediction Using Quantum Chemical Calculations: A Theoretical Model for the SNAr Domain. Chem Res Toxicol 2014; 27:51-60. [DOI: 10.1021/tx400323e] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Malinee Promkatkaew
- Department
of Chemistry, Faculty of Science, Kasetsart University, 50 Phaholyothin
Road, Chatuchak, Bangkok 10900, Thailand
| | - Duangkamol Gleeson
- Department
of Chemistry, Faculty of Science, King Mongkut’s Institute of Technology Ladkrabang, Bangkok 10520, Thailand
| | - Supa Hannongbua
- Department
of Chemistry, Faculty of Science, Kasetsart University, 50 Phaholyothin
Road, Chatuchak, Bangkok 10900, Thailand
| | - M. Paul Gleeson
- Department
of Chemistry, Faculty of Science, Kasetsart University, 50 Phaholyothin
Road, Chatuchak, Bangkok 10900, Thailand
| |
Collapse
|
138
|
Klepsch F, Vasanthanathan P, Ecker GF. Ligand and structure-based classification models for prediction of P-glycoprotein inhibitors. J Chem Inf Model 2014; 54:218-29. [PMID: 24050383 PMCID: PMC3904775 DOI: 10.1021/ci400289j] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
The ABC transporter P-glycoprotein (P-gp) actively transports a wide range of drugs and toxins out of cells, and is therefore related to multidrug resistance and the ADME profile of therapeutics. Thus, development of predictive in silico models for the identification of P-gp inhibitors is of great interest in the field of drug discovery and development. So far in silico P-gp inhibitor prediction was dominated by ligand-based approaches because of the lack of high-quality structural information about P-gp. The present study aims at comparing the P-gp inhibitor/noninhibitor classification performance obtained by docking into a homology model of P-gp, to supervised machine learning methods, such as Kappa nearest neighbor, support vector machine (SVM), random fores,t and binary QSAR, by using a large, structurally diverse data set. In addition, the applicability domain of the models was assessed using an algorithm based on Euclidean distance. Results show that random forest and SVM performed best for classification of P-gp inhibitors and noninhibitors, correctly predicting 73/75% of the external test set compounds. Classification based on the docking experiments using the scoring function ChemScore resulted in the correct prediction of 61% of the external test set. This demonstrates that ligand-based models currently remain the methods of choice for accurately predicting P-gp inhibitors. However, structure-based classification offers information about possible drug/protein interactions, which helps in understanding the molecular basis of ligand-transporter interaction and could therefore also support lead optimization.
Collapse
Affiliation(s)
- Freya Klepsch
- University of Vienna , Department of Medicinal Chemistry, Althanstraße 14, 1090 Vienna, Austria
| | | | | |
Collapse
|
139
|
Sheridan RP. Using random forest to model the domain applicability of another random forest model. J Chem Inf Model 2013; 53:2837-50. [PMID: 24152204 DOI: 10.1021/ci400482e] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In QSAR, a statistical model is generated from a training set of molecules (represented by chemical descriptors) and their biological activities. We will call this traditional type of QSAR model an "activity model". The activity model can be used to predict the activities of molecules not in the training set. A relatively new subfield for QSAR is domain applicability. The aim is to estimate the reliability of prediction of a specific molecule on a specific activity model. A number of different metrics have been proposed in the literature for this purpose. It is desirable to build a quantitative model of reliability against one or more of these metrics. We can call this an "error model". A previous publication from our laboratory (Sheridan J. Chem. Inf. Model., 2012, 52, 814-823.) suggested the simultaneous use of three metrics would be more discriminating than any one metric. An error model could be built in the form of a three-dimensional set of bins. When the number of metrics exceeds three, however, the bin paradigm is not practical. An obvious solution for constructing an error model using multiple metrics is to use a QSAR method, in our case random forest. In this paper we demonstrate the usefulness of this paradigm, specifically for determining whether a useful error model can be built and which metrics are most useful for a given problem. For the ten data sets and for the seven metrics we examine here, it appears that it is possible to construct a useful error model using only two metrics (TREE_SD and PREDICTED). These do not require calculating similarities/distances between the molecules being predicted and the molecules used to build the activity model, which can be rate-limiting.
Collapse
Affiliation(s)
- Robert P Sheridan
- Cheminformatics Department, Merck Research Laboratories , RY800-D133, Rahway, New Jersey 07065, United States
| |
Collapse
|
140
|
Vuorinen A, Odermatt A, Schuster D. In silico methods in the discovery of endocrine disrupting chemicals. J Steroid Biochem Mol Biol 2013; 137:18-26. [PMID: 23688835 DOI: 10.1016/j.jsbmb.2013.04.009] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Revised: 04/03/2013] [Accepted: 04/07/2013] [Indexed: 11/27/2022]
Abstract
The prevalence of sex hormone-dependent cancers, reproductive problems, obesity, and cardiovascular complications has risen especially in the Western world. It has been suggested, that the exposure to various endocrine disrupting chemicals (EDCs) contributes to the development and progression of these diseases. EDCs can interfere with various proteins: nuclear steroid hormone receptors, such as estrogen-, androgen-, glucocorticoid- and mineralocorticoid receptors (ER, AR, GR, MR), and enzymes that are involved in steroid hormone synthesis and metabolism, for example hydroxysteroid dehydrogenases (HSDs). Numerous chemicals are known as endocrine disruptors. However, the mechanism of action for most of these EDCs is still unknown. It is exhaustive and time consuming to test in vitro all chemicals - potential EDCs - used in industry, agriculture or as food preservatives against their effects on the endocrine system. Computational methods, such as virtual screening, quantitative structure activity relationships and docking, are already well recognized and used in drug development. The same methods could also aid the research on EDCs. So far, the computational methods in the search of EDCs have been retrospective. There are, however, some prospective studies reporting the use of in silico methods: five studies reporting the identification of previously unknown 17β-HSD3 inhibitors, MR agonists, and ER antagonists/agonists. This review provides an overview of case studies and in silico methods that are used in the search of EDCs. This article is part of a Special Issue entitled 'CSR 2013'.
Collapse
Affiliation(s)
- Anna Vuorinen
- Institute of Pharmacy/Pharmaceutical Chemistry and Center for Molecular Biosciences Innsbruck - CMBI, University of Innsbruck, Innrain 80-82, 6020 Innsbruck, Austria
| | | | | |
Collapse
|
141
|
|
142
|
Shiraishi A, Niijima S, Brown JB, Nakatsui M, Okuno Y. Chemical genomics approach for GPCR-ligand interaction prediction and extraction of ligand binding determinants. J Chem Inf Model 2013; 53:1253-62. [PMID: 23721295 DOI: 10.1021/ci300515z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Chemical genomics research has revealed that G-protein coupled receptors (GPCRs) interact with a variety of ligands and that a large number of ligands are known to bind GPCRs even with low transmembrane (TM) sequence similarity. It is crucial to extract informative binding region propensities from large quantities of bioactivity data. To address this issue, we propose a machine learning approach that enables identification of both chemical substructures and amino acid properties that are associated with ligand binding, which can be applied to virtual ligand screening on a GPCR-wide scale. We also address the question of how to select plausible negative noninteraction pairs based on a statistical approach in order to develop reliable prediction models for GPCR-ligand interactions. The key interaction sites estimated by our approach can be of great use not only for screening of active compounds but also for modification of active compounds with the aim of improving activity or selectivity.
Collapse
Affiliation(s)
- Akira Shiraishi
- Department of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto
| | | | | | | | | |
Collapse
|
143
|
Sheridan RP. Time-split cross-validation as a method for estimating the goodness of prospective prediction. J Chem Inf Model 2013; 53:783-90. [PMID: 23521722 DOI: 10.1021/ci400084k] [Citation(s) in RCA: 161] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Cross-validation is a common method to validate a QSAR model. In cross-validation, some compounds are held out as a test set, while the remaining compounds form a training set. A model is built from the training set, and the test set compounds are predicted on that model. The agreement of the predicted and observed activity values of the test set (measured by, say, R(2)) is an estimate of the self-consistency of the model and is sometimes taken as an indication of the predictivity of the model. This estimate of predictivity can be optimistic or pessimistic compared to true prospective prediction, depending how compounds in the test set are selected. Here, we show that time-split selection gives an R(2) that is more like that of true prospective prediction than the R(2) from random selection (too optimistic) or from our analog of leave-class-out selection (too pessimistic). Time-split selection should be used in addition to random selection as a standard for cross-validation in QSAR model building.
Collapse
Affiliation(s)
- Robert P Sheridan
- Cheminformatics Department, Merck Research Laboratories, Rahway, New Jersey 07065, USA.
| |
Collapse
|
144
|
Wood DJ, Carlsson L, Eklund M, Norinder U, Stålring J. QSAR with experimental and predictive distributions: an information theoretic approach for assessing model quality. J Comput Aided Mol Des 2013; 27:203-19. [PMID: 23504478 PMCID: PMC3639359 DOI: 10.1007/s10822-013-9639-5] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2012] [Accepted: 03/05/2013] [Indexed: 11/29/2022]
Abstract
We propose that quantitative structure–activity relationship (QSAR) predictions should be explicitly represented as predictive (probability) distributions. If both predictions and experimental measurements are treated as probability distributions, the quality of a set of predictive distributions output by a model can be assessed with Kullback–Leibler (KL) divergence: a widely used information theoretic measure of the distance between two probability distributions. We have assessed a range of different machine learning algorithms and error estimation methods for producing predictive distributions with an analysis against three of AstraZeneca’s global DMPK datasets. Using the KL-divergence framework, we have identified a few combinations of algorithms that produce accurate and valid compound-specific predictive distributions. These methods use reliability indices to assign predictive distributions to the predictions output by QSAR models so that reliable predictions have tight distributions and vice versa. Finally we show how valid predictive distributions can be used to estimate the probability that a test compound has properties that hit single- or multi- objective target profiles.
Collapse
|
145
|
Singh S, Supuran CT. Chemometric QSAR modeling and in silico design of carbonic anhydrase inhibition of a coral secretory isoform by sulfonamide. Bioorg Med Chem 2013; 21:1495-502. [DOI: 10.1016/j.bmc.2012.09.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2012] [Revised: 08/28/2012] [Accepted: 09/01/2012] [Indexed: 11/26/2022]
|
146
|
Keefer CE, Kauffman GW, Gupta RR. Interpretable, Probability-Based Confidence Metric for Continuous Quantitative Structure–Activity Relationship Models. J Chem Inf Model 2013; 53:368-83. [DOI: 10.1021/ci300554t] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | - Gregory W. Kauffman
- Worldwide Medicinal Chemistry,
Neuroscience Research Unit, Pfizer Inc., Cambridge, Massachusetts 02139, United States
| | | |
Collapse
|
147
|
Asadollahi-Baboli M. Straightforward MIA-QSTR evaluation of environmental toxicities of aromatic aldehydes to Tetrahymena pyriformis. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:1041-1050. [PMID: 24313440 DOI: 10.1080/1062936x.2013.840678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Aldehydes are toxic environmental contaminants which cause severe health hazards. There is a growing need by industries and regulatory agencies for the development of tools able to assess the potential hazardous effects of chemicals on living organisms. In this background, multivariate image analysis combined with quantitative structure-toxicity relationships (MIA-QSTR) was used to evaluate the toxicity of aromatic aldehydes to Tetrahymena pyriformis. The techniques of genetic algorithm-partial least squares (GA-PLS) were applied effectively as MIA descriptor selection and mapping tools. In MIA-QSTR evaluation, pixels of 2D images of chemical structures could be used to recognize physicochemical information and predict changes in the toxicities. The resulting MIA-QSTR explains 90.3% leave-one-out predicted variance and 93.1% external predicted variance. The MIA-QSTR/GA-PLS performances were validated using various evaluation techniques such as cross-validation, applicability domain and Y-scrambling procedures, suggesting that the present methodology together with mechanistic interpretation may be useful to evaluate toxicity, safety and risk assessment of toxic environmental contaminants.
Collapse
Affiliation(s)
- M Asadollahi-Baboli
- a Department of Science , Babol University of Technology , Babol , Mazandaran , Iran
| |
Collapse
|
148
|
Abstract
Understanding structure-activity relationships (SARs) for a given set of molecules allows one to rationally explore chemical space and develop a chemical series optimizing multiple physicochemical and biological properties simultaneously, for instance, improving potency, reducing toxicity, and ensuring sufficient bioavailability. In silico methods allow rapid and efficient characterization of SARs and facilitate building a variety of models to capture and encode one or more SARs, which can then be used to predict activities for new molecules. By coupling these methods with in silico modifications of structures, one can easily prioritize large screening decks or even generate new compounds de novo and ascertain whether they belong to the SAR being studied. Computational methods can provide a guide for the experienced user by integrating and summarizing large amounts of preexisting data to suggest useful structural modifications. This chapter highlights the different types of SAR modeling methods and how they support the task of exploring chemical space to elucidate and optimize SARs in a drug discovery setting. In addition to considering modeling algorithms, I briefly discuss how to use databases as a source of SAR data to inform and enhance the exploration of SAR trends. I also review common modeling techniques that are used to encode SARs, recent work in the area of structure-activity landscapes, the role of SAR databases, and alternative approaches to exploring SAR data that do not involve explicit model development.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, Rockville, MD, USA
| |
Collapse
|
149
|
QSRR Study on Flavor Compounds of Diverse Structures on Different Columns with the Help of New Chemometric Methods. Chromatographia 2012. [DOI: 10.1007/s10337-012-2349-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
150
|
Sun H, Veith H, Xia M, Austin CP, Tice RR, Huang R. Prediction of Cytochrome P450 Profiles of Environmental Chemicals with QSAR Models Built from Drug-like Molecules. Mol Inform 2012; 31:783-792. [PMID: 23459712 PMCID: PMC3583379 DOI: 10.1002/minf.201200065] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The human cytochrome P450 (CYP) enzyme family is involved in the biotransformation of many xenobiotics. As part of the U.S. Tox21 Phase I effort, we profiled the CYP activity of approximately three thousand compounds, primarily those of environmental concern, against human CYP1A2, CYP2C19, CYP2C9, CYP2D6, and CYP3A4 isoforms in a quantitative high throughput screening (qHTS) format. In order to evaluate the extent to which computational models built from a drug-like library screened in these five CYP assays under the same conditions can accurately predict the outcome of an environmental compound library, five support vector machines (SVM) models built from over 17,000 drug-like compounds were challenged to predict the CYP activities of the Tox21 compound collection. Although a large fraction of the test compounds fall outside of the applicability domain (AD) of the models, as measured by k-nearest neighbor (k-NN) similarities, the predictions were largely accurate for CYP1A2, CYP2C9, and CYP3A4 ioszymes with area under the receiver operator characteristic curves (AUC-ROC) ranging between 0.82 and 0.84. The lower predictive power of the CYP2C19 model (AUC-ROC = 0.76) is caused by experimental errors and that of the CYP2D6 model (AUC-ROC = 0.76) can be rescued by rebalancing the training data. Our results demonstrate that decomposing molecules into atom types enhanced the coverage of the AD and that computational models built from drug-like molecules can be used to predict the ability of non-drug like compounds to interact with these CYPs.
Collapse
Affiliation(s)
- Hongmao Sun
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Henrike Veith
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Christopher P. Austin
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Raymond R. Tice
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| |
Collapse
|