1
|
Yan Y, Wang W, Sun Z, Zhang JZH, Ji C. Protein-Ligand Empirical Interaction Components for Virtual Screening. J Chem Inf Model 2017; 57:1793-1806. [PMID: 28678484 DOI: 10.1021/acs.jcim.7b00017] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
A major shortcoming of empirical scoring functions is that they often fail to predict binding affinity properly. Removing false positives of docking results is one of the most challenging works in structure-based virtual screening. Postdocking filters, making use of all kinds of experimental structure and activity information, may help in solving the issue. We describe a new method based on detailed protein-ligand interaction decomposition and machine learning. Protein-ligand empirical interaction components (PLEIC) are used as descriptors for support vector machine learning to develop a classification model (PLEIC-SVM) to discriminate false positives from true positives. Experimentally derived activity information is used for model training. An extensive benchmark study on 36 diverse data sets from the DUD-E database has been performed to evaluate the performance of the new method. The results show that the new method performs much better than standard empirical scoring functions in structure-based virtual screening. The trained PLEIC-SVM model is able to capture important interaction patterns between ligand and protein residues for one specific target, which is helpful in discarding false positives in postdocking filtering.
Collapse
Affiliation(s)
- Yuna Yan
- Shanghai Engineering Research Center for Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China.,State Key Laboratory of Precision Spectroscopy, East China Normal University , Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai , Shanghai 200062, China
| | - Weijun Wang
- Shanghai Engineering Research Center for Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China.,State Key Laboratory of Precision Spectroscopy, East China Normal University , Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai , Shanghai 200062, China
| | - Zhaoxi Sun
- Shanghai Engineering Research Center for Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China.,State Key Laboratory of Precision Spectroscopy, East China Normal University , Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai , Shanghai 200062, China
| | - John Z H Zhang
- Shanghai Engineering Research Center for Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China.,State Key Laboratory of Precision Spectroscopy, East China Normal University , Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai , Shanghai 200062, China
| | - Changge Ji
- Shanghai Engineering Research Center for Molecular Therapeutics and New Drug Development, School of Chemistry and Molecular Engineering, East China Normal University , Shanghai 200062, China.,State Key Laboratory of Precision Spectroscopy, East China Normal University , Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai , Shanghai 200062, China
| |
Collapse
|
2
|
Chen D, Xing G, Yao J, Zhou H. Construction of highly functionalized naphthalenes using an in situ ene–allene strategy. RSC Adv 2016. [DOI: 10.1039/c6ra21889j] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Construction of highly functionalized naphthalene derivatives remains a challenging task for organic chemists because of the effect of the substituent.
Collapse
Affiliation(s)
- Dianpeng Chen
- College of Biological
- Chemical Sciences and Engineering
- Jiaxing University
- Jiaxing 314001
- People's Republic of China
| | - Gangdong Xing
- Department of Chemistry
- Zhejiang University (Campus Xixi)
- Hangzhou 310028
- People's Republic of China
| | - Jinzhong Yao
- College of Biological
- Chemical Sciences and Engineering
- Jiaxing University
- Jiaxing 314001
- People's Republic of China
| | - Hongwei Zhou
- College of Biological
- Chemical Sciences and Engineering
- Jiaxing University
- Jiaxing 314001
- People's Republic of China
| |
Collapse
|
3
|
Korkmaz S, Zararsiz G, Goksuluk D. Drug/nondrug classification using Support Vector Machines with various feature selection strategies. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014; 117:51-60. [PMID: 25224081 DOI: 10.1016/j.cmpb.2014.08.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 08/15/2014] [Accepted: 08/27/2014] [Indexed: 06/03/2023]
Abstract
In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM-Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery.
Collapse
Affiliation(s)
- Selcuk Korkmaz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey.
| | - Gokmen Zararsiz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| | - Dincer Goksuluk
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| |
Collapse
|
4
|
|
5
|
Snyder RD, Holt PA, Maguire JM, Trent JO. Prediction of noncovalent Drug/DNA interaction using computational docking models: studies with over 1350 launched drugs. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2013; 54:668-681. [PMID: 23893771 DOI: 10.1002/em.21796] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Revised: 05/11/2013] [Accepted: 05/29/2013] [Indexed: 06/02/2023]
Abstract
Noncovalent chemical/DNA interactions, for example, intercalation and groove-binding, may be more important to genomic integrity than previously appreciated, and there may very well be genotoxic consequences of that binding. It is of importance, then, to develop methods allowing a determination or prediction of such interactions. This would have particular utility in the pharmaceutical industry where genotoxicity is, for the most part, disallowed in new drug entities. We have previously used DNA docking simulations to assess if molecules had structure and charge characteristics which could accommodate noncovalent binding via, for example, electrostatic/hydrogen bonding. We here extend those earlier studies by examining a series of over 1,350 "launched" drugs for ability to noncovalently bind 10 different DNA sequences using two computational programs: Autodock and Surflex. These drugs were also evaluated for binding to the crystallographic ATP-binding site of human topoisomerase II. The results obtained clearly demonstrate multiple series of noncovalent DNA binding structure activity relationships which would not have been predicted based on cursory structural examination. Many drugs within these series are genotoxic although not via any commonly recognized structural covalent alerts. The present studies confirm previously implicated features such as N-dialkyl groups and specific N-aryl ketones as potential genotoxic chemical moieties acting through noncovalent mechanisms. These initial studies provide considerable evidence that DNA intercalation may be an important, largely overlooked, source of drug-induced genotoxicity and further suggest involvement of topoisomerase in that genotoxicity.
Collapse
|
6
|
Abstract
Frequent failure of drug candidates during development stages remains the major deterrent for an early introduction of new drug molecules. The drug toxicity is the major cause of expensive late-stage development failures. An early identification/optimization of the most favorable molecule will naturally save considerable cost, time, human efforts and minimize animal sacrifice. (Quantitative) Structure Activity Relationships [(Q)SARs] represent statistically derived predictive models correlating biological activity (including desirable therapeutic effect and undesirable side effects) of chemicals (drugs/toxicants/environmental pollutants) with molecular descriptors and/or properties. (Q)SAR models which categorize the available data into two or more groups/classes are known as classification models. Numerous techniques of diverse nature are being presently employed for development of classification models. Though there is an increasing use of classification models for prediction of either biological activity or toxicity, the future trend will naturally be towards the development of classification models capable of simultaneous prediction of biological activity, toxicity, and pharmacokinetic parameters so as to accelerate development of bioavailable safe drug molecules.
Collapse
|
7
|
Sato T, Yuki H, Takaya D, Sasaki S, Tanaka A, Honma T. Application of Support Vector Machine to Three-Dimensional Shape-Based Virtual Screening Using Comprehensive Three-Dimensional Molecular Shape Overlay with Known Inhibitors. J Chem Inf Model 2012; 52:1015-26. [DOI: 10.1021/ci200562p] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Tomohiro Sato
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Hitomi Yuki
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Daisuke Takaya
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Shunta Sasaki
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Akiko Tanaka
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Teruki Honma
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| |
Collapse
|
8
|
Abstract
BACKGROUND Drug repositioning is a current strategy to find new uses for existing drugs, patented or not, and for late-stage candidates that failed for lack of efficacy. RESULTS In silico profiling of several marketed drugs (methadone, rapamycin, saquinavir and telmisartan) was performed, exploiting a vast amount of published information. Similar compounds were assessed in terms of target-activity profiles for major drug-target families. In silico profiles were visualized within an interactive heat map and detailed analysis was performed associated with the accessible current knowledge. CONCLUSION Based on a basic principle assuming that similar molecules share similar target activity, new potential targets and, therefore, opportunities of potential new indications have been identified and discussed.
Collapse
|
9
|
Pouliot Y, Chiang AP, Butte AJ. Predicting adverse drug reactions using publicly available PubChem BioAssay data. Clin Pharmacol Ther 2011; 90:90-9. [PMID: 21613989 DOI: 10.1038/clpt.2011.81] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Adverse drug reactions (ADRs) can have severe consequences, and therefore the ability to predict ADRs prior to market introduction of a drug is desirable. Computational approaches applied to preclinical data could be one way to inform drug labeling and marketing with respect to potential ADRs. Based on the premise that some of the molecular actors of ADRs involve interactions that are detectable in large, and increasingly public, compound screening campaigns, we generated logistic regression models that correlate postmarketing ADRs with screening data from the PubChem BioAssay database. These models analyze ADRs at the level of organ systems, using the system organ classes (SOCs). Of the 19 SOCs under consideration, nine were found to be significantly correlated with preclinical screening data. With regard to six of the eight established drugs for which we could retropredict SOC-specific ADRs, prior knowledge was found that supports these predictions. We conclude this paper by predicting that SOC-specific ADRs will be associated with three unapproved or recently introduced drugs.
Collapse
Affiliation(s)
- Y Pouliot
- Division of Systems Medicine, Department of Pediatrics, Stanford University School of Medicine, Stanford, California, USA
| | | | | |
Collapse
|
10
|
Koutsoukas A, Simms B, Kirchmair J, Bond PJ, Whitmore AV, Zimmer S, Young MP, Jenkins JL, Glick M, Glen RC, Bender A. From in silico target prediction to multi-target drug design: current databases, methods and applications. J Proteomics 2011; 74:2554-74. [PMID: 21621023 DOI: 10.1016/j.jprot.2011.05.011] [Citation(s) in RCA: 186] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Revised: 04/10/2011] [Accepted: 05/06/2011] [Indexed: 01/31/2023]
Abstract
Given the tremendous growth of bioactivity databases, the use of computational tools to predict protein targets of small molecules has been gaining importance in recent years. Applications span a wide range, from the 'designed polypharmacology' of compounds to mode-of-action analysis. In this review, we firstly survey databases that can be used for ligand-based target prediction and which have grown tremendously in size in the past. We furthermore outline methods for target prediction that exist, both based on the knowledge of bioactivities from the ligand side and methods that can be applied in situations when a protein structure is known. Applications of successful in silico target identification attempts are discussed in detail, which were based partly or in whole on computational target predictions in the first instance. This includes the authors' own experience using target prediction tools, in this case considering phenotypic antibacterial screens and the analysis of high-throughput screening data. Finally, we will conclude with the prospective application of databases to not only predict, retrospectively, the protein targets of a small molecule, but also how to design ligands with desired polypharmacology in a prospective manner.
Collapse
Affiliation(s)
- Alexios Koutsoukas
- Unilever Centre for Molecular Sciences Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Snyder RD. Possible structural and functional determinants contributing to the clastogenicity of pharmaceuticals. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2010; 51:800-814. [PMID: 20872827 DOI: 10.1002/em.20626] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The battery of regulatory genotoxicity studies required in support of new drug registration has remained largely static since its inception some thirty years ago. The Ames mutagenicity assay still forms the foundation for this testing being highly reproducible and, for the most part, transparent. The same is not necessarily true of in vitro mammalian chromosome aberration assays since there is a fairly large and growing number of molecules without clear structural genotoxicity alerts (DEREK, MCASE), which are negative in Ames testing but positive in aberration studies, often only at high concentrations and/or cytotoxicity. Interpretation and risk assessment of these positive results can be problematic since there is no clear understanding of the process that generates them. The present paper builds on our previous observations suggesting that non covalent drug/DNA interactions, which are not adequately modeled in computational programs, may help explain some of these unexpected positive results. In particular, it is suggested that N-dimethyl groups and certain pyridine/piperidine aryl ketones may play a contributory role in genotoxicity, perhaps via DNA intercalation and topoisomerase inhibition. Clastogenicity arising from topoisomerase inhibition would be expected to be a threshold phenomenon and, as such, may carry a distinctly reduced risk relative to clastogenicity associated with covalent drug/DNA interactions.
Collapse
|
12
|
Sakiyama Y. The use of machine learning and nonlinear statistical tools for ADME prediction. Expert Opin Drug Metab Toxicol 2010; 5:149-69. [PMID: 19239395 DOI: 10.1517/17425250902753261] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.
Collapse
Affiliation(s)
- Yojiro Sakiyama
- Pharmacokinetics Dynamics Metabolism, Pfizer Global Research and Development, Sandwich Laboratories, Kent, UK.
| |
Collapse
|
13
|
Sato T, Honma T, Yokoyama S. Combining Machine Learning and Pharmacophore-Based Interaction Fingerprint for in Silico Screening. J Chem Inf Model 2009; 50:170-85. [PMID: 20038188 DOI: 10.1021/ci900382e] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- Tomohiro Sato
- Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan, and RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Teruki Honma
- Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan, and RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Shigeyuki Yokoyama
- Department of Biophysics and Biochemistry, Graduate School of Science, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-0033, Japan, and RIKEN Systems and Structural Biology Center, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| |
Collapse
|
14
|
Kawai K, Takahashi Y. Identification of the Dual Action Antihypertensive Drugs Using TFS-Based Support Vector Machines. CHEM-BIO INFORMATICS JOURNAL 2009. [DOI: 10.1273/cbij.9.41] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Kentaro Kawai
- Department of Knowledge-Based Information Engineering, Toyohashi University of Technology
- Kaken Pharmaceutical
| | - Yoshimasa Takahashi
- Department of Knowledge-Based Information Engineering, Toyohashi University of Technology
| |
Collapse
|