1
|
Kashyap K, Mahapatra PP, Ahmed S, Buyukbingol E, Siddiqi MI. Identification of Potential Aldose Reductase Inhibitors Using Convolutional Neural Network-Based in Silico Screening. J Chem Inf Model 2023; 63:6261-6282. [PMID: 37788831 DOI: 10.1021/acs.jcim.3c00547] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/05/2023]
Abstract
Aldose reductase (ALR2) is a notable enzyme of the polyol pathway responsible for aggravating diabetic neuropathy complications. The first step begins when it catalyzes the reduction of glucose to sorbitol with NADPH as a coenzyme. Elevated concentrations of sorbitol damage the tissues, leading to complications like neuropathy. Though considerable effort has been pushed toward the successful discovery of potent inhibitors, its discovery still remains an elusive task. To this end, we present a 3D convolutional neural network (3D-CNN) based ALR2 inhibitor classification technique by dealing with snapshots of images captured from 3D chemical structures with multiple rotations as input data. The CNN-based architecture was trained on the 360 sets of image data along each axis and further prediction on the Maybridge library by each of the models. Subjecting the retrieved hits to molecular docking leads to the identification of the top 10 molecules with high binding affinity. The hits displayed a better blood-brain barrier penetration (BBB) score (90% with more than four scores) as compared to standard inhibitors (38%), reflecting the superior BBB penetrating efficiency of the hits. Followed by molecular docking, the biological evaluation spotlighted five compounds as promising ALR2 inhibitors and can be considered as a likely prospect for further structural optimization with medicinal chemistry efforts to improve their inhibition efficacy and consolidate them as new ALR2 antagonists in the future. In addition, the study also demonstrated the usefulness of scaffold analysis of the molecules as a method for investigating the significance of structurally diverse compounds in data-driven studies. For reproducibility and accessibility purposes, all of the source codes used in our study are publicly available.
Collapse
Affiliation(s)
- Kushagra Kashyap
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow 226031, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| | - Pinaki Prasad Mahapatra
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow 226031, India
| | - Shakil Ahmed
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow 226031, India
| | - Erdem Buyukbingol
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, Ankara University, 06100 Ankara, Turkey
| | - Mohammad Imran Siddiqi
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute, Sector 10, Jankipuram Extension, Sitapur Road, Lucknow 226031, India
- Academy of Scientific and Innovative Research (AcSIR), Ghaziabad 201002, India
| |
Collapse
|
2
|
Kashyap K, Panigrahi L, Ahmed S, Siddiqi MI. Artificial neural network models driven novel virtual screening workflow for the identification and biological evaluation of BACE1 inhibitors. Mol Inform 2023; 42:e2200113. [PMID: 36460626 DOI: 10.1002/minf.202200113] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 11/11/2022] [Accepted: 12/02/2022] [Indexed: 12/04/2022]
Abstract
Beta-site amyloid-β precursor protein-cleaving enzyme 1 (BACE1) is a transmembrane aspartic protease and has shown potential as a possible therapeutic target for Alzheimer's disease. This aggravating disease involves the aberrant production of β amyloid plaques by BACE1 which catalyzes the rate-limiting step by cleaving the amyloid precursor protein (APP), generating the neurotoxic amyloid β protein that aggregates to form plaques leading to neurodegeneration. Therefore, it is indispensable to inhibit BACE1, thus modulating the APP processing. In this study, we present a workflow that utilizes a multi-stage virtual screening protocol for identifying potential BACE1 inhibitors by employing multiple artificial neural network-based models. Collectively, all the hyperparameter tuned models were assigned a task to virtually screen Maybridge library, thus yielding a consensus of 41 hits. The majority of these hits exhibited optimal pharmacokinetic properties confirmed by high central nervous system multiparameter optimization (CNS-MPO) scores. Further shortlisting of 8 compounds by molecular docking into the active site of BACE1 and their subsequent in-vitro evaluation identified 4 compounds as potent BACE1 inhibitors with IC50 values falling in the range 0.028-0.052 μM and can be further optimized with medicinal chemistry efforts to improve their activity.
Collapse
Affiliation(s)
- Kushagra Kashyap
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute (CSIR-CDRI), Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| | - Lalita Panigrahi
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute (CSIR-CDRI), Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, India
| | - Shakil Ahmed
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute (CSIR-CDRI), Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, India
| | - Mohammad Imran Siddiqi
- Biochemistry and Structural Biology Division, CSIR-Central Drug Research Institute (CSIR-CDRI), Sector 10, Jankipuram Extension, Sitapur Road, Lucknow, 226031, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, 201002, India
| |
Collapse
|
3
|
Fayyazi N, Mostashari-Rad T, Ghasemi JB, Ardakani MM, Kobarfard F. Molecular dynamics simulation, 3D-pharmacophore and scaffold hopping analysis in the design of multi-target drugs to inhibit potential targets of COVID-19. J Biomol Struct Dyn 2022; 40:11787-11808. [PMID: 34405765 DOI: 10.1080/07391102.2021.1965914] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
SARS-CoV-2 has posed serious threat to the health and has inflicted huge costs in the world. Discovering potent compounds is a critical step to inhibit coronavirus. 3CLpro and RdRp are the most conserved targets associated with COVID-19. In this study, three-dimensional pharmacophore modeling, scaffold hopping, molecular docking, structure-based virtual screening, QSAR-based ADMET predictions and molecular dynamics analysis were used to identify inhibitors for these targets. Binding free energies estimated by molecular docking for each ligand in different binding sites of RdRp were used to predict the active site. Previously reported active 3CLpro and RdRp inhibitors were used to build a pharmacophore model to develop different scaffolds. Structure-based simulations and pharmacophore modeling based on Hip Hop algorithm converged in a state that suggest hydrogen bond acceptor and donor features have a critical role in the two binding sites. Further validations indicated that the best pharmacophore model has fairly good correlation values compared with approved inhibitors. Structure-based simulation results approved that GLu166 and Gln189 in 3CLpro and Lys551 and Glu811 in RdRp, are critical residues for dual activities. Ten compounds were extracted from pharmacophore-based virtual screening in six databases. The results, gained by repurposing approach, suggest the effectiveness of these ten compounds with different scaffolds as possible inhibitors of the two targets. Some quinoline-based hybrid derivatives also were designed. QSAR descriptors plot predicted that the scaffolds have had accepted pharmacokinetic profiles. Multiple molecular dynamics simulations in 100 ns and MM/PBSA studies of some reference inhibitors and the novel compounds in complex with both targets demonstrated stable complexes and confirmed the interaction modes. Based on different computational methods, COVID-19 multi-target inhibitors are proposed. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Neda Fayyazi
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan, Iran.,Phytochemistry Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Tahereh Mostashari-Rad
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan, Iran.,Phytochemistry Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Jahan B Ghasemi
- College of Sciences, Faculty of Chemistry, University of Tehran, Tehran, Iran
| | - Mehran Mirabzadeh Ardakani
- Department of Traditional Pharmacy, Faculty of Pharmacy, Tehran University of Medical Science, Tehran, Iran
| | - Farzad Kobarfard
- Phytochemistry Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.,Department of Medicinal Chemistry, School of Pharmacy, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| |
Collapse
|
4
|
Rodríguez-Pérez R, Bajorath J. Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery. J Comput Aided Mol Des 2022; 36:355-362. [PMID: 35304657 PMCID: PMC9325859 DOI: 10.1007/s10822-022-00442-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 02/15/2022] [Indexed: 11/05/2022]
Abstract
The support vector machine (SVM) algorithm is one of the most widely used machine learning (ML) methods for predicting active compounds and molecular properties. In chemoinformatics and drug discovery, SVM has been a state-of-the-art ML approach for more than a decade. A unique attribute of SVM is that it operates in feature spaces of increasing dimensionality. Hence, SVM conceptually departs from the paradigm of low dimensionality that applies to many other methods for chemical space navigation. The SVM approach is applicable to compound classification, and ranking, multi-class predictions, and –in algorithmically modified form– regression modeling. In the emerging era of deep learning (DL), SVM retains its relevance as one of the premier ML methods in chemoinformatics, for reasons discussed herein. We describe the SVM methodology including strengths and weaknesses and discuss selected applications that have contributed to the evolution of SVM as a premier approach for compound classification, property predictions, and virtual compound screening.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany.,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany. .,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland.
| |
Collapse
|
5
|
Brown BP, Vu O, Geanes AR, Kothiwale S, Butkiewicz M, Lowe EW, Mueller R, Pape R, Mendenhall J, Meiler J. Introduction to the BioChemical Library (BCL): An Application-Based Open-Source Toolkit for Integrated Cheminformatics and Machine Learning in Computer-Aided Drug Discovery. Front Pharmacol 2022; 13:833099. [PMID: 35264967 PMCID: PMC8899505 DOI: 10.3389/fphar.2022.833099] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/24/2022] [Indexed: 01/31/2023] Open
Abstract
The BioChemical Library (BCL) cheminformatics toolkit is an application-based academic open-source software package designed to integrate traditional small molecule cheminformatics tools with machine learning-based quantitative structure-activity/property relationship (QSAR/QSPR) modeling. In this pedagogical article we provide a detailed introduction to core BCL cheminformatics functionality, showing how traditional tasks (e.g., computing chemical properties, estimating druglikeness) can be readily combined with machine learning. In addition, we have included multiple examples covering areas of advanced use, such as reaction-based library design. We anticipate that this manuscript will be a valuable resource for researchers in computer-aided drug discovery looking to integrate modular cheminformatics and machine learning tools into their pipelines.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Chemical and Physical Biology Program, Medical Scientist Training Program, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
- *Correspondence: Jens Meiler, ; Jeffrey Mendenhall, ; Benjamin P. Brown,
| | - Oanh Vu
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Alexander R. Geanes
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Sandeepkumar Kothiwale
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Mariusz Butkiewicz
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Edward W. Lowe
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Ralf Mueller
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Richard Pape
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Jeffrey Mendenhall
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
- *Correspondence: Jens Meiler, ; Jeffrey Mendenhall, ; Benjamin P. Brown,
| | - Jens Meiler
- Department of Chemistry, Departments of Pharmacology and Biomedical Informatics, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
- Institute for Drug Discovery, Leipzig University Medical School, Leipzig, Germany
- *Correspondence: Jens Meiler, ; Jeffrey Mendenhall, ; Benjamin P. Brown,
| |
Collapse
|
6
|
Podlewska S, Kurczab R. Mutual Support of Ligand- and Structure-Based Approaches-To What Extent We Can Optimize the Power of Predictive Model? Case Study of Opioid Receptors. Molecules 2021; 26:molecules26061607. [PMID: 33799356 PMCID: PMC7998793 DOI: 10.3390/molecules26061607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 03/10/2021] [Accepted: 03/11/2021] [Indexed: 11/16/2022] Open
Abstract
The process of modern drug design would not exist in the current form without computational methods. They are part of every stage of the drug design pipeline, supporting the search and optimization of new bioactive substances. Nevertheless, despite the great help that is offered by in silico strategies, the power of computational methods strongly depends on the input data supplied at the stage of the predictive model construction. The studies on the efficiency of the computational protocols most often focus on global efficiency. They use general parameters that refer to the whole dataset, such as accuracy, precision, mean squared error, etc. In the study, we examined machine learning predictions obtained for opioid receptors (mu, kappa, delta) and focused on cases for which the predictions were the most accurate and the least accurate. Moreover, by using docking, we tried to explain prediction errors. We attempted to develop a rule of thumb, which can help in the prediction of compound activity towards opioid receptors via docking, especially those that have been incorrectly predicted by machine learning. We found out that although the combination of ligand- and structure-based path can be beneficial for the prediction accuracy, there still remain cases that cannot be reliably predicted by any available modeling method. In addition to challenging ligand- and structure-based predictions, we also examined the role of the application of machine-learning methods in comparison to simple statistical methods for both standard ligand-based representations (molecular fingerprints) and interaction fingerprints. All approaches were confronted in both classification (where compounds were assigned to the group of active and inactive group constructed on the basis of Ki values) and regression (where exact Ki value was predicted) experiments.
Collapse
Affiliation(s)
- Sabina Podlewska
- Department of Technology and Biotechnology of Drugs, Jagiellonian University, Medical College, 9 Medyczna Street, 30-688 Cracow, Poland;
- Maj Institute of Pharmacology, Polish Academy of Sciences, 12 Smętna Street, 31-343 Cracow, Poland
| | - Rafał Kurczab
- Maj Institute of Pharmacology, Polish Academy of Sciences, 12 Smętna Street, 31-343 Cracow, Poland
- Correspondence: ; Tel.: +48-1266-23-301
| |
Collapse
|
7
|
Shi C, Dong F, Zhao G, Zhu N, Lao X, Zheng H. Applications of machine-learning methods for the discovery of NDM-1 inhibitors. Chem Biol Drug Des 2020; 96:1232-1243. [PMID: 32418370 DOI: 10.1111/cbdd.13708] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 04/25/2020] [Accepted: 05/06/2020] [Indexed: 12/11/2022]
Abstract
The emergence of New Delhi metal beta-lactamase (NDM-1)-producing bacteria and their worldwide spread pose great challenges for the treatment of drug-resistant bacterial infections. These bacteria can hydrolyze most β-lactam antibacterials. Unfortunately, there are no clinically useful NDM-1 inhibitors. In the current work, we manually collected NDM-1 inhibitors reported in the past decade and established the first NDM-1 inhibitor database. Four machine-learning models were constructed using the structural and property characteristics of the collected compounds as input training set to discover potential NDM-1 inhibitors. In order to distinguish between high active inhibitors and putative positive drugs, a three-classification strategy was introduced in our study. In detail, the commonly used positive and negative divisions are converted into strongly active, weakly active, and inactive. The accuracy of the best prediction model designed based on this strategy reached 90.5%, compared with 69.14% achieved by the traditional docking-based virtual screening method. Consequently, the best model was used to virtually screen a natural product library. The safety of the selected compounds was analyzed by the ADMET prediction model based on machine learning. Seven novel NDM-1 inhibitors were identified, which will provide valuable clues for the discovery of NDM-1 inhibitors.
Collapse
Affiliation(s)
- Cheng Shi
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Fanyi Dong
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Guiling Zhao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Ning Zhu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Xingzhen Lao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China
| |
Collapse
|
8
|
Fayyazi N, Esmaeili S, Taheri S, Ribeiro FF, Scotti MT, Scotti L, Ghasemi JB, Saghaei L, Fassihi A. Pharmacophore Modeling, Synthesis, Scaffold Hopping and Biological β- Hematin Inhibition Interaction Studies for Anti-malaria Compounds. Curr Top Med Chem 2020; 19:2743-2765. [DOI: 10.2174/1568026619666191116160326] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 08/02/2019] [Accepted: 10/01/2019] [Indexed: 01/23/2023]
Abstract
Backgound:Exploring potent compounds is critical to generating multi-target drug discovery. Hematin crystallization is an important mechanism of malaria.Methods:A series of chloroquine analogues were designed using a repositioning approach to develop new anticancer compounds. Protein-ligand interaction fingerprints and ADMET descriptors were used to assess docking performance in virtual screenings to design chloroquine hybrid β-hematin inhibitors. A PLS algorithm was applied to correlate the molecular descriptors to IC50 values. The modeling presented excellent predictive power with correlation coefficients for calibration and cross-validation of r2 = 0.93 and q2 = 0.72. Using the model, a series of 4-aminoquinlin hybrids were synthesized and evaluated for their biological activity as an external test series. These compounds were evaluated for cytotoxic cell lines and β-hematin inhibition.Results:The target compounds exhibited high β-hematin inhibition activity and were 3-9 times more active than the positive control. Furthermore, all the compounds exhibited moderate to high cytotoxic activity. The most potent compound in the dataset was docked with hemoglobin and its pharmacophore features were generated. These features were used as input to the Pharmit server for screening of six databases.Conclusion:The compound with the best score from ChEMBL was 2016904, previously reported as a VEGFR-2 inhibitor. The 11 compounds selected presented the best Gold scores with drug-like properties and can be used for drug development.
Collapse
Affiliation(s)
- Neda Fayyazi
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan, Iran
| | - Somayeh Esmaeili
- Traditional Medicine and Medical Material Research Center (TMRC), Shahid beheshti University of Medical Sciences, Tehran, Iran
| | - Salman Taheri
- Chemistry and Chemical Engineering Research Center of Iran, Tehran, Iran
| | - Frederico F. Ribeiro
- Synthesis and Drug Delivery Laboratory, Biological Sciences Department, Paraíba State University, João Pessoa, Brazil
| | | | | | - Jahan B. Ghasemi
- College of Sciences, Faculty of Chemistry, University of Tehran, Tehran, Iran
| | - Lotfollah Saghaei
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan, Iran
| | - Afshin Fassihi
- Department of Medicinal Chemistry, School of Pharmacy and Pharmaceutical Sciences, Isfahan, Iran
| |
Collapse
|
9
|
Pérez-Sianes J, Pérez-Sánchez H, Díaz F. Virtual Screening Meets Deep Learning. Curr Comput Aided Drug Des 2019; 15:6-28. [PMID: 30338743 DOI: 10.2174/1573409914666181018141602] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 10/08/2018] [Accepted: 10/11/2018] [Indexed: 12/27/2022]
Abstract
BACKGROUND Automated compound testing is currently the de facto standard method for drug screening, but it has not brought the great increase in the number of new drugs that was expected. Computer- aided compounds search, known as Virtual Screening, has shown the benefits to this field as a complement or even alternative to the robotic drug discovery. There are different methods and approaches to address this problem and most of them are often included in one of the main screening strategies. Machine learning, however, has established itself as a virtual screening methodology in its own right and it may grow in popularity with the new trends on artificial intelligence. OBJECTIVE This paper will attempt to provide a comprehensive and structured review that collects the most important proposals made so far in this area of research. Particular attention is given to some recent developments carried out in the machine learning field: the deep learning approach, which is pointed out as a future key player in the virtual screening landscape.
Collapse
Affiliation(s)
| | - Horacio Pérez-Sánchez
- Bioinformatics and High Performance Computing Research Group (BIO-HPC), Computer Engineering Department, Universidad Católica San Antonio de Murcia (UCAM), Murcia, Spain
| | - Fernando Díaz
- Departamento de Informática, Escuela de Ingeniería Informática, University of Valladolid, Segovia, Spain
| |
Collapse
|
10
|
Krishna S, Kumar S, Singh DK, Lakra AD, Banerjee D, Siddiqi MI. Multiple Machine Learning Based-Chemoinformatics Models for Identification of Histone Acetyl Transferase Inhibitors. Mol Inform 2018; 37:e1700150. [DOI: 10.1002/minf.201700150] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 04/06/2018] [Indexed: 01/25/2023]
Affiliation(s)
- Shagun Krishna
- Molecular & Structural Biology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| | - Sushil Kumar
- Molecular & Structural Biology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| | - Deependra Kumar Singh
- Molecular & Structural Biology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| | - Amar Deep Lakra
- Endocrinology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| | - Dibyendu Banerjee
- Molecular & Structural Biology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| | - Mohammad Imran Siddiqi
- Molecular & Structural Biology Division; CSIR-Central Drug Research Institute; Lucknow India 260031
| |
Collapse
|
11
|
Kumar A, Sharma A. Computational Modeling of Multi-target-Directed Inhibitors Against Alzheimer’s Disease. NEUROMETHODS 2018. [DOI: 10.1007/978-1-4939-7404-7_19] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
|
12
|
Kurczab R, Bojarski AJ. The influence of the negative-positive ratio and screening database size on the performance of machine learning-based virtual screening. PLoS One 2017; 12:e0175410. [PMID: 28384344 PMCID: PMC5383296 DOI: 10.1371/journal.pone.0175410] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 03/24/2017] [Indexed: 11/22/2022] Open
Abstract
The machine learning-based virtual screening of molecular databases is a commonly used approach to identify hits. However, many aspects associated with training predictive models can influence the final performance and, consequently, the number of hits found. Thus, we performed a systematic study of the simultaneous influence of the proportion of negatives to positives in the testing set, the size of screening databases and the type of molecular representations on the effectiveness of classification. The results obtained for eight protein targets, five machine learning algorithms (SMO, Naïve Bayes, Ibk, J48 and Random Forest), two types of molecular fingerprints (MACCS and CDK FP) and eight screening databases with different numbers of molecules confirmed our previous findings that increases in the ratio of negative to positive training instances greatly influenced most of the investigated parameters of the ML methods in simulated virtual screening experiments. However, the performance of screening was shown to also be highly dependent on the molecular library dimension. Generally, with the increasing size of the screened database, the optimal training ratio also increased, and this ratio can be rationalized using the proposed cost-effectiveness threshold approach. To increase the performance of machine learning-based virtual screening, the training set should be constructed in a way that considers the size of the screening database.
Collapse
Affiliation(s)
- Rafał Kurczab
- Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
- * E-mail:
| | - Andrzej J. Bojarski
- Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
| |
Collapse
|
13
|
Discovery of novel dual VEGFR2 and Src inhibitors using a multistep virtual screening approach. Future Med Chem 2016; 9:7-24. [PMID: 27995811 DOI: 10.4155/fmc-2016-0162] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
AIM Simultaneous inhibition of VEGFR2 and Src may enhance the efficacy of VEGFR2-targeted cancer therapeutics. Hence, development of dual inhibitors on VEGFR2 and Src can be a useful strategy for such treatments. MATERIALS & METHODS A multistep virtual screening protocol, comprising ligand-based support vector machines method, drug-likeness rules filter and structure-based molecular docking, was developed and employed to identify dual inhibitors of VEGFR2 and Src from a large commercial chemical library. Kinase inhibitory assays and cell viability assays were then used for experimental validation. RESULTS A set of compounds belonging to six different molecular scaffolds was identified and sent for biological evaluation. Compound 3c belonging to the 2-amino-3-cyanopyridine scaffold exhibited good antiproliferative effect and dual-target activities against VEGFR2 and Src. CONCLUSION This study demonstrated the ability of the multistep virtual screening approach to identify novel multitarget agents.
Collapse
|
14
|
Chandra S, Pandey J, Tamrakar AK, Siddiqi MI. Multiple machine learning based descriptive and predictive workflow for the identification of potential PTP1B inhibitors. J Mol Graph Model 2016; 71:242-256. [PMID: 28006676 DOI: 10.1016/j.jmgm.2016.10.020] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Revised: 09/27/2016] [Accepted: 10/25/2016] [Indexed: 12/21/2022]
Abstract
In insulin and leptin signaling pathway, Protein-Tyrosine Phosphatase 1B (PTP1B) plays a crucial controlling role as a negative regulator, which makes it an attractive therapeutic target for both Type-2 Diabetes (T2D) and obesity. In this work, we have generated classification models by using the inhibition data set of known PTP1B inhibitors to identify new inhibitors of PTP1B utilizing multiple machine learning techniques like naïve Bayesian, random forest, support vector machine and k-nearest neighbors, along with structural fingerprints and selected molecular descriptors. Several models from each algorithm have been constructed and optimized, with the different combination of molecular descriptors and structural fingerprints. For the training and test sets, most of the predictive models showed more than 90% of overall prediction accuracies. The best model was obtained with support vector machine approach and has Matthews Correlation Coefficient of 0.82 for the external test set, which was further employed for the virtual screening of Maybridge small compound database. Five compounds were subsequently selected for experimental assay. Out of these two compounds were found to inhibit PTP1B with significant inhibitory activity in in-vitro inhibition assay. The structural fragments which are important for PTP1B inhibition were identified by naïve Bayesian method and can be further exploited to design new molecules around the identified scaffolds. The descriptive and predictive modeling strategy applied in this study is capable of identifying PTP1B inhibitors from the large compound libraries.
Collapse
Affiliation(s)
- Sharat Chandra
- Academy of Scientific and Innovative Research (AcSIR), CSIR-Central Drug Resaerch Institute, Campus, Lucknow 226031, India; Molecular and Structural Biology Division, CSIR-Central Drug Research Institute, Lucknow 226031, India
| | - Jyotsana Pandey
- Biochemistry Division, CSIR-Central Drug Research Institute, Lucknow 226031, India
| | | | - Mohammad Imran Siddiqi
- Academy of Scientific and Innovative Research (AcSIR), CSIR-Central Drug Resaerch Institute, Campus, Lucknow 226031, India; Molecular and Structural Biology Division, CSIR-Central Drug Research Institute, Lucknow 226031, India.
| |
Collapse
|
15
|
Discovery of Influenza A virus neuraminidase inhibitors using support vector machine and Naïve Bayesian models. Mol Divers 2015; 20:439-51. [PMID: 26689205 DOI: 10.1007/s11030-015-9641-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 10/12/2015] [Indexed: 10/22/2022]
Abstract
Neuraminidase (NA) is a critical enzyme in the life cycle of influenza virus, which is known as a successful paradigm in the design of anti-influenza agents. However, to date there are no classification models for the virtual screening of NA inhibitors. In this work, we built support vector machine and Naïve Bayesian models of NA inhibitors and non-inhibitors, with different ratios of active-to-inactive compounds in the training set and different molecular descriptors. Four models with sensitivity or Matthews correlation coefficients greater than 0.9 were chosen to predict the NA inhibitory activities of 15,600 compounds in our in-house database. We combined the results of four optimal models and selected 60 representative compounds to assess their NA inhibitory profiles in vitro. Nine NA inhibitors were identified, five of which were oseltamivir derivatives with large C-5 substituents exhibiting potent inhibition against H1N1 NA with IC50 values in the range of 12.9-185.0 nM, and against H3N2 NA with IC50 values between 18.9 and 366.1 nM. The other four active compounds belonged to novel scaffolds, with IC50 values ranging 39.5-63.8 μM against H1N1 NA and 44.5-114.1 μM against H3N2 NA. This is the first time that classification models of NA inhibitors and non-inhibitors are built and their prediction results validated experimentally using in vitro assays.
Collapse
|
16
|
Kurczab R, Smusz S, Bojarski AJ. The influence of negative training set size on machine learning-based virtual screening. J Cheminform 2014; 6:32. [PMID: 24976867 PMCID: PMC4061540 DOI: 10.1186/1758-2946-6-32] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2013] [Accepted: 06/02/2014] [Indexed: 02/07/2023] Open
Abstract
Background The paper presents a thorough analysis of the influence of the number of negative training examples on the performance of machine learning methods. Results The impact of this rather neglected aspect of machine learning methods application was examined for sets containing a fixed number of positive and a varying number of negative examples randomly selected from the ZINC database. An increase in the ratio of positive to negative training instances was found to greatly influence most of the investigated evaluating parameters of ML methods in simulated virtual screening experiments. In a majority of cases, substantial increases in precision and MCC were observed in conjunction with some decreases in hit recall. The analysis of dynamics of those variations let us recommend an optimal composition of training data. The study was performed on several protein targets, 5 machine learning algorithms (SMO, Naïve Bayes, Ibk, J48 and Random Forest) and 2 types of molecular fingerprints (MACCS and CDK FP). The most effective classification was provided by the combination of CDK FP with SMO or Random Forest algorithms. The Naïve Bayes models appeared to be hardly sensitive to changes in the number of negative instances in the training set. Conclusions In conclusion, the ratio of positive to negative training instances should be taken into account during the preparation of machine learning experiments, as it might significantly influence the performance of particular classifier. What is more, the optimization of negative training set size can be applied as a boosting-like approach in machine learning-based virtual screening.
Collapse
Affiliation(s)
- Rafał Kurczab
- Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, Smętna 12, 31-343 Kraków, Poland
| | - Sabina Smusz
- Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, Smętna 12, 31-343 Kraków, Poland ; Faculty of Chemistry, Jagiellonian University, R. Ingardena 3, 30-060 Kraków, Poland
| | - Andrzej J Bojarski
- Department of Medicinal Chemistry, Institute of Pharmacology, Polish Academy of Sciences, Smętna 12, 31-343 Kraków, Poland
| |
Collapse
|
17
|
Chen J, Liu Y, Cheng T, Lao X, Gao X, Zheng H, Yao W. A common binding mode that may facilitate the design of novel broad-spectrum inhibitors against metallo-β-lactamases. Med Chem Res 2013. [DOI: 10.1007/s00044-013-0646-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
18
|
Kurczab R, Smusz S, Bojarski AJ. The influence of training actives/inactives ratio on machine learning performance. J Cheminform 2013. [PMCID: PMC3606185 DOI: 10.1186/1758-2946-5-s1-p30] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
19
|
Han B, Ma X, Zhao R, Zhang J, Wei X, Liu X, Liu X, Zhang C, Tan C, Jiang Y, Chen Y. Development and experimental test of support vector machines virtual screening method for searching Src inhibitors from large compound libraries. Chem Cent J 2012; 6:139. [PMID: 23173901 PMCID: PMC3538513 DOI: 10.1186/1752-153x-6-139] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2012] [Accepted: 11/07/2012] [Indexed: 01/04/2023] Open
Abstract
UNLABELLED BACKGROUND Src plays various roles in tumour progression, invasion, metastasis, angiogenesis and survival. It is one of the multiple targets of multi-target kinase inhibitors in clinical uses and trials for the treatment of leukemia and other cancers. These successes and appearances of drug resistance in some patients have raised significant interest and efforts in discovering new Src inhibitors. Various in-silico methods have been used in some of these efforts. It is desirable to explore additional in-silico methods, particularly those capable of searching large compound libraries at high yields and reduced false-hit rates. RESULTS We evaluated support vector machines (SVM) as virtual screening tools for searching Src inhibitors from large compound libraries. SVM trained and tested by 1,703 inhibitors and 63,318 putative non-inhibitors correctly identified 93.53%~ 95.01% inhibitors and 99.81%~ 99.90% non-inhibitors in 5-fold cross validation studies. SVM trained by 1,703 inhibitors reported before 2011 and 63,318 putative non-inhibitors correctly identified 70.45% of the 44 inhibitors reported since 2011, and predicted as inhibitors 44,843 (0.33%) of 13.56M PubChem, 1,496 (0.89%) of 168 K MDDR, and 719 (7.73%) of 9,305 MDDR compounds similar to the known inhibitors. CONCLUSIONS SVM showed comparable yield and reduced false hit rates in searching large compound libraries compared to the similarity-based and other machine-learning VS methods developed from the same set of training compounds and molecular descriptors. We tested three virtual hits of the same novel scaffold from in-house chemical libraries not reported as Src inhibitor, one of which showed moderate activity. SVM may be potentially explored for searching Src inhibitors from large compound libraries at low false-hit rates.
Collapse
Affiliation(s)
- Bucong Han
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xiaohua Ma
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Ruiying Zhao
- Central Research Institute of China Chemical Science and Technology, 20 Xueyuan Road, Haidian District, Beijing, 100083, People’s Republic of China
| | - Jingxian Zhang
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xiaona Wei
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xianghui Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Xin Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | - Cunlong Zhang
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Chunyan Tan
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Yuyang Jiang
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
| | - Yuzong Chen
- The Key Laboratory of Chemical Biology, Guangdong Province, The Graduate School at Shenzhen, Tsinghua University, Shenzhen, Guangdong, 518055, People’s Republic of China
- Computation and Systems Biology, Singapore-MIT Alliance, National University of Singapore, E4-04-10, 4 Engineering Drive 3, Singapore, 117576, Singapore
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| |
Collapse
|
20
|
Yu P, Wild DJ. Fast rule-based bioactivity prediction using associative classification mining. J Cheminform 2012; 4:29. [PMID: 23176548 PMCID: PMC3515428 DOI: 10.1186/1758-2946-4-29] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 10/23/2012] [Indexed: 12/22/2022] Open
Abstract
Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.
Collapse
Affiliation(s)
- Pulan Yu
- Indiana University School of Informatics and Computing, Bloomington, IN, 47408, USA.
| | | |
Collapse
|
21
|
Combinatorial support vector machines approach for virtual screening of selective multi-target serotonin reuptake inhibitors from large compound libraries. J Mol Graph Model 2012; 32:49-66. [DOI: 10.1016/j.jmgm.2011.09.002] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2011] [Revised: 08/30/2011] [Accepted: 09/01/2011] [Indexed: 12/13/2022]
|
22
|
Abstract
IMPORTANCE OF THE FIELD: PubChem is a public molecular information repository, a scientific showcase of the NIH Roadmap Initiative. The PubChem database holds over 27 million records of unique chemical structures of compounds (CID) derived from nearly 70 million substance depositions (SID), and contains more than 449,000 bioassay records with over thousands of in vitro biochemical and cell-based screening bioassays established, with targeting more than 7000 proteins and genes linking to over 1.8 million of substances. AREAS COVERED IN THIS REVIEW: This review builds on recent PubChem-related computational chemistry research reported by other authors while providing readers with an overview of the PubChem database, focusing on its increasing role in cheminformatics, virtual screening and toxicity prediction modeling. WHAT THE READER WILL GAIN: These publicly available datasets in PubChem provide great opportunities for scientists to perform cheminformatics and virtual screening research for computer-aided drug design. However, the high volume and complexity of the datasets, in particular the bioassay-associated false positives/negatives and highly imbalanced datasets in PubChem, also creates major challenges. Several approaches regarding the modeling of PubChem datasets and development of virtual screening models for bioactivity and toxicity predictions are also reviewed. TAKE HOME MESSAGE: Novel data-mining cheminformatics tools and virtual screening algorithms are being developed and used to retrieve, annotate and analyze the large-scale and highly complex PubChem biological screening data for drug design.
Collapse
Affiliation(s)
- Xiang-Qun Xie
- Department of Pharmaceutical Sciences, School of Pharmacy; Drug Discovery Institute/Pittsburgh Molecular Library Screening Center (PMLSC); Pittsburgh Chemical Methodologies & Library Development (PCMLD) Center; Departments of Computational Biology and Structural Biology; University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
23
|
Ma XH, Wang R, Tan CY, Jiang YY, Lu T, Rao HB, Li XY, Go ML, Low BC, Chen YZ. Virtual screening of selective multitarget kinase inhibitors by combinatorial support vector machines. Mol Pharm 2010; 7:1545-60. [PMID: 20712327 DOI: 10.1021/mp100179t] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Multitarget agents have been increasingly explored for enhancing efficacy and reducing countertarget activities and toxicities. Efficient virtual screening (VS) tools for searching selective multitarget agents are desired. Combinatorial support vector machines (C-SVM) were tested as VS tools for searching dual-inhibitors of 11 combinations of 9 anticancer kinase targets (EGFR, VEGFR, PDGFR, Src, FGFR, Lck, CDK1, CDK2, GSK3). C-SVM trained on 233-1,316 non-dual-inhibitors correctly identified 26.8%-57.3% (majority >36%) of the 56-230 intra-kinase-group dual-inhibitors (equivalent to the 50-70% yields of two independent individual target VS tools), and 12.2% of the 41 inter-kinase-group dual-inhibitors. C-SVM were fairly selective in misidentifying as dual-inhibitors 3.7%-48.1% (majority <20%) of the 233-1,316 non-dual-inhibitors of the same kinase pairs and 0.98%-4.77% of the 3,971-5,180 inhibitors of other kinases. C-SVM produced low false-hit rates in misidentifying as dual-inhibitors 1,746-4,817 (0.013%-0.036%) of the 13.56 M PubChem compounds, 12-175 (0.007%-0.104%) of the 168 K MDDR compounds, and 0-84 (0.0%-2.9%) of the 19,495-38,483 MDDR compounds similar to the known dual-inhibitors. C-SVM was compared to other VS methods Surflex-Dock, DOCK Blaster, kNN and PNN against the same sets of kinase inhibitors and the full set or subset of the 1.02 M Zinc clean-leads data set. C-SVM produced comparable dual-inhibitor yields, slightly better false-hit rates for kinase inhibitors, and significantly lower false-hit rates for the Zinc clean-leads data set. Combinatorial SVM showed promising potential for searching selective multitarget agents against intra-kinase-group kinases without explicit knowledge of multitarget agents.
Collapse
Affiliation(s)
- X H Ma
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore 117543
| | | | | | | | | | | | | | | | | | | |
Collapse
|
24
|
Geppert H, Vogt M, Bajorath J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 2010; 50:205-16. [PMID: 20088575 DOI: 10.1021/ci900419k] [Citation(s) in RCA: 231] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | | | | |
Collapse
|
25
|
Liu XH, Song HY, Zhang JX, Han BC, Wei XN, Ma XH, Cui WK, Chen YZ. Identifying Novel Type ZBGs and Nonhydroxamate HDAC Inhibitors Through a SVM Based Virtual Screening Approach. Mol Inform 2010; 29:407-20. [PMID: 27463196 DOI: 10.1002/minf.200900014] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2009] [Accepted: 03/11/2010] [Indexed: 01/30/2023]
Abstract
Histone deacetylase inhibitors (HDACi) have been successfully used for the treatment of cancers and other diseases. Search for novel type ZBGs and development of non-hydroxamate HDACi has become a focus in current research. To complement this, it is desirable to explore a virtual screening (VS) tool capable of identifying different types of potential inhibitors from large compound libraries with high yields and low false-hit rates similar to HTS. This work explored the use of support vector machines (SVM) combined with our newly developed putative non-inhibitor generation method as such a tool. SVM trained by 702 pre-2008 hydroxamate HDACi and 64334 putative non-HDACi showed good yields and low false-hit rates in cross-validation test and independent test using 220 diverse types of HDACi reported since 2008. The SVM hit rates in scanning 13.56 M PubChem and 168K MDDR compounds are comparable to HTS rates. Further structural analysis of SVM virtual hits suggests its potential for identification of non-hydroxamate HDACi. From this analysis, a series of novel ZBG and cap groups were proposed for HDACi design.
Collapse
Affiliation(s)
- X H Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756
| | - H Y Song
- Institute of Materials Research and Engineering, A*STAR, 3 Research Link, Singapore 117602
| | - J X Zhang
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756
| | - B C Han
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756
| | - X N Wei
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756
| | - X H Ma
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756
| | - W K Cui
- Department of Pharmacy, National University of Singapore, 18 Science Drive 4, Singapore 117543
| | - Y Z Chen
- Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore, Blk S16,Level 8, 3 Science Drive 2, Singapore 117543 phone: 65-6874-6877, fax: 65-6774-6756.
| |
Collapse
|
26
|
Rao H, Li Z, Li X, Ma X, Ung C, Li H, Liu X, Chen Y. Identification of small molecule aggregators from large compound libraries by support vector machines. J Comput Chem 2010; 31:752-63. [PMID: 19569201 DOI: 10.1002/jcc.21347] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
Small molecule aggregators non-specifically inhibit multiple unrelated proteins, rendering them therapeutically useless. They frequently appear as false hits and thus need to be eliminated in high-throughput screening campaigns. Computational methods have been explored for identifying aggregators, which have not been tested in screening large compound libraries. We used 1319 aggregators and 128,325 non-aggregators to develop a support vector machines (SVM) aggregator identification model, which was tested by four methods. The first is five fold cross-validation, which showed comparable aggregator and significantly improved non-aggregator identification rates against earlier studies. The second is the independent test of 17 aggregators discovered independently from the training aggregators, 71% of which were correctly identified. The third is retrospective screening of 13M PUBCHEM and 168K MDDR compounds, which predicted 97.9% and 98.7% of the PUBCHEM and MDDR compounds as non-aggregators. The fourth is retrospective screening of 5527 MDDR compounds similar to the known aggregators, 1.14% of which were predicted as aggregators. SVM showed slightly better overall performance against two other machine learning methods based on five fold cross-validation studies of the same settings. Molecular features of aggregation, extracted by a feature selection method, are consistent with published profiles. SVM showed substantial capability in identifying aggregators from large libraries at low false-hit rates.
Collapse
Affiliation(s)
- Hanbing Rao
- College of Chemistry, Sichuan University, Chengdu 610064, People's Republic of China
| | | | | | | | | | | | | | | |
Collapse
|
27
|
Ma XH, Shi Z, Tan C, Jiang Y, Go ML, Low BC, Chen YZ. In-silico approaches to multi-target drug discovery : computer aided multi-target drug design, multi-target virtual screening. Pharm Res 2010; 27:739-49. [PMID: 20221898 DOI: 10.1007/s11095-010-0065-2] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2009] [Accepted: 01/08/2010] [Indexed: 01/25/2023]
Abstract
Multi-target drugs against selective multiple targets improve therapeutic efficacy, safety and resistance profiles by collective regulations of a primary therapeutic target together with compensatory elements and resistance activities. Efforts have been made to employ in-silico methods for facilitating the search and design of selective multi-target agents. These methods have shown promising potential in facilitating drug discovery directed at selective multiple targets.
Collapse
Affiliation(s)
- Xiao Hua Ma
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore, 117543, Singapore
| | | | | | | | | | | | | |
Collapse
|
28
|
Liew CY, Ma XH, Yap CW. Consensus model for identification of novel PI3K inhibitors in large chemical library. J Comput Aided Mol Des 2010; 24:131-41. [PMID: 20148286 DOI: 10.1007/s10822-010-9321-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2009] [Accepted: 02/02/2010] [Indexed: 01/28/2023]
Abstract
Phosphoinositide 3-kinases (PI3Ks) inhibitors have treatment potential for cancer, diabetes, cardiovascular disease, chronic inflammation and asthma. A consensus model consisting of three base classifiers (AODE, kNN, and SVM) trained with 1,283 positive compounds (PI3K inhibitors), 16 negative compounds (PI3K non-inhibitors) and 64,078 generated putative negatives was developed for predicting compounds with PI3K inhibitory activity of IC(50) < or = 10 microM. The consensus model has an estimated false positive rate of 0.75%. Nine novel potential inhibitors were identified using the consensus model and several of these contain structural features that are consistent with those found to be important for PI3K inhibitory activities. An advantage of the current model is that it does not require knowledge of 3D structural information of the various PI3K isoforms, which is not readily available for all isoforms.
Collapse
Affiliation(s)
- Chin Yee Liew
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, Singapore, Singapore
| | | | | |
Collapse
|
29
|
Liu XH, Ma XH, Tan CY, Jiang YY, Go ML, Low BC, Chen YZ. Virtual screening of Abl inhibitors from large compound libraries by support vector machines. J Chem Inf Model 2009; 49:2101-10. [PMID: 19689138 DOI: 10.1021/ci900135u] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Abl promotes cancers by regulating cell morphogenesis, motility, growth, and survival. Successes of several marketed and clinical trial Abl inhibitors against leukemia and other cancers and appearances of reduced efficacies and drug resistances have led to significant interest in and efforts for developing new Abl inhibitors. In silico methods of pharmacophore, fragment, and molecular docking have been used in some of these efforts. It is desirable to explore other in silico methods capable of searching large compound libraries at high yields and reduced false-hit rates. We evaluated support vector machines (SVM) as a virtual screening tool for searching Abl inhibitors from large compound libraries. SVM trained and tested by 708 inhibitors and 65,494 putative noninhibitors correctly identified 84.4 to 92.3% inhibitors and 99.96 to 99.99% noninhibitors in 5-fold cross validation studies. SVM trained by 708 pre-2008 inhibitors and 65 494 putative noninhibitors correctly identified 50.5% of the 91 inhibitors reported since 2008 and predicted as inhibitors 29,072 (0.21%) of 13.56M PubChem, 659 (0.39%) of 168K MDDR, and 330 (5.0%) of 6638 MDDR compounds similar to the known inhibitors. SVM showed comparable yields and substantially reduced false-hit rates against two similarity based and another machine learning VS methods based on the same training and testing data sets and molecular descriptors. These suggest that SVM is capable of searching Abl inhibitors from large compound libraries at low false-hit rates.
Collapse
Affiliation(s)
- X H Liu
- Bioinformatics and Drug Design Group, Department of Pharmacy, Centre for Computational Science and Engineering, National University of Singapore, Blk S16, Level 8, 3 Science Drive 2, Singapore 117543
| | | | | | | | | | | | | |
Collapse
|
30
|
Zhu F, Han B, Kumar P, Liu X, Ma X, Wei X, Huang L, Guo Y, Han L, Zheng C, Chen Y. Update of TTD: Therapeutic Target Database. Nucleic Acids Res 2009; 38:D787-91. [PMID: 19933260 PMCID: PMC2808971 DOI: 10.1093/nar/gkp1014] [Citation(s) in RCA: 203] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Increasing numbers of proteins, nucleic acids and other molecular entities have been explored as therapeutic targets, hundreds of which are targets of approved and clinical trial drugs. Knowledge of these targets and corresponding drugs, particularly those in clinical uses and trials, is highly useful for facilitating drug discovery. Therapeutic Target Database (TTD) has been developed to provide information about therapeutic targets and corresponding drugs. In order to accommodate increasing demand for comprehensive knowledge about the primary targets of the approved, clinical trial and experimental drugs, numerous improvements and updates have been made to TTD. These updates include information about 348 successful, 292 clinical trial and 1254 research targets, 1514 approved, 1212 clinical trial and 2302 experimental drugs linked to their primary targets (3382 small molecule and 649 antisense drugs with available structure and sequence), new ways to access data by drug mode of action, recursive search of related targets or drugs, similarity target and drug searching, customized and whole data download, standardized target ID, and significant increase of data (1894 targets, 560 diseases and 5028 drugs compared with the 433 targets, 125 diseases and 809 drugs in the original release described in previous paper). This database can be accessed at http://bidd.nus.edu.sg/group/cjttd/TTD.asp.
Collapse
Affiliation(s)
- Feng Zhu
- Department of Pharmacy and Computation and Systems Biology, Center for Computational Science and Engineering, Singapore-MIT Alliance, National University of Singapore, Singapore
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Liew CY, Ma XH, Liu X, Yap CW. SVM Model for Virtual Screening of Lck Inhibitors. J Chem Inf Model 2009; 49:877-85. [DOI: 10.1021/ci800387z] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Chin Y. Liew
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Xiao H. Ma
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Xianghui Liu
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| | - Chun W. Yap
- Pharmaceutical Data Exploration Laboratory, Department of Pharmacy, National University of Singapore, and Bioinformatics and Drug Design Group, Department of Pharmacy, National University of Singapore
| |
Collapse
|