1
|
Hentabli H, Bengherbia B, Saeed F, Salim N, Nafea I, Toubal A, Nasser M. Convolutional Neural Network Model Based on 2D Fingerprint for Bioactivity Prediction. Int J Mol Sci 2022; 23:13230. [PMID: 36362018 PMCID: PMC9657591 DOI: 10.3390/ijms232113230] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 10/22/2022] [Accepted: 10/27/2022] [Indexed: 10/15/2023] Open
Abstract
Determining and modeling the possible behaviour and actions of molecules requires investigating the basic structural features and physicochemical properties that determine their behaviour during chemical, physical, biological, and environmental processes. Computational approaches such as machine learning methods are alternatives to predicting the physiochemical properties of molecules based on their structures. However, the limited accuracy and high error rates of such predictions restrict their use. In this paper, a novel technique based on a deep learning convolutional neural network (CNN) for the prediction of chemical compounds' bioactivity is proposed and developed. The molecules are represented in the new matrix format Mol2mat, a molecular matrix representation adapted from the well-known 2D-fingerprint descriptors. To evaluate the performance of the proposed methods, a series of experiments were conducted using two standard datasets, namely the MDL Drug Data Report (MDDR) and Sutherland, datasets comprising 10 homogeneous and 14 heterogeneous activity classes. After analysing the eight fingerprints, all the probable combinations were investigated using the five best descriptors. The results showed that a combination of three fingerprints, ECFP4, EPFP4, and ECFC4, along with a CNN activity prediction process, achieved the highest performance of 98% AUC when compared to the state-of-the-art ML algorithms NaiveB, LSVM, and RBFN.
Collapse
Affiliation(s)
- Hamza Hentabli
- Laboratory of Advanced Electronics Systems (LSEA), University of Medea, Medea 26000, Algeria
- UTM Big Data Centre, Ibnu Sina Institute for Scientific and Industrial Research, Universiti Teknologi Malaysia, Johor Bahru 81310, Johor, Malaysia
| | - Billel Bengherbia
- Laboratory of Advanced Electronics Systems (LSEA), University of Medea, Medea 26000, Algeria
| | - Faisal Saeed
- UTM Big Data Centre, Ibnu Sina Institute for Scientific and Industrial Research, Universiti Teknologi Malaysia, Johor Bahru 81310, Johor, Malaysia
- DAAI Research Group, Department of Computing and Data Science, School of Computing and Digital Technology, Birmingham City University, Birmingham B4 7XG, UK
| | - Naomie Salim
- UTM Big Data Centre, Ibnu Sina Institute for Scientific and Industrial Research, Universiti Teknologi Malaysia, Johor Bahru 81310, Johor, Malaysia
| | - Ibtehal Nafea
- College of Computer Science and Engineering, Taibah University, Medina 41477, Saudi Arabia
| | - Abdelmoughni Toubal
- Laboratory of Advanced Electronics Systems (LSEA), University of Medea, Medea 26000, Algeria
| | - Maged Nasser
- School of Computer Sciences, Universiti Sains Malaysia, Gelugor 11800, Penang, Malaysia
| |
Collapse
|
2
|
Rodríguez-Pérez R, Bajorath J. Evolution of Support Vector Machine and Regression Modeling in Chemoinformatics and Drug Discovery. J Comput Aided Mol Des 2022; 36:355-362. [PMID: 35304657 PMCID: PMC9325859 DOI: 10.1007/s10822-022-00442-9] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 02/15/2022] [Indexed: 11/05/2022]
Abstract
The support vector machine (SVM) algorithm is one of the most widely used machine learning (ML) methods for predicting active compounds and molecular properties. In chemoinformatics and drug discovery, SVM has been a state-of-the-art ML approach for more than a decade. A unique attribute of SVM is that it operates in feature spaces of increasing dimensionality. Hence, SVM conceptually departs from the paradigm of low dimensionality that applies to many other methods for chemical space navigation. The SVM approach is applicable to compound classification, and ranking, multi-class predictions, and –in algorithmically modified form– regression modeling. In the emerging era of deep learning (DL), SVM retains its relevance as one of the premier ML methods in chemoinformatics, for reasons discussed herein. We describe the SVM methodology including strengths and weaknesses and discuss selected applications that have contributed to the evolution of SVM as a premier approach for compound classification, property predictions, and virtual compound screening.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany.,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 6, D-53115, Bonn, Germany. .,Novartis Institutes for Biomedical Research, Novartis Campus, CH-4002, Basel, Switzerland.
| |
Collapse
|
3
|
Choudhury C, Arul Murugan N, Deva Priyakumar U. Structure-based drug repurposing: traditional and advanced AI/ML-aided methods. Drug Discov Today 2022; 27:1847-1861. [PMID: 35301148 PMCID: PMC8920090 DOI: 10.1016/j.drudis.2022.03.006] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2021] [Revised: 02/16/2022] [Accepted: 03/10/2022] [Indexed: 02/08/2023]
Abstract
The current global health emergency in the form of the Coronavirus 2019 (COVID-19) pandemic has highlighted the need for fast, accurate, and efficient drug discovery pipelines. Traditional drug discovery projects relying on in vitro high-throughput screening (HTS) involve large investments and sophisticated experimental set-ups, affordable only to big biopharmaceutical companies. In this scenario, application of efficient state-of-the-art computational methods and modern artificial intelligence (AI)-based algorithms for rapid screening of repurposable chemical space [approved drugs and natural products (NPs) with proven pharmacokinetic profiles] to identify the initial leads is a powerful option to save resources and time. Structure-based drug repurposing is a popular in silico repurposing approach. In this review, we discuss traditional and modern AI-based computational methods and tools applied at various stages for structure-based drug discovery (SBDD) pipelines. Additionally, we highlight the role of generative models in generating molecules with scaffolds from repurposable chemical space. Teaser: This review highlights the importance of repurposable chemical space, and the contributions of conventional in silico approaches and modern machine-learning algorithms for rapid structure-based drug repurposing.
Collapse
Affiliation(s)
- Chinmayee Choudhury
- Department of Experimental Medicine and Biotechnology, Postgraduate Institute of Medical Education and Research, Sector-12, Chandigarh 160012, India
| | - N Arul Murugan
- Department of Computer Science, School of Electrical Engineering and Computer Sciences, KTH Royal Institute of Technology, S-100 44, Stockholm, Sweden; Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi 110020, India.
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad 500 032, India
| |
Collapse
|
4
|
Brown BP, Vu O, Geanes AR, Kothiwale S, Butkiewicz M, Lowe EW, Mueller R, Pape R, Mendenhall J, Meiler J. Introduction to the BioChemical Library (BCL): An Application-Based Open-Source Toolkit for Integrated Cheminformatics and Machine Learning in Computer-Aided Drug Discovery. Front Pharmacol 2022; 13:833099. [PMID: 35264967 PMCID: PMC8899505 DOI: 10.3389/fphar.2022.833099] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/24/2022] [Indexed: 01/31/2023] Open
Abstract
The BioChemical Library (BCL) cheminformatics toolkit is an application-based academic open-source software package designed to integrate traditional small molecule cheminformatics tools with machine learning-based quantitative structure-activity/property relationship (QSAR/QSPR) modeling. In this pedagogical article we provide a detailed introduction to core BCL cheminformatics functionality, showing how traditional tasks (e.g., computing chemical properties, estimating druglikeness) can be readily combined with machine learning. In addition, we have included multiple examples covering areas of advanced use, such as reaction-based library design. We anticipate that this manuscript will be a valuable resource for researchers in computer-aided drug discovery looking to integrate modular cheminformatics and machine learning tools into their pipelines.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Chemical and Physical Biology Program, Medical Scientist Training Program, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Oanh Vu
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Alexander R. Geanes
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Sandeepkumar Kothiwale
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Mariusz Butkiewicz
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Edward W. Lowe
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Ralf Mueller
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Richard Pape
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Jeffrey Mendenhall
- Department of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
| | - Jens Meiler
- Department of Chemistry, Departments of Pharmacology and Biomedical Informatics, Center for Structural Biology, Vanderbilt University, Nashville, TN, United States
- Institute for Drug Discovery, Leipzig University Medical School, Leipzig, Germany
| |
Collapse
|
5
|
Kawai K, Tomonou M, Machida Y, Karuo Y, Tarui A, Sato K, Ikeda Y, Kinashi T, Omote M. Effect of Learning Dataset for Identification of Active Molecules: A Case Study of Integrin αIIbβ3 Inhibitors. Mol Inform 2021; 40:e2060040. [PMID: 33738924 DOI: 10.1002/minf.202060040] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 01/30/2021] [Indexed: 01/13/2023]
Abstract
Efficient in silico approaches are needed to identify strong integrin αIIbβ3 inhibitors through a small number of measurements. To address the challenge, we investigated the effect of learning dataset on the classification performance of machine learning models focusing on weak and inactive compounds. The structure and activity information of the compounds were obtained from ChEMBL, and pCHEMBL values were used to classify them as active, inactive, or weak. Datasets with various imbalance levels from active:inactive=1 : 1 to 1 : 1000 were used for the machine learning. The prediction scores of the weak samples were found to lie between the predictive values of active and inactive compounds. In addition, another dataset that consists of 149 actives and 6.9 million inactives was screened; the results indicated that the number of positive predictions decreased for models trained with a higher number of inactives. Although there is a trade-off between false positives and false negatives, for determination of compounds with strong activity using a reduced number of measurements, it is better to use a large number of inactives for learning and identifying compounds that score higher than the weak samples.
Collapse
Affiliation(s)
- Kentaro Kawai
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Mami Tomonou
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Yume Machida
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Yukiko Karuo
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Atsushi Tarui
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Kazuyuki Sato
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| | - Yoshiki Ikeda
- Department of Molecular Genetics, Institute of Biomedical Science, Kansai Medical University, 2-5-1 Shin-machi, Hirakata, Osaka, 573-1010, Japan
| | - Tatsuo Kinashi
- Department of Molecular Genetics, Institute of Biomedical Science, Kansai Medical University, 2-5-1 Shin-machi, Hirakata, Osaka, 573-1010, Japan
| | - Masaaki Omote
- Faculty of Pharmaceutical Sciences, Setsunan University, 45-1, Nagaotoge-cho, Hirakata, Osaka, 573-0101, Japan
| |
Collapse
|
6
|
Patel L, Shukla T, Huang X, Ussery DW, Wang S. Machine Learning Methods in Drug Discovery. Molecules 2020; 25:E5277. [PMID: 33198233 PMCID: PMC7696134 DOI: 10.3390/molecules25225277] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Revised: 11/04/2020] [Accepted: 11/09/2020] [Indexed: 12/30/2022] Open
Abstract
The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.
Collapse
Affiliation(s)
- Lauv Patel
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| | - Tripti Shukla
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| | - Xiuzhen Huang
- Department of Computer Science, Arkansas State University, Jonesboro, AR 72467, USA;
| | - David W. Ussery
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA;
| | - Shanzhi Wang
- Chemistry Department, University of Arkansas at Little Rock, Little Rock, AR 72204, USA; (L.P.); (T.S.)
| |
Collapse
|
7
|
Yang S, Ye Q, Ding J, Yin, Lu A, Chen X, Hou T, Cao D. Current advances in ligand‐based target prediction. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1504] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Su‐Qing Yang
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
| | - Qing Ye
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Jun‐Jie Ding
- Beijing Institute of Pharmaceutical Chemistry Beijing China
| | - Yin
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ai‐Ping Lu
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| | - Xiang Chen
- Department of Dermatology, Hunan Engineering Research Center of Skin Health and Disease, Hunan Key Laboratory of Skin Cancer and Psoriasis, Xiangya Hospital Central South University Changsha Hunan China
| | - Ting‐Jun Hou
- College of Pharmaceutical Sciences Innovation Institute for Artificial Intelligence in Medicine, Zhejiang University Hangzhou, Zhejiang China
| | - Dong‐Sheng Cao
- Xiangya School of Pharmaceutical Sciences Central South University Changsha Hunan China
- Institute for Advancing Translational Medicine in Bone and Joint Diseases, School of Chinese Medicine Hong Kong Baptist University Hong Kong China
| |
Collapse
|
8
|
Ghiandoni GM, Bodkin MJ, Chen B, Hristozov D, Wallace JEA, Webster J, Gillet VJ. Enhancing reaction-based de novo design using a multi-label reaction class recommender. J Comput Aided Mol Des 2020; 34:783-803. [PMID: 32112286 PMCID: PMC7293200 DOI: 10.1007/s10822-020-00300-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 02/13/2020] [Indexed: 12/31/2022]
Abstract
Reaction-based de novo design refers to the in-silico generation of novel chemical structures by combining reagents using structural transformations derived from known reactions. The driver for using reaction-based transformations is to increase the likelihood of the designed molecules being synthetically accessible. We have previously described a reaction-based de novo design method based on reaction vectors which are transformation rules that are encoded automatically from reaction databases. A limitation of reaction vectors is that they account for structural changes that occur at the core of a reaction only, and they do not consider the presence of competing functionalities that can compromise the reaction outcome. Here, we present the development of a Reaction Class Recommender to enhance the reaction vector framework. The recommender is intended to be used as a filter on the reaction vectors that are applied during de novo design to reduce the combinatorial explosion of in-silico molecules produced while limiting the generated structures to those which are most likely to be synthesisable. The recommender has been validated using an external data set extracted from the recent medicinal chemistry literature and in two simulated de novo design experiments. Results suggest that the use of the recommender drastically reduces the number of solutions explored by the algorithm while preserving the chance of finding relevant solutions and increasing the global synthetic accessibility of the designed molecules.
Collapse
Affiliation(s)
- Gian Marco Ghiandoni
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
| | - Michael J Bodkin
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - Beining Chen
- Chemistry Department, University of Sheffield, Dainton Building, Brook Hill, Sheffield, S3 7HF, UK
| | - Dimitar Hristozov
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - James E A Wallace
- Evotec (U.K.) Ltd, 114 Innovation Drive, Milton Park, Abingdon, OX14 4RZ, UK
| | - James Webster
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK
| | - Valerie J Gillet
- Information School, University of Sheffield, Regent Court, 211 Portobello, Sheffield, S1 4DP, UK.
| |
Collapse
|
9
|
Bioactivity Prediction Using Convolutional Neural Network. ADVANCES IN INTELLIGENT SYSTEMS AND COMPUTING 2020. [DOI: 10.1007/978-3-030-33582-3_33] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
10
|
Mangiatordi GF, Alberga D, Altomare CD, Carotti A, Catto M, Cellamare S, Gadaleta D, Lattanzi G, Leonetti F, Pisani L, Stefanachi A, Trisciuzzi D, Nicolotti O. Mind the Gap! A Journey towards Computational Toxicology. Mol Inform 2016; 35:294-308. [PMID: 27546034 DOI: 10.1002/minf.201501017] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 03/23/2016] [Indexed: 11/11/2022]
Abstract
Computational methods have advanced toxicology towards the development of target-specific models based on a clear cause-effect rationale. However, the predictive potential of these models presents strengths and weaknesses. On the good side, in silico models are valuable cheap alternatives to in vitro and in vivo experiments. On the other, the unconscious use of in silico methods can mislead end-users with elusive results. The focus of this review is on the basic scientific and regulatory recommendations in the derivation and application of computational models. Attention is paid to examine the interplay between computational toxicology and drug discovery and development. Avoiding the easy temptation of an overoptimistic future, we report our view on what can, or cannot, realistically be done. Indeed, studies of safety/toxicity represent a key element of chemical prioritization programs carried out by chemical industries, and primarily by pharmaceutical companies.
Collapse
Affiliation(s)
- Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Domenico Alberga
- Dipartimento Interateneo di Fisica 'M.Merlin', Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Cosimo Damiano Altomare
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Angelo Carotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Marco Catto
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Saverio Cellamare
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Domenico Gadaleta
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Gianluca Lattanzi
- Dipartimento Interateneo di Fisica 'M.Merlin', Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Leonardo Pisani
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Angela Stefanachi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università di Bari 'AldoMoro', Via Orabona, 4, 70126, Bari, Italy.
| |
Collapse
|
11
|
Wei Y, Li J, Chen Z, Wang F, Huang W, Hong Z, Lin J. Multistage virtual screening and identification of novel HIV-1 protease inhibitors by integrating SVM, shape, pharmacophore and docking methods. Eur J Med Chem 2015; 101:409-18. [PMID: 26185005 DOI: 10.1016/j.ejmech.2015.06.054] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2015] [Revised: 06/28/2015] [Accepted: 06/29/2015] [Indexed: 11/30/2022]
Abstract
The HIV-1 protease has proven to be a crucial component of the HIV replication machinery and a reliable target for anti-HIV drug discovery. In this study, we applied an optimized hierarchical multistage virtual screening method targeting HIV-1 protease. The method sequentially applied SVM (Support Vector Machine), shape similarity, pharmacophore modeling and molecular docking. Using a validation set (270 positives, 155,996 negatives), the multistage virtual screening method showed a high hit rate and high enrichment factor of 80.47% and 465.75, respectively. Furthermore, this approach was applied to screen the National Cancer Institute database (NCI), which contains 260,000 molecules. From the final hit list, 6 molecules were selected for further testing in an in vitro HIV-1 protease inhibitory assay, and 2 molecules (NSC111887 and NSC121217) showed inhibitory potency against HIV-1 protease, with IC50 values of 62 μM and 162 μM, respectively. With further chemical development, these 2 molecules could potentially serve as HIV-1 protease inhibitors.
Collapse
Affiliation(s)
- Yu Wei
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, PR China; College of Pharmacy, Nankai University, Tianjin 300071, PR China
| | - Jinlong Li
- College of Pharmacy, Nankai University, Tianjin 300071, PR China
| | - Zeming Chen
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, PR China; College of Life Sciences, Nankai University, Tianjin 300071, PR China
| | - Fengwei Wang
- Department of Oncology, Tianjin Union Medical Center, Tianjin 300180, PR China
| | | | - Zhangyong Hong
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, PR China; College of Life Sciences, Nankai University, Tianjin 300071, PR China.
| | - Jianping Lin
- State Key Laboratory of Medicinal Chemical Biology, Nankai University, Tianjin 300071, PR China; College of Pharmacy, Nankai University, Tianjin 300071, PR China.
| |
Collapse
|
12
|
Balfer J, Bajorath J. Visualization and Interpretation of Support Vector Machine Activity Predictions. J Chem Inf Model 2015; 55:1136-47. [DOI: 10.1021/acs.jcim.5b00175] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Affiliation(s)
- Jenny Balfer
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| |
Collapse
|
13
|
Afzal AM, Mussa HY, Turner RE, Bender A, Glen RC. A multi-label approach to target prediction taking ligand promiscuity into account. J Cheminform 2015; 7:24. [PMID: 26064191 PMCID: PMC4461803 DOI: 10.1186/s13321-015-0071-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2015] [Accepted: 04/27/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND According to Cobanoglu et al., it is now widely acknowledged that the single target paradigm (one protein/target, one disease, one drug) that has been the dominant premise in drug development in the recent past is untenable. More often than not, a drug-like compound (ligand) can be promiscuous - it can interact with more than one target protein. In recent years, in in silico target prediction methods the promiscuity issue has generally been approached computationally in three main ways: ligand-based methods; target-protein-based methods; and integrative schemes. In this study we confine attention to ligand-based target prediction machine learning approaches, commonly referred to as target-fishing. The target-fishing approaches that are currently ubiquitous in cheminformatics literature can be essentially viewed as single-label multi-classification schemes; these approaches inherently bank on the single target paradigm assumption that a ligand can zero in on one single target. In order to address the ligand promiscuity issue, one might be able to cast target-fishing as a multi-label multi-class classification problem. For illustrative and comparison purposes, single-label and multi-label Naïve Bayes classification models (denoted here by SMM and MMM, respectively) for target-fishing were implemented. The models were constructed and tested on 65,587 compounds/ligands and 308 targets retrieved from the ChEMBL17 database. RESULTS On classifying 3,332 test multi-label (promiscuous) compounds, SMM and MMM performed differently. At the 0.05 significance level, a Wilcoxon signed rank test performed on the paired target predictions yielded by SMM and MMM for the test ligands gave a p-value < 5.1 × 10(-94) and test statistics value of 6.8 × 10(5), in favour of MMM. The two models performed differently when tested on four datasets comprising single-label (non-promiscuous) compounds; McNemar's test yielded χ (2) values of 15.657, 16.500 and 16.405 (with corresponding p-values of 7.594 × 10(-05), 4.865 × 10(-05) and 5.115 × 10(-05)), respectively, for three test sets, in favour of MMM. The models performed similarly on the fourth set. CONCLUSIONS The target prediction results obtained in this study indicate that multi-label multi-class approaches are more apt than the ubiquitous single-label multi-class schemes when it comes to the application of ligand-based classifiers to target-fishing.
Collapse
Affiliation(s)
- Avid M Afzal
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Hamse Y Mussa
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Richard E Turner
- Department of Engineering, Computational and Biological Learning Lab, University of Cambridge, Trumpington Street, Cambridge, CB2 1PZ UK
| | - Andreas Bender
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| | - Robert C Glen
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW UK
| |
Collapse
|
14
|
Korkmaz S, Zararsiz G, Goksuluk D. Drug/nondrug classification using Support Vector Machines with various feature selection strategies. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2014; 117:51-60. [PMID: 25224081 DOI: 10.1016/j.cmpb.2014.08.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 08/15/2014] [Accepted: 08/27/2014] [Indexed: 06/03/2023]
Abstract
In conjunction with the advance in computer technology, virtual screening of small molecules has been started to use in drug discovery. Since there are thousands of compounds in early-phase of drug discovery, a fast classification method, which can distinguish between active and inactive molecules, can be used for screening large compound collections. In this study, we used Support Vector Machines (SVM) for this type of classification task. SVM is a powerful classification tool that is becoming increasingly popular in various machine-learning applications. The data sets consist of 631 compounds for training set and 216 compounds for a separate test set. In data pre-processing step, the Pearson's correlation coefficient used as a filter to eliminate redundant features. After application of the correlation filter, a single SVM has been applied to this reduced data set. Moreover, we have investigated the performance of SVM with different feature selection strategies, including SVM-Recursive Feature Elimination, Wrapper Method and Subset Selection. All feature selection methods generally represent better performance than a single SVM while Subset Selection outperforms other feature selection methods. We have tested SVM as a classification tool in a real-life drug discovery problem and our results revealed that it could be a useful method for classification task in early-phase of drug discovery.
Collapse
Affiliation(s)
- Selcuk Korkmaz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey.
| | - Gokmen Zararsiz
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| | - Dincer Goksuluk
- Hacettepe University, Faculty of Medicine, Department of Biostatistics, 06100 Sihhiye, Ankara, Turkey
| |
Collapse
|
15
|
Comparison of two methods forecasting binding rate of plasma protein. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2014; 2014:957154. [PMID: 25161695 PMCID: PMC4137739 DOI: 10.1155/2014/957154] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2014] [Revised: 07/18/2014] [Accepted: 07/18/2014] [Indexed: 11/17/2022]
Abstract
By introducing the descriptors calculated from the molecular structure, the binding rates of plasma protein (BRPP) with seventy diverse drugs are modeled by a quantitative structure-activity relationship (QSAR) technique. Two algorithms, heuristic algorithm (HA) and support vector machine (SVM), are used to establish linear and nonlinear models to forecast BRPP. Empirical analysis shows that there are good performances for HA and SVM with cross-validation correlation coefficients Rcv2 of 0.80 and 0.83. Comparing HA with SVM, it was found that SVM has more stability and more robustness to forecast BRPP.
Collapse
|
16
|
Balfer J, Hu Y, Bajorath J. Compound Structure-Independent Activity Prediction in High-Dimensional Target Space. Mol Inform 2014; 33:544-58. [PMID: 27486040 DOI: 10.1002/minf.201400051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2014] [Accepted: 05/20/2014] [Indexed: 11/10/2022]
Abstract
Profiling of compound libraries against arrays of targets has become an important approach in pharmaceutical research. The prediction of multi-target compound activities also represents an attractive task for machine learning with potential for drug discovery applications. Herein, we have explored activity prediction in high-dimensional target space. Different types of models were derived to predict multi-target activities. The models included naïve Bayesian (NB) and support vector machine (SVM) classifiers based upon compound structure information and NB models derived on the basis of activity profiles, without considering compound structure. Because the latter approach can be applied to incomplete training data and principally depends on the feature independence assumption, SVM modeling was not applicable in this case. Furthermore, iterative hybrid NB models making use of both activity profiles and compound structure information were built. In high-dimensional target space, NB models utilizing activity profile data were found to yield more accurate activity predictions than structure-based NB and SVM models or hybrid models. An in-depth analysis of activity profile-based models revealed the presence of correlation effects across different targets and rationalized prediction accuracy. Taken together, the results indicate that activity profile information can be effectively used to predict the activity of test compounds against novel targets.
Collapse
Affiliation(s)
- Jenny Balfer
- Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-53113 Bonn,Germany tel: +49-228-2699-306; fax: +49-228-2699-341
| | - Ye Hu
- Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-53113 Bonn,Germany tel: +49-228-2699-306; fax: +49-228-2699-341
| | - Jürgen Bajorath
- Department of Life Science Informatics, Bonn-Aachen International Center for Information Technology, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-53113 Bonn,Germany tel: +49-228-2699-306; fax: +49-228-2699-341.
| |
Collapse
|
17
|
Balfer J, Heikamp K, Laufer S, Bajorath J. Modeling of Compound Profiling Experiments Using Support Vector Machines. Chem Biol Drug Des 2014; 84:75-85. [DOI: 10.1111/cbdd.12294] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Revised: 01/06/2014] [Accepted: 01/19/2014] [Indexed: 11/28/2022]
Affiliation(s)
- Jenny Balfer
- Department of Life Science Informatics; B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry; Rheinische Friedrich-Wilhelms-Universität; Dahlmannstr. 2 D-53113 Bonn Germany
| | - Kathrin Heikamp
- Department of Life Science Informatics; B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry; Rheinische Friedrich-Wilhelms-Universität; Dahlmannstr. 2 D-53113 Bonn Germany
| | - Stefan Laufer
- Department of Pharmacy and Biochemistry, Pharmaceutical/Medicinal Chemistry; Eberhard-Karls-Universität Tübingen; Auf der Morgenstelle 8 D-72076 Tübingen Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics; B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry; Rheinische Friedrich-Wilhelms-Universität; Dahlmannstr. 2 D-53113 Bonn Germany
| |
Collapse
|
18
|
Abstract
Drug action can be rationalized as interaction of a molecule with proteins in a regulatory network of targets from a specific biological system. Both drug and side effects are often governed by interaction of the drug molecule with many, often unrelated, targets. Accordingly, arrays of protein–ligand interaction data from numerous in vitro profiling assays today provide growing evidence of polypharmacological drug interactions, even for marketed drugs. In vitro off-target profiling has therefore become an important tool in early drug discovery to learn about potential off-target liabilities, which are sometimes beneficial, but more often safety relevant. The rapidly developing field of in silico profiling approaches is complementing in vitro profiling. These approaches capitalize from large amounts of biochemical data from multiple sources to be exploited for optimizing undesirable side effects in pharmaceutical research. Therefore, current in silico profiling models are nowadays perceived as valuable tools in drug discovery, and promise a platform to support optimally informed decisions.
Collapse
|
19
|
Abdo A, Leclère V, Jacques P, Salim N, Pupin M. Prediction of new bioactive molecules using a Bayesian belief network. J Chem Inf Model 2014; 54:30-6. [PMID: 24392938 DOI: 10.1021/ci4004909] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Natural products and synthetic compounds are a valuable source of new small molecules leading to novel drugs to cure diseases. However identifying new biologically active small molecules is still a challenge. In this paper, we introduce a new activity prediction approach using Bayesian belief network for classification (BBNC). The roots of the network are the fragments composing a compound. The leaves are, on one side, the activities to predict and, on another side, the unknown compound. The activities are represented by sets of known compounds, and sets of inactive compounds are also used. We calculated a similarity between an unknown compound and each activity class. The more similar activity is assigned to the unknown compound. We applied this new approach on eight well-known data sets extracted from the literature and compared its performance to three classical machine learning algorithms. Experiments showed that BBNC provides interesting prediction rates (from 79% accuracy for high diverse data sets to 99% for low diverse ones) with a short time calculation. Experiments also showed that BBNC is particularly effective for homogeneous data sets but has been found to perform less well with structurally heterogeneous sets. However, it is important to stress that we believe that using several approaches whenever possible for activity prediction can often give a broader understanding of the data than using only one approach alone. Thus, BBNC is a useful addition to the computational chemist's toolbox.
Collapse
Affiliation(s)
- Ammar Abdo
- LIFL UMR CNRS 8022 Université Lille1 and INRIA Lille Nord Europe, 59655 Villeneuve d'Ascq cedex, France
| | | | | | | | | |
Collapse
|
20
|
Kawai K, Nagata N, Takahashi Y. De novo design of drug-like molecules by a fragment-based molecular evolutionary approach. J Chem Inf Model 2014; 54:49-56. [PMID: 24372539 DOI: 10.1021/ci400418c] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
This paper describes a similarity-driven simple evolutionary approach to producing candidate molecules of new drugs. The aim of the method is to explore the candidates that are structurally similar to the reference molecule and yet somewhat different in not only peripheral chains but also their scaffolds. The method employs a known active molecule of our interest as a reference molecule which is used to navigate a huge chemical space. The reference molecule is also used to obtain seed fragments. An initial set of individual structures is prepared with the seed fragments and additional fragments using several connection rules. The fragment library is preferably prepared from a collection of known molecules related to the target of the reference molecule. Every fragment of the library can be used for fragment-based mutation. All the fragments are categorized into three classes; rings, linkers, and side chains. New individuals are produced by the crossover and the fragment-based mutation with the fragment library. Computer experiments with our own fragment library prepared from GPCR SARfari verified the feasibility of our approach to drug discovery.
Collapse
Affiliation(s)
- Kentaro Kawai
- Central Research Laboratories, Kaken Pharmaceutical Co. Ltd. , 14, Shinomiya Minamikawara-cho, Yamashina, Kyoto 607-8042, Japan
| | | | | |
Collapse
|
21
|
|
22
|
Schuster D. 3D pharmacophores as tools for activity profiling. DRUG DISCOVERY TODAY. TECHNOLOGIES 2013; 7:e203-70. [PMID: 24103796 DOI: 10.1016/j.ddtec.2010.11.006] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
23
|
Jalali-Heravi M, Mani-Varnosfaderani A, Valadkhani A. Integrated One-Against-One Classifiers as Tools for Virtual Screening of Compound Databases: A Case Study with CNS Inhibitors. Mol Inform 2013; 32:742-53. [DOI: 10.1002/minf.201200126] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2012] [Accepted: 05/16/2013] [Indexed: 11/07/2022]
|
24
|
Heikamp K, Bajorath J. Prediction of Compounds with Closely Related Activity Profiles Using Weighted Support Vector Machine Linear Combinations. J Chem Inf Model 2013; 53:791-801. [DOI: 10.1021/ci400090t] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Kathrin Heikamp
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| |
Collapse
|
25
|
Sato T, Yuki H, Takaya D, Sasaki S, Tanaka A, Honma T. Application of Support Vector Machine to Three-Dimensional Shape-Based Virtual Screening Using Comprehensive Three-Dimensional Molecular Shape Overlay with Known Inhibitors. J Chem Inf Model 2012; 52:1015-26. [DOI: 10.1021/ci200562p] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Tomohiro Sato
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Hitomi Yuki
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Daisuke Takaya
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Shunta Sasaki
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Akiko Tanaka
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| | - Teruki Honma
- RIKEN Systems and Structural Biology Center, 1-7-22
Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan
| |
Collapse
|
26
|
Goyal RK, Dureja H, Singh G, Madan AK. Models for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines. Sci Pharm 2010; 78:791-820. [PMID: 21179317 PMCID: PMC3007618 DOI: 10.3797/scipharm.1006-03] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2010] [Accepted: 08/12/2010] [Indexed: 11/26/2022] Open
Abstract
The relationship between topological indices and antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines has been investigated. A data set consisting of 31 analogues of 5â-O-[(N-Acyl)sulfamoyl]adenosines was selected for the present study. The values of numerous topostructural and topochemical indices for each of 31 differently substituted analogues of the data set were computed using an in-house computer program. Resulting data was analyzed and suitable models were developed through decision tree, random forest and moving average analysis (MAA). The goodness of the models was assessed by calculating overall accuracy of prediction, sensitivity, specificity and Mathews correlation coefficient. Pendentic eccentricity index â a novel highly discriminating, non-correlating pendenticity based topochemical descriptor â was also conceptualized and successfully utilized for the development of a model for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines. The proposed index exhibited not only high sensitivity towards both the presence as well as relative position(s) of pendent/heteroatom(s) but also led to significant reduction in degeneracy. Random forest correctly classified the analogues into active and inactive with an accuracy of 67.74%. A decision tree was also employed for determining the importance of molecular descriptors. The decision tree learned the information from the input data with an accuracy of 100% and correctly predicted the cross-validated (10 fold) data with accuracy up to 77.4%. Statistical significance of proposed models was also investigated using intercorrelation analysis. Accuracy of prediction of proposed MAA models ranged from 90.4 to 91.6%.
Collapse
Affiliation(s)
- Rakesh K Goyal
- Faculty of Pharmaceutical Sciences, Pt. B.D. Sharma University of Health Sciences, Rohtak,124 001, India.
| | | | | | | |
Collapse
|
27
|
Michielan L, Moro S. Pharmaceutical Perspectives of Nonlinear QSAR Strategies. J Chem Inf Model 2010; 50:961-78. [DOI: 10.1021/ci100072z] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Affiliation(s)
- Lisa Michielan
- Molecular Modeling Section (MMS), Dipartimento di Scienze Farmaceutiche, Università di Padova, via Marzolo 5, I-35131 Padova, Italy
| | - Stefano Moro
- Molecular Modeling Section (MMS), Dipartimento di Scienze Farmaceutiche, Università di Padova, via Marzolo 5, I-35131 Padova, Italy
| |
Collapse
|
28
|
Geppert H, Vogt M, Bajorath J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 2010; 50:205-16. [PMID: 20088575 DOI: 10.1021/ci900419k] [Citation(s) in RCA: 231] [Impact Index Per Article: 15.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | | | | |
Collapse
|
29
|
Kondratovich EP, Zhokhova NI, Baskin II, Palyulin VA, Zefirov NS. Fragmental descriptors in (Q)SAR: prediction of the assignment of organic compounds to pharmacological groups using the support vector machine approach. Russ Chem Bull 2010. [DOI: 10.1007/s11172-009-0076-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
30
|
KAWAI K, TAKAHASHI Y. Virtual Screening of Antihypertensive Drugs Using Support Vector Machines. JOURNAL OF COMPUTER CHEMISTRY-JAPAN 2010. [DOI: 10.2477/jccj.h2137] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
31
|
|
32
|
Wale N, Karypis G. Target fishing for chemical compounds using target-ligand activity data and ranking based methods. J Chem Inf Model 2009; 49:2190-201. [PMID: 19764745 DOI: 10.1021/ci9000376] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In recent years, the development of computational techniques that identify all the likely targets for a given chemical compound, also termed as the problem of Target Fishing, has been an active area of research. Identification of likely targets of a chemical compound in the early stages of drug discovery helps to understand issues such as selectivity, off-target pharmacology, and toxicity. In this paper, we present a set of techniques whose goal is to rank or prioritize targets in the context of a given chemical compound so that most targets against which this compound may show activity appear higher in the ranked list. These methods are based on our extensions to the SVM and ranking perceptron algorithms for this problem. Our extensive experimental study shows that the methods developed in this work outperform previous approaches 2% to 60% under different evaluation criterions.
Collapse
Affiliation(s)
- Nikil Wale
- Department of Computer Science, University of Minnesota, Twin Cities, Minnesota 55455, USA.
| | | |
Collapse
|
33
|
Liao Q, Wang J, Webster Y, Watson IA. GPU Accelerated Support Vector Machines for Mining High-Throughput Screening Data. J Chem Inf Model 2009; 49:2718-25. [DOI: 10.1021/ci900337f] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Quan Liao
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Jibo Wang
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Yue Webster
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Ian A. Watson
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| |
Collapse
|
34
|
Kim BM, Kang BY, Kim HG, Baek SH. Prognosis prediction for Class III malocclusion treatment by feature wrapping method. Angle Orthod 2009; 79:683-91. [PMID: 19537866 DOI: 10.2319/071508-371.1] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2008] [Accepted: 08/01/2008] [Indexed: 11/23/2022] Open
Abstract
OBJECTIVE To use the feature wrapping (FW) method to identify which cephalometric markers show the highest classification accuracy in prognosis prediction for Class III malocclusion and to compare the prediction accuracy between the FW method and conventional statistical methods such as discriminant analysis (DA). MATERIALS AND METHODS The sample set consisted of 38 patients (15 boys and 23 girls, mean age 8.53 +/- 1.36 years) who were diagnosed with Class III malocclusion and received both first-phase (orthopedic) and second-phase (fixed orthodontic) treatments. Lateral cephalograms were taken before (T0) and after first-phase treatment (T1) and after second-phase treatment and retention (T2). Based on the measurements taken at the T2 stage, the patients were allocated into good (n = 20) or poor (n = 18) prognosis groups. Forty-six cephalometric variables on T0 lateral cephalograms were analyzed by the FW method to identify key determinants for discriminating between the two groups. Sequential forward search (SFS) algorism and support vector machine (SVM) were used in conjunction with the FW method to improve classification accuracy. To compare the prediction accuracy of the FW method with conventional statistical methods, DA was performed for the same data set. RESULTS AB to mandibular plane angle ( degrees ) and A to N-perpendicular (mm) were selected as the most accurate cephalometric predictors by both the FW and DA methods. However, classification accuracy was higher with the FW method (97.2%) compared with DA (92.1%), because the FW method with SFS and SVM has a more precise classification algorithm. CONCLUSIONS The FW method, which uses a learning algorithm, might be an effective alternative to DA for prognosis prediction.
Collapse
Affiliation(s)
- Bo-Mi Kim
- Department of Orthodontics, School of Dentistry, Dental Research Institute, Seoul National University, Seoul, South Korea
| | | | | | | |
Collapse
|
35
|
Kawai K, Takahashi Y. Identification of the Dual Action Antihypertensive Drugs Using TFS-Based Support Vector Machines. CHEM-BIO INFORMATICS JOURNAL 2009. [DOI: 10.1273/cbij.9.41] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Kentaro Kawai
- Department of Knowledge-Based Information Engineering, Toyohashi University of Technology
- Kaken Pharmaceutical
| | - Yoshimasa Takahashi
- Department of Knowledge-Based Information Engineering, Toyohashi University of Technology
| |
Collapse
|