1
|
Curcio A, Rocca R, Alcaro S, Artese A. The Histone Deacetylase Family: Structural Features and Application of Combined Computational Methods. Pharmaceuticals (Basel) 2024; 17:620. [PMID: 38794190 PMCID: PMC11124352 DOI: 10.3390/ph17050620] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 05/03/2024] [Accepted: 05/08/2024] [Indexed: 05/26/2024] Open
Abstract
Histone deacetylases (HDACs) are crucial in gene transcription, removing acetyl groups from histones. They also influence the deacetylation of non-histone proteins, contributing to the regulation of various biological processes. Thus, HDACs play pivotal roles in various diseases, including cancer, neurodegenerative disorders, and inflammatory conditions, highlighting their potential as therapeutic targets. This paper reviews the structure and function of the four classes of human HDACs. While four HDAC inhibitors are currently available for treating hematological malignancies, numerous others are undergoing clinical trials. However, their non-selective toxicity necessitates ongoing research into safer and more efficient class-selective or isoform-selective inhibitors. Computational methods have aided the discovery of HDAC inhibitors with the desired potency and/or selectivity. These methods include ligand-based approaches, such as scaffold hopping, pharmacophore modeling, three-dimensional quantitative structure-activity relationships, and structure-based virtual screening (molecular docking). Moreover, recent developments in the field of molecular dynamics simulations, combined with Poisson-Boltzmann/molecular mechanics generalized Born surface area techniques, have improved the prediction of ligand binding affinity. In this review, we delve into the ways in which these methods have contributed to designing and identifying HDAC inhibitors.
Collapse
Affiliation(s)
- Antonio Curcio
- Dipartimento di Scienze della Salute, Campus “S. Venuta”, Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (A.C.); (S.A.); (A.A.)
| | - Roberta Rocca
- Dipartimento di Scienze della Salute, Campus “S. Venuta”, Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (A.C.); (S.A.); (A.A.)
- Net4Science S.r.l., Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy
| | - Stefano Alcaro
- Dipartimento di Scienze della Salute, Campus “S. Venuta”, Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (A.C.); (S.A.); (A.A.)
- Net4Science S.r.l., Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy
| | - Anna Artese
- Dipartimento di Scienze della Salute, Campus “S. Venuta”, Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy; (A.C.); (S.A.); (A.A.)
- Net4Science S.r.l., Università degli Studi “Magna Græcia” di Catanzaro, Viale Europa, 88100 Catanzaro, Italy
| |
Collapse
|
2
|
Gajjala RR, Chinta RR, Gopireddy VSR, Poola S, Balam SK, Chintha V, Pasupuleti VR, Avula VKR, Vallela S, Vasilievich Zyryanov G, Cirandur SR. Ethyl-4-(aryl)-6-methyl-2-(oxo/thio)-3,4-dihydro-1H-pyrimidine-5-carboxylates: Silica supported bismuth(III)triflate catalyzed synthesis and antioxidant activity. Bioorg Chem 2022; 129:106205. [DOI: 10.1016/j.bioorg.2022.106205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 10/06/2022] [Accepted: 10/08/2022] [Indexed: 11/02/2022]
|
3
|
Sreelakshmi P, Krishna BS, Santhisudha S, Murali S, Reddy GR, Venkataramaiah C, Rao PV, Reddy AVK, Swetha V, Zyryanov GV, Reddy CD, Reddy CS. Synthesis and biological evaluation of novel dialkyl (4-amino-5H-chromeno[2,3-d]pyrimidin-5-yl)phosphonates. Bioorg Chem 2022; 129:106121. [PMID: 36075177 DOI: 10.1016/j.bioorg.2022.106121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Revised: 08/28/2022] [Accepted: 08/29/2022] [Indexed: 11/02/2022]
Abstract
This study reports the design and synthesis of novel dialkyl (4-amino-5H-chromeno[2,3-d]pyrimidin-5-yl)phosphonates as potential antitumor agents against A549 (lung cancer), DU-145 (prostate cancer), PC-3 (prostate cancer), HeLa (cervical cancer) and MCF-7 (breast cancer), cell lines evidenced from the in vitro antitumor studies performed by MTT assay (across 10-30 μM concentrations). The structural eminence of these synthesized molecules has emanated by designing the structural core by uniting the chromene, pyrimidine and phosphonate moieties into one, which has augmented their novelty and made them unreported. Further the deep structural activity relationship study investigations articulated that the title compounds are promising drug-like compounds and potential inhibitor of histidine amino acid residue present on the respective enzymatic proteins [3QJZ (A549), 3VHE (DU-145), 3V49 (PC-3), 3F81 (HeLa), & 3R7Q (MCF-7)] of the cell lines screened and are identified as responsible for the multi-faceted antitumor activities predicted in vitro. The obtained results were further supported by molecular docking studies, QSAR, ADMET, and bioactivity studies which have supported them as potential BBB penetrable molecules and proficient CNS active neuro-protective agents during drug delivery.
Collapse
Affiliation(s)
- Poola Sreelakshmi
- Department of Chemistry, Sri Venkateswara University, Tirupati 517 502, India
| | | | - Sarva Santhisudha
- Department of Chemistry, Sri Venkateswara University, Tirupati 517 502, India
| | - Sudileti Murali
- Department of Chemistry, Sri Venkateswara University, Tirupati 517 502, India
| | | | - Chintha Venkataramaiah
- Department of Zoology, Sri Venkateswara University, Tirupati 517 502, India; Department of Medical Environmental Biology and Tropical Medicine, School of Medicine, Kangwon National University, Chuncheon, Gangwon-Do 24341, Republic of Korea
| | - Pasupuleti Visweswara Rao
- Centre for International Collaboration and Research, Reva University, Rukmini Knowledge Park, Bangalore 560 064, India; Department of Biochemistry, Faculty of Medicine and Health Sciences, Abdurrab University, Jl Riau Ujung No. 73, Pekanbaru 28292, Riau, Indonesia.
| | - Avula Vijaya Kumar Reddy
- Chemical Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation
| | - Vallela Swetha
- Chemical Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation
| | - Grigory Vasilievich Zyryanov
- Chemical Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation; Ural Division of the Russian Academy of Sciences, I. Ya. Postovskiy Institute of Organic Synthesis, 22 S., Kovalevskoy Street, Yekaterinburg 620219, Russian Federation
| | | | | |
Collapse
|
4
|
Turbo prediction: a new approach for bioactivity prediction. J Comput Aided Mol Des 2022; 36:77-85. [DOI: 10.1007/s10822-021-00440-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 12/17/2021] [Indexed: 12/29/2022]
|
5
|
Selvaraj C, Chandra I, Singh SK. Artificial intelligence and machine learning approaches for drug design: challenges and opportunities for the pharmaceutical industries. Mol Divers 2021; 26:1893-1913. [PMID: 34686947 PMCID: PMC8536481 DOI: 10.1007/s11030-021-10326-z] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 09/24/2021] [Indexed: 12/27/2022]
Abstract
The global spread of COVID-19 has raised the importance of pharmaceutical drug development as intractable and hot research. Developing new drug molecules to overcome any disease is a costly and lengthy process, but the process continues uninterrupted. The critical point to consider the drug design is to use the available data resources and to find new and novel leads. Once the drug target is identified, several interdisciplinary areas work together with artificial intelligence (AI) and machine learning (ML) methods to get enriched drugs. These AI and ML methods are applied in every step of the computer-aided drug design, and integrating these AI and ML methods results in a high success rate of hit compounds. In addition, this AI and ML integration with high-dimension data and its powerful capacity have taken a step forward. Clinical trials output prediction through the AI/ML integrated models could further decrease the clinical trials cost by also improving the success rate. Through this review, we discuss the backend of AI and ML methods in supporting the computer-aided drug design, along with its challenge and opportunity for the pharmaceutical industry. From the available information or data, the AI and ML based prediction for the high throughput virtual screening. After this integration of AI and ML, the success rate of hit identification has gained a momentum with huge success by providing novel drugs.
Collapse
Affiliation(s)
- Chandrabose Selvaraj
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| | - Ishwar Chandra
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India
| | - Sanjeev Kumar Singh
- CADD and Molecular Modelling Lab, Department of Bioinformatics, Alagappa University, Science Block, Karaikudi, Tamil Nadu, 630004, India.
| |
Collapse
|
6
|
Nadiveedhi M, Nuthalapati P, Gundluru M, Yanamula MR, Kallimakula SV, Pasupuleti VR, Avula VKR, Vallela S, Zyryanov GV, Balam SK, Cirandur SR. Green Synthesis, Antioxidant, and Plant Growth Regulatory Activities of Novel α-Furfuryl-2-alkylaminophosphonates. ACS OMEGA 2021; 6:2934-2948. [PMID: 33553912 PMCID: PMC7860093 DOI: 10.1021/acsomega.0c05302] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 12/18/2020] [Indexed: 05/05/2023]
Abstract
A series of novel α-furfuryl-2-alkylaminophosphonates have been efficiently synthesized from the one-pot three-component classical Kabachnik-Fields reaction in a green chemical approach by addition of an in situ generated dialkylphosphite to Schiff's base of aldehydes and amines by using environmental and eco-friendly silica gel supported iodine as a catalyst by microwave irradiation. The advantage of this protocol is simplicity in experimental procedures and products were resulted in high isolated yields. The synthesized α-furfuryl-2-alkylaminophosphonates were screened to in vitro antioxidant and plant growth regulatory activities and some are found to be potent with antioxidant and plant growth regulatory activities. These in vitro studies have been further supported by ADMET (absorption, distribution, metabolism, excretion, and toxicity), quantitative structure-activity relationship, molecular docking, and bioactivity studies and identified that they were potentially bound to the GLN340 amino acid residue in chain C of 1DNU protein and TYR597 amino acid residue in chain A of 4M7E protein, causing potential exhibition of antioxidant and plant growth regulatory activities. Eventually, title compounds are identified as good blood-brain barrier (BBB)-penetrable compounds and are considered as proficient central nervous system active and neuroprotective antioxidant agents as the neuroprotective property is determined with BBB penetration thresholds.
Collapse
Affiliation(s)
| | - Poojith Nuthalapati
- Sri
Ramachandra Institute of Higher Education and Research, Chennai 600116, Tamil Nadu, India
| | - Mohan Gundluru
- Department
of Chemistry, Sri Venkateswara University, Tirupati 517502, Andhra Pradesh, India
- DST-PURSE
Centre, Sri Venkateswara University, Tirupati 517502, Andhra Pradesh, India
| | - Mohan Reddy Yanamula
- Department
of Biotechnology, Sri Venkateswara University, Tirupati 517502, Andhra Pradesh, India
| | | | - Visweswara Rao Pasupuleti
- Department
of Biomedical Sciences and Therapeutics, Faculty of Medicine and Health
Sciences, Universiti Malaysia Sabah, Kota Kinabalu 88400, Sabah, Malaysia
| | - Vijaya Kumar Reddy Avula
- Chemical
Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation
| | - Swetha Vallela
- Chemical
Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation
| | - Grigory Vasilievich Zyryanov
- Chemical
Engineering Institute, Ural Federal University, Yekaterinburg 620002, Russian Federation
- Ural
Division of the Russian Academy of Sciences, I. Ya. Postovskiy Institute of Organic Synthesis, 22 S. Kovalevskoy Street, Yekaterinburg 620219, Russian Federation
| | - Satheesh Krishna Balam
- Department
of Chemistry, Sri Venkateswara University, Tirupati 517502, Andhra Pradesh, India
| | - Suresh Reddy Cirandur
- Department
of Chemistry, Sri Venkateswara University, Tirupati 517502, Andhra Pradesh, India
| |
Collapse
|
7
|
Chiu YC, Chen HIH, Gorthi A, Mostavi M, Zheng S, Huang Y, Chen Y. Deep learning of pharmacogenomics resources: moving towards precision oncology. Brief Bioinform 2020; 21:2066-2083. [PMID: 31813953 PMCID: PMC7711267 DOI: 10.1093/bib/bbz144] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Revised: 08/22/2019] [Accepted: 10/18/2019] [Indexed: 12/13/2022] Open
Abstract
The recent accumulation of cancer genomic data provides an opportunity to understand how a tumor's genomic characteristics can affect its responses to drugs. This field, called pharmacogenomics, is a key area in the development of precision oncology. Deep learning (DL) methodology has emerged as a powerful technique to characterize and learn from rapidly accumulating pharmacogenomics data. We introduce the fundamentals and typical model architectures of DL. We review the use of DL in classification of cancers and cancer subtypes (diagnosis and treatment stratification of patients), prediction of drug response and drug synergy for individual tumors (treatment prioritization for a patient), drug repositioning and discovery and the study of mechanism/mode of action of treatments. For each topic, we summarize current genomics and pharmacogenomics data resources such as pan-cancer genomics data for cancer cell lines (CCLs) and tumors, and systematic pharmacologic screens of CCLs. By revisiting the published literature, including our in-house analyses, we demonstrate the unprecedented capability of DL enabled by rapid accumulation of data resources to decipher complex drug response patterns, thus potentially improving cancer medicine. Overall, this review provides an in-depth summary of state-of-the-art DL methods and up-to-date pharmacogenomics resources and future opportunities and challenges to realize the goal of precision oncology.
Collapse
Affiliation(s)
- Yu-Chiao Chiu
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Hung-I Harry Chen
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
- Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Aparna Gorthi
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Milad Mostavi
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
- Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
| | - Siyuan Zheng
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
- Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Yufei Huang
- Department of Electrical and Computer Engineering, the University of Texas at San Antonio, San Antonio, TX 78249, USA
- Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| | - Yidong Chen
- Greehey Children’s Cancer Research Institute, University of Texas Health San Antonio, San Antonio, TX 78229, USA
- Department of Population Health Sciences, University of Texas Health San Antonio, San Antonio, TX 78229, USA
| |
Collapse
|
8
|
Hussain W, Rasool N, Khan YD. Insights into Machine Learning-based Approaches for Virtual Screening in Drug Discovery: Existing Strategies and Streamlining Through FP-CADD. Curr Drug Discov Technol 2020; 18:463-472. [PMID: 32767944 DOI: 10.2174/1570163817666200806165934] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Revised: 07/01/2020] [Accepted: 07/03/2020] [Indexed: 11/22/2022]
Abstract
BACKGROUND Machine learning is an active area of research in computer science by the availability of big data collection of all sorts prompting interest in the development of novel tools for data mining. Machine learning methods have wide applications in computer-aided drug discovery methods. Most incredible approaches to machine learning are used in drug designing, which further aid the process of biological modelling in drug discovery. Mainly, two main categories are present which are Ligand-Based Virtual Screening (LBVS) and Structure-Based Virtual Screening (SBVS), however, the machine learning approaches fall mostly in the category of LBVS. OBJECTIVES This study exposits the major machine learning approaches being used in LBVS. Moreover, we have introduced a protocol named FP-CADD which depicts a 4-steps rule of thumb for drug discovery, the four protocols of computer-aided drug discovery (FP-CADD). Various important aspects along with SWOT analysis of FP-CADD are also discussed in this article. CONCLUSION By this thorough study, we have observed that in LBVS algorithms, Support Vector Machines (SVM) and Random Forest (RF) are those which are widely used due to high accuracy and efficiency. These virtual screening approaches have the potential to revolutionize the drug designing field. Also, we believe that the process flow presented in this study, named FP-CADD, can streamline the whole process of computer-aided drug discovery. By adopting this rule, the studies related to drug discovery can be made homogeneous and this protocol can also be considered as an evaluation criterion in the peer-review process of research articles.
Collapse
Affiliation(s)
| | | | - Yaser Daanial Khan
- Department of Computer Science, University of Management and Technology, Lahore, Pakistan
| |
Collapse
|
9
|
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, Isayev O, Curtarolo S, Fourches D, Cohen Y, Aspuru-Guzik A, Winkler DA, Agrafiotis D, Cherkasov A, Tropsha A. QSAR without borders. Chem Soc Rev 2020; 49:3525-3564. [PMID: 32356548 PMCID: PMC8008490 DOI: 10.1039/d0cs00098a] [Citation(s) in RCA: 319] [Impact Index Per Article: 79.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure-activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.
Collapse
Affiliation(s)
- Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Chemogenomic Analysis of the Druggable Kinome and Its Application to Repositioning and Lead Identification Studies. Cell Chem Biol 2019; 26:1608-1622.e6. [DOI: 10.1016/j.chembiol.2019.08.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 07/18/2019] [Accepted: 08/21/2019] [Indexed: 02/06/2023]
|
11
|
Miyao T, Funatsu K, Bajorath J. Exploring Alternative Strategies for the Identification of Potent Compounds Using Support Vector Machine and Regression Modeling. J Chem Inf Model 2018; 59:983-992. [DOI: 10.1021/acs.jcim.8b00584] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Tomoyuki Miyao
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
12
|
|
13
|
Riniker S, Landrum GA, Montanari F, Villalba SD, Maier J, Jansen JM, Walters WP, Shelat AA. Virtual-screening workflow tutorials and prospective results from the Teach-Discover-Treat competition 2014 against malaria. F1000Res 2017; 6:1136. [PMID: 28928948 DOI: 10.12688/f1000research.11905.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 07/11/2017] [Indexed: 12/21/2022] Open
Abstract
The first challenge in the 2014 competition launched by the Teach-Discover-Treat (TDT) initiative asked for the development of a tutorial for ligand-based virtual screening, based on data from a primary phenotypic high-throughput screen (HTS) against malaria. The resulting Workflows were applied to select compounds from a commercial database, and a subset of those were purchased and tested experimentally for anti-malaria activity. Here, we present the two most successful Workflows, both using machine-learning approaches, and report the results for the 114 compounds tested in the follow-up screen. Excluding the two known anti-malarials quinidine and amodiaquine and 31 compounds already present in the primary HTS, a high hit rate of 57% was found.
Collapse
Affiliation(s)
- Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Zürich, Switzerland
| | | | - Floriane Montanari
- Pharmacoinformatics Research Group, Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria
| | - Santiago D Villalba
- IMP - Research Institute of Molecular Pathology, Vienna Biocenter, Vienna, Austria
| | - Julie Maier
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Johanna M Jansen
- Department of Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, CA, USA
| | | | - Anang A Shelat
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA
| |
Collapse
|
14
|
Riniker S, Landrum GA, Montanari F, Villalba SD, Maier J, Jansen JM, Walters WP, Shelat AA. Virtual-screening workflow tutorials and prospective results from the Teach-Discover-Treat competition 2014 against malaria. F1000Res 2017. [PMID: 28928948 PMCID: PMC5580409 DOI: 10.12688/f1000research.11905.2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
The first challenge in the 2014 competition launched by the Teach-Discover-Treat (TDT) initiative asked for the development of a tutorial for ligand-based virtual screening, based on data from a primary phenotypic high-throughput screen (HTS) against malaria. The resulting Workflows were applied to select compounds from a commercial database, and a subset of those were purchased and tested experimentally for anti-malaria activity. Here, we present the two most successful Workflows, both using machine-learning approaches, and report the results for the 114 compounds tested in the follow-up screen. Excluding the two known anti-malarials quinidine and amodiaquine and 31 compounds already present in the primary HTS, a high hit rate of 57% was found.
Collapse
Affiliation(s)
- Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Zürich, Switzerland
| | | | - Floriane Montanari
- Pharmacoinformatics Research Group, Department of Pharmaceutical Chemistry, University of Vienna, Vienna, Austria
| | - Santiago D Villalba
- IMP - Research Institute of Molecular Pathology, Vienna Biocenter, Vienna, Austria
| | - Julie Maier
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Johanna M Jansen
- Department of Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, CA, USA
| | | | - Anang A Shelat
- Department of Chemical Biology & Therapeutics, St. Jude Children's Research Hospital, Memphis, TN, USA
| |
Collapse
|
15
|
Rodríguez-Pérez R, Vogt M, Bajorath J. Influence of Varying Training Set Composition and Size on Support Vector Machine-Based Prediction of Active Compounds. J Chem Inf Model 2017; 57:710-716. [PMID: 28376613 PMCID: PMC5417594 DOI: 10.1021/acs.jcim.7b00088] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Support
vector machine (SVM) modeling is one of the most popular
machine learning approaches in chemoinformatics and drug design. The
influence of training set composition and size on predictions currently
is an underinvestigated issue in SVM modeling. In this study, we have
derived SVM classification and ranking models for a variety of compound
activity classes under systematic variation of the number of positive
and negative training examples. With increasing numbers of negative
training compounds, SVM classification calculations became increasingly
accurate and stable. However, this was only the case if a required
threshold of positive training examples was also reached. In addition,
consideration of class weights and optimization of cost factors substantially
aided in balancing the calculations for increasing numbers of negative
training examples. Taken together, the results of our analysis have
practical implications for SVM learning and the prediction of active
compounds. For all compound classes under study, top recall performance
and independence of compound recall of training set composition was
achieved when 250–500 active and 500–1000 randomly selected
inactive training instances were used. However, as long as ∼50
known active compounds were available for training, increasing numbers of 500–1000
randomly selected negative training examples significantly improved
model performance and gave very similar results for different training
sets.
Collapse
Affiliation(s)
- Raquel Rodríguez-Pérez
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität , Dahlmannstrasse 2, D-53113 Bonn, Germany
| |
Collapse
|
16
|
Liu J, Ning X. Multi-Assay-Based Compound Prioritization via Assistance Utilization: A Machine Learning Framework. J Chem Inf Model 2017; 57:484-498. [PMID: 28234477 DOI: 10.1021/acs.jcim.6b00737] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Junfeng Liu
- Indiana University-Purdue University, Indianapolis, 723 West Michigan St., SL 280, Indianapolis, Indiana 46202, United States
| | - Xia Ning
- Indiana University-Purdue University, Indianapolis, 723 West Michigan St., SL 280, Indianapolis, Indiana 46202, United States
- Center
for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th St., HITS 5000, Indianapolis, Indiana 46202, United States
| |
Collapse
|
17
|
Descriptors and their selection methods in QSAR analysis: paradigm for drug design. Drug Discov Today 2016; 21:1291-302. [DOI: 10.1016/j.drudis.2016.06.013] [Citation(s) in RCA: 162] [Impact Index Per Article: 20.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 04/25/2016] [Accepted: 06/13/2016] [Indexed: 12/22/2022]
|
18
|
Du YM, Hu Y, Xia Y, Ouyang Z. Power Normalization for Mass Spectrometry Data Analysis and Analytical Method Assessment. Anal Chem 2016; 88:3156-63. [PMID: 26882462 PMCID: PMC8135100 DOI: 10.1021/acs.analchem.5b04418] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Biomarker profiling using mass spectrometry plays an essential role in biological studies and is highly dependent on the data analysis for sample classification. In this study, we introduced power nomination of the mass spectra as a method for systematically altering the weights of peaks at different intensity levels. In combination with the use of support vector machine method (SVM), the impact on the sample classification has been characterized using data in four studies previously reported, including the distinctions of anomeric configurations of sugars, types of bacteria, stages of melanoma, and the types of breast cancer. Comprehensive analysis of the data with normalization at different power normalization index (PNI) was developed and analysis tools, including error-PNI plots, reference profiles, and error source profiles, were used to assess the potential of the analytical methods as well as to find the proper approaches to classify the samples.
Collapse
Affiliation(s)
- Y. Melodie Du
- Weldon School of Biomedical Engineering, Purdue University, 206 South Martin Jischke Drive, West Lafayette, Indiana 47907, United States
| | - Ye Hu
- Department of Nanomedicine, Houston Methodist Research Institute, 6565 Fannin Street, Houston, Texas 77030, United States
| | - Yu Xia
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| | - Zheng Ouyang
- Weldon School of Biomedical Engineering, Purdue University, 206 South Martin Jischke Drive, West Lafayette, Indiana 47907, United States
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907, United States
| |
Collapse
|
19
|
Accurate and efficient target prediction using a potency-sensitive influence-relevance voter. J Cheminform 2015; 7:63. [PMID: 26719774 PMCID: PMC4696267 DOI: 10.1186/s13321-015-0110-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 12/02/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A number of algorithms have been proposed to predict the biological targets of diverse molecules. Some are structure-based, but the most common are ligand-based and use chemical fingerprints and the notion of chemical similarity. These methods tend to be computationally faster than others, making them particularly attractive tools as the amount of available data grows. RESULTS Using a ChEMBL-derived database covering 490,760 molecule-protein interactions and 3236 protein targets, we conduct a large-scale assessment of the performance of several target-prediction algorithms at predicting drug-target activity. We assess algorithm performance using three validation procedures: standard tenfold cross-validation, tenfold cross-validation in a simulated screen that includes random inactive molecules, and validation on an external test set composed of molecules not present in our database. CONCLUSIONS We present two improvements over current practice. First, using a modified version of the influence-relevance voter (IRV), we show that using molecule potency data can improve target prediction. Second, we demonstrate that random inactive molecules added during training can boost the accuracy of several algorithms in realistic target-prediction experiments. Our potency-sensitive version of the IRV (PS-IRV) obtains the best results on large test sets in most of the experiments. Models and software are publicly accessible through the chemoinformatics portal at http://chemdb.ics.uci.edu/.
Collapse
|
20
|
Schultes S, Kooistra AJ, Vischer HF, Nijmeijer S, Haaksma EEJ, Leurs R, de Esch IJP, de Graaf C. Combinatorial Consensus Scoring for Ligand-Based Virtual Fragment Screening: A Comparative Case Study for Serotonin 5-HT(3)A, Histamine H(1), and Histamine H(4) Receptors. J Chem Inf Model 2015; 55:1030-44. [PMID: 25815783 DOI: 10.1021/ci500694c] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
In the current study we have evaluated the applicability of ligand-based virtual screening (LBVS) methods for the identification of small fragment-like biologically active molecules using different similarity descriptors and different consensus scoring approaches. For this purpose, we have evaluated the performance of 14 chemical similarity descriptors in retrospective virtual screening studies to discriminate fragment-like ligands of three membrane-bound receptors from fragments that are experimentally determined to have no affinity for these proteins (true inactives). We used a complete fragment affinity data set of experimentally determined ligands and inactives for two G protein-coupled receptors (GPCRs), the histamine H1 receptor (H1R) and the histamine H4 receptor (H4R), and one ligand-gated ion channel (LGIC), the serotonin receptor (5-HT3AR), to validate our retrospective virtual screening studies. We have exhaustively tested consensus scoring strategies that combine the results of multiple actives (group fusion) or combine different similarity descriptors (similarity fusion), and for the first time systematically evaluated different combinations of group fusion and similarity fusion approaches. Our studies show that for these three case study protein targets both consensus scoring approaches can increase virtual screening enrichments compared to single chemical similarity search methods. Our cheminformatics analyses recommend to use a combination of both group fusion and similarity fusion for prospective ligand-based virtual fragment screening.
Collapse
Affiliation(s)
- Sabine Schultes
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Albert J Kooistra
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Henry F Vischer
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Saskia Nijmeijer
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Eric E J Haaksma
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Rob Leurs
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Iwan J P de Esch
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Chris de Graaf
- †Division of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| |
Collapse
|
21
|
Lavecchia A. Machine-learning approaches in drug discovery: methods and applications. Drug Discov Today 2014; 20:318-31. [PMID: 25448759 DOI: 10.1016/j.drudis.2014.10.012] [Citation(s) in RCA: 353] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2014] [Revised: 09/27/2014] [Accepted: 10/24/2014] [Indexed: 12/19/2022]
Abstract
During the past decade, virtual screening (VS) has evolved from traditional similarity searching, which utilizes single reference compounds, into an advanced application domain for data mining and machine-learning approaches, which require large and representative training-set compounds to learn robust decision rules. The explosive growth in the amount of public domain-available chemical and biological data has generated huge effort to design, analyze, and apply novel learning methodologies. Here, I focus on machine-learning techniques within the context of ligand-based VS (LBVS). In addition, I analyze several relevant VS studies from recent publications, providing a detailed view of the current state-of-the-art in this field and highlighting not only the problematic issues, but also the successes and opportunities for further advances.
Collapse
Affiliation(s)
- Antonio Lavecchia
- Department of Pharmacy, Drug Discovery Laboratory, University of Napoli 'Federico II', via D. Montesano 49, I-80131 Napoli, Italy.
| |
Collapse
|
22
|
|
23
|
Riniker S, Fechner N, Landrum GA. Heterogeneous Classifier Fusion for Ligand-Based Virtual Screening: Or, How Decision Making by Committee Can Be a Good Thing. J Chem Inf Model 2013; 53:2829-36. [DOI: 10.1021/ci400466r] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Affiliation(s)
- Sereina Riniker
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, CH-4056 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, CH-4056 Basel, Switzerland
| | - Gregory A. Landrum
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, CH-4056 Basel, Switzerland
| |
Collapse
|
24
|
Heikamp K, Bajorath J. Prediction of Compounds with Closely Related Activity Profiles Using Weighted Support Vector Machine Linear Combinations. J Chem Inf Model 2013; 53:791-801. [DOI: 10.1021/ci400090t] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Affiliation(s)
- Kathrin Heikamp
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| |
Collapse
|
25
|
Ding B, Wang J, Li N, Wang W. Characterization of small molecule binding. I. Accurate identification of strong inhibitors in virtual screening. J Chem Inf Model 2013; 53:114-22. [PMID: 23259763 DOI: 10.1021/ci300508m] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Accurately ranking docking poses remains a great challenge in computer-aided drug design. In this study, we present an integrated approach called MIEC-SVM that combines structure modeling and statistical learning to characterize protein-ligand binding based on the complex structure generated from docking. Using the HIV-1 protease as a model system, we showed that MIEC-SVM can successfully rank the docking poses and consistently outperformed the state-of-art scoring functions when the true positives only account for 1% or 0.5% of all the compounds under consideration. More excitingly, we found that MIEC-SVM can achieve a significant enrichment in virtual screening even when trained on a set of known inhibitors as small as 50, especially when enhanced by a model average approach. Given these features of MIEC-SVM, we believe it provides a powerful tool for searching for and designing new drugs.
Collapse
Affiliation(s)
- Bo Ding
- Department of Chemistry and Biochemistry, UCSD, La Jolla, California 92093-0359, USA
| | | | | | | |
Collapse
|
26
|
Vogt M, Bajorath J. Chemoinformatics: A view of the field and current trends in method development. Bioorg Med Chem 2012; 20:5317-23. [DOI: 10.1016/j.bmc.2012.03.030] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Revised: 03/09/2012] [Accepted: 03/12/2012] [Indexed: 12/18/2022]
|
27
|
Stumpfe D, Bajorath J. Applied Virtual Screening: Strategies, Recommendations, and Caveats. METHODS AND PRINCIPLES IN MEDICINAL CHEMISTRY 2011. [DOI: 10.1002/9783527633326.ch11] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
|
28
|
Wassermann AM, Geppert H, Bajorath J. Application of support vector machine-based ranking strategies to search for target-selective compounds. Methods Mol Biol 2011; 672:517-530. [PMID: 20838983 DOI: 10.1007/978-1-60761-839-3_21] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Support vector machine (SVM)-based selectivity searching has recently been introduced to identify compounds in virtual screening libraries that are not only active for a target protein, but also selective for this target over a closely related member of the same protein family. In simulated virtual screening calculations, SVM-based strategies termed preference ranking and one-versus-all ranking were successfully applied to rank a database and enrich high-ranking positions with selective compounds while removing nonselective molecules from high ranks. In contrast to the original SVM approach developed for binary classification, these strategies enable learning from more than two classes, considering that distinguishing between selective, promiscuously active, and inactive compounds gives rise to a three-class prediction problem. In this chapter, we describe the extension of the one-versus-all strategy to four training classes. Furthermore, we present an adaptation of the preference ranking strategy that leads to higher recall of selective compounds than previously investigated approaches and is applicable in situations where the removal of nonselective compounds from high-ranking positions is not required.
Collapse
Affiliation(s)
- Anne Mai Wassermann
- Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany
| | | | | |
Collapse
|
29
|
Affiliation(s)
- Nikil Wale
- Computational Sciences Center of Emphasis, Pfizer Inc., Groton, Connecticut
| |
Collapse
|
30
|
Lodhi H, Muggleton S, Sternberg MJE. Multi-class Mode of Action Classification of Toxic Compounds Using Logic Based Kernel Methods. Mol Inform 2010; 29:655-64. [DOI: 10.1002/minf.201000083] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 09/04/2010] [Indexed: 11/08/2022]
|
31
|
Agarwal S, Dugar D, Sengupta S. Ranking chemical structures for drug discovery: a new machine learning approach. J Chem Inf Model 2010; 50:716-31. [PMID: 20387860 DOI: 10.1021/ci9003865] [Citation(s) in RCA: 87] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
With chemical libraries increasingly containing millions of compounds or more, there is a fast-growing need for computational methods that can rank or prioritize compounds for screening. Machine learning methods have shown considerable promise for this task; indeed, classification methods such as support vector machines (SVMs), together with their variants, have been used in virtual screening to distinguish active compounds from inactive ones, while regression methods such as partial least-squares (PLS) and support vector regression (SVR) have been used in quantitative structure-activity relationship (QSAR) analysis for predicting biological activities of compounds. Recently, a new class of machine learning methods - namely, ranking methods, which are designed to directly optimize ranking performance - have been developed for ranking tasks such as web search that arise in information retrieval (IR) and other applications. Here we report the application of these new ranking methods in machine learning to the task of ranking chemical structures. Our experiments show that the new ranking methods give better ranking performance than both classification based methods in virtual screening and regression methods in QSAR analysis. We also make some interesting connections between ranking performance measures used in cheminformatics and those used in IR studies.
Collapse
Affiliation(s)
- Shivani Agarwal
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA.
| | | | | |
Collapse
|
32
|
Podolyan Y, Walters MA, Karypis G. Assessing Synthetic Accessibility of Chemical Compounds Using Machine Learning Methods. J Chem Inf Model 2010; 50:979-91. [DOI: 10.1021/ci900301v] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Yevgeniy Podolyan
- Department of Computer Science and Computer Engineering, University of Minnesota, Minneapolis, Minnesota 55455, and Institute for Therapeutics Discovery and Development, Department of Medicinal Chemistry, University of Minnesota, Minneapolis, Minnesota 55455
| | - Michael A. Walters
- Department of Computer Science and Computer Engineering, University of Minnesota, Minneapolis, Minnesota 55455, and Institute for Therapeutics Discovery and Development, Department of Medicinal Chemistry, University of Minnesota, Minneapolis, Minnesota 55455
| | - George Karypis
- Department of Computer Science and Computer Engineering, University of Minnesota, Minneapolis, Minnesota 55455, and Institute for Therapeutics Discovery and Development, Department of Medicinal Chemistry, University of Minnesota, Minneapolis, Minnesota 55455
| |
Collapse
|
33
|
Geppert H, Vogt M, Bajorath J. Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. J Chem Inf Model 2010; 50:205-16. [PMID: 20088575 DOI: 10.1021/ci900419k] [Citation(s) in RCA: 231] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | | | | |
Collapse
|
34
|
|
35
|
|
36
|
Abstract
This chapter reviews the use of molecular fingerprints for chemical similarity searching. The fingerprints encode the presence of 2D substructural fragments in a molecule, and the similarity between a pair of molecules is a function of the number of fragments that they have in common. Although this provides a very simple way of estimating the degree of structural similarity between two molecules, it has been found to provide an effective and an efficient tool for searching large chemical databases. The review describes the historical development of similarity searching since it was first described in the mid-1980s, reviews the many different coefficients, representations, and weightings that can be combined to form a similarity measure, describes quantitative measures of the effectiveness of similarity searching, and concludes by looking at current developments based on the use of data fusion and machine learning techniques.
Collapse
Affiliation(s)
- Peter Willett
- Department of Information Studies, The University of Sheffield, Sheffield, UK
| |
Collapse
|
37
|
Wale N, Karypis G. Target fishing for chemical compounds using target-ligand activity data and ranking based methods. J Chem Inf Model 2009; 49:2190-201. [PMID: 19764745 DOI: 10.1021/ci9000376] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In recent years, the development of computational techniques that identify all the likely targets for a given chemical compound, also termed as the problem of Target Fishing, has been an active area of research. Identification of likely targets of a chemical compound in the early stages of drug discovery helps to understand issues such as selectivity, off-target pharmacology, and toxicity. In this paper, we present a set of techniques whose goal is to rank or prioritize targets in the context of a given chemical compound so that most targets against which this compound may show activity appear higher in the ranked list. These methods are based on our extensions to the SVM and ranking perceptron algorithms for this problem. Our extensive experimental study shows that the methods developed in this work outperform previous approaches 2% to 60% under different evaluation criterions.
Collapse
Affiliation(s)
- Nikil Wale
- Department of Computer Science, University of Minnesota, Twin Cities, Minnesota 55455, USA.
| | | |
Collapse
|
38
|
Kinnings SL, Jackson RM. LigMatch: a multiple structure-based ligand matching method for 3D virtual screening. J Chem Inf Model 2009; 49:2056-66. [PMID: 19685924 DOI: 10.1021/ci900204y] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
We have developed a new virtual screening (VS) method called LigMatch and evaluated its performance on 13 protein targets using a filtered and clustered version of the directory of useful decoys (DUD). The method uses 3D structural comparison to a crystallographically determined ligand in a bioactive 'template' conformation, using a geometric hashing method, in order to prioritize each database compound. We show that LigMatch outperforms several other widely used VS methods on the 13 DUD targets. We go on to demonstrate that improved VS performance can be gained from using multiple, structurally diverse templates rather than a single template ligand for a particular protein target. In this case, a 2D fingerprint-based method is used to select a ligand template from a set of known bioactive conformations. Furthermore, we show that LigMatch performs well even in the absence of 2D similarity to the template ligands, thereby demonstrating its robustness with respect to purely 2D methods and its potential for scaffold hopping.
Collapse
Affiliation(s)
- Sarah L Kinnings
- Institute of Molecular and Cellular Biology and Astbury Centre for Structural Molecular Biology, Faculty of Biological Sciences, University of Leeds, Leeds LS2 9JT, United Kingdom
| | | |
Collapse
|
39
|
Liao Q, Wang J, Webster Y, Watson IA. GPU Accelerated Support Vector Machines for Mining High-Throughput Screening Data. J Chem Inf Model 2009; 49:2718-25. [DOI: 10.1021/ci900337f] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Affiliation(s)
- Quan Liao
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Jibo Wang
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Yue Webster
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| | - Ian A. Watson
- ChemExplorer Co. Ltd., 965 Halei Road, Shanghai 201203, People’s Republic of China, and Lilly Research Laboratories, Eli Lilly and Company, Lilly Corporate Center, Indianapolis, Indiana 46285
| |
Collapse
|
40
|
Wassermann AM, Geppert H, Bajorath J. Ligand Prediction for Orphan Targets Using Support Vector Machines and Various Target-Ligand Kernels Is Dominated by Nearest Neighbor Effects. J Chem Inf Model 2009; 49:2155-67. [DOI: 10.1021/ci9002624] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Anne Mai Wassermann
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstrasse 2, D-53113 Bonn, Germany
| |
Collapse
|
41
|
Geppert H, Humrich J, Stumpfe D, Gärtner T, Bajorath J. Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors. J Chem Inf Model 2009; 49:767-79. [PMID: 19309114 DOI: 10.1021/ci900004a] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Support vector machine (SVM) database search strategies are presented that aim at the identification of small molecule ligands for targets for which no ligand information is currently available. In pharmaceutical research and chemical biology, this situation is faced, for example, when studying orphan targets or newly identified members of protein families. To investigate methods for de novo ligand identification in the absence of known three-dimensional target structures or active molecules, we have focused on combining sequence and ligand information for closely and distantly related proteins. To provide a basis for these investigations, a set of 11 protease targets from different families was assembled together with more than 2000 inhibitors directed against individual proteases. We have compared SVM approaches that combine protein sequence and ligand information in different ways and utilize 2D fingerprints as ligand descriptors. These methodologies were applied to search for inhibitors of individual proteases not taken into account during learning. A target sequence-ligand kernel and, in particular, a linear combination of multiple target-directed SVMs consistently identified inhibitors with high accuracy including test cases where homology-based similarity searching using data fusion and conventional SVM ranking nearly or completely failed. The SVM linear combination and target-ligand kernel methods described herein are intuitive and straightforward to adopt for ligand prediction against other targets.
Collapse
Affiliation(s)
- Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat Bonn, Dahlmannstr. 2, D-53113 Bonn, Germany
| | | | | | | | | |
Collapse
|
42
|
Wassermann AM, Geppert H, Bajorath J. Searching for target-selective compounds using different combinations of multiclass support vector machine ranking methods, kernel functions, and fingerprint descriptors. J Chem Inf Model 2009; 49:582-92. [PMID: 19249858 DOI: 10.1021/ci800441c] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The identification of small chemical compounds that are selective for a target protein over one or more closely related members of the same family is of high relevance for applications in chemical biology. Conventional 2D similarity searching using known selective molecules as templates has recently been found to preferentially detect selective over non-selective and inactive database compounds. To improve the initially observed search performance, we have attempted to use 2D fingerprints as descriptors for support vector machine (SVM)-based selectivity searching. Different from typically applied binary SVM compound classification, SVM analysis has been adapted here for multiclass predictions and compound ranking to distinguish between selective, active but non-selective, and inactive compounds. In systematic database search calculations, we tested combinations of four alternative SVM ranking schemes, four different kernel functions, and four fingerprints and were able to further improve selectivity search performance by effectively removing non-selective molecules from high ranking positions while retaining high recall of selective compounds.
Collapse
Affiliation(s)
- Anne Mai Wassermann
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| | | | | |
Collapse
|
43
|
Abdo A, Salim N. Similarity-Based Virtual Screening Using Bayesian Inference Network: Enhanced Search Using 2D Fingerprints and Multiple Reference Structures. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200860155] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
44
|
Swamidass SJ, Azencott CA, Lin TW, Gramajo H, Tsai S, Baldi P. Influence relevance voting: an accurate and interpretable virtual high throughput screening method. J Chem Inf Model 2009; 49:756-66. [PMID: 19391629 PMCID: PMC2750043 DOI: 10.1021/ci8004379] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Given activity training data from high-throughput screening (HTS) experiments, virtual high-throughput screening (vHTS) methods aim to predict in silico the activity of untested chemicals. We present a novel method, the Influence Relevance Voter (IRV), specifically tailored for the vHTS task. The IRV is a low-parameter neural network which refines a k-nearest neighbor classifier by nonlinearly combining the influences of a chemical's neighbors in the training set. Influences are decomposed, also nonlinearly, into a relevance component and a vote component. The IRV is benchmarked using the data and rules of two large, open, competitions, and its performance compared to the performance of other participating methods, as well as of an in-house support vector machine (SVM) method. On these benchmark data sets, IRV achieves state-of-the-art results, comparable to the SVM in one case, and significantly better than the SVM in the other, retrieving three times as many actives in the top 1% of its prediction-sorted list. The IRV presents several other important advantages over SVMs and other methods: (1) the output predictions have a probabilistic semantic; (2) the underlying inferences are interpretable; (3) the training time is very short, on the order of minutes even for very large data sets; (4) the risk of overfitting is minimal, due to the small number of free parameters; and (5) additional information can easily be incorporated into the IRV architecture. Combined with its performance, these qualities make the IRV particularly well suited for vHTS.
Collapse
|
45
|
Nisius B, Göller AH, Bajorath J. Combining Cluster Analysis, Feature Selection and Multiple Support Vector Machine Models for the Identification of Human Ether-a-go-go Related Gene Channel Blocking Compounds. Chem Biol Drug Des 2009; 73:17-25. [DOI: 10.1111/j.1747-0285.2008.00747.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
46
|
Lounkine E, Bajorath J. Core Trees and Consensus Fragment Sequences for Molecular Representation and Similarity Analysis. J Chem Inf Model 2008; 48:1161-6. [DOI: 10.1021/ci800020s] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Eugen Lounkine
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| |
Collapse
|