1
|
Jobe A, Vijayan R. Orphan G protein-coupled receptors: the ongoing search for a home. Front Pharmacol 2024; 15:1349097. [PMID: 38495099 PMCID: PMC10941346 DOI: 10.3389/fphar.2024.1349097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Accepted: 02/15/2024] [Indexed: 03/19/2024] Open
Abstract
G protein-coupled receptors (GPCRs) make up the largest receptor superfamily, accounting for 4% of protein-coding genes. Despite the prevalence of such transmembrane receptors, a significant number remain orphans, lacking identified endogenous ligands. Since their conception, the reverse pharmacology approach has been used to characterize such receptors. However, the multifaceted and nuanced nature of GPCR signaling poses a great challenge to their pharmacological elucidation. Considering their therapeutic relevance, the search for native orphan GPCR ligands continues. Despite limited structural input in terms of 3D crystallized structures, with advances in machine-learning approaches, there has been great progress with respect to accurate ligand prediction. Though such an approach proves valuable given that ligand scarcity is the greatest hurdle to orphan GPCR deorphanization, the future pairings of the remaining orphan GPCRs may not necessarily take a one-size-fits-all approach but should be more comprehensive in accounting for numerous nuanced possibilities to cover the full spectrum of GPCR signaling.
Collapse
Affiliation(s)
- Amie Jobe
- Department of Biology, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates
| | - Ranjit Vijayan
- Department of Biology, College of Science, United Arab Emirates University, Al Ain, United Arab Emirates
- The Big Data Analytics Center, United Arab Emirates University, Al Ain, United Arab Emirates
- Zayed Bin Sultan Center for Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| |
Collapse
|
2
|
Satake H. Kobayashi Award 2021: Neuropeptides, receptors, and follicle development in the ascidian, Ciona intestinalis Type A: New clues to the evolution of chordate neuropeptidergic systems from biological niches. Gen Comp Endocrinol 2023; 337:114262. [PMID: 36925021 DOI: 10.1016/j.ygcen.2023.114262] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/26/2022] [Revised: 03/09/2023] [Accepted: 03/11/2023] [Indexed: 03/17/2023]
Abstract
Ciona intestinalis Type A (Ciona robusta) is a cosmopolitan species belonging to the phylum Urochordata, invertebrate chordates that are phylogenetically the most closely related to the vertebrates. Therefore, this species is of interest for investigation of the evolution and comparative physiology of endocrine, neuroendocrine, and nervous systems in chordates. Our group has identified>30 Ciona neuropeptides (80% of all identified ascidian neuropeptides) primarily using peptidomic approaches combined with reference to genome sequences. These neuropeptides are classified into two groups: homologs or prototypes of vertebrate neuropeptides and novel (Ciona-specific) neuropeptides. We have also identified the cognate receptors for these peptides. In particular, we elucidated multiple receptors for Ciona-specific neuropeptides by a combination of a novel machine learning system and experimental validation of the specific interaction of the predicted neuropeptide-receptor pairs, and verified unprecedented phylogenies of receptors for neuropeptides. Moreover, several neuropeptides were found to play major roles in the regulation of ovarian follicle development. Ciona tachykinin facilitates the growth of vitellogenic follicles via up-regulation of the enzymatic activities of proteases. Ciona vasopressin stimulates oocyte maturation and ovulation via up-regulation of maturation-promoting factor- and matrix metalloproteinase-directed collagen degradation, respectively. Ciona cholecystokinin also triggers ovulation via up-regulation of receptor tyrosine kinase signaling and the subsequent activation of matrix metalloproteinase. These studies revealed that the neuropeptidergic system plays major roles in ovarian follicle growth, maturation, and ovulation in Ciona, thus paving the way for investigation of the biological roles for neuropeptides in the endocrine, neuroendocrine, nervous systems of Ciona, and studies of the evolutionary processes of various neuropeptidergic systems in chordates.
Collapse
Affiliation(s)
- Honoo Satake
- Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan.
| |
Collapse
|
3
|
Satake H, Osugi T, Shiraishi A. Impact of Machine Learning-Associated Research Strategies on the Identification of Peptide-Receptor Interactions in the Post-Omics Era. Neuroendocrinology 2023; 113:251-261. [PMID: 34348315 DOI: 10.1159/000518572] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 07/19/2021] [Indexed: 11/19/2022]
Abstract
BACKGROUNDS Elucidation of peptide-receptor pairs is a prerequisite for many studies in the neuroendocrine, endocrine, and neuroscience fields. Recent omics analyses have provided vast amounts of peptide and G protein-coupled receptor (GPCR) sequence data. GPCRs for homologous peptides are easily characterized based on homology searching, and the relevant peptide-GPCR interactions are also detected by typical signaling assays. In contrast, conventional evaluation or prediction methods, including high-throughput reverse-pharmacological assays and tertiary structure-based computational analyses, are not useful for identifying interactions between novel and omics-derived peptides and GPCRs. SUMMARY Recently, an approach combining machine learning-based prediction of novel peptide-GPCR pairs and experimental validation of the predicted pairs have been shown to breakthrough this bottleneck. A machine learning method, logistic regression for human class A GPCRs and the multiple subsequent signaling assays led to the deorphanization of human class A orphan GPCRs, namely, the identification of 18 peptide-GPCR pairs. Furthermore, using another machine learning algorithm, the support vector machine (SVM), the peptide descriptor-incorporated SVM was originally developed and employed to predict GPCRs for novel peptides characterized from the closest relative of vertebrates, Ciona intestinalis Type A (Ciona robusta). Experimental validation of the predicted pairs eventually led to the identification of 11 novel peptide-GPCR pairs. Of particular interest is that these newly identified GPCRs displayed neither significant sequence similarity nor molecular phylogenetic relatedness to known GPCRs for peptides. KEY MESSAGES These recent studies highlight the usefulness and versatility of machine learning for enabling the efficient, reliable, and systematic identification of novel peptide-GPCR interactions.
Collapse
Affiliation(s)
- Honoo Satake
- Division of Integrative Biomolecular Function, Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Tomohiro Osugi
- Division of Integrative Biomolecular Function, Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| | - Akira Shiraishi
- Division of Integrative Biomolecular Function, Bioorganic Research Institute, Suntory Foundation for Life Sciences, Kyoto, Japan
| |
Collapse
|
4
|
Ye Q, Hsieh CY, Yang Z, Kang Y, Chen J, Cao D, He S, Hou T. A unified drug-target interaction prediction framework based on knowledge graph and recommendation system. Nat Commun 2021; 12:6775. [PMID: 34811351 PMCID: PMC8635420 DOI: 10.1038/s41467-021-27137-3] [Citation(s) in RCA: 67] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 11/05/2021] [Indexed: 02/06/2023] Open
Abstract
Prediction of drug-target interactions (DTI) plays a vital role in drug development in various areas, such as virtual screening, drug repurposing and identification of potential drug side effects. Despite extensive efforts have been invested in perfecting DTI prediction, existing methods still suffer from the high sparsity of DTI datasets and the cold start problem. Here, we develop KGE_NFM, a unified framework for DTI prediction by combining knowledge graph (KG) and recommendation system. This framework firstly learns a low-dimensional representation for various entities in the KG, and then integrates the multimodal information via neural factorization machine (NFM). KGE_NFM is evaluated under three realistic scenarios, and achieves accurate and robust predictions on four benchmark datasets, especially in the scenario of the cold start for proteins. Our results indicate that KGE_NFM provides valuable insight to integrate KG and recommendation system-based techniques into a unified framework for novel DTI discovery.
Collapse
Affiliation(s)
- Qing Ye
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China ,grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China ,grid.13402.340000 0004 1759 700XState Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058 China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Ziyi Yang
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Yu Kang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China
| | - Jiming Chen
- grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China.
| | - Shibo He
- College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|
5
|
Abstract
Neuropeptides play pivotal roles in various biological events in the nervous, neuroendocrine, and endocrine systems, and are correlated with both physiological functions and unique behavioral traits of animals. Elucidation of functional interaction between neuropeptides and receptors is a crucial step for the verification of their biological roles and evolutionary processes. However, most receptors for novel peptides remain to be identified. Here, we show the identification of multiple G protein-coupled receptors (GPCRs) for species-specific neuropeptides of the vertebrate sister group, Ciona intestinalis Type A, by combining machine learning and experimental validation. We developed an original peptide descriptor-incorporated support vector machine and used it to predict 22 neuropeptide-GPCR pairs. Of note, signaling assays of the predicted pairs identified 1 homologous and 11 Ciona-specific neuropeptide-GPCR pairs for a 41% hit rate: the respective GPCRs for Ci-GALP, Ci-NTLP-2, Ci-LF-1, Ci-LF-2, Ci-LF-5, Ci-LF-6, Ci-LF-7, Ci-LF-8, Ci-YFV-1, and Ci-YFV-3. Interestingly, molecular phylogenetic tree analysis revealed that these receptors, excluding the Ci-GALP receptor, were evolutionarily unrelated to any other known peptide GPCRs, confirming that these GPCRs constitute unprecedented neuropeptide receptor clusters. Altogether, these results verified the neuropeptide-GPCR pairs in the protochordate and evolutionary lineages of neuropeptide GPCRs, and pave the way for investigating the endogenous roles of novel neuropeptides in the closest relatives of vertebrates and the evolutionary processes of neuropeptidergic systems throughout chordates. In addition, the present study also indicates the versatility of the machine-learning-assisted strategy for the identification of novel peptide-receptor pairs in various organisms.
Collapse
|
6
|
Vass M, Kooistra AJ, Verhoeven S, Gloriam D, de Esch IJP, de Graaf C. A Structural Framework for GPCR Chemogenomics: What's In a Residue Number? Methods Mol Biol 2018; 1705:73-113. [PMID: 29188559 DOI: 10.1007/978-1-4939-7465-8_4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The recent surge of crystal structures of G protein-coupled receptors (GPCRs), as well as comprehensive collections of sequence, structural, ligand bioactivity, and mutation data, has enabled the development of integrated chemogenomics workflows for this important target family. This chapter will focus on cross-family and cross-class studies of GPCRs that have pinpointed the need for, and the implementation of, a generic numbering scheme for referring to specific structural elements of GPCRs. Sequence- and structure-based numbering schemes for different receptor classes will be introduced and the remaining caveats will be discussed. The use of these numbering schemes has facilitated many chemogenomics studies such as consensus binding site definition, binding site comparison, ligand repurposing (e.g. for orphan receptors), sequence-based pharmacophore generation for homology modeling or virtual screening, and class-wide chemogenomics studies of GPCRs.
Collapse
Affiliation(s)
- Márton Vass
- Department of Medicinal Chemistry, Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HV, Amsterdam, The Netherlands
| | - Albert J Kooistra
- Department of Medicinal Chemistry, Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HV, Amsterdam, The Netherlands
- Centre for Molecular and Biomolecular Informatics (CMBI), Radboud University Medical Center, 6525 GA, Nijmegen, The Netherlands
| | - Stefan Verhoeven
- Netherlands eScience Center, 1098 XG, Amsterdam, The Netherlands
| | - David Gloriam
- Department of Drug Design and Pharmacology, University of Copenhagen, 2100, Copenhagen, Denmark
| | - Iwan J P de Esch
- Department of Medicinal Chemistry, Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HV, Amsterdam, The Netherlands
| | - Chris de Graaf
- Department of Medicinal Chemistry, Amsterdam Institute for Molecules Medicines and Systems, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HV, Amsterdam, The Netherlands.
| |
Collapse
|
7
|
Abstract
High-throughput and high-content screening campaigns have resulted in the creation of large chemogenomic matrices. These matrices form the training data which is used to build ligand-target interaction models for pharmacological and chemical biology research. While academic, government, and industrial efforts continuously add to the ligand-target data pairs available for modeling, major research efforts are devoted to improving machine learning techniques to cope with the sparseness, heterogeneity, and size of available datasets as well as inherent noise and bias. This "race of arms" has led to the creation of algorithms to generate highly complex models with high prediction performance at the cost of training efficiency as well as interpretability.In contrast, recent studies have challenged the necessity for "big data" in chemogenomic modeling and found that models built on larger numbers of examples do not necessarily result in better predictive abilities. Automated adaptive selection of the training data (ligand-target instances) used for model creation can result in considerably smaller training sets that retain prediction performance on par with training using hundreds of thousands of data points. In this chapter, we describe the protocols used for one such iterative chemogenomic selection technique, including model construction and update as well as possible techniques for evaluations of constructed models and analysis of the iterative model construction.
Collapse
Affiliation(s)
- Daniel Reker
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - J B Brown
- Life Science Informatics Research Unit, Laboratory of Molecular Biosciences, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
8
|
Hounsou C, Baehr C, Gasparik V, Alili D, Belhocine A, Rodriguez T, Dupuis E, Roux T, Mann A, Heissler D, Pin JP, Durroux T, Bonnet D, Hibert M. From the Promiscuous Asenapine to Potent Fluorescent Ligands Acting at a Series of Aminergic G-Protein-Coupled Receptors. J Med Chem 2017; 61:174-188. [PMID: 29219316 DOI: 10.1021/acs.jmedchem.7b01220] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Monoamine neurotransmitters such as serotonin, dopamine, histamine, and noradrenaline have important and varied physiological functions and similar chemical structures. Representing important pharmaceutical drug targets, the corresponding G-protein-coupled receptors (termed aminergic GPCRs) belong to the class of cell membrane receptors and share many levels of similarity as well. Given their pharmacological and structural closeness, one could hypothesize the possibility to derivatize a ubiquitous ligand to afford rapidly fluorescent probes for a large set of GPCRs to be used for instance in FRET-based binding assays. Here we report fluorescent derivatives of the nonselective agent asenapine which were designed, synthesized, and evaluated as ligands of 34 serotonin, dopamine, histamine, melatonin, acetylcholine, and adrenergic receptors. It appears that this strategy led rapidly to the discovery and development of nanomolar affinity fluorescent probes for 14 aminergic GPCRs. Selected probes were tested in competition binding assays with unlabeled competitors in order to demonstrate their suitability for drug discovery purposes.
Collapse
Affiliation(s)
- Candide Hounsou
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Corinne Baehr
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France
| | - Vincent Gasparik
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France
| | - Doria Alili
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Abderazak Belhocine
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Thiéric Rodriguez
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Elodie Dupuis
- Cisbio Bioassays , Parc Marcel Boiteux, BP84175, 30200 Codolet, France
| | - Thomas Roux
- Cisbio Bioassays , Parc Marcel Boiteux, BP84175, 30200 Codolet, France
| | - André Mann
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France
| | - Denis Heissler
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France.,LabEx Medalis, Université de Strasbourg , 67000 Strasbourg, France
| | - Jean-Philippe Pin
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Thierry Durroux
- Institut de Génomique Fonctionnelle, CNRS UMR5203, INSERM U661, Université de Montpellier (IFR3) , 141 Rue de la Cardonille, F-34094 Montpellier Cedex 5, France
| | - Dominique Bonnet
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France.,LabEx Medalis, Université de Strasbourg , 67000 Strasbourg, France
| | - Marcel Hibert
- Laboratoire d'Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS, Université de Strasbourg , 74 Route du Rhin, 67412 Illkirch, France.,LabEx Medalis, Université de Strasbourg , 67000 Strasbourg, France
| |
Collapse
|
9
|
Liu J, Ning X. Differential Compound Prioritization via Bidirectional Selectivity Push with Power. J Chem Inf Model 2017; 57:2958-2975. [PMID: 29178784 DOI: 10.1021/acs.jcim.7b00552] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Junfeng Liu
- Indiana University - Purdue University Indianapolis, 723 West Michigan Street, SL 280, Indianapolis, Indiana 46202, United States
| | - Xia Ning
- Indiana University - Purdue University Indianapolis, 723 West Michigan Street, SL 280, Indianapolis, Indiana 46202, United States
- Center
for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th Street, HITS 5000, Indianapolis, Indiana 46202, United States
| |
Collapse
|
10
|
Sorgenfrei FA, Fulle S, Merget B. Kinome-Wide Profiling Prediction of Small Molecules. ChemMedChem 2017; 13:495-499. [PMID: 28544552 DOI: 10.1002/cmdc.201700180] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 05/20/2017] [Indexed: 12/21/2022]
Abstract
Extensive kinase profiling data, covering more than half of the human kinome, are available nowadays and allow the construction of activity prediction models of high practical utility. Proteochemometric (PCM) approaches use compound and protein descriptors, which enables the extrapolation of bioactivity values to thus far unexplored kinases. In this study, the potential of PCM to make large-scale predictions on the entire kinome is explored, considering the applicability on novel compounds and kinases, including clinically relevant mutants. A rigorous validation indicates high predictive power on left-out kinases and superiority over individual kinase QSAR models for new compounds. Furthermore, external validation on clinically relevant mutant kinases reveals an excellent predictive power for mutations spread across the ATP binding site.
Collapse
Affiliation(s)
- Frieda A Sorgenfrei
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| | - Simone Fulle
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| | - Benjamin Merget
- BioMed X Innovation Center, Im Neuenheimer Feld 515, 69120, Heidelberg, Germany
| |
Collapse
|
11
|
Liu J, Ning X. Multi-Assay-Based Compound Prioritization via Assistance Utilization: A Machine Learning Framework. J Chem Inf Model 2017; 57:484-498. [PMID: 28234477 DOI: 10.1021/acs.jcim.6b00737] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Junfeng Liu
- Indiana University-Purdue University, Indianapolis, 723 West Michigan St., SL 280, Indianapolis, Indiana 46202, United States
| | - Xia Ning
- Indiana University-Purdue University, Indianapolis, 723 West Michigan St., SL 280, Indianapolis, Indiana 46202, United States
- Center
for Computational Biology and Bioinformatics, Indiana University School of Medicine, 410 West 10th St., HITS 5000, Indianapolis, Indiana 46202, United States
| |
Collapse
|
12
|
Shaikh N, Sharma M, Garg P. An improved approach for predicting drug-target interaction: proteochemometrics to molecular docking. MOLECULAR BIOSYSTEMS 2016; 12:1006-14. [PMID: 26822863 DOI: 10.1039/c5mb00650c] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Proteochemometric (PCM) methods, which use descriptors of both the interacting species, i.e. drug and the target, are being successfully employed for the prediction of drug-target interactions (DTI). However, unavailability of non-interacting dataset and determining the applicability domain (AD) of model are a main concern in PCM modeling. In the present study, traditional PCM modeling was improved by devising novel methodologies for reliable negative dataset generation and fingerprint based AD analysis. In addition, various types of descriptors and classifiers were evaluated for their performance. The Random Forest and Support Vector Machine models outperformed the other classifiers (accuracies >98% and >89% for 10-fold cross validation and external validation, respectively). The type of protein descriptors had negligible effect on the developed models, encouraging the use of sequence-based descriptors over the structure-based descriptors. To establish the practical utility of built models, targets were predicted for approved anticancer drugs of natural origin. The molecular recognition interactions between the predicted drug-target pair were quantified with the help of a reverse molecular docking approach. The majority of predicted targets are known for anticancer therapy. These results thus correlate well with anticancer potential of the selected drugs. Interestingly, out of all predicted DTIs, thirty were found to be reported in the ChEMBL database, further validating the adopted methodology. The outcome of this study suggests that the proposed approach, involving use of the improved PCM methodology and molecular docking, can be successfully employed to elucidate the intricate mode of action for drug molecules as well as repositioning them for new therapeutic applications.
Collapse
Affiliation(s)
- Naeem Shaikh
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), S. A. S. Nagar, Punjab 160062, India.
| | - Mahesh Sharma
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), S. A. S. Nagar, Punjab 160062, India.
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), S. A. S. Nagar, Punjab 160062, India.
| |
Collapse
|
13
|
Abstract
INTRODUCTION Over the past three decades, the predominant paradigm in drug discovery was designing selective ligands for a specific target to avoid unwanted side effects. However, in the last 5 years, the aim has shifted to take into account the biological network in which they interact. Quantitative and Systems Pharmacology (QSP) is a new paradigm that aims to understand how drugs modulate cellular networks in space and time, in order to predict drug targets and their role in human pathophysiology. AREAS COVERED This review discusses existing computational and experimental QSP approaches such as polypharmacology techniques combined with systems biology information and considers the use of new tools and ideas in a wider 'systems-level' context in order to design new drugs with improved efficacy and fewer unwanted off-target effects. EXPERT OPINION The use of network biology produces valuable information such as new indications for approved drugs, drug-drug interactions, proteins-drug side effects and pathways-gene associations. However, we are still far from the aim of QSP, both because of the huge effort needed to model precisely biological network models and the limited accuracy that we are able to reach with those. Hence, moving from 'one molecule for one target to give one therapeutic effect' to the 'big systems-based picture' seems obvious moving forward although whether our current tools are sufficient for such a step is still under debate.
Collapse
Affiliation(s)
- Violeta I Pérez-Nueno
- a Harmonic Pharma, Espace Transfert , 615 rue du Jardin Botanique, 54600 Villers lès Nancy, France +33 354 958 604 ; +33 383 593 046 ;
| |
Collapse
|
14
|
Chan WKB, Zhang H, Yang J, Brender JR, Hur J, Özgür A, Zhang Y. GLASS: a comprehensive database for experimentally validated GPCR-ligand associations. Bioinformatics 2015; 31:3035-42. [PMID: 25971743 DOI: 10.1093/bioinformatics/btv302] [Citation(s) in RCA: 67] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2015] [Accepted: 05/07/2015] [Indexed: 12/17/2022] Open
Abstract
MOTIVATION G protein-coupled receptors (GPCRs) are probably the most attractive drug target membrane proteins, which constitute nearly half of drug targets in the contemporary drug discovery industry. While the majority of drug discovery studies employ existing GPCR and ligand interactions to identify new compounds, there remains a shortage of specific databases with precisely annotated GPCR-ligand associations. RESULTS We have developed a new database, GLASS, which aims to provide a comprehensive, manually curated resource for experimentally validated GPCR-ligand associations. A new text-mining algorithm was proposed to collect GPCR-ligand interactions from the biomedical literature, which is then crosschecked with five primary pharmacological datasets, to enhance the coverage and accuracy of GPCR-ligand association data identifications. A special architecture has been designed to allow users for making homologous ligand search with flexible bioactivity parameters. The current database contains ∼500 000 unique entries, of which the vast majority stems from ligand associations with rhodopsin- and secretin-like receptors. The GLASS database should find its most useful application in various in silico GPCR screening and functional annotation studies. AVAILABILITY AND IMPLEMENTATION The website of GLASS database is freely available at http://zhanglab.ccmb.med.umich.edu/GLASS/. CONTACT zhng@umich.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wallace K B Chan
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Hongjiu Zhang
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Jianyi Yang
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Jeffrey R Brender
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Junguk Hur
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Arzucan Özgür
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Yang Zhang
- Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey Department of Biological Chemistry, Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA, Department of Basic Sciences, University of North Dakota, School of Medicine and Health Sciences, Grand Forks, ND 58203, USA and Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| |
Collapse
|
15
|
Liu N, Van Voorst JR, Johnston JB, Kuhn LA. CholMine: Determinants and Prediction of Cholesterol and Cholate Binding Across Nonhomologous Protein Structures. J Chem Inf Model 2015; 55:747-59. [PMID: 25760928 DOI: 10.1021/ci5006542] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Identifying physiological ligands is necessary for annotating new protein structures, yet this presents a significant challenge to biologists and pharmaceutical chemists. Here we develop a predictor of cholesterol and cholate binding that works across diverse protein families, extending beyond sequence motif-based prediction. This approach combines SimSite3D site comparison with the detection of conserved interactions in cholesterol/cholate bound crystal structures to define three-dimensional interaction motifs. The resulting predictor identifies cholesterol sites with an ∼82% unbiased true positive rate in both membrane and soluble proteins, with a very low false positive rate relative to other predictors. The CholMine Web server can analyze users' structures, detect those likely to bind cholesterol/cholate, and predict the binding mode and key interactions. By deciphering the determinants of binding for these important steroids, CholMine may also aid in the design of selective inhibitors and detergents for targets such as G protein coupled receptors and bile acid receptors.
Collapse
Affiliation(s)
- Nan Liu
- †Department of Chemistry, ‡Department of Computer Science and Engineering, and §Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, United States
| | - Jeffrey R Van Voorst
- †Department of Chemistry, ‡Department of Computer Science and Engineering, and §Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, United States
| | - John B Johnston
- †Department of Chemistry, ‡Department of Computer Science and Engineering, and §Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, United States
| | - Leslie A Kuhn
- †Department of Chemistry, ‡Department of Computer Science and Engineering, and §Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, United States
| |
Collapse
|
16
|
Manoharan P, Chennoju K, Ghoshal N. Target specific proteochemometric model development for BACE1 – protein flexibility and structural water are critical in virtual screening. MOLECULAR BIOSYSTEMS 2015; 11:1955-72. [DOI: 10.1039/c5mb00088b] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Structural water and protein plasticity are important factors for BACE1 targeted ligand virtual screening.
Collapse
Affiliation(s)
- Prabu Manoharan
- Structural Biology and Bioinformatics Division
- CSIR-Indian Institute of Chemical Biology
- Kolkata 700032
- India
| | - Kiranmai Chennoju
- National Institute of Pharmaceutical Education and Research
- Kolkata 700032
- India
| | - Nanda Ghoshal
- Structural Biology and Bioinformatics Division
- CSIR-Indian Institute of Chemical Biology
- Kolkata 700032
- India
| |
Collapse
|
17
|
Cortés-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Méndez-Lucio O, IJzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, Bender A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MEDCHEMCOMM 2015. [DOI: 10.1039/c4md00216d] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Proteochemometric (PCM) modelling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Qurrat Ul Ain
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | | | - Eelke B. Lenselink
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Oscar Méndez-Lucio
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Gerd Wohlfahrt
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Peteris Prusis
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Thérèse E. Malliavin
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Gerard J. P. van Westen
- European Molecular Biology Laboratory
- European Bioinformatics Institute
- Wellcome Trust Genome Campus
- Hinxton
- UK
| | - Andreas Bender
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| |
Collapse
|
18
|
Sugaya N. Ligand efficiency-based support vector regression models for predicting bioactivities of ligands to drug target proteins. J Chem Inf Model 2014; 54:2751-63. [PMID: 25220713 DOI: 10.1021/ci5003262] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
The concept of ligand efficiency (LE) indices is widely accepted throughout the drug design community and is frequently used in a retrospective manner in the process of drug development. For example, LE indices are used to investigate LE optimization processes of already-approved drugs and to re-evaluate hit compounds obtained from structure-based virtual screening methods and/or high-throughput experimental assays. However, LE indices could also be applied in a prospective manner to explore drug candidates. Here, we describe the construction of machine learning-based regression models in which LE indices are adopted as an end point and show that LE-based regression models can outperform regression models based on pIC50 values. In addition to pIC50 values traditionally used in machine learning studies based on chemogenomics data, three representative LE indices (ligand lipophilicity efficiency (LLE), binding efficiency index (BEI), and surface efficiency index (SEI)) were adopted, then used to create four types of training data. We constructed regression models by applying a support vector regression (SVR) method to the training data. In cross-validation tests of the SVR models, the LE-based SVR models showed higher correlations between the observed and predicted values than the pIC50-based models. Application tests to new data displayed that, generally, the predictive performance of SVR models follows the order SEI > BEI > LLE > pIC50. Close examination of the distributions of the activity values (pIC50, LLE, BEI, and SEI) in the training and validation data implied that the performance order of the SVR models may be ascribed to the much higher diversity of the LE-based training and validation data. In the application tests, the LE-based SVR models can offer better predictive performance of compound-protein pairs with a wider range of ligand potencies than the pIC50-based models. This finding strongly suggests that LE-based SVR models are better than pIC50-based models at predicting bioactivities of compounds that could exhibit a much higher (or lower) potency.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc. , Hatchobori 2-19-8, Chuo-ku, Tokyo 104-0032, Japan
| |
Collapse
|
19
|
Kooistra AJ, Kuhne S, de Esch IJP, Leurs R, de Graaf C. A structural chemogenomics analysis of aminergic GPCRs: lessons for histamine receptor ligand design. Br J Pharmacol 2014; 170:101-26. [PMID: 23713847 DOI: 10.1111/bph.12248] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Revised: 04/26/2013] [Accepted: 05/03/2013] [Indexed: 01/13/2023] Open
Abstract
BACKGROUND AND PURPOSE Chemogenomics focuses on the discovery of new connections between chemical and biological space leading to the discovery of new protein targets and biologically active molecules. G-protein coupled receptors (GPCRs) are a particularly interesting protein family for chemogenomics studies because there is an overwhelming amount of ligand binding affinity data available. The increasing number of aminergic GPCR crystal structures now for the first time allows the integration of chemogenomics studies with high-resolution structural analyses of GPCR-ligand complexes. EXPERIMENTAL APPROACH In this study, we have combined ligand affinity data, receptor mutagenesis studies, and amino acid sequence analyses to high-resolution structural analyses of (hist)aminergic GPCR-ligand interactions. This integrated structural chemogenomics analysis is used to more accurately describe the molecular and structural determinants of ligand affinity and selectivity in different key binding regions of the crystallized aminergic GPCRs, and histamine receptors in particular. KEY RESULTS Our investigations highlight interesting correlations and differences between ligand similarity and ligand binding site similarity of different aminergic receptors. Apparent discrepancies can be explained by combining detailed analysis of crystallized or predicted protein-ligand binding modes, receptor mutation studies, and ligand structure-selectivity relationships that identify local differences in essential pharmacophore features in the ligand binding sites of different receptors. CONCLUSIONS AND IMPLICATIONS We have performed structural chemogenomics studies that identify links between (hist)aminergic receptor ligands and their binding sites and binding modes. This knowledge can be used to identify structure-selectivity relationships that increase our understanding of ligand binding to (hist)aminergic receptors and hence can be used in future GPCR ligand discovery and design.
Collapse
Affiliation(s)
- A J Kooistra
- Faculty of Sciences, Amsterdam Institute for Molecules, Medicines and Systems, VU University Amsterdam, The Netherlands
| | | | | | | | | |
Collapse
|
20
|
Wassermann AM, Camargo LM, Auld DS. Composition and applications of focus libraries to phenotypic assays. Front Pharmacol 2014; 5:164. [PMID: 25104937 PMCID: PMC4109565 DOI: 10.3389/fphar.2014.00164] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2014] [Accepted: 06/21/2014] [Indexed: 11/16/2022] Open
Abstract
The wealth of bioactivity information now available on low-molecular weight compounds has enabled a paradigm shift in chemical biology and early phase drug discovery efforts. Traditionally chemical libraries have been most commonly employed in screening approaches where a bioassay is used to characterize a chemical library in a random search for active samples. However, robust curating of bioassay data, establishment of ontologies enabling mining of large chemical biology datasets, and a wealth of public chemical biology information has made possible the establishment of highly annotated compound collections. Such annotated chemical libraries can now be used to build a pathway/target hypothesis and have led to a new view where chemical libraries are used to characterize a bioassay. In this article we discuss the types of compounds in these annotated libraries composed of tools, probes, and drugs. As well, we provide rationale and a few examples for how such libraries can enable phenotypic/forward chemical genomic approaches. As with any approach, there are several pitfalls that need to be considered and we also outline some strategies to avoid these.
Collapse
Affiliation(s)
- Anne Mai Wassermann
- Center for Proteomic Chemistry, Novartis Institutes for Biomedical Research Cambridge, MA, USA
| | - Luiz M Camargo
- Center for Proteomic Chemistry, Novartis Institutes for Biomedical Research Cambridge, MA, USA
| | - Douglas S Auld
- Center for Proteomic Chemistry, Novartis Institutes for Biomedical Research Cambridge, MA, USA
| |
Collapse
|
21
|
Exploring the ligand-protein networks in traditional chinese medicine: current databases, methods and applications. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 827:227-57. [PMID: 25387968 PMCID: PMC7120483 DOI: 10.1007/978-94-017-9245-5_14] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
While the concept of "single component-single target" in drug discovery seems to have come to an end, "Multi-component-multi-target" is considered to be another promising way out in this field. The Traditional Chinese Medicine (TCM), which has thousands of years' clinical application among China and other Asian countries, is the pioneer of the "Multi-component-multi-target" and network pharmacology. Hundreds of different components in a TCM prescription can cure the diseases or relieve the patients by modulating the network of potential therapeutic targets. Although there is no doubt of the efficacy, it is difficult to elucidate convincing underlying mechanism of TCM due to its complex composition and unclear pharmacology. Without thorough investigation of its potential targets and side effects, TCM is not able to generate large-scale medicinal benefits, especially in the days when scientific reductionism and quantification are dominant. The use of ligand-protein networks has been gaining significant value in the history of drug discovery while its application in TCM is still in its early stage. This article firstly surveys TCM databases for virtual screening that have been greatly expanded in size and data diversity in recent years. On that basis, different screening methods and strategies for identifying active ingredients and targets of TCM are outlined based on the amount of network information available, both on sides of ligand bioactivity and the protein structures. Furthermore, applications of successful in silico target identification attempts are discussed in details along with experiments in exploring the ligand-protein networks of TCM. Finally, it will be concluded that the prospective application of ligand-protein networks can be used not only to predict protein targets of a small molecule, but also to explore the mode of action of TCM.
Collapse
|
22
|
Computational chemogenomics: is it more than inductive transfer? J Comput Aided Mol Des 2014; 28:597-618. [PMID: 24771144 DOI: 10.1007/s10822-014-9743-1] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2014] [Accepted: 04/11/2014] [Indexed: 10/25/2022]
Abstract
High-throughput assays challenge us to extract knowledge from multi-ligand, multi-target activity data. In QSAR, weights are statically fitted to each ligand descriptor with respect to a single endpoint or target. However, computational chemogenomics (CG) has demonstrated benefits of learning from entire grids of data at once, rather than building target-specific QSARs. A possible reason for this is the emergence of inductive knowledge transfer (IT) between targets, providing statistical robustness to the model, with no assumption about the structure of the targets. Relevant protein descriptors in CG should allow one to learn how to dynamically adjust ligand attribute weights with respect to protein structure. Hence, models built through explicit learning (EL) by including protein information, while benefitting from IT enhancement, should provide additional predictive capability, notably for protein deorphanization. This interplay between IT and EL in CG modeling is not sufficiently studied. While IT is likely to occur irrespective of the injected target information, it is not clear whether and when boosting due to EL may occur. EL is only possible if protein description is appropriate to the target set under investigation. The key issue here is the search for evidence of genuine EL exceeding expectations based on pure IT. We explore the problem in the context of Support Vector Regression, using more than 9,400 pKi values of 31 GPCRs, where compound-protein interactions are represented by the concatenation of vectorial descriptions of compounds and proteins. This provides a unified framework to generate both IT-enhanced and potentially EL-enabled models, where the difference is toggled by supplied protein information. For EL-enabled models, protein information includes genuine protein descriptors such as typical sequence-based terms, but also the experimentally determined affinity cross-correlation fingerprints. These latter benchmark the expected behavior of a quasi-ideal descriptor capturing the actual functional protein-protein relatedness, and therefore thought to be the most likely to enable EL. EL- and IT-based methods were benchmarked alongside classical QSAR, with respect to cross-validation and deorphanization challenges. A rational method for projecting benchmarked methodologies into a strategy space is given, in the aims that the projection will provide directions for the types of molecule designs possible using a given methodology. While EL-enabled strategies outperform classical QSARs and favorably compare to similar published results, they are, in all respects evaluated herein, not strongly distinguished from IT-enhanced models. Moreover, EL-enabled strategies failed to prove superior in deorphanization challenges. Therefore, this paper raises caution that, contrary to common belief and intuitive expectation, the benefits of chemogenomics models over classical QSAR are quite possibly due less to the injection of protein-related information, and rather impacted more by the effect of inductive transfer, due to simultaneous learning from all of the modeled endpoints. These results show that the field of protein descriptor research needs further improvements to truly realize the expected benefit of EL.
Collapse
|
23
|
Pérez-Nueno VI, Karaboga AS, Souchet M, Ritchie DW. GES Polypharmacology Fingerprints: A Novel Approach for Drug Repositioning. J Chem Inf Model 2014; 54:720-34. [DOI: 10.1021/ci4006723] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Violeta I. Pérez-Nueno
- Harmonic Pharma, Espace Transfert, 615 rue du Jardin Botanique, 54600 Villers lès Nancy, France
| | - Arnaud S. Karaboga
- Harmonic Pharma, Espace Transfert, 615 rue du Jardin Botanique, 54600 Villers lès Nancy, France
| | - Michel Souchet
- Harmonic Pharma, Espace Transfert, 615 rue du Jardin Botanique, 54600 Villers lès Nancy, France
| | - David W. Ritchie
- INRIA Nancy − Grand Est, 615 rue du Jardin Botanique, 54506 Vandoeuvre lès Nancy, France
| |
Collapse
|
24
|
Levit A, Beuming T, Krilov G, Sherman W, Niv MY. Predicting GPCR Promiscuity Using Binding Site Features. J Chem Inf Model 2013; 54:184-94. [DOI: 10.1021/ci400552z] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Affiliation(s)
- Anat Levit
- Institute
of Biochemistry, Food Science and Nutrition, Robert H. Smith Faculty
of Agriculture Food and Environment, The Hebrew University, Rehovot 76100, Israel
- Fritz
Haber Center for Molecular Dynamics, The Hebrew University, Jerusalem 91904, Israel
| | - Thijs Beuming
- Schrodinger Inc., 120 West Forty-Fifth Street, 17th Floor, New York, New York 10036, United States
| | - Goran Krilov
- Schrodinger Inc., 120 West Forty-Fifth Street, 17th Floor, New York, New York 10036, United States
| | - Woody Sherman
- Schrodinger Inc., 120 West Forty-Fifth Street, 17th Floor, New York, New York 10036, United States
| | - Masha Y. Niv
- Institute
of Biochemistry, Food Science and Nutrition, Robert H. Smith Faculty
of Agriculture Food and Environment, The Hebrew University, Rehovot 76100, Israel
- Fritz
Haber Center for Molecular Dynamics, The Hebrew University, Jerusalem 91904, Israel
| |
Collapse
|
25
|
Bajorath J. Molecular crime scene investigation - dusting for fingerprints. DRUG DISCOVERY TODAY. TECHNOLOGIES 2013; 10:e491-e498. [PMID: 24451639 DOI: 10.1016/j.ddtec.2012.06.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In chemoinformatics and drug design, fingerprints (FPs) are defined as string representations of molecular structure and properties and are popular descriptors for similarity searching. FPs are generally characterized by the simplicity of their design and ease of use. Despite a long history in chemoinformatics, the potential and limitations of FP searching are often not well under- stood. Standard FPs can also be subjected to engineering techniques to tune them for specific search applications.
Collapse
|
26
|
|
27
|
Benchmarking of protein descriptor sets in proteochemometric modeling (part 2): modeling performance of 13 amino acid descriptor sets. J Cheminform 2013; 5:42. [PMID: 24059743 PMCID: PMC4015169 DOI: 10.1186/1758-2946-5-42] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background While a large body of work exists on comparing and benchmarking descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 amino acid descriptor sets have been benchmarked with respect to their ability of establishing bioactivity models. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI, BLOSUM, a novel protein descriptor set (termed ProtFP (4 variants)), and in addition we created and benchmarked three pairs of descriptor combinations. Prediction performance was evaluated in seven structure-activity benchmarks which comprise Angiotensin Converting Enzyme (ACE) dipeptidic inhibitor data, and three proteochemometric data sets, namely (1) GPCR ligands modeled against a GPCR panel, (2) enzyme inhibitors (NNRTIs) with associated bioactivities against a set of HIV enzyme mutants, and (3) enzyme inhibitors (PIs) with associated bioactivities on a large set of HIV enzyme mutants. Results The amino acid descriptor sets compared here show similar performance (<0.1 log units RMSE difference and <0.1 difference in MCC), while errors for individual proteins were in some cases found to be larger than those resulting from descriptor set differences ( > 0.3 log units RMSE difference and >0.7 difference in MCC). Combining different descriptor sets generally leads to better modeling performance than utilizing individual sets. The best performers were Z-scales (3) combined with ProtFP (Feature), or Z-Scales (3) combined with an average Z-Scale value for each target, while ProtFP (PCA8), ST-Scales, and ProtFP (Feature) rank last. Conclusions While amino acid descriptor sets capture different aspects of amino acids their ability to be used for bioactivity modeling is still – on average – surprisingly similar. Still, combining sets describing complementary information consistently leads to small but consistent improvement in modeling performance (average MCC 0.01 better, average RMSE 0.01 log units lower). Finally, performance differences exist between the targets compared thereby underlining that choosing an appropriate descriptor set is of fundamental for bioactivity modeling, both from the ligand- as well as the protein side.
Collapse
|
28
|
Sugaya N. Training based on ligand efficiency improves prediction of bioactivities of ligands and drug target proteins in a machine learning approach. J Chem Inf Model 2013; 53:2525-37. [PMID: 24020509 DOI: 10.1021/ci400240u] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Machine learning methods based on ligand-protein interaction data in bioactivity databases are one of the current strategies for efficiently finding novel lead compounds as the first step in the drug discovery process. Although previous machine learning studies have succeeded in predicting novel ligand-protein interactions with high performance, all of the previous studies to date have been heavily dependent on the simple use of raw bioactivity data of ligand potencies measured by IC50, EC50, K(i), and K(d) deposited in databases. ChEMBL provides us with a unique opportunity to investigate whether a machine-learning-based classifier created by reflecting ligand efficiency other than the IC50, EC50, K(i), and Kd values can also offer high predictive performance. Here we report that classifiers created from training data based on ligand efficiency show higher performance than those from data based on IC50 or K(i) values. Utilizing GPCRSARfari and KinaseSARfari databases in ChEMBL, we created IC50- or K(i)-based training data and binding efficiency index (BEI) based training data then constructed classifiers using support vector machines (SVMs). The SVM classifiers from the BEI-based training data showed slightly higher area under curve (AUC), accuracy, sensitivity, and specificity in the cross-validation tests. Application of the classifiers to the validation data demonstrated that the AUCs and specificities of the BEI-based classifiers dramatically increased in comparison with the IC50- or K(i)-based classifiers. The improvement of the predictive power by the BEI-based classifiers can be attributed to (i) the more separated distributions of positives and negatives, (ii) the higher diversity of negatives in the BEI-based training data in a feature space of SVMs, and (iii) a more balanced number of positives and negatives in the BEI-based training data. These results strongly suggest that training data based on ligand efficiency as well as data based on classical IC50, EC50, K(d), and K(i) values are important when creating a classifier using a machine learning approach based on bioactivity data.
Collapse
Affiliation(s)
- Nobuyoshi Sugaya
- Drug Discovery Department, Research & Development Division, PharmaDesign, Inc. , Hatchobori 2-19-8, Chuo-ku, Tokyo, 104-0032, Japan
| |
Collapse
|
29
|
van Westen GJ, Swier RF, Wegner JK, Ijzerman AP, van Vlijmen HW, Bender A. Benchmarking of protein descriptor sets in proteochemometric modeling (part 1): comparative study of 13 amino acid descriptor sets. J Cheminform 2013; 5:41. [PMID: 24059694 PMCID: PMC3848949 DOI: 10.1186/1758-2946-5-41] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2013] [Accepted: 09/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background While a large body of work exists on comparing and benchmarking of descriptors of molecular structures, a similar comparison of protein descriptor sets is lacking. Hence, in the current work a total of 13 different protein descriptor sets have been compared with respect to their behavior in perceiving similarities between amino acids. The descriptor sets included in the study are Z-scales (3 variants), VHSE, T-scales, ST-scales, MS-WHIM, FASGAI and BLOSUM, and a novel protein descriptor set termed ProtFP (4 variants). We investigate to which extent descriptor sets show collinear as well as orthogonal behavior via principal component analysis (PCA). Results In describing amino acid similarities, MSWHIM, T-scales and ST-scales show related behavior, as do the VHSE, FASGAI, and ProtFP (PCA3) descriptor sets. Conversely, the ProtFP (PCA5), ProtFP (PCA8), Z-Scales (Binned), and BLOSUM descriptor sets show behavior that is distinct from one another as well as both of the clusters above. Generally, the use of more principal components (>3 per amino acid, per descriptor) leads to a significant differences in the way amino acids are described, despite that the later principal components capture less variation per component of the original input data. Conclusion In this work a comparison is provided of how similar (and differently) currently available amino acids descriptor sets behave when converting structure to property space. The results obtained enable molecular modelers to select suitable amino acid descriptor sets for structure-activity analyses, e.g. those showing complementary behavior.
Collapse
Affiliation(s)
- Gerard Jp van Westen
- Division of Medicinal Chemistry, Leiden / Amsterdam Center for Drug Research, Einsteinweg 55, Leiden 2333, CC, The Netherlands.
| | | | | | | | | | | |
Collapse
|
30
|
Rognan D. Towards the Next Generation of Computational Chemogenomics Tools. Mol Inform 2013; 32:1029-34. [PMID: 27481148 DOI: 10.1002/minf.201300054] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 06/11/2013] [Indexed: 01/07/2023]
Affiliation(s)
- D Rognan
- UMR 7200 CNRS-Université de Strasbourg, MEDALIS Drug Discovery Center, 74 route du Rhin, 67400, Illkirch, France.
| |
Collapse
|
31
|
Pérot S, Regad L, Reynès C, Spérandio O, Miteva MA, Villoutreix BO, Camproux AC. Insights into an original pocket-ligand pair classification: a promising tool for ligand profile prediction. PLoS One 2013; 8:e63730. [PMID: 23840299 PMCID: PMC3688729 DOI: 10.1371/journal.pone.0063730] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Accepted: 04/05/2013] [Indexed: 11/18/2022] Open
Abstract
Pockets are today at the cornerstones of modern drug discovery projects and at the crossroad of several research fields, from structural biology to mathematical modeling. Being able to predict if a small molecule could bind to one or more protein targets or if a protein could bind to some given ligands is very useful for drug discovery endeavors, anticipation of binding to off- and anti-targets. To date, several studies explore such questions from chemogenomic approach to reverse docking methods. Most of these studies have been performed either from the viewpoint of ligands or targets. However it seems valuable to use information from both ligands and target binding pockets. Hence, we present a multivariate approach relating ligand properties with protein pocket properties from the analysis of known ligand-protein interactions. We explored and optimized the pocket-ligand pair space by combining pocket and ligand descriptors using Principal Component Analysis and developed a classification engine on this paired space, revealing five main clusters of pocket-ligand pairs sharing specific and similar structural or physico-chemical properties. These pocket-ligand pair clusters highlight correspondences between pocket and ligand topological and physico-chemical properties and capture relevant information with respect to protein-ligand interactions. Based on these pocket-ligand correspondences, a protocol of prediction of clusters sharing similarity in terms of recognition characteristics is developed for a given pocket-ligand complex and gives high performances. It is then extended to cluster prediction for a given pocket in order to acquire knowledge about its expected ligand profile or to cluster prediction for a given ligand in order to acquire knowledge about its expected pocket profile. This prediction approach shows promising results and could contribute to predict some ligand properties critical for binding to a given pocket, and conversely, some key pocket properties for ligand binding.
Collapse
Affiliation(s)
- Stéphanie Pérot
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Leslie Regad
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Christelle Reynès
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Olivier Spérandio
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Maria A. Miteva
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Bruno O. Villoutreix
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
| | - Anne-Claude Camproux
- INSERM, UMRS 973, MTi, Paris, France
- Univ Paris Diderot, Sorbonne Paris Cité, UMRS 973, MTi, Paris, France
- * E-mail:
| |
Collapse
|
32
|
Shiraishi A, Niijima S, Brown JB, Nakatsui M, Okuno Y. Chemical genomics approach for GPCR-ligand interaction prediction and extraction of ligand binding determinants. J Chem Inf Model 2013; 53:1253-62. [PMID: 23721295 DOI: 10.1021/ci300515z] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Chemical genomics research has revealed that G-protein coupled receptors (GPCRs) interact with a variety of ligands and that a large number of ligands are known to bind GPCRs even with low transmembrane (TM) sequence similarity. It is crucial to extract informative binding region propensities from large quantities of bioactivity data. To address this issue, we propose a machine learning approach that enables identification of both chemical substructures and amino acid properties that are associated with ligand binding, which can be applied to virtual ligand screening on a GPCR-wide scale. We also address the question of how to select plausible negative noninteraction pairs based on a statistical approach in order to develop reliable prediction models for GPCR-ligand interactions. The key interaction sites estimated by our approach can be of great use not only for screening of active compounds but also for modification of active compounds with the aim of improving activity or selectivity.
Collapse
Affiliation(s)
- Akira Shiraishi
- Department of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto
| | | | | | | | | |
Collapse
|
33
|
Koch U, Hamacher M, Nussbaumer P. Cheminformatics at the interface of medicinal chemistry and proteomics. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1844:156-61. [PMID: 23707564 DOI: 10.1016/j.bbapap.2013.05.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2012] [Revised: 04/26/2013] [Accepted: 05/13/2013] [Indexed: 10/26/2022]
Abstract
Multiple factors have to be optimized in the course of a drug discovery project. Traditionally this includes potency on a single target, eventually specificity as well as the pharmacokinetic, physicochemical and the safety profile. Recently an additional dimension has been added by realizing that the therapeutic outcome of a drug is often determined not only by its activity on a single target but also by its activity profile across a variety of biological targets. To address the polypharmacology of drug candidates many compounds are tested on a set of targets or in phenotypic screens generating a tremendous amount of data. To extract useful information computational methods at the interface of proteomics and cheminformatics are indispensable. This review will focus on some recent developments in this field. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.
Collapse
Affiliation(s)
- Uwe Koch
- Lead Discovery Center GmbH, Otto-Hahn-Str. 15, D-44227 Dortmund, Germany.
| | | | | |
Collapse
|
34
|
Fingerprint design and engineering strategies: rationalizing and improving similarity search performance. Future Med Chem 2013; 4:1945-59. [PMID: 23088275 DOI: 10.4155/fmc.12.126] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Fingerprints (FPs) are bit or integer string representations of molecular structure and properties, and are popular descriptors for chemical similarity searching. A major goal of similarity searching is the identification of novel active compounds on the basis of known reference molecules. In this review recent FP design and engineering strategies are discussed. New types of FPs continue to be replaced, often applying different design principles. FP engineering techniques have recently been introduced to further improve search performance and computational efficiency and elucidate mechanisms by which FPs recognize active compounds. In addition, through feature selection and hybridization techniques, standard FPs have been transformed into compound class-specific versions with further increased search performance. Moreover, scaffold hopping mechanisms have been explored. FPs will continue to play an important role in the search for novel active compounds.
Collapse
|
35
|
Desaphy J, Raimbaud E, Ducrot P, Rognan D. Encoding protein-ligand interaction patterns in fingerprints and graphs. J Chem Inf Model 2013; 53:623-37. [PMID: 23432543 DOI: 10.1021/ci300566n] [Citation(s) in RCA: 109] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
We herewith present a novel and universal method to convert protein-ligand coordinates into a simple fingerprint of 210 integers registering the corresponding molecular interaction pattern. Each interaction (hydrophobic, aromatic, hydrogen bond, ionic bond, metal complexation) is detected on the fly and physically described by a pseudoatom centered either on the interacting ligand atom, the interacting protein atom, or the geometric center of both interacting atoms. Counting all possible triplets of interaction pseudoatoms within six distance ranges, and pruning the full integer vector to keep the most frequent triplets enables the definition of a simple (210 integers) and coordinate frame-invariant interaction pattern descriptor (TIFP) that can be applied to compare any pair of protein-ligand complexes. TIFP fingerprints have been calculated for ca. 10,000 druggable protein-ligand complexes therefore enabling a wide comparison of relationships between interaction pattern similarity and ligand or binding site pairwise similarity. We notably show that interaction pattern similarity strongly depends on binding site similarity. In addition to the TIFP fingerprint which registers intermolecular interactions between a ligand and its target protein, we developed two tools (Ishape, Grim) to align protein-ligand complexes from their interaction patterns. Ishape is based on the overlap of interaction pseudoatoms using a smooth Gaussian function, whereas Grim utilizes a standard clique detection algorithm to match interaction pattern graphs. Both tools are complementary and enable protein-ligand complex alignments capitalizing on both global and local pattern similarities. The new fingerprint and companion alignment tools have been successfully used in three scenarios: (i) interaction-biased alignment of protein-ligand complexes, (ii) postprocessing docking poses according to known interaction patterns for a particular target, and (iii) virtual screening for bioisosteric scaffolds sharing similar interaction patterns.
Collapse
Affiliation(s)
- Jérémy Desaphy
- Laboratory for Therapeutical Innovation, UMR 7200 Université de Strabsourg/CNRS , MEDALIS Drug Discovery Center, F-67400 Illkirch, France
| | | | | | | |
Collapse
|
36
|
Gloriam DE. Chemogenomics of allosteric binding sites in GPCRs. DRUG DISCOVERY TODAY. TECHNOLOGIES 2013; 10:e307-e313. [PMID: 24050282 DOI: 10.1016/j.ddtec.2012.07.010] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Chemogenomic techniques connect the chemical and biological domains to establish ligand and target relationships not evident from the individual disciplines. Chemogenomics has been applied in lead generation, target classification, focused library design as well as selectivity and polypharmacology profiling. This review describes recent developments structured into ligand-, target- and combined chemogenomic techniques and applications to allosteric GPCR ligands. It also outlines relative strengths and limitations of these techniques and the impact of the increasing crystallographic data.
Collapse
|
37
|
Yu P, Wild DJ. Discovering associations in biomedical datasets by link-based associative classifier (LAC). PLoS One 2012; 7:e51018. [PMID: 23227228 PMCID: PMC3515483 DOI: 10.1371/journal.pone.0051018] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2012] [Accepted: 10/31/2012] [Indexed: 11/21/2022] Open
Abstract
Associative classification mining (ACM) can be used to provide predictive models with high accuracy as well as interpretability. However, traditional ACM ignores the difference of significances among the features used for mining. Although weighted associative classification mining (WACM) addresses this issue by assigning different weights to features, most implementations can only be utilized when pre-assigned weights are available. In this paper, we propose a link-based approach to automatically derive weight information from a dataset using link-based models which treat the dataset as a bipartite model. By combining this link-based feature weighting method with a traditional ACM method–classification based on associations (CBA), a Link-based Associative Classifier (LAC) is developed. We then demonstrate the application of LAC to biomedical datasets for association discovery between chemical compounds and bioactivities or diseases. The results indicate that the novel link-based weighting method is comparable to support vector machine (SVM) and RELIEF method, and is capable of capturing significant features. Additionally, LAC is shown to produce models with high accuracies and discover interesting associations which may otherwise remain unrevealed by traditional ACM.
Collapse
Affiliation(s)
- Pulan Yu
- School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America
| | - David J. Wild
- School of Informatics and Computing, Indiana University, Bloomington, Indiana, United States of America
- * E-mail:
| |
Collapse
|
38
|
Affiliation(s)
- Michael Bieler
- Boehringer Ingelheim Pharma GmbH & Co. KG; Lead Discovery and Optimization Support; 88397; Biberach/Riss; Germany
| | - Herbert Koeppen
- Boehringer Ingelheim Pharma GmbH & Co. KG; Lead Discovery and Optimization Support; 88397; Biberach/Riss; Germany
| |
Collapse
|
39
|
Vogt M, Bajorath J. Chemoinformatics: A view of the field and current trends in method development. Bioorg Med Chem 2012; 20:5317-23. [DOI: 10.1016/j.bmc.2012.03.030] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Revised: 03/09/2012] [Accepted: 03/12/2012] [Indexed: 12/18/2022]
|
40
|
van Westen GJP, van den Hoven OO, van der Pijl R, Mulder-Krieger T, de Vries H, Wegner JK, Ijzerman AP, van Vlijmen HWT, Bender A. Identifying novel adenosine receptor ligands by simultaneous proteochemometric modeling of rat and human bioactivity data. J Med Chem 2012; 55:7010-20. [PMID: 22827545 DOI: 10.1021/jm3003069] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The four subtypes of adenosine receptors form relevant drug targets in the treatment of, e.g., diabetes and Parkinson's disease. In the present study, we aimed at finding novel small molecule ligands for these receptors using virtual screening approaches based on proteochemometric (PCM) modeling. We combined bioactivity data from all human and rat receptors in order to widen available chemical space. After training and validating a proteochemometric model on this combined data set (Q(2) of 0.73, RMSE of 0.61), we virtually screened a vendor database of 100910 compounds. Of 54 compounds purchased, six novel high affinity adenosine receptor ligands were confirmed experimentally, one of which displayed an affinity of 7 nM on the human adenosine A(1) receptor. We conclude that the combination of rat and human data performs better than human data only. Furthermore, we conclude that proteochemometric modeling is an efficient method to quickly screen for novel bioactive compounds.
Collapse
Affiliation(s)
- Gerard J P van Westen
- Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Cheng F, Zhou Y, Li J, Li W, Liu G, Tang Y. Prediction of chemical-protein interactions: multitarget-QSAR versus computational chemogenomic methods. MOLECULAR BIOSYSTEMS 2012; 8:2373-84. [PMID: 22751809 DOI: 10.1039/c2mb25110h] [Citation(s) in RCA: 76] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Elucidation of chemical-protein interactions (CPI) is the basis of target identification and drug discovery. It is time-consuming and costly to determine CPI experimentally, and computational methods will facilitate the determination of CPI. In this study, two methods, multitarget quantitative structure-activity relationship (mt-QSAR) and computational chemogenomics, were developed for CPI prediction. Two comprehensive data sets were collected from the ChEMBL database for method assessment. One data set consisted of 81 689 CPI pairs among 50 924 compounds and 136 G-protein coupled receptors (GPCRs), while the other one contained 43 965 CPI pairs among 23 376 compounds and 176 kinases. The range of the area under the receiver operating characteristic curve (AUC) for the test sets was 0.95 to 1.0 and 0.82 to 1.0 for 100 GPCR mt-QSAR models and 100 kinase mt-QSAR models, respectively. The AUC of 5-fold cross validation were about 0.92 for both 176 kinases and 136 GPCRs using the chemogenomic method. However, the performance of the chemogenomic method was worse than that of mt-QSAR for the external validation set. Further analysis revealed that there was a high false positive rate for the external validation set when using the chemogenomic method. In addition, we developed a web server named CPI-Predictor, , which is available for free. The methods and tool have potential applications in network pharmacology and drug repositioning.
Collapse
Affiliation(s)
- Feixiong Cheng
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | | | | | | | | | | |
Collapse
|
42
|
Abstract
Notwithstanding their key roles in therapy and as biological probes, 7% of approved drugs are purported to have no known primary target, and up to 18% lack a well-defined mechanism of action. Using a chemoinformatics approach, we sought to "de-orphanize" drugs that lack primary targets. Surprisingly, targets could be easily predicted for many: Whereas these targets were not known to us nor to the common databases, most could be confirmed by literature search, leaving only 13 Food and Drug Administration-approved drugs with unknown targets; the number of drugs without molecular targets likely is far fewer than reported. The number of worldwide drugs without reasonable molecular targets similarly dropped, from 352 (25%) to 44 (4%). Nevertheless, there remained at least seven drugs for which reasonable mechanism-of-action targets were unknown but could be predicted, including the antitussives clemastine, cloperastine, and nepinalone; the antiemetic benzquinamide; the muscle relaxant cyclobenzaprine; the analgesic nefopam; and the immunomodulator lobenzarit. For each, predicted targets were confirmed experimentally, with affinities within their physiological concentration ranges. Turning this question on its head, we next asked which drugs were specific enough to act as chemical probes. Over 100 drugs met the standard criteria for probes, and 40 did so by more stringent criteria. A chemical information approach to drug-target association can guide therapeutic development and reveal applications to probe biology, a focus of much current interest.
Collapse
|
43
|
Schuffenhauer A. Computational methods for scaffold hopping. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1106] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
44
|
Madala PK, Fairlie DP, Bodén M. Matching Cavities in G Protein-Coupled Receptors to Infer Ligand-Binding Sites. J Chem Inf Model 2012; 52:1401-10. [DOI: 10.1021/ci2005498] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Praveen K. Madala
- Institute
for Molecular Bioscience, ‡School of Chemistry and Molecular Biosciences, and §School of Information
Technology and Electrical Engineering, The University of Queensland, St. Lucia, QLD 4072, Australia
| | - David P. Fairlie
- Institute
for Molecular Bioscience, ‡School of Chemistry and Molecular Biosciences, and §School of Information
Technology and Electrical Engineering, The University of Queensland, St. Lucia, QLD 4072, Australia
| | - Mikael Bodén
- Institute
for Molecular Bioscience, ‡School of Chemistry and Molecular Biosciences, and §School of Information
Technology and Electrical Engineering, The University of Queensland, St. Lucia, QLD 4072, Australia
| |
Collapse
|
45
|
Niijima S, Shiraishi A, Okuno Y. Dissecting Kinase Profiling Data to Predict Activity and Understand Cross-Reactivity of Kinase Inhibitors. J Chem Inf Model 2012; 52:901-12. [DOI: 10.1021/ci200607f] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
- Satoshi Niijima
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| | - Akira Shiraishi
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| | - Yasushi Okuno
- Department
of Systems Biosciences for Drug Discovery, Graduate School of Pharmaceutical
Sciences, Kyoto University, Kyoto, Japan
| |
Collapse
|
46
|
Anderson PC, De Sapio V, Turner KB, Elmer SP, Roe DC, Schoeniger JS. Identification of binding specificity-determining features in protein families. J Med Chem 2012; 55:1926-39. [PMID: 22289061 DOI: 10.1021/jm200979x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
We present a new approach for identifying features of ligand-protein binding interfaces that predict binding selectivity and demonstrate its effectiveness for predicting kinase inhibitor specificity. We analyzed a large set of human kinases and kinase inhibitors using clustering of experimentally determined inhibition constants (to define specificity classes of kinases and inhibitors) and virtual ligand docking (to extract structural and chemical features of the ligand-protein binding interfaces). We then used statistical methods to identify features characteristic of each class. Machine learning was employed to determine which combinations of characteristic features were predictive of class membership and to predict binding specificities and affinities of new compounds. Experiments showed predictions were 70% accurate. These results show that our method can automatically pinpoint on the three-dimensional binding interfaces pharmacophore-like features that act as "selectivity filters". The method is not restricted to kinases, requires no prior hypotheses about specific interactions, and can be applied to any protein families for which sets of structures and ligand binding data are available.
Collapse
Affiliation(s)
- Peter C Anderson
- Sandia National Laboratories, Box 969, MS 9291, Livermore, California 94551, USA
| | | | | | | | | | | |
Collapse
|
47
|
Integrating structure-based and ligand-based approaches for computational drug design. Future Med Chem 2011; 3:735-50. [PMID: 21554079 DOI: 10.4155/fmc.11.18] [Citation(s) in RCA: 103] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Methods utilized in computer-aided drug design can be classified into two major categories: structure based and ligand based, using information on the structure of the protein or on the biological and physicochemical properties of bound ligands, respectively. In recent years there has been a trend towards integrating these two methods in order to enhance the reliability and efficiency of computer-aided drug-design approaches by combining information from both the ligand and the protein. This trend resulted in a variety of methods that include: pseudoreceptor methods, pharmacophore methods, fingerprint methods and approaches integrating docking with similarity-based methods. In this article, we will describe the concepts behind each method and selected applications.
Collapse
|
48
|
Weill N, Valencia C, Gioria S, Villa P, Hibert M, Rognan D. Identification of Nonpeptide Oxytocin Receptor Ligands by Receptor-Ligand Fingerprint Similarity Search. Mol Inform 2011; 30:521-6. [DOI: 10.1002/minf.201100026] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 04/20/2011] [Indexed: 11/06/2022]
|
49
|
Buchwald F, Richter L, Kramer S. Predicting a small molecule-kinase interaction map: A machine learning approach. J Cheminform 2011; 3:22. [PMID: 21708012 PMCID: PMC3151211 DOI: 10.1186/1758-2946-3-22] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2010] [Accepted: 06/27/2011] [Indexed: 11/26/2022] Open
Abstract
Background We present a machine learning approach to the problem of protein ligand interaction prediction. We focus on a set of binding data obtained from 113 different protein kinases and 20 inhibitors. It was attained through ATP site-dependent binding competition assays and constitutes the first available dataset of this kind. We extract information about the investigated molecules from various data sources to obtain an informative set of features. Results A Support Vector Machine (SVM) as well as a decision tree algorithm (C5/See5) is used to learn models based on the available features which in turn can be used for the classification of new kinase-inhibitor pair test instances. We evaluate our approach using different feature sets and parameter settings for the employed classifiers. Moreover, the paper introduces a new way of evaluating predictions in such a setting, where different amounts of information about the binding partners can be assumed to be available for training. Results on an external test set are also provided. Conclusions In most of the cases, the presented approach clearly outperforms the baseline methods used for comparison. Experimental results indicate that the applied machine learning methods are able to detect a signal in the data and predict binding affinity to some extent. For SVMs, the binding prediction can be improved significantly by using features that describe the active site of a kinase. For C5, besides diversity in the feature set, alignment scores of conserved regions turned out to be very useful.
Collapse
Affiliation(s)
- Fabian Buchwald
- Institut für Informatik, Technische Universität München, Boltzmannstr, 3, 85748 Garching bei München, Germany.
| | | | | |
Collapse
|
50
|
Meslamani J, Rognan D. Enhancing the Accuracy of Chemogenomic Models with a Three-Dimensional Binding Site Kernel. J Chem Inf Model 2011; 51:1593-603. [DOI: 10.1021/ci200166t] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jamel Meslamani
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| | - Didier Rognan
- Structural Chemogenomics, Laboratory of Therapeutical Innovation, UMR 7200 CNRS, University of Strasbourg, F-67400 Illkirch, France
| |
Collapse
|