26
|
Petrone PM, Simms B, Nigsch F, Lounkine E, Kutchukian P, Cornett A, Deng Z, Davies JW, Jenkins JL, Glick M. Rethinking molecular similarity: comparing compounds on the basis of biological activity. ACS Chem Biol 2012; 7:1399-409. [PMID: 22594495 DOI: 10.1021/cb3001028] [Citation(s) in RCA: 108] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Since the advent of high-throughput screening (HTS), there has been an urgent need for methods that facilitate the interrogation of large-scale chemical biology data to build a mode of action (MoA) hypothesis. This can be done either prior to the HTS by subset design of compounds with known MoA or post HTS by data annotation and mining. To enable this process, we developed a tool that compares compounds solely on the basis of their bioactivity: the chemical biological descriptor "high-throughput screening fingerprint" (HTS-FP). In the current embodiment, data are aggregated from 195 biochemical and cell-based assays developed at Novartis and can be used to identify bioactivity relationships among the in-house collection comprising ~1.5 million compounds. We demonstrate the value of the HTS-FP for virtual screening and in particular scaffold hopping. HTS-FP outperforms state of the art methods in several aspects, retrieving bioactive compounds with remarkable chemical dissimilarity to a probe structure. We also apply HTS-FP for the design of screening subsets in HTS. Using retrospective data, we show that a biodiverse selection of plates performs significantly better than a chemically diverse selection of plates, both in terms of number of hits and diversity of chemotypes retrieved. This is also true in the case of hit expansion predictions using HTS-FP similarity. Sets of compounds clustered with HTS-FP are biologically meaningful, in the sense that these clusters enrich for genes and gene ontology (GO) terms, showing that compounds that are bioactively similar also tend to target proteins that operate together in the cell. HTS-FP are valuable not only because of their predictive power but mainly because they relate compounds solely on the basis of bioactivity, harnessing the accumulated knowledge of a high-throughput screening facility toward the understanding of how compounds interact with the proteome.
Collapse
|
27
|
Jenkins JL. Large-Scale QSAR in Target Prediction and Phenotypic HTS Assessment. Mol Inform 2012; 31:508-14. [PMID: 27477469 DOI: 10.1002/minf.201200002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2012] [Accepted: 06/25/2012] [Indexed: 01/31/2023]
Abstract
The advent of in silico compound target prediction offers a potential paradigm shift in how large compound collections are understood and used strategically in high-throughput screens (HTS). Specifically, phenotypic HTS hits may be annotated both with known targets and predicted targets using large-scale QSAR models, enabling a more sophisticated hit assessment. Efforts in massive bioactivity data integration and standardization is empowering such compound-target annotations. These approaches differ fundamentally from the traditional role of QSAR in lead optimization and binding affinity predictions to global, probabilistic target predictions for thousands of human proteins.
Collapse
|
28
|
Nigsch F, Hutz J, Cornett B, Selinger DW, McAllister G, Bandyopadhyay S, Loureiro J, Jenkins JL. Determination of minimal transcriptional signatures of compounds for target prediction. EURASIP JOURNAL ON BIOINFORMATICS & SYSTEMS BIOLOGY 2012; 2012:2. [PMID: 22574917 PMCID: PMC3386022 DOI: 10.1186/1687-4153-2012-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2012] [Accepted: 05/10/2012] [Indexed: 11/10/2022]
Abstract
The identification of molecular target and mechanism of action of compounds is a key hurdle in drug discovery. Multiplexed techniques for bead-based expression profiling allow the measurement of transcriptional signatures of compound-treated cells in high-throughput mode. Such profiles can be used to gain insight into compounds' mode of action and the protein targets they are modulating. Through the proxy of target prediction from such gene signatures we explored important aspects of the use of transcriptional profiles to capture biological variability of perturbed cellular assays. We found that signatures derived from expression data and signatures derived from biological interaction networks performed equally well, and we showed that gene signatures can be optimised using a genetic algorithm. Gene signatures of approximately 128 genes seemed to be most generic, capturing a maximum of the perturbation inflicted on cells through compound treatment. Moreover, we found evidence for oxidative phosphorylation to be one of the most general ways to capture compound perturbation.
Collapse
|
29
|
Lounkine E, Nigsch F, Jenkins JL, Glick M. Activity-Aware Clustering of High Throughput Screening Data and Elucidation of Orthogonal Structure–Activity Relationships. J Chem Inf Model 2011; 51:3158-68. [DOI: 10.1021/ci2004994] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
30
|
Nigsch F, Lounkine E, McCarren P, Cornett B, Glick M, Azzaoui K, Urban L, Marc P, Müller A, Hahne F, Heard DJ, Jenkins JL. Computational methods for early predictive safety assessment from biological and chemical data. Expert Opin Drug Metab Toxicol 2011; 7:1497-511. [DOI: 10.1517/17425255.2011.632632] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|
31
|
Koutsoukas A, Simms B, Kirchmair J, Bond PJ, Whitmore AV, Zimmer S, Young MP, Jenkins JL, Glick M, Glen RC, Bender A. From in silico target prediction to multi-target drug design: current databases, methods and applications. J Proteomics 2011; 74:2554-74. [PMID: 21621023 DOI: 10.1016/j.jprot.2011.05.011] [Citation(s) in RCA: 186] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Revised: 04/10/2011] [Accepted: 05/06/2011] [Indexed: 01/31/2023]
Abstract
Given the tremendous growth of bioactivity databases, the use of computational tools to predict protein targets of small molecules has been gaining importance in recent years. Applications span a wide range, from the 'designed polypharmacology' of compounds to mode-of-action analysis. In this review, we firstly survey databases that can be used for ligand-based target prediction and which have grown tremendously in size in the past. We furthermore outline methods for target prediction that exist, both based on the knowledge of bioactivities from the ligand side and methods that can be applied in situations when a protein structure is known. Applications of successful in silico target identification attempts are discussed in detail, which were based partly or in whole on computational target predictions in the first instance. This includes the authors' own experience using target prediction tools, in this case considering phenotypic antibacterial screens and the analysis of high-throughput screening data. Finally, we will conclude with the prospective application of databases to not only predict, retrospectively, the protein targets of a small molecule, but also how to design ligands with desired polypharmacology in a prospective manner.
Collapse
|
32
|
Sukuru SCK, Nigsch F, Quancard J, Renatus M, Chopra R, Brooijmans N, Mikhailov D, Deng Z, Cornett A, Jenkins JL, Hommel U, Davies JW, Glick M. A lead discovery strategy driven by a comprehensive analysis of proteases in the peptide substrate space. Protein Sci 2011; 19:2096-109. [PMID: 20799349 DOI: 10.1002/pro.490] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We present here a comprehensive analysis of proteases in the peptide substrate space and demonstrate its applicability for lead discovery. Aligned octapeptide substrates of 498 proteases taken from the MEROPS peptidase database were used for the in silico analysis. A multiple-category naïve Bayes model, trained on the two-dimensional chemical features of the substrates, was able to classify the substrates of 365 (73%) proteases and elucidate statistically significant chemical features for each of their specific substrate positions. The positional awareness of the method allows us to identify the most similar substrate positions between proteases. Our analysis reveals that proteases from different families, based on the traditional classification (aspartic, cysteine, serine, and metallo), could have substrates that differ at the cleavage site (P1-P1') but are similar away from it. Caspase-3 (cysteine protease) and granzyme B (serine protease) are previously known examples of cross-family neighbors identified by this method. To assess whether peptide substrate similarity between unrelated proteases could reliably translate into the discovery of low molecular weight synthetic inhibitors, a lead discovery strategy was tested on two other cross-family neighbors--namely cathepsin L2 and matrix metallo proteinase 9, and calpain 1 and pepsin A. For both these pairs, a naïve Bayes classifier model trained on inhibitors of one protease could successfully enrich those of its neighbor from a different family and vice versa, indicating that this approach could be prospectively applied to lead discovery for a novel protease target with no known synthetic inhibitors.
Collapse
|
33
|
Doddareddy MR, van Westen GJP, van der Horst E, Peironcely JE, Corthals F, Ijzerman AP, Emmerich M, Jenkins JL, Bender A. Chemogenomics: Looking at biology through the lens of chemistry. Stat Anal Data Min 2009. [DOI: 10.1002/sam.10046] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
34
|
Scheiber J, Jenkins JL, Bender A, Milik M, Mikhailov D, Sukuru SCK, Cornett B, Whitebread S, Urban L, Davies JW, Glick M. SPREAD-exploiting chemical features that cause differential activity behavior. Stat Anal Data Min 2009. [DOI: 10.1002/sam.10036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
35
|
Sukuru SCK, Jenkins JL, Beckwith RE, Scheiber J, Bender A, Mikhailov D, Davies JW, Glick M. Plate-Based Diversity Selection Based on Empirical HTS Data to Enhance the Number of Hits and Their Chemical Diversity. ACTA ACUST UNITED AC 2009; 14:690-9. [DOI: 10.1177/1087057109335678] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Typically, screening collections of pharmaceutical companies contain more than a million compounds today. However, for certain high-throughput screening (HTS) campaigns, constraints posed by the assay throughput and/or the reagent costs make it impractical to screen the entire deck. Therefore, it is desirable to effectively screen subsets of the collection based on a hypothesis or a diversity selection. How to select compound subsets is a subject of ongoing debate. The authors present an approach based on extended connectivity fingerprints to carry out diversity selection on a per plate basis (instead of a per compound basis). HTS data from 35 Novartis screens spanning 5 target classes were investigated to assess the performance of this approach. The analysis shows that selecting a fingerprint-diverse subset of 250K compounds, representing 20% of the screening deck, would have achieved significantly higher hit rates for 86% of the screens. This measure also outperforms the Murcko scaffold-based plate selection described previously, where only 49% of the screens showed similar improvements. Strikingly, the 2-fold improvement in average hit rates observed for 3 of 5 target classes in the data set indicates a target bias of the plate (and thus compound) selection method. Even though the diverse subset selection lacks any target hypothesis, its application shows significantly better results for some targets—namely, G-protein-coupled receptors, proteases, and protein-protein interactions—but not for kinase and pathway screens. The synthetic origin of the compounds in the diverse subset appears to influence the screening hit rates. Natural products were the most diverse compound class, with significantly higher hit rates compared to the compounds from the traditional synthetic and combinatorial libraries. These results offer empirical guidelines for plate-based diversity selection to enhance hit rates, based on target class and the library type being screened. ( Journal of Biomolecular Screening 2009:690-699)
Collapse
|
36
|
Scheiber J, Jenkins JL, Sukuru SCK, Bender A, Mikhailov D, Milik M, Azzaoui K, Whitebread S, Hamon J, Urban L, Glick M, Davies JW. Mapping Adverse Drug Reactions in Chemical Space. J Med Chem 2009; 52:3103-7. [DOI: 10.1021/jm801546k] [Citation(s) in RCA: 128] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
37
|
Bender A, Mikhailov D, Glick M, Scheiber J, Davies JW, Cleaver S, Marshall S, Tallarico JA, Harrington E, Cornella-Taracido I, Jenkins JL. Use of Ligand Based Models for Protein Domains To Predict Novel Molecular Targets and Applications To Triage Affinity Chromatography Data. J Proteome Res 2009; 8:2575-85. [DOI: 10.1021/pr900107z] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
38
|
Bender A, Jenkins JL, Scheiber J, Sukuru SCK, Glick M, Davies JW. How similar are similarity searching methods? A principal component analysis of molecular descriptor space. J Chem Inf Model 2009; 49:108-19. [PMID: 19123924 DOI: 10.1021/ci800249s] [Citation(s) in RCA: 197] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
Different molecular descriptors capture different aspects of molecular structures, but this effect has not yet been quantified systematically on a large scale. In this work, we calculate the similarity of 37 descriptors by repeatedly selecting query compounds and ranking the rest of the database. Euclidean distances between the rank-ordering of different descriptors are calculated to determine descriptor (as opposed to compound) similarity, followed by PCA for visualization. Four broad descriptor classes are identified, which are circular fingerprints; circular fingerprints considering counts; path-based and keyed fingerprints; and pharmacophoric descriptors. Descriptor behavior is much more defined by those four classes than the particular parametrization. Using counts instead of the presence/absence of fingerprints significantly changes descriptor behavior, which is crucial for performance of topological autocorrelation vectors, but not circular fingerprints. Four-point pharmacophores (piDAPH4) surprisingly lead to much higher retrieval rates than three-point pharmacophores (28.21% vs 19.15%) but still similar rank-ordering of compounds (retrieval of similar actives). Looking into individual rankings, circular fingerprints seem more appropriate than path-based fingerprints if complex ring systems or branching patterns are present; count-based fingerprints could be more suitable in databases with a large number of repeated subunits (amide bonds, sugar rings, terpenes). Information-based selection of diverse fingerprints for consensus scoring (ECFP4/TGD fingerprints) led only to marginal improvement over single fingerprint results. While it seems to be nontrivial to exploit orthogonal descriptor behavior to improve retrieval rates in consensus virtual screening, those descriptors still each retrieve different actives which corroborates the strategy of employing diverse descriptors individually in prospective virtual screening settings.
Collapse
|
39
|
Scheiber J, Chen B, Milik M, Sukuru SCK, Bender A, Mikhailov D, Whitebread S, Hamon J, Azzaoui K, Urban L, Glick M, Davies JW, Jenkins JL. Gaining Insight into Off-Target Mediated Effects of Drug Candidates with a Comprehensive Systems Chemical Biology Analysis. J Chem Inf Model 2009; 49:308-17. [DOI: 10.1021/ci800344p] [Citation(s) in RCA: 132] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
40
|
Abstract
Understanding the safety of newly developed compounds is a key task in each early drug discovery project. In early stages, pharmaceutical companies address this task by using so-called preclinical safety profiling, in which compounds are screened in inexpensive large-scale assays to understand possible liabilities. This process generates a large amount of binding data on various compounds against a panel of targets - usually thousands or tens of thousands of compounds profiled against approximately 100 different targets. This data matrix is highly valuable and elicits further analysis. After briefly introducing the nature of safety profiling data, we describe several computational methods used internally at Novartis to analyze it. We showcase protocols that can be used to understand compound promiscuity on a chemical structure level and protocols to evaluate the promiscuity of targets used in safety profiling. We also describe a method to quickly determine the chemical similarity of compounds active against different targets. Next, it is shown what protocols can be used to evaluate global chemical similarity of targets. The above approaches can be used either to optimize the composition of a panel of targets or to better understand certain toxicities. Finally, we will explain a simple method to elucidate hidden patterns in safety profiling data.
Collapse
|
41
|
Jacoby E, Boettcher A, Mayr LM, Brown N, Jenkins JL, Kallen J, Engeloch C, Schopfer U, Furet P, Masuya K, Lisztwan J. Knowledge-based virtual screening: application to the MDM4/p53 protein-protein interaction. Methods Mol Biol 2009; 575:173-94. [PMID: 19727615 DOI: 10.1007/978-1-60761-274-2_7] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Chemogenomics knowledge-based drug discovery approaches aim to extract the knowledge gained from one target and to apply it for the discovery of ligands and hopefully drugs of a new target which is related to the parent target by homology or conserved molecular recognition. Herein, we demonstrate the potential of knowledge-based virtual screening by applying it to the MDM4-p53 protein-protein interaction where the MDM2-p53 protein-protein interaction constitutes the parent reference system; both systems are potentially relevant to cancer therapy. We show that a combination of virtual screening methods, including homology based similarity searching, QSAR (Quantitative Structure-Activity Relationship) methods, HTD (High Throughput Docking), and UNITY pharmacophore searching provide a successful approach to the discovery of inhibitors. The virtual screening hit list is of the magnitude of 50,000 compounds picked from the corporate compound library of approximately 1.2 million compounds. Emphasis is placed on the facts that such campaigns are only feasible because of the now existing HTCP (High throughput Cherry-Picking) automation systems in combination with robust MTS (Medium Throughput Screening) fluorescence-based assays. Given that the MDM2-p53 system constitutes the reference system, it is not surprising that significantly more and stronger hits are found for this interaction compared to the MDM4-p53 system. Novel, selective and dual hits are discovered for both systems. A hit rate analysis will be provided compared to the full HTS (High-throughput Screening).
Collapse
|
42
|
Nigsch F, Bender A, Jenkins JL, Mitchell JBO. Ligand-Target Prediction Using Winnow and Naive Bayesian Algorithms and the Implications of Overall Performance Statistics. J Chem Inf Model 2008; 48:2313-25. [DOI: 10.1021/ci800079x] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
43
|
Bender A, Scheiber J, Glick M, Davies JW, Azzaoui K, Hamon J, Urban L, Whitebread S, Jenkins JL. Analysis of pharmacology data and the prediction of adverse drug reactions and off-target effects from chemical structure. ChemMedChem 2008; 2:861-73. [PMID: 17477341 DOI: 10.1002/cmdc.200700026] [Citation(s) in RCA: 225] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Preclinical Safety Pharmacology (PSP) attempts to anticipate adverse drug reactions (ADRs) during early phases of drug discovery by testing compounds in simple, in vitro binding assays (that is, preclinical profiling). The selection of PSP targets is based largely on circumstantial evidence of their contribution to known clinical ADRs, inferred from findings in clinical trials, animal experiments, and molecular studies going back more than forty years. In this work we explore PSP chemical space and its relevance for the prediction of adverse drug reactions. Firstly, in silico (computational) Bayesian models for 70 PSP-related targets were built, which are able to detect 93% of the ligands binding at IC(50) < or = 10 microM at an overall correct classification rate of about 94%. Secondly, employing the World Drug Index (WDI), a model for adverse drug reactions was built directly based on normalized side-effect annotations in the WDI, which does not require any underlying functional knowledge. This is, to our knowledge, the first attempt to predict adverse drug reactions across hundreds of categories from chemical structure alone. On average 90% of the adverse drug reactions observed with known, clinically used compounds were detected, an overall correct classification rate of 92%. Drugs withdrawn from the market (Rapacuronium, Suprofen) were tested in the model and their predicted ADRs align well with known ADRs. The analysis was repeated for acetylsalicylic acid and Benperidol which are still on the market. Importantly, features of the models are interpretable and back-projectable to chemical structure, raising the possibility of rationally engineering out adverse effects. By combining PSP and ADR models new hypotheses linking targets and adverse effects can be proposed and examples for the opioid mu and the muscarinic M2 receptors, as well as for cyclooxygenase-1 are presented. It is hoped that the generation of predictive models for adverse drug reactions is able to help support early SAR to accelerate drug discovery and decrease late stage attrition in drug discovery projects. In addition, models such as the ones presented here can be used for compound profiling in all development stages.
Collapse
|
44
|
Bender A, Bojanic D, Davies JW, Crisman TJ, Mikhailov D, Scheiber J, Jenkins JL, Deng Z, Hill WAG, Popov M, Jacoby E, Glick M. Which aspects of HTS are empirically correlated with downstream success? CURRENT OPINION IN DRUG DISCOVERY & DEVELOPMENT 2008; 11:327-337. [PMID: 18428086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
High-throughput screening (HTS) is a well-established hit-finding approach used in the pharmaceutical industry. In this article, recent experience at Novartis with respect to factors influencing the success of HTS campaigns is discussed. An inherent measure of HTS quality could be defined by the assay Z and Z' factors, the number of hits and their biological potencies; however, such measures of quality do not always correlate with the advancement of hits to the later stages of drug discovery. Also, for many target classes, such as kinases, it is easy to identify hits, but, as a result of selectivity, intellectual property and other issues, the projects do not result in lead declarations. In this article, HTS success is defined as the fraction of HTS campaigns that advance into the later stages of drug discovery, and the major influencing factors are examined. Interestingly, screening compounds in individual wells or in mixtures did not have a major impact on the HTS success and, equally interesting, there was no difference in the progression rates of biochemical and cell-based assays. Particular target types, assay technologies, structure-activity relationships and powder availability had a much greater impact on success as defined above. In addition, significant mutual dependencies can be observed - while one assay format works well with one target type, this situation might be completely reversed for a combination of the same readout technology with a different target type. The results and opinions presented here should be regarded as groundwork, and a plethora of factors that influence the fate of a project, such as biophysical measurements, chemical attractiveness of the hits, strategic reasons and safety pharmacology, are not covered here. Nonetheless, it is hoped that this information will be used industry-wide to improve success rates in terms of hits progressing into exploratory chemistry and beyond. The support that can be obtained from new in silico approaches to phase transitions are also described, along with the gaps they are designed to fill.
Collapse
|
45
|
Bender A, Young DW, Jenkins JL, Serrano M, Mikhailov D, Clemons PA, Davies JW. Chemogenomic data analysis: prediction of small-molecule targets and the advent of biological fingerprint. Comb Chem High Throughput Screen 2008; 10:719-31. [PMID: 18045083 DOI: 10.2174/138620707782507313] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Chemogenomics comprises a systematic relationship between targets and ligands that are used as target modulators in living systems such as cells or organisms. In recent years, data on small molecule-bioactivity relationships have become increasingly available, and consequently so have the number of approaches used to translate bioactivity data into knowledge. This review will focus on two aspects of chemogenomics. Firstly, in cases such as cell-based screens, the question of which target(s) a compound is modulating in order to cause the observed phenotype is crucial. In silico target prediction tools can suggest likely biological targets of small molecules via data mining in target-annotated chemical databases. This review presents some of the current tools available for this task and shows some sample applications relevant to a pharmaceutical industry setting. These applications are the prediction of false-positives in cell-based reporter gene assays, the prediction of targets by linking bioassay data with protein domain annotations, and the direct prediction of adverse reactions. Secondly, in recent years a shift from structure-derived chemical descriptors to biological descriptors has occurred. Here, the effect of a compound on a number of biological endpoints is used to make predictions about other properties, such as putative targets, associated adverse reactions, and pathways modulated by the compound. This review further summarizes these "performance" descriptors and their applications, focusing on gene expression profiles and high-content screening data. The advent of such biological fingerprints suggests that the field of drug discovery is currently at a crossroads, where single target bioassay results are supplanted by multidimensional biological fingerprints that reflect a new awareness of biological networks and polypharmacology.
Collapse
|
46
|
Scheiber J, Jenkins JL, Bender A, Whitebread S, Hamon J, Urban L, Azzaoui K, Glick M, Davies JW. Side effect profile prediction - early addressing of big pharma's worst nightmare. Chem Cent J 2008. [PMCID: PMC4236057 DOI: 10.1186/1752-153x-2-s1-s4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
47
|
Crisman TJ, Bender A, Milik M, Jenkins JL, Scheiber J, Sukuru SCK, Fejzo J, Hommel U, Davies JW, Glick M. “Virtual Fragment Linking”: An Approach To Identify Potent Binders from Low Affinity Fragment Hits. J Med Chem 2008; 51:2481-91. [DOI: 10.1021/jm701314u] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
48
|
You TH, Lee MK, Jenkins JL, Alzate O, Dean DH. Blocking binding of Bacillus thuringiensis Cry1Aa to Bombyx mori cadherin receptor results in only a minor reduction of toxicity. BMC BIOCHEMISTRY 2008; 9:3. [PMID: 18218126 PMCID: PMC2245940 DOI: 10.1186/1471-2091-9-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/14/2007] [Accepted: 01/24/2008] [Indexed: 11/24/2022]
Abstract
Background Bacillus thuringiensis Cry1Aa insecticidal protein is the most active known B. thuringiensis toxin against the forest insect pest Lymantria dispar (gypsy moth), unfortunately it is also highly toxic against the non-target insect Bombyx mori (silk worm). Results Surface exposed hydrophobic residues over domains II and III were targeted for site-directed mutagenesis. Substitution of a phenylalanine residue (F328) by alanine reduced binding to the Bombyx mori cadherin by 23-fold, reduced biological activity against B. mori by 4-fold, while retaining activity against Lymantria dispar. Conclusion The results identify a novel receptor-binding epitope and demonstrate that virtual elimination of binding to cadherin BR-175 does not completely remove toxicity in the case of B. mori.
Collapse
|
49
|
Crisman TJ, Parker CN, Jenkins JL, Scheiber J, Thoma M, Kang ZB, Kim R, Bender A, Nettles JH, Davies JW, Glick M. Understanding false positives in reporter gene assays: in silico chemogenomics approaches to prioritize cell-based HTS data. J Chem Inf Model 2007; 47:1319-27. [PMID: 17608469 DOI: 10.1021/ci6005504] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
High throughput screening (HTS) data is often noisy, containing both false positives and negatives. Thus, careful triaging and prioritization of the primary hit list can save time and money by identifying potential false positives before incurring the expense of followup. Of particular concern are cell-based reporter gene assays (RGAs) where the number of hits may be prohibitively high to be scrutinized manually for weeding out erroneous data. Based on statistical models built from chemical structures of 650 000 compounds tested in RGAs, we created "frequent hitter" models that make it possible to prioritize potential false positives. Furthermore, we followed up the frequent hitter evaluation with chemical structure based in silico target predictions to hypothesize a mechanism for the observed "off target" response. It was observed that the predicted cellular targets for the frequent hitters were known to be associated with undesirable effects such as cytotoxicity. More specifically, the most frequently predicted targets relate to apoptosis and cell differentiation, including kinases, topoisomerases, and protein phosphatases. The mechanism-based frequent hitter hypothesis was tested using 160 additional druglike compounds predicted by the model to be nonspecific actives in RGAs. This validation was successful (showing a 50% hit rate compared to a normal hit rate as low as 2%), and it demonstrates the power of computational models toward understanding complex relations between chemical structure and biological function.
Collapse
|
50
|
Azzaoui K, Hamon J, Faller B, Whitebread S, Jacoby E, Bender A, Jenkins JL, Urban L. Modeling Promiscuity Based on in vitro Safety Pharmacology Profiling Data. ChemMedChem 2007; 2:874-80. [PMID: 17492703 DOI: 10.1002/cmdc.200700036] [Citation(s) in RCA: 149] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
This study describes a method for mining and modeling binding data obtained from a large panel of targets (in vitro safety pharmacology) to distinguish differences between promiscuous and selective compounds. Two naïve Bayes models for promiscuity and selectivity were generated and validated on a test set as well as publicly available drug databases. The model shows a higher score (lower promiscuity) for marketed drugs than for compounds in early development or compounds that failed during clinical development. Such models can be used in triaging high-throughput screening data or for lead optimization.
Collapse
|