1
|
Kumar A, Agarwal P, Shivangi, Meena LS. Structural and functional investigation of mycobacterial HflX protein and its mutational hotspots annotation by in silico approach. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2021.101418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
2
|
Zhang ZM, Guan ZX, Wang F, Zhang D, Ding H. Application of Machine Learning Methods in Predicting Nuclear Receptors and their Families. Med Chem 2021; 16:594-604. [PMID: 31584374 DOI: 10.2174/1573406415666191004125551] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 06/18/2019] [Accepted: 08/23/2019] [Indexed: 11/22/2022]
Abstract
Nuclear receptors (NRs) are a superfamily of ligand-dependent transcription factors that are closely related to cell development, differentiation, reproduction, homeostasis, and metabolism. According to the alignments of the conserved domains, NRs are classified and assigned the following seven subfamilies or eight subfamilies: (1) NR1: thyroid hormone like (thyroid hormone, retinoic acid, RAR-related orphan receptor, peroxisome proliferator activated, vitamin D3- like), (2) NR2: HNF4-like (hepatocyte nuclear factor 4, retinoic acid X, tailless-like, COUP-TFlike, USP), (3) NR3: estrogen-like (estrogen, estrogen-related, glucocorticoid-like), (4) NR4: nerve growth factor IB-like (NGFI-B-like), (5) NR5: fushi tarazu-F1 like (fushi tarazu-F1 like), (6) NR6: germ cell nuclear factor like (germ cell nuclear factor), and (7) NR0: knirps like (knirps, knirpsrelated, embryonic gonad protein, ODR7, trithorax) and DAX like (DAX, SHP), or dividing NR0 into (7) NR7: knirps like and (8) NR8: DAX like. Different NRs families have different structural features and functions. Since the function of a NR is closely correlated with which subfamily it belongs to, it is highly desirable to identify NRs and their subfamilies rapidly and effectively. The knowledge acquired is essential for a proper understanding of normal and abnormal cellular mechanisms. With the advent of the post-genomics era, huge amounts of sequence-known proteins have increased explosively. Conventional methods for accurately classifying the family of NRs are experimental means with high cost and low efficiency. Therefore, it has created a greater need for bioinformatics tools to effectively recognize NRs and their subfamilies for the purpose of understanding their biological function. In this review, we summarized the application of machine learning methods in the prediction of NRs from different aspects. We hope that this review will provide a reference for further research on the classification of NRs and their families.
Collapse
Affiliation(s)
- Zi-Mei Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Zheng-Xing Guan
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Fang Wang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Dan Zhang
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Ding
- Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
3
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
4
|
Jarada TN, Rokne JG, Alhajj R. A review of computational drug repositioning: strategies, approaches, opportunities, challenges, and directions. J Cheminform 2020; 12:46. [PMID: 33431024 PMCID: PMC7374666 DOI: 10.1186/s13321-020-00450-7] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 07/13/2020] [Indexed: 01/13/2023] Open
Abstract
Drug repositioning is the process of identifying novel therapeutic potentials for existing drugs and discovering therapies for untreated diseases. Drug repositioning, therefore, plays an important role in optimizing the pre-clinical process of developing novel drugs by saving time and cost compared to the traditional de novo drug discovery processes. Since drug repositioning relies on data for existing drugs and diseases the enormous growth of publicly available large-scale biological, biomedical, and electronic health-related data along with the high-performance computing capabilities have accelerated the development of computational drug repositioning approaches. Multidisciplinary researchers and scientists have carried out numerous attempts, with different degrees of efficiency and success, to computationally study the potential of repositioning drugs to identify alternative drug indications. This study reviews recent advancements in the field of computational drug repositioning. First, we highlight different drug repositioning strategies and provide an overview of frequently used resources. Second, we summarize computational approaches that are extensively used in drug repositioning studies. Third, we present different computing and experimental models to validate computational methods. Fourth, we address prospective opportunities, including a few target areas. Finally, we discuss challenges and limitations encountered in computational drug repositioning and conclude with an outline of further research directions.
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, Alberta, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
| |
Collapse
|
5
|
Wang C, Kurgan L. Review and comparative assessment of similarity-based methods for prediction of drug–protein interactions in the druggable human proteome. Brief Bioinform 2018; 20:2066-2087. [DOI: 10.1093/bib/bby069] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2018] [Revised: 06/26/2018] [Accepted: 07/10/2018] [Indexed: 12/18/2022] Open
Abstract
AbstractDrug–protein interactions (DPIs) underlie the desired therapeutic actions and the adverse side effects of a significant majority of drugs. Computational prediction of DPIs facilitates research in drug discovery, characterization and repurposing. Similarity-based methods that do not require knowledge of protein structures are particularly suitable for druggable genome-wide predictions of DPIs. We review 35 high-impact similarity-based predictors that were published in the past decade. We group them based on three types of similarities and their combinations that they use. We discuss and compare key aspects of these methods including source databases, internal databases and their predictive models. Using our novel benchmark database, we perform comparative empirical analysis of predictive performance of seven types of representative predictors that utilize each type of similarity individually and all possible combinations of similarities. We assess predictive quality at the database-wide DPI level and we are the first to also include evaluation over individual drugs. Our comprehensive analysis shows that predictors that use more similarity types outperform methods that employ fewer similarities, and that the model combining all three types of similarities secures area under the receiver operating characteristic curve of 0.93. We offer a comprehensive analysis of sensitivity of predictive performance to intrinsic and extrinsic characteristics of the considered predictors. We find that predictive performance is sensitive to low levels of similarities between sequences of the drug targets and several extrinsic properties of the input drug structures, drug profiles and drug targets. The benchmark database and a webserver for the seven predictors are freely available at http://biomine.cs.vcu.edu/servers/CONNECTOR/.
Collapse
Affiliation(s)
- Chen Wang
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Lukasz Kurgan
- Computer Science Department, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
6
|
Abstract
Following the elucidation of the human genome, chemogenomics emerged in the beginning of the twenty-first century as an interdisciplinary research field with the aim to accelerate target and drug discovery by making best usage of the genomic data and the data linkable to it. What started as a systematization approach within protein target families now encompasses all types of chemical compounds and gene products. A key objective of chemogenomics is the establishment, extension, analysis, and prediction of a comprehensive SAR matrix which by application will enable further systematization in drug discovery. Herein we outline future perspectives of chemogenomics including the extension to new molecular modalities, or the potential extension beyond the pharma to the agro and nutrition sectors, and the importance for environmental protection. The focus is on computational sciences with potential applications for compound library design, virtual screening, hit assessment, analysis of phenotypic screens, lead finding and optimization, and systems biology-based prediction of toxicology and translational research.
Collapse
Affiliation(s)
- Edgar Jacoby
- Janssen Research & Development, Beerse, Belgium.
| | - J B Brown
- Life Science Informatics Research Unit, Laboratory of Molecular Biosciences, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
7
|
Lorimer T, Held J, Stoop R. Clustering: how much bias do we need? PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2017; 375:rsta.2016.0293. [PMID: 28507238 PMCID: PMC5434083 DOI: 10.1098/rsta.2016.0293] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 12/05/2016] [Indexed: 05/05/2023]
Abstract
Scientific investigations in medicine and beyond increasingly require observations to be described by more features than can be simultaneously visualized. Simply reducing the dimensionality by projections destroys essential relationships in the data. Similarly, traditional clustering algorithms introduce data bias that prevents detection of natural structures expected from generic nonlinear processes. We examine how these problems can best be addressed, where in particular we focus on two recent clustering approaches, Phenograph and Hebbian learning clustering, applied to synthetic and natural data examples. Our results reveal that already for very basic questions, minimizing clustering bias is essential, but that results can benefit further from biased post-processing.This article is part of the themed issue 'Mathematical methods in medicine: neuroscience, cardiology and pathology'.
Collapse
Affiliation(s)
- Tom Lorimer
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Jenny Held
- Eawag, Überlandstrasse 133, 8600 Dübendorf, Switzerland
| | - Ruedi Stoop
- Institute of Neuroinformatics, University of Zurich and ETH Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
8
|
Bernasconi P, Min Chen, Galasinski S, Popa-Burke I, Bobasheva A, Coudurier L, Birkos S, Hallam R, Janzen WP. A Chemogenomic Analysis of the Human Proteome: Application to Enzyme Families. ACTA ACUST UNITED AC 2016; 12:972-82. [PMID: 17942790 DOI: 10.1177/1087057107306759] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Sequence-based phylogenies (SBP) are well-established tools for describing relationships between proteins. They have been used extensively to predict the behavior and sensitivity toward inhibitors of enzymes within a family. The utility of this approach diminishes when comparing proteins with little sequence homology. Even within an enzyme family, SBPs must be complemented by an orthogonal method that is independent of sequence to better predict enzymatic behavior. A chemogenomic approach is demonstrated here that uses the inhibition profile of a 130,000 diverse molecule library to uncover relationships within a set of enzymes. The profile is used to construct a semimetric additive distance matrix. This matrix, in turn, defines a sequence-independent phylogeny (SIP). The method was applied to 97 enzymes (kinases, proteases, and phosphatases). SIP does not use structural information from the molecules used for establishing the profile, thus providing a more heuristic method than the current approaches, which require knowledge of the specific inhibitor's structure. Within enzyme families, SIP shows a good overall correlation with SBP. More interestingly, SIP uncovers distances within families that are not recognizable by sequence-based methods. In addition, SIP allows the determination of distance between enzymes with no sequence homology, thus uncovering novel relationships not predicted by SBP. This chemogenomic approach, used in conjunction with SBP, should prove to be a powerful tool for choosing target combinations for drug discovery programs as well as for guiding the selection of profiling and liability targets. ( Journal of Biomolecular Screening 2007:972-982)
Collapse
Affiliation(s)
| | - Min Chen
- Amphora Discovery Corporation, Durham, North Carolina
| | | | | | | | | | - Steve Birkos
- Amphora Discovery Corporation, Durham, North Carolina
| | - Rhonda Hallam
- Amphora Discovery Corporation, Durham, North Carolina
| | | |
Collapse
|
9
|
Abstract
Computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based methods. Structure-based methods are in principle analogous to high-throughput screening in that both target and ligand structure information is imperative. Structure-based approaches include ligand docking, pharmacophore, and ligand design methods. The article discusses theory behind the most important methods and recent successful applications. Ligand-based methods use only ligand information for predicting activity depending on its similarity/dissimilarity to previously known active ligands. We review widely used ligand-based methods such as ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships. In addition, important tools such as target/ligand data bases, homology modeling, ligand fingerprint methods, etc., necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign are discussed. Finally, computational methods for toxicity prediction and optimization for favorable physiologic properties are discussed with successful examples from literature.
Collapse
Affiliation(s)
- Gregory Sliwoski
- Jr., Center for Structural Biology, 465 21st Ave South, BIOSCI/MRBIII, Room 5144A, Nashville, TN 37232-8725.
| | | | | | | |
Collapse
|
10
|
Rognan D. Towards the Next Generation of Computational Chemogenomics Tools. Mol Inform 2013; 32:1029-34. [PMID: 27481148 DOI: 10.1002/minf.201300054] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 06/11/2013] [Indexed: 01/07/2023]
Affiliation(s)
- D Rognan
- UMR 7200 CNRS-Université de Strasbourg, MEDALIS Drug Discovery Center, 74 route du Rhin, 67400, Illkirch, France.
| |
Collapse
|
11
|
Azzaoui K, Jacoby E, Senger S, Rodríguez EC, Loza M, Zdrazil B, Pinto M, Williams AJ, de la Torre V, Mestres J, Pastor M, Taboureau O, Rarey M, Chichester C, Pettifer S, Blomberg N, Harland L, Williams-Jones B, Ecker GF. Scientific competency questions as the basis for semantically enriched open pharmacological space development. Drug Discov Today 2013; 18:843-52. [DOI: 10.1016/j.drudis.2013.05.008] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2012] [Revised: 04/17/2013] [Accepted: 05/14/2013] [Indexed: 10/26/2022]
|
12
|
Papp A, Szommer T, Barna L, Gyimesi G, Ferdinandy P, Spadoni C, Darvas F, Fujita T, Urge L, Dormán G. Enhanced hit-to-lead process using bioanalogous lead evolution and chemogenomics: application in designing selective matrix metalloprotease inhibitors. Expert Opin Drug Discov 2013; 2:707-23. [PMID: 23488960 DOI: 10.1517/17460441.2.5.707] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The authors describe an innovative approach for designing novel inhibitors. This approach effectively integrates the emerging chemogenomics concept of target-family-based drug discovery with bioanalogous design strategies, including privileged structures, molecular frameworks as well as bioisosteric and bioanalogous/isofunctional modifications. The authors applied this method in the design of selective inhibitors of matrix metalloproteases (MMPs), also referred to as matrixins, on the basis of a unique analysis of the ligand-target knowledge base, the 'matrixinome'. For this analysis, the authors created an annotated MMP database containing ∼ 300 inhibitors with their published activity profile. The ligand space was then arranged into a lead evolution tree, where the substructural transformations in each virtual step led to marked changes in the activity pattern. This allowed subtype-specific privileged fragments to be extracted as well as modifications, which improve activity and/or selectivity. Furthermore, the compounds with the preferred activity profile were correlated with sequence homology as well as binding site similarity within the target family, thereby leading to the identification of substructural modifications that turn non-selective, biohomologous structures into selective inhibitors. The matrixinomic application of the authors' approach, therefore, provides an example of how the combination of ligand space knowledge with sequence-related data can radically improve the outcome of the lead optimisation process to achieve higher selectivity within a given target family.
Collapse
Affiliation(s)
- Akos Papp
- AMRI Hungary, Inc., Záhony utca 7, 1031 Budapest, Hungary +361 6666 129 ; +361 6666 110 ;
| | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Ma DL, Chan DSH, Leung CH. Drug repositioning by structure-based virtual screening. Chem Soc Rev 2013; 42:2130-41. [PMID: 23288298 DOI: 10.1039/c2cs35357a] [Citation(s) in RCA: 162] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Approved drugs have favourable or validated pharmacokinetic properties and toxicological profiles, and the repositioning of existing drugs for new indications can potentially avoid expensive costs associated with early-stage testing of the hit compounds. In recent years, technological advances in virtual screening methodologies have allowed medicinal chemists to rapidly screen drug libraries for therapeutic activity against new biomolecular targets in a cost-effective manner. This review article outlines the basic principles and recent advances in structure-based virtual screening and highlights the powerful synergy of in silico techniques in drug repositioning as demonstrated in several recent reports.
Collapse
Affiliation(s)
- Dik-Lung Ma
- Department of Chemistry, Hong Kong Baptist University, Kowloon Tong, Hong Kong, China.
| | | | | |
Collapse
|
14
|
Eriksson M, Nilsson I, Kogej T, Southan C, Johansson M, Tyrchan C, Muresan S, Blomberg N, Bjäreland M. SARConnect: A Tool to Interrogate the Connectivity Between Proteins, Chemical Structures and Activity Data. Mol Inform 2012; 31:555-568. [PMID: 23308082 PMCID: PMC3535785 DOI: 10.1002/minf.201200030] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 04/14/2012] [Indexed: 11/21/2022]
Abstract
The access and use of large-scale structure-activity relationships (SAR) is increasing as the range of targets and availability of bioactive compound-to-protein mappings expands. However, effective exploitation requires merging and normalisation of activity data, mappings to target classifications as well as visual display of chemical structure relationships. This work describes the development of the application "SARConnect" to address these issues. We discuss options for delivery and analysis of large-scale SAR data together with a set of use-cases to illustrate the design choices and utility. The main activity sources of ChEMBL,1 GOSTAR2 and AstraZeneca's internal system IBIS, had already been integrated in Chemistry Connect.3 For target relationships we selected human UniProtKB/Swiss-Prot4 as our primary source of a heuristic target classification. Similarly, to explore chemical relationships we combined several methods for framework and scaffold analysis into a unified, hierarchical classification where ease of navigation was the primary goal. An application was built on TIBCO Spotfire to retrieve data for visual display. Consequently, users can explore relationships between target, activity and structure across internal, external and commercial sources that encompass approximately 3 million compounds, 2000 human proteins and 10 million activity values. Examples showing the utility of the application are given.
Collapse
Affiliation(s)
- Mats Eriksson
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | - Thierry Kogej
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | | | | | - Sorel Muresan
- Discovery Sciences, Computational
Sciences, AstraZeneca R&D Mölndal,
S-431 83 Mölndal, Sweden
| | | | | |
Collapse
|
15
|
Pérez-Nueno VI, Venkatraman V, Mavridis L, Ritchie DW. Detecting Drug Promiscuity Using Gaussian Ensemble Screening. J Chem Inf Model 2012; 52:1948-61. [DOI: 10.1021/ci3000979] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Violeta I. Pérez-Nueno
- INRIA Nancy − Grand Est, 615 rue du Jardin Botanique,
54506 Vandoeuvre-lès-Nancy, France
| | - Vishwesh Venkatraman
- INRIA Nancy − Grand Est, 615 rue du Jardin Botanique,
54506 Vandoeuvre-lès-Nancy, France
| | - Lazaros Mavridis
- INRIA Nancy − Grand Est, 615 rue du Jardin Botanique,
54506 Vandoeuvre-lès-Nancy, France
| | - David W. Ritchie
- INRIA Nancy − Grand Est, 615 rue du Jardin Botanique,
54506 Vandoeuvre-lès-Nancy, France
| |
Collapse
|
16
|
Schuffenhauer A. Computational methods for scaffold hopping. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1106] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
17
|
Williams AJ, Ekins S, Tkachenko V. Towards a gold standard: regarding quality in public domain chemistry databases and approaches to improving the situation. Drug Discov Today 2012; 17:685-701. [PMID: 22426180 DOI: 10.1016/j.drudis.2012.02.013] [Citation(s) in RCA: 86] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2011] [Revised: 01/17/2012] [Accepted: 02/28/2012] [Indexed: 01/25/2023]
Abstract
In recent years there has been a dramatic increase in the number of freely accessible online databases serving the chemistry community. The internet provides chemistry data that can be used for data-mining, for computer models, and integration into systems to aid drug discovery. There is however a responsibility to ensure that the data are high quality to ensure that time is not wasted in erroneous searches, that models are underpinned by accurate data and that improved discoverability of online resources is not marred by incorrect data. In this article we provide an overview of some of the experiences of the authors using online chemical compound databases, critique the approaches taken to assemble data and we suggest approaches to deliver definitive reference data sources.
Collapse
Affiliation(s)
- Antony J Williams
- Royal Society of Chemistry, US Office, 904 Tamaras Circle, Wake Forest, NC 27587, USA.
| | | | | |
Collapse
|
18
|
Sukumar N, Krein MP, Embrechts MJ. Predictive cheminformatics in drug discovery: statistical modeling for analysis of micro-array and gene expression data. Methods Mol Biol 2012; 910:165-94. [PMID: 22821597 DOI: 10.1007/978-1-61779-965-5_9] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
The vast amounts of chemical and biological data available through robotic high-throughput assays and micro-array technologies require computational techniques for visualization, analysis, and predictive -modeling. Predictive cheminformatics and bioinformatics employ statistical methods to mine this data for hidden correlations and to retrieve molecules or genes with desirable biological activity from large databases, for the purpose of drug development. While many statistical methods are commonly employed and widely accessible, their proper use involves due consideration to data representation and preprocessing, model validation and domain of applicability estimation, similarity assessment, the nature of the structure-activity landscape, and model interpretation. This chapter seeks to review these considerations in light of the current state of the art in statistical modeling and to summarize the best practices in predictive cheminformatics.
Collapse
Affiliation(s)
- N Sukumar
- Rensselaer Exploratory Center for Cheminformatics Research and Department of Chemistry and Chemical Biology, Rensselaer Polytechnic Institute, Troy, NY, USA.
| | | | | |
Collapse
|
19
|
Orchard S, Al-Lazikani B, Bryant S, Clark D, Calder E, Dix I, Engkvist O, Forster M, Gaulton A, Gilson M, Glen R, Grigorov M, Hammond-Kosack K, Harland L, Hopkins A, Larminie C, Lynch N, Mann RK, Murray-Rust P, Lo Piparo E, Southan C, Steinbeck C, Wishart D, Hermjakob H, Overington J, Thornton J. Minimum information about a bioactive entity (MIABE). Nat Rev Drug Discov 2011; 10:661-9. [DOI: 10.1038/nrd3503] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
20
|
Kinnings SL, Jackson RM. ReverseScreen3D: a structure-based ligand matching method to identify protein targets. J Chem Inf Model 2011; 51:624-34. [PMID: 21361385 DOI: 10.1021/ci1003174] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Ligand promiscuity, which is now recognized as an extremely common phenomenon, is a major underlying cause of drug toxicity. We have developed a new reverse virtual screening (VS) method called ReverseScreen3D, which can be used to predict the potential protein targets of a query compound of interest. The method uses a 2D fingerprint-based method to select a ligand template from each unique binding site of each protein within a target database. The target database contains only the structurally determined bioactive conformations of known ligands. The 2D comparison is followed by a 3D structural comparison to the selected query ligand using a geometric matching method, in order to prioritize each target binding site in the database. We have evaluated the performance of the ReverseScreen2D and 3D methods using a diverse set of small molecule protein inhibitors known to have multiple targets, and have shown that they are able to provide a highly significant enrichment of true targets in the database. Furthermore, we have shown that the 3D structural comparison improves early enrichment when compared with the 2D method alone, and that the 3D method performs well even in the absence of 2D similarity to the template ligands. By carrying out further experimental screening on the prioritized list of targets, it may be possible to determine the potential targets of a new compound or determine the off-targets of an existing drug. The ReverseScreen3D method has been incorporated into a Web server, which is freely available at http://www.modelling.leeds.ac.uk/ReverseScreen3D .
Collapse
Affiliation(s)
- Sarah L Kinnings
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | | |
Collapse
|
21
|
Jacoby E. Computational chemogenomics. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011. [DOI: 10.1002/wcms.11] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Edgar Jacoby
- Novartis Institutes for BioMedical Research, Center for Proteomic Chemistry, Forum 1, Novartis Campus, Basel, Switzerland
| |
Collapse
|
22
|
Schnur DM, Beno BR, Tebben AJ, Cavallaro C. Methods for combinatorial and parallel library design. Methods Mol Biol 2011; 672:387-434. [PMID: 20838978 DOI: 10.1007/978-1-60761-839-3_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Diversity has historically played a critical role in design of combinatorial libraries, screening sets and corporate collections for lead discovery. Large library design dominated the field in the 1990s with methods ranging anywhere from purely arbitrary through property based reagent selection to product based approaches. In recent years, however, there has been a downward trend in library size. This was due to increased information about the desirable targets gleaned from the genomics revolution and to the ever growing availability of target protein structures from crystallography and homology modeling. Creation of libraries directed toward families of receptors such as GPCRs, kinases, nuclear hormone receptors, proteases, etc., replaced the generation of libraries based primarily on diversity while single target focused library design has remained an important objective. Concurrently, computing grids and cpu clusters have facilitated the development of structure based tools that screen hundreds of thousands of molecules. Smaller "smarter" combinatorial and focused parallel libraries replaced those early un-focused large libraries in the twenty-first century drug design paradigm. While diversity still plays a role in lead discovery, the focus of current library design methods has shifted to receptor based methods, scaffold hopping/bio-isostere searching, and a much needed emphasis on synthetic feasibility. Methods such as "privileged substructures based design" and pharmacophore based design still are important methods for parallel and small combinatorial library design. This chapter discusses some of the possible design methods and presents examples where they are available.
Collapse
Affiliation(s)
- Dora M Schnur
- Computer Aided Drug Design, Pharmaceutical Research Institute, Bristol-Myers Squibb Company, Princeton, NJ, USA
| | | | | | | |
Collapse
|
23
|
Affiliation(s)
- David Gurwitz
- Department of Human Molecular Genetics and Biochemistry, Sackler Faculty of Medicine, Tel‐Aviv University, Tel‐Aviv, Israel
| |
Collapse
|
24
|
Abstract
![]()
Molecular biology now dominates pharmacology so thoroughly that it is difficult to recall that only a generation ago the field was very different. To understand drug action today, we characterize the targets through which they act and new drug leads are discovered on the basis of target structure and function. Until the mid-1980s the information often flowed in reverse: investigators began with organic molecules and sought targets, relating receptors not by sequence or structure but by their ligands. Recently, investigators have returned to this chemical view of biology, bringing to it systematic and quantitative methods of relating targets by their ligands. This has allowed the discovery of new targets for established drugs, suggested the bases for their side effects, and predicted the molecular targets underlying phenotypic screens. The bases for these new methods, some of their successes and liabilities, and new opportunities for their use are described.
Collapse
Affiliation(s)
- Michael J Keiser
- Department of Pharmaceutical Chemistry, University of California-San Francisco, 1700 4th Street, San Francisco, CA 94158-2558, USA
| | | | | |
Collapse
|
25
|
Hailemariam L, Venkatasubramanian V. Purdue Ontology for Pharmaceutical Engineering: Part I. Conceptual Framework. J Pharm Innov 2010. [DOI: 10.1007/s12247-010-9081-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
26
|
Keiser MJ, Setola V, Irwin JJ, Laggner C, Abbas AI, Hufeisen SJ, Jensen NH, Kuijer MB, Matos RC, Tran TB, Whaley R, Glennon RA, Hert J, Thomas KLH, Edwards DD, Shoichet BK, Roth BL. Predicting new molecular targets for known drugs. Nature 2009; 462:175-81. [PMID: 19881490 PMCID: PMC2784146 DOI: 10.1038/nature08506] [Citation(s) in RCA: 1139] [Impact Index Per Article: 75.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2009] [Accepted: 09/14/2009] [Indexed: 12/18/2022]
Abstract
Whereas drugs are intended to be selective, at least some bind to several physiologic targets, explaining both side effects and efficacy. As many drug-target combinations exist, it would be useful to explore possible interactions computationally. Here, we compared 3,665 FDA-approved and investigational drugs against hundreds of targets, defining each target by its ligands. Chemical similarities between drugs and ligand sets predicted thousands of unanticipated associations. Thirty were tested experimentally, including the antagonism of the β1 receptor by the transporter inhibitor Prozac, the inhibition of the 5-HT transporter by the ion channel drug Vadilex, and antagonism of the histamine H4 receptor by the enzyme inhibitor Rescriptor. Overall, 23 new drug-target associations were confirmed, five of which were potent (< 100 nM). The physiological relevance of one such, the drug DMT on serotonergic receptors, was confirmed in a knock-out mouse. The chemical similarity approach is systematic and comprehensive, and may suggest side-effects and new indications for many drugs.
Collapse
Affiliation(s)
- Michael J Keiser
- Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th Street, San Francisco, California 94143-2550, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Adams JC, Keiser MJ, Basuino L, Chambers HF, Lee DS, Wiest OG, Babbitt PC. A mapping of drug space from the viewpoint of small molecule metabolism. PLoS Comput Biol 2009; 5:e1000474. [PMID: 19701464 PMCID: PMC2727484 DOI: 10.1371/journal.pcbi.1000474] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2009] [Accepted: 07/16/2009] [Indexed: 12/25/2022] Open
Abstract
Small molecule drugs target many core metabolic enzymes in humans and pathogens, often mimicking endogenous ligands. The effects may be therapeutic or toxic, but are frequently unexpected. A large-scale mapping of the intersection between drugs and metabolism is needed to better guide drug discovery. To map the intersection between drugs and metabolism, we have grouped drugs and metabolites by their associated targets and enzymes using ligand-based set signatures created to quantify their degree of similarity in chemical space. The results reveal the chemical space that has been explored for metabolic targets, where successful drugs have been found, and what novel territory remains. To aid other researchers in their drug discovery efforts, we have created an online resource of interactive maps linking drugs to metabolism. These maps predict the "effect space" comprising likely target enzymes for each of the 246 MDDR drug classes in humans. The online resource also provides species-specific interactive drug-metabolism maps for each of the 385 model organisms and pathogens in the BioCyc database collection. Chemical similarity links between drugs and metabolites predict potential toxicity, suggest routes of metabolism, and reveal drug polypharmacology. The metabolic maps enable interactive navigation of the vast biological data on potential metabolic drug targets and the drug chemistry currently available to prosecute those targets. Thus, this work provides a large-scale approach to ligand-based prediction of drug action in small molecule metabolism.
Collapse
Affiliation(s)
- James Corey Adams
- Graduate Program in Pharmaceutical Sciences and Pharmacogenomics,
University of California, San Francisco, California, United States of
America
| | - Michael J. Keiser
- Graduate Program in Bioinformatics, University of California, San
Francisco, California, United States of America
| | - Li Basuino
- San Francisco General Hospital, University of California San Francisco,
San Francisco, California, United States of America
| | - Henry F. Chambers
- San Francisco General Hospital, University of California San Francisco,
San Francisco, California, United States of America
| | - Deok-Sun Lee
- Center for Complex Network Research and Departments of Physics, Biology,
and Computer Science, Northeastern University, Boston, Massachusetts, United
States of America
- Center for Cancer Systems Biology, Dana-Farber Cancer Institute, Boston,
Massachusetts, United States of America
- Department of Natural Medical Sciences, Inha University, Incheon,
Korea
| | - Olaf G. Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre
Dame, Indiana, United States of America
| | - Patricia C. Babbitt
- Department of Bioengineering and Therapeutic Sciences, University of
California, San Francisco, California, United States of America
- Department of Pharmaceutical Chemistry, University of California, San
Francisco, California, United States of America
- California Institute for Quantitative Biosciences, University of
California, San Francisco, California, United States of America
| |
Collapse
|
28
|
Geppert H, Humrich J, Stumpfe D, Gärtner T, Bajorath J. Ligand prediction from protein sequence and small molecule information using support vector machines and fingerprint descriptors. J Chem Inf Model 2009; 49:767-79. [PMID: 19309114 DOI: 10.1021/ci900004a] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Support vector machine (SVM) database search strategies are presented that aim at the identification of small molecule ligands for targets for which no ligand information is currently available. In pharmaceutical research and chemical biology, this situation is faced, for example, when studying orphan targets or newly identified members of protein families. To investigate methods for de novo ligand identification in the absence of known three-dimensional target structures or active molecules, we have focused on combining sequence and ligand information for closely and distantly related proteins. To provide a basis for these investigations, a set of 11 protease targets from different families was assembled together with more than 2000 inhibitors directed against individual proteases. We have compared SVM approaches that combine protein sequence and ligand information in different ways and utilize 2D fingerprints as ligand descriptors. These methodologies were applied to search for inhibitors of individual proteases not taken into account during learning. A target sequence-ligand kernel and, in particular, a linear combination of multiple target-directed SVMs consistently identified inhibitors with high accuracy including test cases where homology-based similarity searching using data fusion and conventional SVM ranking nearly or completely failed. The SVM linear combination and target-ligand kernel methods described herein are intuitive and straightforward to adopt for ligand prediction against other targets.
Collapse
Affiliation(s)
- Hanna Geppert
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universitat Bonn, Dahlmannstr. 2, D-53113 Bonn, Germany
| | | | | | | | | |
Collapse
|
29
|
Weill N, Rognan D. Development and Validation of a Novel Protein−Ligand Fingerprint To Mine Chemogenomic Space: Application to G Protein-Coupled Receptors and Their Ligands. J Chem Inf Model 2009; 49:1049-62. [DOI: 10.1021/ci800447g] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Nathanael Weill
- Structural Chemogenomics Group, Laboratory of Therapeutic Inovation, UMR 7200 CNRS-UdS (Université de Strasbourg), 74 route du Rhin, B.P.24, F-67400 Illkirch, France
| | - Didier Rognan
- Structural Chemogenomics Group, Laboratory of Therapeutic Inovation, UMR 7200 CNRS-UdS (Université de Strasbourg), 74 route du Rhin, B.P.24, F-67400 Illkirch, France
| |
Collapse
|
30
|
Jacob L, Hoffmann B, Stoven V, Vert JP. Virtual screening of GPCRs: an in silico chemogenomics approach. BMC Bioinformatics 2008; 9:363. [PMID: 18775075 PMCID: PMC2553090 DOI: 10.1186/1471-2105-9-363] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2008] [Accepted: 09/06/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The G-protein coupled receptor (GPCR) superfamily is currently the largest class of therapeutic targets. In silico prediction of interactions between GPCRs and small molecules in the transmembrane ligand-binding site is therefore a crucial step in the drug discovery process, which remains a daunting task due to the difficulty to characterize the 3D structure of most GPCRs, and to the limited amount of known ligands for some members of the superfamily. Chemogenomics, which attempts to characterize interactions between all members of a target class and all small molecules simultaneously, has recently been proposed as an interesting alternative to traditional docking or ligand-based virtual screening strategies. RESULTS We show that interaction prediction in the chemogenomics framework outperforms state-of-the-art individual ligand-based methods in accuracy both for receptor with known ligands and without known ligands. This is done with no knowledge of the receptor 3D structure. In particular we are able to predict ligands of orphan GPCRs with an estimated accuracy of 78.1%. CONCLUSION We propose new methods for in silico chemogenomics and validate them on the virtual screening of GPCRs. The methods represent an extension of a recently proposed machine learning strategy, based on support vector machines (SVM), which provides a flexible framework to incorporate various information sources on the biological space of targets and on the chemical space of small molecules. We investigate the use of 2D and 3D descriptors for small molecules, and test a variety of descriptors for GPCRs. We show that incorporating information about the known hierarchical classification of the target family and about key residues in their inferred binding pockets significantly improves the prediction accuracy of our model.
Collapse
Affiliation(s)
- Laurent Jacob
- Mines ParisTech, Centre for Computational Biology, 35 rue Saint-Honoré, F-77305, Fontainebleau, France.
| | | | | | | |
Collapse
|
31
|
Hert J, Keiser MJ, Irwin JJ, Oprea TI, Shoichet BK. Quantifying the relationships among drug classes. J Chem Inf Model 2008; 48:755-65. [PMID: 18335977 PMCID: PMC2722950 DOI: 10.1021/ci8000259] [Citation(s) in RCA: 135] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
The similarity of drug targets is typically measured using sequence or structural information. Here, we consider chemo-centric approaches that measure target similarity on the basis of their ligands, asking how chemoinformatics similarities differ from those derived bioinformatically, how stable the ligand networks are to changes in chemoinformatics metrics, and which network is the most reliable for prediction of pharmacology. We calculated the similarities between hundreds of drug targets and their ligands and mapped the relationship between them in a formal network. Bioinformatics networks were based on the BLAST similarity between sequences, while chemoinformatics networks were based on the ligand-set similarities calculated with either the Similarity Ensemble Approach (SEA) or a method derived from Bayesian statistics. By multiple criteria, bioinformatics and chemoinformatics networks differed substantially, and only occasionally did a high sequence similarity correspond to a high ligand-set similarity. In contrast, the chemoinformatics networks were stable to the method used to calculate the ligand-set similarities and to the chemical representation of the ligands. Also, the chemoinformatics networks were more natural and more organized, by network theory, than their bioinformatics counterparts: ligand-based networks were found to be small-world and broad-scale.
Collapse
Affiliation(s)
- Jérôme Hert
- Department of Pharmaceutical Chemistry, University of California—San Francisco, 1700 4th St., San Francisco, California 94143-2550
| | - Michael J. Keiser
- Department of Pharmaceutical Chemistry, University of California—San Francisco, 1700 4th St., San Francisco, California 94143-2550
| | - John J. Irwin
- Department of Pharmaceutical Chemistry, University of California—San Francisco, 1700 4th St., San Francisco, California 94143-2550
| | - Tudor I. Oprea
- Division of Biocomputing, MSC11 6145, University of New Mexico School of Medicine, 2703 Frontier NE, Albuquerque, New Mexico 87131
| | - Brian K. Shoichet
- Department of Pharmaceutical Chemistry, University of California—San Francisco, 1700 4th St., San Francisco, California 94143-2550
| |
Collapse
|
32
|
Brown RD, Rogers D. Is learning drugs the same as learning non-drugs? Chem Cent J 2008. [PMCID: PMC4236220 DOI: 10.1186/1752-153x-2-s1-s5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
33
|
Vogt I, Stumpfe D, Ahmed HEA, Bajorath J. Methods for computer-aided chemical biology. Part 2: Evaluation of compound selectivity using 2D molecular fingerprints. Chem Biol Drug Des 2007; 70:195-205. [PMID: 17718714 DOI: 10.1111/j.1747-0285.2007.00555.x] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
We analyze 558 compounds with selectivity against members of different protein families using two-dimensional molecular fingerprint methods. The calculations target compounds selective for 13 targets belonging to three families. These compound sets were especially designed for selectivity studies. The identification of compounds displaying different selectivity patterns against related protein targets is a prerequisite for chemical genetics and genomics applications to specifically interfere with functions of individual members of protein families. Thus far, computational methods have only little impact on the search for selective compounds. This is in part due to the fact that selectivity is more difficult to study computationally than activity because selectivity analysis requires the evaluation of compounds binding to multiple targets. Here, we investigate the ability of state-of-the-art two-dimensional molecular fingerprints to detect compounds having different selectivity. The results of systematic similarity search calculations reveal that two-dimensional fingerprints are capable of identifying compounds having different selectivity against closely related target proteins, although fingerprints were originally not developed for such applications. In addition to target-selective molecules, fingerprints are also found to preferentially recognize compounds that are active at the target family level. Our findings suggest that similarity methods should merit further exploration in the study of compound selectivity across target families.
Collapse
Affiliation(s)
- Ingo Vogt
- Department of Life Science Informatics, B-IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Dahlmannstr. 2, D-53113 Bonn, Germany
| | | | | | | |
Collapse
|
34
|
Affiliation(s)
- Daniel P Walsh
- Department of Chemistry, New York University, New York, New York 10003, USA
| | | |
Collapse
|
35
|
Ekins S, Mestres J, Testa B. In silico pharmacology for drug discovery: methods for virtual ligand screening and profiling. Br J Pharmacol 2007; 152:9-20. [PMID: 17549047 PMCID: PMC1978274 DOI: 10.1038/sj.bjp.0707305] [Citation(s) in RCA: 399] [Impact Index Per Article: 23.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Pharmacology over the past 100 years has had a rich tradition of scientists with the ability to form qualitative or semi-quantitative relations between molecular structure and activity in cerebro. To test these hypotheses they have consistently used traditional pharmacology tools such as in vivo and in vitro models. Increasingly over the last decade however we have seen that computational (in silico) methods have been developed and applied to pharmacology hypothesis development and testing. These in silico methods include databases, quantitative structure-activity relationships, pharmacophores, homology models and other molecular modeling approaches, machine learning, data mining, network analysis tools and data analysis tools that use a computer. In silico methods are primarily used alongside the generation of in vitro data both to create the model and to test it. Such models have seen frequent use in the discovery and optimization of novel molecules with affinity to a target, the clarification of absorption, distribution, metabolism, excretion and toxicity properties as well as physicochemical characterization. The aim of this review is to illustrate some of the in silico methods for pharmacology that are used in drug discovery. Further applications of these methods to specific targets and their limitations will be discussed in the second accompanying part of this review.
Collapse
Affiliation(s)
- S Ekins
- ACT LLC, 1 Penn Plaza, New York, NY 10119, USA.
| | | | | |
Collapse
|
36
|
Keiser MJ, Roth BL, Armbruster BN, Ernsberger P, Irwin JJ, Shoichet BK. Relating protein pharmacology by ligand chemistry. Nat Biotechnol 2007; 25:197-206. [PMID: 17287757 DOI: 10.1038/nbt1284] [Citation(s) in RCA: 1392] [Impact Index Per Article: 81.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The identification of protein function based on biological information is an area of intense research. Here we consider a complementary technique that quantitatively groups and relates proteins based on the chemical similarity of their ligands. We began with 65,000 ligands annotated into sets for hundreds of drug targets. The similarity score between each set was calculated using ligand topology. A statistical model was developed to rank the significance of the resulting similarity scores, which are expressed as a minimum spanning tree to map the sets together. Although these maps are connected solely by chemical similarity, biologically sensible clusters nevertheless emerged. Links among unexpected targets also emerged, among them that methadone, emetine and loperamide (Imodium) may antagonize muscarinic M3, alpha2 adrenergic and neurokinin NK2 receptors, respectively. These predictions were subsequently confirmed experimentally. Relating receptors by ligand chemistry organizes biology to reveal unexpected relationships that may be assayed using the ligands themselves.
Collapse
Affiliation(s)
- Michael J Keiser
- Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, San Francisco California 94143-2550, USA
| | | | | | | | | | | |
Collapse
|
37
|
Good AC, Hermsmeier MA. Measuring CAMD Technique Performance. 2. How “Druglike” Are Drugs? Implications of Random Test Set Selection Exemplified Using Druglikeness Classification Models. J Chem Inf Model 2006; 47:110-4. [PMID: 17238255 DOI: 10.1021/ci6003493] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Research into the advancement of computer-aided molecular design (CAMD) has a tendency to focus on the discipline of algorithm development. Such efforts are often wrought to the detriment of the data set selection and analysis used in said algorithm validation. Here we highlight the potential problems this can cause in the context of druglikeness classification. More rigorous efforts are applied to the selection of decoy (nondruglike) molecules from the ACD. Comparisons are made between model performance using the standard technique of random test set creation with test sets derived from explicit ontological separation by drug class. The dangers of viewing druglike space as sufficiently coherent to permit simple classification are highlighted. In addition the issues inherent in applying unfiltered data and random test set selection to (Q)SAR models utilizing large and supposedly heterogeneous databases are discussed.
Collapse
Affiliation(s)
- Andrew C Good
- Bristol-Myers Squibb, 5 Research Parkway, Wallingford, Connecticut 06492, USA.
| | | |
Collapse
|
38
|
Abstract
Chemical biology approaches have a long history in the exploration of the G-protein-coupled receptor (GPCR) family, which represents the largest and most important group of targets for therapeutics. The analysis of the human genome revealed a significant number of new members with unknown physiological function which are today the focus of many reverse pharmacology drug-discovery programs. As the seven hydrophobic transmembrane segments are a defining common structural feature of these receptors, and as signaling through heterotrimeric G proteins is not demonstrated in all cases, these proteins are also referred to as seven transmembrane (7 TM) or serpentine receptors. This review summarizes important historic milestones of GPCR research, from the beginning, when pharmacology was mainly descriptive, to the age of modern molecular biology, with the cloning of the first receptor and now the availability of the entire human GPCR repertoire at the sequence and protein level. It shows how GPCR-directed drug discovery was initially based on the careful testing of a few specifically made chemical compounds and is today pursued with modern drug-discovery approaches, including combinatorial library design, structural biology, molecular informatics, and advanced screening technologies for the identification of new compounds that activate or inhibit GPCRs specifically. Such compounds, in conjunction with other new technologies, allow us to study the role of receptors in physiology and medicine, and will hopefully result in novel therapies. We also outline how basic research on the signaling and regulatory mechanisms of GPCRs is advancing, leading to the discovery of new GPCR-interacting proteins and thus opening new perspectives for drug development. Practical examples from GPCR expression studies, HTS (high-throughput screening), and the design of monoamine-related GPCR-focused combinatorial libraries illustrate ongoing GPCR chemical biology research. Finally, we outline future progress that may relate today's discoveries to the development of new medicines.
Collapse
Affiliation(s)
- Edgar Jacoby
- Novartis Institutes for Biomedical Research, 4002 Basel, Switzerland.
| | | | | | | |
Collapse
|
39
|
Hert J, Willett P, Wilton DJ, Acklin P, Azzaoui K, Jacoby E, Schuffenhauer A. New methods for ligand-based virtual screening: use of data fusion and machine learning to enhance the effectiveness of similarity searching. J Chem Inf Model 2006; 46:462-70. [PMID: 16562973 DOI: 10.1021/ci050348j] [Citation(s) in RCA: 165] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Similarity searching using a single bioactive reference structure is a well-established technique for accessing chemical structure databases. This paper describes two extensions of the basic approach. First, we discuss the use of group fusion to combine the results of similarity searches when multiple reference structures are available. We demonstrate that this technique is notably more effective than conventional similarity searching in scaffold-hopping searches for structurally diverse sets of active molecules; conversely, the technique will do little to improve the search performance if the actives are structurally homogeneous. Second, we make the assumption that the nearest neighbors resulting from a similarity search, using a single bioactive reference structure, are also active and use this assumption to implement approximate forms of group fusion, substructural analysis, and binary kernel discrimination. This approach, called turbo similarity searching, is notably more effective than conventional similarity searching.
Collapse
Affiliation(s)
- Jérôme Hert
- Krebs Institute for Biomolecular Research and Department of Information Studies, University of Sheffield, Western Bank, Sheffield S10 2TN, U.K
| | | | | | | | | | | | | |
Collapse
|
40
|
Cummings MD, Farnum MA, Nelen MI. Universal screening methods and applications of ThermoFluor. ACTA ACUST UNITED AC 2006; 11:854-63. [PMID: 16943390 DOI: 10.1177/1087057106292746] [Citation(s) in RCA: 139] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
The genomics revolution has unveiled a wealth of poorly characterized proteins. Scientists are often able to produce milligram quantities of proteins for which function is unknown or hypothetical, based only on very distant sequence homology. Broadly applicable tools for functional characterization are essential to the illumination of these orphan proteins. An additional challenge is the direct detection of inhibitors of protein-protein interactions (and allosteric effectors). Both of these research problems are relevant to, among other things, the challenge of finding and validating new protein targets for drug action. Screening collections of small molecules has long been used in the pharmaceutical industry as 1 method of discovering drug leads. Screening in this context typically involves a function-based assay. Given a sufficient quantity of a protein of interest, significant effort may still be required for functional characterization, assay development, and assay configuration for screening. Increasingly, techniques are being reported that facilitate screening for specific ligands for a protein of unknown function. Such techniques also allow for function-independent screening with better characterized proteins. ThermoFluor, a screening instrument based on monitoring ligand effects on temperature-dependent protein unfolding, can be applied when protein function is unknown. This technology has proven useful in the decryption of an essential bacterial enzyme and in the discovery of a series of inhibitors of a cancer-related, protein-protein interaction. The authors review some of the tools relevant to these research problems in drug discovery, and describe our experiences with 2 different proteins.
Collapse
Affiliation(s)
- Maxwell D Cummings
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Exton, PA 19341, USA.
| | | | | |
Collapse
|
41
|
Paolini GV, Shapland RHB, van Hoorn WP, Mason JS, Hopkins AL. Global mapping of pharmacological space. Nat Biotechnol 2006; 24:805-15. [PMID: 16841068 DOI: 10.1038/nbt1228] [Citation(s) in RCA: 564] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We present the global mapping of pharmacological space by the integration of several vast sources of medicinal chemistry structure-activity relationships (SAR) data. Our comprehensive mapping of pharmacological space enables us to identify confidently the human targets for which chemical tools and drugs have been discovered to date. The integration of SAR data from diverse sources by unique canonical chemical structure, protein sequence and disease indication enables the construction of a ligand-target matrix to explore the global relationships between chemical structure and biological targets. Using the data matrix, we are able to catalog the links between proteins in chemical space as a polypharmacology interaction network. We demonstrate that probabilistic models can be used to predict pharmacology from a large knowledge base. The relationships between proteins, chemical structures and drug-like properties provide a framework for developing a probabilistic approach to drug discovery that can be exploited to increase research productivity.
Collapse
Affiliation(s)
- Gaia V Paolini
- The Department of Knowledge Discovery, Pfizer Global Research and Development, Sandwich, Kent CT13 9NJ, UK
| | | | | | | | | |
Collapse
|
42
|
Stahl M, Guba W, Kansy M. Integrating molecular design resources within modern drug discovery research: the Roche experience. Drug Discov Today 2006; 11:326-33. [PMID: 16580974 DOI: 10.1016/j.drudis.2006.02.008] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2005] [Revised: 01/24/2006] [Accepted: 02/20/2006] [Indexed: 01/28/2023]
Abstract
Various computational disciplines, such as cheminformatics, ADME modeling, virtual screening, chemogenomics search strategies and classic structure-based design, should be seen as one multifaceted discipline contributing to the early drug discovery process. Although significant resources enabling these activities have been established, their true integration into daily research should not be taken for granted. This article reviews value-adding activities from target assessment to lead optimization, and highlights the technical and process-related aspects that can be considered essential for performance and alignment within the research organization.
Collapse
Affiliation(s)
- Martin Stahl
- F. Hoffmann -- La Roche Ltd, Pharmaceuticals Division, PRBD-CM, CH-4070 Basel, Switzerland.
| | | | | |
Collapse
|
43
|
Surgand JS, Rodrigo J, Kellenberger E, Rognan D. A chemogenomic analysis of the transmembrane binding cavity of human G-protein-coupled receptors. Proteins 2006; 62:509-38. [PMID: 16294340 DOI: 10.1002/prot.20768] [Citation(s) in RCA: 189] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The amino acid sequences of 369 human nonolfactory G-protein-coupled receptors (GPCRs) have been aligned at the seven transmembrane domain (TM) and used to extract the nature of 30 critical residues supposed--from the X-ray structure of bovine rhodopsin bound to retinal--to line the TM binding cavity of ground-state receptors. Interestingly, the clustering of human GPCRs from these 30 residues mirrors the recently described phylogenetic tree of full-sequence human GPCRs (Fredriksson et al., Mol Pharmacol 2003;63:1256-1272) with few exceptions. A TM cavity could be found for all investigated GPCRs with physicochemical properties matching that of their cognate ligands. The current approach allows a very fast comparison of most human GPCRs from the focused perspective of the predicted TM cavity and permits to easily detect key residues that drive ligand selectivity or promiscuity.
Collapse
|
44
|
Schnur DM, Hermsmeier MA, Tebben AJ. Are Target-Family-Privileged Substructures Truly Privileged? J Med Chem 2006; 49:2000-9. [PMID: 16539387 DOI: 10.1021/jm0502900] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
One of the early and effective approaches to G-coupled protein receptor target family library design was the analysis of a set of ligands for frequently occurring chemical moieties or substructures. Various methods ranging from frameworks analysis to pharmacophores have been employed to find these so-called target-family-privileged substructures. Although the use of these substructures is common practice in combinatorial library design and has produced leads, the methods used for finding them rarely verified their selectivity for the particular target family from which they were derived. The frequency of occurrence among ligands associated with a target receptor family is not a sufficient criterion for those substructures to receive the label of target-family-privileged substructure. This study explores the question of selectivity of ClassPharmer generated fragments for a series of target families: GPCRs, nuclear hormone receptors, serine proteases, protein kinases, and ligand-gated ion channels. In addition, a GPCR focused library and a random set of 10k compounds are examined in terms of their target-family-privileged-substructure composition. The results challenge the combinatorial chemistry concept of target-family-privileged substructures and suggest that many of these fragments may simply be drug-like or attractive for various receptors in accordance with the original definition of privileged substructures.
Collapse
Affiliation(s)
- Dora M Schnur
- Computer Aided Drug Design and Lead Discovery, Pharmaceutical Research Institute, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543-5400, USA.
| | | | | |
Collapse
|
45
|
Guba W. Chemogenomics strategies for G-protein coupled receptor hit finding. ERNST SCHERING RESEARCH FOUNDATION WORKSHOP 2006:21-9. [PMID: 16708996 DOI: 10.1007/978-3-540-37635-4_2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Targeting protein superfamilies via chemogenomics is based on a similarity clustering of gene sequences and molecular structures of ligands. Both target and ligand clusters are linked by generating binding affinity profiles of chemotypes vs a target panel. The application of this multidimensional similarity paradigm will be described in the context of Lead Generation to identify novel chemical hit classes for G-protein coupled receptors.
Collapse
Affiliation(s)
- W Guba
- F. Hoffmann-La Roche Ltd., Basel, Switzerland.
| |
Collapse
|
46
|
Abstract
Chemogenomics aims towards the systematic identification of small molecules that interact with the products of the genome and modulate their biological function. This Opinion article summarizes the different knowledge-based chemogenomics strategies that are followed and outlines the challenges and opportunities that will impact drug discovery. Chemogenomics aims towards the systematic identification of small molecules that interact with the products of the genome and modulate their biological function. While historically the approach is based on efforts that systematically explore target gene families like kinases, today additional knowledge-based systematization principles are followed within early drug discovery projects which aim to biologically validate the targets and to identify starting points for chemical lead optimization. While the expectations of chemogenomics are very high, the reality of drug discovery is quite sobering with very high project attrition rates. This article summarizes the different knowledge-based chemogenomics strategies that are followed and outlines the challenges and potential opportunities that will impact drug discovery.
Collapse
Affiliation(s)
- Edgar Jacoby
- Novartis Institutes for Bio Medical Research, Lichtstrasse 35, Basel, CH-4056, Switzerland.
| |
Collapse
|
47
|
Savchuk NP, Balakin KV, Tkachenko SE. Exploring the chemogenomic knowledge space with annotated chemical libraries. Curr Opin Chem Biol 2005; 8:412-7. [PMID: 15288252 DOI: 10.1016/j.cbpa.2004.06.003] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The recent human genome initiatives have led to the discovery of a multitude of genes that are potentially associated with various pathologic conditions and, thus, have opened new horizons in drug discovery. Simultaneously, annotated chemical libraries have emerged as information-rich databases to integrate biological and chemical data. They can be useful for the discovery of new pharmaceutical leads, the validation of new biotargets and the determination of the structural basis of ligand selectivity within target families. Annotated libraries provide a strong information basis for computational design of target-directed combinatorial libraries, which are a key component of modern drug discovery. Today, the rational design of chemical libraries enhanced with chemogenomics data is a new area of progressive research.
Collapse
Affiliation(s)
- Nikolay P Savchuk
- Chemical Diversity Labs, Inc., 11558 Sorrento Valley Road, San Diego, California 92121, USA.
| | | | | |
Collapse
|
48
|
Jenkins JL, Glick M, Davies JW. A 3D similarity method for scaffold hopping from known drugs or natural ligands to new chemotypes. J Med Chem 2005; 47:6144-59. [PMID: 15566286 DOI: 10.1021/jm049654z] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A primary goal of 3D similarity searching is to find compounds with similar bioactivity to a reference ligand but with different chemotypes, i.e., "scaffold hopping". However, an adequate description of chemical structures in 3D conformational space is difficult due to the high-dimensionality of the problem. We present an automated method that simplifies flexible 3D chemical descriptions in which clustering techniques traditionally used in data mining are exploited to create "fuzzy" molecular representations called FEPOPS (feature point pharmacophores). The representations can be used for flexible 3D similarity searching given one or more active compounds without a priori knowledge of bioactive conformations or pharmacophores. We demonstrate that similarity searching with FEPOPS significantly enriches for actives taken from in-house high-throughput screening datasets and from MDDR activity classes COX-2, 5-HT3A, and HIV-RT, while also scaffold or ring-system hopping to new chemical frameworks. Further, inhibitors of target proteins (dopamine 2 and retinoic acid receptor) are recalled by FEPOPS by scaffold hopping from their associated endogenous ligands (dopamine and retinoic acid). Importantly, the method excels in comparison to commonly used 2D similarity methods (DAYLIGHT, MACCS, Pipeline Pilot fingerprints) and a commercial 3D method (Pharmacophore Distance Triplets) at finding novel scaffold classes given a single query molecule.
Collapse
Affiliation(s)
- Jeremy L Jenkins
- Lead Discovery Center, Novartis Institutes for BioMedical Research Inc., Cambridge, Massachusetts 02139, USA.
| | | | | |
Collapse
|
49
|
Sheridan RP, Shpungin J. Calculating similarities between biological activities in the MDL Drug Data Report database. ACTA ACUST UNITED AC 2004; 44:727-40. [PMID: 15032555 DOI: 10.1021/ci034245h] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
There are a number of licensed databases that assign biological activities to druglike compounds. The MDL Drug Data Report (MDDR), compiled from the patent literature, is a popular example. It contains several hundred distinct activities, some of which are therapeutic areas (e.g., Antihypertensive) and some of which are related to specific enzymes or receptors (e.g., ACE inhibitor). There are several data mining applications where it would be useful to calculate a similarity between any two activities. Two distinct activity labels can have a significant similarity for a number of reasons: two activities can be nearly synonymous (e.g., CCK B antagonist vs Gastrin antagonist), one activity may be a subset of another (e.g., Dopamine (D2) agonist vs Dopamine agonist), or an activity can be the mechanism by which another activity works (e.g., ACE inhibitor vs Antihypertensive), etc. In an ideal world, similarities for two activities could be calculated simply by comparing the compounds they have in common, but in hand-curated databases such as the MDDR the assignment of activities to compounds are inevitably inconsistent and incomplete. We propose a number of methods of calculating activity-activity similarities that hopefully compensate for errors in hand-curation. Two of these, TIMI and trend vector, show promise. Soft clustering of the activities using a union of similarity methods shows a reasonable association of therapeutic areas with their mechanisms.
Collapse
Affiliation(s)
- Robert P Sheridan
- RY50S-100 Merck Research Laboratories, Rahway, New Jersey 07065, USA.
| | | |
Collapse
|
50
|
Schuffenhauer A, Jacoby E. Annotating and mining the ligand-target chemogenomics knowledge space. ACTA ACUST UNITED AC 2004. [DOI: 10.1016/s1741-8364(04)02408-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|