Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhang J, Aizawa M, Amari S, Iwasawa Y, Nakano T, Nakata K. Development of KiBank, a database supporting structure-based drug design. Comput Biol Chem 2005;28:401-7. [PMID: 15556481 DOI: 10.1016/j.compbiolchem.2004.09.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Revised: 09/13/2004] [Accepted: 09/15/2004] [Indexed: 11/29/2022]

For:	Zhang J, Aizawa M, Amari S, Iwasawa Y, Nakano T, Nakata K. Development of KiBank, a database supporting structure-based drug design. Comput Biol Chem 2005;28:401-7. [PMID: 15556481 DOI: 10.1016/j.compbiolchem.2004.09.003] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Revised: 09/13/2004] [Accepted: 09/15/2004] [Indexed: 11/29/2022]

Number

Cited by Other Article(s)

Yosipof A, Guedes RC, García-Sosa AT. Data Mining and Machine Learning Models for Predicting Drug Likeness and Their Disease or Organ Category. Front Chem 2018;6:162. [PMID: 29868564 PMCID: PMC5954128 DOI: 10.3389/fchem.2018.00162] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2018] [Accepted: 04/20/2018] [Indexed: 12/11/2022] Open

Abstract

Data mining approaches can uncover underlying patterns in chemical and pharmacological property space decisive for drug discovery and development. Two of the most common approaches are visualization and machine learning methods. Visualization methods use dimensionality reduction techniques in order to reduce multi-dimension data into 2D or 3D representations with a minimal loss of information. Machine learning attempts to find correlations between specific activities or classifications for a set of compounds and their features by means of recurring mathematical models. Both models take advantage of the different and deep relationships that can exist between features of compounds, and helpfully provide classification of compounds based on such features or in case of visualization methods uncover underlying patterns in the feature space. Drug-likeness has been studied from several viewpoints, but here we provide the first implementation in chemoinformatics of the t-Distributed Stochastic Neighbor Embedding (t-SNE) method for the visualization and the representation of chemical space, and the use of different machine learning methods separately and together to form a new ensemble learning method called AL Boost. The models obtained from AL Boost synergistically combine decision tree, random forests (RF), support vector machine (SVM), artificial neural network (ANN), k nearest neighbors (kNN), and logistic regression models. In this work, we show that together they form a predictive model that not only improves the predictive force but also decreases bias. This resulted in a corrected classification rate of over 0.81, as well as higher sensitivity and specificity rates for the models. In addition, separation and good models were also achieved for disease categories such as antineoplastic compounds and nervous system diseases, among others. Such models can be used to guide decision on the feature landscape of compounds and their likeness to either drugs or other characteristics, such as specific or multiple disease-category(ies) or organ(s) of action of a molecule.

Collapse

Lagarde N, Zagury JF, Montes M. Benchmarking Data Sets for the Evaluation of Virtual Ligand Screening Methods: Review and Perspectives. J Chem Inf Model 2015;55:1297-307. [PMID: 26038804 DOI: 10.1021/acs.jcim.5b00090] [Citation(s) in RCA: 59] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Meyer T, Knapp EW. Database of protein complexes with multivalent binding ability: Bival-bind. Proteins 2013;82:744-51. [DOI: 10.1002/prot.24453] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2013] [Revised: 10/15/2013] [Accepted: 10/21/2013] [Indexed: 01/13/2023]

Mavridis L, Mitchell JB. Predicting the protein targets for athletic performance-enhancing substances. J Cheminform 2013;5:31. [PMID: 23800040 PMCID: PMC3701582 DOI: 10.1186/1758-2946-5-31] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2013] [Accepted: 06/17/2013] [Indexed: 12/02/2022] Open

Abstract

Background

The World Anti-Doping Agency (WADA) publishes the Prohibited List, a manually compiled international standard of substances and methods prohibited in-competition, out-of-competition and in particular sports. It would be ideal to be able to identify all substances that have one or more performance-enhancing pharmacological actions in an automated, fast and cost effective way. Here, we use experimental data derived from the ChEMBL database (~7,000,000 activity records for 1,300,000 compounds) to build a database model that takes into account both structure and experimental information, and use this database to predict both on-target and off-target interactions between these molecules and targets relevant to doping in sport.

Results

The ChEMBL database was screened and eight well populated categories of activities (K_i, K_d, EC50, ED50, activity, potency, inhibition and IC50) were used for a rule-based filtering process to define the labels “active” or “inactive”. The “active” compounds for each of the ChEMBL families were thereby defined and these populated our bioactivity-based filtered families. A structure-based clustering step was subsequently performed in order to split families with more than one distinct chemical scaffold. This produced refined families, whose members share both a common chemical scaffold and bioactivity against a common target in ChEMBL.

Conclusions

We have used the Parzen-Rosenblatt machine learning approach to test whether compounds in ChEMBL can be correctly predicted to belong to their appropriate refined families. Validation tests using the refined families gave a significant increase in predictivity compared with the filtered or with the original families. Out of 61,660 queries in our Monte Carlo cross-validation, belonging to 19,639 refined families, 41,300 (66.98%) had the parent family as the top prediction and 53,797 (87.25%) had the parent family in the top four hits. Having thus validated our approach, we used it to identify the protein targets associated with the WADA prohibited classes. For compounds where we do not have experimental data, we use their computed patterns of interaction with protein targets to make predictions of bioactivity. We hope that other groups will test these predictions experimentally in the future.

Collapse

García-Sosa AT, Maran U. Drugs, non-drugs, and disease category specificity: organ effects by ligand pharmacology. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013;24:319-331. [PMID: 23534612 DOI: 10.1080/1062936x.2013.773373] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]

García-Sosa AT, Oja M, Hetényi C, Maran U. DrugLogit: logistic discrimination between drugs and nondrugs including disease-specificity by assigning probabilities based on molecular properties. J Chem Inf Model 2012;52:2165-80. [PMID: 22830445 DOI: 10.1021/ci200587h] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

The increasing knowledge of both structure and activity of compounds provides a good basis for enhancing the pharmacological characterization of chemical libraries. In addition, pharmacology can be seen as incorporating both advances from molecular biology as well as chemical sciences, with innovative insight provided from studying target-ligand data from a ligand molecular point of view. Predictions and profiling of libraries of drug candidates have previously focused mainly on certain cases of oral bioavailability. Inclusion of other administration routes and disease-specificity would improve the precision of drug profiling. In this work, recent data are extended, and a probability-based approach is introduced for quantitative and gradual classification of compounds into categories of drugs/nondrugs, as well as for disease- or organ-specificity. Using experimental data of over 1067 compounds and multivariate logistic regressions, the classification shows good performance in training and independent test cases. The regressions have high statistical significance in terms of the robustness of coefficients and 95% confidence intervals provided by a 1000-fold bootstrapping resampling. Besides their good predictive power, the classification functions remain chemically interpretable, containing only one to five variables in total, and the physicochemical terms involved can be easily calculated. The present approach is useful for an improved description and filtering of compound libraries. It can also be applied sequentially or in combinations of filters, as well as adapted to particular use cases. The scores and equations may be able to suggest possible routes for compound or library modification. The data is made available for reuse by others, and the equations are freely accessible at http://hermes.chem.ut.ee/~alfx/druglogit.html.

Collapse

García-Sosa AT, Oja M, Hetényi C, Maran U. Disease-Specific Differentiation Between Drugs and Non-Drugs Using Principal Component Analysis of Their Molecular Descriptor Space. Mol Inform 2012;31:369-83. [DOI: 10.1002/minf.201100094] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2011] [Accepted: 01/25/2012] [Indexed: 01/04/2023]

Xue M, Zheng M, Xiong B, Li Y, Jiang H, Shen J. Knowledge-based scoring functions in drug design. 1. Developing a target-specific method for kinase-ligand interactions. J Chem Inf Model 2010;50:1378-86. [PMID: 20681607 DOI: 10.1021/ci100182c] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Ogawa T, Nakano T. The Extended Universal Force Field (XUFF):Theory and Applications. CHEM-BIO INFORMATICS JOURNAL 2010. [DOI: 10.1273/cbij.10.111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Novikov FN, Stroylov VS, Stroganov OV, Chilov GG. Improving performance of docking-based virtual screening by structural filtration. J Mol Model 2009;16:1223-30. [PMID: 20041273 DOI: 10.1007/s00894-009-0633-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2009] [Accepted: 11/16/2009] [Indexed: 10/20/2022]

Saravanan SE, Karthi R, Sathish K, Kokila K, Sabarinathan R, Sekar K. MLDB: macromolecule ligand database. J Appl Crystallogr 2009. [DOI: 10.1107/s0021889809048626] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Søndergaard CR, Garrett AE, Carstensen T, Pollastri G, Nielsen JE. Structural artifacts in protein-ligand X-ray structures: implications for the development of docking scoring functions. J Med Chem 2009;52:5673-84. [PMID: 19711919 DOI: 10.1021/jm8016464] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Doddareddy MR, van Westen GJP, van der Horst E, Peironcely JE, Corthals F, Ijzerman AP, Emmerich M, Jenkins JL, Bender A. Chemogenomics: Looking at biology through the lens of chemistry. Stat Anal Data Min 2009. [DOI: 10.1002/sam.10046] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Kirchmair J, Markt P, Distinto S, Schuster D, Spitzer GM, Liedl KR, Langer T, Wolber G. The Protein Data Bank (PDB), its related services and software tools as key components for in silico guided drug discovery. J Med Chem 2009;51:7021-40. [PMID: 18975926 DOI: 10.1021/jm8005977] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Stroganov OV, Novikov FN, Stroylov VS, Kulkov V, Chilov GG. Lead finder: an approach to improve accuracy of protein-ligand docking, binding energy estimation, and virtual screening. J Chem Inf Model 2009;48:2371-85. [PMID: 19007114 DOI: 10.1021/ci800166p] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Miteva MA, Alexov E, Villoutreix BO. Protein structure analysis online. ACTA ACUST UNITED AC 2008;Chapter 2:Unit 2.13. [PMID: 18429316 DOI: 10.1002/0471140864.ps0213s50] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Irwin JJ. Community benchmarks for virtual screening. J Comput Aided Mol Des 2008;22:193-9. [DOI: 10.1007/s10822-008-9189-4] [Citation(s) in RCA: 133] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2007] [Accepted: 01/30/2008] [Indexed: 11/24/2022]

Senger S, Leach AR. SAR Knowledge Bases in Drug Discovery. ACTA ACUST UNITED AC 2008. [DOI: 10.1016/s1574-1400(08)00011-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]

Benson ML, Smith RD, Khazanov NA, Dimcheff B, Beaver J, Dresslar P, Nerothin J, Carlson HA. Binding MOAD, a high-quality protein-ligand database. Nucleic Acids Res 2007;36:D674-8. [PMID: 18055497 PMCID: PMC2238910 DOI: 10.1093/nar/gkm911] [Citation(s) in RCA: 119] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Li H, Yap CW, Ung CY, Xue Y, Li ZR, Han LY, Lin HH, Chen YZ. Machine learning approaches for predicting compounds that interact with therapeutic and ADMET related proteins. J Pharm Sci 2007;96:2838-60. [PMID: 17786989 DOI: 10.1002/jps.20985] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Nakata K, Tanaka Y, Nakano T, Adachi T, Tanaka H, Kaminuma T, Ishikawa T. Nuclear receptor-mediated transcriptional regulation in Phase I, II, and III xenobiotic metabolizing systems. Drug Metab Pharmacokinet 2007;21:437-57. [PMID: 17220560 DOI: 10.2133/dmpk.21.437] [Citation(s) in RCA: 146] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]

Huang N, Shoichet BK, Irwin JJ. Benchmarking sets for molecular docking. J Med Chem 2007;49:6789-801. [PMID: 17154509 PMCID: PMC3383317 DOI: 10.1021/jm0608356] [Citation(s) in RCA: 970] [Impact Index Per Article: 57.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]

Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 2006;35:D198-201. [PMID: 17145705 PMCID: PMC1751547 DOI: 10.1093/nar/gkl999] [Citation(s) in RCA: 1203] [Impact Index Per Article: 66.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open

Strömbergsson H, Kryshtafovych A, Prusis P, Fidelis K, Wikberg JES, Komorowski J, Hvidsten TR. Generalized modeling of enzyme-ligand interactions using proteochemometrics and local protein substructures. Proteins 2006;65:568-79. [PMID: 16948162 DOI: 10.1002/prot.21163] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Abstract

Modeling and understanding protein-ligand interactions is one of the most important goals in computational drug discovery. To this end, proteochemometrics uses structural and chemical descriptors from several proteins and several ligands to induce interaction-models. Here, we present a new and generalized approach in which proteins varying greatly in terms of sequence and structure are represented by a library of local substructures. Using linear regression and rule-based learning, we combine such local substructures with chemical descriptors from the ligands to model binding affinity for a training set of hydrolase and lyase enzymes. We evaluate the predictive performance of these models using cross validation and sets of unseen ligand with unknown three-dimensional structure. The models are shown to generalize by outperforming models using descriptors from only proteins or only ligands, or models using global structure similarities rather than local similarities. Thus, we demonstrate that this approach is capable of describing dependencies between local structural properties and ligands in otherwise dissimilar protein structures. These dependencies are often, but not always, associated with local substructures that are in contact with the ligands. Finally, we show that strongly bound enzyme-ligand complexes require the presence of particular local substructures, while weakly bound complexes may be described by the absence of certain properties. The results demonstrate that the alignment-independent approach using local substructures is capable of describing protein-ligand interaction for largely different proteins and hence opens up for proteochemometrics-analysis of the interaction-space of entire proteomes. Current approaches are limited to families of closely related proteins. families of closely related proteins.

Collapse

Strachan RT, Ferrara G, Roth BL. Screening the receptorome: an efficient approach for drug discovery and target validation. Drug Discov Today 2006;11:708-16. [PMID: 16846798 DOI: 10.1016/j.drudis.2006.06.012] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2006] [Revised: 06/02/2006] [Accepted: 06/16/2006] [Indexed: 11/18/2022]

Miteva MA, Violas S, Montes M, Gomez D, Tuffery P, Villoutreix BO. FAF-Drugs: free ADME/tox filtering of compound collections. Nucleic Acids Res 2006;34:W738-44. [PMID: 16845110 PMCID: PMC1538885 DOI: 10.1093/nar/gkl065] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2006] [Revised: 02/22/2006] [Accepted: 03/01/2006] [Indexed: 12/21/2022] Open

Han L, Cui J, Lin H, Ji Z, Cao Z, Li Y, Chen Y. Recent progresses in the application of machine learning approach for predicting protein functional class independent of sequence similarity. Proteomics 2006;6:4023-37. [PMID: 16791826 DOI: 10.1002/pmic.200500938] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Block P, Sotriffer CA, Dramburg I, Klebe G. AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB. Nucleic Acids Res 2006;34:D522-6. [PMID: 16381925 PMCID: PMC1347402 DOI: 10.1093/nar/gkj039] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Tobita M, Horiuchi K, Araki K, Nemoto M, Shimada H, Nishikawa T. BirdsAnts: A protein-small molecule interaction viewer. CHEM-BIO INFORMATICS JOURNAL 2006. [DOI: 10.1273/cbij.6.17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Nakata K, Amari S, Nakano T. Application of KiBank Database. CHEM-BIO INFORMATICS JOURNAL 2006. [DOI: 10.1273/cbij.6.47] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Li H, Yap CW, Xue Y, Li ZR, Ung CY, Han LY, Chen YZ. Statistical learning approach for predicting specific pharmacodynamic, pharmacokinetic, or toxicological properties of pharmaceutical agents. Drug Dev Res 2005. [DOI: 10.1002/ddr.20044] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

In Brief. Nat Rev Drug Discov 2005. [DOI: 10.1038/nrd1673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]