51
|
Abstract
IMPORTANCE OF THE FIELD: PubChem is a public molecular information repository, a scientific showcase of the NIH Roadmap Initiative. The PubChem database holds over 27 million records of unique chemical structures of compounds (CID) derived from nearly 70 million substance depositions (SID), and contains more than 449,000 bioassay records with over thousands of in vitro biochemical and cell-based screening bioassays established, with targeting more than 7000 proteins and genes linking to over 1.8 million of substances. AREAS COVERED IN THIS REVIEW: This review builds on recent PubChem-related computational chemistry research reported by other authors while providing readers with an overview of the PubChem database, focusing on its increasing role in cheminformatics, virtual screening and toxicity prediction modeling. WHAT THE READER WILL GAIN: These publicly available datasets in PubChem provide great opportunities for scientists to perform cheminformatics and virtual screening research for computer-aided drug design. However, the high volume and complexity of the datasets, in particular the bioassay-associated false positives/negatives and highly imbalanced datasets in PubChem, also creates major challenges. Several approaches regarding the modeling of PubChem datasets and development of virtual screening models for bioactivity and toxicity predictions are also reviewed. TAKE HOME MESSAGE: Novel data-mining cheminformatics tools and virtual screening algorithms are being developed and used to retrieve, annotate and analyze the large-scale and highly complex PubChem biological screening data for drug design.
Collapse
Affiliation(s)
- Xiang-Qun Xie
- Department of Pharmaceutical Sciences, School of Pharmacy; Drug Discovery Institute/Pittsburgh Molecular Library Screening Center (PMLSC); Pittsburgh Chemical Methodologies & Library Development (PCMLD) Center; Departments of Computational Biology and Structural Biology; University of Pittsburgh, Pittsburgh, PA 15260, USA
| |
Collapse
|
52
|
Torres-Piedra M, Ortiz-Andrade R, Villalobos-Molina R, Singh N, Medina-Franco JL, Webster SP, Binnie M, Navarrete-Vázquez G, Estrada-Soto S. A comparative study of flavonoid analogues on streptozotocin–nicotinamide induced diabetic rats: Quercetin as a potential antidiabetic agent acting via 11β-Hydroxysteroid dehydrogenase type 1 inhibition. Eur J Med Chem 2010; 45:2606-12. [DOI: 10.1016/j.ejmech.2010.02.049] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2009] [Revised: 12/21/2009] [Accepted: 02/19/2010] [Indexed: 10/19/2022]
|
53
|
Kharchevnikova NV, Blinova VG, Dobrynin DA, Fedorova N, Novich M, Vrachko M. Data mining on carcinogenicity of chemical compounds by the JSM method. AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS 2010. [DOI: 10.3103/s000510550906003x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
54
|
Rastogi RP, Sinha RP. Biotechnological and industrial significance of cyanobacterial secondary metabolites. Biotechnol Adv 2009; 27:521-39. [DOI: 10.1016/j.biotechadv.2009.04.009] [Citation(s) in RCA: 173] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2009] [Revised: 04/13/2009] [Accepted: 04/14/2009] [Indexed: 01/22/2023]
|
55
|
Nigsch F, Bender A, Jenkins JL, Mitchell JBO. Ligand-Target Prediction Using Winnow and Naive Bayesian Algorithms and the Implications of Overall Performance Statistics. J Chem Inf Model 2008; 48:2313-25. [DOI: 10.1021/ci800079x] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Florian Nigsch
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Andreas Bender
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - Jeremy L. Jenkins
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| | - John B. O. Mitchell
- Unilever Centre for Molecular Science Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom; Lead Discovery Informatics, Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, 250 Massachusetts Avenue, Cambridge, Massachusetts 02139; and Division of Medicinal Chemistry, Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333 CC, Leiden, The Netherlands
| |
Collapse
|
56
|
Dunkel M, Günther S, Ahmed J, Wittig B, Preissner R. SuperPred: drug classification and target prediction. Nucleic Acids Res 2008; 36:W55-9. [PMID: 18499712 PMCID: PMC2447784 DOI: 10.1093/nar/gkn307] [Citation(s) in RCA: 104] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
The drug classification scheme of the World Health Organization (WHO) [Anatomical Therapeutic Chemical (ATC)-code] connects chemical classification and therapeutic approach. It is generally accepted that compounds with similar physicochemical properties exhibit similar biological activity. If this hypothesis holds true for drugs, then the ATC-code, the putative medical indication area and potentially the medical target should be predictable on the basis of structural similarity. We have validated that the prediction of the drug class is reliable for WHO-classified drugs. The reliability of the predicted medical effects of the compounds increases with a rising number of (physico-) chemical properties similar to a drug with known function. The web-server translates a user-defined molecule into a structural fingerprint that is compared to about 6300 drugs, which are enriched by 7300 links to molecular targets of the drugs, derived through text mining followed by manual curation. Links to the affected pathways are provided. The similarity to the medical compounds is expressed by the Tanimoto coefficient that gives the structural similarity of two compounds. A similarity score higher than 0.85 results in correct ATC prediction for 81% of all cases. As the biological effect is well predictable, if the structural similarity is sufficient, the web-server allows prognoses about the medical indication area of novel compounds and to find new leads for known targets. Availability: the system is freely accessible at http://bioinformatics.charite.de/superpred. SuperPred can be obtained via a Creative Commons Attribution Noncommercial-Share Alike 3.0 License.
Collapse
Affiliation(s)
- Mathias Dunkel
- Institute of Molecular Biology and Bioinformatics, Charité - University Medicine Berlin, Arnimallee 22, 14195 Berlin, Germany
| | | | | | | | | |
Collapse
|
57
|
Fjodorova N, Novich M, Vrachko M, Smirnov V, Kharchevnikova N, Zholdakova Z, Novikov S, Skvortsova N, Filimonov D, Poroikov V, Benfenati E. Directions in QSAR modeling for regulatory uses in OECD member countries, EU and in Russia. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS 2008; 26:201-236. [PMID: 18569330 DOI: 10.1080/10590500802135578] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
The aim of this article is to show the main aspects of quantitative structure activity relationship (QSAR) modeling for regulatory purposes. We try to answer the question; what makes QSAR models suitable for regulatory uses. The article focuses on directions in QSAR modeling in European Union (EU) and Russia. Difficulties in validation models have been discussed.
Collapse
|
58
|
Utilizing high throughput screening data for predictive toxicology models: protocols and application to MLSCN assays. J Comput Aided Mol Des 2008; 22:367-84. [DOI: 10.1007/s10822-008-9192-9] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2007] [Accepted: 01/30/2008] [Indexed: 01/27/2023]
|
59
|
Filz O, Lagunin A, Filimonov D, Poroikov V. Computer-aided prediction of QT-prolongation. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2008; 19:81-90. [PMID: 18311636 DOI: 10.1080/10629360701844183] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
Drug-induced cardiac arrhythmia is acknowledged as a serious obstacle in successful development of new drugs. Several methods for in silico prediction of acquired long QT syndrome (LQTS) caused by the pharmacological blockade of human hERG K+ channels are discussed in literature. We propose to use the computer program PASS, which estimates the probabilities of about 3000 biological activities, not only for prediction of hERG blockade and QT-prolongation but also for the analysis of indirect mechanisms of these actions. After addition in the PASS training set of 163 compounds with data on QT-Prolongation and re-training, it was shown that accuracy of prediction was 87.1% and 81.8% for hERG blockade and QT-prolongation, respectively. Using computer program PharmaExpert we found that in the predicted biological activity spectra there was a certain correlation between the hERG blockade and some other molecular mechanisms of action. Possible role of 1-phosphatidylinositol-4-phospate 5-kinase, dimethylargininase and progesterone 11 alpha-monooxygenase inhibition in hERG blockade was discussed.
Collapse
Affiliation(s)
- O Filz
- Institute of Biomedical Chemistry of Rus. Acad. Med. Sci., Moscow, Russia.
| | | | | | | |
Collapse
|
60
|
Devillers J, Doré JC, Guyot M, Poroikov V, Gloriozova T, Lagunin A, Filimonov D. Prediction of biological activity profiles of cyanobacterial secondary metabolites. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2007; 18:629-643. [PMID: 18038364 DOI: 10.1080/10629360701698704] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Over the past decade cyanobacteria have become an interesting source of new classes of pharmacologically active natural products. Some cyanobacterial secondary metabolites (CSMs) are also well known for their toxic effects on living species. The PASS (Prediction of Activity Spectra for Substances) computer program, which is able to simultaneously predict more than one thousand biological and toxicological activities from only the structural formulas of the chemicals, was used to predict the biological activity profile of 681 CSMs. Multivariate methods were employed to structure and analyse this wealth of biological and chemical information. PASS predictions were successfully compared to the available information on the pharmacological and toxicological activity of these compounds.
Collapse
Affiliation(s)
- J Devillers
- CTIS, 3 Chemin de la Gravière, 69140 Rillieux La Pape, France.
| | | | | | | | | | | | | |
Collapse
|
61
|
Devillers J, Marchand-Geneste N, Doré JC, Porcher JM, Poroikov V. Endocrine disruption profile analysis of 11,416 chemicals from chemometrical tools. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2007; 18:181-93. [PMID: 17514564 DOI: 10.1080/10629360701303669] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
A number of chemicals released into the environment have the potential to disturb the normal functioning of the endocrine system. These chemicals termed endocrine disruptors (EDs) act by mimicking or antagonizing the normal functions of natural hormones and may pose serious threats to the reproductive capability and development of living species. Batteries of laboratory bioassays exist for detecting these chemicals. However, due to time and cost limitations, they cannot be used for all the chemicals which can be found in the ecosystems. SAR and QSAR models are particularly suited to overcome this problem but they only deal with specific targets/endpoints. The interest to account for profiles of endocrine activities instead of unique endpoints to better gauge the complexity of endocrine disruption is discussed through a SAR study performed on 11,416 chemicals retrieved from the US-NCI database and for which 13 different PASS (Prediction of Activity Spectra for Substances) endocrine activities were available. Various multivariate analyses and graphical displays were used for deriving structure-activity relationships based on specific structural features.
Collapse
Affiliation(s)
- J Devillers
- CTIS, 3 Chemin de la Gravière, 69140 Rillieux La Pape, France.
| | | | | | | | | |
Collapse
|