151
|
Xu Y, Dai Z, Chen F, Gao S, Pei J, Lai L. Deep Learning for Drug-Induced Liver Injury. J Chem Inf Model 2015; 55:2085-93. [PMID: 26437739 DOI: 10.1021/acs.jcim.5b00238] [Citation(s) in RCA: 187] [Impact Index Per Article: 20.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Drug-induced liver injury (DILI) has been the single most frequent cause of safety-related drug marketing withdrawals for the past 50 years. Recently, deep learning (DL) has been successfully applied in many fields due to its exceptional and automatic learning ability. In this study, DILI prediction models were developed using DL architectures, and the best model trained on 475 drugs predicted an external validation set of 198 drugs with an accuracy of 86.9%, sensitivity of 82.5%, specificity of 92.9%, and area under the curve of 0.955, which is better than the performance of previously described DILI prediction models. Furthermore, with deep analysis, we also identified important molecular features that are related to DILI. Such DL models could improve the prediction of DILI risk in humans. The DL DILI prediction models are freely available at http://www.repharma.cn/DILIserver/DILI_home.php.
Collapse
Affiliation(s)
- Youjun Xu
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China
| | - Ziwei Dai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China
| | - Fangjin Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China
| | - Shuaishi Gao
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University , Beijing 100871, China.,Beijing National Laboratory for Molecular Sciences, State Key Laboratory for Structural Chemistry of Unstable and Stable Species, College of Chemistry and Molecular Engineering, Peking University , Beijing 100871, China.,Peking-Tsinghua Center for Life Sciences, Peking University , Beijing 100871, China
| |
Collapse
|
152
|
Wang T, Wu MB, Lin JP, Yang LR. Quantitative structure–activity relationship: promising advances in drug discovery platforms. Expert Opin Drug Discov 2015; 10:1283-300. [DOI: 10.1517/17460441.2015.1083006] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
153
|
Su BH, Tu YS, Lin C, Shao CY, Lin OA, Tseng YJ. Rule-Based Prediction Models of Cytochrome P450 Inhibition. J Chem Inf Model 2015; 55:1426-34. [DOI: 10.1021/acs.jcim.5b00130] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Bo-Han Su
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Yi-shu Tu
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Chieh Lin
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Chi-Yu Shao
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Olivia A. Lin
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| | - Yufeng J. Tseng
- Graduate Institute of Biomedical Electronics
and Bioinformatics and §Department of Computer
Science and Information Engineering, National Taiwan University, No.
1 Sec. 4, Roosevelt Road, Taipei, Taiwan 106
| |
Collapse
|
154
|
Wang J, Hou T. Advances in computationally modeling human oral bioavailability. Adv Drug Deliv Rev 2015; 86:11-6. [PMID: 25582307 PMCID: PMC4490973 DOI: 10.1016/j.addr.2015.01.001] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2014] [Revised: 11/03/2014] [Accepted: 01/05/2015] [Indexed: 12/15/2022]
Abstract
Although significant progress has been made in experimental high throughput screening (HTS) of ADME (absorption, distribution, metabolism, excretion) and pharmacokinetic properties, the ADME and Toxicity (ADME-Tox) in silico modeling is still indispensable in drug discovery as it can guide us to wisely select drug candidates prior to expensive ADME screenings and clinical trials. Compared to other ADME-Tox properties, human oral bioavailability (HOBA) is particularly important but extremely difficult to predict. In this paper, the advances in human oral bioavailability modeling will be reviewed. Moreover, our deep insight on how to construct more accurate and reliable HOBA QSAR and classification models will also discussed.
Collapse
Affiliation(s)
- Junmei Wang
- Green Center for Systems Biology, The University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd. Dallas, TX 75390, USA.
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| |
Collapse
|
155
|
Ruiz-Blanco YB, Paz W, Green J, Marrero-Ponce Y. ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins. BMC Bioinformatics 2015; 16:162. [PMID: 25982853 PMCID: PMC4432771 DOI: 10.1186/s12859-015-0586-0] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2015] [Accepted: 04/22/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The exponential growth of protein structural and sequence databases is enabling multifaceted approaches to understanding the long sought sequence-structure-function relationship. Advances in computation now make it possible to apply well-established data mining and pattern recognition techniques to these data to learn models that effectively relate structure and function. However, extracting meaningful numerical descriptors of protein sequence and structure is a key issue that requires an efficient and widely available solution. RESULTS We here introduce ProtDCal, a new computational software suite capable of generating tens of thousands of features considering both sequence-based and 3D-structural descriptors. We demonstrate, by means of principle component analysis and Shannon entropy tests, how ProtDCal's sequence-based descriptors provide new and more relevant information not encoded by currently available servers for sequence-based protein feature generation. The wide diversity of the 3D-structure-based features generated by ProtDCal is shown to provide additional complementary information and effectively completes its general protein encoding capability. As demonstration of the utility of ProtDCal's features, prediction models of N-linked glycosylation sites are trained and evaluated. Classification performance compares favourably with that of contemporary predictors of N-linked glycosylation sites, in spite of not using domain-specific features as input information. CONCLUSIONS ProtDCal provides a friendly and cross-platform graphical user interface, developed in the Java programming language and is freely available at: http://bioinf.sce.carleton.ca/ProtDCal/ . ProtDCal introduces local and group-based encoding which enhances the diversity of the information captured by the computed features. Furthermore, we have shown that adding structure-based descriptors contributes non-redundant additional information to the features-based characterization of polypeptide systems. This software is intended to provide a useful tool for general-purpose encoding of protein sequences and structures for applications is protein classification, similarity analyses and function prediction.
Collapse
Affiliation(s)
- Yasser B Ruiz-Blanco
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Waldo Paz
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Centre of Informatics Studies (CEI), Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP:54830, Villa Clara, Cuba.
| | - James Green
- Department of Systems and Computer Engineering, Carleton University, Ottawa, ON, Canada.
| | - Yovani Marrero-Ponce
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Facultad de Química y Farmacia, Universidad Central "Marta Abreu" de Las Villas, Road to Camajuani km 5 ½, Santa Clara, CP: 54830, Villa Clara, Cuba. .,Grupo de Investigación Microbiología y Ambiente (GIMA). Programa de Bacteriología, Facultad Ciencias de la Salud, Universidad de San Buenaventura, Calle Real de Ternera, Cartagena (Bolivar), Colombia.
| |
Collapse
|
156
|
Appell M, Bosma WB. Assessment of the electronic structure and properties of trichothecene toxins using density functional theory. JOURNAL OF HAZARDOUS MATERIALS 2015; 288:113-123. [PMID: 25698572 DOI: 10.1016/j.jhazmat.2015.01.051] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2014] [Revised: 01/17/2015] [Accepted: 01/21/2015] [Indexed: 06/04/2023]
Abstract
A comprehensive quantum chemical study was carried out on 35 type A and B trichothecenes and biosynthetic precursors, including selected derivatives of deoxynivalenol and T-2 toxin. Quantum chemical properties, Natural Bond Orbital (NBO) analysis, and molecular parameters were calculated on structures geometry optimized at the B3LYP/6-311+G** level. Type B trichothecenes possessed significantly larger electrophilicity index compared to the type A trichothecenes studied. Certain hydroxyl groups of deoxynivalenol, nivalenol, and T-2 toxin exhibited considerable rotation during molecular dynamics simulations (5 ps) at the B3LYP/6-31G** level in implicit aqueous solvent. Quantitative structure activity relationship (QSAR) models were developed to evaluate toxicity and detection using genetic algorithm, principal component, and multilinear analyses. The models suggest electronegativity and several 2-dimensional topological descriptors contain important information related to trichothecene cytotoxicity, phytotoxicity, immunochemical detection, and cross-reactivity.
Collapse
Affiliation(s)
- Michael Appell
- Bacterial Foodborne Pathogens and Mycology Research USDA, ARS, National Center for Agricultural Utilization Research 1815 N. University St., Peoria, IL 61604, USA.
| | - Wayne B Bosma
- Mund-Lagowski Department of Chemistry and Biochemistry Bradley University 1501 W. Bradley Ave., Peoria, IL 61625, USA.
| |
Collapse
|
157
|
Kurdekar V, Jadhav HR. A new open source data analysis python script for QSAR study and its validation. Med Chem Res 2015. [DOI: 10.1007/s00044-014-1240-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
158
|
IMMAN: free software for information theory-based chemometric analysis. Mol Divers 2015; 19:305-19. [DOI: 10.1007/s11030-014-9565-z] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Accepted: 12/24/2014] [Indexed: 11/27/2022]
|
159
|
Shao CY, Su BH, Tu YS, Lin C, Lin OA, Tseng YJ. CypRules: a rule-based P450 inhibition prediction server. Bioinformatics 2015; 31:1869-71. [DOI: 10.1093/bioinformatics/btv043] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2014] [Accepted: 01/18/2015] [Indexed: 11/14/2022] Open
|
160
|
Liu Y, Buendía-Rodríguez G, Peñuelas-Rívas CG, Tan Z, Rívas-Guevara M, Tenorio-Borroto E, Munteanu CR, Pazos A, González-Díaz H. Experimental and computational studies of fatty acid distribution networks. MOLECULAR BIOSYSTEMS 2015; 11:2964-77. [DOI: 10.1039/c5mb00325c] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A new PT-LFER model is useful for predicting a distribution network in terms of specific fatty acid distribution.
Collapse
Affiliation(s)
- Yong Liu
- Faculty of Veterinary Medicine and Animal Science
- Autonomous University of the State of Mexico
- Toluca
- Mexico
- Key Laboratory of Subtropical Agro-ecological Engineering
| | - Germán Buendía-Rodríguez
- National Center for Disciplinary Research on Animal Physiology and Breeding
- National Institute of Forestry
- Agriculture and Livestock Research
- Queretaro
- Mexico
| | | | - Zhiliang Tan
- Key Laboratory of Subtropical Agro-ecological Engineering
- Institute of Subtropical Agriculture, the Chinese Academy of Sciences
- Changsha
- P. R. China
| | - María Rívas-Guevara
- Ethnobiology and Biodiversity Research Center
- Chapingo Autonomous University
- Texcoco
- Mexico
| | - Esvieta Tenorio-Borroto
- Faculty of Veterinary Medicine and Animal Science
- Autonomous University of the State of Mexico
- Toluca
- Mexico
| | | | | | - Humberto González-Díaz
- Department of Organic Chemistry II
- Faculty of Science and Technology
- University of the Basque Country UPV/EHU
- Leioa
- Spain
| |
Collapse
|
161
|
Cortés-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Méndez-Lucio O, IJzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, Bender A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MEDCHEMCOMM 2015. [DOI: 10.1039/c4md00216d] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Proteochemometric (PCM) modelling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Qurrat Ul Ain
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | | | - Eelke B. Lenselink
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Oscar Méndez-Lucio
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Gerd Wohlfahrt
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Peteris Prusis
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Thérèse E. Malliavin
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Gerard J. P. van Westen
- European Molecular Biology Laboratory
- European Bioinformatics Institute
- Wellcome Trust Genome Campus
- Hinxton
- UK
| | - Andreas Bender
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| |
Collapse
|
162
|
García-Jacas CR, Aguilera-Mendoza L, González-Pérez R, Marrero-Ponce Y, Acevedo-Martínez L, Barigye SJ, Avdeenko T. Multi-Server Approach for High-Throughput Molecular Descriptors Calculation based on Multi-Linear Algebraic Maps. Mol Inform 2014; 34:60-9. [DOI: 10.1002/minf.201400086] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2014] [Accepted: 09/17/2014] [Indexed: 11/10/2022]
|
163
|
Baumann D, Baumann K. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J Cheminform 2014; 6:47. [PMID: 25506400 PMCID: PMC4260165 DOI: 10.1186/s13321-014-0047-1] [Citation(s) in RCA: 83] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2014] [Accepted: 10/30/2014] [Indexed: 01/17/2023] Open
Abstract
Background Generally, QSAR modelling requires both model selection and validation since there is no a priori knowledge about the optimal QSAR model. Prediction errors (PE) are frequently used to select and to assess the models under study. Reliable estimation of prediction errors is challenging – especially under model uncertainty – and requires independent test objects. These test objects must not be involved in model building nor in model selection. Double cross-validation, sometimes also termed nested cross-validation, offers an attractive possibility to generate test data and to select QSAR models since it uses the data very efficiently. Nevertheless, there is a controversy in the literature with respect to the reliability of double cross-validation under model uncertainty. Moreover, systematic studies investigating the adequate parameterization of double cross-validation are still missing. Here, the cross-validation design in the inner loop and the influence of the test set size in the outer loop is systematically studied for regression models in combination with variable selection. Methods Simulated and real data are analysed with double cross-validation to identify important factors for the resulting model quality. For the simulated data, a bias-variance decomposition is provided. Results The prediction errors of QSAR/QSPR regression models in combination with variable selection depend to a large degree on the parameterization of double cross-validation. While the parameters for the inner loop of double cross-validation mainly influence bias and variance of the resulting models, the parameters for the outer loop mainly influence the variability of the resulting prediction error estimate. Conclusions Double cross-validation reliably and unbiasedly estimates prediction errors under model uncertainty for regression models. As compared to a single test set, double cross-validation provided a more realistic picture of model quality and should be preferred over a single test set. Electronic supplementary material The online version of this article (doi:10.1186/s13321-014-0047-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Désirée Baumann
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, Beethovenstrasse 55, D-38106 Braunschweig, Germany
| | - Knut Baumann
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, Beethovenstrasse 55, D-38106 Braunschweig, Germany
| |
Collapse
|
164
|
Ng HW, Zhang W, Shu M, Luo H, Ge W, Perkins R, Tong W, Hong H. Competitive molecular docking approach for predicting estrogen receptor subtype α agonists and antagonists. BMC Bioinformatics 2014; 15 Suppl 11:S4. [PMID: 25349983 PMCID: PMC4251048 DOI: 10.1186/1471-2105-15-s11-s4] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Background Endocrine disrupting chemicals (EDCs) are exogenous compounds that interfere with the endocrine system of vertebrates, often through direct or indirect interactions with nuclear receptor proteins. Estrogen receptors (ERs) are particularly important protein targets and many EDCs are ER binders, capable of altering normal homeostatic transcription and signaling pathways. An estrogenic xenobiotic can bind ER as either an agonist or antagonist to increase or inhibit transcription, respectively. The receptor conformations in the complexes of ER bound with agonists and antagonists are different and dependent on interactions with co-regulator proteins that vary across tissue type. Assessment of chemical endocrine disruption potential depends not only on binding affinity to ERs, but also on changes that may alter the receptor conformation and its ability to subsequently bind DNA response elements and initiate transcription. Using both agonist and antagonist conformations of the ERα, we developed an in silico approach that can be used to differentiate agonist versus antagonist status of potential binders. Methods The approach combined separate molecular docking models for ER agonist and antagonist conformations. The ability of this approach to differentiate agonists and antagonists was first evaluated using true agonists and antagonists extracted from the crystal structures available in the protein data bank (PDB), and then further validated using a larger set of ligands from the literature. The usefulness of the approach was demonstrated with enrichment analysis in data sets with a large number of decoy ligands. Results The performance of individual agonist and antagonist docking models was found comparable to similar models in the literature. When combined in a competitive docking approach, they provided the ability to discriminate agonists from antagonists with good accuracy, as well as the ability to efficiently select true agonists and antagonists from decoys during enrichment analysis. Conclusion This approach enables evaluation of potential ER biological function changes caused by chemicals bound to the receptor which, in turn, allows the assessment of a chemical's endocrine disrupting potential. The approach can be used not only by regulatory authorities to perform risk assessments on potential EDCs but also by the industry in drug discovery projects to screen for potential agonists and antagonists.
Collapse
|
165
|
García-Jacas CR, Marrero-Ponce Y, Acevedo-Martínez L, Barigye SJ, Valdés-Martiní JR, Contreras-Torres E. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps. J Comput Chem 2014; 35:1395-409. [PMID: 24889018 DOI: 10.1002/jcc.23640] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 04/22/2014] [Accepted: 04/23/2014] [Indexed: 11/12/2022]
Abstract
The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies.
Collapse
Affiliation(s)
- César R García-Jacas
- Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Martha Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | | | | | | | | | | |
Collapse
|
166
|
Melagraki G, Afantitis A. Enalos InSilicoNano platform: an online decision support tool for the design and virtual screening of nanoparticles. RSC Adv 2014. [DOI: 10.1039/c4ra07756c] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
A QNAR model, available online through Enalos InSilicoNano platform, has been developed and validated for the risk assessment of nanoparticles (NPs).
Collapse
|
167
|
Lagunin AA, Goel RK, Gawande DY, Pahwa P, Gloriozova TA, Dmitriev AV, Ivanov SM, Rudik AV, Konova VI, Pogodin PV, Druzhilovsky DS, Poroikov VV. Chemo- and bioinformatics resources for in silico drug discovery from medicinal plants beyond their traditional use: a critical review. Nat Prod Rep 2014; 31:1585-611. [DOI: 10.1039/c4np00068d] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
An overview of databases andin silicotools for discovery of the hidden therapeutic potential of medicinal plants.
Collapse
Affiliation(s)
- Alexey A. Lagunin
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
- Russian National Research Medical University
- Medico-Biologic Faculty
- Moscow, Russia
| | - Rajesh K. Goel
- Department of Pharmaceutical Sciences and Drug Research
- Punjabi University
- Patiala-147002, India
| | - Dinesh Y. Gawande
- Department of Pharmaceutical Sciences and Drug Research
- Punjabi University
- Patiala-147002, India
| | - Priynka Pahwa
- Department of Pharmaceutical Sciences and Drug Research
- Punjabi University
- Patiala-147002, India
| | | | | | - Sergey M. Ivanov
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
| | - Anastassia V. Rudik
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
| | - Varvara I. Konova
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
| | - Pavel V. Pogodin
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
- Russian National Research Medical University
- Medico-Biologic Faculty
- Moscow, Russia
| | | | - Vladimir V. Poroikov
- Orekhovich Institute of Biomedical Chemistry of Rus. Acad. Med. Sci
- Moscow, Russia
- Russian National Research Medical University
- Medico-Biologic Faculty
- Moscow, Russia
| |
Collapse
|
168
|
Abstract
Computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades. These methods are broadly classified as either structure-based or ligand-based methods. Structure-based methods are in principle analogous to high-throughput screening in that both target and ligand structure information is imperative. Structure-based approaches include ligand docking, pharmacophore, and ligand design methods. The article discusses theory behind the most important methods and recent successful applications. Ligand-based methods use only ligand information for predicting activity depending on its similarity/dissimilarity to previously known active ligands. We review widely used ligand-based methods such as ligand-based pharmacophores, molecular descriptors, and quantitative structure-activity relationships. In addition, important tools such as target/ligand data bases, homology modeling, ligand fingerprint methods, etc., necessary for successful implementation of various computer-aided drug discovery/design methods in a drug discovery campaign are discussed. Finally, computational methods for toxicity prediction and optimization for favorable physiologic properties are discussed with successful examples from literature.
Collapse
Affiliation(s)
- Gregory Sliwoski
- Jr., Center for Structural Biology, 465 21st Ave South, BIOSCI/MRBIII, Room 5144A, Nashville, TN 37232-8725.
| | | | | | | |
Collapse
|
169
|
Subramanian V, Prusis P, Pietilä LO, Xhaard H, Wohlfahrt G. Visually interpretable models of kinase selectivity related features derived from field-based proteochemometrics. J Chem Inf Model 2013; 53:3021-30. [PMID: 24116714 DOI: 10.1021/ci400369z] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Achieving selectivity for small organic molecules toward biological targets is a main focus of drug discovery but has been proven difficult, for example, for kinases because of the high similarity of their ATP binding pockets. To support the design of more selective inhibitors with fewer side effects or with altered target profiles for improved efficacy, we developed a method combining ligand- and receptor-based information. Conventional QSAR models enable one to study the interactions of multiple ligands toward a single protein target, but in order to understand the interactions between multiple ligands and multiple proteins, we have used proteochemometrics, a multivariate statistics method that aims to combine and correlate both ligand and protein descriptions with affinity to receptors. The superimposed binding sites of 50 unique kinases were described by molecular interaction fields derived from knowledge-based potentials and Schrödinger's WaterMap software. Eighty ligands were described by Mold(2), Open Babel, and Volsurf descriptors. Partial least-squares regression including cross-terms, which describe the selectivity, was used for model building. This combination of methods allows interpretation and easy visualization of the models within the context of ligand binding pockets, which can be translated readily into the design of novel inhibitors.
Collapse
|
170
|
Chen M, Hong H, Fang H, Kelly R, Zhou G, Borlak J, Tong W. Quantitative Structure-Activity Relationship Models for Predicting Drug-Induced Liver Injury Based on FDA-Approved Drug Labeling Annotation and Using a Large Collection of Drugs. Toxicol Sci 2013; 136:242-9. [DOI: 10.1093/toxsci/kft189] [Citation(s) in RCA: 69] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
|
171
|
Shen J, Xu L, Fang H, Richard AM, Bray JD, Judson RS, Zhou G, Colatsky TJ, Aungst JL, Teng C, Harris SC, Ge W, Dai SY, Su Z, Jacobs AC, Harrouk W, Perkins R, Tong W, Hong H. EADB: an estrogenic activity database for assessing potential endocrine activity. Toxicol Sci 2013; 135:277-91. [PMID: 23897986 DOI: 10.1093/toxsci/kft164] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Endocrine-active chemicals can potentially have adverse effects on both humans and wildlife. They can interfere with the body's endocrine system through direct or indirect interactions with many protein targets. Estrogen receptors (ERs) are one of the major targets, and many endocrine disruptors are estrogenic and affect the normal estrogen signaling pathways. However, ERs can also serve as therapeutic targets for various medical conditions, such as menopausal symptoms, osteoporosis, and ER-positive breast cancer. Because of the decades-long interest in the safety and therapeutic utility of estrogenic chemicals, a large number of chemicals have been assayed for estrogenic activity, but these data exist in various sources and different formats that restrict the ability of regulatory and industry scientists to utilize them fully for assessing risk-benefit. To address this issue, we have developed an Estrogenic Activity Database (EADB; http://www.fda.gov/ScienceResearch/BioinformaticsTools/EstrogenicActivityDatabaseEADB/default.htm) and made it freely available to the public. EADB contains 18,114 estrogenic activity data points collected for 8212 chemicals tested in 1284 binding, reporter gene, cell proliferation, and in vivo assays in 11 different species. The chemicals cover a broad chemical structure space and the data span a wide range of activities. A set of tools allow users to access EADB and evaluate potential endocrine activity of chemicals. As a case study, a classification model was developed using EADB for predicting ER binding of chemicals.
Collapse
Affiliation(s)
- Jie Shen
- * Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, Arkansas 72079
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
172
|
Choi SS, Kim JS, Valerio LG, Sadrieh N. In silico modeling to predict drug-induced phospholipidosis. Toxicol Appl Pharmacol 2013; 269:195-204. [DOI: 10.1016/j.taap.2013.03.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2012] [Revised: 03/01/2013] [Accepted: 03/03/2013] [Indexed: 11/26/2022]
|
173
|
Zakharov AV, Peach ML, Sitzmann M, Filippov IV, McCartney HJ, Smith LH, Pugliese A, Nicklaus MC. Computational tools and resources for metabolism-related property predictions. 2. Application to prediction of half-life time in human liver microsomes. Future Med Chem 2012; 4:1933-44. [PMID: 23088274 PMCID: PMC4117347 DOI: 10.4155/fmc.12.152] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
BACKGROUND The most important factor affecting metabolic excretion of compounds from the body is their half-life time. This provides an indication of compound stability of, for example, drug molecules. We report on our efforts to develop QSAR models for metabolic stability of compounds, based on in vitro half-life assay data measured in human liver microsomes. METHOD A variety of QSAR models generated using different statistical methods and descriptor sets implemented in both open-source and commercial programs (KNIME, GUSAR and StarDrop) were analyzed. The models obtained were compared using four different external validation sets from public and commercial data sources, including two smaller sets of in vivo half-life data in humans. CONCLUSION In many cases, the accuracy of prediction achieved on one external test set did not correspond to the results achieved with another test set. The most predictive models were used for predicting the metabolic stability of compounds from the open NCI database, the results of which are publicly available on the NCI/CADD Group web server ( http://cactus.nci.nih.gov ).
Collapse
Affiliation(s)
- Alexey V Zakharov
- CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
| | - Megan L Peach
- Basic Science Program, SAICF-rederick, SAIC, Inc., CADD Group, Chemical Biology Laboratory, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
| | - Markus Sitzmann
- CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
| | - Igor V Filippov
- Basic Science Program, SAICF-rederick, SAIC, Inc., CADD Group, Chemical Biology Laboratory, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
| | - Heather J McCartney
- CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
- Interdisciplinary Graduate Program, Biomedical Sciences, Vanderbilt University, Nashville, TN 37240, USA
| | - Layton H Smith
- Conrad Prebys Center for Chemical Genomics, Sanford Burnham Medical Research Institute, 6400 Sanger Road, Orlando, FL 32827, USA
| | - Angelo Pugliese
- CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
- Computer-Aided Drug Design at Cancer Research UK, Beatson Laboratories, Drug Discovery Programme, Switchback Road, Bearsden, Glasgow, G61 1BD, UK
| | - Marc C Nicklaus
- CADD Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, Frederick National Laboratory for Cancer Research, Building 376, 376 Boyles Street, Frederick, MD 21702, USA
| |
Collapse
|
174
|
Barigye SJ, Marrero-Ponce Y, Martínez-López Y, Torrens F, Artiles-Martínez LM, Pino-Urias RW, Martínez-Santiago O. Relations frequency hypermatrices in mutual, conditional, and joint entropy-based information indices. J Comput Chem 2012; 34:259-74. [DOI: 10.1002/jcc.23123] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Revised: 07/05/2012] [Accepted: 08/22/2012] [Indexed: 11/10/2022]
|
175
|
A computational study on thiourea analogs as potent MK-2 inhibitors. Int J Mol Sci 2012; 13:7057-7079. [PMID: 22837679 PMCID: PMC3397511 DOI: 10.3390/ijms13067057] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2012] [Revised: 05/25/2012] [Accepted: 05/28/2012] [Indexed: 11/17/2022] Open
Abstract
Mitogen-activated protein kinase-activated protein kinase 2 (MK-2) has been identified as a drug target for the treatment of inflammatory diseases. Currently, a series of thiourea analogs as potent MK-2 inhibitors were studied using comprehensive computational methods by 3D-QSAR, molecular docking and molecular dynamics simulations for a further improvement in activities. The optimal 3D models exhibit high statistical significance of the results, especially for the CoMFA results with r(2) (ncv), q(2) values of 0.974, 0.536 for the internal validation, and r(2) (pred), r(2) (m) values of 0.910, 0.723 for the external validation and Roy's index, respectively. In addition, more rigorous validation criteria suggested by Tropsha were also employed to check the built models. Graphic representation of the results, as contoured 3D coefficient plots, also provides a clue to the reasonable modification of molecules: (i) The substituent with a bulky size and electron-rich group at the C5 position of the pyrazine ring is required to enhance the potency; (ii) The H-bond acceptor group in the C3 position of the pyrazine ring is likely to be helpful to increase MK-2 inhibition; (iii) The small and electropositive substituent as a hydrogen bond donor of the C2 position in the oxazolone ring is favored; In addition, several important amino acid residues were also identified as playing an important role in MK-2 inhibition. The agreement between 3D-QSAR, molecular docking and molecular dynamics simulations also proves the rationality of the developed models. These results, we hope, may be helpful in designing novel and potential MK-2 inhibitors.
Collapse
|
176
|
Hao M, Zhang S, Qiu J. Toward the prediction of FBPase inhibitory activity using chemoinformatic methods. Int J Mol Sci 2012; 13:7015-7037. [PMID: 22837677 PMCID: PMC3397509 DOI: 10.3390/ijms13067015] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2012] [Revised: 05/18/2012] [Accepted: 05/31/2012] [Indexed: 01/08/2023] Open
Abstract
Currently, Chemoinformatic methods are used to perform the prediction for FBPase inhibitory activity. A genetic algorithm-random forest coupled method (GA-RF) was proposed to predict fructose 1,6-bisphosphatase (FBPase) inhibitors to treat type 2 diabetes mellitus using the Mold2 molecular descriptors. A data set of 126 oxazole and thiazole analogs was used to derive the GA-RF model, yielding the significant non-cross-validated correlation coefficient r2ncv and cross-validated r2cv values of 0.96 and 0.67 for the training set, respectively. The statistically significant model was validated by a test set of 64 compounds, producing the prediction correlation coefficient r2pred of 0.90. More importantly, the building GA-RF model also passed through various criteria suggested by Tropsha and Roy with r2o and r2m values of 0.90 and 0.83, respectively. In order to compare with the GA-RF model, a pure RF model developed based on the full descriptors was performed as well for the same data set. The resulting GA-RF model with significantly internal and external prediction capacities is beneficial to the prediction of potential oxazole and thiazole series of FBPase inhibitors prior to chemical synthesis in drug discovery programs.
Collapse
Affiliation(s)
| | | | - Jieshan Qiu
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +86-411-84986024; Fax: +86-411-84986080
| |
Collapse
|
177
|
Tie Y, McPhail B, Hong H, Pearce BA, Schnackenberg LK, Ge W, Buzatu DA, Wilkes JG, Fuscoe JC, Tong W, Fowler BA, Beger RD, Demchuk E. Modeling chemical interaction profiles: II. Molecular docking, spectral data-activity relationship, and structure-activity relationship models for potent and weak inhibitors of cytochrome P450 CYP3A4 isozyme. Molecules 2012; 17:3407-60. [PMID: 22421793 PMCID: PMC6268819 DOI: 10.3390/molecules17033407] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Revised: 02/27/2012] [Accepted: 02/28/2012] [Indexed: 01/15/2023] Open
Abstract
Polypharmacy increasingly has become a topic of public health concern, particularly as the U.S. population ages. Drug labels often contain insufficient information to enable the clinician to safely use multiple drugs. Because many of the drugs are bio-transformed by cytochrome P450 (CYP) enzymes, inhibition of CYP activity has long been associated with potentially adverse health effects. In an attempt to reduce the uncertainty pertaining to CYP-mediated drug-drug/chemical interactions, an interagency collaborative group developed a consensus approach to prioritizing information concerning CYP inhibition. The consensus involved computational molecular docking, spectral data-activity relationship (SDAR), and structure-activity relationship (SAR) models that addressed the clinical potency of CYP inhibition. The models were built upon chemicals that were categorized as either potent or weak inhibitors of the CYP3A4 isozyme. The categorization was carried out using information from clinical trials because currently available in vitro high-throughput screening data were not fully representative of the in vivo potency of inhibition. During categorization it was found that compounds, which break the Lipinski rule of five by molecular weight, were about twice more likely to be inhibitors of CYP3A4 compared to those, which obey the rule. Similarly, among inhibitors that break the rule, potent inhibitors were 2–3 times more frequent. The molecular docking classification relied on logistic regression, by which the docking scores from different docking algorithms, CYP3A4 three-dimensional structures, and binding sites on them were combined in a unified probabilistic model. The SDAR models employed a multiple linear regression approach applied to binned 1D 13C-NMR and 1D 15N-NMR spectral descriptors. Structure-based and physical-chemical descriptors were used as the basis for developing SAR models by the decision forest method. Thirty-three potent inhibitors and 88 weak inhibitors of CYP3A4 were used to train the models. Using these models, a synthetic majority rules consensus classifier was implemented, while the confidence of estimation was assigned following the percent agreement strategy. The classifier was applied to a testing set of 120 inhibitors not included in the development of the models. Five compounds of the test set, including known strong inhibitors dalfopristin and tioconazole, were classified as probable potent inhibitors of CYP3A4. Other known strong inhibitors, such as lopinavir, oltipraz, quercetin, raloxifene, and troglitazone, were among 18 compounds classified as plausible potent inhibitors of CYP3A4. The consensus estimation of inhibition potency is expected to aid in the nomination of pharmaceuticals, dietary supplements, environmental pollutants, and occupational and other chemicals for in-depth evaluation of the CYP3A4 inhibitory activity. It may serve also as an estimate of chemical interactions via CYP3A4 metabolic pharmacokinetic pathways occurring through polypharmacy and nutritional and environmental exposures to chemical mixtures.
Collapse
Affiliation(s)
- Yunfeng Tie
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (Y.T.); (B.M.); (B.A.F.)
| | - Brooks McPhail
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (Y.T.); (B.M.); (B.A.F.)
| | - Huixiao Hong
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Bruce A. Pearce
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Laura K. Schnackenberg
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Weigong Ge
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Dan A. Buzatu
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Jon G. Wilkes
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - James C. Fuscoe
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Weida Tong
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Bruce A. Fowler
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (Y.T.); (B.M.); (B.A.F.)
| | - Richard D. Beger
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (D.A.B.); (J.G.W.); (J.C.F.); (W.T.); (R.D.B.)
| | - Eugene Demchuk
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (Y.T.); (B.M.); (B.A.F.)
- Department of Basic Pharmaceutical Sciences, West Virginia University, Morgantown, WV 26506-9530, USA
- Author to whom correspondence should be addressed; ; Tel.: +1-770-488-3327; Fax: +1-404-248-4142
| |
Collapse
|
178
|
McPhail B, Tie Y, Hong H, Pearce BA, Schnackenberg LK, Ge W, Fuscoe JC, Tong W, Buzatu DA, Wilkes JG, Fowler BA, Demchuk E, Beger RD. Modeling chemical interaction profiles: I. Spectral data-activity relationship and structure-activity relationship models for inhibitors and non-inhibitors of cytochrome P450 CYP3A4 and CYP2D6 isozymes. Molecules 2012; 17:3383-406. [PMID: 22421792 PMCID: PMC6268752 DOI: 10.3390/molecules17033383] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2012] [Revised: 02/27/2012] [Accepted: 02/28/2012] [Indexed: 02/07/2023] Open
Abstract
An interagency collaboration was established to model chemical interactions that may cause adverse health effects when an exposure to a mixture of chemicals occurs. Many of these chemicals—drugs, pesticides, and environmental pollutant—interact at the level of metabolic biotransformations mediated by cytochrome P450 (CYP) enzymes. In the present work, spectral data-activity relationship (SDAR) and structure-activity relationship (SAR) approaches were used to develop machine-learning classifiers of inhibitors and non-inhibitors of the CYP3A4 and CYP2D6 isozymes. The models were built upon 602 reference pharmaceutical compounds whose interactions have been deduced from clinical data, and 100 additional chemicals that were used to evaluate model performance in an external validation (EV) test. SDAR is an innovative modeling approach that relies on discriminant analysis applied to binned nuclear magnetic resonance (NMR) spectral descriptors. In the present work, both 1D 13C and 1D 15N-NMR spectra were used together in a novel implementation of the SDAR technique. It was found that increasing the binning size of 1D 13C-NMR and 15N-NMR spectra caused an increase in the tenfold cross-validation (CV) performance in terms of both the rate of correct classification and sensitivity. The results of SDAR modeling were verified using SAR. For SAR modeling, a decision forest approach involving from 6 to 17 Mold2 descriptors in a tree was used. Average rates of correct classification of SDAR and SAR models in a hundred CV tests were 60% and 61% for CYP3A4, and 62% and 70% for CYP2D6, respectively. The rates of correct classification of SDAR and SAR models in the EV test were 73% and 86% for CYP3A4, and 76% and 90% for CYP2D6, respectively. Thus, both SDAR and SAR methods demonstrated a comparable performance in modeling a large set of structurally diverse data. Based on unique NMR structural descriptors, the new SDAR modeling method complements the existing SAR techniques, providing an independent estimator that can increase confidence in a structure-activity assessment. When modeling was applied to hazardous environmental chemicals, it was found that up to 20% of them may be substrates and up to 10% of them may be inhibitors of the CYP3A4 and CYP2D6 isoforms. The developed models provide a rare opportunity for the environmental health branch of the public health service to extrapolate to hazardous chemicals directly from human clinical data. Therefore, the pharmacological and environmental health branches are both expected to benefit from these reported models.
Collapse
Affiliation(s)
- Brooks McPhail
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (B.M.); (Y.T.); (B.A.F.)
| | - Yunfeng Tie
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (B.M.); (Y.T.); (B.A.F.)
| | - Huixiao Hong
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Bruce A. Pearce
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Laura K. Schnackenberg
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Weigong Ge
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - James C. Fuscoe
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Weida Tong
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Dan A. Buzatu
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Jon G. Wilkes
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| | - Bruce A. Fowler
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (B.M.); (Y.T.); (B.A.F.)
| | - Eugene Demchuk
- Division of Toxicology and Environmental Medicine, Agency for Toxic Substances and Disease Registry, Atlanta, GA 30333, USA; (B.M.); (Y.T.); (B.A.F.)
- Department of Basic Pharmaceutical Sciences, West Virginia University, Morgantown, WV 26506-9530, USA
- Author to whom correspondence should be addressed; ; Tel.: +1-770-488-3327; Fax: +1-404-248-4142
| | - Richard D. Beger
- Division of Systems Biology, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA; (H.H.); (B.A.P.); (L.K.S.); (W.G.); (J.C.F.); (W.T.); (D.A.B.); (J.G.W.); (R.D.B.)
| |
Collapse
|
179
|
Sacan A, Ekins S, Kortagere S. Applications and limitations of in silico models in drug discovery. Methods Mol Biol 2012; 910:87-124. [PMID: 22821594 DOI: 10.1007/978-1-61779-965-5_6] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Drug discovery in the late twentieth and early twenty-first century has witnessed a myriad of changes that were adopted to predict whether a compound is likely to be successful, or conversely enable identification of molecules with liabilities as early as possible. These changes include integration of in silico strategies for lead design and optimization that perform complementary roles to that of the traditional in vitro and in vivo approaches. The in silico models are facilitated by the availability of large datasets associated with high-throughput screening, bioinformatics algorithms to mine and annotate the data from a target perspective, and chemoinformatics methods to integrate chemistry methods into lead design process. This chapter highlights the applications of some of these methods and their limitations. We hope this serves as an introduction to in silico drug discovery.
Collapse
Affiliation(s)
- Ahmet Sacan
- School of Biomedical Engineering, Drexel University, Philadelphia, PA, USA
| | | | | |
Collapse
|
180
|
Abstract
Computer-aided drug design plays a vital role in drug discovery and development and has become an indispensable tool in the pharmaceutical industry. Computational medicinal chemists can take advantage of all kinds of software and resources in the computer-aided drug design field for the purposes of discovering and optimizing biologically active compounds. This article reviews software and other resources related to computer-aided drug design approaches, putting particular emphasis on structure-based drug design, ligand-based drug design, chemical databases and chemoinformatics tools.
Collapse
|
181
|
Liu Z, Kelly R, Fang H, Ding D, Tong W. Comparative analysis of predictive models for nongenotoxic hepatocarcinogenicity using both toxicogenomics and quantitative structure-activity relationships. Chem Res Toxicol 2011; 24:1062-70. [PMID: 21627106 DOI: 10.1021/tx2000637] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The primary testing strategy to identify nongenotoxic carcinogens largely relies on the 2-year rodent bioassay, which is time-consuming and labor-intensive. There is an increasing effort to develop alternative approaches to prioritize the chemicals for, supplement, or even replace the cancer bioassay. In silico approaches based on quantitative structure-activity relationships (QSAR) are rapid and inexpensive and thus have been investigated for such purposes. A slightly more expensive approach based on short-term animal studies with toxicogenomics (TGx) represents another attractive option for this application. Thus, the primary questions are how much better predictive performance using short-term TGx models can be achieved compared to that of QSAR models, and what length of exposure is sufficient for high quality prediction based on TGx. In this study, we developed predictive models for rodent liver carcinogenicity using gene expression data generated from short-term animal models at different time points and QSAR. The study was focused on the prediction of nongenotoxic carcinogenicity since the genotoxic chemicals can be inexpensively removed from further development using various in vitro assays individually or in combination. We identified 62 chemicals whose hepatocarcinogenic potential was available from the National Center for Toxicological Research liver cancer database (NCTRlcdb). The gene expression profiles of liver tissue obtained from rats treated with these chemicals at different time points (1 day, 3 days, and 5 days) are available from the Gene Expression Omnibus (GEO) database. Both TGx and QSAR models were developed on the basis of the same set of chemicals using the same modeling approach, a nearest-centroid method with a minimum redundancy and maximum relevancy-based feature selection with performance assessed using compound-based 5-fold cross-validation. We found that the TGx models outperformed QSAR in every aspect of modeling. For example, the TGx models' predictive accuracy (0.77, 0.77, and 0.82 for the 1-day, 3-day, and 5-day models, respectively) was much higher for an independent validation set than that of a QSAR model (0.55). Permutation tests confirmed the statistical significance of the model's prediction performance. The study concluded that a short-term 5-day TGx animal model holds the potential to predict nongenotoxic hepatocarcinogenicity.
Collapse
Affiliation(s)
- Zhichao Liu
- Center of Excellence for Bioinformatics, National Center for Toxicological Research, US Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079, USA
| | | | | | | | | |
Collapse
|
182
|
Hao M, Li Y, Wang Y, Zhang S. Prediction of P2Y12 antagonists using a novel genetic algorithm-support vector machine coupled approach. Anal Chim Acta 2011; 690:53-63. [DOI: 10.1016/j.aca.2011.02.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2010] [Revised: 01/26/2011] [Accepted: 02/01/2011] [Indexed: 12/15/2022]
|
183
|
Hao M, Li Y, Wang Y, Zhang S. A classification study of respiratory Syncytial Virus (RSV) inhibitors by variable selection with random forest. Int J Mol Sci 2011; 12:1259-80. [PMID: 21541057 PMCID: PMC3083704 DOI: 10.3390/ijms12021259] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2010] [Revised: 02/10/2011] [Accepted: 02/11/2011] [Indexed: 12/29/2022] Open
Abstract
Experimental pEC50s for 216 selective respiratory syncytial virus (RSV) inhibitors are used to develop classification models as a potential screening tool for a large library of target compounds. Variable selection algorithm coupled with random forests (VS-RF) is used to extract the physicochemical features most relevant to the RSV inhibition. Based on the selected small set of descriptors, four other widely used approaches, i.e., support vector machine (SVM), Gaussian process (GP), linear discriminant analysis (LDA) and k nearest neighbors (kNN) routines are also employed and compared with the VS-RF method in terms of several of rigorous evaluation criteria. The obtained results indicate that the VS-RF model is a powerful tool for classification of RSV inhibitors, producing the highest overall accuracy of 94.34% for the external prediction set, which significantly outperforms the other four methods with the average accuracy of 80.66%. The proposed model with excellent prediction capacity from internal to external quality should be important for screening and optimization of potential RSV inhibitors prior to chemical synthesis in drug development.
Collapse
Affiliation(s)
- Ming Hao
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
| | - Yan Li
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
- Author to whom correspondence should be addressed; E-Mail: ; Tel.: +86-411-84986062; Fax: +86-411-84986063
| | - Yonghua Wang
- Center of Bioinformatics, Northwest A&F University, Yangling, Shaanxi 712100, China; E-Mail:
| | - Shuwei Zhang
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
| |
Collapse
|
184
|
Abstract
INTRODUCTION A frightening increase in the number of isolated multidrug resistant bacterial strains linked to the decline in novel antimicrobial drugs entering the market is a great cause for concern. Cationic antimicrobial peptides (AMPs) have lately been introduced as a potential new class of antimicrobial drugs, and computational methods utilizing molecular descriptors can significantly accelerate the development of new peptide drug candidates. AREAS COVERED This paper gives a broad overview of peptide and amino-acid scale descriptors available for AMP modeling and highlights which of these are currently being used in quantitative structure-activity relationship (QSAR) studies for AMP optimization. Additionally, some key commercial computational tools are discussed, and both successful and less successful studies are referenced, illustrating some of the challenges facing AMP scientists. Through examples of different peptide QSAR studies, this review highlights some of the missing links and illuminates some of the questions that would be interesting to challenge in a more systematic fashion. EXPERT OPINION Computer-aided peptide QSAR using molecular descriptors may provide the necessary edge to peptide drug discovery, enabling successful design of a new generation anti-infective drug molecules. However, if this wonderful scenario is to play out, computational chemists and peptide microbiologists would need to start playing together and not just side by side.
Collapse
Affiliation(s)
- Håvard Jenssen
- Roskilde University, Institute of Science, Systems and Models, Universitetsvej 1, Building 17.1, DK-4000 Roskilde, Denmark +45 4674 2877 ; +45 4674 3010 ;
| |
Collapse
|
185
|
Schüller A, Goh GB, Kim H, Lee JS, Chang YT. Quantitative Structure-Fluorescence Property Relationship Analysis of a Large BODIPY Library. Mol Inform 2010; 29:717-29. [DOI: 10.1002/minf.201000089] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2010] [Accepted: 09/28/2010] [Indexed: 12/31/2022]
|
186
|
Goyal RK, Dureja H, Singh G, Madan AK. Models for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines. Sci Pharm 2010; 78:791-820. [PMID: 21179317 PMCID: PMC3007618 DOI: 10.3797/scipharm.1006-03] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2010] [Accepted: 08/12/2010] [Indexed: 11/26/2022] Open
Abstract
The relationship between topological indices and antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines has been investigated. A data set consisting of 31 analogues of 5â-O-[(N-Acyl)sulfamoyl]adenosines was selected for the present study. The values of numerous topostructural and topochemical indices for each of 31 differently substituted analogues of the data set were computed using an in-house computer program. Resulting data was analyzed and suitable models were developed through decision tree, random forest and moving average analysis (MAA). The goodness of the models was assessed by calculating overall accuracy of prediction, sensitivity, specificity and Mathews correlation coefficient. Pendentic eccentricity index â a novel highly discriminating, non-correlating pendenticity based topochemical descriptor â was also conceptualized and successfully utilized for the development of a model for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines. The proposed index exhibited not only high sensitivity towards both the presence as well as relative position(s) of pendent/heteroatom(s) but also led to significant reduction in degeneracy. Random forest correctly classified the analogues into active and inactive with an accuracy of 67.74%. A decision tree was also employed for determining the importance of molecular descriptors. The decision tree learned the information from the input data with an accuracy of 100% and correctly predicted the cross-validated (10 fold) data with accuracy up to 77.4%. Statistical significance of proposed models was also investigated using intercorrelation analysis. Accuracy of prediction of proposed MAA models ranged from 90.4 to 91.6%.
Collapse
Affiliation(s)
- Rakesh K Goyal
- Faculty of Pharmaceutical Sciences, Pt. B.D. Sharma University of Health Sciences, Rohtak,124 001, India.
| | | | | | | |
Collapse
|
187
|
Hao M, Li Y, Wang Y, Zhang S. Prediction of PKCθ inhibitory activity using the Random Forest Algorithm. Int J Mol Sci 2010; 11:3413-33. [PMID: 20957104 PMCID: PMC2956104 DOI: 10.3390/ijms11093413] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Revised: 08/24/2010] [Accepted: 09/03/2010] [Indexed: 12/14/2022] Open
Abstract
This work is devoted to the prediction of a series of 208 structurally diverse PKCθ inhibitors using the Random Forest (RF) based on the Mold(2) molecular descriptors. The RF model was established and identified as a robust predictor of the experimental pIC(50) values, producing good external R(2) (pred) of 0.72, a standard error of prediction (SEP) of 0.45, for an external prediction set of 51 inhibitors which were not used in the development of QSAR models. By using the RF built-in measure of the relative importance of the descriptors, an important predictor-the number of group donor atoms for H-bonds (with N and O)-has been identified to play a crucial role in PKCθ inhibitory activity. We hope that the developed RF model will be helpful in the screening and prediction of novel unknown PKCθ inhibitory activity.
Collapse
Affiliation(s)
- Ming Hao
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
| | - Yan Li
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
| | - Yonghua Wang
- Center of Bioinformatics, Northwest A&F University, Yangling, Shaanxi 712100, China; E-Mail: (Y.W.)
| | - Shuwei Zhang
- School of Chemical Engineering, Dalian University of Technology, Dalian, Liaoning 116012, China; E-Mails: (M.H.); (S.Z.)
| |
Collapse
|
188
|
Gupta RR, Gifford EM, Liston T, Waller CL, Hohman M, Bunin BA, Ekins S. Using open source computational tools for predicting human metabolic stability and additional absorption, distribution, metabolism, excretion, and toxicity properties. Drug Metab Dispos 2010; 38:2083-90. [PMID: 20693417 DOI: 10.1124/dmd.110.034918] [Citation(s) in RCA: 72] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
Ligand-based computational models could be more readily shared between researchers and organizations if they were generated with open source molecular descriptors [e.g., chemistry development kit (CDK)] and modeling algorithms, because this would negate the requirement for proprietary commercial software. We initially evaluated open source descriptors and model building algorithms using a training set of approximately 50,000 molecules and a test set of approximately 25,000 molecules with human liver microsomal metabolic stability data. A C5.0 decision tree model demonstrated that CDK descriptors together with a set of Smiles Arbitrary Target Specification (SMARTS) keys had good statistics [κ = 0.43, sensitivity = 0.57, specificity = 0.91, and positive predicted value (PPV) = 0.64], equivalent to those of models built with commercial Molecular Operating Environment 2D (MOE2D) and the same set of SMARTS keys (κ = 0.43, sensitivity = 0.58, specificity = 0.91, and PPV = 0.63). Extending the dataset to ∼193,000 molecules and generating a continuous model using Cubist with a combination of CDK and SMARTS keys or MOE2D and SMARTS keys confirmed this observation. When the continuous predictions and actual values were binned to get a categorical score we observed a similar κ statistic (0.42). The same combination of descriptor set and modeling method was applied to passive permeability and P-glycoprotein efflux data with similar model testing statistics. In summary, open source tools demonstrated predictive results comparable to those of commercial software with attendant cost savings. We discuss the advantages and disadvantages of open source descriptors and the opportunity for their use as a tool for organizations to share data precompetitively, avoiding repetition and assisting drug discovery.
Collapse
Affiliation(s)
- Rishi R Gupta
- Pfizer Global Research and Development, Groton, Connecticut, USA
| | | | | | | | | | | | | |
Collapse
|
189
|
Ekins S, Bradford J, Dole K, Spektor A, Gregory K, Blondeau D, Hohman M, Bunin BA. A collaborative database and computational models for tuberculosis drug discovery. MOLECULAR BIOSYSTEMS 2010; 6:840-51. [PMID: 20567770 DOI: 10.1039/b917766c] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The search for molecules with activity against Mycobacterium tuberculosis (Mtb) is employing many approaches in parallel including high throughput screening and computational methods. We have developed a database (CDD TB) to capture public and private Mtb data while enabling data mining and collaborations with other researchers. We have used the public data along with several cheminformatics approaches to produce models that describe active and inactive compounds. We have compared these datasets to those for known FDA approved drugs and between Mtb active and inactive compounds. The distribution of polar surface area and pK(a) of active compounds was found to be a statistically significant determinant of activity against Mtb. Hydrophobicity was not always statistically significant. Bayesian classification models for 220, 463 molecules were generated and tested with external molecules, and enabled the discrimination of active or inactive substructures from other datasets in the CDD TB. Computational pharmacophores based on known Mtb drugs were able to map to and retrieve a small subset of some of the Mtb datasets, including a high percentage of Mtb actives. The combination of the database, dataset analysis, Bayesian and pharmacophore models provides new insights into molecular properties and features that are determinants of activity in whole cells. This study provides novel insights into the key 1D molecular descriptors, 2D chemical substructures and 3D pharmacophores which can be used to mine the chemistry space, prioritizing those molecules with a higher probability of activity against Mtb.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94403, USA.
| | | | | | | | | | | | | | | |
Collapse
|
190
|
Filimonov DA, Zakharov AV, Lagunin AA, Poroikov VV. QNA-based 'Star Track' QSAR approach. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2009; 20:679-709. [PMID: 20024804 DOI: 10.1080/10629360903438370] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
In the existing quantitative structure-activity relationship (QSAR) methods any molecule is represented as a single point in a many-dimensional space of molecular descriptors. We propose a new QSAR approach based on Quantitative Neighbourhoods of Atoms (QNA) descriptors, which characterize each atom of a molecule and depend on the whole molecule structure. In the 'Star Track' methodology any molecule is represented as a set of points in a two-dimensional space of QNA descriptors. With our new method the estimate of the target property of a chemical compound is calculated as the average value of the function of QNA descriptors in the points of the atoms of a molecule in QNA descriptor space. Substantially, we propose the use of only two descriptors rather than more than 3000 molecular descriptors that apply in the QSAR method. On the basis of this approach we have developed the computer program GUSAR and compared it with several widely used QSAR methods including CoMFA, CoMSIA, Golpe/GRID, HQSAR and others, using ten data sets representing various chemical series and diverse types of biological activity. We show that in the majority of cases the accuracy and predictivity of GUSAR models appears to be better than those for the reference QSAR methods. High predictive ability and robustness of GUSAR are also shown in the leave-20%-out cross-validation procedure.
Collapse
Affiliation(s)
- D A Filimonov
- Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, Moscow, Russia.
| | | | | | | |
Collapse
|
191
|
Wang J, Hou T. Chapter 5 Recent Advances on in silico ADME Modeling. ACTA ACUST UNITED AC 2009. [DOI: 10.1016/s1574-1400(09)00505-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2023]
|