Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Willett P. Chemoinformatics - similarity and diversity in chemical libraries. Curr Opin Biotechnol 2000;11:85-8. [PMID: 10679335 DOI: 10.1016/s0958-1669(99)00059-2] [Citation(s) in RCA: 81] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Electronic supplementary material

The online version of this article (doi:10.1186/s13321-016-0176-9) contains supplementary material, which is available to authorized users.

Collapse

Number

Cited by Other Article(s)

Comparative analysis of an anthraquinone and chalcone derivatives-based virtual combinatorial library. A cheminformatics "proof-of-concept" study. J Mol Graph Model 2022;117:108307. [PMID: 36096064 DOI: 10.1016/j.jmgm.2022.108307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/08/2022] [Accepted: 08/08/2022] [Indexed: 01/14/2023]

Abstract

A Laplacian scoring algorithm for gene selection and the Gini coefficient to identify the genes whose expression varied least across a large set of samples were the state-of-the-art methods used here. These methods have not been trialed for their feasibility in cheminformatics. This was a maiden attempt to investigate a complete comparative analysis of an anthraquinone and chalcone derivatives-based virtual combinatorial library. This computational "proof-of-concept" study illustrated the combinatorial approach used to explain how the structure of the selected natural products (NPs) undergoes molecular diversity analysis. A virtual combinatorial library (1.6 M) based on 20 anthraquinones and 24 chalcones was enumerated. The resulting compounds were optimized to the near drug-likeness properties, and the physicochemical descriptors were calculated for all datasets including FDA, Non-FDA, and NPs from ZINC 15. UMAP and PCA were applied to compare and represent the chemical space coverage of each dataset. Subsequently, the Laplacian score and Gini coefficient were applied to delineate feature selection and selectivity among properties, respectively. Finally, we demonstrated the diversity between the datasets by employing Murcko's and the central scaffolds systems, calculating three fingerprint descriptors and analyzing their diversity by PCA and SOM. The optimized enumeration resulted in 1,610,268 compounds with NP-Likeness, and synthetic feasibility mean scores close to FDA, Non-FDA, and NPs datasets. The overlap between the chemical space of the 1.6 M database was more prominent than with the NPs dataset. A Laplacian score prioritized NP-likeness and hydrogen bond acceptor properties (1.0 and 0.923), respectively, while the Gini coefficient showed that all properties have selective effects on datasets (0.81-0.93). Scaffold and fingerprint diversity indicated that the descending order for the tested datasets was FDA, Non-FDA, NPs and 1.6 M. Virtual combinatorial libraries based on NPs can be considered as a source of the combinatorial compound with NP-likeness properties. Furthermore, measuring molecular diversity is supposed to be performed by different methods to allow for comparison and better judgment.

Collapse

Nainwal LM, Shaququzzaman M, Akhter M, Husain A, Parvez S, Tasneem S, Iqubal A, Alam MM. Synthesis, and reverse screening of 6‐(3,4,5‐trimethoxyphenyl)pyrimidine‐5‐carbonitrile derivatives as anticancer agents: Part‐ II. J Heterocycl Chem 2021. [DOI: 10.1002/jhet.4421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Xu X, Zou X. Dissimilar Ligands Bind in a Similar Fashion: A Guide to Ligand Binding-Mode Prediction with Application to CELPP Studies. Int J Mol Sci 2021;22:ijms222212320. [PMID: 34830201 PMCID: PMC8625032 DOI: 10.3390/ijms222212320] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/11/2021] [Accepted: 11/12/2021] [Indexed: 11/25/2022] Open

Hackett WE, Zaia J. Calculating Glycoprotein Similarities From Mass Spectrometric Data. Mol Cell Proteomics 2021;20:100028. [PMID: 32883803 PMCID: PMC8724611 DOI: 10.1074/mcp.r120.002223] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/24/2020] [Accepted: 09/03/2020] [Indexed: 12/23/2022] Open

Nainwal LM, Shaququzzaman M, Akhter M, Husain A, Parvez S, Khan F, Naematullah M, Alam MM. Synthesis, ADMET prediction and reverse screening study of 3,4,5-trimethoxy phenyl ring pendant sulfur-containing cyanopyrimidine derivatives as promising apoptosis inducing anticancer agents. Bioorg Chem 2020;104:104282. [PMID: 33010624 DOI: 10.1016/j.bioorg.2020.104282] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/03/2020] [Accepted: 09/12/2020] [Indexed: 02/09/2023]

Cao Y, Park SJ, Im W. A systematic analysis of protein-carbohydrate interactions in the Protein Data Bank. Glycobiology 2020;31:126-136. [PMID: 32614943 DOI: 10.1093/glycob/cwaa062] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 06/26/2020] [Accepted: 06/26/2020] [Indexed: 12/17/2022] Open

Marrero-Ponce Y, Teran JE, Contreras-Torres E, García-Jacas CR, Perez-Castillo Y, Cubillan N, Peréz-Giménez F, Valdés-Martini JR. LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs. J Theor Biol 2020;485:110039. [DOI: 10.1016/j.jtbi.2019.110039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 09/11/2019] [Accepted: 10/02/2019] [Indexed: 11/28/2022]

Bologa CG, Ursu O, Oprea TI. How to Prepare a Compound Collection Prior to Virtual Screening. Methods Mol Biol 2019;1939:119-138. [PMID: 30848459 DOI: 10.1007/978-1-4939-9089-4_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Byrne R, Schneider G. In Silico Target Prediction for Small Molecules. Methods Mol Biol 2019;1888:273-309. [PMID: 30519953 DOI: 10.1007/978-1-4939-8891-4_16] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Ashenden SK. Screening Library Design. Methods Enzymol 2018;610:73-96. [PMID: 30390806 DOI: 10.1016/bs.mie.2018.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]

Reker D, Brown JB. Selection of Informative Examples in Chemogenomic Datasets. Methods Mol Biol 2018;1825:369-410. [PMID: 30334214 DOI: 10.1007/978-1-4939-8639-2_13] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers 2017;22:247-258. [PMID: 29204824 DOI: 10.1007/s11030-017-9802-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/26/2017] [Indexed: 12/13/2022]

Taraji M, Haddad PR, Amos RIJ, Talebi M, Szucs R, Dolan JW, Pohl CA. Chemometric-assisted method development in hydrophilic interaction liquid chromatography: A review. Anal Chim Acta 2017;1000:20-40. [PMID: 29289311 DOI: 10.1016/j.aca.2017.09.041] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Revised: 09/22/2017] [Accepted: 09/24/2017] [Indexed: 02/09/2023]

Varela JN, Lammoglia Cobo MF, Pawar SV, Yadav VG. Cheminformatic Analysis of Antimalarial Chemical Space Illuminates Therapeutic Mechanisms and Offers Strategies for Therapy Development. J Chem Inf Model 2017;57:2119-2131. [PMID: 28810125 DOI: 10.1021/acs.jcim.7b00072] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Abstract

The clear and present danger of malaria, which has been amplified in recent years by climate change, and the progressive thinning of our drug arsenal over the past two decades raise uncomfortable questions about the current state and future of antimalarial drug development. Besides suffering from many of the same technical challenges that affect drug development in other disease areas, the quest for new antimalarial therapies is also hindered by the complex, dynamic life cycle of the malaria parasite, P. falciparum, in its mosquito and human hosts, and its role thereof in the elicitation of drug resistance. New strategies are needed in order to ensure economical and expeditious development of new, more efficacious treatments. In the present study, we employ open-source cheminformatics tools to analyze the chemical space traversed by approved antimalarial drugs and promising candidates at various stages of development to uncover insights that could shape future endeavors in the field. Our scaffold-centric analysis reveals that the antimalarial chemical space is disjointed and segregated into a few dominant structural groups. In fact, the structures of antimalarial drugs and drug candidates are distributed according to Pareto's principle. This structural convergence can potentially be exploited for future drug discovery by incorporating it into bioinformatics workflows that are typically employed for solving problems in structural biology. Significantly, we demonstrate how molecular scaffold hunting can be applied to unearth putative mechanisms of action of drugs whose activities remain a mystery, and how scaffold-centric analysis of drug space can also provide a recipe for combination therapies that minimize the likelihood of emergence of drug resistance, as well as identify areas on which to focus efforts. Finally, we also observe that over half of the molecules in the antimalarial space bear no resemblance to other molecules in the collection, which suggests that the pharmacobiology of antimalarial drugs has not been entirely surveyed.

Collapse

Consensus Diversity Plots: a global diversity analysis of chemical libraries. J Cheminform 2016;8:63. [PMID: 27895718 PMCID: PMC5105260 DOI: 10.1186/s13321-016-0176-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 10/27/2016] [Indexed: 01/14/2023] Open

Consensus Diversity Plot is a novel data mining tool that represents in two-dimensions the global diversity of compound data sets using multiple metrics.

Kasahara K, Kinoshita K. Landscape of protein-small ligand binding modes. Protein Sci 2016;25:1659-71. [PMID: 27327045 PMCID: PMC5338237 DOI: 10.1002/pro.2971] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2015] [Revised: 06/04/2016] [Accepted: 06/15/2016] [Indexed: 11/15/2022]

Armacost KA, Goh GB, Brooks CL. Biasing Potential Replica Exchange Multisite λ-Dynamics for Efficient Free Energy Calculations. J Chem Theory Comput 2016;11:1267-77. [PMID: 26579773 DOI: 10.1021/ct500894k] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]

Abstract

Traditional free energy calculation methods are well-known for their drawbacks in scalability and speed in converging results particularly for calculations with large perturbations. In the present work, we report on the development of biasing potential replica exchange multisite λ-dynamics (BP-REX MSλD), which is a free energy method that is capable of performing simultaneous alchemical free energy transformations, including perturbations between flexible moieties. BP-REX MSλD and the original MSλD are applied to a series of symmetrical 2,5-benzoquinone derivatives covering a diverse chemical space and range of conformational flexibility. Improved λ-space sampling is observed for the BP-REX MSλD simulations, yielding a 2-5-fold increase in the number of transitions between substituents compared to traditional MSλD. We also demonstrate the efficacy of varying the value of c, the parameter that controls the ruggedness of the landscape mediating the sampling of λ-states, based on the flexibility of the fragment. Finally, we developed a protocol for maximizing the transition frequency between fragments. This protocol reduces the "kinetic barrier" for alchemically transforming fragments by grouping and ordering based on volume. These findings are applied to a challenging test set involving a series of geldanamycin-based inhibitors of heat shock protein 90 (Hsp90). Even though the perturbations span volume changes by as large as 60 Å(3), the values for the free energy change achieve an average unsigned error (AUE) of 1.5 kcal/mol relative to experimental Kd measurements with a reasonable correlation (R = 0.56). Our results suggest that the BP-REX MSλD algorithm is a highly efficient and scalable free energy method, which when utilized will enable routine calculations on the order of hundreds of compounds using only a few simulations.

Collapse

Exploration of Scaffolds from Natural Products with Antiplasmodial Activities, Currently Registered Antimalarial Drugs and Public Malarial Screen Data. Molecules 2016;21:104. [PMID: 26784165 PMCID: PMC6273396 DOI: 10.3390/molecules21010104] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 01/06/2016] [Accepted: 01/12/2016] [Indexed: 01/07/2023] Open

Hähnke VD, Bolton EE, Bryant SH. PubChem atom environments. J Cheminform 2015;7:41. [PMID: 26300985 PMCID: PMC4540750 DOI: 10.1186/s13321-015-0076-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Accepted: 05/20/2015] [Indexed: 02/05/2023] Open

Abstract

BACKGROUND

Atom environments and fragments find wide-spread use in chemical information and cheminformatics. They are the basis of prediction models, an integral part in similarity searching, and employed in structure search techniques. Most of these methods were developed and evaluated on the relatively small sets of chemical structures available at the time. An analysis of fragment distributions representative of most known chemical structures was published in the 1970s using the Chemical Abstracts Service data system. More recently, advances in automated synthesis of chemicals allow millions of chemicals to be synthesized by a single organization. In addition, open chemical databases are readily available containing tens of millions of chemical structures from a multitude of data sources, including chemical vendors, patents, and the scientific literature, making it possible for scientists to readily access most known chemical structures. With this availability of information, one can now address interesting questions, such as: what chemical fragments are known today? How do these fragments compare to earlier studies? How unique are chemical fragments found in chemical structures?

RESULTS

For our analysis, after hydrogen suppression, atoms were characterized by atomic number, formal charge, implicit hydrogen count, explicit degree (number of neighbors), valence (bond order sum), and aromaticity. Bonds were differentiated as single, double, triple or aromatic bonds. Atom environments were created in a circular manner focused on a central atom with radii from 0 (atom types) up to 3 (representative of ECFP_6 fragments). In total, combining atom types and atom environments that include up to three spheres of nearest neighbors, our investigation identified 28,462,319 unique fragments in the 46 million structures found in the PubChem Compound database as of January 2013. We could identify several factors inflating the number of environments involving transition metals, with many seemingly due to erroneous interpretation of structures from patent data. Compared to fragmentation statistics published 40 years ago, the exponential growth in chemistry is mirrored in a nearly eightfold increase in the number of unique chemical fragments; however, this result is clearly an upper bound estimate as earlier studies employed structure sampling approaches and this study shows that a relatively high rate of atom fragments are found in only a single chemical structure (singletons). In addition, the percentage of singletons grows as the size of the chemical fragment is increased.

CONCLUSIONS

The observed growth of the numbers of unique fragments over time suggests that many chemically possible connections of atom types to larger fragments have yet to be explored by chemists. A dramatic drop in the relative rate of increase of atom environments from smaller to larger fragments shows that larger fragments mainly consist of diverse combinations of a limited subset of smaller fragments. This is further supported by the observed concomitant increase of singleton atom environments. Combined, these findings suggest that there is considerable opportunity for chemists to combine known fragments to novel chemical compounds. The comparison of PubChem to an older study of known chemical structures shows noticeable differences. The changes suggest advances in synthetic capabilities of chemists to combine atoms in new patterns. Log-log plots of fragment incidence show small numbers of fragments are found in many structures and that large numbers of fragments are found in very few structures, with nearly half being novel using the methods in this work. The relative decrease in the count of new fragments as a function of size further suggests considerable opportunity for more novel chemicals exists. Lastly, the differences in atom environment diversity between PubChem Substance and Compound showcase the effect of PubChem standardization protocols, but also indicate that a normalization procedure for atom types, functional groups, and tautomeric/resonance forms based on atom environments is possible. The complete sets of atom types and atom environments are supplied as supporting information.

Collapse

Reker D, Schneider G. Active-learning strategies in computer-assisted drug discovery. Drug Discov Today 2015;20:458-65. [DOI: 10.1016/j.drudis.2014.12.004] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 11/13/2014] [Accepted: 12/02/2014] [Indexed: 12/20/2022]

Liu X, Campillos M. Unveiling new biological relationships using shared hits of chemical screening assay pairs. ACTA ACUST UNITED AC 2015;30:i579-86. [PMID: 25161250 PMCID: PMC4147921 DOI: 10.1093/bioinformatics/btu468] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Copeland JC, Zehr LJ, Cerny RL, Powers R. The applicability of molecular descriptors for designing an electrospray ionization mass spectrometry compatible library for drug discovery. Comb Chem High Throughput Screen 2014;15:806-15. [PMID: 22708878 DOI: 10.2174/138620712803901180] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Revised: 05/25/2012] [Accepted: 06/08/2012] [Indexed: 11/22/2022]

Bologa CG, Oprea TI. Compound collection preparation for virtual screening. Methods Mol Biol 2013;910:125-43. [PMID: 22821595 DOI: 10.1007/978-1-61779-965-5_7] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]

Conway KR, Boddy CN. ClusterMine360: a database of microbial PKS/NRPS biosynthesis. Nucleic Acids Res 2012;41:D402-7. [PMID: 23104377 PMCID: PMC3531105 DOI: 10.1093/nar/gks993] [Citation(s) in RCA: 91] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Entzeroth M, Flotow H, Condron P. Overview of high-throughput screening. ACTA ACUST UNITED AC 2012;Chapter 9:Unit 9.4. [PMID: 22294406 DOI: 10.1002/0471141755.ph0904s44] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Le Guilloux V, Colliandre L, Bourg S, Guénegou G, Dubois-Chevalier J, Morin-Allory L. Visual characterization and diversity quantification of chemical libraries: 1. creation of delimited reference chemical subspaces. J Chem Inf Model 2011;51:1762-74. [PMID: 21761916 DOI: 10.1021/ci200051r] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Zhou JZ. Chemoinformatics and library design. Methods Mol Biol 2011;685:27-52. [PMID: 20981517 DOI: 10.1007/978-1-60761-931-4_2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/19/2023]

Schnur DM, Beno BR, Tebben AJ, Cavallaro C. Methods for combinatorial and parallel library design. Methods Mol Biol 2011;672:387-434. [PMID: 20838978 DOI: 10.1007/978-1-60761-839-3_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]

Chen H, Engkvist O, Blomberg N. Combinatorial library design from reagent pharmacophore fingerprints. Methods Mol Biol 2011;685:135-152. [PMID: 20981522 DOI: 10.1007/978-1-60761-931-4_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Cheng TJR, Wu YT, Yang ST, Lo KH, Chen SK, Chen YH, Huang WI, Yuan CH, Guo CW, Huang LY, Chen KT, Shih HW, Cheng YSE, Cheng WC, Wong CH. High-throughput identification of antibacterials against methicillin-resistant Staphylococcus aureus (MRSA) and the transglycosylase. Bioorg Med Chem 2010;18:8512-29. [PMID: 21075637 DOI: 10.1016/j.bmc.2010.10.036] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Revised: 10/11/2010] [Accepted: 10/14/2010] [Indexed: 12/01/2022]

Fischer JD, Holliday GL, Rahman SA, Thornton JM. The structures and physicochemical properties of organic cofactors in biocatalysis. J Mol Biol 2010;403:803-24. [PMID: 20850456 DOI: 10.1016/j.jmb.2010.09.018] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 09/03/2010] [Accepted: 09/06/2010] [Indexed: 10/19/2022]

Xi L, Li S, Liu H, Li J, Lei B, Yao X. Global and local prediction of protein folding rates based on sequence autocorrelation information. J Theor Biol 2010;264:1159-68. [DOI: 10.1016/j.jtbi.2010.03.042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2009] [Revised: 03/28/2010] [Accepted: 03/29/2010] [Indexed: 11/24/2022]

Verma J, Malde A, Khedkar S, Iyer R, Coutinho E. Local Indices for Similarity Analysis (LISA)—A 3D-QSAR Formalism Based on Local Molecular Similarity. J Chem Inf Model 2009;49:2695-707. [DOI: 10.1021/ci900224u] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Powers R. Advances in Nuclear Magnetic Resonance for Drug Discovery. Expert Opin Drug Discov 2009;4:1077-1098. [PMID: 20333269 PMCID: PMC2843924 DOI: 10.1517/17460440903232623] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Rahman SA, Bashton M, Holliday GL, Schrader R, Thornton JM. Small Molecule Subgraph Detector (SMSD) toolkit. J Cheminform 2009;1:12. [PMID: 20298518 PMCID: PMC2820491 DOI: 10.1186/1758-2946-1-12] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Accepted: 08/10/2009] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Finding one small molecule (query) in a large target library is a challenging task in computational chemistry. Although several heuristic approaches are available using fragment-based chemical similarity searches, they fail to identify exact atom-bond equivalence between the query and target molecules and thus cannot be applied to complex chemical similarity searches, such as searching a complete or partial metabolic pathway.In this paper we present a new Maximum Common Subgraph (MCS) tool: SMSD (Small Molecule Subgraph Detector) to overcome the issues with current heuristic approaches to small molecule similarity searches. The MCS search implemented in SMSD incorporates chemical knowledge (atom type match with bond sensitive and insensitive information) while searching molecular similarity. We also propose a novel method by which solutions obtained by each MCS run can be ranked using chemical filters such as stereochemistry, bond energy, etc.

RESULTS

In order to benchmark and test the tool, we performed a 50,000 pair-wise comparison between KEGG ligands and PDB HET Group atoms. In both cases the SMSD was shown to be more efficient than the widely used MCS module implemented in the Chemistry Development Kit (CDK) in generating MCS solutions from our test cases.

CONCLUSION

Presently this tool can be applied to various areas of bioinformatics and chemo-informatics for finding exhaustive MCS matches. For example, it can be used to analyse metabolic networks by mapping the atoms between reactants and products involved in reactions. It can also be used to detect the MCS/substructure searches in small molecules reported by metabolome experiments, as well as in the screening of drug-like compounds with similar substructures.Thus, we present a robust tool that can be used for multiple applications, including the discovery of new drug molecules. This tool is freely available on http://www.ebi.ac.uk/thornton-srv/software/SMSD/

Collapse

Tanikawa T, Fridman M, Zhu W, Faulk B, Joseph IC, Kahne D, Wagner BK, Clemons PA. Using biological performance similarity to inform disaccharide library design. J Am Chem Soc 2009;131:5075-83. [PMID: 19298063 DOI: 10.1021/ja806583y] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Chen H, Börjesson U, Engkvist O, Kogej T, Svensson MA, Blomberg N, Weigelt D, Burrows JN, Lange T. ProSAR: A New Methodology for Combinatorial Library Design. J Chem Inf Model 2009;49:603-14. [DOI: 10.1021/ci800231d] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Affiliation(s)

Hongming Chen DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Ulf Börjesson DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Ola Engkvist DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Thierry Kogej DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Mats A. Svensson DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Niklas Blomberg DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Dirk Weigelt DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Jeremy N. Burrows DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
Tim Lange DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden

Collapse

Rupp M, Schneider P, Schneider G. Distance phenomena in high-dimensional chemical descriptor spaces: Consequences for similarity-based approaches. J Comput Chem 2009;30:2285-96. [DOI: 10.1002/jcc.21218] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Lipkus AH, Yuan Q, Lucas KA, Funk SA, Bartelt WF, Schenck RJ, Trippe AJ. Structural diversity of organic chemistry. A scaffold analysis of the CAS Registry. J Org Chem 2008;73:4443-51. [PMID: 18505297 DOI: 10.1021/jo8001276] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Houghten RA, Pinilla C, Giulianotti MA, Appel JR, Dooley CT, Nefzi A, Ostresh JM, Yu Y, Maggiora GM, Medina-Franco JL, Brunner D, Schneider J. Strategies for the use of mixture-based synthetic combinatorial libraries: scaffold ranking, direct testing in vivo, and enhanced deconvolution by computational methods. ACTA ACUST UNITED AC 2007;10:3-19. [PMID: 18067268 DOI: 10.1021/cc7001205] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Rosania GR, Crippen G, Woolf P, States D, Shedden K. A Cheminformatic Toolkit for Mining Biomedical Knowledge. Pharm Res 2007;24:1791-802. [PMID: 17385012 DOI: 10.1007/s11095-007-9285-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2007] [Accepted: 02/27/2007] [Indexed: 01/31/2023]

Zhang S, Golbraikh A, Oloff S, Kohn H, Tropsha A. A novel automated lazy learning QSAR (ALL-QSAR) approach: method development, applications, and virtual screening of chemical databases using validated ALL-QSAR models. J Chem Inf Model 2006;46:1984-95. [PMID: 16995729 PMCID: PMC2536695 DOI: 10.1021/ci060132x] [Citation(s) in RCA: 169] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

In Silico Design in Homogeneous Catalysis Using Descriptor Modelling. Int J Mol Sci 2006. [DOI: 10.3390/i7090375] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Maldonado AG, Doucet JP, Petitjean M, Fan BT. Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 2006;10:39-79. [PMID: 16404528 DOI: 10.1007/s11030-006-8697-1] [Citation(s) in RCA: 179] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2004] [Accepted: 06/14/2005] [Indexed: 01/04/2023]

Oloff S, Zhang S, Sukumar N, Breneman C, Tropsha A. Chemometric analysis of ligand receptor complementarity: identifying Complementary Ligands Based on Receptor Information (CoLiBRI). J Chem Inf Model 2006;46:844-51. [PMID: 16563016 PMCID: PMC2755506 DOI: 10.1021/ci050065r] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

We have developed a novel structure-based chemoinformatics approach to search for Complimentary Ligands Based on Receptor Information (CoLiBRI). CoLiBRI is based on the representation of both receptor binding sites and their respective ligands in a space of universal chemical descriptors. The binding site atoms involved in the interaction with ligands are identified by the means of a computational geometry technique known as Delaunay tessellation as applied to X-ray characterized ligand-receptor complexes. TAE/RECON multiple chemical descriptors are calculated independently for each ligand as well as for its active site atoms. The representation of both ligands and active sites using chemical descriptors allows the application of well-known chemometric techniques in order to correlate chemical similarities between active sites and their respective ligands. We have established a protocol to map patterns of nearest neighbor active site vectors in a multidimensional TAE/RECON space onto those of their complementary ligands and vice versa. This protocol affords the prediction of a virtual complementary ligand vector in the ligand chemical space from the position of a known active site vector. This prediction is followed by chemical similarity calculations between this virtual ligand vector and those calculated for molecules in a chemical database to identify real compounds most similar to the virtual ligand. Consequently, the knowledge of the receptor active site structure affords straightforward and efficient identification of its complementary ligands in large databases of chemical compounds using rapid chemical similarity searches. Conversely, starting from the ligand chemical structure, one may identify possible complementary receptor cavities as well. We have applied the CoLiBRI approach to a data set of 800 X-ray characterized ligand-receptor complexes in the PDBbind database. Using a k nearest neighbor (kNN) pattern recognition approach and variable selection, we have shown that knowledge of the active site structure affords identification of its complimentary ligand among the top 1% of a large chemical database in over 90% of all test active sites when a binding site of the same protein family was present in the training set. In the case where test receptors are highly dissimilar and not present among the receptor families in the training set, the prediction accuracy is decreased; however, CoLiBRI was still able to quickly eliminate 75% of the chemical database as improbable ligands. CoLiBRI affords rapid prefiltering of a large chemical database to eliminate compounds that have little chance of binding to a receptor active site.

Collapse

Lumley J. Compound Selection and Filtering in Library Design. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/qsar.200520136] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Fechner U, Paetz J, Schneider G. Comparison of Three Holographic Fingerprint Descriptors and their Binary Counterparts. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/qsar.200530118] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Mauser H, Roche O, Stahl M, Müller S. Prediction of UV and ESI−MS Signal Intensities. J Chem Inf Model 2005;45:1039-46. [PMID: 16045299 DOI: 10.1021/ci0496548] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Mercier KA, Powers R. Determining the optimal size of small molecule mixtures for high throughput NMR screening. JOURNAL OF BIOMOLECULAR NMR 2005;31:243-258. [PMID: 15803397 DOI: 10.1007/s10858-005-0948-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2004] [Accepted: 01/06/2005] [Indexed: 05/24/2023]

Jónsdóttir SO, Jørgensen FS, Brunak S. Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates. Bioinformatics 2005;21:2145-60. [PMID: 15713739 DOI: 10.1093/bioinformatics/bti314] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open