1
|
Comparative analysis of an anthraquinone and chalcone derivatives-based virtual combinatorial library. A cheminformatics "proof-of-concept" study. J Mol Graph Model 2022; 117:108307. [PMID: 36096064 DOI: 10.1016/j.jmgm.2022.108307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 08/08/2022] [Accepted: 08/08/2022] [Indexed: 01/14/2023]
Abstract
A Laplacian scoring algorithm for gene selection and the Gini coefficient to identify the genes whose expression varied least across a large set of samples were the state-of-the-art methods used here. These methods have not been trialed for their feasibility in cheminformatics. This was a maiden attempt to investigate a complete comparative analysis of an anthraquinone and chalcone derivatives-based virtual combinatorial library. This computational "proof-of-concept" study illustrated the combinatorial approach used to explain how the structure of the selected natural products (NPs) undergoes molecular diversity analysis. A virtual combinatorial library (1.6 M) based on 20 anthraquinones and 24 chalcones was enumerated. The resulting compounds were optimized to the near drug-likeness properties, and the physicochemical descriptors were calculated for all datasets including FDA, Non-FDA, and NPs from ZINC 15. UMAP and PCA were applied to compare and represent the chemical space coverage of each dataset. Subsequently, the Laplacian score and Gini coefficient were applied to delineate feature selection and selectivity among properties, respectively. Finally, we demonstrated the diversity between the datasets by employing Murcko's and the central scaffolds systems, calculating three fingerprint descriptors and analyzing their diversity by PCA and SOM. The optimized enumeration resulted in 1,610,268 compounds with NP-Likeness, and synthetic feasibility mean scores close to FDA, Non-FDA, and NPs datasets. The overlap between the chemical space of the 1.6 M database was more prominent than with the NPs dataset. A Laplacian score prioritized NP-likeness and hydrogen bond acceptor properties (1.0 and 0.923), respectively, while the Gini coefficient showed that all properties have selective effects on datasets (0.81-0.93). Scaffold and fingerprint diversity indicated that the descending order for the tested datasets was FDA, Non-FDA, NPs and 1.6 M. Virtual combinatorial libraries based on NPs can be considered as a source of the combinatorial compound with NP-likeness properties. Furthermore, measuring molecular diversity is supposed to be performed by different methods to allow for comparison and better judgment.
Collapse
|
2
|
Nainwal LM, Shaququzzaman M, Akhter M, Husain A, Parvez S, Tasneem S, Iqubal A, Alam MM. Synthesis, and reverse screening of 6‐(3,4,5‐trimethoxyphenyl)pyrimidine‐5‐carbonitrile derivatives as anticancer agents: Part‐
II. J Heterocycl Chem 2021. [DOI: 10.1002/jhet.4421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Lalit Mohan Nainwal
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Mohammad Shaququzzaman
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Mymoona Akhter
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Asif Husain
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Suhel Parvez
- Department of Toxicology School of Chemical and Life Sciences, Jamia Hamdard New Delhi India
| | - Sharba Tasneem
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Ashif Iqubal
- Department of Pharmacology School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| | - Mohammad Mumtaz Alam
- Drug Design and Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry School of Pharmaceutical Education and Research, Jamia Hamdard New Delhi India
| |
Collapse
|
3
|
Xu X, Zou X. Dissimilar Ligands Bind in a Similar Fashion: A Guide to Ligand Binding-Mode Prediction with Application to CELPP Studies. Int J Mol Sci 2021; 22:ijms222212320. [PMID: 34830201 PMCID: PMC8625032 DOI: 10.3390/ijms222212320] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2021] [Revised: 11/11/2021] [Accepted: 11/12/2021] [Indexed: 11/25/2022] Open
Abstract
The molecular similarity principle has achieved great successes in the field of drug design/discovery. Existing studies have focused on similar ligands, while the behaviors of dissimilar ligands remain unknown. In this study, we developed an intercomparison strategy in order to compare the binding modes of ligands with different molecular structures. A systematic analysis of a newly constructed protein–ligand complex structure dataset showed that ligands with similar structures tended to share a similar binding mode, which is consistent with the Molecular Similarity Principle. More importantly, the results revealed that dissimilar ligands can also bind in a similar fashion. This finding may open another avenue for drug discovery. Furthermore, a template-guiding method was introduced for predicting protein–ligand complex structures. With the use of dissimilar ligands as templates, our method significantly outperformed the traditional molecular docking methods. The newly developed template-guiding method was further applied to recent CELPP studies.
Collapse
Affiliation(s)
- Xianjin Xu
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211, USA;
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211, USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
- Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA
| | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, University of Missouri, Columbia, MO 65211, USA;
- Department of Physics and Astronomy, University of Missouri, Columbia, MO 65211, USA
- Department of Biochemistry, University of Missouri, Columbia, MO 65211, USA
- Institute for Data Science and Informatics, University of Missouri, Columbia, MO 65211, USA
- Correspondence:
| |
Collapse
|
4
|
Hackett WE, Zaia J. Calculating Glycoprotein Similarities From Mass Spectrometric Data. Mol Cell Proteomics 2021; 20:100028. [PMID: 32883803 PMCID: PMC8724611 DOI: 10.1074/mcp.r120.002223] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 08/24/2020] [Accepted: 09/03/2020] [Indexed: 12/23/2022] Open
Abstract
Complex protein glycosylation occurs through biosynthetic steps in the secretory pathway that create macro- and microheterogeneity of structure and function. Required for all life forms, glycosylation diversifies and adapts protein interactions with binding partners that underpin interactions at cell surfaces and pericellular and extracellular environments. Because these biological effects arise from heterogeneity of structure and function, it is necessary to measure their changes as part of the quest to understand nature. Quite often, however, the assumption behind proteomics that posttranslational modifications are discrete additions that can be modeled using the genome as a template does not apply to protein glycosylation. Rather, it is necessary to quantify the glycosylation distribution at each glycosite and to aggregate this information into a population of mature glycoproteins that exist in a given biological system. To date, mass spectrometric methods for assigning singly glycosylated peptides are well-established. But it is necessary to quantify glycosylation heterogeneity accurately in order to gauge the alterations that occur during biological processes. The task is to quantify the glycosylated peptide forms as accurately as possible and then apply appropriate bioinformatics algorithms to the calculation of micro- and macro-similarities. In this review, we summarize current approaches for protein quantification as they apply to this glycoprotein similarity problem.
Collapse
Affiliation(s)
- William E Hackett
- Bioinformatics Program, Boston University, Boston, Massachusetts, USA
| | - Joseph Zaia
- Bioinformatics Program, Boston University, Boston, Massachusetts, USA; Department of Biochemistry, Boston University, Boston, Massachusetts, USA.
| |
Collapse
|
5
|
Nainwal LM, Shaququzzaman M, Akhter M, Husain A, Parvez S, Khan F, Naematullah M, Alam MM. Synthesis, ADMET prediction and reverse screening study of 3,4,5-trimethoxy phenyl ring pendant sulfur-containing cyanopyrimidine derivatives as promising apoptosis inducing anticancer agents. Bioorg Chem 2020; 104:104282. [PMID: 33010624 DOI: 10.1016/j.bioorg.2020.104282] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2020] [Revised: 09/03/2020] [Accepted: 09/12/2020] [Indexed: 02/09/2023]
Abstract
Cancer remains considered as one of the leading global health problems either due to meagre and suboptimal therapeutic response of chemotherapeutic agents or due to the emergence of spontaneous complex multidrug resistance in cancer cells. This created a persistent need for the development of new anticancer agents. Enthralled by the high success rate for natural product-based drug discovery and current research scenario, we synthesized a new series of 3,4,5-trimethoxy phenyl ring pendant sulfur-containingcyanopyrimidine derivatives clubbed with different amines intending to search an anticancer lead compound. To probe the anti-proliferative spectrum of the synthesized derivatives, an in-vitro evaluation was piloted against a panel of 60 cancer cell lines at the National Cancer Institute (NCI) representing major types of cancer diseases. Most of the derivatives showed good to moderate anti-proliferative activity. The results revealed that compound 4e displayed the most promising broad-spectrum anticancer activity with high growth inhibition of various cell lines representing multiple cancers diseases. Mechanistic investigation of compound 4e in human breast cancer MDA-MB-231 cells showed that compound 4e triggers cell death through the induction of apoptosis. ADMET studies and reverse screening were also performed to identify the potential targets of designed molecules. It was concluded that 3,4,5-trimethoxy phenyl ring pendant sulfur-containingcyanopyrimidine derivative 4e could act as a promising hit molecule for further development of novel anticancer therapeutics.
Collapse
Affiliation(s)
- Lalit Mohan Nainwal
- Drug Design & Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India
| | - Mohammad Shaququzzaman
- Drug Design & Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India
| | - Mymoona Akhter
- Drug Design & Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India
| | - Asif Husain
- Drug Design & Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India
| | - Suhel Parvez
- Department of Toxicology, School of Chemical and Life Sciences, Jamia Hamdard, New Delhi 110062, India
| | - Farah Khan
- Department of Biochemistry, School of Chemical and Life Sciences, Jamia Hamdard, New Delhi 110062, India
| | - Md Naematullah
- Department of Biochemistry, School of Chemical and Life Sciences, Jamia Hamdard, New Delhi 110062, India
| | - Mohammad Mumtaz Alam
- Drug Design & Medicinal Chemistry Lab, Department of Pharmaceutical Chemistry, School of Pharmaceutical Education and Research, Jamia Hamdard, New Delhi 110062, India.
| |
Collapse
|
6
|
Cao Y, Park SJ, Im W. A systematic analysis of protein-carbohydrate interactions in the Protein Data Bank. Glycobiology 2020; 31:126-136. [PMID: 32614943 DOI: 10.1093/glycob/cwaa062] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Revised: 06/26/2020] [Accepted: 06/26/2020] [Indexed: 12/17/2022] Open
Abstract
Protein-carbohydrate interactions underlie essential biological processes. Elucidating the mechanism of protein-carbohydrate recognition is a prerequisite for modeling and optimizing protein-carbohydrate interactions, which will help in discovery of carbohydrate-derived therapeutics. In this work, we present a survey of a curated database consisting of 6,402 protein-carbohydrate complexes in the Protein Data Bank (PDB). We performed an all-against-all comparison of a subset of nonredundant binding sites, and the result indicates that the interaction pattern similarity is not completely relevant to the binding site structural similarity. Investigation of both binding site and ligand promiscuities reveals that the geometry of chemical feature points is more important than local backbone structure in determining protein-carbohydrate interactions. A further analysis on the frequency and geometry of atomic interactions shows that carbohydrate functional groups are not equally involved in binding interactions. Finally, we discuss the usefulness of protein-carbohydrate complexes in the PDB with acknowledgement that the carbohydrates in many structures are incomplete.
Collapse
Affiliation(s)
- Yiwei Cao
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Sang-Jun Park
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA
| | - Wonpil Im
- Departments of Biological Sciences, Chemistry, Bioengineering, and Computer Sciences and Engineering, Lehigh University, Bethlehem, PA 18015, USA.,School of Computational Sciences, Korea Institute for Advanced Study, Seoul 02455, Republic of Korea
| |
Collapse
|
7
|
Marrero-Ponce Y, Teran JE, Contreras-Torres E, García-Jacas CR, Perez-Castillo Y, Cubillan N, Peréz-Giménez F, Valdés-Martini JR. LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs. J Theor Biol 2020; 485:110039. [DOI: 10.1016/j.jtbi.2019.110039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 09/11/2019] [Accepted: 10/02/2019] [Indexed: 11/28/2022]
|
8
|
Bologa CG, Ursu O, Oprea TI. How to Prepare a Compound Collection Prior to Virtual Screening. Methods Mol Biol 2019; 1939:119-138. [PMID: 30848459 DOI: 10.1007/978-1-4939-9089-4_7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Virtual screening is a well-established technique that has proven to be successful in the identification of novel biologically active molecules, including drug repurposing. Whether for ligand-based or for structure-based virtual screening, a chemical collection needs to be properly processed prior to in silico evaluation. Here we describe our step-by-step procedure for handling very large collections (up to billions) of compounds prior to virtual screening.
Collapse
Affiliation(s)
- Cristian G Bologa
- Division of Translational Informatics, Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - Oleg Ursu
- Merck Research Laboratories, Boston, MA, USA.,Division of Translational Informatics, Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | - Tudor I Oprea
- Division of Translational Informatics, Department of Internal Medicine, University of New Mexico School of Medicine, Albuquerque, NM, USA.
| |
Collapse
|
9
|
Abstract
Drugs modulate disease states through their actions on targets in the body. Determining these targets aids the focused development of new treatments, and helps to better characterize those already employed. One means of accomplishing this is through the deployment of in silico methodologies, harnessing computational analytical and predictive power to produce educated hypotheses for experimental verification. Here, we provide an overview of the current state of the art, describe some of the well-established methods in detail, and reflect on how they, and emerging technologies promoting the incorporation of complex and heterogeneous data-sets, can be employed to improve our understanding of (poly)pharmacology.
Collapse
Affiliation(s)
- Ryan Byrne
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland.
| |
Collapse
|
10
|
Ashenden SK. Screening Library Design. Methods Enzymol 2018; 610:73-96. [PMID: 30390806 DOI: 10.1016/bs.mie.2018.09.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023]
Abstract
Thanks to technological advances and a greater understanding of the biological and chemical natures of targets and related diseases, high-throughput screening (HTS) has been allowed to be faster, cheaper, and more accessible. Yet, despite these increased technologies and understandings, the frequency of novel and drugs are being approved each year has not being increasing over the years. 2017 was considered a "bumper" year with a total of 46 approved drugs, over double that of the previous year. However, it is thought that part of the problem that HTS has not lived up to expectations is because of the contents of current chemical libraries. Therefore, new methods to design screening libraries are of great interest.
Collapse
Affiliation(s)
- Stephanie Kay Ashenden
- Department of Chemistry, Cambridge University, Cambridge, United Kingdom; Discovery Sciences, IMed Biotech Unit, AstraZeneca R&D, Cambridge, United Kingdom.
| |
Collapse
|
11
|
Abstract
High-throughput and high-content screening campaigns have resulted in the creation of large chemogenomic matrices. These matrices form the training data which is used to build ligand-target interaction models for pharmacological and chemical biology research. While academic, government, and industrial efforts continuously add to the ligand-target data pairs available for modeling, major research efforts are devoted to improving machine learning techniques to cope with the sparseness, heterogeneity, and size of available datasets as well as inherent noise and bias. This "race of arms" has led to the creation of algorithms to generate highly complex models with high prediction performance at the cost of training efficiency as well as interpretability.In contrast, recent studies have challenged the necessity for "big data" in chemogenomic modeling and found that models built on larger numbers of examples do not necessarily result in better predictive abilities. Automated adaptive selection of the training data (ligand-target instances) used for model creation can result in considerably smaller training sets that retain prediction performance on par with training using hundreds of thousands of data points. In this chapter, we describe the protocols used for one such iterative chemogenomic selection technique, including model construction and update as well as possible techniques for evaluations of constructed models and analysis of the iterative model construction.
Collapse
Affiliation(s)
- Daniel Reker
- Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - J B Brown
- Life Science Informatics Research Unit, Laboratory of Molecular Biosciences, Kyoto University Graduate School of Medicine, Kyoto, Japan
| |
Collapse
|
12
|
Naveja JJ, Oviedo-Osornio CI, Trujillo-Minero NN, Medina-Franco JL. Chemoinformatics: a perspective from an academic setting in Latin America. Mol Divers 2017; 22:247-258. [PMID: 29204824 DOI: 10.1007/s11030-017-9802-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2017] [Accepted: 11/26/2017] [Indexed: 12/13/2022]
Abstract
This perspective discusses the current progress of a chemoinformatics group in a major university in Latin America. Three major aspects are discussed in a critical manner: research, education, and collaboration with industry and other public research networks. It is also presented an overview of the progress in applied research and development of research concepts. Efforts to teach chemoinformatics at the undergraduate and graduate levels are discussed. It is addressed how the partnership with industry and other not-for-profit research institutions not only brings additional sources of funding but, more importantly, increases the impact of the multidisciplinary work and offers the students to be exposed to other research environments. We also discuss the main perspectives and challenges that remain to be addressed in these settings.
Collapse
Affiliation(s)
- J Jesús Naveja
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.,PECEM, Facultad de Medicina, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - C Iluhí Oviedo-Osornio
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - Nicole N Trujillo-Minero
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico
| | - José L Medina-Franco
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510, Mexico City, Mexico.
| |
Collapse
|
13
|
Taraji M, Haddad PR, Amos RIJ, Talebi M, Szucs R, Dolan JW, Pohl CA. Chemometric-assisted method development in hydrophilic interaction liquid chromatography: A review. Anal Chim Acta 2017; 1000:20-40. [PMID: 29289311 DOI: 10.1016/j.aca.2017.09.041] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Revised: 09/22/2017] [Accepted: 09/24/2017] [Indexed: 02/09/2023]
Abstract
With an enormous growth in the application of hydrophilic interaction liquid chromatography (HILIC), there has also been significant progress in HILIC method development. HILIC is a chromatographic method that utilises hydro-organic mobile phases with a high organic content, and a hydrophilic stationary phase. It has been applied predominantly in the determination of small polar compounds. Theoretical studies in computer-aided modelling tools, most importantly the predictive, quantitative structure retention relationship (QSRR) modelling methods, have attracted the attention of researchers and these approaches greatly assist the method development process. This review focuses on the application of computer-aided modelling tools in understanding the retention mechanism, the classification of HILIC stationary phases, prediction of retention times in HILIC systems, optimisation of chromatographic conditions, and description of the interaction effects of the chromatographic factors in HILIC separations. Additionally, what has been achieved in the potential application of QSRR methodology in combination with experimental design philosophy in the optimisation of chromatographic separation conditions in the HILIC method development process is communicated. Developing robust predictive QSRR models will undoubtedly facilitate more application of this chromatographic mode in a broader variety of research areas, significantly minimising cost and time of the experimental work.
Collapse
Affiliation(s)
- Maryam Taraji
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Paul R Haddad
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia.
| | - Ruth I J Amos
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Mohammad Talebi
- Australian Centre for Research on Separation Science (ACROSS), School of Physical Sciences-Chemistry, University of Tasmania, Private Bag 75, Hobart 7001, Australia
| | - Roman Szucs
- Pfizer Global Research and Development, CT13 9NJ, Sandwich, UK
| | - John W Dolan
- LC Resources, 1795 NW Wallace Rd., McMinnville, OR 97128, USA
| | | |
Collapse
|
14
|
Varela JN, Lammoglia Cobo MF, Pawar SV, Yadav VG. Cheminformatic Analysis of Antimalarial Chemical Space Illuminates Therapeutic Mechanisms and Offers Strategies for Therapy Development. J Chem Inf Model 2017; 57:2119-2131. [PMID: 28810125 DOI: 10.1021/acs.jcim.7b00072] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
The clear and present danger of malaria, which has been amplified in recent years by climate change, and the progressive thinning of our drug arsenal over the past two decades raise uncomfortable questions about the current state and future of antimalarial drug development. Besides suffering from many of the same technical challenges that affect drug development in other disease areas, the quest for new antimalarial therapies is also hindered by the complex, dynamic life cycle of the malaria parasite, P. falciparum, in its mosquito and human hosts, and its role thereof in the elicitation of drug resistance. New strategies are needed in order to ensure economical and expeditious development of new, more efficacious treatments. In the present study, we employ open-source cheminformatics tools to analyze the chemical space traversed by approved antimalarial drugs and promising candidates at various stages of development to uncover insights that could shape future endeavors in the field. Our scaffold-centric analysis reveals that the antimalarial chemical space is disjointed and segregated into a few dominant structural groups. In fact, the structures of antimalarial drugs and drug candidates are distributed according to Pareto's principle. This structural convergence can potentially be exploited for future drug discovery by incorporating it into bioinformatics workflows that are typically employed for solving problems in structural biology. Significantly, we demonstrate how molecular scaffold hunting can be applied to unearth putative mechanisms of action of drugs whose activities remain a mystery, and how scaffold-centric analysis of drug space can also provide a recipe for combination therapies that minimize the likelihood of emergence of drug resistance, as well as identify areas on which to focus efforts. Finally, we also observe that over half of the molecules in the antimalarial space bear no resemblance to other molecules in the collection, which suggests that the pharmacobiology of antimalarial drugs has not been entirely surveyed.
Collapse
Affiliation(s)
- Julia Nogueira Varela
- Department of Chemical & Biological Engineering, The University of British Columbia , Vancouver, BC, Canada , V6T 1Z3
| | - María Fernanda Lammoglia Cobo
- Department of Chemical & Biological Engineering, The University of British Columbia , Vancouver, BC, Canada , V6T 1Z3.,Life Sciences Department, Monterrey Institute of Technology and Higher Education , Mexico City Campus, Mexico City, Mexico , 14380
| | - Sandip V Pawar
- Department of Chemical & Biological Engineering, The University of British Columbia , Vancouver, BC, Canada , V6T 1Z3
| | - Vikramaditya G Yadav
- Department of Chemical & Biological Engineering, The University of British Columbia , Vancouver, BC, Canada , V6T 1Z3.,Neglected Global Diseases Initiative, The University of British Columbia , Vancouver, BC, Canada , V6T 1Z3
| |
Collapse
|
15
|
Consensus Diversity Plots: a global diversity analysis of chemical libraries. J Cheminform 2016; 8:63. [PMID: 27895718 PMCID: PMC5105260 DOI: 10.1186/s13321-016-0176-9] [Citation(s) in RCA: 51] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 10/27/2016] [Indexed: 01/14/2023] Open
Abstract
Background Measuring the structural diversity of compound databases is relevant in drug discovery and many other areas of chemistry. Since molecular diversity depends on molecular representation, comprehensive chemoinformatic analysis of the diversity of libraries uses multiple criteria. For instance, the diversity of the molecular libraries is typically evaluated employing molecular scaffolds, structural fingerprints, and physicochemical properties. However, the assessment with each criterion is analyzed independently and it is not straightforward to provide an evaluation of the “global diversity”. Results Herein the Consensus Diversity Plot (CDP) is proposed as a novel method to represent in low dimensions the diversity of chemical libraries considering simultaneously multiple molecular representations. We illustrate the application of CDPs to classify eight compound data sets and two subsets with different sizes and compositions using molecular scaffolds, structural fingerprints, and physicochemical properties. Conclusions CDPs are general data mining tools that represent in two-dimensions the global diversity of compound data sets using multiple metrics. These plots can be constructed using single or combined measures of diversity. An online version of the CDPs is freely available at: https://consensusdiversityplots-difacquim-unam.shinyapps.io/RscriptsCDPlots/.Consensus Diversity Plot is a novel data mining tool that represents in two-dimensions the global diversity of compound data sets using multiple metrics. ![]() Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0176-9) contains supplementary material, which is available to authorized users.
Collapse
|
16
|
Kasahara K, Kinoshita K. Landscape of protein-small ligand binding modes. Protein Sci 2016; 25:1659-71. [PMID: 27327045 PMCID: PMC5338237 DOI: 10.1002/pro.2971] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2015] [Revised: 06/04/2016] [Accepted: 06/15/2016] [Indexed: 11/15/2022]
Abstract
Elucidating the mechanisms of specific small-molecule (ligand) recognition by proteins is a long-standing conundrum. While the structures of these molecules, proteins and ligands, have been extensively studied, protein-ligand interactions, or binding modes, have not been comprehensively analyzed. Although methods for assessing similarities of binding site structures have been extensively developed, the methods for the computational treatment of binding modes have not been well established. Here, we developed a computational method for encoding the information about binding modes as graphs, and assessing their similarities. An all-against-all comparison of 20,040 protein-ligand complexes provided the landscape of the protein-ligand binding modes and its relationships with protein- and chemical spaces. While similar proteins in the same SCOP Family tend to bind relatively similar ligands with similar binding modes, the correlation between ligand and binding similarities was not very high (R(2) = 0.443). We found many pairs with novel relationships, in which two evolutionally distant proteins recognize dissimilar ligands by similar binding modes (757,474 pairs out of 200,790,780 pairs were categorized into this relationship, in our dataset). In addition, there were an abundance of pairs of homologous proteins binding to similar ligands with different binding modes (68,217 pairs). Our results showed that many interesting relationships between protein-ligand complexes are still hidden in the structure database, and our new method for assessing binding mode similarities is effective to find them.
Collapse
Affiliation(s)
- Kota Kasahara
- College of Life SciencesRitsumeikan UniversityKusatsuShiga525‐8577Japan
| | - Kengo Kinoshita
- Graduate School of Information SciencesTohoku UniversitySendaiMiyagi980‐8597Japan
- Tohoku Medical Megabank OrganizationTohoku UniversitySendaiMiyagi980‐8573Japan
- Institute of Development, Aging and Cancer, Tohoku UniversitySendaiMiyagi980‐8575Japan
| |
Collapse
|
17
|
Armacost KA, Goh GB, Brooks CL. Biasing Potential Replica Exchange Multisite λ-Dynamics for Efficient Free Energy Calculations. J Chem Theory Comput 2016; 11:1267-77. [PMID: 26579773 DOI: 10.1021/ct500894k] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Traditional free energy calculation methods are well-known for their drawbacks in scalability and speed in converging results particularly for calculations with large perturbations. In the present work, we report on the development of biasing potential replica exchange multisite λ-dynamics (BP-REX MSλD), which is a free energy method that is capable of performing simultaneous alchemical free energy transformations, including perturbations between flexible moieties. BP-REX MSλD and the original MSλD are applied to a series of symmetrical 2,5-benzoquinone derivatives covering a diverse chemical space and range of conformational flexibility. Improved λ-space sampling is observed for the BP-REX MSλD simulations, yielding a 2-5-fold increase in the number of transitions between substituents compared to traditional MSλD. We also demonstrate the efficacy of varying the value of c, the parameter that controls the ruggedness of the landscape mediating the sampling of λ-states, based on the flexibility of the fragment. Finally, we developed a protocol for maximizing the transition frequency between fragments. This protocol reduces the "kinetic barrier" for alchemically transforming fragments by grouping and ordering based on volume. These findings are applied to a challenging test set involving a series of geldanamycin-based inhibitors of heat shock protein 90 (Hsp90). Even though the perturbations span volume changes by as large as 60 Å(3), the values for the free energy change achieve an average unsigned error (AUE) of 1.5 kcal/mol relative to experimental Kd measurements with a reasonable correlation (R = 0.56). Our results suggest that the BP-REX MSλD algorithm is a highly efficient and scalable free energy method, which when utilized will enable routine calculations on the order of hundreds of compounds using only a few simulations.
Collapse
Affiliation(s)
- Kira A Armacost
- Department of Chemistry, University of Michigan , 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| | - Garrett B Goh
- Department of Chemistry, University of Michigan , 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Chemistry, University of Michigan , 930 North University Avenue, Ann Arbor, Michigan 48109, United States
| |
Collapse
|
18
|
Exploration of Scaffolds from Natural Products with Antiplasmodial Activities, Currently Registered Antimalarial Drugs and Public Malarial Screen Data. Molecules 2016; 21:104. [PMID: 26784165 PMCID: PMC6273396 DOI: 10.3390/molecules21010104] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 01/06/2016] [Accepted: 01/12/2016] [Indexed: 01/07/2023] Open
Abstract
In light of current resistance to antimalarial drugs, there is a need to discover new classes of antimalarial agents with unique mechanisms of action. Identification of unique scaffolds from natural products with in vitro antiplasmodial activities may be the starting point for such new classes of antimalarial agents. We therefore conducted scaffold diversity and comparison analysis of natural products with in vitro antiplasmodial activities (NAA), currently registered antimalarial drugs (CRAD) and malaria screen data from Medicine for Malaria Ventures (MMV). The scaffold diversity analyses on the three datasets were performed using scaffold counts and cumulative scaffold frequency plots. Scaffolds from the NAA were compared to those from CRAD and MMV. A Scaffold Tree was also generated for each of the datasets and the scaffold diversity of NAA was found to be higher than that of MMV. Among the NAA compounds, we identified unique scaffolds that were not contained in any of the other compound datasets. These scaffolds from NAA also possess desirable drug-like properties making them ideal starting points for antimalarial drug design considerations. The Scaffold Tree showed the preponderance of ring systems in NAA and identified virtual scaffolds, which may be potential bioactive compounds.
Collapse
|
19
|
Abstract
BACKGROUND Atom environments and fragments find wide-spread use in chemical information and cheminformatics. They are the basis of prediction models, an integral part in similarity searching, and employed in structure search techniques. Most of these methods were developed and evaluated on the relatively small sets of chemical structures available at the time. An analysis of fragment distributions representative of most known chemical structures was published in the 1970s using the Chemical Abstracts Service data system. More recently, advances in automated synthesis of chemicals allow millions of chemicals to be synthesized by a single organization. In addition, open chemical databases are readily available containing tens of millions of chemical structures from a multitude of data sources, including chemical vendors, patents, and the scientific literature, making it possible for scientists to readily access most known chemical structures. With this availability of information, one can now address interesting questions, such as: what chemical fragments are known today? How do these fragments compare to earlier studies? How unique are chemical fragments found in chemical structures? RESULTS For our analysis, after hydrogen suppression, atoms were characterized by atomic number, formal charge, implicit hydrogen count, explicit degree (number of neighbors), valence (bond order sum), and aromaticity. Bonds were differentiated as single, double, triple or aromatic bonds. Atom environments were created in a circular manner focused on a central atom with radii from 0 (atom types) up to 3 (representative of ECFP_6 fragments). In total, combining atom types and atom environments that include up to three spheres of nearest neighbors, our investigation identified 28,462,319 unique fragments in the 46 million structures found in the PubChem Compound database as of January 2013. We could identify several factors inflating the number of environments involving transition metals, with many seemingly due to erroneous interpretation of structures from patent data. Compared to fragmentation statistics published 40 years ago, the exponential growth in chemistry is mirrored in a nearly eightfold increase in the number of unique chemical fragments; however, this result is clearly an upper bound estimate as earlier studies employed structure sampling approaches and this study shows that a relatively high rate of atom fragments are found in only a single chemical structure (singletons). In addition, the percentage of singletons grows as the size of the chemical fragment is increased. CONCLUSIONS The observed growth of the numbers of unique fragments over time suggests that many chemically possible connections of atom types to larger fragments have yet to be explored by chemists. A dramatic drop in the relative rate of increase of atom environments from smaller to larger fragments shows that larger fragments mainly consist of diverse combinations of a limited subset of smaller fragments. This is further supported by the observed concomitant increase of singleton atom environments. Combined, these findings suggest that there is considerable opportunity for chemists to combine known fragments to novel chemical compounds. The comparison of PubChem to an older study of known chemical structures shows noticeable differences. The changes suggest advances in synthetic capabilities of chemists to combine atoms in new patterns. Log-log plots of fragment incidence show small numbers of fragments are found in many structures and that large numbers of fragments are found in very few structures, with nearly half being novel using the methods in this work. The relative decrease in the count of new fragments as a function of size further suggests considerable opportunity for more novel chemicals exists. Lastly, the differences in atom environment diversity between PubChem Substance and Compound showcase the effect of PubChem standardization protocols, but also indicate that a normalization procedure for atom types, functional groups, and tautomeric/resonance forms based on atom environments is possible. The complete sets of atom types and atom environments are supplied as supporting information.
Collapse
Affiliation(s)
- Volker D Hähnke
- Department of Health and Human Services, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Evan E Bolton
- Department of Health and Human Services, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| | - Stephen H Bryant
- Department of Health and Human Services, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894 USA
| |
Collapse
|
20
|
Reker D, Schneider G. Active-learning strategies in computer-assisted drug discovery. Drug Discov Today 2015; 20:458-65. [DOI: 10.1016/j.drudis.2014.12.004] [Citation(s) in RCA: 100] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2014] [Revised: 11/13/2014] [Accepted: 12/02/2014] [Indexed: 12/20/2022]
|
21
|
Liu X, Campillos M. Unveiling new biological relationships using shared hits of chemical screening assay pairs. ACTA ACUST UNITED AC 2015; 30:i579-86. [PMID: 25161250 PMCID: PMC4147921 DOI: 10.1093/bioinformatics/btu468] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Motivation: Although the integration and analysis of the activity of small molecules across multiple chemical screens is a common approach to determine the specificity and toxicity of hits, the suitability of these approaches to reveal novel biological information is less explored. Here, we test the hypothesis that assays sharing selective hits are biologically related. Results: We annotated the biological activities (i.e. biological processes or molecular activities) measured in assays and constructed chemical hit profiles with sets of compounds differing on their selectivity level for 1640 assays of ChemBank repository. We compared the similarity of chemical hit profiles of pairs of assays with their biological relationships and observed that assay pairs sharing non-promiscuous chemical hits tend to be biologically related. A detailed analysis of a network containing assay pairs with the highest hit similarity confirmed biological meaningful relationships. Furthermore, the biological roles of predicted molecular targets of the shared hits reinforced the biological associations between assay pairs. Contact:monica.campillos@helmholtz-muenchen.de Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xueping Liu
- Institute of Bioinformatics and Systems Biology and German Center for Diabetes Research, Helmholtz Center Munich, 85764 Neuherberg, Germany Institute of Bioinformatics and Systems Biology and German Center for Diabetes Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| | - Monica Campillos
- Institute of Bioinformatics and Systems Biology and German Center for Diabetes Research, Helmholtz Center Munich, 85764 Neuherberg, Germany Institute of Bioinformatics and Systems Biology and German Center for Diabetes Research, Helmholtz Center Munich, 85764 Neuherberg, Germany
| |
Collapse
|
22
|
Copeland JC, Zehr LJ, Cerny RL, Powers R. The applicability of molecular descriptors for designing an electrospray ionization mass spectrometry compatible library for drug discovery. Comb Chem High Throughput Screen 2014; 15:806-15. [PMID: 22708878 DOI: 10.2174/138620712803901180] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2012] [Revised: 05/25/2012] [Accepted: 06/08/2012] [Indexed: 11/22/2022]
Abstract
Detecting a small molecular-weight compound by electrospray ionization mass spectrometry (ESI-MS) requires the compound to obtain a charge. Factors such as gas-phase proton affinities and analyte surface activity are correlated with a positive ESI-MS response, but unfortunately it is extremely challenging to predict from a chemical structure alone if a compound is likely to yield an observable molecular-ion peak in an ESI-MS spectrum. Thus, the design of a chemical library for an ESI-MS ligand-affinity screen is particularly daunting. Only 56.9% of the compounds from our FAST-NMR functional library [1] were detectable by ESI-MS. An analysis of ~1,600 molecular descriptors did not identify any correlation with a positive ESI-MS response that cannot be attributed to a skewed population distribution. Unfortunately, our results suggest that molecular descriptors are not a valuable approach for designing a chemical library for an MS-based ligand affinity screen.
Collapse
Affiliation(s)
- Jennifer C Copeland
- Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE 68588-0304, USA
| | | | | | | |
Collapse
|
23
|
Abstract
Virtual screening is an established technique that has successfully been deployed in the identification of novel biologically active molecules. Whether for ligand-based or for structure-based virtual screening, a chemical collection needs to be properly processed prior to in silico evaluation. Here we describe our step-by-step procedure for handling large collections of compounds prior to virtual screening.
Collapse
Affiliation(s)
- Cristian G Bologa
- Department of Biochemistry and Molecular Biology, University of New Mexico School of Medicine, Albuquerque, NM, USA
| | | |
Collapse
|
24
|
Abstract
ClusterMine360 (http://www.clustermine360.ca/) is a database of microbial polyketide and non-ribosomal peptide gene clusters. It takes advantage of crowd-sourcing by allowing members of the community to make contributions while automation is used to help achieve high data consistency and quality. The database currently has >200 gene clusters from >185 compound families. It also features a unique sequence repository containing >10 000 polyketide synthase/non-ribosomal peptide synthetase domains. The sequences are filterable and downloadable as individual or multiple sequence FASTA files. We are confident that this database will be a useful resource for members of the polyketide synthases/non-ribosomal peptide synthetases research community, enabling them to keep up with the growing number of sequenced gene clusters and rapidly mine these clusters for functional information.
Collapse
Affiliation(s)
- Kyle R Conway
- Department of Chemistry, Center for Advanced Research in Environmental Genomics, University of Ottawa, Ottawa, Ontario K1N 6N5, Canada
| | | |
Collapse
|
25
|
Abstract
High-throughput screening (HTS) is a key process used in drug discovery to identify hits from compound libraries that may become leads for medicinal chemistry optimization. This updated overview discusses the utilization of compound libraries, compounds derived from combinatorial and parallel synthesis campaigns and natural product sources; creation of mother and daughter plates; and compound storage, handling, and bar coding in HTS. The unit also presents an overview of established and emerging assay technologies (i.e., time-resolved fluorescence, fluorescence polarization, fluorescence-correlation spectroscopy, functional whole cell assays, and high-content assays) and their integration in automation hardware and IT systems. This revised unit provides updated descriptions of state-of-the-art instrumentation and technologies in this rapidly changing environment. The section on assay methodologies now also covers enzyme complementation assays and methods for high-throughput screening of ion channel activities. Finally, a section on criteria for assay robustness is included discussing the Z'-factor, which is now a widely accepted criterion for evaluation and validation of high throughput screening assays.
Collapse
Affiliation(s)
- Michael Entzeroth
- Experimental Therapeutics Centre, Agency for Science, Technology, and Research (A*STAR), Singapore
| | | | | |
Collapse
|
26
|
Le Guilloux V, Colliandre L, Bourg S, Guénegou G, Dubois-Chevalier J, Morin-Allory L. Visual characterization and diversity quantification of chemical libraries: 1. creation of delimited reference chemical subspaces. J Chem Inf Model 2011; 51:1762-74. [PMID: 21761916 DOI: 10.1021/ci200051r] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
High-throughput screening (HTS) is a well-established technology which can test up to several million compounds in a few weeks. Despite these appealing capabilities, available resources and high costs may limit the number of molecules screened, making diversity analysis a method of choice to design and prioritize screening libraries. With a constantly increasing number of molecules available for screening, chemical space has become a key concept for visualizing, analyzing, and comparing chemical libraries. In this first article, we present a new method to build delimited reference chemical subspaces (DRCS). A set of 16 million screening compounds from 73 chemical providers has been gathered, resulting in a database of 6.63 million standardized and unique molecules. These molecules have been used to create three DRCS using three different sets of chemical descriptors. A robust principal component analysis model for each space has been obtained, whereby molecules are projected in a reduced two-dimensional viewable space. The specificity of our approach is that each reduced space has been delimited by a representative contour encompassing a very large proportion of molecules and reflecting its overall shape. The methodology is illustrated by mapping and comparing various chemical libraries. Several tools used in these studies are made freely available, thus enabling any user to compute DRCS matching specific requirements.
Collapse
Affiliation(s)
- Vincent Le Guilloux
- Institut de Chimie Organique et Analytique (ICOA), Université d'Orléans, rue de Chartres, 45067 Orléans Cedex 2, France
| | | | | | | | | | | |
Collapse
|
27
|
Abstract
This chapter provides a brief overview of chemoinformatics and its applications to chemical library design. It is meant to be a quick starter and to serve as an invitation to readers for more in-depth exploration of the field. The topics covered in this chapter are chemical representation, chemical data and data mining, molecular descriptors, chemical space and dimension reduction, quantitative structure-activity relationship, similarity, diversity, and multiobjective optimization.
Collapse
|
28
|
Schnur DM, Beno BR, Tebben AJ, Cavallaro C. Methods for combinatorial and parallel library design. Methods Mol Biol 2011; 672:387-434. [PMID: 20838978 DOI: 10.1007/978-1-60761-839-3_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Diversity has historically played a critical role in design of combinatorial libraries, screening sets and corporate collections for lead discovery. Large library design dominated the field in the 1990s with methods ranging anywhere from purely arbitrary through property based reagent selection to product based approaches. In recent years, however, there has been a downward trend in library size. This was due to increased information about the desirable targets gleaned from the genomics revolution and to the ever growing availability of target protein structures from crystallography and homology modeling. Creation of libraries directed toward families of receptors such as GPCRs, kinases, nuclear hormone receptors, proteases, etc., replaced the generation of libraries based primarily on diversity while single target focused library design has remained an important objective. Concurrently, computing grids and cpu clusters have facilitated the development of structure based tools that screen hundreds of thousands of molecules. Smaller "smarter" combinatorial and focused parallel libraries replaced those early un-focused large libraries in the twenty-first century drug design paradigm. While diversity still plays a role in lead discovery, the focus of current library design methods has shifted to receptor based methods, scaffold hopping/bio-isostere searching, and a much needed emphasis on synthetic feasibility. Methods such as "privileged substructures based design" and pharmacophore based design still are important methods for parallel and small combinatorial library design. This chapter discusses some of the possible design methods and presents examples where they are available.
Collapse
Affiliation(s)
- Dora M Schnur
- Computer Aided Drug Design, Pharmaceutical Research Institute, Bristol-Myers Squibb Company, Princeton, NJ, USA
| | | | | | | |
Collapse
|
29
|
Chen H, Engkvist O, Blomberg N. Combinatorial library design from reagent pharmacophore fingerprints. Methods Mol Biol 2011; 685:135-152. [PMID: 20981522 DOI: 10.1007/978-1-60761-931-4_7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Combinatorial and parallel chemical synthesis technologies are powerful tools in early drug discovery projects. Over the past couple of years an increased emphasis on targeted lead generation libraries and focussed screening libraries in the pharmaceutical industry has driven a surge in computational methods to explore molecular frameworks to establish new chemical equity. In this chapter we describe a complementary technique in the library design process, termed ProSAR, to effectively cover the accessible pharmacophore space around a given scaffold. With this method reagents are selected such that each R-group on the scaffold has an optimal coverage of pharmacophoric features. This is achieved by optimising the Shannon entropy, i.e. the information content, of the topological pharmacophore distribution for the reagents. As this method enumerates compounds with a systematic variation of user-defined pharmacophores to the attachment point on the scaffold, the enumerated compounds may serve as a good starting point for deriving a structure-activity relationship (SAR).
Collapse
Affiliation(s)
- Hongming Chen
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Mölndal, Sweden.
| | | | | |
Collapse
|
30
|
Cheng TJR, Wu YT, Yang ST, Lo KH, Chen SK, Chen YH, Huang WI, Yuan CH, Guo CW, Huang LY, Chen KT, Shih HW, Cheng YSE, Cheng WC, Wong CH. High-throughput identification of antibacterials against methicillin-resistant Staphylococcus aureus (MRSA) and the transglycosylase. Bioorg Med Chem 2010; 18:8512-29. [PMID: 21075637 DOI: 10.1016/j.bmc.2010.10.036] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2010] [Revised: 10/11/2010] [Accepted: 10/14/2010] [Indexed: 12/01/2022]
Abstract
To identify new transglycosylase inhibitors with potent anti-methicillin-resistant Staphylococcus aureus (MRSA) activities, a high-throughput screening against Staphylococcus aureus was conducted to look for antibacterial cores in our 2M compound library that consists of natural products, proprietary collection, and synthetic molecules. About 3600 hits were identified from the primary screening and the subsequent confirmation resulted in a total of 252 compounds in 84 clusters which showed anti-MRSA activities with MIC values as low as 0.1 μg/ml. Subsequent screening targeting bacterial transglycosylase identified a salicylanilide-based core that inhibited the lipid II polymerization and the moenomycin-binding activities of transglycosylase. Among the collected analogues, potent inhibitors with the IC(50) values below 10 μM against transglycosylase were identified. The non-carbonhydrate scaffold reported in this study suggests a new direction for development of bacterial transglycosylase inhibitors.
Collapse
Affiliation(s)
- Ting-Jen Rachel Cheng
- Genomics Research Center, Academia Sinica, 128 Sec 2 Academia Road, Nankang, Taipei 115, Taiwan. Genomics Research Center, Academia Sinica, 128 Sec 2 Academia Road, Nankang, Taipei 115, Taiwan
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Fischer JD, Holliday GL, Rahman SA, Thornton JM. The structures and physicochemical properties of organic cofactors in biocatalysis. J Mol Biol 2010; 403:803-24. [PMID: 20850456 DOI: 10.1016/j.jmb.2010.09.018] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 09/03/2010] [Accepted: 09/06/2010] [Indexed: 10/19/2022]
Abstract
Many crucial biochemical reactions in the cell require not only enzymes for catalysis but also organic cofactors or metal ions. Here, we analyse the physicochemical properties, chemical structures and functions of organic cofactors. Based on a thorough analysis of the literature complemented by our quantitative characterisation and classification, we found that most of these molecules are constructed from nucleotide and amino-acid-type building blocks, as well as some recurring cofactor-specific chemical scaffolds. We show that, as expected, organic cofactors are on average significantly more polar and slightly larger than other metabolites in the cell, yet they cover the full spectrum of physicochemical properties found in the metabolome. Furthermore, we have identified intrinsic groupings among the cofactors, based on their molecular properties, structures and functions, that represent a new way of considering cofactors. Although some classes of cofactors, as defined by their physicochemical properties, exhibit clear structural communalities, cofactors with similar structures can have diverse functional and physicochemical profiles. Finally, we show that the molecular functions of the cofactors not only may duplicate reactions performed by inorganic metal cofactors and amino acids, the cell's other catalytic tools, but also provide novel chemistries for catalysis.
Collapse
Affiliation(s)
- Julia D Fischer
- EMBL European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK.
| | | | | | | |
Collapse
|
32
|
Xi L, Li S, Liu H, Li J, Lei B, Yao X. Global and local prediction of protein folding rates based on sequence autocorrelation information. J Theor Biol 2010; 264:1159-68. [DOI: 10.1016/j.jtbi.2010.03.042] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2009] [Revised: 03/28/2010] [Accepted: 03/29/2010] [Indexed: 11/24/2022]
|
33
|
Verma J, Malde A, Khedkar S, Iyer R, Coutinho E. Local Indices for Similarity Analysis (LISA)—A 3D-QSAR Formalism Based on Local Molecular Similarity. J Chem Inf Model 2009; 49:2695-707. [DOI: 10.1021/ci900224u] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jitender Verma
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India, and Spring Bank Pharmaceuticals, Inc., 113 Cedar Street, Milford, Massachusetts 01757
| | - Alpeshkumar Malde
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India, and Spring Bank Pharmaceuticals, Inc., 113 Cedar Street, Milford, Massachusetts 01757
| | - Santosh Khedkar
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India, and Spring Bank Pharmaceuticals, Inc., 113 Cedar Street, Milford, Massachusetts 01757
| | - Radhakrishnan Iyer
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India, and Spring Bank Pharmaceuticals, Inc., 113 Cedar Street, Milford, Massachusetts 01757
| | - Evans Coutinho
- Department of Pharmaceutical Chemistry, Bombay College of Pharmacy, Kalina, Santacruz (E), Mumbai 400 098, India, and Spring Bank Pharmaceuticals, Inc., 113 Cedar Street, Milford, Massachusetts 01757
| |
Collapse
|
34
|
Abstract
BACKGROUND: Drug discovery is a complex and unpredictable endeavor with a high failure rate. Current trends in the pharmaceutical industry have exasperated these challenges and are contributing to the dramatic decline in productivity observed over the last decade. The industrialization of science by forcing the drug discovery process to adhere to assembly-line protocols is imposing unnecessary restrictions, such as short project time-lines. Recent advances in nuclear magnetic resonance are responding to these self-imposed limitations and are providing opportunities to increase the success rate of drug discovery. OBJECTIVE/METHOD: A review of recent advancements in NMR technology that have the potential of significantly impacting and benefiting the drug discovery process will be presented. These include fast NMR data collection protocols and high-throughput protein structure determination, rapid protein-ligand co-structure determination, lead discovery using fragment-based NMR affinity screens, NMR metabolomics to monitor in vivo efficacy and toxicity for lead compounds, and the identification of new therapeutic targets through the functional annotation of proteins by FAST-NMR. CONCLUSION: NMR is a critical component of the drug discovery process, where the versatility of the technique enables it to continually expand and evolve its role. NMR is expected to maintain this growth over the next decade with advancements in automation, speed of structure calculation, in-cell imaging techniques, and the expansion of NMR amenable targets.
Collapse
Affiliation(s)
- Robert Powers
- Department of Chemistry, University of Nebraska Lincoln, Lincoln, NE 68588
| |
Collapse
|
35
|
Rahman SA, Bashton M, Holliday GL, Schrader R, Thornton JM. Small Molecule Subgraph Detector (SMSD) toolkit. J Cheminform 2009; 1:12. [PMID: 20298518 PMCID: PMC2820491 DOI: 10.1186/1758-2946-1-12] [Citation(s) in RCA: 86] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Accepted: 08/10/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Finding one small molecule (query) in a large target library is a challenging task in computational chemistry. Although several heuristic approaches are available using fragment-based chemical similarity searches, they fail to identify exact atom-bond equivalence between the query and target molecules and thus cannot be applied to complex chemical similarity searches, such as searching a complete or partial metabolic pathway.In this paper we present a new Maximum Common Subgraph (MCS) tool: SMSD (Small Molecule Subgraph Detector) to overcome the issues with current heuristic approaches to small molecule similarity searches. The MCS search implemented in SMSD incorporates chemical knowledge (atom type match with bond sensitive and insensitive information) while searching molecular similarity. We also propose a novel method by which solutions obtained by each MCS run can be ranked using chemical filters such as stereochemistry, bond energy, etc. RESULTS In order to benchmark and test the tool, we performed a 50,000 pair-wise comparison between KEGG ligands and PDB HET Group atoms. In both cases the SMSD was shown to be more efficient than the widely used MCS module implemented in the Chemistry Development Kit (CDK) in generating MCS solutions from our test cases. CONCLUSION Presently this tool can be applied to various areas of bioinformatics and chemo-informatics for finding exhaustive MCS matches. For example, it can be used to analyse metabolic networks by mapping the atoms between reactants and products involved in reactions. It can also be used to detect the MCS/substructure searches in small molecules reported by metabolome experiments, as well as in the screening of drug-like compounds with similar substructures.Thus, we present a robust tool that can be used for multiple applications, including the discovery of new drug molecules. This tool is freely available on http://www.ebi.ac.uk/thornton-srv/software/SMSD/
Collapse
Affiliation(s)
- Syed Asad Rahman
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | | | | | | | |
Collapse
|
36
|
Tanikawa T, Fridman M, Zhu W, Faulk B, Joseph IC, Kahne D, Wagner BK, Clemons PA. Using biological performance similarity to inform disaccharide library design. J Am Chem Soc 2009; 131:5075-83. [PMID: 19298063 DOI: 10.1021/ja806583y] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Designing better small-molecule discovery libraries requires having methods to assess the consequences of different synthesis decisions on the biological performance of resulting library members. Since we are particularly interested in how stereochemistry affects performance in biological assays, we prepared a disaccharide library containing systematic stereochemical variations, assayed the library for different biological effects, and developed methods to assess the similarity of performance between members across multiple assays. These methods allow us to ask which subsets of stereochemical features best predict similarity in patterns of biological performance between individual members and which features produce the greatest variation of outcomes. We anticipate that the data-analysis approach presented here can be generalized to other sets of biological assays and other chemical descriptors. Methods to assess which structural features of library members produce the greatest similarity in performance for a given set of biological assays should help prioritize synthesis decisions in second-generation library development targeting the underlying cell-biological processes. Methods to assess which structural features of library members produce the greatest variation in performance should help guide decisions about what synthetic methods need to be developed to make optimal small-molecule screening collections.
Collapse
Affiliation(s)
- Tetsuya Tanikawa
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, Massachusetts 02138, USA
| | | | | | | | | | | | | | | |
Collapse
|
37
|
Chen H, Börjesson U, Engkvist O, Kogej T, Svensson MA, Blomberg N, Weigelt D, Burrows JN, Lange T. ProSAR: A New Methodology for Combinatorial Library Design. J Chem Inf Model 2009; 49:603-14. [DOI: 10.1021/ci800231d] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Hongming Chen
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Ulf Börjesson
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Ola Engkvist
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Thierry Kogej
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Mats A. Svensson
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Niklas Blomberg
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Dirk Weigelt
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Jeremy N. Burrows
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| | - Tim Lange
- DECS GCS Computational Chemistry, AstraZeneca R&D Mölndal, Pepparedsleden 1, SE-43183 Mölndal, Sweden, and Medicinal Chemistry, AstraZeneca R&D Södertälje, SE-151 85 Södertälje, Sweden
| |
Collapse
|
38
|
Rupp M, Schneider P, Schneider G. Distance phenomena in high-dimensional chemical descriptor spaces: Consequences for similarity-based approaches. J Comput Chem 2009; 30:2285-96. [DOI: 10.1002/jcc.21218] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
39
|
Lipkus AH, Yuan Q, Lucas KA, Funk SA, Bartelt WF, Schenck RJ, Trippe AJ. Structural diversity of organic chemistry. A scaffold analysis of the CAS Registry. J Org Chem 2008; 73:4443-51. [PMID: 18505297 DOI: 10.1021/jo8001276] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
By analyzing the scaffold content of the CAS Registry, we attempt to characterize in a comprehensive way the structural diversity of organic chemistry. The scaffold of a molecule is taken to be its framework, defined as all its ring systems and all the linkers that connect them. Framework data from more than 24 million organic compounds is analyzed. The distribution of frameworks among compounds is found to be top-heavy, i.e., a small percentage of frameworks occur in a large percentage of compounds. When frameworks are analyzed at the graph level, an even more top-heavy distribution is found: half of the compounds can be described by only 143 framework shapes. The most significant finding is that the framework distribution conforms almost exactly to a power law. This suggests that the more often a framework has been used as the basis for a compound, the more likely it is to be used in another compound. This may be explained by the cost of synthesis: making a new derivative of a framework is probably less costly if many other derivatives are known. We believe this power law is evidence that the minimization of synthetic cost has been a key factor in shaping the known universe of organic chemistry.
Collapse
Affiliation(s)
- Alan H Lipkus
- Chemical Abstracts Service, Columbus, OH 43210, USA.
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Houghten RA, Pinilla C, Giulianotti MA, Appel JR, Dooley CT, Nefzi A, Ostresh JM, Yu Y, Maggiora GM, Medina-Franco JL, Brunner D, Schneider J. Strategies for the use of mixture-based synthetic combinatorial libraries: scaffold ranking, direct testing in vivo, and enhanced deconvolution by computational methods. ACTA ACUST UNITED AC 2007; 10:3-19. [PMID: 18067268 DOI: 10.1021/cc7001205] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- Richard A Houghten
- Torrey Pines Institute for Molecular Studies, 3550 General Atomics Court, San Diego, California 92121, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Rosania GR, Crippen G, Woolf P, States D, Shedden K. A Cheminformatic Toolkit for Mining Biomedical Knowledge. Pharm Res 2007; 24:1791-802. [PMID: 17385012 DOI: 10.1007/s11095-007-9285-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2007] [Accepted: 02/27/2007] [Indexed: 01/31/2023]
Abstract
PURPOSE Cheminformatics can be broadly defined to encompass any activity related to the application of information technology to the study of properties, effects and uses of chemical agents. One of the most important current challenges in cheminformatics is to allow researchers to search databases of biomedical knowledge, using chemical structures as input. MATERIALS AND METHODS An important step towards this goal was the establishment of PubChem, an open, centralized database of small molecules accessible through the World Wide Web. While PubChem is primarily intended to serve as a repository for high throughput screening data from federally-funded screening centers and academic research laboratories, the major impact of PubChem could also reside in its ability to serve as a chemical gateway to biomedical databases such as PubMed. CONCLUSION This article will review cheminformatic tools that can be applied to facilitate annotation of PubChem through links to the scientific literature; to integrate PubChem with transcriptomic, proteomic, and metabolomic datasets; to incorporate results of numerical simulations of physiological systems into PubChem annotation; and ultimately, to translate data of chemical genomics screening efforts into information that will benefit biomedical researchers and physician scientists across all therapeutic areas.
Collapse
Affiliation(s)
- Gus R Rosania
- Department of Pharmaceutical Sciences, University of Michigan College of Pharmacy, 428 Church Street, Ann Arbor, MI 48109, USA.
| | | | | | | | | |
Collapse
|
42
|
Zhang S, Golbraikh A, Oloff S, Kohn H, Tropsha A. A novel automated lazy learning QSAR (ALL-QSAR) approach: method development, applications, and virtual screening of chemical databases using validated ALL-QSAR models. J Chem Inf Model 2006; 46:1984-95. [PMID: 16995729 PMCID: PMC2536695 DOI: 10.1021/ci060132x] [Citation(s) in RCA: 169] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A novel automated lazy learning quantitative structure-activity relationship (ALL-QSAR) modeling approach has been developed on the basis of the lazy learning theory. The activity of a test compound is predicted from a locally weighted linear regression model using chemical descriptors and the biological activity of the training set compounds most chemically similar to this test compound. The weights with which training set compounds are included in the regression depend on the similarity of those compounds to a test compound. We have applied the ALL-QSAR method to several experimental chemical data sets including 48 anticonvulsant agents with known ED50 values, 48 dopamine D1-receptor antagonists with known competitive binding affinities (Ki), and a Tetrahymena pyriformis data set containing 250 phenolic compounds with toxicity IGC50 values. When applied to database screening, models developed for anticonvulsant agents identified several known anticonvulsant compounds that were not only absent in the training set but highly chemically dissimilar to the training set compounds. This initial success indicates that ALL-QSAR can be further exploited as a general tool for accurate bioactivity prediction and database screening in drug design and discovery. Because of its local nature, the ALL-QSAR approach appears to be especially well-suited for the development of highly predictive models for the sparse or unevenly distributed data sets.
Collapse
Affiliation(s)
| | | | | | | | - Alexander Tropsha
- Corresponding author, School of Pharmacy, Campus Box 7360, 327 Beard Hall, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7360., Telephone (919) 966-2955, FAX: (919) 966-0204,
| |
Collapse
|
43
|
|
44
|
Maldonado AG, Doucet JP, Petitjean M, Fan BT. Molecular similarity and diversity in chemoinformatics: from theory to applications. Mol Divers 2006; 10:39-79. [PMID: 16404528 DOI: 10.1007/s11030-006-8697-1] [Citation(s) in RCA: 179] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2004] [Accepted: 06/14/2005] [Indexed: 01/04/2023]
Abstract
This review is dedicated to a survey on molecular similarity and diversity. Key findings reported in recent investigations are selectively highlighted and summarized. Even if this overview is mainly centered in chemoinformatics, applications in other areas (pharmaceutical and medical chemistry, combinatorial chemistry, chemical databases management, etc.) are also introduced. The approaches used to define and describe the concepts of molecular similarity and diversity in the context of chemoinformatics are discussed in the first part of this review. We introduce, in the second and third parts, the descriptions and analyses of different methods and techniques. Finally, current applications and problems are enumerated and discussed in the last part.
Collapse
Affiliation(s)
- Ana G Maldonado
- ITODYS, Université Paris 7--Denis Diderot, CNRS UMR-7086, 1 rue Guy de la Brosse, 75005, Paris, France
| | | | | | | |
Collapse
|
45
|
Oloff S, Zhang S, Sukumar N, Breneman C, Tropsha A. Chemometric analysis of ligand receptor complementarity: identifying Complementary Ligands Based on Receptor Information (CoLiBRI). J Chem Inf Model 2006; 46:844-51. [PMID: 16563016 PMCID: PMC2755506 DOI: 10.1021/ci050065r] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We have developed a novel structure-based chemoinformatics approach to search for Complimentary Ligands Based on Receptor Information (CoLiBRI). CoLiBRI is based on the representation of both receptor binding sites and their respective ligands in a space of universal chemical descriptors. The binding site atoms involved in the interaction with ligands are identified by the means of a computational geometry technique known as Delaunay tessellation as applied to X-ray characterized ligand-receptor complexes. TAE/RECON multiple chemical descriptors are calculated independently for each ligand as well as for its active site atoms. The representation of both ligands and active sites using chemical descriptors allows the application of well-known chemometric techniques in order to correlate chemical similarities between active sites and their respective ligands. We have established a protocol to map patterns of nearest neighbor active site vectors in a multidimensional TAE/RECON space onto those of their complementary ligands and vice versa. This protocol affords the prediction of a virtual complementary ligand vector in the ligand chemical space from the position of a known active site vector. This prediction is followed by chemical similarity calculations between this virtual ligand vector and those calculated for molecules in a chemical database to identify real compounds most similar to the virtual ligand. Consequently, the knowledge of the receptor active site structure affords straightforward and efficient identification of its complementary ligands in large databases of chemical compounds using rapid chemical similarity searches. Conversely, starting from the ligand chemical structure, one may identify possible complementary receptor cavities as well. We have applied the CoLiBRI approach to a data set of 800 X-ray characterized ligand-receptor complexes in the PDBbind database. Using a k nearest neighbor (kNN) pattern recognition approach and variable selection, we have shown that knowledge of the active site structure affords identification of its complimentary ligand among the top 1% of a large chemical database in over 90% of all test active sites when a binding site of the same protein family was present in the training set. In the case where test receptors are highly dissimilar and not present among the receptor families in the training set, the prediction accuracy is decreased; however, CoLiBRI was still able to quickly eliminate 75% of the chemical database as improbable ligands. CoLiBRI affords rapid prefiltering of a large chemical database to eliminate compounds that have little chance of binding to a receptor active site.
Collapse
Affiliation(s)
- Scott Oloff
- Laboratory for Molecular Modeling, School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | | | | | |
Collapse
|
46
|
|
47
|
Fechner U, Paetz J, Schneider G. Comparison of Three Holographic Fingerprint Descriptors and their Binary Counterparts. ACTA ACUST UNITED AC 2005. [DOI: 10.1002/qsar.200530118] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
48
|
Mauser H, Roche O, Stahl M, Müller S. Prediction of UV and ESI−MS Signal Intensities. J Chem Inf Model 2005; 45:1039-46. [PMID: 16045299 DOI: 10.1021/ci0496548] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
All major pharmaceutical companies maintain large collections of compounds that are used either for screening against biological targets or as synthetic precursors. The quality assessment of these compounds is typically done by liquid chromatography combined with mass spectroscopy (LC/MS) and UV purity control. To facilitate the analysis of the analytical data, we have built computational models to predict UV and MS signal intensities under experimental LC/MS conditions. The discriminant partial-least-squares technique was used for classifying compounds into those most likely to yield a MS signal and others where the signal is below the detection limit (94% and 88% correct predictions, respectively). In the case of UV prediction, we compared this statistical linear-regression technique to a knowledge-based approach. A combination of both techniques proved to be the most reliable (96/98% correct predictions of UV-active/ UV-inactive compounds). Both models have been incorporated into the automated compound integrity profiling at F. Hoffmann-La Roche.
Collapse
Affiliation(s)
- Harald Mauser
- F. Hoffmann-La Roche Ltd., Pharmaceuticals Division, CH-4070 Basel, Switzerland.
| | | | | | | |
Collapse
|
49
|
Mercier KA, Powers R. Determining the optimal size of small molecule mixtures for high throughput NMR screening. JOURNAL OF BIOMOLECULAR NMR 2005; 31:243-258. [PMID: 15803397 DOI: 10.1007/s10858-005-0948-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2004] [Accepted: 01/06/2005] [Indexed: 05/24/2023]
Abstract
High-throughput screening (HTS) using NMR spectroscopy has become a common component of the drug discovery effort and is widely used throughout the pharmaceutical industry. NMR provides additional information about the nature of small molecule-protein interactions compared to traditional HTS methods. In order to achieve comparable efficiency, small molecules are often screened as mixtures in NMR-based assays. Nevertheless, an analysis of the efficiency of mixtures and a corresponding determination of the optimum mixture size (OMS) that minimizes the amount of material and instrumentation time required for an NMR screen has been lacking. A model for calculating OMS based on the application of the hypergeometric distribution function to determine the probability of a "hit" for various mixture sizes and hit rates is presented. An alternative method for the deconvolution of large screening mixtures is also discussed. These methods have been applied in a high-throughput NMR screening assay using a small, directed library.
Collapse
Affiliation(s)
- Kelly A Mercier
- Department of Chemistry, University of Nebraska Lincoln, 722 Hamilton Hall, Lincoln, NE 68522-0304, USA
| | | |
Collapse
|
50
|
Jónsdóttir SO, Jørgensen FS, Brunak S. Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates. Bioinformatics 2005; 21:2145-60. [PMID: 15713739 DOI: 10.1093/bioinformatics/bti314] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION To gather information about available databases and chemoinformatics methods for prediction of properties relevant to the drug discovery and optimization process. RESULTS We present an overview of the most important databases with 2-dimensional and 3-dimensional structural information about drugs and drug candidates, and of databases with relevant properties. Access to experimental data and numerical methods for selecting and utilizing these data is crucial for developing accurate predictive in silico models. Many interesting predictive methods for classifying the suitability of chemical compounds as potential drugs, as well as for predicting their physico-chemical and ADMET properties have been proposed in recent years. These methods are discussed, and some possible future directions in this rapidly developing field are described.
Collapse
Affiliation(s)
- Svava Osk Jónsdóttir
- Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| | | | | |
Collapse
|