1
|
Weidlich IE, Filippov IV. Using the gini coefficient to measure the chemical diversity of small-molecule libraries. J Comput Chem 2016; 37:2091-7. [PMID: 27353971 DOI: 10.1002/jcc.24423] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2016] [Revised: 05/10/2016] [Accepted: 05/17/2016] [Indexed: 11/12/2022]
Abstract
Modern databases of small organic molecules contain tens of millions of structures. The size of theoretically available chemistry is even larger. However, despite the large amount of chemical information, the "big data" moment for chemistry has not yet provided the corresponding payoff of cheaper computer-predicted medicine or robust machine-learning models for the determination of efficacy and toxicity. Here, we present a study of the diversity of chemical datasets using a measure that is commonly used in socioeconomic studies. We demonstrate the use of this diversity measure on several datasets that were constructed to contain various congeneric subsets of molecules as well as randomly selected molecules. We also apply our method to a number of well-known databases that are frequently used for structure-activity relationship modeling. Our results show the poor diversity of the common sources of potential lead compounds compared to actual known drugs. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Iwona E Weidlich
- Computational Drug Design Systems (CODDES) LLC, Gaithersburg, Maryland, 20877
| | | |
Collapse
|
2
|
Pipkorn R, Braun K, Wiessler M, Waldeck W, Schrenk HH, Koch M, Semmler W, Komljenovic D. A peptide & peptide nucleic acid synthesis technology for transporter molecules and theranostics--the SPPS. Int J Med Sci 2014; 11:697-706. [PMID: 24843319 PMCID: PMC4025169 DOI: 10.7150/ijms.8168] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 03/25/2014] [Indexed: 11/20/2022] Open
Abstract
Advances in imaging diagnostics using magnetic resonance tomography (MRT), positron emission tomography (PET) and fluorescence imaging including near infrared (NIR) imaging methods are facilitated by constant improvement of the concepts of peptide synthesis. Feasible patient-specific theranostic platforms in the personalized medicine are particularly dependent on efficient and clinically applicable peptide constructs. The role of peptides in the interrelations between the structure and function of proteins is widely investigated, especially by using computer-assisted methods. Nowadays the solid phase synthesis (SPPS) chemistry emerges as a key technology and is considered as a promising methodology to design peptides for the investigation of molecular pharmacological processes at the transcriptional level. SPPS syntheses could be carried out in core facilities producing peptides for large-scale scientific implementations as presented here.
Collapse
Affiliation(s)
- Ruediger Pipkorn
- 1. German Cancer Research Center, Dept. of Translational Immunology, INF 410, D-69120 Heidelberg, Germany
| | - Klaus Braun
- 2. German Cancer Research Center, Dept. of Medical Physics in Radiology, INF 280, D-69120 Heidelberg, Germany
| | - Manfred Wiessler
- 2. German Cancer Research Center, Dept. of Medical Physics in Radiology, INF 280, D-69120 Heidelberg, Germany
| | - Waldemar Waldeck
- 3. German Cancer Research Center, Division of Biophysics of Macromolecules, INF 580, D-69120 Heidelberg, Germany
| | - Hans-Hermann Schrenk
- 2. German Cancer Research Center, Dept. of Medical Physics in Radiology, INF 280, D-69120 Heidelberg, Germany
| | - Mario Koch
- 1. German Cancer Research Center, Dept. of Translational Immunology, INF 410, D-69120 Heidelberg, Germany
| | - Wolfhard Semmler
- 2. German Cancer Research Center, Dept. of Medical Physics in Radiology, INF 280, D-69120 Heidelberg, Germany
| | - Dorde Komljenovic
- 2. German Cancer Research Center, Dept. of Medical Physics in Radiology, INF 280, D-69120 Heidelberg, Germany
| |
Collapse
|
4
|
Schnur DM, Beno BR, Tebben AJ, Cavallaro C. Methods for combinatorial and parallel library design. Methods Mol Biol 2011; 672:387-434. [PMID: 20838978 DOI: 10.1007/978-1-60761-839-3_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Diversity has historically played a critical role in design of combinatorial libraries, screening sets and corporate collections for lead discovery. Large library design dominated the field in the 1990s with methods ranging anywhere from purely arbitrary through property based reagent selection to product based approaches. In recent years, however, there has been a downward trend in library size. This was due to increased information about the desirable targets gleaned from the genomics revolution and to the ever growing availability of target protein structures from crystallography and homology modeling. Creation of libraries directed toward families of receptors such as GPCRs, kinases, nuclear hormone receptors, proteases, etc., replaced the generation of libraries based primarily on diversity while single target focused library design has remained an important objective. Concurrently, computing grids and cpu clusters have facilitated the development of structure based tools that screen hundreds of thousands of molecules. Smaller "smarter" combinatorial and focused parallel libraries replaced those early un-focused large libraries in the twenty-first century drug design paradigm. While diversity still plays a role in lead discovery, the focus of current library design methods has shifted to receptor based methods, scaffold hopping/bio-isostere searching, and a much needed emphasis on synthetic feasibility. Methods such as "privileged substructures based design" and pharmacophore based design still are important methods for parallel and small combinatorial library design. This chapter discusses some of the possible design methods and presents examples where they are available.
Collapse
Affiliation(s)
- Dora M Schnur
- Computer Aided Drug Design, Pharmaceutical Research Institute, Bristol-Myers Squibb Company, Princeton, NJ, USA
| | | | | | | |
Collapse
|
6
|
Akella LB, DeCaprio D. Cheminformatics approaches to analyze diversity in compound screening libraries. Curr Opin Chem Biol 2010; 14:325-30. [PMID: 20457001 DOI: 10.1016/j.cbpa.2010.03.017] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2009] [Revised: 02/16/2010] [Accepted: 03/21/2010] [Indexed: 10/19/2022]
Abstract
As high-throughput screening matures as a discipline, cheminformatics is playing an increasingly important role in selecting new compounds for diverse screening libraries. New visualization techniques such as multi-fusion similarity maps, scaffold trees, and principal moments of inertia plots provide complementary information on compound libraries and enable identification of unexplored regions of chemical space with potential biological relevance. Quantitative metrics have been developed to analyze libraries for properties such as natural product-likeness and shape complexity. Analysis of high-throughput screening results and drug discovery programs identify compounds problematic for screening. Taken together these approaches allow us to increase the diversity of biological outcomes available in compound screening libraries and improve the success rates of high-throughput screening against new targets without making significant increases in the size of compound libraries.
Collapse
Affiliation(s)
- Lakshmi B Akella
- Broad Institute of MIT and Harvard, 7 Cambridge Ctr., Cambridge, MA 02142, USA
| | | |
Collapse
|
7
|
Singh N, Guha R, Giulianotti MA, Pinilla C, Houghten RA, Medina-Franco JL. Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository. J Chem Inf Model 2009; 49:1010-24. [PMID: 19301827 DOI: 10.1021/ci800426u] [Citation(s) in RCA: 114] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A multiple criteria approach is presented, that is used to perform a comparative analysis of four recently developed combinatorial libraries to drugs, Molecular Libraries Small Molecule Repository (MLSMR) and natural products. The compound databases were assessed in terms of physicochemical properties, scaffolds, and fingerprints. The approach enables the analysis of property space coverage, degree of overlap between collections, scaffold and structural diversity, and overall structural novelty. The degree of overlap between combinatorial libraries and drugs was assessed using the R-NN curve methodology, which measures the density of chemical space around a query molecule embedded in the chemical space of a target collection. The combinatorial libraries studied in this work exhibit scaffolds that were not observed in the drug, MLSMR, and natural products databases. The fingerprint-based comparisons indicate that these combinatorial libraries are structurally different than current drugs. The R-NN curve methodology revealed that a proportion of molecules in the combinatorial libraries is located within the property space of the drugs. However, the R-NN analysis also showed that there are a significant number of molecules in several combinatorial libraries that are located in sparse regions of the drug space.
Collapse
Affiliation(s)
- Narender Singh
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987, USA
| | | | | | | | | | | |
Collapse
|
8
|
Dolle RE, Bourdonnec BL, Goodman AJ, Morales GA, Thomas CJ, Zhang W. Comprehensive Survey of Chemical Libraries for Drug Discovery and Chemical Biology: 2007. ACTA ACUST UNITED AC 2008; 10:753-802. [PMID: 18991466 DOI: 10.1021/cc800119z] [Citation(s) in RCA: 92] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Roland E. Dolle
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| | - Bertrand Le Bourdonnec
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| | - Allan J. Goodman
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| | - Guillermo A. Morales
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| | - Craig J. Thomas
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| | - Wei Zhang
- Adolor Corporation, 700 Pennsylvania Drive, Exton, Pennsylvania 19341, Semafore Pharmaceuticals Inc., 8496 Georgetown Road, Indianapolis, Indiana 46268, NIH Chemical Genomics Center, National Human Genome Research Institute, National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, and Department of Chemistry, University of Massachusetts, 100 Morrissey Boulevard, Boston, Massachusetts 02125
| |
Collapse
|
10
|
Gillet VJ. New directions in library design and analysis. Curr Opin Chem Biol 2008; 12:372-8. [PMID: 18331851 DOI: 10.1016/j.cbpa.2008.02.015] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 02/06/2008] [Indexed: 10/22/2022]
Abstract
The high costs associated with high-throughput screening (HTS) coupled with the limited coverage and bias of current screening collections is such that diversity analysis continues to be an important criterion in lead generation. Whereas early approaches to diversity analysis were based on traditional descriptors such as two-dimensional fingerprints a recent emphasis has been on assessing scaffold coverage to ensure that a variety of different chemotypes are represented. Moreover, whether designing diverse or focused libraries, it is widely recognised that designs should aim to achieve a balance in a number of different properties and multiobjective optimisation provides an effective way of achieving such designs.
Collapse
Affiliation(s)
- Valerie J Gillet
- Department of Information Studies, University of Sheffield, Sheffield, UK.
| |
Collapse
|