1
|
Brodney MD, Bakken G, Butler CR, Klug-McLeod J, Owen R, Sng ST. Integrated design environment: A multi-use platform for design idea capture, evaluation, and tracking in medicinal chemistry. J Comput Chem 2023; 44:788-800. [PMID: 36471909 DOI: 10.1002/jcc.27041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Revised: 09/07/2022] [Accepted: 10/24/2022] [Indexed: 12/12/2022]
Abstract
An integrated design environment (IDE) has been developed that allows the capture of design ideas, virtual compounds, and design hypotheses for medicinal chemistry projects. Specific consideration for rational molecular design, including design strategy and tactics, as well as comparator reference compounds have been incorporated to more easily convey the proposed design idea. A hierarchical tree architecture and customizable layouts allow for facile browsing across multiple programs and rapid examination of both ongoing and newly designed virtual compounds enabling centralized team discussions to ensure the most efficient prosecution of a queue of these target compounds. Additionally, a "whiteboard" module was incorporated for the rapid evaluation of virtual compounds against a suite of computational models enabling real-time design and triage. Finally, aggregation of cross-project design data enables broader analyses that can indicate portfolio-wide design challenges.
Collapse
Affiliation(s)
- Marian D Brodney
- Molecular Informatics, Pfizer Global, Cambridge, Massachusetts, USA
| | - Gregory Bakken
- Research and Development, Cadence Design Systems Inc, San Jose, California, USA
| | - Christopher R Butler
- Medicinal Chemistry, Vertex Pharmaceuticals Incorporated, Boston, Massachusetts, USA
| | | | | | - Shao-Tien Sng
- Pfizer Global Research and Development, Groton, Connecticut, USA
| |
Collapse
|
2
|
Schaub J, Zander J, Zielesny A, Steinbeck C. Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK). J Cheminform 2022; 14:79. [PMID: 36357931 PMCID: PMC9650898 DOI: 10.1186/s13321-022-00656-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/30/2022] [Indexed: 11/12/2022] Open
Abstract
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of molecular scaffolds, scaffold trees and networks. The new library is based on the Chemistry Development Kit (CDK) and highly customisable through multiple settings, e.g. five different structural framework definitions are available. For display of scaffold hierarchies, the open GraphStream Java library is utilised. Performance snapshots with natural products (NP) from the COCONUT (COlleCtion of Open Natural prodUcTs) database and drug molecules from DrugBank are reported. The generation of a scaffold network from more than 450,000 NP can be achieved within a single day.
Collapse
Affiliation(s)
- Jonas Schaub
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Julian Zander
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Achim Zielesny
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Christoph Steinbeck
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| |
Collapse
|
3
|
Naveja JJ, Vogt M. Automatic Identification of Analogue Series from Large Compound Data Sets: Methods and Applications. Molecules 2021; 26:5291. [PMID: 34500724 PMCID: PMC8433811 DOI: 10.3390/molecules26175291] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Revised: 08/27/2021] [Accepted: 08/28/2021] [Indexed: 01/21/2023] Open
Abstract
Analogue series play a key role in drug discovery. They arise naturally in lead optimization efforts where analogues are explored based on one or a few core structures. However, it is much harder to accurately identify and extract pairs or series of analogue molecules in large compound databases with no predefined core structures. This methodological review outlines the most common and recent methodological developments to automatically identify analogue series in large libraries. Initial approaches focused on using predefined rules to extract scaffold structures, such as the popular Bemis-Murcko scaffold. Later on, the matched molecular pair concept led to efficient algorithms to identify similar compounds sharing a common core structure by exploring many putative scaffolds for each compound. Further developments of these ideas yielded, on the one hand, approaches for hierarchical scaffold decomposition and, on the other hand, algorithms for the extraction of analogue series based on single-site modifications (so-called matched molecular series) by exploring potential scaffold structures based on systematic molecule fragmentation. Eventually, further development of these approaches resulted in methods for extracting analogue series defined by a single core structure with several substitution sites that allow convenient representations, such as R-group tables. These methods enable the efficient analysis of large data sets with hundreds of thousands or even millions of compounds and have spawned many related methodological developments.
Collapse
Affiliation(s)
- José J. Naveja
- Instituto de Química, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico;
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5-6, 53115 Bonn, Germany
| |
Collapse
|
4
|
Manelfi C, Gemei M, Talarico C, Cerchia C, Fava A, Lunghini F, Beccari AR. "Molecular Anatomy": a new multi-dimensional hierarchical scaffold analysis tool. J Cheminform 2021; 13:54. [PMID: 34301327 PMCID: PMC8299179 DOI: 10.1186/s13321-021-00526-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 06/13/2021] [Indexed: 11/10/2022] Open
Abstract
The scaffold representation is widely employed to classify bioactive compounds on the basis of common core structures or correlate compound classes with specific biological activities. In this paper, we present a novel approach called "Molecular Anatomy" as a flexible and unbiased molecular scaffold-based metrics to cluster large set of compounds. We introduce a set of nine molecular representations at different abstraction levels, combined with fragmentation rules, to define a multi-dimensional network of hierarchically interconnected molecular frameworks. We demonstrate that the introduction of a flexible scaffold definition and multiple pruning rules is an effective method to identify relevant chemical moieties. This approach allows to cluster together active molecules belonging to different molecular classes, capturing most of the structure activity information, in particular when libraries containing a huge number of singletons are analyzed. We also propose a procedure to derive a network visualization that allows a full graphical representation of compounds dataset, permitting an efficient navigation in the scaffold's space and significantly contributing to perform high quality SAR analysis. The protocol is freely available as a web interface at https://ma.exscalate.eu .
Collapse
Affiliation(s)
- Candida Manelfi
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Marica Gemei
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmine Talarico
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Carmen Cerchia
- Department of Pharmacy, University of Naples "Federico II", 80131, Napoli, Italy
| | - Anna Fava
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | - Filippo Lunghini
- Dompé Farmaceutici SpA, Via Campo di Pile, 67100, L'Aquila, Italy
| | | |
Collapse
|
5
|
Saldívar-González FI, Huerta-García CS, Medina-Franco JL. Chemoinformatics-based enumeration of chemical libraries: a tutorial. J Cheminform 2020; 12:64. [PMID: 33372622 PMCID: PMC7590480 DOI: 10.1186/s13321-020-00466-z] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 10/05/2020] [Indexed: 11/10/2022] Open
Abstract
Virtual compound libraries are increasingly being used in computer-assisted drug discovery applications and have led to numerous successful cases. This paper aims to examine the fundamental concepts of library design and describe how to enumerate virtual libraries using open source tools. To exemplify the enumeration of chemical libraries, we emphasize the use of pre-validated or reported reactions and accessible chemical reagents. This tutorial shows a step-by-step procedure for anyone interested in designing and building chemical libraries with or without chemoinformatics experience. The aim is to explore various methodologies proposed by synthetic organic chemists and explore affordable chemical space using open-access chemoinformatics tools. As part of the tutorial, we discuss three examples of design: a Diversity-Oriented-Synthesis library based on lactams, a bis-heterocyclic combinatorial library, and a set of target-oriented molecules: isoindolinone based compounds as potential acetylcholinesterase inhibitors. This manuscript also seeks to contribute to the critical task of teaching and learning chemoinformatics.
Collapse
Affiliation(s)
- Fernanda I. Saldívar-González
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510 Mexico, Mexico
| | - C. Sebastian Huerta-García
- School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510 Mexico, Mexico
| | - José L. Medina-Franco
- DIFACQUIM Research Group, School of Chemistry, Department of Pharmacy, Universidad Nacional Autónoma de México, Avenida Universidad 3000, 04510 Mexico, Mexico
| |
Collapse
|
6
|
Kruger F, Stiefl N, Landrum GA. rdScaffoldNetwork: The Scaffold Network Implementation in RDKit. J Chem Inf Model 2020; 60:3331-3335. [PMID: 32584031 DOI: 10.1021/acs.jcim.0c00296] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We present an implementation of the scaffold network in the open source cheminformatics toolkit RDKit. Scaffold networks have been introduced in the literature as a powerful method to navigate and analyze large screening data sets in medicinal chemistry. Such a network can be created by iteratively applying predefined fragmentation rules to the investigated set of small molecules and by linking the produced fragments according to their descendence. This procedure results in a network graph, where the nodes correspond to the fragments and the edges correspond to the operations producing one fragment from another. In extension to the scaffold network implementations suggested in the literature, the presented implementation in RDKit allows an enhanced flexibility in terms of customizing the fragmentation rules and enables the inclusion of atom- and bond-generic scaffolds into the network. The output, providing node and edge information on the network, enables a simple and elegant navigation through the network, laying the basis to organize and better understand the data set being investigated.
Collapse
Affiliation(s)
- Franziska Kruger
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | | |
Collapse
|
7
|
Kunkel C, Schober C, Oberhofer H, Reuter K. Knowledge discovery through chemical space networks: the case of organic electronics. J Mol Model 2019; 25:87. [DOI: 10.1007/s00894-019-3950-6] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Accepted: 01/29/2019] [Indexed: 12/14/2022]
|
8
|
Shang J, Sun H, Liu H, Chen F, Tian S, Pan P, Li D, Kong D, Hou T. Comparative analyses of structural features and scaffold diversity for purchasable compound libraries. J Cheminform 2017; 9:25. [PMID: 29086044 PMCID: PMC5400773 DOI: 10.1186/s13321-017-0212-4] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Accepted: 04/09/2017] [Indexed: 11/30/2022] Open
Abstract
Large purchasable screening libraries of small molecules afforded by commercial vendors are indispensable sources for virtual screening (VS). Selecting an optimal screening library for a specific VS campaign is quite important to improve the success rates and avoid wasting resources in later experimental phases. Analysis of the structural features and molecular diversity for different screening libraries can provide valuable information to the decision making process when selecting screening libraries for VS. In this study, the structural features and scaffold diversity of eleven purchasable screening libraries and Traditional Chinese Medicine Compound Database (TCMCD) were analyzed and compared. Their scaffold diversity represented by the Murcko frameworks and Level 1 scaffolds was characterized by the scaffold counts and cumulative scaffold frequency plots, and visualized by Tree Maps and SAR Maps. The analysis demonstrates that, based on the standardized subsets with similar molecular weight distributions, Chembridge, ChemicalBlock, Mucle, TCMCD and VitasM are more structurally diverse than the others. Compared with all purchasable screening libraries, TCMCD has the highest structural complexity indeed but more conservative molecular scaffolds. Moreover, we found that some representative scaffolds were important components of drug candidates against different drug targets, such as kinases and guanosine-binding protein coupled receptors, and therefore the molecules containing pharmacologically important scaffolds found in screening libraries might be potential inhibitors against the relevant targets. This study may provide valuable perspective on which purchasable compound libraries are better for you to screen.Selecting diverse compound libraries with scaffold analyses. ![]()
Collapse
Affiliation(s)
- Jun Shang
- State Key Laboratory of Agricultural Microbiology and Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, China.,College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Hui Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Fu Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Sheng Tian
- College of Pharmaceutical Sciences, Soochow University, Suzhou, 215021, Jiangsu, China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dan Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dexin Kong
- State Key Laboratory of Agricultural Microbiology and Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Hubei, China.
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
9
|
Velkoborsky J, Hoksza D. Scaffold analysis of PubChem database as background for hierarchical scaffold-based visualization. J Cheminform 2016; 8:74. [PMID: 28090217 PMCID: PMC5199768 DOI: 10.1186/s13321-016-0186-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Accepted: 12/02/2016] [Indexed: 11/25/2022] Open
Abstract
Background Visualization of large molecular datasets is a challenging yet important topic utilised in diverse fields of chemistry ranging from material engineering to drug design. Especially in drug design, modern methods of high-throughput screening generate large amounts of molecular data that call for methods enabling their analysis. One such method is classification of compounds based on their molecular scaffolds, a concept widely used by medicinal chemists to group molecules of similar properties. This classification can then be utilized for intuitive visualization of compounds. Results In this paper, we propose a scaffold hierarchy as a result of large-scale analysis of the PubChem Compound database. The analysis not only provided insights into scaffold diversity of the PubChem Compound database, but also enables scaffold-based hierarchical visualization of user compound data sets on the background of empirical chemical space, as defined by the PubChem data, or on the background of any other user-defined data set. The visualization is performed by a web based client-server application called Scaffvis. It provides an interactive zoomable tree map visualization of data sets up to hundreds of thousands molecules. Scaffvis is free to use and its source codes have been published under an open source license.. ![]() Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0186-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jakub Velkoborsky
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| | - David Hoksza
- Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic
| |
Collapse
|
10
|
Mok NY, Brown N. Applications of Systematic Molecular Scaffold Enumeration to Enrich Structure-Activity Relationship Information. J Chem Inf Model 2016; 57:27-35. [PMID: 27990817 PMCID: PMC6152611 DOI: 10.1021/acs.jcim.6b00386] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
![]()
Establishing
structure–activity relationships (SARs) in
hit identification during early stage drug discovery is important
in accelerating hit confirmation and expansion. We describe the development
of EnCore, a systematic molecular scaffold enumeration
protocol using single atom mutations, to enhance the application of
objective scaffold definitions and to enrich SAR information from
analysis of high-throughput screening output. A list of 43 literature
medicinal chemistry compound series, each containing a minimum of
100 compounds, published in the Journal of Medicinal Chemistry was collated to validate the protocol. Analysis using the top representative
Level 1 scaffolds this list of literature compound series demonstrated
that EnCore could mimic the scaffold exploration
conducted when establishing SAR. When EnCore was
applied to analyze an HTS library containing over 200 000 compounds,
we observed that over 70% of the molecular scaffolds matched extant
scaffolds within the library after enumeration. In particular, over
60% of the singleton scaffolds with only one representative compound
were found to have structurally related compounds after enumeration.
These results illustrate the potential of EnCore to
enrich SAR information. A case study using literature cyclooxygenase-2
inhibitors further demonstrates the advantage of EnCore application in establishing SAR from structurally related scaffolds. EnCore complements literature enumeration methods in enabling
changes to the physicochemical properties of molecular scaffolds and
structural modifications to aliphatic rings and linkers. The enumerated
scaffold clusters generated would constitute a comprehensive collection
of scaffolds for scaffold morphing and hopping.
Collapse
Affiliation(s)
- N Yi Mok
- Cancer Research UK Cancer Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research , London, SM2 5NG, U.K
| | - Nathan Brown
- Cancer Research UK Cancer Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research , London, SM2 5NG, U.K
| |
Collapse
|
11
|
Weskamp N. Guided Iterative Substructure Search (GI-SSS) - A New Trick for an Old Dog. Mol Inform 2016; 35:286-92. [PMID: 27492243 DOI: 10.1002/minf.201600063] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Accepted: 06/09/2016] [Indexed: 11/10/2022]
Abstract
Substructure search (SSS) is a fundamental technique supported by various chemical information systems. Many users apply it in an iterative manner: they modify their queries to shape the composition of the retrieved hit sets according to their needs. We propose and evaluate two heuristic extensions of SSS aimed at simplifying these iterative query modifications by collecting additional information during query processing and visualizing this information in an intuitive way. This gives the user a convenient feedback on how certain changes to the query would affect the retrieved hit set and reduces the number of trial-and-error cycles needed to generate an optimal search result. The proposed heuristics are simple, yet surprisingly effective and can be easily added to existing SSS implementations.
Collapse
Affiliation(s)
- Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Discovery Research, Lead Identification and Optimization Support, Computational Chemistry, Birkendorfer Straße 65, 88397, Biberach an der Riss, Germany.
| |
Collapse
|
12
|
Abstract
How to design a ligand to bind multiple targets, rather than to a single target, is the focus of this review. Rational polypharmacology draws on knowledge that is both broad ranging and hierarchical. Computer-aided multitarget ligand design methods are described according to their nested knowledge level. Ligand-only and then receptor-ligand strategies are first described; followed by the metabolic network viewpoint. Subsequently strategies that view infectious diseases as multigenomic targets are discussed, and finally the disease level interpretation of medicinal therapy is considered. As yet there is no consensus on how best to proceed in designing a multitarget ligand. The current methodologies are bought together in an attempt to give a practical overview of how polypharmacology design might be best initiated.
Collapse
|
13
|
Grygorenko OO, Babenko P, Volochnyuk DM, Raievskyi O, Komarov IV. Following Ramachandran: exit vector plots (EVP) as a tool to navigate chemical space covered by 3D bifunctional scaffolds. The case of cycloalkanes. RSC Adv 2016. [DOI: 10.1039/c5ra19958a] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
An approach to analysis and visualization of chemical space covered by disubstituted scaffolds, which is based on exit vector plots (EVP), is used for analysis of cycloalkane. Four clearly defined regions (α, β, γ and δ) are found in their EVP.
Collapse
Affiliation(s)
| | - Pavlo Babenko
- Taras Shevchenko National University of Kyiv
- Kyiv 01601
- Ukraine
| | - Dmitry M. Volochnyuk
- Institute of Organic Chemistry National Academy of Sciences of Ukraine
- Kyiv 02094
- Ukraine
| | - Oleksii Raievskyi
- Institute of Molecular Biology and Genetics National Academy of Sciences of Ukraine
- Kyiv 03680
- Ukraine
- Life Chemicals
- Life Chemicals Group
| | - Igor V. Komarov
- Taras Shevchenko National University of Kyiv
- Kyiv 01601
- Ukraine
| |
Collapse
|
14
|
Segall M, Champness E, Leeding C, Chisholm J, Hunt P, Elliott A, Garcia-Martinez H, Foster N, Dowling S. Breaking free from chemical spreadsheets. Drug Discov Today 2015; 20:1093-103. [DOI: 10.1016/j.drudis.2015.03.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Revised: 03/05/2015] [Accepted: 03/13/2015] [Indexed: 01/24/2023]
|
15
|
Osolodkin DI, Radchenko EV, Orlov AA, Voronkov AE, Palyulin VA, Zefirov NS. Progress in visual representations of chemical space. Expert Opin Drug Discov 2015; 10:959-73. [DOI: 10.1517/17460441.2015.1060216] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
16
|
Klein K, Koch O, Kriege N, Mutzel P, Schäfer T. Visual Analysis of Biological Activity Data with Scaffold Hunter. Mol Inform 2013; 32:964-75. [DOI: 10.1002/minf.201300087] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2013] [Accepted: 07/25/2013] [Indexed: 02/03/2023]
|
17
|
Systematic mining of analog series with related core structures in multi-target activity space. J Comput Aided Mol Des 2013; 27:665-74. [DOI: 10.1007/s10822-013-9671-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2013] [Accepted: 08/05/2013] [Indexed: 10/26/2022]
|
18
|
Tian S, Li Y, Wang J, Xu X, Xu L, Wang X, Chen L, Hou T. Drug-likeness analysis of traditional Chinese medicines: 2. Characterization of scaffold architectures for drug-like compounds, non-drug-like compounds, and natural compounds from traditional Chinese medicines. J Cheminform 2013; 5:5. [PMID: 23336706 PMCID: PMC3561156 DOI: 10.1186/1758-2946-5-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2012] [Accepted: 01/08/2013] [Indexed: 01/08/2023] Open
Abstract
Background In order to better understand the structural features of natural compounds from traditional Chinese medicines, the scaffold architectures of drug-like compounds in MACCS-II Drug Data Report (MDDR), non-drug-like compounds in Available Chemical Directory (ACD), and natural compounds in Traditional Chinese Medicine Compound Database (TCMCD) were explored and compared. Results First, the different scaffolds were extracted from ACD, MDDR and TCMCD by using three scaffold representations, including Murcko frameworks, Scaffold Tree, and ring systems with different complexity and side chains. Then, by examining the accumulative frequency of the scaffolds in each dataset, we observed that the Level 1 scaffolds of the Scaffold Tree offer advantages over the other scaffold architectures to represent the scaffold diversity of the compound libraries. By comparing the similarity of the scaffold architectures presented in MDDR, ACD and TCMCD, structural overlaps were observed not only between MDDR and TCMCD but also between MDDR and ACD. Finally, Tree Maps were used to cluster the Level 1 scaffolds of the Scaffold Tree and visualize the scaffold space of the three datasets. Conclusion The analysis of the scaffold architectures of MDDR, ACD and TCMCD shows that, on average, drug-like molecules in MDDR have the highest diversity while natural compounds in TCMCD have the highest complexity. According to the Tree Maps, it can be observed that the Level 1 scaffolds present in MDDR have higher diversity than those presented in TCMCD and ACD. However, some representative scaffolds in MDDR with high frequency show structural similarities to those in TCMCD and ACD, suggesting that some scaffolds in TCMCD and ACD may be potentially drug-like fragments for fragment-based and de novo drug design.
Collapse
Affiliation(s)
- Sheng Tian
- Institute of Functional Nano & Soft Materials (FUNSOM) and Jiangsu Key Laboratory for Carbon-Based Functional Materials & Devices, Soochow University, Suzhou, Jiangsu, 215123, China.
| | | | | | | | | | | | | | | |
Collapse
|
19
|
|
20
|
Abstract
Understanding structure-activity relationships (SARs) for a given set of molecules allows one to rationally explore chemical space and develop a chemical series optimizing multiple physicochemical and biological properties simultaneously, for instance, improving potency, reducing toxicity, and ensuring sufficient bioavailability. In silico methods allow rapid and efficient characterization of SARs and facilitate building a variety of models to capture and encode one or more SARs, which can then be used to predict activities for new molecules. By coupling these methods with in silico modifications of structures, one can easily prioritize large screening decks or even generate new compounds de novo and ascertain whether they belong to the SAR being studied. Computational methods can provide a guide for the experienced user by integrating and summarizing large amounts of preexisting data to suggest useful structural modifications. This chapter highlights the different types of SAR modeling methods and how they support the task of exploring chemical space to elucidate and optimize SARs in a drug discovery setting. In addition to considering modeling algorithms, I briefly discuss how to use databases as a source of SAR data to inform and enhance the exploration of SAR trends. I also review common modeling techniques that are used to encode SARs, recent work in the area of structure-activity landscapes, the role of SAR databases, and alternative approaches to exploring SAR data that do not involve explicit model development.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, Rockville, MD, USA
| |
Collapse
|
21
|
Titarenko Z, Vasilevich N, Zernov V, Kirpichenok M, Genis D. Oxygen-containing fragments in natural products. J Comput Aided Mol Des 2012; 27:125-60. [PMID: 23271273 DOI: 10.1007/s10822-012-9629-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2012] [Accepted: 12/17/2012] [Indexed: 01/08/2023]
Abstract
An analysis of the chemical environment of the oxygen atoms in the DNP database compared to the CMC and SCD databases was performed. Some structural clusters were identified which are predominant among the natural products and can be considered as distinctive features of NPs. Fifty-three oxygen-containing structural fragments that are distinctive for the DNP (distinctive set of fragments DSF) in comparison with the SCD have been identified. A new descriptor Mc was introduced for describing the ratio of atoms involved in the DSF to the total number of heavy atoms. A significant difference in the Mc values among the reference databases allowed the use of a specific cluster of the DSF as a tool for performing similarity searches for oxygen-containing NP molecules, or for evaluation or comparison of databases according to their NP-likeness. An example illustrating that the suggested approach could allow not only estimating the NP-likeness, but also serve as a tool for designing new NP-like compounds is provided. The suggested approach for NP-likeness evaluation moves away from the traditional ideas of scaffolds, cycles, linkers and substituents.
Collapse
Affiliation(s)
- Zoya Titarenko
- ASINEX, 20 Geroev Panfilovtsev Str., Moscow 125480, Russia
| | | | | | | | | |
Collapse
|
22
|
Rabal O, Oyarzabal J. Biologically Relevant Chemical Space Navigator: From Patent and Structure–Activity Relationship Analysis to Library Acquisition and Design. J Chem Inf Model 2012; 52:3123-37. [DOI: 10.1021/ci3004539] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Obdulia Rabal
- Small Molecule Discovery Platform,
Center for Applied Medical Research (CIMA), University of Navarra, Avda. Pio XII 55, E-31008 Pamplona, Spain
| | - Julen Oyarzabal
- Small Molecule Discovery Platform,
Center for Applied Medical Research (CIMA), University of Navarra, Avda. Pio XII 55, E-31008 Pamplona, Spain
| |
Collapse
|
23
|
Vogt M, Bajorath J. Chemoinformatics: A view of the field and current trends in method development. Bioorg Med Chem 2012; 20:5317-23. [DOI: 10.1016/j.bmc.2012.03.030] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Revised: 03/09/2012] [Accepted: 03/12/2012] [Indexed: 12/18/2022]
|
24
|
Wassermann AM, Haebel P, Weskamp N, Bajorath J. SAR matrices: automated extraction of information-rich SAR tables from large compound data sets. J Chem Inf Model 2012; 52:1769-76. [PMID: 22657271 DOI: 10.1021/ci300206e] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.
Collapse
Affiliation(s)
- Anne Mai Wassermann
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | | | | | | |
Collapse
|
25
|
Baede EJ, den Bekker E, Boiten JW, Cronin D, van Gammeren R, de Vlieg J. Integrated Project Views: Decision Support Platform for Drug Discovery Project Teams. J Chem Inf Model 2012; 52:1438-49. [DOI: 10.1021/ci200253g] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
- Eric J. Baede
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| | - Ernest den Bekker
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| | - Jan-Willem Boiten
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| | - Deborah Cronin
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| | - Rob van Gammeren
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| | - Jacob de Vlieg
- Discovery Informatics, Molecular
Design and Informatics Department, MSD, Molenstraat 110, 5342 CC Oss,
The Netherlands
| |
Collapse
|
26
|
Gupta-Ostermann D, Hu Y, Bajorath J. Introducing the LASSO Graph for Compound Data Set Representation and Structure–Activity Relationship Analysis. J Med Chem 2012; 55:5546-53. [DOI: 10.1021/jm3004762] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Disha Gupta-Ostermann
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| | - Ye Hu
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| |
Collapse
|
27
|
Hastings J, Magka D, Batchelor C, Duan L, Stevens R, Ennis M, Steinbeck C. Structure-based classification and ontology in chemistry. J Cheminform 2012; 4:8. [PMID: 22480202 PMCID: PMC3361486 DOI: 10.1186/1758-2946-4-8] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 04/05/2012] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND Recent years have seen an explosion in the availability of data in the chemistry domain. With this information explosion, however, retrieving relevant results from the available information, and organising those results, become even harder problems. Computational processing is essential to filter and organise the available resources so as to better facilitate the work of scientists. Ontologies encode expert domain knowledge in a hierarchically organised machine-processable format. One such ontology for the chemical domain is ChEBI. ChEBI provides a classification of chemicals based on their structural features and a role or activity-based classification. An example of a structure-based class is 'pentacyclic compound' (compounds containing five-ring structures), while an example of a role-based class is 'analgesic', since many different chemicals can act as analgesics without sharing structural features. Structure-based classification in chemistry exploits elegant regularities and symmetries in the underlying chemical domain. As yet, there has been neither a systematic analysis of the types of structural classification in use in chemistry nor a comparison to the capabilities of available technologies. RESULTS We analyze the different categories of structural classes in chemistry, presenting a list of patterns for features found in class definitions. We compare these patterns of class definition to tools which allow for automation of hierarchy construction within cheminformatics and within logic-based ontology technology, going into detail in the latter case with respect to the expressive capabilities of the Web Ontology Language and recent extensions for modelling structured objects. Finally we discuss the relationships and interactions between cheminformatics approaches and logic-based approaches. CONCLUSION Systems that perform intelligent reasoning tasks on chemistry data require a diverse set of underlying computational utilities including algorithmic, statistical and logic-based tools. For the task of automatic structure-based classification of chemical entities, essential to managing the vast swathes of chemical data being brought online, systems which are capable of hybrid reasoning combining several different approaches are crucial. We provide a thorough review of the available tools and methodologies, and identify areas of open research.
Collapse
Affiliation(s)
- Janna Hastings
- Cheminformatics and Metabolism, European Bioinformatics Institute, Hinxton, UK
- Swiss Center for Affective Sciences, University of Geneva, Geneva, Switzerland
| | - Despoina Magka
- Department of Computer Science, University of Oxford, Oxford, UK
| | | | - Lian Duan
- Cheminformatics and Metabolism, European Bioinformatics Institute, Hinxton, UK
- ETH, Zürich, Switzerland
| | - Robert Stevens
- School of Computer Science, University of Manchester, Manchester, UK
| | - Marcus Ennis
- Cheminformatics and Metabolism, European Bioinformatics Institute, Hinxton, UK
| | - Christoph Steinbeck
- Cheminformatics and Metabolism, European Bioinformatics Institute, Hinxton, UK
| |
Collapse
|
28
|
Sheng C, Zhang W. Fragment Informatics and Computational Fragment-Based Drug Design: An Overview and Update. Med Res Rev 2012; 33:554-98. [DOI: 10.1002/med.21255] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Chunquan Sheng
- Department of Medicinal Chemistry; School of Pharmacy; Second Military Medical University; 325 Guohe Road Shanghai 200433 People's Republic of China
| | - Wannian Zhang
- Department of Medicinal Chemistry; School of Pharmacy; Second Military Medical University; 325 Guohe Road Shanghai 200433 People's Republic of China
| |
Collapse
|
29
|
Peptide Scaffolds: Flexible Molecular Structures With Diverse Therapeutic Potentials. Int J Pept Res Ther 2012. [DOI: 10.1007/s10989-011-9286-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
30
|
Lounkine E, Nigsch F, Jenkins JL, Glick M. Activity-Aware Clustering of High Throughput Screening Data and Elucidation of Orthogonal Structure–Activity Relationships. J Chem Inf Model 2011; 51:3158-68. [DOI: 10.1021/ci2004994] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Eugen Lounkine
- Novartis Institutes for Biomedical Research, 250 Massachusetts Ave., Cambridge, Massachusetts 02139, United States
| | - Florian Nigsch
- Novartis Institutes for Biomedical Research, Novartis Campus, Forum 1, CH-4056 Basel, Switzerland
| | - Jeremy L. Jenkins
- Novartis Institutes for Biomedical Research, 250 Massachusetts Ave., Cambridge, Massachusetts 02139, United States
| | - Meir Glick
- Novartis Institutes for Biomedical Research, 250 Massachusetts Ave., Cambridge, Massachusetts 02139, United States
| |
Collapse
|
31
|
Hack MD, Rassokhin DN, Buyck C, Seierstad M, Skalkin A, ten Holte P, Jones TK, Mirzadegan T, Agrafiotis DK. Library Enhancement through the Wisdom of Crowds. J Chem Inf Model 2011; 51:3275-86. [DOI: 10.1021/ci200446y] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Michael D. Hack
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Row, San Diego, California 92121, United States
| | - Dmitrii N. Rassokhin
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Christophe Buyck
- Janssen Research & Development, Division of Janssen Pharmaceutica NV, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Mark Seierstad
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Row, San Diego, California 92121, United States
| | - Andrew Skalkin
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Peter ten Holte
- Janssen Research & Development, Division of Janssen Pharmaceutica NV, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Todd K. Jones
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Row, San Diego, California 92121, United States
- Todd Jones Consulting, San Diego, California
| | - Taraneh Mirzadegan
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Row, San Diego, California 92121, United States
| | - Dimitris K. Agrafiotis
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| |
Collapse
|
32
|
Agrafiotis DK, Lobanov VS, Shemanarev M, Rassokhin DN, Izrailev S, Jaeger EP, Alex S, Farnum M. Efficient Substructure Searching of Large Chemical Libraries: The ABCD Chemical Cartridge. J Chem Inf Model 2011; 51:3113-30. [DOI: 10.1021/ci200413e] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dimitris K. Agrafiotis
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Victor S. Lobanov
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Maxim Shemanarev
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Dmitrii N. Rassokhin
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Sergei Izrailev
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Edward P. Jaeger
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Simson Alex
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Michael Farnum
- Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| |
Collapse
|
33
|
Langdon SR, Brown N, Blagg J. Scaffold diversity of exemplified medicinal chemistry space. J Chem Inf Model 2011; 51:2174-85. [PMID: 21877753 PMCID: PMC3180201 DOI: 10.1021/ci2001428] [Citation(s) in RCA: 102] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2011] [Indexed: 11/28/2022]
Abstract
The scaffold diversity of 7 representative commercial and proprietary compound libraries is explored for the first time using both Murcko frameworks and Scaffold Trees. We show that Level 1 of the Scaffold Tree is useful for the characterization of scaffold diversity in compound libraries and offers advantages over the use of Murcko frameworks. This analysis also demonstrates that the majority of compounds in the libraries we analyzed contain only a small number of well represented scaffolds and that a high percentage of singleton scaffolds represent the remaining compounds. We use Tree Maps to clearly visualize the scaffold space of representative compound libraries, for example, to display highly populated scaffolds and clusters of structurally similar scaffolds. This study further highlights the need for diversification of compound libraries used in hit discovery by focusing library enrichment on the synthesis of compounds with novel or underrepresented scaffolds.
Collapse
Affiliation(s)
- Sarah R. Langdon
- Cancer Research UK Cancer Therapeutics Unit, The Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey SM2 5NG, U.K
| | - Nathan Brown
- Cancer Research UK Cancer Therapeutics Unit, The Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey SM2 5NG, U.K
| | - Julian Blagg
- Cancer Research UK Cancer Therapeutics Unit, The Institute of Cancer Research, 15 Cotswold Road, Sutton, Surrey SM2 5NG, U.K
| |
Collapse
|
34
|
Schuffenhauer A, Varin T. Rule-Based Classification of Chemical Structures by Scaffold. Mol Inform 2011; 30:646-64. [PMID: 27467257 DOI: 10.1002/minf.201100078] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2011] [Accepted: 07/14/2011] [Indexed: 01/25/2023]
Abstract
Databases for small organic chemical molecules usually contain millions of structures. The screening decks of pharmaceutical companies contain more than a million of structures. Nevertheless chemical substructure searching in these databases can be performed interactively in seconds. Because of this nobody has really missed structural classification of these databases for the purpose of finding data for individual chemical substructures. However, a full deck high-throughput screen produces also activity data for more than a million of substances. How can this amount of data be analyzed? Which are the active scaffolds identified by an assays? To answer such questions systematic classifications of molecules by scaffolds are needed. In this review it is described how molecules can be hierarchically classified by their scaffolds. It is explained how such classifications can be used to identify active scaffolds in an HTS data set. Once active classes are identified, they need to be visualized in the context of related scaffolds in order to understand SAR. Consequently such visualizations are another topic of this review. In addition scaffold based diversity measures are discussed and an outlook is given about the potential impact of structural classifications on a chemically aware semantic web.
Collapse
Affiliation(s)
- Ansgar Schuffenhauer
- Novartis Institutes for BioMedical Research, CPC/LFP, WSJ-88.11.11, Postfach, Basel, Switzerland, CH-4002; phone:+41 61 32 45385.
| | - Thibault Varin
- Novartis Institutes for BioMedical Research, CPC/LFP, WSJ-88.11.11, Postfach, Basel, Switzerland, CH-4002; phone:+41 61 32 45385
| |
Collapse
|
35
|
Varin T, Schuffenhauer A, Ertl P, Renner S. Mining for bioactive scaffolds with scaffold networks: improved compound set enrichment from primary screening data. J Chem Inf Model 2011; 51:1528-38. [PMID: 21615076 DOI: 10.1021/ci2000924] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Identification of meaningful chemical patterns in the increasing amounts of high-throughput-generated bioactivity data available today is an increasingly important challenge for successful drug discovery. Herein, we present the scaffold network as a novel approach for mapping and navigation of chemical and biological space. A scaffold network represents the chemical space of a library of molecules consisting of all molecular scaffolds and smaller "parent" scaffolds generated therefrom by the pruning of rings, effectively leading to a network of common scaffold substructure relationships. This algorithm provides an extension of the scaffold tree algorithm that, instead of a network, generates a tree relationship between a heuristically rule-based selected subset of parent scaffolds. The approach was evaluated for the identification of statistically significantly active scaffolds from primary screening data for which the scaffold tree approach has already been shown to be successful. Because of the exhaustive enumeration of smaller scaffolds and the full enumeration of relationships between them, about twice as many statistically significantly active scaffolds were identified compared to the scaffold-tree-based approach. We suggest visualizing scaffold networks as islands of active scaffolds.
Collapse
Affiliation(s)
- Thibault Varin
- Novartis Institutes for BioMedical Research, Forum 1, Novartis Campus, CH-4056 Basel, Switzerland
| | | | | | | |
Collapse
|
36
|
Agrafiotis DK, Wiener JJM, Skalkin A, Kolpak J. Single R-Group Polymorphisms (SRPs) and R-Cliffs: An Intuitive Framework for Analyzing and Visualizing Activity Cliffs in a Single Analog Series. J Chem Inf Model 2011; 51:1122-31. [DOI: 10.1021/ci200054u] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dimitris K. Agrafiotis
- Informatics, Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - John J. M. Wiener
- Medicinal Chemistry, Johnson & Johnson Pharmaceutical Research & Development, L.L.C., 3210 Merryfield Road, San Diego, California 92121, United States
| | - Andrew Skalkin
- Informatics, Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| | - Jeremy Kolpak
- Informatics, Johnson & Johnson Pharmaceutical Research & Development, L.L.C., Welsh & McKean Roads, Spring House, Pennsylvania 19477, United States
| |
Collapse
|
37
|
Tsantili-Kakoulidou A, Agrafiotis DK. The 18th European Symposium on Quantitative Structure–Activity Relationships. Expert Opin Drug Discov 2011; 6:453-6. [DOI: 10.1517/17460441.2011.560604] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
38
|
Glick M, Jacoby E. The role of computational methods in the identification of bioactive compounds. Curr Opin Chem Biol 2011; 15:540-6. [PMID: 21411361 DOI: 10.1016/j.cbpa.2011.02.021] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2011] [Revised: 02/01/2011] [Accepted: 02/21/2011] [Indexed: 10/18/2022]
Abstract
Computational methods play an ever increasing role in lead finding. A vast repertoire of molecular design and virtual screening methods emerged in the past two decades and are today routinely used. There is increasing awareness that there is no single best computational protocol and correspondingly there is a shift recommending the combination of complementary methods. A promising trend for the application of computational methods in lead finding is to take advantage of the vast amounts of HTS (High Throughput Screening) data to allow lead assessment by detailed systems-based data analysis, especially for phenotypic screens where the identification of compound-target pairs is the primary goal. Herein, we review trends and provide examples of successful applications of computational methods in lead finding.
Collapse
Affiliation(s)
- Meir Glick
- Novartis Institutes for BioMedical Research, Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA
| | | |
Collapse
|