1
|
Manna S, Das K, Santra S, Nosova EV, Zyryanov GV, Halder S. Structural and Synthetic Aspects of Small Ring Oxa- and Aza-Heterocyclic Ring Systems as Antiviral Activities. Viruses 2023; 15:1826. [PMID: 37766233 PMCID: PMC10536032 DOI: 10.3390/v15091826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 08/17/2023] [Accepted: 08/21/2023] [Indexed: 09/29/2023] Open
Abstract
Antiviral properties of different oxa- and aza-heterocycles are identified and properly correlated with their structural features and discussed in this review article. The primary objective is to explore the activity of such ring systems as antiviral agents, as well as their synthetic routes and biological significance. Eventually, the structure-activity relationship (SAR) of the heterocyclic compounds, along with their salient characteristics are exhibited to build a suitable platform for medicinal chemists and biotechnologists. The synergistic conclusions are extremely important for the introduction of a newer tool for the future drug discovery program.
Collapse
Affiliation(s)
- Sibasish Manna
- Department of Chemistry, Visvesvaraya National Institute of Technology, Nagpur 440010, India
| | - Koushik Das
- Department of Chemistry, Visvesvaraya National Institute of Technology, Nagpur 440010, India
| | - Sougata Santra
- Department of Organic and Biomolecular Chemistry, Chemical Engineering Institute, Ural Federal University, 19 Mira Street, 620002 Yekaterinburg, Russia; (S.S.); (E.V.N.); (G.V.Z.)
| | - Emily V. Nosova
- Department of Organic and Biomolecular Chemistry, Chemical Engineering Institute, Ural Federal University, 19 Mira Street, 620002 Yekaterinburg, Russia; (S.S.); (E.V.N.); (G.V.Z.)
- I. Ya. Postovskiy Institute of Organic Synthesis, Ural Division of the Russian Academy of Sciences, 22 S. Kovalevskoy Street, 620219 Yekaterinburg, Russia
| | - Grigory V. Zyryanov
- Department of Organic and Biomolecular Chemistry, Chemical Engineering Institute, Ural Federal University, 19 Mira Street, 620002 Yekaterinburg, Russia; (S.S.); (E.V.N.); (G.V.Z.)
- I. Ya. Postovskiy Institute of Organic Synthesis, Ural Division of the Russian Academy of Sciences, 22 S. Kovalevskoy Street, 620219 Yekaterinburg, Russia
| | - Sandipan Halder
- Department of Chemistry, Visvesvaraya National Institute of Technology, Nagpur 440010, India
| |
Collapse
|
2
|
Abstract
DNA-encoded libraries (DELs) are widely used in the discovery of drug candidates, and understanding their design principles is critical for accessing better libraries. Most DELs are combinatorial in nature and are synthesized by assembling sets of building blocks in specific topologies. In this study, different aspects of library topology were explored and their effect on DEL properties and chemical diversity was analyzed. We introduce a descriptor for DEL topological assignment (DELTA) and use it to examine the landscape of possible DEL topologies and their coverage in the literature. A generative topographic mapping analysis revealed that the impact of library topology on chemical space coverage is secondary to building block selection. Furthermore, it became apparent that the descriptor used to analyze chemical space dictates how structures cluster, with the effects of topology being apparent when using three-dimensional descriptors but not with common two-dimensional descriptors. This outcome points to potential challenges of attempts to predict DEL productivity based on chemical space analyses alone. While topology is rather inconsequential for defining the chemical space of encoded compounds, it greatly affects possible interactions with target proteins as illustrated in docking studies using NAD/NADP binding proteins as model receptors.
Collapse
Affiliation(s)
- William K Weigel
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Alba L Montoya
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Raphael M Franzini
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
- Huntsman Cancer Institute, University of Utah, 2000 Circle of Hope Dr., Salt Lake City, Utah 84112, United States
| |
Collapse
|
3
|
Pikalyova R, Zabolotna Y, Horvath D, Marcou G, Varnek A. Chemical Library Space: Definition and DNA-Encoded Library Comparison Study Case. J Chem Inf Model 2023. [PMID: 37368824 DOI: 10.1021/acs.jcim.3c00520] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2023]
Abstract
The development of DNA-encoded library (DEL) technology introduced new challenges for the analysis of chemical libraries. It is often useful to consider a chemical library as a stand-alone chemoinformatic object─represented both as a collection of independent molecules, and yet an individual entity─in particular, when they are inseparable mixtures, like DELs. Herein, we introduce the concept of chemical library space (CLS), in which resident items are individual chemical libraries. We define and compare four vectorial library representations obtained using generative topographic mapping. These allow for an effective comparison of libraries, with the ability to tune and chemically interpret the similarity relationships. In particular, property-tuned CLS encodings enable to simultaneously compare libraries with respect to both property and chemotype distributions. We apply the various CLS encodings for the selection problem of DELs that optimally "match" a reference collection (here ChEMBL28), showing how the choice of the CLS descriptors may help to fine-tune the "matching" (overlap) criteria. Hence, the proposed CLS may represent a new efficient way for polyvalent analysis of thousands of chemical libraries. Selection of an easily accessible compound collection for drug discovery, as a substitute for a difficult to produce reference library, can be tuned for either primary or target-focused screening, also considering property distributions of compounds. Alternatively, selection of libraries covering novel regions of the chemical space with respect to a reference compound subspace may serve for library portfolio enrichment.
Collapse
Affiliation(s)
- Regina Pikalyova
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Yuliana Zabolotna
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, University of Strasbourg, 4, rue B. Pascal, Strasbourg 67081, France
| |
Collapse
|
4
|
Thakur M, Bateman A, Brooksbank C, Freeberg M, Harrison M, Hartley M, Keane T, Kleywegt G, Leach A, Levchenko M, Morgan S, McDonagh E, Orchard S, Papatheodorou I, Velankar S, Vizcaino J, Witham R, Zdrazil B, McEntyre J. EMBL's European Bioinformatics Institute (EMBL-EBI) in 2022. Nucleic Acids Res 2023; 51:D9-D17. [PMID: 36477213 PMCID: PMC9825486 DOI: 10.1093/nar/gkac1098] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 10/21/2022] [Accepted: 10/31/2022] [Indexed: 12/13/2022] Open
Abstract
The European Molecular Biology Laboratory's European Bioinformatics Institute (EMBL-EBI) is one of the world's leading sources of public biomolecular data. Based at the Wellcome Genome Campus in Hinxton, UK, EMBL-EBI is one of six sites of the European Molecular Biology Laboratory (EMBL), Europe's only intergovernmental life sciences organisation. This overview summarises the status of services that EMBL-EBI data resources provide to scientific communities globally. The scale, openness, rich metadata and extensive curation of EMBL-EBI added-value databases makes them particularly well-suited as training sets for deep learning, machine learning and artificial intelligence applications, a selection of which are described here. The data resources at EMBL-EBI can catalyse such developments because they offer sustainable, high-quality data, collected in some cases over decades and made openly availability to any researcher, globally. Our aim is for EMBL-EBI data resources to keep providing the foundations for tools and research insights that transform fields across the life sciences.
Collapse
Affiliation(s)
| | - Alex Bateman
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Cath Brooksbank
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Mallory Freeberg
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Melissa Harrison
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Matthew Hartley
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Thomas Keane
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Gerard Kleywegt
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Andrew Leach
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Mariia Levchenko
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sarah Morgan
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Ellen M McDonagh
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
- OpenTargets, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sandra Orchard
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Irene Papatheodorou
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Sameer Velankar
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Juan Antonio Vizcaino
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Rick Witham
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | - Barbara Zdrazil
- Data Services Teams, EMBL’s European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton CB10 1SD, UK
| | | |
Collapse
|
5
|
Pikalyova R, Zabolotna Y, Volochnyuk D, Horvath D, Gilles M, Varnek A. Exploration of the chemical space of DNA-encoded libraries. Mol Inform 2022; 41:e2100289. [PMID: 34981643 DOI: 10.1002/minf.202100289] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Accepted: 01/03/2022] [Indexed: 11/09/2022]
Abstract
DNA-Encoded Library (DEL) technology has emerged as an alternative method for bioactive molecules discovery in medicinal chemistry. It enables the simple synthesis and screening of compound libraries of enormous size. Even though it gains more and more popularity each day, there are almost no reports of chemoinformatics analysis of DEL chemical space. Therefore, in this project, we aimed to generate and analyze the ultra-large chemical space of DEL. Around 2500 DELs were designed using commercially available BBs resulting in 2,5B DEL compounds that were compared to biologically relevant compounds from ChEMBL using Generative Topographic Mapping. This allowed to choose several optimal DELs covering the chemical space of ChEMBL to the highest extent and thus containing the maximum possible percentage of biologically relevant chemotypes. Different combinations of DELs were also analyzed to identify a set of mutually complementary libraries allowing to attain even higher coverage of ChEMBL than it is possible with one single DEL.
Collapse
|
6
|
Serafim MSM, Dos Santos Júnior VS, Gertrudes JC, Maltarollo VG, Honorio KM. Machine learning techniques applied to the drug design and discovery of new antivirals: a brief look over the past decade. Expert Opin Drug Discov 2021; 16:961-975. [PMID: 33957833 DOI: 10.1080/17460441.2021.1918098] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Introduction: Drug design and discovery of new antivirals will always be extremely important in medicinal chemistry, taking into account known and new viral diseases that are yet to come. Although machine learning (ML) have shown to improve predictions on the biological potential of chemicals and accelerate the discovery of drugs over the past decade, new methods and their combinations have improved their performance and established promising perspectives regarding ML in the search for new antivirals.Areas covered: The authors consider some interesting areas that deal with different ML techniques applied to antivirals. Recent innovative studies on ML and antivirals were selected and analyzed in detail. Also, the authors provide a brief look at the past to the present to detect advances and bottlenecks in the area.Expert opinion: From classical ML techniques, it was possible to boost the searches for antivirals. However, from the emergence of new algorithms and the improvement in old approaches, promising results will be achieved every day, as we have observed in the case of SARS-CoV-2. Recent experience has shown that it is possible to use ML to discover new antiviral candidates from virtual screening and drug repurposing.
Collapse
Affiliation(s)
- Mateus Sá Magalhães Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | | | - Jadson Castro Gertrudes
- Departamento de Computação, Instituto de Ciências Exatas e Biológicas, Universidade Federal de Ouro Preto (UFOP), Ouro Preto, Brazil
| | - Vinícius Gonçalves Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia Maria Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| |
Collapse
|
7
|
Shrivastava AD, Kell DB. FragNet, a Contrastive Learning-Based Transformer Model for Clustering, Interpreting, Visualizing, and Navigating Chemical Space. Molecules 2021; 26:2065. [PMID: 33916824 PMCID: PMC8038408 DOI: 10.3390/molecules26072065] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/29/2021] [Accepted: 04/01/2021] [Indexed: 12/12/2022] Open
Abstract
The question of molecular similarity is core in cheminformatics and is usually assessed via a pairwise comparison based on vectors of properties or molecular fingerprints. We recently exploited variational autoencoders to embed 6M molecules in a chemical space, such that their (Euclidean) distance within the latent space so formed could be assessed within the framework of the entire molecular set. However, the standard objective function used did not seek to manipulate the latent space so as to cluster the molecules based on any perceived similarity. Using a set of some 160,000 molecules of biological relevance, we here bring together three modern elements of deep learning to create a novel and disentangled latent space, viz transformers, contrastive learning, and an embedded autoencoder. The effective dimensionality of the latent space was varied such that clear separation of individual types of molecules could be observed within individual dimensions of the latent space. The capacity of the network was such that many dimensions were not populated at all. As before, we assessed the utility of the representation by comparing clozapine with its near neighbors, and we also did the same for various antibiotics related to flucloxacillin. Transformers, especially when as here coupled with contrastive learning, effectively provide one-shot learning and lead to a successful and disentangled representation of molecular latent spaces that at once uses the entire training set in their construction while allowing "similar" molecules to cluster together in an effective and interpretable way.
Collapse
Affiliation(s)
- Aditya Divyakant Shrivastava
- Department of Computer Science and Engineering, Nirma University, Ahmedabad 382481, India;
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
| | - Douglas B. Kell
- Department of Biochemistry and Systems Biology, Institute of Systems, Molecular and Integrative Biology, University of Liverpool, Crown St., Liverpool L69 7ZB, UK
- Novo Nordisk Foundation Centre for Biosustainability, Technical University of Denmark, Building 220, Kemitorvet, 2800 Kgs Lyngby, Denmark
- Mellizyme Ltd., Liverpool Science Park, IC1, 131 Mount Pleasant, Liverpool L3 5TF, UK
| |
Collapse
|
8
|
Bort W, Baskin II, Gimadiev T, Mukanov A, Nugmanov R, Sidorov P, Marcou G, Horvath D, Klimchuk O, Madzhidov T, Varnek A. Discovery of novel chemical reactions by deep generative recurrent neural network. Sci Rep 2021; 11:3178. [PMID: 33542271 PMCID: PMC7862614 DOI: 10.1038/s41598-021-81889-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 01/06/2021] [Indexed: 12/18/2022] Open
Abstract
The "creativity" of Artificial Intelligence (AI) in terms of generating de novo molecular structures opened a novel paradigm in compound design, weaknesses (stability & feasibility issues of such structures) notwithstanding. Here we show that "creative" AI may be as successfully taught to enumerate novel chemical reactions that are stoichiometrically coherent. Furthermore, when coupled to reaction space cartography, de novo reaction design may be focused on the desired reaction class. A sequence-to-sequence autoencoder with bidirectional Long Short-Term Memory layers was trained on on-purpose developed "SMILES/CGR" strings, encoding reactions of the USPTO database. The autoencoder latent space was visualized on a generative topographic map. Novel latent space points were sampled around a map area populated by Suzuki reactions and decoded to corresponding reactions. These can be critically analyzed by the expert, cleaned of irrelevant functional groups and eventually experimentally attempted, herewith enlarging the synthetic purpose of popular synthetic pathways.
Collapse
Affiliation(s)
- William Bort
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Igor I Baskin
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
- Department of Materials Science and Engineering, Technion - Israel Institute of Technology, 3200003, Haifa, Israel
| | - Timur Gimadiev
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan
| | - Artem Mukanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Pavel Sidorov
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Olga Klimchuk
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France.
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan.
| |
Collapse
|
9
|
Horvath D, Orlov A, Osolodkin DI, Ishmukhametov AA, Marcou G, Varnek A. A Chemographic Audit of anti-Coronavirus Structure-activity Information from Public Databases (ChEMBL). Mol Inform 2020; 39:e2000080. [PMID: 32363750 PMCID: PMC7267182 DOI: 10.1002/minf.202000080] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 04/26/2020] [Indexed: 01/30/2023]
Abstract
Discovery of drugs against newly emerged pathogenic agents like the SARS-CoV-2 coronavirus (CoV) must be based on previous research against related species. Scientists need to get acquainted with and develop a global oversight over so-far tested molecules. Chemography (herein used Generative Topographic Mapping, in particular) places structures on a human-readable 2D map (obtained by dimensionality reduction of the chemical space of molecular descriptors) and is thus well suited for such an audit. The goal is to map medicinal chemistry efforts so far targeted against CoVs. This includes comparing libraries tested against various virus species/genera, predicting their polypharmacological profiles and highlighting often encountered chemotypes. Maps are challenged to provide predictive activity landscapes against viral proteins. Definition of "anti-CoV" map zones led to selection of therein residing 380 potential anti-CoV agents, out of a vast pool of 800 M organic compounds.
Collapse
Affiliation(s)
- Dragos Horvath
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| | - Alexey Orlov
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
| | - Dmitry I. Osolodkin
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
- Institute of Translational Medicine and BiotechnologySechenov First Moscow State Medical UniversityTrubetskaya ul. 8Moscow119991Russia
| | - Aydar A. Ishmukhametov
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
- Institute of Translational Medicine and BiotechnologySechenov First Moscow State Medical UniversityTrubetskaya ul. 8Moscow119991Russia
| | - Gilles Marcou
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| | - Alexandre Varnek
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| |
Collapse
|
10
|
Maggiora G, Medina-Franco JL, Iqbal J, Vogt M, Bajorath J. From Qualitative to Quantitative Analysis of Activity and Property Landscapes. J Chem Inf Model 2020; 60:5873-5880. [PMID: 33205984 DOI: 10.1021/acs.jcim.0c01249] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Activity or, more generally, property landscapes (PLs) have been considered as an attractive way to visualize and explore structure-property relationships (SPRs) contained in large data sets of chemical compounds. For graphical analysis, three-dimensional representations reminiscent of natural landscapes are particularly intuitive. So far, the use of such landscape models has essentially been confined to qualitative assessment. We describe recent efforts to analyze PLs in a more quantitative manner, which make it possible to calculate topographical similarity values for comparison of landscape models as a measure of relative SPR information content.
Collapse
Affiliation(s)
- Gerald Maggiora
- University of Arizona BIO5 Institute, 1657 East Helen Street, Tucson, Arizona 85721-0240, United States
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Javed Iqbal
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| |
Collapse
|
11
|
Singh N, Decroly E, Khatib AM, Villoutreix BO. Structure-based drug repositioning over the human TMPRSS2 protease domain: search for chemical probes able to repress SARS-CoV-2 Spike protein cleavages. Eur J Pharm Sci 2020; 153:105495. [PMID: 32730844 PMCID: PMC7384984 DOI: 10.1016/j.ejps.2020.105495] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 07/16/2020] [Accepted: 07/27/2020] [Indexed: 12/28/2022]
Abstract
In December 2019, a new coronavirus was identified in the Hubei province of central china and named SARS-CoV-2. This new virus induces COVID-19, a severe respiratory disease with high death rate. A putative target to interfere with the virus is the host transmembrane serine protease family member II (TMPRSS2). This enzyme is critical for the entry of coronaviruses into human cells by cleaving and activating the spike protein (S) of SARS-CoV-2. Repositioning approved, investigational and experimental drugs on the serine protease domain of TMPRSS2 could thus be valuable. There is no experimental structure for TMPRSS2 but it is possible to develop quality structural models for the serine protease domain using comparative modeling strategies as such domains are highly structurally conserved. Beside the TMPRSS2 catalytic site, we predicted on our structural models a main exosite that could be important for the binding of protein partners and/or substrates. To block the catalytic site or the exosite of TMPRSS2 we used structure-based virtual screening computations and two different collections of approved, investigational and experimental drugs. We propose a list of 156 molecules that could bind to the catalytic site and 100 compounds that may interact with the exosite. These small molecules should now be tested in vitro to gain novel insights over the roles of TMPRSS2 or as starting point for the development of second generation analogs.
Collapse
Affiliation(s)
- Natesh Singh
- Univ. Lille, INSERM, Institut Pasteur de Lille, U1177, F-59000 Lille, France
| | | | - Abdel-Majid Khatib
- Univ. Bordeaux, Allée Geoffroy St Hilaire, 33615 Pessac, France
- INSERM, LAMC, UMR 1029, Allée Geoffroy St Hilaire, 33615 Pessac, France
- Corresponding authors.
| | - Bruno O. Villoutreix
- Univ. Lille, INSERM, Institut Pasteur de Lille, U1177, F-59000 Lille, France
- Corresponding authors.
| |
Collapse
|
12
|
Horvath D, Marcou G, Varnek A. Generative topographic mapping in drug design. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:99-107. [PMID: 33386101 DOI: 10.1016/j.ddtec.2020.06.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/10/2020] [Accepted: 06/18/2020] [Indexed: 06/12/2023]
Abstract
This is a review article of Generative Topographic Mapping (GTM) - a non-linear dimensionality reduction technique producing generative 2D maps of high-dimensional vector spaces - and its specific applications in Drug Design (chemical space cartography, compound library design and analysis, virtual screening, pharmacological profiling, de novo drug design, conformational space & docking interaction cartography, etc.) Written by chemoinformaticians for potential users among medicinal chemists and biologists, the article purposely avoids all underlying mathematics. First, the GTM concept is intuitively explained, based on the strong analogies with the rather popular Self-Organizing Maps (SOMs), which are well established library analysis tools. GTM is basically a fuzzy-logics-based generalization of SOMs. The second part of the review, some of published GTM applications in drug design are briefly revisited.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| |
Collapse
|
13
|
Lin A, Beck B, Horvath D, Marcou G, Varnek A. Diversifying chemical libraries with generative topographic mapping. J Comput Aided Mol Des 2019; 34:805-815. [DOI: 10.1007/s10822-019-00215-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 07/15/2019] [Indexed: 01/28/2023]
|
14
|
In silico identification of endogenous and exogenous agonists of Estrogen-related receptor α. ACTA ACUST UNITED AC 2019. [DOI: 10.1016/j.comtox.2019.01.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
|
15
|
Sattarov B, Baskin II, Horvath D, Marcou G, Bjerrum EJ, Varnek A. De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping. J Chem Inf Model 2019; 59:1182-1196. [PMID: 30785751 DOI: 10.1021/acs.jcim.8b00751] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Here we show that Generative Topographic Mapping (GTM) can be used to explore the latent space of the SMILES-based autoencoders and generate focused molecular libraries of interest. We have built a sequence-to-sequence neural network with Bidirectional Long Short-Term Memory layers and trained it on the SMILES strings from ChEMBL23. Very high reconstruction rates of the test set molecules were achieved (>98%), which are comparable to the ones reported in related publications. Using GTM, we have visualized the autoencoder latent space on the two-dimensional topographic map. Targeted map zones can be used for generating novel molecular structures by sampling associated latent space points and decoding them to SMILES. The sampling method based on a genetic algorithm was introduced to optimize compound properties "on the fly". The generated focused molecular libraries were shown to contain original and a priori feasible compounds which, pending actual synthesis and testing, showed encouraging behavior in independent structure-based affinity estimation procedures (pharmacophore matching, docking).
Collapse
Affiliation(s)
- Boris Sattarov
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Igor I Baskin
- Faculty of Physics , M.V. Lomonosov Moscow State University , Leninskie Gory , Moscow 19991 , Russia
| | - Dragos Horvath
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Gilles Marcou
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Esben Jannik Bjerrum
- Wildcard Pharmaceutical Consulting, Zeaborg Science Center, Frødings Allé 41 , 2860 Søborg , Denmark
| | - Alexandre Varnek
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| |
Collapse
|
16
|
Orlov AA, Khvatov EV, Koruchekov AA, Nikitina AA, Zolotareva AD, Eletskaya AA, Kozlovskaya LI, Palyulin VA, Horvath D, Osolodkin DI, Varnek A. Getting to Know the Neighbours with GTM: The Case of Antiviral Compounds. Mol Inform 2019; 38:e1800166. [DOI: 10.1002/minf.201800166] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 02/02/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Alexey A. Orlov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | | | - Alexander A. Koruchekov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia A. Nikitina
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia D. Zolotareva
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Anastasia A. Eletskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Liubov I. Kozlovskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | | | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| | - Dmitry I. Osolodkin
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| |
Collapse
|
17
|
Nikitina AA, Orlov AA, Kozlovskaya LI, Palyulin VA, Osolodkin DI. Enhanced taxonomy annotation of antiviral activity data from ChEMBL. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2019; 2019:5308407. [PMID: 30753475 PMCID: PMC6367519 DOI: 10.1093/database/bay139] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2018] [Accepted: 12/09/2018] [Indexed: 11/14/2022]
Abstract
The discovery of antiviral drugs is a rapidly developing area of medicinal chemistry research. The emergence of resistant variants and outbreaks of poorly studied viral diseases make this area constantly developing. The amount of antiviral activity data available in ChEMBL consistently grows, but virus taxonomy annotation of these data is not sufficient for thorough studies of antiviral chemical space. We developed a procedure for semi-automatic extraction of antiviral activity data from ChEMBL and mapped them to the virus taxonomy developed by the International Committee for Taxonomy of Viruses (ICTV). The procedure is based on the lists of virus-related values of ChEMBL annotation fields and a dictionary of virus names and acronyms mapped to ICTV taxa. Application of this data extraction procedure allows retrieving from ChEMBL 1.6 times more assays linked to 2.5 times more compounds and data points than ChEMBL web interface allows. Mapping of these data to ICTV taxa allows analyzing all the compounds tested against each viral species. Activity values and structures of the compounds were standardized, and the antiviral activity profile was created for each standard structure. Data set compiled using this algorithm was called ViralChEMBL. As case studies, we compared descriptor and scaffold distributions for the full ChEMBL and its `viral' and `non-viral' subsets, identified the most studied compounds and created a self-organizing map for ViralChEMBL. Our approach to data annotation appeared to be a very efficient tool for the study of antiviral chemical space.
Collapse
Affiliation(s)
- Anastasia A Nikitina
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
| | - Alexey A Orlov
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
| | - Liubov I Kozlovskaya
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Institute of Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Moscow, Russia
| | | | - Dmitry I Osolodkin
- FSBSI "Chumakov FSC R&D IBP RAS", Moscow, Russia.,Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia.,Institute of Translational Medicine and Biotechnology, Sechenov First Moscow State Medical University, Moscow, Russia
| |
Collapse
|
18
|
Casciuc I, Zabolotna Y, Horvath D, Marcou G, Bajorath J, Varnek A. Virtual Screening with Generative Topographic Maps: How Many Maps Are Required? J Chem Inf Model 2018; 59:564-572. [DOI: 10.1021/acs.jcim.8b00650] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Iuri Casciuc
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Yuliana Zabolotna
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Dragos Horvath
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Gilles Marcou
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Jürgen Bajorath
- B-IT, Limes, Unit Chem. Biol. & Med. Chem., University of Bonn, 53115 Bonn, Germany
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| |
Collapse
|
19
|
Miyao T, Funatsu K, Bajorath J. Three-Dimensional Activity Landscape Models of Different Design and Their Application to Compound Mapping and Potency Prediction. J Chem Inf Model 2018; 59:993-1004. [DOI: 10.1021/acs.jcim.8b00661] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Tomoyuki Miyao
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
20
|
Sidorov P, Davioud-Charvet E, Marcou G, Horvath D, Varnek A. AntiMalarial Mode of Action (AMMA) Database: Data Selection, Verification and Chemical Space Analysis. Mol Inform 2018; 37:e1800021. [DOI: 10.1002/minf.201800021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2018] [Accepted: 04/14/2018] [Indexed: 12/15/2022]
Affiliation(s)
- Pavel Sidorov
- Laboratoire de Chemoinformatique; UMR 7140 CNRS-Univ. Strasbourg; 1 rue Blaise Pascal Strasbourg 67000 France
| | - Elisabeth Davioud-Charvet
- Laboratoire d'Innovation Moléculaire et Applications (LIMA); UMR7042 CNRS-Unistra-UHA; Bioorganic and Medicinal Chemistry Team, European School of Chemistry, Polymers and Materials (ECPM); 25, rue Becquerel Strasbourg F-67087 France
| | - Gilles Marcou
- Laboratoire de Chemoinformatique; UMR 7140 CNRS-Univ. Strasbourg; 1 rue Blaise Pascal Strasbourg 67000 France
| | - Dragos Horvath
- Laboratoire de Chemoinformatique; UMR 7140 CNRS-Univ. Strasbourg; 1 rue Blaise Pascal Strasbourg 67000 France
| | - Alexandre Varnek
- Laboratoire de Chemoinformatique; UMR 7140 CNRS-Univ. Strasbourg; 1 rue Blaise Pascal Strasbourg 67000 France
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry; Kazan Federal University; Kazan Russia
| |
Collapse
|
21
|
Lin A, Horvath D, Afonina V, Marcou G, Reymond JL, Varnek A. Mapping of the Available Chemical Space versus the Chemical Universe of Lead-Like Compounds. ChemMedChem 2018; 13:540-554. [PMID: 29154440 DOI: 10.1002/cmdc.201700561] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 11/07/2017] [Indexed: 12/15/2022]
Abstract
This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪105 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary map building strategy has been updated with a "coverage check" step, which discards manifolds failing to accommodate compounds outside the frame set. The evolved map has a good propensity to separate actives from inactives for more than 20 external structure-activity sets. It was proven to properly accommodate the entire collection of 40 m compounds. Next, it served as a library comparison tool to highlight biases of real-world molecules (PubChem and ChEMBL) versus the universe of all possible species represented by FDB-17, a fragment-like subset of GDB-17 containing 10 million molecules. Specific patterns, proper to some libraries and absent from others (diversity holes), were highlighted.
Collapse
Affiliation(s)
- Arkadii Lin
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Valentina Afonina
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France.,Laboratory of Chemoinformatics and Molecular Modeling, Department of Organic Chemistry, A.M. Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya str., 420008, Kazan, Russia
| | - Gilles Marcou
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, 3 Freiestrasse, 3012, Berne, Switzerland
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| |
Collapse
|
22
|
From bird’s eye views to molecular communities: two-layered visualization of structure–activity relationships in large compound data sets. J Comput Aided Mol Des 2017; 31:961-977. [DOI: 10.1007/s10822-017-0070-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 09/21/2017] [Indexed: 01/18/2023]
|
23
|
Nowotka MM, Gaulton A, Mendez D, Bento AP, Hersey A, Leach A. Using ChEMBL web services for building applications and data processing workflows relevant to drug discovery. Expert Opin Drug Discov 2017; 12:757-767. [PMID: 28602100 DOI: 10.1080/17460441.2017.1339032] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
INTRODUCTION ChEMBL is a manually curated database of bioactivity data on small drug-like molecules, used by drug discovery scientists. Among many access methods, a REST API provides programmatic access, allowing the remote retrieval of ChEMBL data and its integration into other applications. This approach allows scientists to move from a world where they go to the ChEMBL web site to search for relevant data, to one where ChEMBL data can be simply integrated into their everyday tools and work environment. Areas covered: This review highlights some of the audiences who may benefit from using the ChEMBL API, and the goals they can address, through the description of several use cases. The examples cover a team communication tool (Slack), a data analytics platform (KNIME), batch job management software (Luigi) and Rich Internet Applications. Expert opinion: The advent of web technologies, cloud computing and micro services oriented architectures have made REST APIs an essential ingredient of modern software development models. The widespread availability of tools consuming RESTful resources have made them useful for many groups of users. The ChEMBL API is a valuable resource of drug discovery bioactivity data for professional chemists, chemistry students, data scientists, scientific and web developers.
Collapse
Affiliation(s)
- Michał M Nowotka
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| | - Anna Gaulton
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| | - David Mendez
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| | - A Patricia Bento
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| | - Anne Hersey
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| | - Andrew Leach
- a European Molecular Biology Laboratory - European Bioinformatics Institute , Wellcome Genome Campus , Hinxton , UK
| |
Collapse
|
24
|
Kayastha S, Horvath D, Gilberg E, Gütschow M, Bajorath J, Varnek A. Privileged Structural Motif Detection and Analysis Using Generative Topographic Maps. J Chem Inf Model 2017; 57:1218-1232. [DOI: 10.1021/acs.jcim.7b00128] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Shilva Kayastha
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Erik Gilberg
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
- Pharmaceutical
Institute, University of Bonn, An der Immenburg 4, 53121 Bonn, Germany
| | - Michael Gütschow
- Pharmaceutical
Institute, University of Bonn, An der Immenburg 4, 53121 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| | - Alexandre Varnek
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| |
Collapse
|
25
|
Horvath D, Baskin I, Marcou G, Varnek A. Generative Topographic Mapping of Conformational Space. Mol Inform 2017; 36. [PMID: 28421706 DOI: 10.1002/minf.201700036] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 03/31/2017] [Indexed: 12/17/2022]
Abstract
Herein, Generative Topographic Mapping (GTM) was challenged to produce planar projections of the high-dimensional conformational space of complex molecules (the 1LE1 peptide). GTM is a probability-based mapping strategy, and its capacity to support property prediction models serves to objectively assess map quality (in terms of regression statistics). The properties to predict were total, non-bonded and contact energies, surface area and fingerprint darkness. Map building and selection was controlled by a previously introduced evolutionary strategy allowed to choose the best-suited conformational descriptors, options including classical terms and novel atom-centric autocorrellograms. The latter condensate interatomic distance patterns into descriptors of rather low dimensionality, yet precise enough to differentiate between close favorable contacts and atom clashes. A subset of 20 K conformers of the 1LE1 peptide, randomly selected from a pool of 2 M geometries (generated by the S4MPLE tool) was employed for map building and cross-validation of property regression models. The GTM build-up challenge reached robust three-fold cross-validated determination coefficients of Q2 =0.7…0.8, for all modeled properties. Mapping of the full 2 M conformer set produced intuitive and information-rich property landscapes. Functional and folding subspaces appear as well-separated zones, even though RMSD with respect to the PDB structure was never used as a selection criterion of the maps.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | | | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| |
Collapse
|
26
|
QSAR modeling and chemical space analysis of antimalarial compounds. J Comput Aided Mol Des 2017; 31:441-451. [PMID: 28374255 DOI: 10.1007/s10822-017-0019-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 03/18/2017] [Indexed: 10/19/2022]
Abstract
Generative topographic mapping (GTM) has been used to visualize and analyze the chemical space of antimalarial compounds as well as to build predictive models linking structure of molecules with their antimalarial activity. For this, a database, including ~3000 molecules tested in one or several of 17 anti-Plasmodium activity assessment protocols, has been compiled by assembling experimental data from in-house and ChEMBL databases. GTM classification models built on subsets corresponding to individual bioassays perform similarly to the earlier reported SVM models. Zones preferentially populated by active and inactive molecules, respectively, clearly emerge in the class landscapes supported by the GTM model. Their analysis resulted in identification of privileged structural motifs of potential antimalarial compounds. Projection of marketed antimalarial drugs on this map allowed us to delineate several areas in the chemical space corresponding to different mechanisms of antimalarial activity. This helped us to make a suggestion about the mode of action of the molecules populating these zones.
Collapse
|
27
|
Kireeva N, Pervov VS. Materials space of solid-state electrolytes: unraveling chemical composition–structure–ionic conductivity relationships in garnet-type metal oxides using cheminformatics virtual screening approaches. Phys Chem Chem Phys 2017; 19:20904-20918. [DOI: 10.1039/c7cp00518k] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Several candidate garnet-related compounds have been recommended for synthesis as potential materials for solid-state electrolytes.
Collapse
Affiliation(s)
- Natalia Kireeva
- Frumkin Institute of Physical Chemistry and Electrochemistry Russian Academy of Sciences
- Moscow
- Russia
- Moscow Institute of Physics and Technology (State University)
- Dolgoprudny
| | - Vladislav S. Pervov
- Kurnakov Institute of General and Inorganic Chemistry Russian Academy of Sciences
- Moscow
- Russia
| |
Collapse
|
28
|
Horvath D, Marcou G, Varnek A. Generative Topographic Mapping Approach to Chemical Space Analysis. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2017. [DOI: 10.1007/978-3-319-56850-8_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|