1
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024; 19:1043-1069. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
2
|
Rana D, Pflüger PM, Hölter NP, Tan G, Glorius F. Standardizing Substrate Selection: A Strategy toward Unbiased Evaluation of Reaction Generality. ACS CENTRAL SCIENCE 2024; 10:899-906. [PMID: 38680564 PMCID: PMC11046462 DOI: 10.1021/acscentsci.3c01638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/14/2024] [Accepted: 03/18/2024] [Indexed: 05/01/2024]
Abstract
With over 10,000 new reaction protocols arising every year, only a handful of these procedures transition from academia to application. A major reason for this gap stems from the lack of comprehensive knowledge about a reaction's scope, i.e., to which substrates the protocol can or cannot be applied. Even though chemists invest substantial effort to assess the scope of new protocols, the resulting scope tables involve significant biases, reducing their expressiveness. Herein we report a standardized substrate selection strategy designed to mitigate these biases and evaluate the applicability, as well as the limits, of any chemical reaction. Unsupervised learning is utilized to map the chemical space of industrially relevant molecules. Subsequently, potential substrate candidates are projected onto this universal map, enabling the selection of a structurally diverse set of substrates with optimal relevance and coverage. By testing our methodology on different chemical reactions, we were able to demonstrate its effectiveness in finding general reactivity trends by using a few highly representative examples. The developed methodology empowers chemists to showcase the unbiased applicability of novel methodologies, facilitating their practical applications. We hope that this work will trigger interdisciplinary discussions about biases in synthetic chemistry, leading to improved data quality.
Collapse
Affiliation(s)
- Debanjan Rana
- Universität Münster,
Organisch-Chemisches Institut, Corrensstraße 36, 48149 Münster, Germany
| | - Philipp M. Pflüger
- Universität Münster,
Organisch-Chemisches Institut, Corrensstraße 36, 48149 Münster, Germany
| | - Niklas P. Hölter
- Universität Münster,
Organisch-Chemisches Institut, Corrensstraße 36, 48149 Münster, Germany
| | - Guangying Tan
- Universität Münster,
Organisch-Chemisches Institut, Corrensstraße 36, 48149 Münster, Germany
| | - Frank Glorius
- Universität Münster,
Organisch-Chemisches Institut, Corrensstraße 36, 48149 Münster, Germany
| |
Collapse
|
3
|
Ruiz-Moreno AJ, Reyes-Romero A, Dömling A, Velasco-Velázquez MA. In Silico Design and Selection of New Tetrahydroisoquinoline-Based CD44 Antagonist Candidates. Molecules 2021; 26:molecules26071877. [PMID: 33810348 PMCID: PMC8037692 DOI: 10.3390/molecules26071877] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 03/13/2021] [Accepted: 03/14/2021] [Indexed: 02/07/2023] Open
Abstract
CD44 promotes metastasis, chemoresistance, and stemness in different types of cancer and is a target for the development of new anti-cancer therapies. All CD44 isoforms share a common N-terminal domain that binds to hyaluronic acid (HA). Herein, we used a computational approach to design new potential CD44 antagonists and evaluate their target-binding ability. By analyzing 30 crystal structures of the HA-binding domain (CD44HAbd), we characterized a subdomain that binds to 1,2,3,4-tetrahydroisoquinoline (THQ)-containing compounds and is adjacent to residues essential for HA interaction. By computational combinatorial chemistry (CCC), we designed 168,190 molecules and compared their conformers to a pharmacophore containing the key features of the crystallographic THQ binding mode. Approximately 0.01% of the compounds matched the pharmacophore and were analyzed by computational docking and molecular dynamics (MD). We identified two compounds, Can125 and Can159, that bound to human CD44HAbd (hCD44HAbd) in explicit-solvent MD simulations and therefore may elicit CD44 blockage. These compounds can be easily synthesized by multicomponent reactions for activity testing and their binding mode, reported here, could be helpful in the design of more potent CD44 antagonists.
Collapse
Affiliation(s)
- Angel J. Ruiz-Moreno
- Departamento de Farmacología, Facultad de Medicina, Universidad Nacional Autónoma de Mexico (UNAM), Ciudad de Mexico 04510, Mexico;
- Unidad Periférica de Investigación en Biomedicina Translacional, Facultad de Medicina, Universidad Nacional Autónoma de México (UNAM), Félix Cuevas 540, Ciudad de Mexico 03229, Mexico
- Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México (UNAM), Ciudad de Mexico 04510, Mexico
- Drug Design Group, Department of Pharmacy, University of Groningen, 9700 AD Groningen, The Netherlands;
| | - Atilio Reyes-Romero
- Drug Design Group, Department of Pharmacy, University of Groningen, 9700 AD Groningen, The Netherlands;
| | - Alexander Dömling
- Drug Design Group, Department of Pharmacy, University of Groningen, 9700 AD Groningen, The Netherlands;
- Correspondence: (A.D.); (M.A.V.-V.); Tel.: +31-50-363-330 (A.D.); +52-55-5623-2282 (M.A.V.-V.)
| | - Marco A. Velasco-Velázquez
- Departamento de Farmacología, Facultad de Medicina, Universidad Nacional Autónoma de Mexico (UNAM), Ciudad de Mexico 04510, Mexico;
- Unidad Periférica de Investigación en Biomedicina Translacional, Facultad de Medicina, Universidad Nacional Autónoma de México (UNAM), Félix Cuevas 540, Ciudad de Mexico 03229, Mexico
- Correspondence: (A.D.); (M.A.V.-V.); Tel.: +31-50-363-330 (A.D.); +52-55-5623-2282 (M.A.V.-V.)
| |
Collapse
|
4
|
|
5
|
Yang T, Li Z, Chen Y, Feng D, Wang G, Fu Z, Ding X, Tan X, Zhao J, Luo X, Chen K, Jiang H, Zheng M. DrugSpaceX: a large screenable and synthetically tractable database extending drug space. Nucleic Acids Res 2021; 49:D1170-D1178. [PMID: 33104791 PMCID: PMC7778939 DOI: 10.1093/nar/gkaa920] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2020] [Revised: 09/11/2020] [Accepted: 10/05/2020] [Indexed: 02/07/2023] Open
Abstract
One of the most prominent topics in drug discovery is efficient exploration of the vast drug-like chemical space to find synthesizable and novel chemical structures with desired biological properties. To address this challenge, we created the DrugSpaceX (https://drugspacex.simm.ac.cn/) database based on expert-defined transformations of approved drug molecules. The current version of DrugSpaceX contains >100 million transformed chemical products for virtual screening, with outstanding characteristics in terms of structural novelty, diversity and large three-dimensional chemical space coverage. To illustrate its practical application in drug discovery, we used a case study of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, to show DrugSpaceX performing a quick search of initial hit compounds. Additionally, for ligand identification and optimization purposes, DrugSpaceX also provides several subsets for download, including a 10% diversity subset, an extended drug-like subset, a drug-like subset, a lead-like subset, and a fragment-like subset. In addition to chemical properties and transformation instructions, DrugSpaceX can locate the position of transformation, which will enable medicinal chemists to easily integrate strategy planning and protection design.
Collapse
Affiliation(s)
- Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
| | - Zhaojun Li
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Yingjia Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Dan Feng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Chemistry, College of Sciences, Shanghai University, Shanghai, China
| | - Guangchao Wang
- School of Information Management, Dezhou University, No. 566 University Rd. West, Dezhou 253023, Shandong, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Jiangsu, Nanjing 210023, China
| | - Xiaoyu Ding
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Jihui Zhao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou 310024, China
- School of Life Science and Technology, ShanghaiTech University, 393 Huaxiazhong Road, Shanghai 200031, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- Department of Pharmacy, University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
6
|
Donmez A, Rifaioglu AS, Acar A, Doğan T, Cetin-Atalay R, Atalay V. iBioProVis: interactive visualization and analysis of compound bioactivity space. Bioinformatics 2020; 36:4227-4230. [PMID: 32407491 PMCID: PMC7454317 DOI: 10.1093/bioinformatics/btaa496] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 03/25/2020] [Accepted: 05/08/2020] [Indexed: 12/23/2022] Open
Abstract
SUMMARY iBioProVis is an interactive tool for visual analysis of the compound bioactivity space in the context of target proteins, drugs and drug candidate compounds. iBioProVis tool takes target protein identifiers and, optionally, compound SMILES as input, and uses the state-of-the-art non-linear dimensionality reduction method t-Distributed Stochastic Neighbor Embedding (t-SNE) to plot the distribution of compounds embedded in a 2D map, based on the similarity of structural properties of compounds and in the context of compounds' cognate targets. Similar compounds, which are embedded to proximate points on the 2D map, may bind the same or similar target proteins. Thus, iBioProVis can be used to easily observe the structural distribution of one or two target proteins' known ligands on the 2D compound space, and to infer new binders to the same protein, or to infer new potential target(s) for a compound of interest, based on this distribution. Principal component analysis (PCA) projection of the input compounds is also provided, Hence the user can interactively observe the same compound or a group of selected compounds which is projected by both PCA and embedded by t-SNE. iBioProVis also provides detailed information about drugs and drug candidate compounds through cross-references to widely used and well-known databases, in the form of linked table views. Two use-case studies were demonstrated, one being on angiotensin-converting enzyme 2 (ACE2) protein which is Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Spike protein receptor. ACE2 binding compounds and seven antiviral drugs were closely embedded in which two of them have been under clinical trial for Coronavirus disease 19 (COVID-19). AVAILABILITY AND IMPLEMENTATION iBioProVis and its carefully filtered dataset are available at https://ibpv.kansil.org/ for public use. CONTACT vatalay@metu.edu.tr. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ataberk Donmez
- Department of Computer Engineering, METU, Ankara 06800, Turkey
| | - Ahmet Sureyya Rifaioglu
- Department of Computer Engineering, METU, Ankara 06800, Turkey
- Department of Computer Engineering, İskenderun Technical University, Hatay 31200, Turkey
| | - Aybar Acar
- Department of Health Informatics, KanSiL, Graduate School of Informatics, METU
| | - Tunca Doğan
- Department of Computer Engineering, Hacettepe University, 06800 Ankara, Turkey
- Institute of Informatics, Hacettepe University, 06800 Ankara, Turkey
| | - Rengul Cetin-Atalay
- Department of Health Informatics, KanSiL, Graduate School of Informatics, METU
- Department of Medicine, Section of Pulmonary and Critical Care Medicine, the University of Chicago, Chicago, IL 60637, USA
| | - Volkan Atalay
- Department of Computer Engineering, METU, Ankara 06800, Turkey
| |
Collapse
|
7
|
Probst D, Reymond JL. Visualization of very large high-dimensional data sets as minimum spanning trees. J Cheminform 2020; 12:12. [PMID: 33431043 PMCID: PMC7015965 DOI: 10.1186/s13321-020-0416-x] [Citation(s) in RCA: 135] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/04/2020] [Indexed: 01/10/2023] Open
Abstract
The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing chemical structures and associated properties. However, there are currently no algorithms to visualize such data while preserving both global and local features with a sufficient level of detail to allow for human inspection and interpretation. Here, we propose a solution to this problem with a new data visualization method, TMAP, capable of representing data sets of up to millions of data points and arbitrary high dimensionality as a two-dimensional tree (http://tmap.gdb.tools). Visualizations based on TMAP are better suited than t-SNE or UMAP for the exploration and interpretation of large data sets due to their tree-like nature, increased local and global neighborhood and structure preservation, and the transparency of the methods the algorithm is based on. We apply TMAP to the most used chemistry data sets including databases of molecules such as ChEMBL, FDB17, the Natural Products Atlas, DSSTox, as well as to the MoleculeNet benchmark collection of data sets. We also show its broad applicability with further examples from biology, particle physics, and literature.![]()
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
8
|
Dimandja JM. Introduction and historical background: the “inside” story of comprehensive two-dimensional gas chromatography. SEP SCI TECHNOL 2020. [DOI: 10.1016/b978-0-12-813745-1.00001-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
9
|
Borrel A, Kleinstreuer NC, Fourches D. Exploring drug space with ChemMaps.com. Bioinformatics 2019; 34:3773-3775. [PMID: 29790904 DOI: 10.1093/bioinformatics/bty412] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 05/18/2018] [Indexed: 11/12/2022] Open
Abstract
Motivation Easily navigating chemical space has become more important due to the increasing size and diversity of publicly-accessible databases such as DrugBank, ChEMBL or Tox21. To do so, modelers typically rely on complex projection techniques using molecular descriptors computed for all the chemicals to be visualized. However, the multiple cheminformatics steps required to prepare, characterize, compute and explore those molecules, are technical, typically necessitate scripting skills, and thus represent a real obstacle for non-specialists. Results We developed the ChemMaps.com webserver to easily browse, navigate and mine chemical space. The first version of ChemMaps.com features more than 8000 approved, in development, and rejected drugs, as well as over 47 000 environmental chemicals. Availability and implementation The webserver is freely available at http://www.chemmaps.com.
Collapse
Affiliation(s)
- Alexandre Borrel
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.,Division of Intramural Research/Biostatistics and Computational Biology Branch
| | - Nicole C Kleinstreuer
- Division of Intramural Research/Biostatistics and Computational Biology Branch.,National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, NIEHS, RTP, Research Triangle, NC, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|
10
|
Capecchi A, Awale M, Probst D, Reymond JL. PubChem and ChEMBL beyond Lipinski. Mol Inform 2019; 38:e1900016. [PMID: 30844149 DOI: 10.1002/minf.201900016] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Accepted: 02/18/2019] [Indexed: 12/13/2022]
Abstract
Seven million of the currently 94 million entries in the PubChem database break at least one of the four Lipinski constraints for oral bioavailability, 183,185 of which are also found in the ChEMBL database. These non-Lipinski PubChem (NLP) and ChEMBL (NLC) subsets are interesting because they contain new modalities that can display biological properties not accessible to small molecule drugs. Unfortunately, the current search tools in PubChem and ChEMBL are designed for small molecules and are not well suited to explore these subsets, which therefore remain poorly appreciated. Herein we report MXFP (macromolecule extended atom-pair fingerprint), a 217-D fingerprint tailored to analyze large molecules in terms of molecular shape and pharmacophores. We implement MXFP in two web-based applications, the first one to visualize NLP and NLC interactively using Faerun (http://faerun.gdb.tools/), the second one to perform MXFP nearest neighbor searches in NLP and NLC (http://similaritysearch.gdb.tools/). We show that these tools provide a meaningful insight into the diversity of large molecules in NLP and NLC. The interactive tools presented here are publicly available at http://gdb.unibe.ch and can be used freely to explore and better understand the diversity of non-Lipinski molecules in PubChem and ChEMBL.
Collapse
Affiliation(s)
- Alice Capecchi
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Daniel Probst
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland
| |
Collapse
|
11
|
Karlov DS, Sosnin S, Tetko IV, Fedorov MV. Chemical space exploration guided by deep neural networks. RSC Adv 2019; 9:5151-5157. [PMID: 35514634 PMCID: PMC9060647 DOI: 10.1039/c8ra10182e] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 01/29/2019] [Indexed: 11/21/2022] Open
Abstract
A parametric t-SNE approach based on deep feed-forward neural networks was applied to the chemical space visualization problem. It is able to retain more information than certain dimensionality reduction techniques used for this purpose (principal component analysis (PCA), multidimensional scaling (MDS)). The applicability of this method to some chemical space navigation tasks (activity cliffs and activity landscapes identification) is discussed. We created a simple web tool to illustrate our work (http://space.syntelly.com). A parametric t-SNE approach based on deep feed-forward neural networks was applied to the chemical space visualization problem.![]()
Collapse
Affiliation(s)
- Dmitry S. Karlov
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
| | - Sergey Sosnin
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
- Syntelly LLC
| | - Igor V. Tetko
- Helmholtz Zentrum München – Research Center for Environmental Health (GmbH)
- Institute of Structural Biology
- Germany
- BIGCHEM GmbH
- Germany
| | - Maxim V. Fedorov
- Skolkovo Institute of Science and Technology
- Skolkovo Innovation Center
- Moscow 143026
- Russia
- Syntelly LLC
| |
Collapse
|
12
|
Abstract
The recent general availability of low-cost virtual reality headsets and accompanying three-dimensional (3D) engine support presents an opportunity to bring the concept of chemical space into virtual environments. While virtual reality applications represent a category of widespread tools in other fields, their use in the visualization and exploration of abstract data such as chemical spaces has been experimental. In our previous work, we established the concept of interactive two-dimensional (2D) maps of chemical spaces followed by interactive web-based 3D visualizations, culminating in the interactive web-based 3D visualization of extremely large chemical spaces. Virtual reality chemical spaces are a natural extension of these concepts. As 2D and 3D embeddings and projections of high-dimensional chemical fingerprint spaces have been shown to be valuable tools in chemical space visualization and exploration, existing pipelines of data mining and preparation can be extended to be used in virtual reality applications. Here we present an application based on the Unity engine and the Virtual Reality Toolkit, allowing for the interactive exploration of chemical space populated by DrugBank compounds in virtual reality. The source code of the application as well as the most recent build are available on GitHub ( https://github.com/reymond-group/virtual-reality-chemical-space ).
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure , University of Berne , Freiestrasse 3 , 3012 Berne , Switzerland
| |
Collapse
|
13
|
Probst D, Reymond JL. SmilesDrawer: Parsing and Drawing SMILES-Encoded Molecular Structures Using Client-Side JavaScript. J Chem Inf Model 2018; 58:1-7. [PMID: 29257869 DOI: 10.1021/acs.jcim.7b00425] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Here we present SmilesDrawer, a dependency-free JavaScript component capable of both parsing and drawing SMILES-encoded molecular structures client-side, developed to be easily integrated into web projects and to display organic molecules in large numbers and fast succession. SmilesDrawer can draw structurally and stereochemically complex structures such as maitotoxin and C60 without using templates, yet has an exceptionally small computational footprint and low memory usage without the requirement for loading images or any other form of client-server communication, making it easy to integrate even in secure (intranet, firewalled) or offline applications. These features allow the rendering of thousands of molecular structure drawings on a single web page within seconds on a wide range of hardware supporting modern browsers. The source code as well as the most recent build of SmilesDrawer is available on Github ( http://doc.gdb.tools/smilesDrawer/ ). Both yarn and npm packages are also available.
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center for Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|
14
|
Awale M, Probst D, Reymond JL. WebMolCS: A Web-Based Interface for Visualizing Molecules in Three-Dimensional Chemical Spaces. J Chem Inf Model 2017; 57:643-649. [PMID: 28316236 DOI: 10.1021/acs.jcim.6b00690] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The concept of chemical space provides a convenient framework to analyze large collections of molecules by placing them in property spaces where distances represent similarities. Here we report webMolCS, a new type of web-based interface visualizing up to 5000 user-defined molecules in six different three-dimensional (3D) chemical spaces obtained by principal component analysis or similarity mapping of multidimensional property spaces describing composition (MQN: 42D molecular quantum numbers, SMIfp: 34D SMILES fingerprint), shapes and pharmacophores (APfp: 20D atom pair fingerprint, Xfp: 55D category extended atom pair fingerprint), and substructures (Sfp: 1024D binary substructure fingerprint, ECfp4:1024D extended connectivity fingerprint). Each molecule is shown as a sphere, and its structure appears on mouse over. The sphere is color-coded by similarity to the first compound in the list, by the list rank, or by a user-defined value, which reveals the relationship between any property encoded by these values and structural similarities. WebMolCS is freely available at www.gdb.unibe.ch .
Collapse
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Daniel Probst
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, National Center of Competence in Research NCCR TransCure, University of Berne , Freiestrasse 3, 3012 Berne, Switzerland
| |
Collapse
|