1
|
Iqbal J, Vogt M, Bajorath J. Computational Method for Quantitative Comparison of Activity Landscapes on the Basis of Image Data. Molecules 2020; 25:E3952. [PMID: 32872506 PMCID: PMC7504767 DOI: 10.3390/molecules25173952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 08/21/2020] [Accepted: 08/27/2020] [Indexed: 01/31/2023] Open
Abstract
Activity landscape (AL) models are used for visualizing and interpreting structure-activity relationships (SARs) in compound datasets. Therefore, ALs are designed to present chemical similarity and compound potency information in context. Different two- or three-dimensional (2D or 3D) AL representations have been introduced. For SAR analysis, 3D AL models are particularly intuitive. In these models, an interpolated potency surface is added as a third dimension to a 2D projection of chemical space. Accordingly, AL topology can be associated with characteristic SAR features. Going beyond visualization and a qualitative assessment of SARs, it would be very helpful to compare 3D ALs of different datasets in more quantitative terms. However, quantitative AL analysis is still in its infancy. Recently, it has been shown that 3D AL models with pre-defined topologies can be correctly classified using machine learning. Classification was facilitated on the basis of AL image feature representations learned with convolutional neural networks. Therefore, we have further investigated image analysis for quantitative comparison of 3D ALs and devised an approach to determine (dis)similarity relationships for ALs representing different compound datasets. Herein, we report this approach and demonstrate proof-of-principle. The methodology makes it possible to computationally compare 3D ALs and quantify topological differences reflecting varying SAR information content. For SAR exploration in drug design, this adds a quantitative measure of AL (dis)similarity to graphical analysis.
Collapse
Affiliation(s)
- Javed Iqbal
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
2
|
Kausar S, Falcao AO. A visual approach for analysis and inference of molecular activity spaces. J Cheminform 2019; 11:63. [PMID: 33430986 PMCID: PMC6805449 DOI: 10.1186/s13321-019-0386-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 10/05/2019] [Indexed: 11/25/2022] Open
Abstract
BACKGROUND Molecular space visualization can help to explore the diversity of large heterogeneous chemical data, which ultimately may increase the understanding of structure-activity relationships (SAR) in drug discovery projects. Visual SAR analysis can therefore be useful for library design, chemical classification for their biological evaluation and virtual screening for the selection of compounds for synthesis or in vitro testing. As such, computational approaches for molecular space visualization have become an important issue in cheminformatics research. The proposed approach uses molecular similarity as the sole input for computing a probabilistic surface of molecular activity (PSMA). This similarity matrix is transformed in 2D using different dimension reduction algorithms (Principal Coordinates Analysis ( PCooA), Kruskal multidimensional scaling, Sammon mapping and t-SNE). From this projection, a kernel density function is applied to compute the probability of activity for each coordinate in the new projected space. RESULTS This methodology was tested over four different quantitative structure-activity relationship (QSAR) binary classification data sets and the PSMAs were computed for each. The generated maps showed internal consistency with active molecules grouped together for all data sets and all dimensionality reduction algorithms. To validate the quality of the generated maps, the 2D coordinates of test molecules were computed into the new reference space using a data transformation matrix. In total sixteen PSMAs were built, and their performance was assessed using the Area Under Curve (AUC) and the Matthews Coefficient Correlation (MCC). For the best projections for each data set, AUC testing results ranged from 0.87 to 0.98 and the MCC scores ranged from 0.33 to 0.77, suggesting this methodology can validly capture the complexities of the molecular activity space. All four mapping functions provided generally good results yet the overall performance of PCooA and t-SNE was slightly better than Sammon mapping and Kruskal multidimensional scaling. CONCLUSIONS Our result showed that by using an appropriate combination of metric space representation and dimensionality reduction applied over metric spaces it is possible to produce a visual PSMA for which its consistency has been validated by using this map as a classification model. The produced maps can be used as prediction tools as it is simple to project any molecule into this new reference space as long as the similarities to the molecules used to compute the initial similarity matrix can be computed.
Collapse
Affiliation(s)
- Samina Kausar
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
- BioISI: Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| | - Andre O. Falcao
- LaSIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016 Lisboa, Portugal
- BioISI: Biosystems & Integrative Sciences Institute, Faculdade de Ciencias, Universidade de Lisboa, 1749-016 Lisboa, Portugal
| |
Collapse
|
3
|
Medina-Franco JL, Naveja JJ, López-López E. Reaching for the bright StARs in chemical space. Drug Discov Today 2019; 24:2162-2169. [PMID: 31557448 DOI: 10.1016/j.drudis.2019.09.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2019] [Revised: 09/10/2019] [Accepted: 09/17/2019] [Indexed: 02/07/2023]
Abstract
Visualization of activity data in chemical space is common in drug discovery. Navigating the space in a systematic manner is not trivial, given its size and huge coverage. To this end, methods for data visualization have been developed charting biological activity into chemical space. Herein, we review the progress in different visualization approaches to explore the chemical space aiming at reaching insightful structure-activity relationships (SARs) in the chemical space. We discuss recent methods including consensus diversity plots, ChemMaps, and constellation plots. Several of the methods we review can be extended to analyze other properties of interest in medicinal chemistry, such as structure-toxicity relationships, and can be adapted to postprocess results of virtual screening (VS) of large compound libraries.
Collapse
Affiliation(s)
- José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico.
| | - J Jesús Naveja
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico; PECEM, School of Medicine, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Edgar López-López
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
4
|
Miyao T, Funatsu K, Bajorath J. Three-Dimensional Activity Landscape Models of Different Design and Their Application to Compound Mapping and Potency Prediction. J Chem Inf Model 2018; 59:993-1004. [DOI: 10.1021/acs.jcim.8b00661] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Tomoyuki Miyao
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
5
|
Abstract
INTRODUCTION Activity landscapes (ALs) are representations and models of compound data sets annotated with a target-specific activity. In contrast to quantitative structure-activity relationship (QSAR) models, ALs aim at characterizing structure-activity relationships (SARs) on a large-scale level encompassing all active compounds for specific targets. The popularity of AL modeling has grown substantially with the public availability of large activity-annotated compound data sets. AL modeling crucially depends on molecular representations and similarity metrics used to assess structural similarity. Areas covered: The concepts of AL modeling are introduced and its basis in quantitatively assessing molecular similarity is discussed. The different types of AL modeling approaches are introduced. AL designs can broadly be divided into three categories: compound-pair based, dimensionality reduction, and network approaches. Recent developments for each of these categories are discussed focusing on the application of mathematical, statistical, and machine learning tools for AL modeling. AL modeling using chemical space networks is covered in more detail. Expert opinion: AL modeling has remained a largely descriptive approach for the analysis of SARs. Beyond mere visualization, the application of analytical tools from statistics, machine learning and network theory has aided in the sophistication of AL designs and provides a step forward in transforming ALs from descriptive to predictive tools. To this end, optimizing representations that encode activity relevant features of molecules might prove to be a crucial step.
Collapse
Affiliation(s)
- Martin Vogt
- a Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Bonn , Germany
| |
Collapse
|
6
|
Métivier JP, Cuissart B, Bureau R, Lepailleur A. The Pharmacophore Network: A Computational Method for Exploring Structure–Activity Relationships from a Large Chemical Data Set. J Med Chem 2018; 61:3551-3564. [DOI: 10.1021/acs.jmedchem.7b01890] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Jean-Philippe Métivier
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
| | - Bertrand Cuissart
- Groupe de Recherche en Informatique, Image, Automatique et Instrumentation de Caen, Normandie Univ, UNICAEN, ENSICAEN, CNRS, GREYC, 14000 Caen, France
| | - Ronan Bureau
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
| | - Alban Lepailleur
- Centre d’Etudes et de Recherche sur le Médicament de Normandie, Normandie Univ, UNICAEN, CERMN, 14000 Caen, France
| |
Collapse
|
7
|
From bird’s eye views to molecular communities: two-layered visualization of structure–activity relationships in large compound data sets. J Comput Aided Mol Des 2017; 31:961-977. [DOI: 10.1007/s10822-017-0070-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 09/21/2017] [Indexed: 01/18/2023]
|
8
|
Yang D, Wan C, He M, Che C, Xiao Y, Fu B, Qin Z. Design, synthesis, crystal structure and fungicidal activity of ( E)-5-(methoxyimino)-3,5-dihydrobenzo[ e][1,2]oxazepin-4(1 H)-one analogues. MEDCHEMCOMM 2017; 8:1007-1014. [PMID: 30108816 DOI: 10.1039/c7md00025a] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2017] [Accepted: 03/01/2017] [Indexed: 12/22/2022]
Abstract
A practical method of four-step synthesis towards novel (E)-5-(methoxyimino)-3,5-dihydrobenzo[e][1,2]oxazepin-4(1H)-one antifungals is presented, where a commercially available pesticide and pharmacology intermediate, (E)-methyl 2-(2-(bromomethyl)phenyl)-2-(methoxyimino)acetate (1), was used as starting material. These compounds were confirmed by 1H NMR, 13C NMR, high-resolution mass spectroscopy and X-ray crystal structure. Via in vitro fungicidal evaluation, the moderate to high activities of several compounds against eight phytopathogenic fungi were demonstrated. Especially, the fungicidal activities of compounds 5-03 and 5-09 were comparable to those of the controls azoxystrobin and trifloxystrobin in precise virulence measurements for four fungi. These results suggested that dihydrobenzo[e][1,2]oxazepin-4(1H)-one analogues could be considered as potential fungicidal candidates for crop protection.
Collapse
Affiliation(s)
- Dongyan Yang
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - Chuan Wan
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - MengMeng He
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - Chuanliang Che
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - Yumei Xiao
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - Bin Fu
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| | - Zhaohai Qin
- College of Science , China Agricultural University , Beijing 100193 , China . ; ; Tel: +86 (0)10 62732958
| |
Collapse
|
9
|
Kunimoto R, Vogt M, Bajorath J. Tracing compound pathways using chemical space networks. MEDCHEMCOMM 2017; 8:376-384. [PMID: 30108753 PMCID: PMC6072420 DOI: 10.1039/c6md00628k] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/09/2016] [Accepted: 12/06/2016] [Indexed: 11/21/2022]
Abstract
Similarity-based compound networks are used as coordinate-free representations of chemical space. In so-called chemical space networks (CSNs), nodes represent compounds and edges pairwise similarity relationships. Nodes can be annotated with activity information, which enables visualization of structure-activity relationship (SAR) patterns. A major determinant of CSN structure and topology is the way in which similarity relationships are determined. Using different similarity measures, a number of CSN variants have been generated previously. Herein, we report a new type of CSN with an asymmetric similarity metric based upon the maximum common substructure of compound pairs. While CSNs have thus far mostly been used for SAR visualization, the new CSN variant was designed for another medicinal chemistry application, i.e. the identification of compound pathways in data sets. In this CSN, pathways consisting of structurally related compounds with increasing size can be systematically traced, which represent models of compound optimization paths. Compound series forming such paths can be extracted from the CSN. The network-based identification of hit-to-lead or lead optimization series in compound data sets is intuitive and thought to provide valuable information for medicinal chemistry.
Collapse
Affiliation(s)
- Ryo Kunimoto
- Department of Life Science Informatics , B-IT , LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Dahlmannstr. 2 , D-53113 Bonn , Germany . ; ; Tel: +49 228 2699 306
| | - Martin Vogt
- Department of Life Science Informatics , B-IT , LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Dahlmannstr. 2 , D-53113 Bonn , Germany . ; ; Tel: +49 228 2699 306
| | - Jürgen Bajorath
- Department of Life Science Informatics , B-IT , LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Dahlmannstr. 2 , D-53113 Bonn , Germany . ; ; Tel: +49 228 2699 306
| |
Collapse
|