1
|
Baygi SF, Barupal DK. IDSL_MINT: a deep learning framework to predict molecular fingerprints from mass spectra. J Cheminform 2024; 16:8. [PMID: 38238779 PMCID: PMC10797927 DOI: 10.1186/s13321-024-00804-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 01/14/2024] [Indexed: 01/22/2024] Open
Abstract
The majority of tandem mass spectrometry (MS/MS) spectra in untargeted metabolomics and exposomics studies lack any annotation. Our deep learning framework, Integrated Data Science Laboratory for Metabolomics and Exposomics-Mass INTerpreter (IDSL_MINT) can translate MS/MS spectra into molecular fingerprint descriptors. IDSL_MINT allows users to leverage the power of the transformer model for mass spectrometry data, similar to the large language models. Models are trained on user-provided reference MS/MS libraries via any customizable molecular fingerprint descriptors. IDSL_MINT was benchmarked using the LipidMaps database and improved the annotation rate of a test study for MS/MS spectra that were not originally annotated using existing mass spectral libraries. IDSL_MINT may improve the overall annotation rates in untargeted metabolomics and exposomics studies. The IDSL_MINT framework and tutorials are available in the GitHub repository at https://github.com/idslme/IDSL_MINT .Scientific contribution statement.Structural annotation of MS/MS spectra from untargeted metabolomics and exposomics datasets is a major bottleneck in gaining new biological insights. Machine learning models to convert spectra into molecular fingerprints can help in the annotation process. Here, we present IDSL_MINT, a new, easy-to-use and customizable deep-learning framework to train and utilize new models to predict molecular fingerprints from spectra for the compound annotation workflows.
Collapse
Affiliation(s)
- Sadjad Fakouri Baygi
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, CAM Building, 3rd Floor, 17 E 102 St, New York, NY, 10029, USA
| | - Dinesh Kumar Barupal
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, CAM Building, 3rd Floor, 17 E 102 St, New York, NY, 10029, USA.
| |
Collapse
|
2
|
Olmedo DA, Durant-Archibold AA, López-Pérez JL, Medina-Franco JL. Design and Diversity Analysis of Chemical Libraries in Drug Discovery. Comb Chem High Throughput Screen 2024; 27:502-515. [PMID: 37409545 DOI: 10.2174/1386207326666230705150110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2023] [Accepted: 05/30/2023] [Indexed: 07/07/2023]
Abstract
Chemical libraries and compound data sets are among the main inputs to start the drug discovery process at universities, research institutes, and the pharmaceutical industry. The approach used in the design of compound libraries, the chemical information they possess, and the representation of structures, play a fundamental role in the development of studies: chemoinformatics, food informatics, in silico pharmacokinetics, computational toxicology, bioinformatics, and molecular modeling to generate computational hits that will continue the optimization process of drug candidates. The prospects for growth in drug discovery and development processes in chemical, biotechnological, and pharmaceutical companies began a few years ago by integrating computational tools with artificial intelligence methodologies. It is anticipated that it will increase the number of drugs approved by regulatory agencies shortly.
Collapse
Affiliation(s)
- Dionisio A Olmedo
- Centro de Investigaciones Farmacognósticas de la Flora Panameña (CIFLORPAN), Facultad de Farmacia, Universidad de Panamá, Ciudad de Panamá, Apartado, 0824-00178, Panamá
- Sistema Nacional de Investigación (SNI), Secretaria Nacional de Ciencia, Tecnología e Innovación (SENACYT), Ciudad del Saber, Clayton, Panamá
| | - Armando A Durant-Archibold
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Apartado, 0843-01103, Panamá
- Departamento de Bioquímica, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Ciudad de Panamá, Panamá
| | - José Luis López-Pérez
- CESIFAR, Departamento de Farmacología, Facultad de Medicina, Universidad de Panamá, Ciudad de Panamá, Panamá
- Departamento de Ciencias Farmacéuticas, Facultad de Farmacia, Universidad de Salamanca, Avda. Campo Charro s/n, 37071 Salamanca, España
| | - José Luis Medina-Franco
- DIFACQUIM Grupo de Investigación, Departamento de Farmacia, Escuela de Química, Universidad Nacional Autónoma de México, Ciudad de México, Apartado, 04510, México
| |
Collapse
|
3
|
Weber JK, Morrone JA, Bagchi S, Pabon JDE, Kang SG, Zhang L, Cornell WD. Simplified, interpretable graph convolutional neural networks for small molecule activity prediction. J Comput Aided Mol Des 2021; 36:391-404. [PMID: 34817762 PMCID: PMC9325818 DOI: 10.1007/s10822-021-00421-6] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Accepted: 09/24/2021] [Indexed: 12/11/2022]
Abstract
We here present a streamlined, explainable graph convolutional neural network (gCNN) architecture for small molecule activity prediction. We first conduct a hyperparameter optimization across nearly 800 protein targets that produces a simplified gCNN QSAR architecture, and we observe that such a model can yield performance improvements over both standard gCNN and RF methods on difficult-to-classify test sets. Additionally, we discuss how reductions in convolutional layer dimensions potentially speak to the “anatomical” needs of gCNNs with respect to radial coarse graining of molecular substructure. We augment this simplified architecture with saliency map technology that highlights molecular substructures relevant to activity, and we perform saliency analysis on nearly 100 data-rich protein targets. We show that resultant substructural clusters are useful visualization tools for understanding substructure-activity relationships. We go on to highlight connections between our models’ saliency predictions and observations made in the medicinal chemistry literature, focusing on four case studies of past lead finding and lead optimization campaigns.
Collapse
Affiliation(s)
- Jeffrey K Weber
- IBM Thomas J Watson Research Center, Yorktown Heights, NY, USA
| | | | - Sugato Bagchi
- IBM Thomas J Watson Research Center, Yorktown Heights, NY, USA
| | | | - Seung-Gu Kang
- IBM Thomas J Watson Research Center, Yorktown Heights, NY, USA
| | - Leili Zhang
- IBM Thomas J Watson Research Center, Yorktown Heights, NY, USA
| | - Wendy D Cornell
- IBM Thomas J Watson Research Center, Yorktown Heights, NY, USA.
| |
Collapse
|
4
|
Spiers RC, Kalivas JH. Reliable Model Selection without Reference Values by Utilizing Model Diversity with Prediction Similarity. J Chem Inf Model 2021; 61:2220-2230. [PMID: 33900749 DOI: 10.1021/acs.jcim.0c01493] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Predictive modeling (calibration or training) with various data formats, such as near-infrared (NIR) spectra and quantitative structure-activity relationship (QSAR) data, provides essential information if a proper model is selected. Similarly, with a general model selection approach, spectral model maintenance (updating) from original modeling conditions to new conditions can be performed for dynamic modeling. Fundamental modeling (partial least-squares (PLS) and others) and maintenance processes (domain adaptation or transfer learning and others) require selection of tuning parameter(s) values to isolate models that can accurately predict new samples or molecules, e.g., number of PLS latent variables to predict analyte concentration. Regardless of the modeling task, model selection is complex and without a reliable protocol. Tuning parameter selection typically depends on only one model quality measure assessing model bias using prediction accuracy. Developed in this paper is a generic model selection process using concepts from consensus modeling and QSAR activity landscapes. It is a consensus filtering approach that prioritizes model diversity (MD) while conserving prediction similarity (PS) fused with a common bias-variance trade-off measure. A significant feature of MDPS is that a cross-validation scheme is not needed because models are selected relative to predicting new samples or molecules, i.e., model selection uses unlabeled samples (without reference values) for active predictions. The versatility and reliability of MDPS model selection is shown using four NIR data sets and a QSAR data set. The study also substantiates the Rashomon effect where there is not one best model tuning parameter value that provides accurate predictions.
Collapse
Affiliation(s)
- Robert C Spiers
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| | - John H Kalivas
- Department of Chemistry, Idaho State University, Pocatello, Idaho 83209, United States
| |
Collapse
|
5
|
Maggiora G, Medina-Franco JL, Iqbal J, Vogt M, Bajorath J. From Qualitative to Quantitative Analysis of Activity and Property Landscapes. J Chem Inf Model 2020; 60:5873-5880. [PMID: 33205984 DOI: 10.1021/acs.jcim.0c01249] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Activity or, more generally, property landscapes (PLs) have been considered as an attractive way to visualize and explore structure-property relationships (SPRs) contained in large data sets of chemical compounds. For graphical analysis, three-dimensional representations reminiscent of natural landscapes are particularly intuitive. So far, the use of such landscape models has essentially been confined to qualitative assessment. We describe recent efforts to analyze PLs in a more quantitative manner, which make it possible to calculate topographical similarity values for comparison of landscape models as a measure of relative SPR information content.
Collapse
Affiliation(s)
- Gerald Maggiora
- University of Arizona BIO5 Institute, 1657 East Helen Street, Tucson, Arizona 85721-0240, United States
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Javed Iqbal
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| |
Collapse
|
6
|
Iqbal J, Vogt M, Bajorath J. Quantitative Comparison of Three-Dimensional Activity Landscapes of Compound Data Sets Based upon Topological Features. ACS OMEGA 2020; 5:24111-24117. [PMID: 32984733 PMCID: PMC7513547 DOI: 10.1021/acsomega.0c03659] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 08/27/2020] [Indexed: 05/07/2023]
Abstract
Visualization of structure-activity relationships (SARs) in compound data sets substantially contributes to their systematic analysis. For SAR visualization, different types of activity landscape (AL) representations have been introduced. Three-dimensional (3D) AL models in which an activity hypersurface is constructed in chemical space are particularly intuitive because these 3D ALs are reminiscent of "true" (geographical) landscapes. Accordingly, the topologies of 3D AL representations can be immediately associated with different SAR characteristics of compound data sets. However, the comparison of 3D ALs has thus far been confined to visual inspection and qualitative analysis. We have focused on image analysis as a possible approach to facilitate a quantitative comparison of 3D ALs, which would further increase their utility for SAR exploration. Herein, we introduce a new computational methodology for quantifying topological relationships between 3D ALs. Images of color-coded 3D ALs were converted into top-down views of these ALs. From transformed images, different categories of shape features were systematically extracted, and multilevel shape correspondence was determined as a measure of AL similarity. This made it possible to differentiate between 3D ALs in quantitative terms.
Collapse
Affiliation(s)
- Javed Iqbal
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Martin Vogt
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science
Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal
Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
7
|
Iqbal J, Vogt M, Bajorath J. Computational Method for Quantitative Comparison of Activity Landscapes on the Basis of Image Data. Molecules 2020; 25:E3952. [PMID: 32872506 PMCID: PMC7504767 DOI: 10.3390/molecules25173952] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 08/21/2020] [Accepted: 08/27/2020] [Indexed: 01/31/2023] Open
Abstract
Activity landscape (AL) models are used for visualizing and interpreting structure-activity relationships (SARs) in compound datasets. Therefore, ALs are designed to present chemical similarity and compound potency information in context. Different two- or three-dimensional (2D or 3D) AL representations have been introduced. For SAR analysis, 3D AL models are particularly intuitive. In these models, an interpolated potency surface is added as a third dimension to a 2D projection of chemical space. Accordingly, AL topology can be associated with characteristic SAR features. Going beyond visualization and a qualitative assessment of SARs, it would be very helpful to compare 3D ALs of different datasets in more quantitative terms. However, quantitative AL analysis is still in its infancy. Recently, it has been shown that 3D AL models with pre-defined topologies can be correctly classified using machine learning. Classification was facilitated on the basis of AL image feature representations learned with convolutional neural networks. Therefore, we have further investigated image analysis for quantitative comparison of 3D ALs and devised an approach to determine (dis)similarity relationships for ALs representing different compound datasets. Herein, we report this approach and demonstrate proof-of-principle. The methodology makes it possible to computationally compare 3D ALs and quantify topological differences reflecting varying SAR information content. For SAR exploration in drug design, this adds a quantitative measure of AL (dis)similarity to graphical analysis.
Collapse
Affiliation(s)
- Javed Iqbal
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
8
|
López-López E, Rabal O, Oyarzabal J, Medina-Franco JL. Towards the understanding of the activity of G9a inhibitors: an activity landscape and molecular modeling approach. J Comput Aided Mol Des 2020; 34:659-669. [PMID: 32060676 DOI: 10.1007/s10822-020-00298-x] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Accepted: 02/07/2020] [Indexed: 11/26/2022]
Abstract
In this work, we analyze the structure-activity relationships (SAR) of epigenetic inhibitors (lysine mimetics) against lysine methyltransferase (G9a or EHMT2) using a combined activity landscape, molecular docking and molecular dynamics approach. The study was based on a set of 251 G9a inhibitors with reported experimental activity. The activity landscape analysis rapidly led to the identification of activity cliffs, scaffolds hops and other active an inactive molecules with distinct SAR. Structure-based analysis of activity cliffs, scaffold hops and other selected active and inactive G9a inhibitors by means of docking followed by molecular dynamics simulations led to the identification of interactions with key residues involved in activity against G9a, for instance with ASP 1083, LEU 1086, ASP 1088, TYR 1154 and PHE 1158. The outcome of this work is expected to further advance the development of G9a inhibitors.
Collapse
Affiliation(s)
- Edgar López-López
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510, Mexico City, Mexico
| | - Obdulia Rabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research, CIMA, University of Navarra, Pio XII, 55, 31008, Pamplona, Spain
| | - Julen Oyarzabal
- Small Molecule Discovery Platform, Molecular Therapeutics Program, Center for Applied Medical Research, CIMA, University of Navarra, Pio XII, 55, 31008, Pamplona, Spain
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, National Autonomous University of Mexico, 04510, Mexico City, Mexico.
| |
Collapse
|
9
|
Predicting protein-ligand interactions based on bow-pharmacological space and Bayesian additive regression trees. Sci Rep 2019; 9:7703. [PMID: 31118426 PMCID: PMC6531441 DOI: 10.1038/s41598-019-43125-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2017] [Accepted: 04/12/2019] [Indexed: 02/08/2023] Open
Abstract
Identifying potential protein-ligand interactions is central to the field of drug discovery as it facilitates the identification of potential novel drug leads, contributes to advancement from hits to leads, predicts potential off-target explanations for side effects of approved drugs or candidates, as well as de-orphans phenotypic hits. For the rapid identification of protein-ligand interactions, we here present a novel chemogenomics algorithm for the prediction of protein-ligand interactions using a new machine learning approach and novel class of descriptor. The algorithm applies Bayesian Additive Regression Trees (BART) on a newly proposed proteochemical space, termed the bow-pharmacological space. The space spans three distinctive sub-spaces that cover the protein space, the ligand space, and the interaction space. Thereby, the model extends the scope of classical target prediction or chemogenomic modelling that relies on one or two of these subspaces. Our model demonstrated excellent prediction power, reaching accuracies of up to 94.5–98.4% when evaluated on four human target datasets constituting enzymes, nuclear receptors, ion channels, and G-protein-coupled receptors . BART provided a reliable probabilistic description of the likelihood of interaction between proteins and ligands, which can be used in the prioritization of assays to be performed in both discovery and vigilance phases of small molecule development.
Collapse
|
10
|
Activity Landscape and Molecular Modeling to Explore the SAR of Dual Epigenetic Inhibitors: A Focus on G9a and DNMT1. Molecules 2018; 23:molecules23123282. [PMID: 30544967 PMCID: PMC6321328 DOI: 10.3390/molecules23123282] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2018] [Revised: 12/09/2018] [Accepted: 12/10/2018] [Indexed: 11/17/2022] Open
Abstract
In this work we discuss the insights from activity landscape, docking and molecular dynamics towards the understanding of the structure-activity relationships of dual inhibitors of major epigenetic targets: lysine methyltransferase (G9a) and DNA methyltranferase 1 (DNMT1). The study was based on a novel data set of 50 published compounds with reported experimental activity for both targets. The activity landscape analysis revealed the presence of activity cliffs, e.g., pairs of compounds with high structure similarity but large activity differences. Activity cliffs were further rationalized at the molecular level by means of molecular docking and dynamics simulations that led to the identification of interactions with key residues involved in the dual activity or selectivity with the epigenetic targets.
Collapse
|
11
|
Miyao T, Funatsu K, Bajorath J. Three-Dimensional Activity Landscape Models of Different Design and Their Application to Compound Mapping and Potency Prediction. J Chem Inf Model 2018; 59:993-1004. [DOI: 10.1021/acs.jcim.8b00661] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Tomoyuki Miyao
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
12
|
Abstract
INTRODUCTION Activity landscapes (ALs) are representations and models of compound data sets annotated with a target-specific activity. In contrast to quantitative structure-activity relationship (QSAR) models, ALs aim at characterizing structure-activity relationships (SARs) on a large-scale level encompassing all active compounds for specific targets. The popularity of AL modeling has grown substantially with the public availability of large activity-annotated compound data sets. AL modeling crucially depends on molecular representations and similarity metrics used to assess structural similarity. Areas covered: The concepts of AL modeling are introduced and its basis in quantitatively assessing molecular similarity is discussed. The different types of AL modeling approaches are introduced. AL designs can broadly be divided into three categories: compound-pair based, dimensionality reduction, and network approaches. Recent developments for each of these categories are discussed focusing on the application of mathematical, statistical, and machine learning tools for AL modeling. AL modeling using chemical space networks is covered in more detail. Expert opinion: AL modeling has remained a largely descriptive approach for the analysis of SARs. Beyond mere visualization, the application of analytical tools from statistics, machine learning and network theory has aided in the sophistication of AL designs and provides a step forward in transforming ALs from descriptive to predictive tools. To this end, optimizing representations that encode activity relevant features of molecules might prove to be a crucial step.
Collapse
Affiliation(s)
- Martin Vogt
- a Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Bonn , Germany
| |
Collapse
|
13
|
Saldívar-González FI, Naveja JJ, Palomino-Hernández O, Medina-Franco JL. Getting SMARt in drug discovery: chemoinformatics approaches for mining structure–multiple activity relationships. RSC Adv 2017. [DOI: 10.1039/c6ra26230a] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
In light of the high relevance of polypharmacology, multi-target screening is a major trend in drug discovery.
Collapse
Affiliation(s)
- Fernanda I. Saldívar-González
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- Avenida Universidad 3000
- Mexico City 04510
| | - J. Jesús Naveja
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- Avenida Universidad 3000
- Mexico City 04510
| | - Oscar Palomino-Hernández
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- Avenida Universidad 3000
- Mexico City 04510
| | - José L. Medina-Franco
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- Avenida Universidad 3000
- Mexico City 04510
| |
Collapse
|
14
|
García-Sánchez MO, Cruz-Monteagudo M, Medina-Franco JL. Quantitative Structure-Epigenetic Activity Relationships. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2017. [DOI: 10.1007/978-3-319-56850-8_8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
15
|
Activity and property landscape modeling is at the interface of chemoinformatics and medicinal chemistry. Future Med Chem 2016; 7:1197-211. [PMID: 26132526 DOI: 10.4155/fmc.15.51] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Property landscape modeling (PLM) methods are at the interface of experimental sciences and computational chemistry. PLM are becoming a common strategy to describe systematically structure-property relationships of datasets. Thus far, PLM have been used mainly in medicinal chemistry and drug discovery. Herein, we survey advances on key topics on PLM with emphasis on questions often raised regarding the outcomes of the property landscape studies. We also emphasize on concepts of PLM that are being extended to other experimental areas beyond drug discovery. Topics discussed in this paper include applications of PLM to further characterize protein-ligand interactions, the utility of PLM as a quantitative and descriptive approach, and the statistical validation of property cliffs.
Collapse
|
16
|
Naveja JJ, Medina-Franco JL. Activity landscape of DNA methyltransferase inhibitors bridges chemoinformatics with epigenetic drug discovery. Expert Opin Drug Discov 2015; 10:1059-70. [DOI: 10.1517/17460441.2015.1073257] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
17
|
Pérez-Villanueva J, Méndez-Lucio O, Soria-Arteche O, Medina-Franco JL. Activity cliffs and activity cliff generators based on chemotype-related activity landscapes. Mol Divers 2015; 19:1021-35. [PMID: 26150300 DOI: 10.1007/s11030-015-9609-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Accepted: 06/24/2015] [Indexed: 12/26/2022]
Abstract
Activity cliffs have large impact in drug discovery; therefore, their detection and quantification are of major importance. This work introduces the metric activity cliff enrichment factor and expands the previously reported activity cliff generator concept by adding chemotype information to representations of the activity landscape. To exemplify these concepts, three molecular databases with multiple biological activities were characterized. Compounds in each database were grouped into chemotype classes. Then, pairwise comparisons of structure similarities and activity differences were calculated for each compound and used to construct chemotype-based structure-activity similarity (SAS) maps. Different landscape distributions among four major regions of the SAS maps were observed for different subsets of molecules grouped in chemotypes. Based on this observation, the activity cliff enrichment factor was calculated to numerically detect chemotypes enriched in activity cliffs. Several chemotype classes were detected having major proportion of activity cliffs than the entire database. In addition, some chemotype classes comprising compounds with smooth structure activity relationships (SAR) were detected. Finally, the activity cliff generator concept was applied to compounds grouped in chemotypes to extract valuable SAR information.
Collapse
Affiliation(s)
- Jaime Pérez-Villanueva
- División de Ciencias Biológicas y de la Salud, Departamento de Sistemas Biológicos, Universidad Autónoma Metropolitana Unidad Xochimilco (UAM-X), 04960, Mexico, DF, Mexico.
| | - Oscar Méndez-Lucio
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México (UNAM), 04510, Mexico, DF, Mexico.,Unilever Centre for Molecular Science Informatics Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Olivia Soria-Arteche
- División de Ciencias Biológicas y de la Salud, Departamento de Sistemas Biológicos, Universidad Autónoma Metropolitana Unidad Xochimilco (UAM-X), 04960, Mexico, DF, Mexico
| | - José L Medina-Franco
- Departamento de Farmacia, Facultad de Química, Universidad Nacional Autónoma de México (UNAM), 04510, Mexico, DF, Mexico
| |
Collapse
|
18
|
Méndez-Lucio O, Kooistra AJ, Graaf CD, Bender A, Medina-Franco JL. Analyzing Multitarget Activity Landscapes Using Protein–Ligand Interaction Fingerprints: Interaction Cliffs. J Chem Inf Model 2015; 55:251-62. [DOI: 10.1021/ci500721x] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Oscar Méndez-Lucio
- Centre
for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Albert J. Kooistra
- Division
of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for
Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Chris de Graaf
- Division
of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for
Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Andreas Bender
- Centre
for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - José L. Medina-Franco
- Facultad
de Química, Departamento de Farmacia, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
19
|
Naveja JJ, Medina-Franco JL. Activity landscape sweeping: insights into the mechanism of inhibition and optimization of DNMT1 inhibitors. RSC Adv 2015. [DOI: 10.1039/c5ra12339a] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Inhibitors of DNA methyltransferases have distinct structure–activity relationships as revealed by the activity landscape sweeping study discussed in this work.
Collapse
Affiliation(s)
- J. Jesús Naveja
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- México
- México
| | - José L. Medina-Franco
- Facultad de Química
- Departamento de Farmacia
- Universidad Nacional Autónoma de México
- México
- México
| |
Collapse
|
20
|
Cruz-Monteagudo M, Medina-Franco JL, Pérez-Castillo Y, Nicolotti O, Cordeiro MND, Borges F. Activity cliffs in drug discovery: Dr Jekyll or Mr Hyde? Drug Discov Today 2014; 19:1069-80. [DOI: 10.1016/j.drudis.2014.02.003] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Revised: 01/23/2014] [Accepted: 02/10/2014] [Indexed: 10/25/2022]
|
21
|
Rojas-Aguirre Y, Medina-Franco JL. Analysis of structure-Caco-2 permeability relationships using a property landscape approach. Mol Divers 2014; 18:599-610. [DOI: 10.1007/s11030-014-9514-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2013] [Accepted: 02/28/2014] [Indexed: 12/14/2022]
|
22
|
Guha R, Medina-Franco JL. On the validity versus utility of activity landscapes: are all activity cliffs statistically significant? J Cheminform 2014; 6:11. [PMID: 24694189 PMCID: PMC4021161 DOI: 10.1186/1758-2946-6-11] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Accepted: 03/25/2014] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are "real" (i.e., statistically significant). RESULTS The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. CONCLUSIONS We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - José L Medina-Franco
- Circuito Exterior, Instituto de Química, Universidad Nacional Autónoma de México, Ciudad Universitaria, México D.F. 04510, Mexico ; Current address: Mayo Clinic, 13400 East Shea Boulevard, Scottsdale, AZ 85259, USA
| |
Collapse
|
23
|
Medina-Franco JL, Méndez-Lucio O, Martinez-Mayorga K. The Interplay Between Molecular Modeling and Chemoinformatics to Characterize Protein–Ligand and Protein–Protein Interactions Landscapes for Drug Discovery. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2014; 96:1-37. [DOI: 10.1016/bs.apcsb.2014.06.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
24
|
Santos RG, Giulianotti MA, Houghten RA, Medina-Franco JL. Conditional probabilistic analysis for prediction of the activity landscape and relative compound activities. J Chem Inf Model 2013; 53:2613-25. [PMID: 23971977 PMCID: PMC3850180 DOI: 10.1021/ci400243e] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Structure-property relationships and structure-activity relationships play an important role in many research areas, such as medicinal chemistry and drug discovery. Such methods, however, have focused on providing post-hoc descriptions of such relationships based on known data. The ability for these descriptions to remain relevant when considering compounds of unknown activity, and thus the prediction of activity and property landscapes using existing data, remains little explored. In this study, we present a novel method of evaluating the ability of a compound comparison methodology to provide accurate information about a set of unknown compounds and also explore the ability of these predicted activity landscapes to prioritize active compounds over inactive. These methods are applied to three distinct and diverse sets of compounds, each with activity data for multiple targets, for a total of eight target-compound set pairs. Six methodologically distinct compound comparison methods were evaluated. We show that overall, all compound comparison methods provided an improvement in structure-activity relationship prediction over random and were able to prioritize compounds in a superior manner to random sampling, but the degree of success and therefore applicability varied markedly.
Collapse
Affiliation(s)
- Radleigh G. Santos
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987
| | - Marc A. Giulianotti
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987
| | - Richard A. Houghten
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987
| | - José L. Medina-Franco
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987
| |
Collapse
|
25
|
Yongye AB, Medina-Franco JL. Toward an Efficient Approach to Identify Molecular Scaffolds Possessing Selective or Promiscuous Compounds. Chem Biol Drug Des 2013; 82:367-75. [DOI: 10.1111/cbdd.12162] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 04/12/2013] [Accepted: 04/17/2013] [Indexed: 01/09/2023]
Affiliation(s)
- Austin B. Yongye
- Torrey Pines Institute for Molecular Studies; 11350 SW Village Parkway Port St. Lucie FL 34987 USA
| | - José L. Medina-Franco
- Torrey Pines Institute for Molecular Studies; 11350 SW Village Parkway Port St. Lucie FL 34987 USA
- Instituto de Química; Universidad Nacional Autónoma de México; Circuito Exterior; Ciudad Universitaria; México D.F 04510 Mexico
| |
Collapse
|
26
|
Abstract
The analysis of structure–activity relationships (SARs) is a central task in medicinal chemistry. Traditionally, SAR exploration has concentrated on individual compound series. This conventional approach is complemented by large-scale SAR analysis, which puts strong emphasis on data mining and SAR visualization. This contribution reviews recent concepts for large-scale SAR analysis including numerical functions to characterize global and local SAR information content of compound data sets, alternative activity landscape representations and data mining strategies.
Collapse
|
27
|
Vogt M, Iyer P, Maggiora GM, Bajorath J. Conditional Probabilities of Activity Landscape Features for Individual Compounds. J Chem Inf Model 2013; 53:1602-12. [DOI: 10.1021/ci400288r] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Martin Vogt
- Department of Life
Science Informatics, B-IT, LIMES Program Unit Chemical
Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Preeti Iyer
- Department of Life
Science Informatics, B-IT, LIMES Program Unit Chemical
Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| | - Gerald M. Maggiora
- College of Pharmacy & BIO5 Institute, University of Arizona, Translational Genomics Research Institute, 1295 North Martin, P.O. Box 210202, Tucson, Arizona 85721, United States, and 445 North Fifth Street, Phoenix, Arizona 85004, United States
| | - Jürgen Bajorath
- Department of Life
Science Informatics, B-IT, LIMES Program Unit Chemical
Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse 2, D-53113 Bonn, Germany
| |
Collapse
|
28
|
Medina-Franco JL, Edwards BS, Pinilla C, Appel JR, Giulianotti MA, Santos RG, Yongye AB, Sklar LA, Houghten RA. Rapid scanning structure-activity relationships in combinatorial data sets: identification of activity switches. J Chem Inf Model 2013; 53:1475-85. [PMID: 23705689 DOI: 10.1021/ci400192y] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
We present a general approach to describe the structure-activity relationships (SAR) of combinatorial data sets with activity for two biological endpoints with emphasis on the rapid identification of substitutions that have a large impact on activity and selectivity. The approach uses dual-activity difference (DAD) maps that represent a visual and quantitative analysis of all pairwise comparisons of one, two, or more substitutions around a molecular template. Scanning the SAR of data sets using DAD maps allows the visual and quantitative identification of activity switches defined as specific substitutions that have an opposite effect on the activity of the compounds against two targets. The approach also rapidly identifies single- and double-target R-cliffs, i.e., compounds where a single or double substitution around the central scaffold dramatically modifies the activity for one or two targets, respectively. The approach introduced in this report can be applied to any analogue series with two biological activity endpoints. To illustrate the approach, we discuss the SAR of 106 pyrrolidine bis-diketopiperazines tested against two formylpeptide receptors obtained from positional scanning deconvolution methods of mixture-based libraries.
Collapse
Affiliation(s)
- José L Medina-Franco
- Torrey Pines Institute for Molecular Studies, Port St. Lucie, Florida 34987, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Stumpfe D, Dimova D, Heikamp K, Bajorath J. Compound Pathway Model To Capture SAR Progression: Comparison of Activity Cliff-Dependent and -Independent Pathways. J Chem Inf Model 2013; 53:1067-72. [DOI: 10.1021/ci400141w] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Dagmar Stumpfe
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| | - Dilyana Dimova
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| | - Kathrin Heikamp
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr.
2, D-53113 Bonn, Germany
| |
Collapse
|
30
|
Yongye AB, Medina-Franco JL. Systematic characterization of structure-activity relationships and ADMET compliance: a case study. Drug Discov Today 2013; 18:732-9. [PMID: 23583765 DOI: 10.1016/j.drudis.2013.04.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2012] [Revised: 03/18/2013] [Accepted: 04/04/2013] [Indexed: 01/29/2023]
Abstract
Traditionally, activity landscape modeling has been focused on analyzing SAR, despite the fact that lead optimization in drug discovery involves concurrent enhancements of activity and ADMET properties of leads. As a case study, we discuss the systematic analysis of activity landscapes, incorporating ADMET considerations, using a dataset of 166 compounds screened for kappa-opioid receptor activity. Pairwise MACCS/Tanimoto structure similarities, property similarities utilizing 33 ADMET descriptors and a 35-dimensional 'violation bit vector' representing drug-likeness are analyzed. We address the question about the range of ADMET property violations that arise from structural changes, subtle and significant. Pairs of compounds are identified bearing identical, comparable and significantly different drug-likeness in the three informative regions of structure-activity landscapes.
Collapse
Affiliation(s)
- Austin B Yongye
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, FL 34987, USA.
| | | |
Collapse
|
31
|
Medina-Franco JL, Yoo J. Docking of a novel DNA methyltransferase inhibitor identified from high-throughput screening: insights to unveil inhibitors in chemical databases. Mol Divers 2013; 17:337-44. [PMID: 23447100 DOI: 10.1007/s11030-013-9428-z] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2012] [Accepted: 02/07/2013] [Indexed: 12/21/2022]
Abstract
Inhibitors of DNA methyltransferase (DNMT) are attractive compounds not only as potential therapeutic agents for the treatment of cancer and other diseases, but also as research tools to investigate the role of DNMTs in epigenetic events. Recent advances in high-throughput screening (HTS) for epigenetic targets and the availability of the first crystallographic structure of human DNMT1 encourage the integration of research strategies to uncover and optimize the activity of DNMT inhibitors. Herein, we present a binding model of a novel small-molecule DNMT1 inhibitor obtained by HTS, recently released in a public database. The docking model is in agreement with key interactions previously identified for established inhibitors using extensive computational studies including molecular dynamics and structure-based pharmacophore modeling. Based on the chemical structure of the novel inhibitor, a sequential computational screening of five chemical databases was performed to identify candidate compounds for testing. Similarity searching followed by molecular docking of chemical databases such as approved drugs, natural products, a DNMT-focused library, and a general screening collection, identified at least 108 molecules with promising DNMT inhibitory activity. The chemical structures of all hit compounds are disclosed to encourage the research community working on epigenetics to test experimentally the enzymatic and demethylating activity in vivo. Five candidate hits are drugs approved for other indications and represent potential starting points of a drug repurposing strategy.
Collapse
Affiliation(s)
- José L Medina-Franco
- Instituto de Química, Universidad Nacional Autónoma de México, Circuito Exterior, Ciudad Universitaria, 04510 México, D.F., Mexico.
| | | |
Collapse
|
32
|
Pérez-Villanueva J, Méndez-Lucio O, Soria-Arteche O, Izquierdo T, Concepción Lozada M, Gloria-Greimel WA, Medina-Franco JL. Cyclic Systems Distribution Along Similarity Measures: Insights for an Application to Activity Landscape Modeling. Mol Inform 2013; 32:179-90. [DOI: 10.1002/minf.201200127] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2012] [Accepted: 12/21/2012] [Indexed: 12/12/2022]
|
33
|
Abstract
Understanding structure-activity relationships (SARs) for a given set of molecules allows one to rationally explore chemical space and develop a chemical series optimizing multiple physicochemical and biological properties simultaneously, for instance, improving potency, reducing toxicity, and ensuring sufficient bioavailability. In silico methods allow rapid and efficient characterization of SARs and facilitate building a variety of models to capture and encode one or more SARs, which can then be used to predict activities for new molecules. By coupling these methods with in silico modifications of structures, one can easily prioritize large screening decks or even generate new compounds de novo and ascertain whether they belong to the SAR being studied. Computational methods can provide a guide for the experienced user by integrating and summarizing large amounts of preexisting data to suggest useful structural modifications. This chapter highlights the different types of SAR modeling methods and how they support the task of exploring chemical space to elucidate and optimize SARs in a drug discovery setting. In addition to considering modeling algorithms, I briefly discuss how to use databases as a source of SAR data to inform and enhance the exploration of SAR trends. I also review common modeling techniques that are used to encode SARs, recent work in the area of structure-activity landscapes, the role of SAR databases, and alternative approaches to exploring SAR data that do not involve explicit model development.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, Rockville, MD, USA
| |
Collapse
|
34
|
Medina-Franco JL, Yongye AB, Pérez-Villanueva J, Houghten RA, Martínez-Mayorga K. Activity-difference maps and consensus similarity measure characterize structure-activity relationships. J Cheminform 2012. [PMCID: PMC3341234 DOI: 10.1186/1758-2946-4-s1-p24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
35
|
Medina-Franco JL, Martínez-Mayorga K, Peppard TL, Del Rio A. Chemoinformatic analysis of GRAS (Generally Recognized as Safe) flavor chemicals and natural products. PLoS One 2012; 7:e50798. [PMID: 23226386 PMCID: PMC3511266 DOI: 10.1371/journal.pone.0050798] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Accepted: 10/24/2012] [Indexed: 12/15/2022] Open
Abstract
Food materials designated as "Generally Recognized as Safe" (GRAS) are attracting the attention of researchers in their attempts to systematically identify compounds with putative health-related benefits. In particular, there is currently a great deal of interest in exploring possible secondary benefits of flavor ingredients, such as those relating to health and wellness. One step in this direction is the comprehensive characterization of the chemical structures contained in databases of flavoring substances. Herein, we report a comprehensive analysis of the recently updated FEMA GRAS list of flavoring substances (discrete chemical entities only). Databases of natural products, approved drugs and a large set of commercial molecules were used as references. Remarkably, natural products continue to be an important source of bioactive compounds for drug discovery and nutraceutical purposes. The comparison of five collections of compounds of interest was performed using molecular properties, rings, atom counts and structural fingerprints. It was found that the molecular size of the GRAS flavoring substances is, in general, smaller cf. members of the other databases analyzed. The lipophilicity profile of the GRAS database, a key property to predict human bioavailability, is similar to approved drugs. Several GRAS chemicals overlap to a broad region of the property space occupied by drugs. The GRAS list analyzed in this work has high structural diversity, comparable to approved drugs, natural products and libraries of screening compounds. This study represents one step towards the use of the distinctive features of the flavoring chemicals contained in the GRAS list and natural products to systematically search for compounds with potential health-related benefits.
Collapse
Affiliation(s)
- José L Medina-Franco
- Torrey Pines Institute for Molecular Studies, Port St. Lucie, Florida, United States of America.
| | | | | | | |
Collapse
|
36
|
Méndez-Lucio O, Pérez-Villanueva J, Castillo R, Medina-Franco JL. Identifying Activity Cliff Generators of PPAR Ligands Using SAS Maps. Mol Inform 2012; 31:837-46. [DOI: 10.1002/minf.201200078] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2012] [Accepted: 10/06/2012] [Indexed: 01/27/2023]
|
37
|
Medina-Franco JL. Interrogating Novel Areas of Chemical Space for Drug Discovery using Chemoinformatics. Drug Dev Res 2012. [DOI: 10.1002/ddr.21034] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
38
|
Medina-Franco JL. Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 2012; 52:2485-93. [PMID: 22989212 DOI: 10.1021/ci300362x] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Systematic description of structure-activity relationships (SARs) of data sets and structure-property relationships (SPRs) is of paramount importance in medicinal chemistry and other research fields. To this end, structure-activity similarity (SAS) maps are one of the first tools proposed to describe SARs using the concept of activity landscape modeling. One of the major goals of the SAS maps is to identify activity cliffs defined as chemical compounds with high similar structure but unexpectedly very different biological activity. Since the first publication of the SAS maps more than ten years ago, these tools have evolved and adapted over the years to analyze various types of compound collections, including structural diverse and combinatorial sets with activity for one or multiple biological end points. The development of SAS maps has led to general concepts that are applicable to other activity landscape methods such as "consensus activity cliffs" (activity cliffs common to a series of representations or descriptors) and "selectivity switches" (structural changes that completely invert the selectivity pattern of similar compounds against two biological end points). Herein, we review the development, practical applications, limitations, and perspectives of the SAS and related maps which are intuitive and powerful informatics tools to computationally analyze SPRs.
Collapse
Affiliation(s)
- José L Medina-Franco
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987, USA.
| |
Collapse
|
39
|
Waddell J, Medina-Franco JL. Bioactivity landscape modeling: Chemoinformatic characterization of structure–activity relationships of compounds tested across multiple targets. Bioorg Med Chem 2012; 20:5443-52. [DOI: 10.1016/j.bmc.2011.11.051] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Revised: 11/01/2011] [Accepted: 11/23/2011] [Indexed: 12/14/2022]
|
40
|
Vogt M, Bajorath J. Chemoinformatics: A view of the field and current trends in method development. Bioorg Med Chem 2012; 20:5317-23. [DOI: 10.1016/j.bmc.2012.03.030] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2012] [Revised: 03/09/2012] [Accepted: 03/12/2012] [Indexed: 12/18/2022]
|
41
|
Yongye AB, Medina-Franco JL. Data mining of protein-binding profiling data identifies structural modifications that distinguish selective and promiscuous compounds. J Chem Inf Model 2012; 52:2454-61. [PMID: 22856455 DOI: 10.1021/ci3002606] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Activity profiling of compound collections across multiple targets is increasingly being used in probe and drug discovery. Herein, we discuss an approach to systematically analyzing the structure-activity relationships of a large screening profile data with emphasis on identifying structural changes that have a significant impact on the number of proteins to which a compound binds. As a case study, we analyzed a recently released public data set of more than 15 000 compounds screened across 100 sequence-unrelated proteins. The screened compounds have different origins and include natural products, synthetic molecules from academic groups, and commercial compounds. Similar synthetic structures from academic groups showed, overall, greater promiscuity differences than do natural products and commercial compounds. The method implemented in this work readily identified structural changes that differentiated highly specific from promiscuous compounds. This approach is general and can be applied to analyze any other large-scale protein-binding profile data.
Collapse
Affiliation(s)
- Austin B Yongye
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987, USA
| | | |
Collapse
|
42
|
Activity landscape modeling of PPAR ligands with dual-activity difference maps. Bioorg Med Chem 2012; 20:3523-32. [DOI: 10.1016/j.bmc.2012.04.005] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Revised: 03/27/2012] [Accepted: 04/04/2012] [Indexed: 01/19/2023]
|
43
|
López-Vallejo F, Martínez-Mayorga K. Furin inhibitors: importance of the positive formal charge and beyond. Bioorg Med Chem 2012; 20:4462-71. [PMID: 22682919 DOI: 10.1016/j.bmc.2012.05.029] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2012] [Revised: 05/03/2012] [Accepted: 05/12/2012] [Indexed: 02/02/2023]
Abstract
Furin is the prototype member of the proprotein convertases superfamily. Proprotein convertases are associated with hormonal response, neural degeneration, viral and bacterial activation, and cancer. Several studies over the last decade have examined small molecules, natural products, peptides and peptide derivatives as furin inhibitors. Currently, subnanomolar inhibition of furin is possible. Herein, we report the analysis of 115 furin inhibitors reported in the literature. Analysis of the physicochemical properties of these compounds highlights the dependence of the inhibitory potency with the total formal charge and also shows how the most potent (peptide-based) furin inhibitors have physicochemical properties similar to drugs. In addition, we report docking studies of 26 furin inhibitors using Glide XP. Inspection of binding interactions shows that the two putative binding modes derived from our study are reasonable. Analysis of the binding modes and protein-ligand interaction fingerprints, used here as postdocking procedure, shows that electrostatic interactions predominate on S1, S2 and S4 subsites but are seldom in S3. Our models also show that the benzimidamide group, present in the most active inhibitors, can be accommodated in the S1 subsite. These results are valuable for the design of new furin inhibitors.
Collapse
Affiliation(s)
- Fabian López-Vallejo
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, FL 34987, USA
| | | |
Collapse
|
44
|
Gupta-Ostermann D, Hu Y, Bajorath J. Introducing the LASSO Graph for Compound Data Set Representation and Structure–Activity Relationship Analysis. J Med Chem 2012; 55:5546-53. [DOI: 10.1021/jm3004762] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Disha Gupta-Ostermann
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| | - Ye Hu
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| |
Collapse
|
45
|
A large scale classification of molecular fingerprints for the chemical space representation and SAR analysis. J Cheminform 2012. [PMCID: PMC3341238 DOI: 10.1186/1758-2946-4-s1-p26] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
46
|
Guha R. Exploring Structure-Activity Data Using the Landscape Paradigm. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012; 2. [PMID: 24163705 DOI: 10.1002/wcms.1087] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
In this article we present an overview of the origin and applications of the activity landscape view of structure-actvitiy relationship data as conceived by Maggiora. Within this landscape, different regions exemplify different aspects of SAR trends - ranging from smoothly varying trends to discontinuous trends (also termed activity cliffs). We discuss the various definitions of landscapes and cliffs that have been proposed as well as different approaches to the numerical quantification of a landscape. We then highlight some of the landscape visualization approaches that have been developed, followed by a review of the various applications of activity landscapes and cliffs to topics in medicinal chemistry and SAR analysis.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Translational Therapeutics 9800 Medical Center Drive Rockville, MD 20850
| |
Collapse
|
47
|
López-Vallejo F, Giulianotti MA, Houghten RA, Medina-Franco JL. Expanding the medicinally relevant chemical space with compound libraries. Drug Discov Today 2012; 17:718-26. [PMID: 22515962 DOI: 10.1016/j.drudis.2012.04.001] [Citation(s) in RCA: 90] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2012] [Revised: 03/01/2012] [Accepted: 04/02/2012] [Indexed: 02/04/2023]
Abstract
Analysis of marketed drugs and commercial vendor libraries used in high-throughput screening suggests that the medicinally relevant chemical space may be expanded to unexplored regions. Novel regions of the chemical space can be conveniently explored with structurally unique molecules with increased complexity and balanced physicochemical properties. As a case study, we discuss the chemoinformatic profile of natural products in the Traditional Chinese Medicine (TCM) database and a large collection assembled from 30 small-molecule combinatorial libraries with emphasis on assessing molecular complexity. The herein surveyed combinatorial libraries have been successfully used over the past 20 years to identify novel bioactive compounds across different therapeutic areas. Combinatorial libraries and natural products are suitable sources to expand the traditional relevant medicinal chemistry space.
Collapse
Affiliation(s)
- Fabian López-Vallejo
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, FL 34987, USA
| | | | | | | |
Collapse
|
48
|
|
49
|
Identification of benzoylisoquinolines as potential anti-Chagas agents. Bioorg Med Chem 2012; 20:2587-94. [DOI: 10.1016/j.bmc.2012.02.046] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 02/14/2012] [Accepted: 02/20/2012] [Indexed: 12/15/2022]
|
50
|
Affiliation(s)
- Dagmar Stumpfe
- Department of Life Science
Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| | - Jürgen Bajorath
- Department of Life Science
Informatics, B-IT, LIMES
Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstrasse
2, D-53113 Bonn, Germany
| |
Collapse
|