1
|
Dolciami D, Ziolek RM, Davies DW, Carter M, Mok NY, Sherhod R. Exploiting Vector Pattern Diversity of Molecular Scaffolds for Cheminformatics Tasks in Drug Discovery. J Chem Inf Model 2024; 64:1966-1974. [PMID: 38437714 DOI: 10.1021/acs.jcim.3c01674] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/06/2024]
Abstract
Chemical diversity is challenging to describe objectively. Despite this, various notions of chemical diversity are used throughout the medicinal chemistry optimization process in drug discovery. In this work, we show the usefulness of considering exploited vectors during different phases of the drug design process to provide a quantitative and objective description of chemical diversity. We have developed a concise and fast approach to enumerate and analyze the exploited vector patterns (EVPs) of molecular compound series, which can then be used in archetypal compound selection tasks, from hit matter identification to hit expansion and lead optimization. We first show that EVPs can be used to assess the progressibility of compounds in a fragment library design exercise. By considering EVPs, we then show how a set of compounds can be prioritized for hit expansion using EVP-based, customizable diversity sampling approaches, reducing the time taken and mitigating human biases. We also show that EVPs are a useful tool to analyze SAR data, offering the chance to uncover correlations between different vectors without predetermining the molecular scaffold structures. The codes used to perform these tasks are presented as easy-to-use Jupyter notebooks, which can be readily adapted for further related tasks.
Collapse
Affiliation(s)
| | | | | | | | - N Yi Mok
- BenevolentAI, 4-8 Maple Street, London W1T 5HD, U.K
| | | |
Collapse
|
2
|
Yu T, Chong LC, Nantasenamat C, Anuwongcharoen N, Piacham T. Machine learning approaches to study the structure-activity relationships of LpxC inhibitors. EXCLI JOURNAL 2023; 22:975-991. [PMID: 38023567 PMCID: PMC10630528 DOI: 10.17179/excli2023-6356] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Figures] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 09/01/2023] [Indexed: 12/01/2023]
Abstract
Antimicrobial resistance (AMR) has emerged as one of the global threats to human health in the 21st century. Drug discovery of inhibitors against novel targets rather than conventional bacterial targets has been considered an inevitable strategy for the growing threat of AMR infections. In this study, we applied quantitative structure-activity relationship (QSAR) modeling to the LpxC inhibitors to predict the inhibitory activity. In addition, we performed various cheminformatics analysis consisting of the exploration of the chemical space, identification of chemotypes, performing structure-activity landscape and activity cliffs as well as construction of the Structure-Activity Similarity (SAS) map. We built a total of 24 QSAR classification models using PubChem and MACCS fingerprint with 12 various machine learning algorithms. The best model with PubChem fingerprint is the Extremely Gradient Boost model (accuracy on the training set: 0.937; accuracy on the 10-fold cross-validation set: 0.795; accuracy on the test set: 0.799). Furthermore, it was found that the best model using the MACCS fingerprint was the Random Forest model (accuracy on the training set: 0.955; accuracy on the 10-fold cross-validation set: 0.803; accuracy on the test set: 0.785). In addition, we have identified eight consensus activity cliff generators that are highly informative for further SAR investigations. It is hoped that findings presented herein can provide guidance for further lead optimization of LpxC inhibitors.
Collapse
Affiliation(s)
- Tianshi Yu
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| | - Li Chuin Chong
- Beykoz Institute of Life Sciences and Biotechnology, Bezmialem Vakif University, Beykoz, Istanbul, Türkiye
| | - Chanin Nantasenamat
- Streamlit Open Source, Snowflake Inc., San Mateo, California 94402, United States
| | - Nuttapat Anuwongcharoen
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| | - Theeraphon Piacham
- Department of Clinical Microbiology and Applied Technology, Faculty of Medical Technology, Mahidol University, Bangkok, Thailand
| |
Collapse
|
3
|
Vivek-Ananth RP, Sahoo AK, Baskaran SP, Ravichandran J, Samal A. Identification of activity cliffs in structure-activity landscape of androgen receptor binding chemicals. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 873:162263. [PMID: 36801331 DOI: 10.1016/j.scitotenv.2023.162263] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/09/2023] [Accepted: 02/11/2023] [Indexed: 06/18/2023]
Abstract
Androgen mimicking environmental chemicals can bind to Androgen receptor (AR) and can cause severe effects on the reproductive health of males. Predicting such endocrine disrupting chemicals (EDCs) in the human exposome is vital for improving current chemical regulations. To this end, QSAR models have been developed to predict androgen binders. However, a continuous structure-activity relationship (SAR) wherein chemicals with similar structure have similar activity does not always hold. Activity landscape analysis can help map the structure-activity landscape and identify unique features such as activity cliffs. Here we performed a systematic investigation of the chemical diversity along with the global and local structure-activity landscape of a curated list of 144 AR binding chemicals. Specifically, we clustered the AR binding chemicals and visualized the associated chemical space. Thereafter, consensus diversity plot was used to assess the global diversity of the chemical space. Subsequently, the structure-activity landscape was investigated using SAS maps which capture the activity difference and structural similarity among the AR binders. This analysis led to a subset of 41 AR binding chemicals forming 86 activity cliffs, of which 14 are activity cliff generators. Additionally, SALI scores were computed for all pairs of AR binding chemicals and the SALI heatmap was also used to evaluate the activity cliffs identified using SAS map. Finally, we provide a classification of the 86 activity cliffs into six categories using structural information of chemicals at different levels. Overall, this investigation reveals the heterogeneous nature of the structure-activity landscape of AR binding chemicals and provides insights which will be crucial in preventing false prediction of chemicals as androgen binders and developing predictive computational toxicity models in the future.
Collapse
Affiliation(s)
- R P Vivek-Ananth
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Ajaya Kumar Sahoo
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Shanmuga Priya Baskaran
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Janani Ravichandran
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India.
| |
Collapse
|
4
|
Maggiora G, Medina-Franco JL, Iqbal J, Vogt M, Bajorath J. From Qualitative to Quantitative Analysis of Activity and Property Landscapes. J Chem Inf Model 2020; 60:5873-5880. [PMID: 33205984 DOI: 10.1021/acs.jcim.0c01249] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Activity or, more generally, property landscapes (PLs) have been considered as an attractive way to visualize and explore structure-property relationships (SPRs) contained in large data sets of chemical compounds. For graphical analysis, three-dimensional representations reminiscent of natural landscapes are particularly intuitive. So far, the use of such landscape models has essentially been confined to qualitative assessment. We describe recent efforts to analyze PLs in a more quantitative manner, which make it possible to calculate topographical similarity values for comparison of landscape models as a measure of relative SPR information content.
Collapse
Affiliation(s)
- Gerald Maggiora
- University of Arizona BIO5 Institute, 1657 East Helen Street, Tucson, Arizona 85721-0240, United States
| | - José L Medina-Franco
- Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico
| | - Javed Iqbal
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Martin Vogt
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, Bonn D-53115, Germany
| |
Collapse
|
5
|
Abstract
The ccbmlib Python package is a collection of modules for modeling similarity value distributions based on Tanimoto coefficients for fingerprints available in RDKit. It can be used to assess the statistical significance of Tanimoto coefficients and evaluate how molecular similarity is reflected when different fingerprint representations are used. Significance measures derived from p-values allow a quantitative comparison of similarity scores obtained from different fingerprint representations that might have very different value ranges. Furthermore, the package models conditional distributions of similarity coefficients for a given reference compound. The conditional significance score estimates where a test compound would be ranked in a similarity search. The models are based on the statistical analysis of feature distributions and feature correlations of fingerprints of a reference database. The resulting models have been evaluated for 11 RDKit fingerprints, taking a collection of ChEMBL compounds as a reference data set. For most fingerprints, highly accurate models were obtained, with differences of 1% or less for Tanimoto coefficients indicating high similarity.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, University of Bonn, Endenicher Allee 19c, Bonn, NRW, 53115, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, University of Bonn, Endenicher Allee 19c, Bonn, NRW, 53115, Germany
| |
Collapse
|
6
|
Vogt M, Bajorath J. ccbmlib - a Python package for modeling Tanimoto similarity value distributions. F1000Res 2020; 9:Chem Inf Sci-100. [PMID: 32161645 PMCID: PMC7050271 DOI: 10.12688/f1000research.22292.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/04/2020] [Indexed: 11/15/2023] Open
Abstract
The ccbmlib Python package is a collection of modules for modeling similarity value distributions based on Tanimoto coefficients for fingerprints available in RDKit. It can be used to assess the statistical significance of Tanimoto coefficients and evaluate how molecular similarity is reflected when different fingerprint representations are used. Significance measures derived from p-values allow a quantitative comparison of similarity scores obtained from different fingerprint representations that might have very different value ranges. Furthermore, the package models conditional distributions of similarity coefficients for a given reference compound. The conditional significance score estimates where a test compound would be ranked in a similarity search. The models are based on the statistical analysis of feature distributions and feature correlations of fingerprints of a reference database. The resulting models have been evaluated for 11 RDKit fingerprints, taking a collection of ChEMBL compounds as a reference data set. For most fingerprints, highly accurate models were obtained, with differences of 1% or less for Tanimoto coefficients indicating high similarity.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, University of Bonn, Endenicher Allee 19c, Bonn, NRW, 53115, Germany
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, University of Bonn, Endenicher Allee 19c, Bonn, NRW, 53115, Germany
| |
Collapse
|
7
|
Miranda PHDS, Lourenço EMG, Morais AMS, de Oliveira PIC, Silverio PSDSN, Jordão AK, Barbosa EG. Molecular modeling of a series of dehydroquinate dehydratase type II inhibitors of Mycobacterium tuberculosis and design of new binders. Mol Divers 2019; 25:1-12. [PMID: 31820222 DOI: 10.1007/s11030-019-10020-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/22/2019] [Indexed: 11/24/2022]
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis (M. tuberculosis), is still responsible for a large number of fatal cases, especially in developing countries with alarming rates of incidence and prevalence worldwide. Mycobacterium tuberculosis has a remarkable ability to develop new resistance mechanisms to the conventional antimicrobials treatment. Because of this, there is an urgent need for novel bioactive compounds for its treatment. The dehydroquinate dehydratase II (DHQase II) is considered a key enzyme of shikimate pathway, and it can be used as a promising target for the design of new bioactive compounds with antibacterial action. The aim of this work was the construction of QSAR models to aid the design of new potential DHQase II inhibitors. For that purpose, various molecular modeling approaches, such as activity cliff, QSAR models and computer-aided ligand design were utilized. A predictive in silico 4D-QSAR model was built using a database comprising 86 inhibitors of DHQase II, and the model was used to predict the activity of the designed ligands. The obtained model proved to predict well the DHQase II inhibition for an external validation dataset ([Formula: see text] = 0.72). Also, the Activity Cliff analysis shed light on important structural features applied to the ligand design.
Collapse
Affiliation(s)
- Paulo H de S Miranda
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Estela M G Lourenço
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Alexander M S Morais
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Pedro I C de Oliveira
- Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | | | - Alessandro K Jordão
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Euzébio G Barbosa
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil. .,Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil.
| |
Collapse
|
8
|
Naveja JJ, Oviedo-Osornio CI, Medina-Franco JL. Computational Methods for Epigenetic Drug Discovery: A Focus on Activity Landscape Modeling. COMPUTATIONAL MOLECULAR MODELLING IN STRUCTURAL BIOLOGY 2018; 113:65-83. [DOI: 10.1016/bs.apcsb.2018.01.001] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
|
9
|
Activity landscape analysis of novel 5$$\upalpha $$-reductase inhibitors. Mol Divers 2016; 20:771-80. [DOI: 10.1007/s11030-016-9659-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2015] [Accepted: 01/12/2016] [Indexed: 01/21/2023]
|
10
|
Egieyeh SA, Syce J, Malan SF, Christoffels A. Prioritization of anti-malarial hits from nature: chemo-informatic profiling of natural products with in vitro antiplasmodial activities and currently registered anti-malarial drugs. Malar J 2016; 15:50. [PMID: 26823078 PMCID: PMC4731946 DOI: 10.1186/s12936-016-1087-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2015] [Accepted: 01/09/2016] [Indexed: 01/28/2023] Open
Abstract
BACKGROUND A large number of natural products have shown in vitro antiplasmodial activities. Early identification and prioritization of these natural products with potential for novel mechanism of action, desirable pharmacokinetics and likelihood for development into drugs is advantageous. Chemo-informatic profiling of these natural products were conducted and compared to currently registered anti-malarial drugs (CRAD). METHODS Natural products with in vitro antiplasmodial activities (NAA) were compiled from various sources. These natural products were sub-divided into four groups based on inhibitory concentration (IC50). Key molecular descriptors and physicochemical properties were computed for these compounds and analysis of variance used to assess statistical significance amongst the sets of compounds. Molecular similarity analysis, estimation of drug-likeness, in silico pharmacokinetic profiling, and exploration of structure-activity landscape were also carried out on these sets of compounds. RESULTS A total of 1040 natural products were selected and a total of 13 molecular descriptors were analysed. Significant differences were observed among the sub-groups of NAA and CRAD for at least 11 of the molecular descriptors, including number of hydrogen bond donors and acceptors, molecular weight, polar and hydrophobic surface areas, chiral centres, oxygen and nitrogen atoms, and shape index. The remaining molecular descriptors, including clogP, number of rotatable bonds and number of aromatic rings, did not show any significant difference when comparing the two compound sets. Molecular similarity and chemical space analysis identified natural products that were structurally diverse from CRAD. Prediction of the pharmacokinetic properties and drug-likeness of these natural products identified over 50% with desirable drug-like properties. Nearly 70% of all natural products were identified as potentially promiscuous compounds. Structure-activity landscape analysis highlighted compound pairs that form 'activity cliffs'. In all, prioritization strategies for the NAA were proposed. CONCLUSIONS Chemo-informatic profiling of NAA and CRAD have produced a wealth of information that may guide decisions and facilitate anti-malarial drug development from natural products. Articulation of the information provided within an interactive data-mining environment led to a prioritized list of NAA.
Collapse
Affiliation(s)
- Samuel Ayodele Egieyeh
- South African Medial Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, Cape Town, South Africa. .,School of Pharmacy, University of the Western Cape, Bellville, Cape Town, South Africa.
| | - James Syce
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town, South Africa.
| | - Sarel F Malan
- School of Pharmacy, University of the Western Cape, Bellville, Cape Town, South Africa.
| | - Alan Christoffels
- South African Medial Research Council Bioinformatics Unit, South African National Bioinformatics Institute, University of the Western Cape, Bellville, Cape Town, South Africa.
| |
Collapse
|
11
|
Naveja JJ, Medina-Franco JL. Activity landscape of DNA methyltransferase inhibitors bridges chemoinformatics with epigenetic drug discovery. Expert Opin Drug Discov 2015; 10:1059-70. [DOI: 10.1517/17460441.2015.1073257] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
|
12
|
Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 2015; 44:1172-239. [PMID: 25503938 PMCID: PMC4349129 DOI: 10.1039/c4cs00351a] [Citation(s) in RCA: 251] [Impact Index Per Article: 27.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Indexed: 12/21/2022]
Abstract
The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.
Collapse
Affiliation(s)
- Andrew Currin
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| | - Neil Swainston
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- School of Computer Science , The University of Manchester , Manchester M13 9PL , UK
| | - Philip J. Day
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- Faculty of Medical and Human Sciences , The University of Manchester , Manchester M13 9PT , UK
| | - Douglas B. Kell
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| |
Collapse
|
13
|
Méndez-Lucio O, Kooistra AJ, Graaf CD, Bender A, Medina-Franco JL. Analyzing Multitarget Activity Landscapes Using Protein–Ligand Interaction Fingerprints: Interaction Cliffs. J Chem Inf Model 2015; 55:251-62. [DOI: 10.1021/ci500721x] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Oscar Méndez-Lucio
- Centre
for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Albert J. Kooistra
- Division
of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for
Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Chris de Graaf
- Division
of Medicinal Chemistry, Faculty of Sciences, Amsterdam Institute for
Molecules, Medicines and Systems (AIMMS), VU University Amsterdam, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Andreas Bender
- Centre
for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - José L. Medina-Franco
- Facultad
de Química, Departamento de Farmacia, Universidad Nacional Autónoma de México, Avenida Universidad 3000, Mexico City 04510, Mexico
| |
Collapse
|
14
|
Willett P. The Calculation of Molecular Structural Similarity: Principles and Practice. Mol Inform 2014; 33:403-13. [DOI: 10.1002/minf.201400024] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Accepted: 03/14/2014] [Indexed: 01/28/2023]
|
15
|
Nonlinear Dimensionality Reduction for Visualizing Toxicity Data: Distance-Based Versus Topology-Based Approaches. ChemMedChem 2014; 9:1047-59. [DOI: 10.1002/cmdc.201400027] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Indexed: 01/11/2023]
|
16
|
Guha R, Medina-Franco JL. On the validity versus utility of activity landscapes: are all activity cliffs statistically significant? J Cheminform 2014; 6:11. [PMID: 24694189 PMCID: PMC4021161 DOI: 10.1186/1758-2946-6-11] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Accepted: 03/25/2014] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Most work on the topic of activity landscapes has focused on their quantitative description and visual representation, with the aim of aiding navigation of SAR. Recent developments have addressed applications such as quantifying the proportion of activity cliffs, investigating the predictive abilities of activity landscape methods and so on. However, all these publications have worked under the assumption that the activity landscape models are "real" (i.e., statistically significant). RESULTS The current study addresses for the first time, in a quantitative manner, the significance of a landscape or individual cliffs in the landscape. In particular, we question whether the activity landscape derived from observed (experimental) activity data is different from a randomly generated landscape. To address this we used the SALI measure with six different data sets tested against one or more molecular targets. We also assessed the significance of the landscapes for single and multiple representations. CONCLUSIONS We find that non-random landscapes are data set and molecular representation dependent. For the data sets and representations used in this work, our results suggest that not all representations lead to non-random landscapes. This indicates that not all molecular representations should be used to a) interpret the SAR and b) combined to generate consensus models. Our results suggest that significance testing of activity landscape models and in particular, activity cliffs, is key, prior to the use of such models.
Collapse
Affiliation(s)
- Rajarshi Guha
- NIH Center for Advancing Translational Science, 9800 Medical Center Drive, Rockville, MD 20850, USA
| | - José L Medina-Franco
- Circuito Exterior, Instituto de Química, Universidad Nacional Autónoma de México, Ciudad Universitaria, México D.F. 04510, Mexico ; Current address: Mayo Clinic, 13400 East Shea Boulevard, Scottsdale, AZ 85259, USA
| |
Collapse
|
17
|
Santos CBR, Lobato CC, Braga FS, Morais SSS, Santos CF, Fernandes CP, Brasil DSB, Hage-Melim LIS, Macêdo WJC, Carvalho JCT. Application of Hartree-Fock Method for Modeling of Bioactive Molecules Using SAR and QSPR. ACTA ACUST UNITED AC 2014. [DOI: 10.4236/cmb.2014.41001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
18
|
Medina-Franco JL. Activity Cliffs: Facts or Artifacts? Chem Biol Drug Des 2013; 81:553-6. [DOI: 10.1111/cbdd.12115] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 01/17/2013] [Accepted: 01/27/2013] [Indexed: 01/12/2023]
Affiliation(s)
- José L. Medina-Franco
- Instituto de Química, Universidad Nacional Autónoma de México, Circuito Exterior; Ciudad Universitaria; México; D.F. 04510; Mexico
| |
Collapse
|
19
|
Iyer P, Stumpfe D, Vogt M, Bajorath J, Maggiora GM. Activity Landscapes, Information Theory, and Structure - Activity Relationships. Mol Inform 2013; 32:421-30. [DOI: 10.1002/minf.201200120] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2012] [Accepted: 12/13/2012] [Indexed: 12/16/2022]
|
20
|
Medina-Franco JL. Scanning structure-activity relationships with structure-activity similarity and related maps: from consensus activity cliffs to selectivity switches. J Chem Inf Model 2012; 52:2485-93. [PMID: 22989212 DOI: 10.1021/ci300362x] [Citation(s) in RCA: 53] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Systematic description of structure-activity relationships (SARs) of data sets and structure-property relationships (SPRs) is of paramount importance in medicinal chemistry and other research fields. To this end, structure-activity similarity (SAS) maps are one of the first tools proposed to describe SARs using the concept of activity landscape modeling. One of the major goals of the SAS maps is to identify activity cliffs defined as chemical compounds with high similar structure but unexpectedly very different biological activity. Since the first publication of the SAS maps more than ten years ago, these tools have evolved and adapted over the years to analyze various types of compound collections, including structural diverse and combinatorial sets with activity for one or multiple biological end points. The development of SAS maps has led to general concepts that are applicable to other activity landscape methods such as "consensus activity cliffs" (activity cliffs common to a series of representations or descriptors) and "selectivity switches" (structural changes that completely invert the selectivity pattern of similar compounds against two biological end points). Herein, we review the development, practical applications, limitations, and perspectives of the SAS and related maps which are intuitive and powerful informatics tools to computationally analyze SPRs.
Collapse
Affiliation(s)
- José L Medina-Franco
- Torrey Pines Institute for Molecular Studies, 11350 SW Village Parkway, Port St. Lucie, Florida 34987, USA.
| |
Collapse
|