1
|
Manen-Freixa L, Antolin AA. Polypharmacology prediction: the long road toward comprehensively anticipating small-molecule selectivity to de-risk drug discovery. Expert Opin Drug Discov 2024; 19:1043-1069. [PMID: 39004919 DOI: 10.1080/17460441.2024.2376643] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Accepted: 07/02/2024] [Indexed: 07/16/2024]
Abstract
INTRODUCTION Small molecules often bind to multiple targets, a behavior termed polypharmacology. Anticipating polypharmacology is essential for drug discovery since unknown off-targets can modulate safety and efficacy - profoundly affecting drug discovery success. Unfortunately, experimental methods to assess selectivity present significant limitations and drugs still fail in the clinic due to unanticipated off-targets. Computational methods are a cost-effective, complementary approach to predict polypharmacology. AREAS COVERED This review aims to provide a comprehensive overview of the state of polypharmacology prediction and discuss its strengths and limitations, covering both classical cheminformatics methods and bioinformatic approaches. The authors review available data sources, paying close attention to their different coverage. The authors then discuss major algorithms grouped by the types of data that they exploit using selected examples. EXPERT OPINION Polypharmacology prediction has made impressive progress over the last decades and contributed to identify many off-targets. However, data incompleteness currently limits most approaches to comprehensively predict selectivity. Moreover, our limited agreement on model assessment challenges the identification of the best algorithms - which at present show modest performance in prospective real-world applications. Despite these limitations, the exponential increase of multidisciplinary Big Data and AI hold much potential to better polypharmacology prediction and de-risk drug discovery.
Collapse
Affiliation(s)
- Leticia Manen-Freixa
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
| | - Albert A Antolin
- Oncobell Division, Bellvitge Biomedical Research Institute (IDIBELL) and ProCURE Department, Catalan Institute of Oncology (ICO), Barcelona, Spain
- Center for Cancer Drug Discovery, The Division of Cancer Therapeutics, The Institute of Cancer Research, London, UK
| |
Collapse
|
2
|
Yin Y, Hu H, Yang J, Ye C, Goh WWB, Kong AWK, Wu J. OLB-AC: toward optimizing ligand bioactivities through deep graph learning and activity cliffs. Bioinformatics 2024; 40:btae365. [PMID: 38889277 PMCID: PMC11208724 DOI: 10.1093/bioinformatics/btae365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 05/14/2024] [Accepted: 06/14/2024] [Indexed: 06/20/2024] Open
Abstract
MOTIVATION Deep graph learning (DGL) has been widely employed in the realm of ligand-based virtual screening. Within this field, a key hurdle is the existence of activity cliffs (ACs), where minor chemical alterations can lead to significant changes in bioactivity. In response, several DGL models have been developed to enhance ligand bioactivity prediction in the presence of ACs. Yet, there remains a largely unexplored opportunity within ACs for optimizing ligand bioactivity, making it an area ripe for further investigation. RESULTS We present a novel approach to simultaneously predict and optimize ligand bioactivities through DGL and ACs (OLB-AC). OLB-AC possesses the capability to optimize ligand molecules located near ACs, providing a direct reference for optimizing ligand bioactivities with the matching of original ligands. To accomplish this, a novel attentive graph reconstruction neural network and ligand optimization scheme are proposed. Attentive graph reconstruction neural network reconstructs original ligands and optimizes them through adversarial representations derived from their bioactivity prediction process. Experimental results on nine drug targets reveal that out of the 667 molecules generated through OLB-AC optimization on datasets comprising 974 low-activity, noninhibitor, or highly toxic ligands, 49 are recognized as known highly active, inhibitor, or nontoxic ligands beyond the datasets' scope. The 27 out of 49 matched molecular pairs generated by OLB-AC reveal novel transformations not present in their training sets. The adversarial representations employed for ligand optimization originate from the gradients of bioactivity predictions. Therefore, we also assess OLB-AC's prediction accuracy across 33 different bioactivity datasets. Results show that OLB-AC achieves the best Pearson correlation coefficient (r2) on 27/33 datasets, with an average improvement of 7.2%-22.9% against the state-of-the-art bioactivity prediction methods. AVAILABILITY AND IMPLEMENTATION The code and dataset developed in this work are available at github.com/Yueming-Yin/OLB-AC.
Collapse
Affiliation(s)
- Yueming Yin
- School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
- College of Computing and Data Science, Nanyang Technological University, 639798, Singapore
| | - Haifeng Hu
- School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
| | - Jitao Yang
- School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
| | - Chun Ye
- School of Telecommunications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
| | - Wilson Wen Bin Goh
- Lee Kong Chian School of Medicine, Nanyang Technological University, 637551, Singapore
- School of Biological Sciences, Nanyang Technological University, 637551, Singapore
- Center for Biomedical Informatics, Nanyang Technological University, 637551, Singapore
- Center for AI in Medicine, Nanyang Technological University, 639798, Singapore
- Division of Neurology, Department of Brain Sciences, Faculty of Medicine, Imperial College London, London W12 0NN, U.K
| | - Adams Wai-Kin Kong
- College of Computing and Data Science, Nanyang Technological University, 639798, Singapore
| | - Jiansheng Wu
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| |
Collapse
|
3
|
Kerstjens A, De Winter H. Molecule auto-correction to facilitate molecular design. J Comput Aided Mol Des 2024; 38:10. [PMID: 38363377 PMCID: PMC10873457 DOI: 10.1007/s10822-024-00549-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 01/11/2024] [Indexed: 02/17/2024]
Abstract
Ensuring that computationally designed molecules are chemically reasonable is at best cumbersome. We present a molecule correction algorithm that morphs invalid molecular graphs into structurally related valid analogs. The algorithm is implemented as a tree search, guided by a set of policies to minimize its cost. We showcase how the algorithm can be applied to molecular design, either as a post-processing step or as an integral part of molecule generators.
Collapse
Affiliation(s)
- Alan Kerstjens
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium
| | - Hans De Winter
- Laboratory of Medicinal Chemistry, Department of Pharmaceutical Sciences, University of Antwerp, Universiteitslaan 1, 2610, Wilrijk, Belgium.
| |
Collapse
|
4
|
Vivek-Ananth RP, Sahoo AK, Baskaran SP, Ravichandran J, Samal A. Identification of activity cliffs in structure-activity landscape of androgen receptor binding chemicals. THE SCIENCE OF THE TOTAL ENVIRONMENT 2023; 873:162263. [PMID: 36801331 DOI: 10.1016/j.scitotenv.2023.162263] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Revised: 02/09/2023] [Accepted: 02/11/2023] [Indexed: 06/18/2023]
Abstract
Androgen mimicking environmental chemicals can bind to Androgen receptor (AR) and can cause severe effects on the reproductive health of males. Predicting such endocrine disrupting chemicals (EDCs) in the human exposome is vital for improving current chemical regulations. To this end, QSAR models have been developed to predict androgen binders. However, a continuous structure-activity relationship (SAR) wherein chemicals with similar structure have similar activity does not always hold. Activity landscape analysis can help map the structure-activity landscape and identify unique features such as activity cliffs. Here we performed a systematic investigation of the chemical diversity along with the global and local structure-activity landscape of a curated list of 144 AR binding chemicals. Specifically, we clustered the AR binding chemicals and visualized the associated chemical space. Thereafter, consensus diversity plot was used to assess the global diversity of the chemical space. Subsequently, the structure-activity landscape was investigated using SAS maps which capture the activity difference and structural similarity among the AR binders. This analysis led to a subset of 41 AR binding chemicals forming 86 activity cliffs, of which 14 are activity cliff generators. Additionally, SALI scores were computed for all pairs of AR binding chemicals and the SALI heatmap was also used to evaluate the activity cliffs identified using SAS map. Finally, we provide a classification of the 86 activity cliffs into six categories using structural information of chemicals at different levels. Overall, this investigation reveals the heterogeneous nature of the structure-activity landscape of AR binding chemicals and provides insights which will be crucial in preventing false prediction of chemicals as androgen binders and developing predictive computational toxicity models in the future.
Collapse
Affiliation(s)
- R P Vivek-Ananth
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Ajaya Kumar Sahoo
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Shanmuga Priya Baskaran
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Janani Ravichandran
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc), Chennai 600113, India; Homi Bhabha National Institute (HBNI), Mumbai 400094, India.
| |
Collapse
|
5
|
Huang B, Fong LWR, Chaudhari R, Zhang S. Development and evaluation of a java-based deep neural network method for drug response predictions. Front Artif Intell 2023; 6:1069353. [PMID: 37035534 PMCID: PMC10076891 DOI: 10.3389/frai.2023.1069353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 03/03/2023] [Indexed: 04/11/2023] Open
Abstract
Accurate prediction of drug response is a crucial step in personalized medicine. Recently, deep learning techniques have been witnessed with significant breakthroughs in a variety of areas including biomedical research and chemogenomic applications. This motivated us to develop a novel deep learning platform to accurately and reliably predict the response of cancer cells to different drug treatments. In the present work, we describe a Java-based implementation of deep neural network method, termed JavaDL, to predict cancer responses to drugs solely based on their chemical features. To this end, we devised a novel cost function and added a regularization term which suppresses overfitting. We also adopted an early stopping strategy to further reduce overfit and improve the accuracy and robustness of our models. To evaluate our method, we compared with several popular machine learning and deep neural network programs and observed that JavaDL either outperformed those methods in model building or obtained comparable predictions. Finally, JavaDL was employed to predict drug responses of several aggressive breast cancer cell lines, and the results showed robust and accurate predictions with r 2 as high as 0.81.
Collapse
|
6
|
van Tilborg D, Alenicheva A, Grisoni F. Exposing the Limitations of Molecular Machine Learning with Activity Cliffs. J Chem Inf Model 2022; 62:5938-5951. [PMID: 36456532 PMCID: PMC9749029 DOI: 10.1021/acs.jcim.2c01073] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Indexed: 12/03/2022]
Abstract
Machine learning has become a crucial tool in drug discovery and chemistry at large, e.g., to predict molecular properties, such as bioactivity, with high accuracy. However, activity cliffs─pairs of molecules that are highly similar in their structure but exhibit large differences in potency─have received limited attention for their effect on model performance. Not only are these edge cases informative for molecule discovery and optimization but also models that are well equipped to accurately predict the potency of activity cliffs have increased potential for prospective applications. Our work aims to fill the current knowledge gap on best-practice machine learning methods in the presence of activity cliffs. We benchmarked a total of 24 machine and deep learning approaches on curated bioactivity data from 30 macromolecular targets for their performance on activity cliff compounds. While all methods struggled in the presence of activity cliffs, machine learning approaches based on molecular descriptors outperformed more complex deep learning methods. Our findings highlight large case-by-case differences in performance, advocating for (a) the inclusion of dedicated "activity-cliff-centered" metrics during model development and evaluation and (b) the development of novel algorithms to better predict the properties of activity cliffs. To this end, the methods, metrics, and results of this study have been encapsulated into an open-access benchmarking platform named MoleculeACE (Activity Cliff Estimation, available on GitHub at: https://github.com/molML/MoleculeACE). MoleculeACE is designed to steer the community toward addressing the pressing but overlooked limitation of molecular machine learning models posed by activity cliffs.
Collapse
Affiliation(s)
- Derek van Tilborg
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| | | | - Francesca Grisoni
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| |
Collapse
|
7
|
Trapotsi MA, Hosseini-Gerami L, Bender A. Computational analyses of mechanism of action (MoA): data, methods and integration. RSC Chem Biol 2022; 3:170-200. [PMID: 35360890 PMCID: PMC8827085 DOI: 10.1039/d1cb00069a] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2021] [Accepted: 12/09/2021] [Indexed: 12/15/2022] Open
Abstract
The elucidation of a compound's Mechanism of Action (MoA) is a challenging task in the drug discovery process, but it is important in order to rationalise phenotypic findings and to anticipate potential side-effects. Bioinformatic approaches, advances in machine learning techniques and the increasing deposition of high-throughput data in public databases have significantly contributed to recent advances in the field, but it is not straightforward to decide which data and methods are most suitable to use in a given case. In this review, we focus on these methods and data and their applications in generating MoA hypotheses for subsequent experimental validation. We discuss compound-specific data such as -omics, cell morphology and bioactivity data, as well as commonly used supplementary prior knowledge such as network and pathway data, and provide information on databases where this data can be accessed. In terms of methodologies, we discuss both well-established methods (connectivity mapping, pathway enrichment) as well as more developing methods (neural networks and multi-omics integration). Finally, we review case studies where the MoA of a compound was successfully suggested from computational analysis by incorporating multiple data modalities and/or methodologies. Our aim for this review is to provide researchers with insights into the benefits and drawbacks of both the data and methods in terms of level of understanding, biases and interpretation - and to highlight future avenues of investigation which we foresee will improve the field of MoA elucidation, including greater public access to -omics data and methodologies which are capable of data integration.
Collapse
Affiliation(s)
- Maria-Anna Trapotsi
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| | - Layla Hosseini-Gerami
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| | - Andreas Bender
- Centre for Molecular Informatics, Yusuf Hamied Department of Chemistry, University of Cambridge UK
| |
Collapse
|
8
|
Ye Q, Hsieh CY, Yang Z, Kang Y, Chen J, Cao D, He S, Hou T. A unified drug-target interaction prediction framework based on knowledge graph and recommendation system. Nat Commun 2021; 12:6775. [PMID: 34811351 PMCID: PMC8635420 DOI: 10.1038/s41467-021-27137-3] [Citation(s) in RCA: 91] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 11/05/2021] [Indexed: 02/06/2023] Open
Abstract
Prediction of drug-target interactions (DTI) plays a vital role in drug development in various areas, such as virtual screening, drug repurposing and identification of potential drug side effects. Despite extensive efforts have been invested in perfecting DTI prediction, existing methods still suffer from the high sparsity of DTI datasets and the cold start problem. Here, we develop KGE_NFM, a unified framework for DTI prediction by combining knowledge graph (KG) and recommendation system. This framework firstly learns a low-dimensional representation for various entities in the KG, and then integrates the multimodal information via neural factorization machine (NFM). KGE_NFM is evaluated under three realistic scenarios, and achieves accurate and robust predictions on four benchmark datasets, especially in the scenario of the cold start for proteins. Our results indicate that KGE_NFM provides valuable insight to integrate KG and recommendation system-based techniques into a unified framework for novel DTI discovery.
Collapse
Affiliation(s)
- Qing Ye
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China ,grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China ,grid.13402.340000 0004 1759 700XState Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang 310058 China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Ziyi Yang
- Tencent Quantum Laboratory, Shenzhen, 518057 Guangdong China
| | - Yu Kang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang China
| | - Jiming Chen
- grid.13402.340000 0004 1759 700XCollege of Control Science and Engineering, Zhejiang University, Hangzhou, 310027 Zhejiang China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China.
| | - Shibo He
- College of Control Science and Engineering, Zhejiang University, Hangzhou, 310027, Zhejiang, China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, Zhejiang, 310058, China.
| |
Collapse
|
9
|
SimilarityLab: Molecular Similarity for SAR Exploration and Target Prediction on the Web. Processes (Basel) 2021. [DOI: 10.3390/pr9091520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Exploration of chemical space around hit, experimental, and known active compounds is an important step in the early stages of drug discovery. In academia, where access to chemical synthesis efforts is restricted in comparison to the pharma-industry, hits from primary screens are typically followed up through purchase and testing of similar compounds, before further funding is sought to begin medicinal chemistry efforts. Rapid exploration of druglike similars and structure–activity relationship profiles can be achieved through our new webservice SimilarityLab. In addition to searching for commercially available molecules similar to a query compound, SimilarityLab also enables the search of compounds with recorded activities, generating consensus counts of activities, which enables target and off-target prediction. In contrast to other online offerings utilizing the USRCAT similarity measure, SimilarityLab’s set of commercially available small molecules is consistently updated, currently containing over 12.7 million unique small molecules, and not relying on published databases which may be many years out of date. This ensures researchers have access to up-to-date chemistries and synthetic processes enabling greater diversity and access to a wider area of commercial chemical space. All source code is available in the SimilarityLab source repository.
Collapse
|
10
|
Miranda-Quintana RA, Bajusz D, Rácz A, Héberger K. Differential Consistency Analysis: Which Similarity Measures can be Applied in Drug Discovery? Mol Inform 2021; 40:e2060017. [PMID: 33891369 DOI: 10.1002/minf.202060017] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2021] [Accepted: 03/01/2021] [Indexed: 12/16/2022]
Abstract
Similarity measures are widely used in various areas from taxonomy to cheminformatics. To this end, a large number of similarity and distance measures (or, collectively, comparative measures) have been introduced, with only a few studies directed to revealing their inner relationships. We present a thorough analytical study of the conditions leading to two comparative measures providing equivalent results over a given set of molecules. A key part of this work is the introduction of a novel way to study the consistency between comparative measures: the differential consistency analysis (DCA). This tool reveals how the consistency can be established in an analytical way with minimal (or no) assumptions. We found that the consensus between Tanimoto and the Cosine coefficients improved by choosing a reference whose similarity to the rest of the molecules varies less, or by representing the molecules in a way that does not depend strongly on their size (i. e. bit frequency in the chosen fingerprint representation). The presented derivations are just some generic examples; DCA can be applied widely and for all binary similarity coefficients introduced so far, independently from the molecular representations.
Collapse
Affiliation(s)
| | - Dávid Bajusz
- Medicinal Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Anita Rácz
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| | - Károly Héberger
- Plasma Chemistry Research Group, Research Centre for Natural Sciences, Magyar tudósok krt. 2, 1117, Budapest, Hungary
| |
Collapse
|
11
|
Trapotsi MA, Mervin LH, Afzal AM, Sturm N, Engkvist O, Barrett IP, Bender A. Comparison of Chemical Structure and Cell Morphology Information for Multitask Bioactivity Predictions. J Chem Inf Model 2021; 61:1444-1456. [PMID: 33661004 DOI: 10.1021/acs.jcim.0c00864] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The understanding of the mechanism-of-action (MoA) of compounds and the prediction of potential drug targets play an important role in small-molecule drug discovery. The aim of this work was to compare chemical and cell morphology information for bioactivity prediction. The comparison was performed using bioactivity data from the ExCAPE database, image data (in the form of CellProfiler features) from the Cell Painting data set (the largest publicly available data set of cell images with ∼30,000 compound perturbations), and extended connectivity fingerprints (ECFPs) using the multitask Bayesian matrix factorization (BMF) approach Macau. We found that the BMF Macau and random forest (RF) performance were overall similar when ECFPs were used as compound descriptors. However, BMF Macau outperformed RF in 159 out of 224 targets (71%) when image data were used as compound information. Using BMF Macau, 100 (corresponding to about 45%) and 90 (about 40%) of the 224 targets were predicted with high predictive performance (AUC > 0.8) with ECFP data and image data as side information, respectively. There were targets better predicted by image data as side information, such as β-catenin, and others better predicted by fingerprint-based side information, such as proteins belonging to the G-protein-Coupled Receptor 1 family, which could be rationalized from the underlying data distributions in each descriptor domain. In conclusion, both cell morphology changes and chemical structure information contain information about compound bioactivity, which is also partially complementary, and can hence contribute to in silico MoA analysis.
Collapse
Affiliation(s)
- Maria-Anna Trapotsi
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| | - Lewis H Mervin
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0WG, U.K
| | - Avid M Afzal
- Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0WG, U.K
| | - Noé Sturm
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, R&D, AstraZeneca, Gothenburg SE-43183, Sweden
| | - Ian P Barrett
- Data Sciences & Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge CB4 0WG, U.K
| | - Andreas Bender
- Department of Chemistry, Centre for Molecular Informatics, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, U.K
| |
Collapse
|
12
|
Veale CGL. Into the Fray! A Beginner's Guide to Medicinal Chemistry. ChemMedChem 2021; 16:1199-1225. [PMID: 33591595 DOI: 10.1002/cmdc.202000929] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Indexed: 12/31/2022]
Abstract
Modern medicinal chemistry is a complex, multidimensional discipline that operates at the interface of the chemical and biological sciences. The medicinal chemistry contribution to drug discovery is typically described in the context of the well-recited linear progression of the drug discovery pipeline. However, compound optimization is idiosyncratic to each project, and clear definitions of hit and lead molecules and the subsequent progress along the pipeline becomes easily blurred. In addition, this description lacks insight into the entangled relationship between chemical and pharmacological properties, and thus provides limited guidance on how innovative medicinal chemistry strategies can be applied to solve optimization problems, regardless of the stage in the pipeline. Through discussion and illustrative examples, this article seeks to provide insights into the finesse of medicinal chemistry and the subtlety of balancing chemical properties pharmacology. In so doing, it aims to serve as an accessible and simple-to-digest guide for anyone who wishes to learn about the underlying principles of medicinal chemistry, in a context that has been decoupled from the pipeline description.
Collapse
Affiliation(s)
- Clinton G L Veale
- School of Chemistry and Physics, Pietermaritzburg Campus, University of KwaZulu-Natal, Private Bag X01, Pietermaritzburg, Scottsville, 3209, South Africa
| |
Collapse
|
13
|
Coley CW, Eyke NS, Jensen KF. Autonome Entdeckung in den chemischen Wissenschaften, Teil II: Ausblick. Angew Chem Int Ed Engl 2020. [DOI: 10.1002/ange.201909989] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Connor W. Coley
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Natalie S. Eyke
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| | - Klavs F. Jensen
- Department of Chemical Engineering Massachusetts Institute of Technology Cambridge MA 02139 USA
| |
Collapse
|
14
|
Cheirdaris DG. Artificial Neural Networks in Computer-Aided Drug Design: An Overview of Recent Advances. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2020; 1194:115-125. [PMID: 32468528 DOI: 10.1007/978-3-030-32622-7_10] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Computer-aided drug design (CADD) is the framework in which the huge amount of data accumulated by high-throughput experimental methods used in drug design is quantitatively studied. Its objectives include pattern recognition, biomarker identification and/or classification, etc. In order to achieve these objectives, machine learning algorithms and especially artificial neural networks (ANNs) have been used over ADMET factor testing and QSAR modeling evaluation. This paper provides an overview of the current trends in CADD-applied ANNs, since their use was re-boosted over a decade ago.
Collapse
|
15
|
Coley CW, Eyke NS, Jensen KF. Autonomous Discovery in the Chemical Sciences Part II: Outlook. Angew Chem Int Ed Engl 2020; 59:23414-23436. [PMID: 31553509 DOI: 10.1002/anie.201909989] [Citation(s) in RCA: 105] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2019] [Indexed: 01/19/2023]
Abstract
This two-part Review examines how automation has contributed to different aspects of discovery in the chemical sciences. In this second part, we reflect on a selection of exemplary studies. It is increasingly important to articulate what the role of automation and computation has been in the scientific process and how that has or has not accelerated discovery. One can argue that even the best automated systems have yet to "discover" despite being incredibly useful as laboratory assistants. We must carefully consider how they have been and can be applied to future problems of chemical discovery in order to effectively design and interact with future autonomous platforms. The majority of this Review defines a large set of open research directions, including improving our ability to work with complex data, build empirical models, automate both physical and computational experiments for validation, select experiments, and evaluate whether we are making progress towards the ultimate goal of autonomous discovery. Addressing these practical and methodological challenges will greatly advance the extent to which autonomous systems can make meaningful discoveries.
Collapse
Affiliation(s)
- Connor W Coley
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Natalie S Eyke
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| | - Klavs F Jensen
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA
| |
Collapse
|
16
|
Martinez-Mayorga K, Madariaga-Mazon A, Medina-Franco JL, Maggiora G. The impact of chemoinformatics on drug discovery in the pharmaceutical industry. Expert Opin Drug Discov 2020; 15:293-306. [PMID: 31965870 DOI: 10.1080/17460441.2020.1696307] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Introduction: Even though there have been substantial advances in our understanding of biological systems, research in drug discovery is only just now beginning to utilize this type of information. The single-target paradigm, which exemplifies the reductionist approach, remains a mainstay of drug research today. A deeper view of the complexity involved in drug discovery is necessary to advance on this field.Areas covered: This perspective provides a summary of research areas where cheminformatics has played a key role in drug discovery, including of the available resources as well as a personal perspective of the challenges still faced in the field.Expert opinion: Although great strides have been made in the handling and analysis of biological and pharmacological data, more must be done to link the data to biological pathways. This is crucial if one is to understand how drugs modify disease phenotypes, although this will involve a shift from the single drug/single target paradigm that remains a mainstay of drug research. Moreover, such a shift would require an increased awareness of the role of physiology in the mechanism of drug action, which will require the introduction of new mathematical, computer, and biological methods for chemoinformaticians to be trained in.
Collapse
Affiliation(s)
| | | | - José L Medina-Franco
- Facultad de Química, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | | |
Collapse
|
17
|
Alberga D, Trisciuzzi D, Mansouri K, Mangiatordi GF, Nicolotti O. Prediction of Acute Oral Systemic Toxicity Using a Multifingerprint Similarity Approach. Toxicol Sci 2020; 167:484-495. [PMID: 30371864 DOI: 10.1093/toxsci/kfy255] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The implementation of nonanimal approaches is of particular importance to regulatory agencies for the prediction of potential hazards associated with acute exposures to chemicals. This work was carried out in the framework of an international modeling initiative organized by the Acute Toxicity Workgroup (ATWG) of the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) with the participation of 32 international groups across government, industry, and academia. Our contribution was to develop a multifingerprints similarity approach for predicting five relevant toxicology endpoints related to the acute oral systemic toxicity that are: the median lethal dose (LD50) point prediction, the "nontoxic" (LD50 > 2000 mg/kg) and "very toxic" (LD50<50 mg/kg) binary classification, and the multiclass categorization of chemicals based on the United States Environmental Protection Agency and Globally Harmonized System of Classification and Labeling of Chemicals schemes. Provided by the ICCVAM's ATWG, the training set used to develop the models consisted of 8944 chemicals having high-quality rat acute oral lethality data. The proposed approach integrates the results coming from a similarity search based on 19 different fingerprint definitions to return a consensus prediction value. Moreover, the herein described algorithm is tailored to properly tackling the so-called toxicity cliffs alerting that a large gap in LD50 values exists despite a high structural similarity for a given molecular pair. An external validation set made available by ICCVAM and consisting in 2896 chemicals was employed to further evaluate the selected models. This work returned high-accuracy predictions based on the evaluations conducted by ICCVAM's ATWG.
Collapse
Affiliation(s)
- Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| | - Kamel Mansouri
- ScitoVation LLC, Research Triangle Park, North Carolina 27709.,Integrated Laboratory Systems, Morrisville, NC 27560
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy.,Istituto Tumori IRCCS Giovanni Paolo II, 70124 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| |
Collapse
|
18
|
Ferraro M, Decherchi S, De Simone A, Recanatini M, Cavalli A, Bottegoni G. Multi-target dopamine D3 receptor modulators: Actionable knowledge for drug design from molecular dynamics and machine learning. Eur J Med Chem 2019; 188:111975. [PMID: 31940507 DOI: 10.1016/j.ejmech.2019.111975] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 12/02/2019] [Accepted: 12/16/2019] [Indexed: 10/25/2022]
Abstract
Local changes in the structure of G-protein coupled receptors (GPCR) binders largely affect their pharmacological profile. While the sought efficacy can be empirically obtained by introducing local modifications, the underlining structural explanation can remain elusive. Here, molecular dynamics (MD) simulations of the eticlopride-bound inactive state of the Dopamine D3 Receptor (D3DR) have been clustered using a machine learning-based approach in the attempt to rationalize the efficacy change in four congeneric modulators. Accumulating extended MD trajectories of receptor-ligand complexes, we observed how the increase in ligand flexibility progressively destabilized the crystal structure of the inactivated receptor. To prospectively validate this model, a partial agonist was rationally designed based on structural insights and computational modeling, and eventually synthesized and tested. Results turned out to be in line with the predictions. This case study suggests that the investigation of ligand flexibility in the framework of extended MD simulations can assist and inform drug design strategies, highlighting its potential role as a powerful in silico counterpart to functional assays.
Collapse
Affiliation(s)
- Mariarosaria Ferraro
- Istituto di Chimica Del Riconoscimento Molecolare, Consiglio Nazionale Delle Ricerche (ICRM-CNR), Via Mario Bianco 9, 20131, Milan, Italy.
| | - Sergio Decherchi
- Computational & Chemical Biology, Italian Institute of Technology, Via Morego 30, 16163, Genoa, Italy.
| | - Alessio De Simone
- Sygnature Discovery Ltd, Bio City, Pennyfoot St, Nottingham NG1 1GR, United Kingdom.
| | - Maurizio Recanatini
- Dept. of Pharmacy and Biotechnology, University of Bologna, Via Belmeloro 6, 40126, Bologna, Italy.
| | - Andrea Cavalli
- Computational & Chemical Biology, Italian Institute of Technology, Via Morego 30, 16163, Genoa, Italy; Dept. of Pharmacy and Biotechnology, University of Bologna, Via Belmeloro 6, 40126, Bologna, Italy.
| | - Giovanni Bottegoni
- School of Pharmacy, University of Birmingham, Sir Robert Aitken Institute for Clinical Research, Edgbaston, B15 2TT, United Kingdom.
| |
Collapse
|
19
|
Abramyan TM, An Y, Kireev D. Off-Pocket Activity Cliffs: A Puzzling Facet of Molecular Recognition. J Chem Inf Model 2019; 60:152-161. [PMID: 31790251 DOI: 10.1021/acs.jcim.9b00731] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
While accurate quantitative prediction of ligand-protein binding affinity remains an elusive goal, high-affinity ligands to therapeutic targets are being designed through heuristic optimization of ligand-protein contacts. However, herein, through large-scale data mining and analyses, we demonstrate that a ligand's binding can also be strongly affected through modifying its solvent-exposed portion that does not make contacts with the protein, thus resulting in "off-pocket activity cliffs" (OAC). We then exposed the roots of the OAC phenomenon by means of molecular dynamics (MD) simulations and MD data analyses. We expect OAC to extend our knowledge of molecular recognition and enhance the drug designer's toolkit.
Collapse
Affiliation(s)
- Tigran M Abramyan
- Center for Integrative Chemical Biology and Drug Discovery, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy , University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , 27599-7363
| | - Yi An
- Center for Integrative Chemical Biology and Drug Discovery, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy , University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , 27599-7363
| | - Dmitri Kireev
- Center for Integrative Chemical Biology and Drug Discovery, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy , University of North Carolina at Chapel Hill , Chapel Hill , North Carolina , 27599-7363
| |
Collapse
|
20
|
Miranda PHDS, Lourenço EMG, Morais AMS, de Oliveira PIC, Silverio PSDSN, Jordão AK, Barbosa EG. Molecular modeling of a series of dehydroquinate dehydratase type II inhibitors of Mycobacterium tuberculosis and design of new binders. Mol Divers 2019; 25:1-12. [PMID: 31820222 DOI: 10.1007/s11030-019-10020-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 11/22/2019] [Indexed: 11/24/2022]
Abstract
Tuberculosis, caused by Mycobacterium tuberculosis (M. tuberculosis), is still responsible for a large number of fatal cases, especially in developing countries with alarming rates of incidence and prevalence worldwide. Mycobacterium tuberculosis has a remarkable ability to develop new resistance mechanisms to the conventional antimicrobials treatment. Because of this, there is an urgent need for novel bioactive compounds for its treatment. The dehydroquinate dehydratase II (DHQase II) is considered a key enzyme of shikimate pathway, and it can be used as a promising target for the design of new bioactive compounds with antibacterial action. The aim of this work was the construction of QSAR models to aid the design of new potential DHQase II inhibitors. For that purpose, various molecular modeling approaches, such as activity cliff, QSAR models and computer-aided ligand design were utilized. A predictive in silico 4D-QSAR model was built using a database comprising 86 inhibitors of DHQase II, and the model was used to predict the activity of the designed ligands. The obtained model proved to predict well the DHQase II inhibition for an external validation dataset ([Formula: see text] = 0.72). Also, the Activity Cliff analysis shed light on important structural features applied to the ligand design.
Collapse
Affiliation(s)
- Paulo H de S Miranda
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Estela M G Lourenço
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Alexander M S Morais
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Pedro I C de Oliveira
- Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | | | - Alessandro K Jordão
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil
| | - Euzébio G Barbosa
- Departamento de Farmácia, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil. .,Programa de Pós-Graduação em Bioinformática, Universidade Federal do Rio Grande do Norte, Natal, RN, Brazil.
| |
Collapse
|
21
|
Matsuzaka Y, Uesawa Y. Prediction Model with High-Performance Constitutive Androstane Receptor (CAR) Using DeepSnap-Deep Learning Approach from the Tox21 10K Compound Library. Int J Mol Sci 2019; 20:ijms20194855. [PMID: 31574921 PMCID: PMC6801383 DOI: 10.3390/ijms20194855] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2019] [Revised: 09/23/2019] [Accepted: 09/27/2019] [Indexed: 12/30/2022] Open
Abstract
The constitutive androstane receptor (CAR) plays pivotal roles in drug-induced liver injury through the transcriptional regulation of drug-metabolizing enzymes and transporters. Thus, identifying regulatory factors for CAR activation is important for understanding its mechanisms. Numerous studies conducted previously on CAR activation and its toxicity focused on in vivo or in vitro analyses, which are expensive, time consuming, and require many animals. We developed a computational model that predicts agonists for the CAR using the Toxicology in the 21st Century 10k library. Additionally, we evaluate the prediction performance of novel deep learning (DL)-based quantitative structure-activity relationship analysis called the DeepSnap-DL approach, which is a procedure of generating an omnidirectional snapshot portraying three-dimensional (3D) structures of chemical compounds. The CAR prediction model, which applies a 3D structure generator tool, called CORINA-generated and -optimized chemical structures, in the DeepSnap-DL demonstrated better performance than the existing methods using molecular descriptors. These results indicate that high performance in the prediction model using the DeepSnap-DL approach may be important to prepare suitable 3D chemical structures as input data and to enable the identification of modulators of the CAR.
Collapse
Affiliation(s)
- Yasunari Matsuzaka
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo 204-8588, Japan.
| | - Yoshihiro Uesawa
- Department of Medical Molecular Informatics, Meiji Pharmaceutical University, Tokyo 204-8588, Japan.
| |
Collapse
|
22
|
Sydow D, Morger A, Driller M, Volkamer A. TeachOpenCADD: a teaching platform for computer-aided drug design using open source packages and data. J Cheminform 2019; 11:29. [PMID: 30963287 PMCID: PMC6454689 DOI: 10.1186/s13321-019-0351-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 03/27/2019] [Indexed: 11/25/2022] Open
Abstract
Owing to the increase in freely available software and data for cheminformatics and structural bioinformatics, research for computer-aided drug design (CADD) is more and more built on modular, reproducible, and easy-to-share pipelines. While documentation for such tools is available, there are only a few freely accessible examples that teach the underlying concepts focused on CADD, especially addressing users new to the field. Here, we present TeachOpenCADD, a teaching platform developed by students for students, using open source compound and protein data as well as basic and CADD-related Python packages. We provide interactive Jupyter notebooks for central CADD topics, integrating theoretical background and practical code. TeachOpenCADD is freely available on GitHub: https://github.com/volkamerlab/TeachOpenCADD .
Collapse
Affiliation(s)
- Dominique Sydow
- In Silico Toxicology, Institute of Physiology, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Andrea Morger
- In Silico Toxicology, Institute of Physiology, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Maximilian Driller
- In Silico Toxicology, Institute of Physiology, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany
| | - Andrea Volkamer
- In Silico Toxicology, Institute of Physiology, Charité - Universitätsmedizin Berlin, Charitéplatz 1, 10117, Berlin, Germany.
| |
Collapse
|
23
|
Affiliation(s)
- Jürgen Bajorath
- a Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Bonn , Germany
| |
Collapse
|
24
|
Sydow D, Burggraaff L, Szengel A, van Vlijmen HWT, IJzerman AP, van Westen GJP, Volkamer A. Advances and Challenges in Computational Target Prediction. J Chem Inf Model 2019; 59:1728-1742. [DOI: 10.1021/acs.jcim.8b00832] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Dominique Sydow
- In silico Toxicology, Institute of Physiology, Charité − Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Lindsey Burggraaff
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
| | - Angelika Szengel
- In silico Toxicology, Institute of Physiology, Charité − Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| | - Herman W. T. van Vlijmen
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
- Computational Chemistry, Janssen Research & Development, Turnhoutseweg 30, B-2340 Beerse, Belgium
| | - Adriaan P. IJzerman
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- Division of Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden University, P.O. Box 9502, 2300 RA, Leiden, The Netherlands
| | - Andrea Volkamer
- In silico Toxicology, Institute of Physiology, Charité − Universitätsmedizin Berlin, Charitéplatz 1, 10117 Berlin, Germany
| |
Collapse
|
25
|
Pérez-Benito L, Casajuana-Martin N, Jiménez-Rosés M, van Vlijmen H, Tresadern G. Predicting Activity Cliffs with Free-Energy Perturbation. J Chem Theory Comput 2019; 15:1884-1895. [DOI: 10.1021/acs.jctc.8b01290] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Laura Pérez-Benito
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Nil Casajuana-Martin
- Laboratori de Medicina Computacional, Unitat de Bioestadistica, Facultat de Medicina, Universitat Autonoma de Barcelona, Bellaterra 08193, Spain
| | - Mireia Jiménez-Rosés
- Laboratori de Medicina Computacional, Unitat de Bioestadistica, Facultat de Medicina, Universitat Autonoma de Barcelona, Bellaterra 08193, Spain
| | - Herman van Vlijmen
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, Beerse B-2340, Belgium
| | - Gary Tresadern
- Computational Chemistry, Janssen Research & Development, Janssen Pharmaceutica N. V., Turnhoutseweg 30, Beerse B-2340, Belgium
| |
Collapse
|
26
|
Saldívar-González FI, Lenci E, Trabocchi A, Medina-Franco JL. Exploring the chemical space and the bioactivity profile of lactams: a chemoinformatic study. RSC Adv 2019; 9:27105-27116. [PMID: 35528563 PMCID: PMC9070607 DOI: 10.1039/c9ra04841c] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2019] [Accepted: 08/17/2019] [Indexed: 01/04/2023] Open
Abstract
Lactams are a class of compounds important for drug design, due to their great variety of potential therapeutic applications, spanning cancer, diabetes, and infectious diseases. So far, the biological profile and chemical diversity of lactams have not been characterized in a systematic and detailed manner. In this work, we report the chemoinformatic analysis of beta-, gamma-, delta- and epsilon-lactams present in databases of approved drugs, natural products, and bioactive compounds from the large public database ChEMBL. We identified the main biological targets in which the lactams have been evaluated according to their chemical classification. We also identified the most frequent scaffolds and those that can be prioritized in chemical synthesis, since they are scaffolds with potential biological activity but with few reported analogs. Results of the biological and chemoinformatic analysis of lactams indicate that spiro- and bridged-lactams belong to classes with the lowest number of compounds and unique scaffolds, and some showing activity against specific targets. Information obtained from this analysis allows focusing the design of new chemical structures in less explored spaces and with increased possibilities of success. Lactams are a class of compounds important for drug design, due to their great variety of potential therapeutic applications, spanning cancer, diabetes, and infectious diseases.![]()
Collapse
Affiliation(s)
| | - Elena Lenci
- Department of Chemistry “Ugo Schiff”
- University of Florence
- 50019 Sesto Fiorentino
- Italy
| | - Andrea Trabocchi
- Department of Chemistry “Ugo Schiff”
- University of Florence
- 50019 Sesto Fiorentino
- Italy
- Interdepartmental Center for Preclinical Development of Molecular Imaging (CISPIM)
| | - José L. Medina-Franco
- School of Chemistry
- Department of Pharmacy
- Universidad Nacional Autónoma de México
- Mexico City 04510
- Mexico
| |
Collapse
|
27
|
Alberga D, Trisciuzzi D, Montaruli M, Leonetti F, Mangiatordi GF, Nicolotti O. A New Approach for Drug Target and Bioactivity Prediction: The Multifingerprint Similarity Search Algorithm (MuSSeL). J Chem Inf Model 2018; 59:586-596. [PMID: 30485097 DOI: 10.1021/acs.jcim.8b00698] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
We present MuSSeL, a multifingerprint similarity search algorithm, able to predict putative drug targets for a given query small molecule as well as to return a quantitative assessment of its bioactivity in terms of Ki or IC50 values. Predictions are automatically made exploiting a large collection of high quality experimental bioactivity data available from ChEMBL (version 22.1) combining, in a consensus-like approach, predictions resulting from a similarity search performed using 13 different fingerprint definitions. Importantly, the herein proposed algorithm is also effective in detecting and handling activity cliffs. A calibration set including small molecules present in the last updated version of ChEMBL (version 23) was employed to properly tune the algorithm parameters. Three randomly built external sets were instead challenged for model performances. The potential use of MuSSeL was also challenged by a prospective exercise for the prediction of five bioactive compounds taken from articles published in the Journal of Medicinal Chemistry just few months ago. The paper emphasizes the importance of implementing multifingerprint consensus strategies to increase the confidence in prediction of similarity search algorithms and provides a fast and easy-to-run tool for drug target and bioactivity prediction.
Collapse
Affiliation(s)
- Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Michele Montaruli
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Francesco Leonetti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco , Università degli Studi di Bari "Aldo Moro" , Via E. Orabona, 4 , I-70126 Bari , Italy
| |
Collapse
|
28
|
Miyao T, Funatsu K, Bajorath J. Three-Dimensional Activity Landscape Models of Different Design and Their Application to Compound Mapping and Potency Prediction. J Chem Inf Model 2018; 59:993-1004. [DOI: 10.1021/acs.jcim.8b00661] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Tomoyuki Miyao
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
| | - Kimito Funatsu
- Data Science Center and Graduate School of Science and Technology, Nara Institute of Science and Technology, 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan
- Department of Chemical System Engineering, School of Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany
| |
Collapse
|
29
|
Sturm N, Sun J, Vandriessche Y, Mayr A, Klambauer G, Carlsson L, Engkvist O, Chen H. Application of Bioactivity Profile-Based Fingerprints for Building Machine Learning Models. J Chem Inf Model 2018; 59:962-972. [DOI: 10.1021/acs.jcim.8b00550] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Noé Sturm
- Hit Discovery, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Pepparedsleden 1, 43153 Mölndal, Sweden
| | - Jiangming Sun
- Hit Discovery, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Pepparedsleden 1, 43153 Mölndal, Sweden
| | - Yves Vandriessche
- Intel Corporation, Data Center Group, Veldkant 31, 2550 Kontich, Belgium
| | - Andreas Mayr
- LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenbergerstr 69, 4040 Linz, Austria
| | - Günter Klambauer
- LIT AI Lab & Institute for Machine Learning, Johannes Kepler University Linz, Altenbergerstr 69, 4040 Linz, Austria
| | - Lars Carlsson
- Quantitative Biology, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Pepparedsleden 1, 43153 Mölndal, Sweden
| | - Ola Engkvist
- Hit Discovery, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Pepparedsleden 1, 43153 Mölndal, Sweden
| | - Hongming Chen
- Hit Discovery, Discovery Sciences, IMED Biotech Unit, AstraZeneca, Pepparedsleden 1, 43153 Mölndal, Sweden
| |
Collapse
|
30
|
Réau M, Lagarde N, Zagury JF, Montes M. Nuclear Receptors Database Including Negative Data (NR-DBIND): A Database Dedicated to Nuclear Receptors Binding Data Including Negative Data and Pharmacological Profile. J Med Chem 2018; 62:2894-2904. [PMID: 30354114 DOI: 10.1021/acs.jmedchem.8b01105] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Nuclear receptors (NRs) are transcription factors that regulate gene expression in various physiological processes through their interactions with small hydrophobic molecules. They constitute an important class of targets for drugs and endocrine disruptors and are widely studied for both health and environment concerns. Since the integration of negative data can be critical for accurate modeling of ligand activity profiles, we manually collected and annotated NRs interaction data (positive and negative) through a sharp review of the corresponding literature. 15 116 positive and negative interactions data are provided for 28 NRs together with 593 PDB structures in the freely available Nuclear Receptors Database Including Negative Data ( http://nr-dbind.drugdesign.fr ). The NR-DBIND contains the most extensive information about interaction data on NRs, which should bring valuable information to chemists, biologists, pharmacologists and toxicologists.
Collapse
Affiliation(s)
- Manon Réau
- Laboratoire GBA, EA4627 , Conservatoire National des Arts et Métiers , 2 Rue Conté , 75003 Paris , France
| | - Nathalie Lagarde
- Laboratoire GBA, EA4627 , Conservatoire National des Arts et Métiers , 2 Rue Conté , 75003 Paris , France.,Université Paris Diderot, Sorbonne Paris Cité, Molécules Thérapeutiques in Silico, INSERM UMR-S 973, 75205 Paris , France
| | - Jean-François Zagury
- Laboratoire GBA, EA4627 , Conservatoire National des Arts et Métiers , 2 Rue Conté , 75003 Paris , France
| | - Matthieu Montes
- Laboratoire GBA, EA4627 , Conservatoire National des Arts et Métiers , 2 Rue Conté , 75003 Paris , France
| |
Collapse
|
31
|
Cortés-Ciriano I, Bender A. Deep Confidence: A Computationally Efficient Framework for Calculating Reliable Prediction Errors for Deep Neural Networks. J Chem Inf Model 2018; 59:1269-1281. [DOI: 10.1021/acs.jcim.8b00542] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| |
Collapse
|
32
|
Cortés-Ciriano I, Firth NC, Bender A, Watson O. Discovering Highly Potent Molecules from an Initial Set of Inactives Using Iterative Screening. J Chem Inf Model 2018; 58:2000-2014. [PMID: 30130102 DOI: 10.1021/acs.jcim.8b00376] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The versatility of similarity searching and quantitative structure-activity relationships to model the activity of compound sets within given bioactivity ranges (i.e., interpolation) is well established. However, their relative performance in the common scenario in early stage drug discovery where lots of inactive data but no active data points are available (i.e., extrapolation from the low-activity to the high-activity range) has not been thoroughly examined yet. To this aim, we have designed an iterative virtual screening strategy which was evaluated on 25 diverse bioactivity data sets from ChEMBL. We benchmark the efficiency of random forest (RF), multiple linear regression, ridge regression, similarity searching, and random selection of compounds to identify a highly active molecule in the test set among a large number of low-potency compounds. We use the number of iterations required to find this active molecule to evaluate the performance of each experimental setup. We show that linear and ridge regression often outperform RF and similarity searching, reducing the number of iterations to find an active compound by a factor of 2 or more. Even simple regression methods seem better able to extrapolate to high-bioactivity ranges than RF, which only provides output values in the range covered by the training set. In addition, examination of the scaffold diversity in the data sets used shows that in some cases similarity searching and RF require two times as many iterations as random selection depending on the chemical space covered in the initial training data. Lastly, we show using bioactivity data for COX-1 and COX-2 that our framework can be extended to multitarget drug discovery, where compounds are selected by concomitantly considering their activity against multiple targets. Overall, this study provides an approach for iterative screening where only inactive data are present in early stages of drug discovery in order to discover highly potent compounds and the best experimental set up in which to do so.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| | - Nicholas C Firth
- Centre for Medical Image Computing, Department of Computer Science , UCL , London WC1E 6BT , United Kingdom.,Evariste Technologies Ltd , Goring on Thames RG8 9AL , United Kingdom
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , United Kingdom
| | - Oliver Watson
- Evariste Technologies Ltd , Goring on Thames RG8 9AL , United Kingdom
| |
Collapse
|
33
|
Abstract
INTRODUCTION Activity landscapes (ALs) are representations and models of compound data sets annotated with a target-specific activity. In contrast to quantitative structure-activity relationship (QSAR) models, ALs aim at characterizing structure-activity relationships (SARs) on a large-scale level encompassing all active compounds for specific targets. The popularity of AL modeling has grown substantially with the public availability of large activity-annotated compound data sets. AL modeling crucially depends on molecular representations and similarity metrics used to assess structural similarity. Areas covered: The concepts of AL modeling are introduced and its basis in quantitatively assessing molecular similarity is discussed. The different types of AL modeling approaches are introduced. AL designs can broadly be divided into three categories: compound-pair based, dimensionality reduction, and network approaches. Recent developments for each of these categories are discussed focusing on the application of mathematical, statistical, and machine learning tools for AL modeling. AL modeling using chemical space networks is covered in more detail. Expert opinion: AL modeling has remained a largely descriptive approach for the analysis of SARs. Beyond mere visualization, the application of analytical tools from statistics, machine learning and network theory has aided in the sophistication of AL designs and provides a step forward in transforming ALs from descriptive to predictive tools. To this end, optimizing representations that encode activity relevant features of molecules might prove to be a crucial step.
Collapse
Affiliation(s)
- Martin Vogt
- a Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Bonn , Germany
| |
Collapse
|
34
|
Filimonov D, Druzhilovskiy D, Lagunin A, Gloriozova T, Rudik A, Dmitriev A, Pogodin P, Poroikov V. Computer-aided prediction of biological activity spectra for chemical compounds: opportunities and limitation. ACTA ACUST UNITED AC 2018. [DOI: 10.18097/bmcrm00004] [Citation(s) in RCA: 77] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
An essential characteristic of chemical compounds is their biological activity since its presence can become the basis for the use of the substance for therapeutic purposes, or, on the contrary, limit the possibilities of its practical application due to the manifestation of side action and toxic effects. Computer assessment of the biological activity spectra makes it possible to determine the most promising directions for the study of the pharmacological action of particular substances, and to filter out potentially dangerous molecules at the early stages of research. For more than 25 years, we have been developing and improving the computer program PASS (Prediction of Activity Spectra for Substances), designed to predict the biological activity spectrum of substance based on the structural formula of its molecules. The prediction is carried out by the analysis of structure-activity relationships for the training set, which currently contains information on structures and known biological activities for more than one million molecules. The structure of the organic compound is represented in PASS using Multilevel Neighborhoods of Atoms descriptors; the activity prediction for new compounds is performed by the naive Bayes classifier and the structure-activity relationships determined by the analysis of the training set. We have created and improved both local versions of the PASS program and freely available web resources based on PASS (http://www.way2drug.com). They predict several thousand biological activities (pharmacological effects, molecular mechanisms of action, specific toxicity and adverse effects, interaction with the unwanted targets, metabolism and action on molecular transport), cytotoxicity for tumor and non-tumor cell lines, carcinogenicity, induced changes of gene expression profiles, metabolic sites of the major enzymes of the first and second phases of xenobiotics biotransformation, and belonging to substrates and/or metabolites of metabolic enzymes. The web resource Way2Drug is used by over 18,000 researchers from more than 90 countries around the world, which allowed them to obtain over 600,000 predictions and publish about 500 papers describing the obtained results. The analysis of the published works shows that in some cases the interpretation of the prediction results presented by the authors of these publications requires an adjustment. In this work, we provide the theoretical basis and consider, on particular examples, the opportunities and limitations of computer-aided prediction of biological activity spectra.
Collapse
Affiliation(s)
| | | | - A.A. Lagunin
- Institute of Biomedical Chemistry; Pirogov Russian National Research Medical University, Moscow, Russia
| | | | - A.V. Rudik
- Institute of Biomedical Chemistry, Moscow, Russia
| | | | - P.V. Pogodin
- Institute of Biomedical Chemistry, Moscow, Russia
| | | |
Collapse
|