151
|
Zeng Y, Chen X, Luo Y, Li X, Peng D. Deep drug-target binding affinity prediction with multiple attention blocks. Brief Bioinform 2021; 22:6231754. [PMID: 33866349 PMCID: PMC8083346 DOI: 10.1093/bib/bbab117] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2021] [Revised: 02/12/2021] [Accepted: 03/13/2021] [Indexed: 11/23/2022] Open
Abstract
Drug-target interaction (DTI) prediction has drawn increasing interest due to its substantial position in the drug discovery process. Many studies have introduced computational models to treat DTI prediction as a regression task, which directly predict the binding affinity of drug-target pairs. However, existing studies (i) ignore the essential correlations between atoms when encoding drug compounds and (ii) model the interaction of drug-target pairs simply by concatenation. Based on those observations, in this study, we propose an end-to-end model with multiple attention blocks to predict the binding affinity scores of drug-target pairs. Our proposed model offers the abilities to (i) encode the correlations between atoms by a relation-aware self-attention block and (ii) model the interaction of drug representations and target representations by the multi-head attention block. Experimental results of DTI prediction on two benchmark datasets show our approach outperforms existing methods, which are benefit from the correlation information encoded by the relation-aware self-attention block and the interaction information extracted by the multi-head attention block. Moreover, we conduct the experiments on the effects of max relative position length and find out the best max relative position length value \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$k \in \{3, 5\}$\end{document}. Furthermore, we apply our model to predict the binding affinity of Corona Virus Disease 2019 (COVID-19)-related genome sequences and \documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
}{}$3137$\end{document} FDA-approved drugs.
Collapse
Affiliation(s)
- Yuni Zeng
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| | - Xiangru Chen
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| | - Yujie Luo
- Shenzhen Peng Cheng Laboratory, Shenzhen, 518052, China
| | - Xuedong Li
- Chengdu Sobey Digital Technology Co., Ltd, Chengdu, 610041,China
| | - Dezhong Peng
- College of Computer Science, Sichuan University, Chengdu, Sichuan,610065, China
| |
Collapse
|
152
|
Multi-PLI: interpretable multi-task deep learning model for unifying protein-ligand interaction datasets. J Cheminform 2021; 13:30. [PMID: 33858485 PMCID: PMC8051026 DOI: 10.1186/s13321-021-00510-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2021] [Accepted: 04/08/2021] [Indexed: 11/11/2022] Open
Abstract
The assessment of protein–ligand interactions is critical at early stage of drug discovery. Computational approaches for efficiently predicting such interactions facilitate drug development. Recently, methods based on deep learning, including structure- and sequence-based models, have achieved impressive performance on several different datasets. However, their application still suffers from a generalizability issue because of insufficient data, especially for structure based models, as well as a heterogeneity problem because of different label measurements and varying proteins across datasets. Here, we present an interpretable multi-task model to evaluate protein–ligand interaction (Multi-PLI). The model can run classification (binding or not) and regression (binding affinity) tasks concurrently by unifying different datasets. The model outperforms traditional docking and machine learning on both binary classification and regression tasks and achieves competitive results compared with some structure-based deep learning methods, even with the same training set size. Furthermore, combined with the proposed occlusion algorithm, the model can predict the important amino acids of proteins that are crucial for binding, thus providing a biological interpretation.
Collapse
|
153
|
Lim S, Lu Y, Cho CY, Sung I, Kim J, Kim Y, Park S, Kim S. A review on compound-protein interaction prediction methods: Data, format, representation and model. Comput Struct Biotechnol J 2021; 19:1541-1556. [PMID: 33841755 PMCID: PMC8008185 DOI: 10.1016/j.csbj.2021.03.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2020] [Revised: 02/28/2021] [Accepted: 03/01/2021] [Indexed: 01/27/2023] Open
Abstract
There has recently been a rapid progress in computational methods for determining protein targets of small molecule drugs, which will be termed as compound protein interaction (CPI). In this review, we comprehensively review topics related to computational prediction of CPI. Data for CPI has been accumulated and curated significantly both in quantity and quality. Computational methods have become powerful ever to analyze such complex the data. Thus, recent successes in the improved quality of CPI prediction are due to use of both sophisticated computational techniques and higher quality information in the databases. The goal of this article is to provide reviews of topics related to CPI, such as data, format, representation, to computational models, so that researchers can take full advantages of these resources to develop novel prediction methods. Chemical compounds and protein data from various resources were discussed in terms of data formats and encoding schemes. For the CPI methods, we grouped prediction methods into five categories from traditional machine learning techniques to state-of-the-art deep learning techniques. In closing, we discussed emerging machine learning topics to help both experimental and computational scientists leverage the current knowledge and strategies to develop more powerful and accurate CPI prediction methods.
Collapse
Affiliation(s)
- Sangsoo Lim
- Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea
| | - Yijingxiu Lu
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Chang Yun Cho
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
| | - Inyoung Sung
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
| | - Jungwoo Kim
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Youngkuk Kim
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Sungjoon Park
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
| | - Sun Kim
- Bioinformatics Institute, Seoul National University, Seoul, Republic of Korea
- Department of Computer Science and Engineering, College of Engineering, Seoul National University, Seoul, Republic of Korea
- Institute of Engineering Research, Seoul National University, Seoul, Republic of Korea
- Interdisciplinary Program in Bioinformatics, College of Natural Sciences, Seoul National University, Seoul, Republic of Korea
| |
Collapse
|
154
|
Shim J, Hong ZY, Sohn I, Hwang C. Prediction of drug-target binding affinity using similarity-based convolutional neural network. Sci Rep 2021; 11:4416. [PMID: 33627791 PMCID: PMC7904939 DOI: 10.1038/s41598-021-83679-y] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2020] [Accepted: 01/18/2021] [Indexed: 12/02/2022] Open
Abstract
Identifying novel drug–target interactions (DTIs) plays an important role in drug discovery. Most of the computational methods developed for predicting DTIs use binary classification, whose goal is to determine whether or not a drug–target (DT) pair interacts. However, it is more meaningful but also more challenging to predict the binding affinity that describes the strength of the interaction between a DT pair. If the binding affinity is not sufficiently large, such drug may not be useful. Therefore, the methods for predicting DT binding affinities are very valuable. The increase in novel public affinity data available in the DT-related databases enables advanced deep learning techniques to be used to predict binding affinities. In this paper, we propose a similarity-based model that applies 2-dimensional (2D) convolutional neural network (CNN) to the outer products between column vectors of two similarity matrices for the drugs and targets to predict DT binding affinities. To our best knowledge, this is the first application of 2D CNN in similarity-based DT binding affinity prediction. The validation results on multiple public datasets show that the proposed model is an effective approach for DT binding affinity prediction and can be quite helpful in drug development process.
Collapse
Affiliation(s)
- Jooyong Shim
- Department of Statistics, Institute of Statistical Information, Inje University, Gimhae, Gyeongsangnamdo, South Korea
| | | | | | - Changha Hwang
- Department of Applied Statistics, Dankook University, Yongin, Gyeonggido, 16890, South Korea.
| |
Collapse
|
155
|
Abbasi K, Razzaghi P, Poso A, Amanlou M, Ghasemi JB, Masoudi-Nejad A. DeepCDA: deep cross-domain compound-protein affinity prediction through LSTM and convolutional neural networks. Bioinformatics 2021; 36:4633-4642. [PMID: 32462178 DOI: 10.1093/bioinformatics/btaa544] [Citation(s) in RCA: 81] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2019] [Revised: 04/29/2020] [Accepted: 05/22/2020] [Indexed: 02/07/2023] Open
Abstract
MOTIVATION An essential part of drug discovery is the accurate prediction of the binding affinity of new compound-protein pairs. Most of the standard computational methods assume that compounds or proteins of the test data are observed during the training phase. However, in real-world situations, the test and training data are sampled from different domains with different distributions. To cope with this challenge, we propose a deep learning-based approach that consists of three steps. In the first step, the training encoder network learns a novel representation of compounds and proteins. To this end, we combine convolutional layers and long-short-term memory layers so that the occurrence patterns of local substructures through a protein and a compound sequence are learned. Also, to encode the interaction strength of the protein and compound substructures, we propose a two-sided attention mechanism. In the second phase, to deal with the different distributions of the training and test domains, a feature encoder network is learned for the test domain by utilizing an adversarial domain adaptation approach. In the third phase, the learned test encoder network is applied to new compound-protein pairs to predict their binding affinity. RESULTS To evaluate the proposed approach, we applied it to KIBA, Davis and BindingDB datasets. The results show that the proposed method learns a more reliable model for the test domain in more challenging situations. AVAILABILITY AND IMPLEMENTATION https://github.com/LBBSoft/DeepCDA.
Collapse
Affiliation(s)
- Karim Abbasi
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran 1417614411, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan 4513766731, Iran
| | - Antti Poso
- School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio 80100, Finland
| | - Massoud Amanlou
- Department of Medicinal Chemistry, Drug Design and Development Research Center, Tehran University of Medical Sciences, Tehran 1416753955, Iran
| | - Jahan B Ghasemi
- Chemistry Department, Faculty of Sciences, University of Tehran, Tehran 1417614418, Iran
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran 1417614411, Iran
| |
Collapse
|
156
|
Majumdar S, Nandi SK, Ghosal S, Ghosh B, Mallik W, Roy ND, Biswas A, Mukherjee S, Pal S, Bhattacharyya N. Deep Learning-Based Potential Ligand Prediction Framework for COVID-19 with Drug-Target Interaction Model. Cognit Comput 2021:1-13. [PMID: 33552306 PMCID: PMC7852055 DOI: 10.1007/s12559-021-09840-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2020] [Accepted: 01/15/2021] [Indexed: 11/11/2022]
Abstract
To fight against the present pandemic scenario of COVID-19 outbreak, medication with drugs and vaccines is extremely essential other than ventilation support. In this paper, we present a list of ligands which are expected to have the highest binding affinity with the S-glycoprotein of 2019-nCoV and thus can be used to make the drug for the novel coronavirus. Here, we implemented an architecture using 1D convolutional networks to predict drug-target interaction (DTI) values. The network was trained on the KIBA (Kinase Inhibitor Bioactivity) dataset. With this network, we predicted the KIBA scores (which gives a measure of binding affinity) of a list of ligands against the S-glycoprotein of 2019-nCoV. Based on these KIBA scores, we are proposing a list of ligands (33 top ligands based on best interactions) which have a high binding affinity with the S-glycoprotein of 2019-nCoV and thus can be used for the formation of drugs.
Collapse
Affiliation(s)
- Shatadru Majumdar
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Soumik Kumar Nandi
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Shuvam Ghosal
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Bavrabi Ghosh
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Writam Mallik
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Nilanjana Dutta Roy
- Department of Computer Science and Engineering, Institute of Engineering and Management, Kolkata, India
| | - Arindam Biswas
- Department of Information Technology, Indian Institute of Engineering Science and Technology, Shibpur, India
| | - Subhankar Mukherjee
- Agri and Environmental Electronics (AEE), Centre for Development of Advanced Computing, Kolkata, India
| | - Souvik Pal
- Agri and Environmental Electronics (AEE), Centre for Development of Advanced Computing, Kolkata, India
| | - Nabarun Bhattacharyya
- Agri and Environmental Electronics (AEE), Centre for Development of Advanced Computing, Kolkata, India
| |
Collapse
|
157
|
Pan Y, Chen Z, Qi F, Liu J. Identification of drug compounds for keloids and hypertrophic scars: drug discovery based on text mining and DeepPurpose. ANNALS OF TRANSLATIONAL MEDICINE 2021; 9:347. [PMID: 33708974 PMCID: PMC7944324 DOI: 10.21037/atm-21-218] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Background Keloids (KL) and hypertrophic scars (HS) are forms of abnormal cutaneous scarring characterized by excessive deposition of extracellular matrix and fibroblast proliferation. Currently, the efficacy of drug therapies for KL and HS is limited. The present study aimed to investigate new drug therapies for KL and HS by using computational methods. Methods Text mining and GeneCodis were used to mine genes closely related to KL and HS. Protein-protein interaction analysis was performed using Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and Cytoscape. The selection of drugs targeting the genes closely related to KL and HS was carried out using Pharmaprojects. Drug-target interaction prediction was performed using DeepPurpose, through which candidate drugs with the highest predicted binding affinity were finally obtained. Results Our analysis using text mining identified 69 KL- and HS-related genes. Gene enrichment analysis generated 25 genes, representing 7 pathways and 130 targeting drugs. DeepPurpose recommended 14 drugs as the final drug list, including 2 phosphatidylinositol-4,5-bisphosphate 3-kinase (PI3K) inhibitors, 10 prostaglandin-endoperoxide synthase 2 (PTGS2) inhibitors and 2 vascular endothelial growth factor A (VEGFA) antagonists. Conclusions Drug discovery using in silico text mining and DeepPurpose may be a powerful and effective way to identify drugs targeting the genes related to KL and HS.
Collapse
Affiliation(s)
- Yuyan Pan
- Department of Plastic Surgery, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Zhiwei Chen
- Big Data and Artificial Intelligence Center, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Fazhi Qi
- Department of Plastic Surgery, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Jiaqi Liu
- Department of Plastic Surgery, Zhongshan Hospital, Fudan University, Shanghai, China.,Artificial Intelligence Center for Plastic Surgery and Cutaneous Soft Tissue Cancers, Zhongshan Hospital, Fudan University, Shanghai, China
| |
Collapse
|
158
|
Shen C, Luo J, Ouyang W, Ding P, Chen X. IDDkin: Network-based influence deep diffusion model for enhancing prediction of kinase inhibitors. Bioinformatics 2020; 36:5481-5491. [PMID: 33367525 DOI: 10.1093/bioinformatics/btaa1058] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 11/09/2020] [Accepted: 12/10/2020] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Protein kinases have been the focus of drug discovery research for many years because they play a causal role in many human diseases. Understanding the binding profile of kinase inhibitors is a prerequisite for drug discovery, and traditional methods of predicting kinase inhibitors are time-consuming and inefficient. Calculation-based predictive methods provide a relatively low-cost and high-efficiency approach to the rapid development and effective understanding of the binding profile of kinase inhibitors. Particularly, the continuous improvement of network pharmacology methods provides unprecedented opportunities for drug discovery, network-based computational methods could be employed to aggregate the effective information from heterogeneous sources, which have become a new way for predicting the binding profile of kinase inhibitors. RESULTS In this study, we proposed a network-based influence deep diffusion model, named IDDkin, for enhancing the prediction of kinase inhibitors. IDDkin uses deep graph convolutional networks, graph attention networks and adaptive weighting methods to diffuse the effective information of heterogeneous networks. The updated kinase and compound representations are used to predict potential compound-kinase pairs. The experimental results show that the performance of IDDkin is superior to the comparison methods, including the state-of-the art kinase inhibitor prediction method and the classic model widely used in relationship prediction. In experiments conducted to verify its generalizability and in case studies, the IDDkin model also shows excellent performance. All of these results demonstrate the powerful predictive ability of the IDDkin model in the field of kinase inhibitors. AVAILABILITY AND IMPLEMENTATION Source code and data can be downloaded from https://github.com/ CS-BIO/IDDkin. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cong Shen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, China
| | - Wenjue Ouyang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, China
| | - Pingjian Ding
- School of Computer Science, University of South China, Hengyang, 421001, China
| | - Xiangtao Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410083, China
| |
Collapse
|
159
|
Karimi M, Wu D, Wang Z, Shen Y. Explainable Deep Relational Networks for Predicting Compound-Protein Affinities and Contacts. J Chem Inf Model 2020; 61:46-66. [PMID: 33347301 DOI: 10.1021/acs.jcim.0c00866] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Predicting compound-protein affinity is beneficial for accelerating drug discovery. Doing so without the often-unavailable structure data is gaining interest. However, recent progress in structure-free affinity prediction, made by machine learning, focuses on accuracy but leaves much to be desired for interpretability. Defining intermolecular contacts underlying affinities as a vehicle for interpretability; our large-scale interpretability assessment finds previously used attention mechanisms inadequate. We thus formulate a hierarchical multiobjective learning problem, where predicted contacts form the basis for predicted affinities. We solve the problem by embedding protein sequences (by hierarchical recurrent neural networks) and compound graphs (by graph neural networks) with joint attentions between protein residues and compound atoms. We further introduce three methodological advances to enhance interpretability: (1) structure-aware regularization of attentions using protein sequence-predicted solvent exposure and residue-residue contact maps; (2) supervision of attentions using known intermolecular contacts in training data; and (3) an intrinsically explainable architecture where atomic-level contacts or "relations" lead to molecular-level affinity prediction. The first two and all three advances result in DeepAffinity+ and DeepRelations, respectively. Our methods show generalizability in affinity prediction for molecules that are new and dissimilar to training examples. Moreover, they show superior interpretability compared to state-of-the-art interpretable methods: with similar or better affinity prediction, they boost the AUPRC of contact prediction by around 33-, 35-, 10-, and 9-fold for the default test, new-compound, new-protein, and both-new sets, respectively. We further demonstrate their potential utilities in contact-assisted docking, structure-free binding site prediction, and structure-activity relationship studies without docking. Our study represents the first model development and systematic model assessment dedicated to interpretable machine learning for structure-free compound-protein affinity prediction.
Collapse
Affiliation(s)
- Mostafa Karimi
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843, United States.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, Texas 77840, United States
| | - Di Wu
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843, United States
| | - Zhangyang Wang
- Department of Computer Science and Engineering, Texas A&M University, College Station, Texas 77843, United States.,Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas 78712, United States
| | - Yang Shen
- Department of Electrical and Computer Engineering, Texas A&M University, College Station, Texas 77843, United States.,TEES-AgriLife Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University, College Station, Texas 77840, United States
| |
Collapse
|
160
|
Gao D, Chen Q, Zeng Y, Jiang M, Zhang Y. Applications of Machine Learning in Drug Target Discovery. Curr Drug Metab 2020; 21:790-803. [PMID: 32723266 DOI: 10.2174/1567201817999200728142023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 03/12/2020] [Accepted: 05/13/2020] [Indexed: 12/15/2022]
Abstract
Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.
Collapse
Affiliation(s)
- Dongrui Gao
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Meng Jiang
- School of Mechanical Automotive Engineering, Nanyang Institute of Technology, Nanyang 473000, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| |
Collapse
|
161
|
Özçelik R, Öztürk H, Özgür A, Ozkirimli E. ChemBoost: A Chemical Language Based Approach for Protein - Ligand Binding Affinity Prediction. Mol Inform 2020; 40:e2000212. [PMID: 33225594 DOI: 10.1002/minf.202000212] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2020] [Accepted: 11/20/2020] [Indexed: 11/07/2022]
Abstract
Identification of high affinity drug-target interactions is a major research question in drug discovery. Proteins are generally represented by their structures or sequences. However, structures are available only for a small subset of biomolecules and sequence similarity is not always correlated with functional similarity. We propose ChemBoost, a chemical language based approach for affinity prediction using SMILES syntax. We hypothesize that SMILES is a codified language and ligands are documents composed of chemical words. These documents can be used to learn chemical word vectors that represent words in similar contexts with similar vectors. In ChemBoost, the ligands are represented via chemical word embeddings, while the proteins are represented through sequence-based features and/or chemical words of their ligands. Our aim is to process the patterns in SMILES as a language to predict protein-ligand affinity, even when we cannot infer the function from the sequence. We used eXtreme Gradient Boosting to predict protein-ligand affinities in KIBA and BindingDB data sets. ChemBoost was able to predict drug-target binding affinity as well as or better than state-of-the-art machine learning systems. When powered with ligand-centric representations, ChemBoost was more robust to the changes in protein sequence similarity and successfully captured the interactions between a protein and a ligand, even if the protein has low sequence similarity to the known targets of the ligand.
Collapse
Affiliation(s)
- Rıza Özçelik
- Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey
| | - Hakime Öztürk
- Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey
| | - Arzucan Özgür
- Department of Computer Engineering, Boğaziçi University, Istanbul, Turkey
| | - Elif Ozkirimli
- Department of Chemical Engineering, Boğaziçi University, Istanbul, Turkey.,Data and Analytics Chapter, Pharma International Informatics, F. Hoffmann-La Roche AG, Switzerland
| |
Collapse
|
162
|
Agyemang B, Wu WP, Kpiebaareh MY, Lei Z, Nanor E, Chen L. Multi-view self-attention for interpretable drug–target interaction prediction. J Biomed Inform 2020; 110:103547. [DOI: 10.1016/j.jbi.2020.103547] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Revised: 08/21/2020] [Accepted: 08/24/2020] [Indexed: 01/08/2023]
|
163
|
Abdel-Basset M, Hawash H, Elhoseny M, Chakrabortty RK, Ryan M. DeepH-DTA: Deep Learning for Predicting Drug-Target Interactions: A Case Study of COVID-19 Drug Repurposing. IEEE ACCESS : PRACTICAL INNOVATIONS, OPEN SOLUTIONS 2020; 8:170433-170451. [PMID: 34786289 PMCID: PMC8545313 DOI: 10.1109/access.2020.3024238] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Accepted: 09/11/2020] [Indexed: 05/04/2023]
Abstract
The rapid spread of novel coronavirus pneumonia (COVID-19) has led to a dramatically increased mortality rate worldwide. Despite many efforts, the rapid development of an effective vaccine for this novel virus will take considerable time and relies on the identification of drug-target (DT) interactions utilizing commercially available medication to identify potential inhibitors. Motivated by this, we propose a new framework, called DeepH-DTA, for predicting DT binding affinities for heterogeneous drugs. We propose a heterogeneous graph attention (HGAT) model to learn topological information of compound molecules and bidirectional ConvLSTM layers for modeling spatio-sequential information in simplified molecular-input line-entry system (SMILES) sequences of drug data. For protein sequences, we propose a squeezed-excited dense convolutional network for learning hidden representations within amino acid sequences; while utilizing advanced embedding techniques for encoding both kinds of input sequences. The performance of DeepH-DTA is evaluated through extensive experiments against cutting-edge approaches utilising two public datasets (Davis, and KIBA) which comprise eclectic samples of the kinase protein family and the pertinent inhibitors. DeepH-DTA attains the highest Concordance Index (CI) of 0.924 and 0.927 and also achieved a mean square error (MSE) of 0.195 and 0.111 on the Davis and KIBA datasets respectively. Moreover, a study using FDA-approved drugs from the Drug Bank database is performed using DeepH-DTA to predict the affinity scores of drugs against SARS-CoV-2 amino acid sequences, and the results show that that the model can predict some of the SARS-Cov-2 inhibitors that have been recently approved in many clinical studies.
Collapse
Affiliation(s)
| | - Hossam Hawash
- Faculty of Computers and InformaticsZagazig University Zagazig 44519 Egypt
| | - Mohamed Elhoseny
- Department of Computer ScienceCollege of Computer Information TechnologyAmerican University in the Emirates Dubai 503000 United Arab Emirates
- Faculty of Computers and InformationMansoura University Mansoura 35516 Egypt
| | - Ripon K Chakrabortty
- Capability Systems Centre, School of Engineering and ITUniversity of New South Wales Canberra Canberra ACT 2612 Australia
| | - Michael Ryan
- Capability Systems Centre, School of Engineering and ITUniversity of New South Wales Canberra Canberra ACT 2612 Australia
| |
Collapse
|
164
|
Lennox M, Robertson N, Devereux B. Expanding the Vocabulary of a Protein: Application of Subword Algorithms to Protein Sequence Modelling. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2020; 2020:2361-2367. [PMID: 33018481 DOI: 10.1109/embc44109.2020.9176380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Deep learning has proven to be a useful tool for modelling protein properties. However, given the variability in the length of proteins, it can be difficult to summarise the sequence of amino acids effectively. In many cases, as a result of using fixed-length representations, information about long proteins can be lost through truncation, or model training can be slow due to the use of excessive padding. In this work, we aim to overcome these problems by expanding upon the original vocabulary used to represent the protein sequence. To this end, we utilise two prominent subword algorithms that have been previously used to reach state-of-the-art results in various Natural Language Processing tasks. The algorithms are used to encode the original protein sequence into a set of subsequences before they are analysed by a Doc2Vec model. The pre-trained encodings produced by each algorithm are tested on a variety of downstream tasks: four protein property prediction tasks (plasma membrane localization, thermostability, peak absorption wavelength, enantioselectivity) as well as drug-target affinity prediction tasks over two datasets. Our results significantly improve on the state-of-the-art for these tasks, demonstrating the benefits of using subword compression algorithms for modelling proteins.
Collapse
|
165
|
Wang DD, Zhu M, Yan H. Computationally predicting binding affinity in protein-ligand complexes: free energy-based simulations and machine learning-based scoring functions. Brief Bioinform 2020; 22:5860693. [PMID: 32591817 DOI: 10.1093/bib/bbaa107] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/20/2020] [Accepted: 05/05/2020] [Indexed: 12/18/2022] Open
Abstract
Accurately predicting protein-ligand binding affinities can substantially facilitate the drug discovery process, but it remains as a difficult problem. To tackle the challenge, many computational methods have been proposed. Among these methods, free energy-based simulations and machine learning-based scoring functions can potentially provide accurate predictions. In this paper, we review these two classes of methods, following a number of thermodynamic cycles for the free energy-based simulations and a feature-representation taxonomy for the machine learning-based scoring functions. More recent deep learning-based predictions, where a hierarchy of feature representations are generally extracted, are also reviewed. Strengths and weaknesses of the two classes of methods, coupled with future directions for improvements, are comparatively discussed.
Collapse
Affiliation(s)
- Debby D Wang
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology
| | - Mengxu Zhu
- Department of Electrical Engineering, City University of Hong Kong
| | - Hong Yan
- College of Science and Engineering, City University of Hong Kong
| |
Collapse
|
166
|
Chen L, Tan X, Wang D, Zhong F, Liu X, Yang T, Luo X, Chen K, Jiang H, Zheng M. TransformerCPI: improving compound–protein interaction prediction by sequence-based deep learning with self-attention mechanism and label reversal experiments. Bioinformatics 2020; 36:4406-4414. [DOI: 10.1093/bioinformatics/btaa524] [Citation(s) in RCA: 79] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Revised: 04/13/2020] [Accepted: 05/14/2020] [Indexed: 12/13/2022] Open
Abstract
Abstract
Motivation
Identifying compound–protein interaction (CPI) is a crucial task in drug discovery and chemogenomics studies, and proteins without three-dimensional structure account for a large part of potential biological targets, which requires developing methods using only protein sequence information to predict CPI. However, sequence-based CPI models may face some specific pitfalls, including using inappropriate datasets, hidden ligand bias and splitting datasets inappropriately, resulting in overestimation of their prediction performance.
Results
To address these issues, we here constructed new datasets specific for CPI prediction, proposed a novel transformer neural network named TransformerCPI, and introduced a more rigorous label reversal experiment to test whether a model learns true interaction features. TransformerCPI achieved much improved performance on the new experiments, and it can be deconvolved to highlight important interacting regions of protein sequences and compound atoms, which may contribute chemical biology studies with useful guidance for further ligand structural optimization.
Availability and implementation
https://github.com/lifanchen-simm/transformerCPI.
Collapse
Affiliation(s)
- Lifan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Dingyan Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaohong Liu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 200031, China
| | - Tianbiao Yang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaomin Luo
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 200031, China
| | - Hualiang Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
- Shanghai Institute for Advanced Immunochemical Studies, School of Life Science and Technology, ShanghaiTech University, Shanghai 200031, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| |
Collapse
|
167
|
Hassan-Harrirou H, Zhang C, Lemmin T. RosENet: Improving Binding Affinity Prediction by Leveraging Molecular Mechanics Energies with an Ensemble of 3D Convolutional Neural Networks. J Chem Inf Model 2020; 60:2791-2802. [DOI: 10.1021/acs.jcim.0c00075] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Hussein Hassan-Harrirou
- DS3Lab, System Group, Department of Computer Sciences, ETH Zurich, CH-8092 Zurich, Switzerland
| | - Ce Zhang
- DS3Lab, System Group, Department of Computer Sciences, ETH Zurich, CH-8092 Zurich, Switzerland
| | - Thomas Lemmin
- DS3Lab, System Group, Department of Computer Sciences, ETH Zurich, CH-8092 Zurich, Switzerland
- Institute of Medical Virology, University of Zurich (UZH), CH-8057 Zurich, Switzerland
| |
Collapse
|
168
|
Wang X, Liu Y, Lu F, Li H, Gao P, Wei D. Dipeptide Frequency of Word Frequency and Graph Convolutional Networks for DTA Prediction. Front Bioeng Biotechnol 2020; 8:267. [PMID: 32318557 PMCID: PMC7147459 DOI: 10.3389/fbioe.2020.00267] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Accepted: 03/13/2020] [Indexed: 11/13/2022] Open
Abstract
Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.
Collapse
Affiliation(s)
- Xianfang Wang
- School of Computer Science and Technology, Henan Institute of Technology, Xinxiang, China.,School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Yifeng Liu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Fan Lu
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Hongfei Li
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Peng Gao
- School of Computer and Information Engineering, Henan Normal University, Xinxiang, China
| | - Dongqing Wei
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
169
|
Beck BR, Shin B, Choi Y, Park S, Kang K. Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model. Comput Struct Biotechnol J 2020; 18:784-790. [PMID: 32280433 PMCID: PMC7118541 DOI: 10.1016/j.csbj.2020.03.025] [Citation(s) in RCA: 384] [Impact Index Per Article: 96.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2020] [Revised: 03/23/2020] [Accepted: 03/25/2020] [Indexed: 12/15/2022] Open
Abstract
The MT-DTI deep learning model was used to identify potent drugs for SARS-CoV-2. Atazanavir, remdesivir, and Kaletra were predicted to inhibit SARS-CoV-2. Rapamycin and tiotropium bromide may also be effective for SARS-CoV-2.
The infection of a novel coronavirus found in Wuhan of China (SARS-CoV-2) is rapidly spreading, and the incidence rate is increasing worldwide. Due to the lack of effective treatment options for SARS-CoV-2, various strategies are being tested in China, including drug repurposing. In this study, we used our pre-trained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI) to identify commercially available drugs that could act on viral proteins of SARS-CoV-2. The result showed that atazanavir, an antiretroviral medication used to treat and prevent the human immunodeficiency virus (HIV), is the best chemical compound, showing an inhibitory potency with Kd of 94.94 nM against the SARS-CoV-2 3C-like proteinase, followed by remdesivir (113.13 nM), efavirenz (199.17 nM), ritonavir (204.05 nM), and dolutegravir (336.91 nM). Interestingly, lopinavir, ritonavir, and darunavir are all designed to target viral proteinases. However, in our prediction, they may also bind to the replication complex components of SARS-CoV-2 with an inhibitory potency with Kd < 1000 nM. In addition, we also found that several antiviral agents, such as Kaletra (lopinavir/ritonavir), could be used for the treatment of SARS-CoV-2. Overall, we suggest that the list of antiviral drugs identified by the MT-DTI model should be considered, when establishing effective treatment strategies for SARS-CoV-2.
Collapse
Affiliation(s)
| | - Bonggun Shin
- Deargen, Inc., Daejeon, Republic of Korea.,Department of Computer Science, Emory University, Atlanta, GA, United States
| | | | | | - Keunsoo Kang
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan, Republic of Korea
| |
Collapse
|
170
|
Ellingson SR, Davis B, Allen J. Machine learning and ligand binding predictions: A review of data, methods, and obstacles. Biochim Biophys Acta Gen Subj 2020; 1864:129545. [PMID: 32057823 DOI: 10.1016/j.bbagen.2020.129545] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Revised: 12/21/2019] [Accepted: 01/30/2020] [Indexed: 10/25/2022]
Abstract
Computational predictions of ligand binding is a difficult problem, with more accurate methods being extremely computationally expensive. The use of machine learning for drug binding predictions could possibly leverage the use of biomedical big data in exchange for time-intensive simulations. This paper reviews current trends in the use of machine learning for drug binding predictions, data sources to develop machine learning algorithms, and potential problems that may lead to overfitting and ungeneralizable models. A few popular datasets that can be used to develop virtual high-throughput screening models are characterized using spatial statistics to quantify potential biases. We can see from evaluating some common benchmarks that good performance correlates with models with high-predicted bias scores and models with low bias scores do not have much predictive power. A better understanding of the limits of available data sources and how to fix them will lead to more generalizable models that will lead to novel drug discovery.
Collapse
Affiliation(s)
- Sally R Ellingson
- College of Medicine, Division of Biomedical Informatics, University of Kentucky, Lexington, KY, United States of America; Markey Cancer Center, Lexington, KY, United States of America.
| | - Brian Davis
- Markey Cancer Center, Lexington, KY, United States of America
| | - Jonathan Allen
- Lawrence Livermore National Laboratory, Livermore, CA, United States of America
| |
Collapse
|
171
|
Bagherian M, Sabeti E, Wang K, Sartor MA, Nikolovska-Coleska Z, Najarian K. Machine learning approaches and databases for prediction of drug-target interaction: a survey paper. Brief Bioinform 2020; 22:247-269. [PMID: 31950972 PMCID: PMC7820849 DOI: 10.1093/bib/bbz157] [Citation(s) in RCA: 148] [Impact Index Per Article: 37.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 12/12/2022] Open
Abstract
The task of predicting the interactions between drugs and targets plays a key role in the process of drug discovery. There is a need to develop novel and efficient prediction approaches in order to avoid costly and laborious yet not-always-deterministic experiments to determine drug–target interactions (DTIs) by experiments alone. These approaches should be capable of identifying the potential DTIs in a timely manner. In this article, we describe the data required for the task of DTI prediction followed by a comprehensive catalog consisting of machine learning methods and databases, which have been proposed and utilized to predict DTIs. The advantages and disadvantages of each set of methods are also briefly discussed. Lastly, the challenges one may face in prediction of DTI using machine learning approaches are highlighted and we conclude by shedding some lights on important future research directions.
Collapse
Affiliation(s)
- Maryam Bagherian
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Elyas Sabeti
- Michigan Institute for Data Science, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Kai Wang
- Department of Biostatistics, School of Public Health, University of Michigan, Ann Arbor, MI, 48109, USA
| | - Maureen A Sartor
- Department of Pathology, University of Michigan, Ann Arbor, MI, 48109, USA
| | | | - Kayvan Najarian
- Department of Electrical Engineering and Computer Science, College of Engineering, University of Michigan, Ann Arbor, MI, 48109, USA
| |
Collapse
|
172
|
Zhao L, Wang J, Pang L, Liu Y, Zhang J. GANsDTA: Predicting Drug-Target Binding Affinity Using GANs. Front Genet 2020; 10:1243. [PMID: 31993067 PMCID: PMC6962343 DOI: 10.3389/fgene.2019.01243] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 11/11/2019] [Indexed: 01/09/2023] Open
Abstract
The computational prediction of interactions between drugs and targets is a standing challenge in drug discovery. State-of-the-art methods for drug-target interaction prediction are primarily based on supervised machine learning with known label information. However, in biomedicine, obtaining labeled training data is an expensive and a laborious process. This paper proposes a semi-supervised generative adversarial networks (GANs)-based method to predict binding affinity. Our method comprises two parts, two GANs for feature extraction and a regression network for prediction. The semi-supervised mechanism allows our model to learn proteins drugs features of both labeled and unlabeled data. We evaluate the performance of our method using multiple public datasets. Experimental results demonstrate that our method achieves competitive performance while utilizing freely available unlabeled data. Our results suggest that utilizing such unlabeled data can considerably help improve performance in various biomedical relation extraction processes, for example, Drug-Target interaction and protein-protein interaction, particularly when only limited labeled data are available in such tasks. To our best knowledge, this is the first semi-supervised GANs-based method to predict binding affinity.
Collapse
Affiliation(s)
- Lingling Zhao
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Junjie Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Long Pang
- Institute of Space Environment and Material Science, Harbin Institute of Technology, Harbin, China
| | - Yang Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - Jun Zhang
- Department of Rehabilitation, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| |
Collapse
|
173
|
Thafar M, Raies AB, Albaradei S, Essack M, Bajic VB. Comparison Study of Computational Prediction Tools for Drug-Target Binding Affinities. Front Chem 2019; 7:782. [PMID: 31824921 PMCID: PMC6879652 DOI: 10.3389/fchem.2019.00782] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2019] [Accepted: 10/30/2019] [Indexed: 12/30/2022] Open
Abstract
The drug development is generally arduous, costly, and success rates are low. Thus, the identification of drug-target interactions (DTIs) has become a crucial step in early stages of drug discovery. Consequently, developing computational approaches capable of identifying potential DTIs with minimum error rate are increasingly being pursued. These computational approaches aim to narrow down the search space for novel DTIs and shed light on drug functioning context. Most methods developed to date use binary classification to predict if the interaction between a drug and its target exists or not. However, it is more informative but also more challenging to predict the strength of the binding between a drug and its target. If that strength is not sufficiently strong, such DTI may not be useful. Therefore, the methods developed to predict drug-target binding affinities (DTBA) are of great value. In this study, we provide a comprehensive overview of the existing methods that predict DTBA. We focus on the methods developed using artificial intelligence (AI), machine learning (ML), and deep learning (DL) approaches, as well as related benchmark datasets and databases. Furthermore, guidance and recommendations are provided that cover the gaps and directions of the upcoming work in this research area. To the best of our knowledge, this is the first comprehensive comparison analysis of tools focused on DTBA with reference to AI/ML/DL.
Collapse
Affiliation(s)
- Maha Thafar
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- College of Computers and Information Technology, Taif University, Taif, Saudi Arabia
| | - Arwa Bin Raies
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Somayah Albaradei
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Magbubah Essack
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Vladimir B. Bajic
- Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division, Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
174
|
Elmadani M, Khan S, Tenhunen O, Magga J, Aittokallio T, Wennerberg K, Kerkelä R. Novel Screening Method Identifies PI3Kα, mTOR, and IGF1R as Key Kinases Regulating Cardiomyocyte Survival. J Am Heart Assoc 2019; 8:e013018. [PMID: 31617439 PMCID: PMC6898841 DOI: 10.1161/jaha.119.013018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Background Small molecule kinase inhibitors (KIs) are a class of agents currently used for treatment of various cancers. Unfortunately, treatment of cancer patients with some of the KIs is associated with cardiotoxicity, and there is an unmet need for methods to predict their cardiotoxicity. Here, we utilized a novel computational method to identify protein kinases crucial for cardiomyocyte viability. Methods and Results One hundred forty KIs were screened for their toxicity in cultured neonatal cardiomyocytes. The kinase targets of KIs were determined based on integrated data from binding assays. The key kinases mediating the toxicity of KIs to cardiomyocytes were identified by using a novel machine learning method for target deconvolution that combines the information from the toxicity screen and from the kinase profiling assays. The top kinases identified by the model were phosphoinositide 3‐kinase catalytic subunit alpha, mammalian target of rapamycin, and insulin‐like growth factor 1 receptor. Knockdown of the individual kinases in cardiomyocytes confirmed their role in regulating cardiomyocyte viability. Conclusions Combining the data from analysis of KI toxicity on cardiomyocytes and KI target profiling provides a novel method to predict cardiomyocyte toxicity of KIs.
Collapse
Affiliation(s)
- Manar Elmadani
- Research Unit of Biomedicine Department of Pharmacology and Toxicology University of Oulu Finland
| | - Suleiman Khan
- Institute for Molecular Medicine Finland (FIMM) University of Helsinki Finland
| | - Olli Tenhunen
- Department of Oncology and Radiotherapy Oulu University Hospital University of Oulu Finland
| | - Johanna Magga
- Research Unit of Biomedicine Department of Pharmacology and Toxicology University of Oulu Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM) University of Helsinki Finland
| | - Krister Wennerberg
- Institute for Molecular Medicine Finland (FIMM) University of Helsinki Finland
| | - Risto Kerkelä
- Research Unit of Biomedicine Department of Pharmacology and Toxicology University of Oulu Finland.,Medical Research Center Oulu Oulu University Hospital and University of Oulu Finland
| |
Collapse
|
175
|
Lee I, Keum J, Nam H. DeepConv-DTI: Prediction of drug-target interactions via deep learning with convolution on protein sequences. PLoS Comput Biol 2019; 15:e1007129. [PMID: 31199797 PMCID: PMC6594651 DOI: 10.1371/journal.pcbi.1007129] [Citation(s) in RCA: 220] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 06/26/2019] [Accepted: 05/24/2019] [Indexed: 12/04/2022] Open
Abstract
Identification of drug-target interactions (DTIs) plays a key role in drug discovery. The high cost and labor-intensive nature of in vitro and in vivo experiments have highlighted the importance of in silico-based DTI prediction approaches. In several computational models, conventional protein descriptors have been shown to not be sufficiently informative to predict accurate DTIs. Thus, in this study, we propose a deep learning based DTI prediction model capturing local residue patterns of proteins participating in DTIs. When we employ a convolutional neural network (CNN) on raw protein sequences, we perform convolution on various lengths of amino acids subsequences to capture local residue patterns of generalized protein classes. We train our model with large-scale DTI information and demonstrate the performance of the proposed model using an independent dataset that is not seen during the training phase. As a result, our model performs better than previous protein descriptor-based models. Also, our model performs better than the recently developed deep learning models for massive prediction of DTIs. By examining pooled convolution results, we confirmed that our model can detect binding sites of proteins for DTIs. In conclusion, our prediction model for detecting local residue patterns of target proteins successfully enriches the protein features of a raw protein sequence, yielding better prediction results than previous approaches. Our code is available at https://github.com/GIST-CSBL/DeepConv-DTI.
Collapse
Affiliation(s)
- Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Buk-ku, Gwangju, Republic of Korea
| | - Jongsoo Keum
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Buk-ku, Gwangju, Republic of Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Buk-ku, Gwangju, Republic of Korea
| |
Collapse
|
176
|
Abstract
Motivation The identification of novel drug-target (DT) interactions is a substantial part of the drug discovery process. Most of the computational methods that have been proposed to predict DT interactions have focused on binary classification, where the goal is to determine whether a DT pair interacts or not. However, protein-ligand interactions assume a continuum of binding strength values, also called binding affinity and predicting this value still remains a challenge. The increase in the affinity data available in DT knowledge-bases allows the use of advanced learning techniques such as deep learning architectures in the prediction of binding affinities. In this study, we propose a deep-learning based model that uses only sequence information of both targets and drugs to predict DT interaction binding affinities. The few studies that focus on DT binding affinity prediction use either 3D structures of protein-ligand complexes or 2D features of compounds. One novel approach used in this work is the modeling of protein sequences and compound 1D representations with convolutional neural networks (CNNs). Results The results show that the proposed deep learning based model that uses the 1D representations of targets and drugs is an effective approach for drug target binding affinity prediction. The model in which high-level representations of a drug and a target are constructed via CNNs achieved the best Concordance Index (CI) performance in one of our larger benchmark datasets, outperforming the KronRLS algorithm and SimBoost, a state-of-the-art method for DT binding affinity prediction. Availability and implementation https://github.com/hkmztrk/DeepDTA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hakime Öztürk
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Arzucan Özgür
- Department of Computer Engineering, Bogazici University, Istanbul, Turkey
| | - Elif Ozkirimli
- Department of Chemical Engineering, Bogazici University, Istanbul, Turkey
| |
Collapse
|
177
|
Fukunishi Y, Yamashita Y, Mashimo T, Nakamura H. Prediction of Protein-compound Binding Energies from Known Activity Data: Docking-score-based Method and its Applications. Mol Inform 2018; 37:e1700120. [PMID: 29442436 PMCID: PMC6055825 DOI: 10.1002/minf.201700120] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2017] [Accepted: 01/22/2018] [Indexed: 12/18/2022]
Abstract
We used protein-compound docking simulations to develop a structure-based quantitative structure-activity relationship (QSAR) model. The prediction model used docking scores as descriptors. The binding free energy was approximated by a weighted average of docking scores for multiple proteins. This approximation was based on a pharmacophore model of receptor pockets and compounds. The weights of the docking scores were restricted to small values to avoid unrealistic weights by a regularization term. Additional outlier elimination improved the results. We applied this method to two groups of targets. The first target was the kinase family. The cross-validation results of 107 kinase proteins showed that the RMSE of predicted binding free energies was 1.1 kcal/mol. The second target was the matrix metalloproteinase (MMP) family, which has been difficult for docking programs. MMPs require metal-binding groups in their inhibitor structures in many cases. A quantum effect contributes to the metal-ligand interaction. Despite this difficulty, the present method worked well for the MMPs. This method showed that the RMSE of predicted binding free energies was 1.1 kcal/mol. In comparison, with the original docking method the RMSE was 1.7 kcal/mol. The results suggest that the present QSAR model should be applied to general target proteins.
Collapse
Affiliation(s)
- Yoshifumi Fukunishi
- Molecular Profiling Research Center for Drug Discovery (molprof)National Institute of Advanced Industrial Science and Technology (AIST)2-3-26Aomi, Koto-ku, Tokyo135-0064Japan
| | - Yasunobu Yamashita
- Technology Research Association for Next-Generation Natural Products Chemistry2-3-26, Aomi, Koto-kuTokyo135-0064Japan
| | - Tadaaki Mashimo
- Technology Research Association for Next-Generation Natural Products Chemistry2-3-26, Aomi, Koto-kuTokyo135-0064Japan
- IMSBIO Co., Ltd.Owl Tower, 4-21-1Higashi-Ikebukuro, Toshima-kuTokyo170-0013Japan
| | - Haruki Nakamura
- Institute for Protein ResearchOsaka University3-2 YamadaokaSuita, Osaka565-0871Japan
| |
Collapse
|
178
|
Abstract
Phenotypic screens are increasingly utilized in drug discovery for multiple purposes such as lead and/or tool compound finding, and target discovery. Using potent and selective chemical tool compounds against well-defined targets in phenotypic screens can help elucidate biological processes modulating assay phenotypes. Unfortunately the identification of such tools from large heterogeneous bioactivity databases is nontrivial and there is repeated use of published unselective compounds as phenotypic tools. Here we describe a computational model, the compound-target tool score (TS), which is an evidence-based quantitative confidence metric that can be used to systematically rank tool compounds for targets. The identified selective and nonselective tool compounds have applications in phenotypic assays for target hypothesis validation as well as assay development.
Collapse
|
179
|
Drug Target Commons: A Community Effort to Build a Consensus Knowledge Base for Drug-Target Interactions. Cell Chem Biol 2017; 25:224-229.e2. [PMID: 29276046 PMCID: PMC5814751 DOI: 10.1016/j.chembiol.2017.11.009] [Citation(s) in RCA: 74] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 10/03/2017] [Accepted: 11/20/2017] [Indexed: 11/23/2022]
Abstract
Knowledge of the full target space of bioactive substances, approved and investigational drugs as well as chemical probes, provides important insights into therapeutic potential and possible adverse effects. The existing compound-target bioactivity data resources are often incomparable due to non-standardized and heterogeneous assay types and variability in endpoint measurements. To extract higher value from the existing and future compound target-profiling data, we implemented an open-data web platform, named Drug Target Commons (DTC), which features tools for crowd-sourced compound-target bioactivity data annotation, standardization, curation, and intra-resource integration. We demonstrate the unique value of DTC with several examples related to both drug discovery and drug repurposing applications and invite researchers to join this community effort to increase the reuse and extension of compound bioactivity data. DTC is a crowd-sourcing-based web platform to annotate drug-target bioactivity data The open environment improves data harmonization for drug repurposing applications DTC offers a comprehensive, reproducible, and sustainable bioactivity knowledge base
Collapse
|
180
|
Computational-experimental approach to drug-target interaction mapping: A case study on kinase inhibitors. PLoS Comput Biol 2017; 13:e1005678. [PMID: 28787438 PMCID: PMC5560747 DOI: 10.1371/journal.pcbi.1005678] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2017] [Revised: 08/17/2017] [Accepted: 07/11/2017] [Indexed: 01/09/2023] Open
Abstract
Due to relatively high costs and labor required for experimental profiling of the full target space of chemical compounds, various machine learning models have been proposed as cost-effective means to advance this process in terms of predicting the most potent compound-target interactions for subsequent verification. However, most of the model predictions lack direct experimental validation in the laboratory, making their practical benefits for drug discovery or repurposing applications largely unknown. Here, we therefore introduce and carefully test a systematic computational-experimental framework for the prediction and pre-clinical verification of drug-target interactions using a well-established kernel-based regression algorithm as the prediction model. To evaluate its performance, we first predicted unmeasured binding affinities in a large-scale kinase inhibitor profiling study, and then experimentally tested 100 compound-kinase pairs. The relatively high correlation of 0.77 (p < 0.0001) between the predicted and measured bioactivities supports the potential of the model for filling the experimental gaps in existing compound-target interaction maps. Further, we subjected the model to a more challenging task of predicting target interactions for such a new candidate drug compound that lacks prior binding profile information. As a specific case study, we used tivozanib, an investigational VEGF receptor inhibitor with currently unknown off-target profile. Among 7 kinases with high predicted affinity, we experimentally validated 4 new off-targets of tivozanib, namely the Src-family kinases FRK and FYN A, the non-receptor tyrosine kinase ABL1, and the serine/threonine kinase SLK. Our sub-sequent experimental validation protocol effectively avoids any possible information leakage between the training and validation data, and therefore enables rigorous model validation for practical applications. These results demonstrate that the kernel-based modeling approach offers practical benefits for probing novel insights into the mode of action of investigational compounds, and for the identification of new target selectivities for drug repurposing applications.
Collapse
|
181
|
Drewry DH, Wells CI, Andrews DM, Angell R, Al-Ali H, Axtman AD, Capuzzi SJ, Elkins JM, Ettmayer P, Frederiksen M, Gileadi O, Gray N, Hooper A, Knapp S, Laufer S, Luecking U, Michaelides M, Müller S, Muratov E, Denny RA, Saikatendu KS, Treiber DK, Zuercher WJ, Willson TM. Progress towards a public chemogenomic set for protein kinases and a call for contributions. PLoS One 2017; 12:e0181585. [PMID: 28767711 PMCID: PMC5540273 DOI: 10.1371/journal.pone.0181585] [Citation(s) in RCA: 99] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2017] [Accepted: 07/03/2017] [Indexed: 01/01/2023] Open
Abstract
Protein kinases are highly tractable targets for drug discovery. However, the biological function and therapeutic potential of the majority of the 500+ human protein kinases remains unknown. We have developed physical and virtual collections of small molecule inhibitors, which we call chemogenomic sets, that are designed to inhibit the catalytic function of almost half the human protein kinases. In this manuscript we share our progress towards generation of a comprehensive kinase chemogenomic set (KCGS), release kinome profiling data of a large inhibitor set (Published Kinase Inhibitor Set 2 (PKIS2)), and outline a process through which the community can openly collaborate to create a KCGS that probes the full complement of human protein kinases.
Collapse
Affiliation(s)
- David H. Drewry
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Carrow I. Wells
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - David M. Andrews
- AstraZeneca, Darwin Building, Cambridge Science Park, Cambridge, United Kingdom
| | - Richard Angell
- Drug Discovery Group, Translational Research Office, University College London School of Pharmacy, 29–39 Brunswick Square, London, United Kingdom
| | - Hassan Al-Ali
- Miami Project to Cure Paralysis, University of Miami Miller School of Medicine, Miami, Florida, United States of America
- Peggy and Harold Katz Family Drug Discovery Center, University of Miami Miller School of Medicine, Miami, Florida, United States of America
| | - Alison D. Axtman
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Stephen J. Capuzzi
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jonathan M. Elkins
- Structural Genomics Consortium, Universidade Estadual de Campinas—UNICAMP, Campinas, Sao Paulo, Brazil
| | | | - Mathias Frederiksen
- Novartis Institutes for BioMedical Research, Novartis Campus, Basel, Switzerland
| | - Opher Gileadi
- Structural Genomics Consortium and Target Discovery Institute, Nuffield Department of Clinical Medicine, University of Oxford, Oxford, United Kingdom
| | - Nathanael Gray
- Harvard Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Cancer Biology, Dana−Farber Cancer Institute, Boston, Massachusetts, United States of America
| | - Alice Hooper
- Drug Discovery Group, Translational Research Office, University College London School of Pharmacy, 29–39 Brunswick Square, London, United Kingdom
| | - Stefan Knapp
- Structural Genomics Consortium, Buchmann Institute for Molecular Life Sciences, and Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Max-von-Laue-Straße 15, Frankfurt am Main, Germany
| | - Stefan Laufer
- Department of Pharmaceutical Chemistry, Institute of Pharmaceutical Sciences, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 8, Tübingen, Germany
| | - Ulrich Luecking
- Bayer Pharma AG, Drug Discovery, Müllerstrasse 178, Berlin, Germany
| | - Michael Michaelides
- Oncology Chemistry, AbbVie, 1 North Waukegan Road, North Chicago, Illinois, United States of America
| | - Susanne Müller
- Structural Genomics Consortium, Buchmann Institute for Molecular Life Sciences, and Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Max-von-Laue-Straße 15, Frankfurt am Main, Germany
| | - Eugene Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - R. Aldrin Denny
- Worldwide Medicinal Chemistry, Pfizer Inc., Cambridge, Massachusetts, United States of America
| | - Kumar S. Saikatendu
- Global Research Externalization, Takeda California, Inc., 10410 Science Center Drive, San Diego, California, United States of America
| | | | - William J. Zuercher
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Timothy M. Willson
- Structural Genomics Consortium, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| |
Collapse
|
182
|
He T, Heidemeyer M, Ban F, Cherkasov A, Ester M. SimBoost: a read-across approach for predicting drug-target binding affinities using gradient boosting machines. J Cheminform 2017; 9:24. [PMID: 29086119 PMCID: PMC5395521 DOI: 10.1186/s13321-017-0209-z] [Citation(s) in RCA: 156] [Impact Index Per Article: 22.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2016] [Accepted: 03/30/2017] [Indexed: 02/06/2023] Open
Abstract
Computational prediction of the interaction between drugs and targets is a standing challenge in the field of drug discovery. A number of rather accurate predictions were reported for various binary drug–target benchmark datasets. However, a notable drawback of a binary representation of interaction data is that missing endpoints for non-interacting drug–target pairs are not differentiated from inactive cases, and that predicted levels of activity depend on pre-defined binarization thresholds. In this paper, we present a method called SimBoost that predicts continuous (non-binary) values of binding affinities of compounds and proteins and thus incorporates the whole interaction spectrum from true negative to true positive interactions. Additionally, we propose a version of the method called SimBoostQuant which computes a prediction interval in order to assess the confidence of the predicted affinity, thus defining the Applicability Domain metrics explicitly. We evaluate SimBoost and SimBoostQuant on two established drug–target interaction benchmark datasets and one new dataset that we propose to use as a benchmark for read-across cheminformatics applications. We demonstrate that our methods outperform the previously reported models across the studied datasets.
Collapse
Affiliation(s)
- Tong He
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
| | - Marten Heidemeyer
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada
| | - Fuqiang Ban
- Faculty of Medicine, Vancouver Prostate Center, University of British Columbia, Vancouver, BC, V6H 3Z6, Canada
| | - Artem Cherkasov
- Faculty of Medicine, Vancouver Prostate Center, University of British Columbia, Vancouver, BC, V6H 3Z6, Canada
| | - Martin Ester
- School of Computing Science, Simon Fraser University, 8888 University Drive, Burnaby, BC, V5A 1S6, Canada.
| |
Collapse
|
183
|
Fukunishi Y, Yamasaki S, Yasumatsu I, Takeuchi K, Kurosawa T, Nakamura H. Quantitative Structure-activity Relationship (QSAR) Models for Docking Score Correction. Mol Inform 2017; 36:1600013. [PMID: 28001004 PMCID: PMC5297997 DOI: 10.1002/minf.201600013] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2016] [Accepted: 04/01/2016] [Indexed: 01/26/2023]
Abstract
In order to improve docking score correction, we developed several structure-based quantitative structure activity relationship (QSAR) models by protein-drug docking simulations and applied these models to public affinity data. The prediction models used descriptor-based regression, and the compound descriptor was a set of docking scores against multiple (∼600) proteins including nontargets. The binding free energy that corresponded to the docking score was approximated by a weighted average of docking scores for multiple proteins, and we tried linear, weighted linear and polynomial regression models considering the compound similarities. In addition, we tried a combination of these regression models for individual data sets such as IC50 , Ki , and %inhibition values. The cross-validation results showed that the weighted linear model was more accurate than the simple linear regression model. Thus, the QSAR approaches based on the affinity data of public databases should improve docking scores.
Collapse
Affiliation(s)
- Yoshifumi Fukunishi
- Molecular Profiling Research Center for Drug Discovery (molprof), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Satoshi Yamasaki
- Technology Research Association for Next-Generation Natural Products Chemistry, 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Isao Yasumatsu
- Technology Research Association for Next-Generation Natural Products Chemistry, 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
- Daiichi Sankyo RD Novare Co., Ltd., 1-16-13, Kita-Kasai, Edogawa-ku, Tokyo, 134-8630, Japan
| | - Koh Takeuchi
- Molecular Profiling Research Center for Drug Discovery (molprof), National Institute of Advanced Industrial Science and Technology (AIST), 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
| | - Takashi Kurosawa
- Technology Research Association for Next-Generation Natural Products Chemistry, 2-3-26, Aomi, Koto-ku, Tokyo, 135-0064, Japan
- Hitachi Solutions East Japan, 12-1 Ekimaehoncho, Kawasaki-ku, Kanagawa, 210-0007, Japan
| | - Haruki Nakamura
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka, 565-0871, Japan
| |
Collapse
|
184
|
Tang J. Informatics Approaches for Predicting, Understanding, and Testing Cancer Drug Combinations. Methods Mol Biol 2017; 1636:485-506. [PMID: 28730498 DOI: 10.1007/978-1-4939-7154-1_30] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Making cancer treatment more effective is one of the grand challenges in our health care system. However, many drugs have entered clinical trials but so far showed limited efficacy or induced rapid development of resistance. We urgently need multi-targeted drug combinations, which shall selectively inhibit the cancer cells and block the emergence of drug resistance. The book chapter focuses on mathematical and computational tools to facilitate the discovery of the most promising drug combinations to improve efficacy and prevent resistance. Data integration approaches that leverage drug-target interactions, cancer molecular features, and signaling pathways for predicting, understanding, and testing drug combinations are critically reviewed.
Collapse
Affiliation(s)
- Jing Tang
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Tukholmankatu 8, 00290, Helsinki, Finland. .,Department of Mathematics and Statistics, University of Turku, Turku, Finland.
| |
Collapse
|
185
|
Merget B, Turk S, Eid S, Rippmann F, Fulle S. Profiling Prediction of Kinase Inhibitors: Toward the Virtual Assay. J Med Chem 2016; 60:474-485. [PMID: 27966949 DOI: 10.1021/acs.jmedchem.6b01611] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Kinome-wide screening would have the advantage of providing structure-activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.7) was achieved for ∼200 kinases by (1) combining open with proprietary data, (2) choosing Random Forest over alternative tested Machine Learning methods, and (3) balancing the training data sets. Tests on left-out and external data indicate a high value for virtual screening projects. Importantly, the derived models are evenly distributed across the kinome tree, allowing reliable profiling prediction for all kinase branches. The prediction quality was further improved by employing experimental bioactivity fingerprints of a small kinase subset. Overall, the generated models can support various hit identification tasks, including virtual screening, compound repurposing, and the detection of potential off-targets.
Collapse
Affiliation(s)
- Benjamin Merget
- BioMed X Innovation Center , Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| | - Samo Turk
- BioMed X Innovation Center , Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| | - Sameh Eid
- BioMed X Innovation Center , Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| | - Friedrich Rippmann
- Global Computational Chemistry, Merck KGaA , Frankfurter Strasse 250, 64293 Darmstadt, Germany
| | - Simone Fulle
- BioMed X Innovation Center , Im Neuenheimer Feld 515, 69120 Heidelberg, Germany
| |
Collapse
|
186
|
Malani D, Murumägi A, Yadav B, Kontro M, Eldfors S, Kumar A, Karjalainen R, Majumder MM, Ojamies P, Pemovska T, Wennerberg K, Heckman C, Porkka K, Wolf M, Aittokallio T, Kallioniemi O. Enhanced sensitivity to glucocorticoids in cytarabine-resistant AML. Leukemia 2016; 31:1187-1195. [PMID: 27833094 PMCID: PMC5420795 DOI: 10.1038/leu.2016.314] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 09/22/2016] [Accepted: 09/26/2016] [Indexed: 12/20/2022]
Abstract
We sought to identify drugs that could counteract cytarabine resistance in acute myeloid leukemia (AML) by generating eight resistant variants from MOLM-13 and SHI-1 AML cell lines by long-term drug treatment. These cells were compared with 66 ex vivo chemorefractory samples from cytarabine-treated AML patients. The models and patient cells were subjected to genomic and transcriptomic profiling and high-throughput testing with 250 emerging and clinical oncology compounds. Genomic profiling uncovered deletion of the deoxycytidine kinase (DCK) gene in both MOLM-13- and SHI-1-derived cytarabine-resistant variants and in an AML patient sample. Cytarabine-resistant SHI-1 variants and a subset of chemorefractory AML patient samples showed increased sensitivity to glucocorticoids that are often used in treatment of lymphoid leukemia but not AML. Paired samples taken from AML patients before treatment and at relapse also showed acquisition of glucocorticoid sensitivity. Enhanced glucocorticoid sensitivity was only seen in AML patient samples that were negative for the FLT3 mutation (P=0.0006). Our study shows that development of cytarabine resistance is associated with increased sensitivity to glucocorticoids in a subset of AML, suggesting a new therapeutic strategy that should be explored in a clinical trial of chemorefractory AML patients carrying wild-type FLT3.
Collapse
Affiliation(s)
- D Malani
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - A Murumägi
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - B Yadav
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - M Kontro
- Hematology Research Unit Helsinki, Department of Hematology, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - S Eldfors
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - A Kumar
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - R Karjalainen
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - M M Majumder
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - P Ojamies
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - T Pemovska
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - K Wennerberg
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - C Heckman
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - K Porkka
- Hematology Research Unit Helsinki, Department of Hematology, University of Helsinki and Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - M Wolf
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland
| | - T Aittokallio
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland.,Department of Mathematics and Statistics, University of Turku, Turku, Finland
| | - O Kallioniemi
- Institute for Molecular Medicine Finland, FIMM, University of Helsinki, Helsinki, Finland.,Science for Life Laboratory, Department of Oncology and Pathology, Karolinska Institutet, Solna, Sweden
| |
Collapse
|
187
|
Yadav B, Gopalacharyulu P, Pemovska T, Khan SA, Szwajda A, Tang J, Wennerberg K, Aittokallio T. From drug response profiling to target addiction scoring in cancer cell models. Dis Model Mech 2016; 8:1255-64. [PMID: 26438695 PMCID: PMC4610238 DOI: 10.1242/dmm.021105] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Deconvoluting the molecular target signals behind observed drug response phenotypes is an important part of phenotype-based drug discovery and repurposing efforts. We demonstrate here how our network-based deconvolution approach, named target addiction score (TAS), provides insights into the functional importance of druggable protein targets in cell-based drug sensitivity testing experiments. Using cancer cell line profiling data sets, we constructed a functional classification across 107 cancer cell models, based on their common and unique target addiction signatures. The pan-cancer addiction correlations could not be explained by the tissue of origin, and only correlated in part with molecular and genomic signatures of the heterogeneous cancer cells. The TAS-based cancer cell classification was also shown to be robust to drug response data resampling, as well as predictive of the transcriptomic patterns in an independent set of cancer cells that shared similar addiction signatures with the 107 cancers. The critical protein targets identified by the integrated approach were also shown to have clinically relevant mutation frequencies in patients with various cancer subtypes, including not only well-established pan-cancer genes, such as PTEN tumor suppressor, but also a number of targets that are less frequently mutated in specific cancer types, including ABL1 oncoprotein in acute myeloid leukemia. An application to leukemia patient primary cell models demonstrated how the target deconvolution approach offers functional insights into patient-specific addiction patterns, such as those indicative of their receptor-type tyrosine-protein kinase FLT3 internal tandem duplication (FLT3-ITD) status and co-addiction partners, which may lead to clinically actionable, personalized drug treatment developments. To promote its application to the future drug testing studies, we have made available an open-source implementation of the TAS calculation in the form of a stand-alone R package.
Collapse
Affiliation(s)
- Bhagwan Yadav
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Peddinti Gopalacharyulu
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Tea Pemovska
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Suleiman A Khan
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Agnieszka Szwajda
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Jing Tang
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Krister Wennerberg
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), Nordic EMBL Partnership for Molecular Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| |
Collapse
|
188
|
Wang Y, Cornett A, King FJ, Mao Y, Nigsch F, Paris CG, McAllister G, Jenkins JL. Evidence-Based and Quantitative Prioritization of Tool Compounds in Phenotypic Drug Discovery. Cell Chem Biol 2016; 23:862-874. [PMID: 27427232 DOI: 10.1016/j.chembiol.2016.05.016] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 04/29/2016] [Accepted: 05/13/2016] [Indexed: 01/07/2023]
Abstract
The use of potent and selective chemical tools with well-defined targets can help elucidate biological processes driving phenotypes in phenotypic screens. However, identification of selective compounds en masse to create targeted screening sets is non-trivial. A systematic approach is needed to prioritize probes, which prevents the repeated use of published but unselective compounds. Here we performed a meta-analysis of integrated large-scale, heterogeneous bioactivity data to create an evidence-based, quantitative metric to systematically rank tool compounds for targets. Our tool score (TS) was then tested on hundreds of compounds by assessing their activity profiles in a panel of 41 cell-based pathway assays. We demonstrate that high-TS tools show more reliably selective phenotypic profiles than lower-TS compounds. Additionally we highlight frequently tested compounds that are non-selective tools and distinguish target family polypharmacology from cross-family promiscuity. TS can therefore be used to prioritize compounds from heterogeneous databases for phenotypic screening.
Collapse
Affiliation(s)
- Yuan Wang
- Novartis Institutes for BioMedical Research Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA.
| | - Allen Cornett
- Novartis Institutes for BioMedical Research Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Fred J King
- Genomics Institute of the Novartis Research Foundation, 10675 John Jay Hopkins Drive, San Diego, CA 92121, USA
| | - Yi Mao
- Harvard T.H. Chan School of Public Health, 677 Huntington Avenue, Boston, MA 02115, USA
| | - Florian Nigsch
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, Basel 4056, Switzerland
| | - C Gregory Paris
- Novartis Institutes for BioMedical Research Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Gregory McAllister
- Novartis Institutes for BioMedical Research Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA
| | - Jeremy L Jenkins
- Novartis Institutes for BioMedical Research Inc., 250 Massachusetts Avenue, Cambridge, MA 02139, USA.
| |
Collapse
|
189
|
Kangaspeska S, Hultsch S, Jaiswal A, Edgren H, Mpindi JP, Eldfors S, Brück O, Aittokallio T, Kallioniemi O. Systematic drug screening reveals specific vulnerabilities and co-resistance patterns in endocrine-resistant breast cancer. BMC Cancer 2016; 16:378. [PMID: 27378269 PMCID: PMC4932681 DOI: 10.1186/s12885-016-2452-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Revised: 05/31/2016] [Accepted: 06/15/2016] [Indexed: 11/24/2022] Open
Abstract
Background The estrogen receptor (ER) inhibitor tamoxifen reduces breast cancer mortality by 31 % and has served as the standard treatment for ER-positive breast cancers for decades. However, 50 % of advanced ER-positive cancers display de novo resistance to tamoxifen, and acquired resistance evolves in 40 % of patients who initially respond. Mechanisms underlying resistance development remain poorly understood and new therapeutic opportunities are urgently needed. Here, we report the generation and characterization of seven tamoxifen-resistant breast cancer cell lines from four parental strains. Methods Using high throughput drug sensitivity and resistance testing (DSRT) with 279 approved and investigational oncology drugs, exome-sequencing and network analysis, we for the first time, systematically determine the drug response profiles specific to tamoxifen resistance. Results We discovered emerging vulnerabilities towards specific drugs, such as ERK1/2-, proteasome- and BCL-family inhibitors as the cells became tamoxifen-resistant. Co-resistance to other drugs such as the survivin inhibitor YM155 and the chemotherapeutic agent paclitaxel also occurred. Conclusion This study indicates that multiple molecular mechanisms dictate endocrine resistance, resulting in unexpected vulnerabilities to initially ineffective drugs, as well as in emerging co-resistances. Thus, combatting drug-resistant tumors will require patient-tailored strategies in order to identify new drug vulnerabilities, and to understand the associated co-resistance patterns. Electronic supplementary material The online version of this article (doi:10.1186/s12885-016-2452-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Sara Kangaspeska
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland. .,Present address: Helsinki Innovation Services, Tukholmankatu 8 A, 00290, Helsinki, Finland.
| | - Susanne Hultsch
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Alok Jaiswal
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Henrik Edgren
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland.,Present address: MediSapiens Ltd, Erottajankatu 19B, 00130, Helsinki, Finland
| | - John-Patrick Mpindi
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Samuli Eldfors
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Oscar Brück
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland
| | - Olli Kallioniemi
- Institute for Molecular Medicine Finland (FIMM), Biomedicum 2U, Tukholmankatu 8, 00290, Helsinki, Finland.,Present address: Science for Life Laboratory, Department Oncology-Pathology, Karolinska Institutet, Tomtebodavägen 23, 171 65, Solna, Sweden
| |
Collapse
|
190
|
Callahan A, Abeyruwan SW, Al-Ali H, Sakurai K, Ferguson AR, Popovich PG, Shah NH, Visser U, Bixby JL, Lemmon VP. RegenBase: a knowledge base of spinal cord injury biology for translational research. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2016; 2016:baw040. [PMID: 27055827 PMCID: PMC4823819 DOI: 10.1093/database/baw040] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/03/2015] [Accepted: 03/03/2016] [Indexed: 12/20/2022]
Abstract
Spinal cord injury (SCI) research is a data-rich field that aims to identify the biological mechanisms resulting in loss of function and mobility after SCI, as well as develop therapies that promote recovery after injury. SCI experimental methods, data and domain knowledge are locked in the largely unstructured text of scientific publications, making large scale integration with existing bioinformatics resources and subsequent analysis infeasible. The lack of standard reporting for experiment variables and results also makes experiment replicability a significant challenge. To address these challenges, we have developed RegenBase, a knowledge base of SCI biology. RegenBase integrates curated literature-sourced facts and experimental details, raw assay data profiling the effect of compounds on enzyme activity and cell growth, and structured SCI domain knowledge in the form of the first ontology for SCI, using Semantic Web representation languages and frameworks. RegenBase uses consistent identifier schemes and data representations that enable automated linking among RegenBase statements and also to other biological databases and electronic resources. By querying RegenBase, we have identified novel biological hypotheses linking the effects of perturbagens to observed behavioral outcomes after SCI. RegenBase is publicly available for browsing, querying and download. Database URL:http://regenbase.org
Collapse
Affiliation(s)
- Alison Callahan
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305
| | | | - Hassan Al-Ali
- Miami Project to Cure Paralysis, University of Miami School of Medicine, Miami, FL 33136
| | - Kunie Sakurai
- Miami Project to Cure Paralysis, University of Miami School of Medicine, Miami, FL 33136
| | - Adam R Ferguson
- Brain and Spinal Injury Center (BASIC), Department of Neurological Surgery, University of California, San Francisco; San Francisco Veterans Affairs Medical Center, San Francisco, CA 94143
| | - Phillip G Popovich
- Center for Brain and Spinal Cord Repair and the Department of Neuroscience, The Ohio State University, Columbus, OH 43210
| | - Nigam H Shah
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, CA 94305
| | - Ubbo Visser
- Department of Computer Science, University of Miami, Coral Gables, FL 33146
| | - John L Bixby
- Miami Project to Cure Paralysis, University of Miami School of Medicine, Miami, FL 33136 Center for Computational Science, University of Miami, Coral Gables, FL 33146 Department of Cellular and Molecular Pharmacology, University of Miami School of Medicine, Miami, FL 33136, USA
| | - Vance P Lemmon
- Miami Project to Cure Paralysis, University of Miami School of Medicine, Miami, FL 33136 Center for Computational Science, University of Miami, Coral Gables, FL 33146
| |
Collapse
|
191
|
Kibble M, Khan SA, Saarinen N, Iorio F, Saez-Rodriguez J, Mäkelä S, Aittokallio T. Transcriptional response networks for elucidating mechanisms of action of multitargeted agents. Drug Discov Today 2016; 21:1063-75. [PMID: 26979547 DOI: 10.1016/j.drudis.2016.03.001] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2015] [Revised: 02/15/2016] [Accepted: 03/04/2016] [Indexed: 01/08/2023]
Abstract
Drug discovery is moving away from the single target-based approach towards harnessing the potential of polypharmacological agents that modulate the activity of multiple nodes in the complex networks of deregulations underlying disease phenotypes. Computational network pharmacology methods that use systems-level drug-response phenotypes, such as those originating from genome-wide transcriptomic profiles, have proved particularly effective for elucidating the mechanisms of action of multitargeted compounds. Here, we show, via the case study of the natural product pinosylvin, how the combination of two complementary network-based methods can provide novel, unexpected mechanistic insights. This case study also illustrates that elucidating the mechanism of action of multitargeted natural products through transcriptional response-based approaches is a challenging endeavor, often requiring multiple computational-experimental iterations.
Collapse
Affiliation(s)
- Milla Kibble
- Institute for Molecular Medicine Finland (FIMM), Biomedicum Helsinki 2U, Tukholmankatu 8, University of Helsinki, Helsinki 00014, Finland.
| | - Suleiman A Khan
- Institute for Molecular Medicine Finland (FIMM), Biomedicum Helsinki 2U, Tukholmankatu 8, University of Helsinki, Helsinki 00014, Finland
| | - Niina Saarinen
- Institute of Biomedicine, Turku Center for Disease Modeling & Functional Foods Forum, University of Turku, Turku 20014, Finland
| | - Francesco Iorio
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
| | - Julio Saez-Rodriguez
- European Molecular Biology Laboratory - European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK; Joint Research Centre for Computational Biomedicine (JRC-COMBINE) - RWTH Aachen University, Faculty of Medicine, D-52074 Aachen, Germany
| | - Sari Mäkelä
- Institute of Biomedicine, Turku Center for Disease Modeling & Functional Foods Forum, University of Turku, Turku 20014, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), Biomedicum Helsinki 2U, Tukholmankatu 8, University of Helsinki, Helsinki 00014, Finland; Department of Mathematics and Statistics, Quantum, University of Turku, Turku 20014, Finland
| |
Collapse
|
192
|
Glaab E. Building a virtual ligand screening pipeline using free software: a survey. Brief Bioinform 2016; 17:352-66. [PMID: 26094053 PMCID: PMC4793892 DOI: 10.1093/bib/bbv037] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Revised: 05/20/2015] [Indexed: 12/17/2022] Open
Abstract
Virtual screening, the search for bioactive compounds via computational methods, provides a wide range of opportunities to speed up drug development and reduce the associated risks and costs. While virtual screening is already a standard practice in pharmaceutical companies, its applications in preclinical academic research still remain under-exploited, in spite of an increasing availability of dedicated free databases and software tools. In this survey, an overview of recent developments in this field is presented, focusing on free software and data repositories for screening as alternatives to their commercial counterparts, and outlining how available resources can be interlinked into a comprehensive virtual screening pipeline using typical academic computing facilities. Finally, to facilitate the set-up of corresponding pipelines, a downloadable software system is provided, using platform virtualization to integrate pre-installed screening tools and scripts for reproducible application across different operating systems.
Collapse
|
193
|
Guo J, Liu H, Zheng J. SynLethDB: synthetic lethality database toward discovery of selective and sensitive anticancer drug targets. Nucleic Acids Res 2015; 44:D1011-7. [PMID: 26516187 PMCID: PMC4702809 DOI: 10.1093/nar/gkv1108] [Citation(s) in RCA: 90] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2015] [Accepted: 10/09/2015] [Indexed: 02/03/2023] Open
Abstract
Synthetic lethality (SL) is a type of genetic interaction between two genes such that simultaneous perturbations of the two genes result in cell death or a dramatic decrease of cell viability, while a perturbation of either gene alone is not lethal. SL reflects the biologically endogenous difference between cancer cells and normal cells, and thus the inhibition of SL partners of genes with cancer-specific mutations could selectively kill cancer cells but spare normal cells. Therefore, SL is emerging as a promising anticancer strategy that could potentially overcome the drawbacks of traditional chemotherapies by reducing severe side effects. Researchers have developed experimental technologies and computational prediction methods to identify SL gene pairs on human and a few model species. However, there has not been a comprehensive database dedicated to collecting SL pairs and related knowledge. In this paper, we propose a comprehensive database, SynLethDB (http://histone.sce.ntu.edu.sg/SynLethDB/), which contains SL pairs collected from biochemical assays, other related databases, computational predictions and text mining results on human and four model species, i.e. mouse, fruit fly, worm and yeast. For each SL pair, a confidence score was calculated by integrating individual scores derived from different evidence sources. We also developed a statistical analysis module to estimate the druggability and sensitivity of cancer cells upon drug treatments targeting human SL partners, based on large-scale genomic data, gene expression profiles and drug sensitivity profiles on more than 1000 cancer cell lines. To help users access and mine the wealth of the data, we developed other practical functionalities, such as search and filtering, orthology search, gene set enrichment analysis. Furthermore, a user-friendly web interface has been implemented to facilitate data analysis and interpretation. With the integrated data sets and analytics functionalities, SynLethDB would be a useful resource for biomedical research community and pharmaceutical industry.
Collapse
Affiliation(s)
- Jing Guo
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore
| | - Hui Liu
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore Lab of Information Management, Changzhou University, Jiangsu 213164, China
| | - Jie Zheng
- School of Computer Engineering, Nanyang Technological University, Singapore 639798, Singapore Genome Institute of Singapore (GIS), Biopolis, Singapore 138672, Singapore
| |
Collapse
|
194
|
Kibble M, Saarinen N, Tang J, Wennerberg K, Mäkelä S, Aittokallio T. Network pharmacology applications to map the unexplored target space and therapeutic potential of natural products. Nat Prod Rep 2015; 32:1249-66. [PMID: 26030402 DOI: 10.1039/c5np00005j] [Citation(s) in RCA: 277] [Impact Index Per Article: 30.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
It is widely accepted that drug discovery often requires a systems-level polypharmacology approach to tackle problems such as lack of efficacy and emerging resistance of single-targeted compounds. Network pharmacology approaches are increasingly being developed and applied to find new therapeutic opportunities and to re-purpose approved drugs. However, these recent advances have been relatively slow to be translated into the field of natural products. Here, we argue that a network pharmacology approach would enable an effective mapping of the yet unexplored target space of natural products, hence providing a systematic means to extend the druggable space of proteins implicated in various complex diseases. We give an overview of the key network pharmacology concepts and recent experimental-computational approaches that have been successfully applied to natural product research, including unbiased elucidation of mechanisms of action as well as systematic prediction of effective therapeutic combinations. We focus specifically on anticancer applications that use in vivo and in vitro functional phenotypic measurements, such as genome-wide transcriptomic response profiles, which enable a global modelling of the multi-target activity at the level of the biological pathways and interaction networks. We also provide representative examples of other disease applications, databases and tools as well as existing and emerging resources, which may prove useful for future natural product research. Finally, we offer our personal view of the current limitations, prospective developments and open questions in this exciting field.
Collapse
Affiliation(s)
- Milla Kibble
- Institute for Molecular Medicine Finland (FIMM), Biomedicum Helsinki 2U, 00014 University of Helsinki, Finland.
| | | | | | | | | | | |
Collapse
|
195
|
Cichonska A, Rousu J, Aittokallio T. Identification of drug candidates and repurposing opportunities through compound-target interaction networks. Expert Opin Drug Discov 2015; 10:1333-45. [PMID: 26429153 DOI: 10.1517/17460441.2015.1096926] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
INTRODUCTION System-wide identification of both on- and off-targets of chemical probes provides improved understanding of their therapeutic potential and possible adverse effects, thereby accelerating and de-risking drug discovery process. Given the high costs of experimental profiling of the complete target space of drug-like compounds, computational models offer systematic means for guiding these mapping efforts. These models suggest the most potent interactions for further experimental or pre-clinical evaluation both in cell line models and in patient-derived material. AREAS COVERED The authors focus here on network-based machine learning models and their use in the prediction of novel compound-target interactions both in target-based and phenotype-based drug discovery applications. While currently being used mainly in complementing the experimentally mapped compound-target networks for drug repurposing applications, such as extending the target space of already approved drugs, these network pharmacology approaches may also suggest completely unexpected and novel investigational probes for drug development. EXPERT OPINION Although the studies reviewed here have already demonstrated that network-centric modeling approaches have the potential to identify candidate compounds and selective targets in disease networks, many challenges still remain. In particular, these challenges include how to incorporate the cellular context and genetic background into the disease networks to enable more stratified and selective target predictions, as well as how to make the prediction models more realistic for the practical drug discovery and therapeutic applications.
Collapse
Affiliation(s)
- Anna Cichonska
- a 1 University of Helsinki, Institute for Molecular Medicine Finland FIMM , Helsinki, Finland.,b 2 Aalto University, Helsinki Institute for Information Technology HIIT, Department of Computer Science , Espoo, Finland
| | - Juho Rousu
- c 3 Aalto University, Helsinki Institute for Information Technology HIIT, Department of Computer Science , Espoo, Finland
| | - Tero Aittokallio
- d 4 University of Helsinki, Institute for Molecular Medicine Finland FIMM , Helsinki, Finland +358 5 03 18 24 26 ; .,e 5 University of Turku, Department of Mathematics and Statistics , Turku, Finland
| |
Collapse
|
196
|
Ekins S, Litterman NK, Lipinski CA, Bunin BA. Thermodynamic Proxies to Compensate for Biases in Drug Discovery Methods. Pharm Res 2015; 33:194-205. [DOI: 10.1007/s11095-015-1779-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2015] [Accepted: 08/13/2015] [Indexed: 11/24/2022]
|
197
|
He L, Wennerberg K, Aittokallio T, Tang J. TIMMA-R: an R package for predicting synergistic multi-targeted drug combinations in cancer cell lines or patient-derived samples. ACTA ACUST UNITED AC 2015; 31:1866-8. [PMID: 25638808 PMCID: PMC4443685 DOI: 10.1093/bioinformatics/btv067] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Accepted: 01/26/2015] [Indexed: 11/14/2022]
Abstract
Summary: Network pharmacology-based prediction of multi-targeted drug combinations is becoming a promising strategy to improve anticancer efficacy and safety. We developed a logic-based network algorithm, called Target Inhibition Interaction using Maximization and Minimization Averaging (TIMMA), which predicts the effects of drug combinations based on their binary drug-target interactions and single-drug sensitivity profiles in a given cancer sample. Here, we report the R implementation of the algorithm (TIMMA-R), which is much faster than the original MATLAB code. The major extensions include modeling of multiclass drug-target profiles and network visualization. We also show that the TIMMA-R predictions are robust to the intrinsic noise in the experimental data, thus making it a promising high-throughput tool to prioritize drug combinations in various cancer types for follow-up experimentation or clinical applications. Availability and implementation: TIMMA-R source code is freely available at http://cran.r-project.org/web/packages/timma/. Contact:jing.tang@helsinki.fi Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Liye He
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Tukholmankatu 8, FI-00290, Helsinki, Finland
| | - Krister Wennerberg
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Tukholmankatu 8, FI-00290, Helsinki, Finland
| | - Tero Aittokallio
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Tukholmankatu 8, FI-00290, Helsinki, Finland
| | - Jing Tang
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Tukholmankatu 8, FI-00290, Helsinki, Finland
| |
Collapse
|
198
|
Cortés-Ciriano I, Ain QU, Subramanian V, Lenselink EB, Méndez-Lucio O, IJzerman AP, Wohlfahrt G, Prusis P, Malliavin TE, van Westen GJP, Bender A. Polypharmacology modelling using proteochemometrics (PCM): recent methodological developments, applications to target families, and future prospects. MEDCHEMCOMM 2015. [DOI: 10.1039/c4md00216d] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Proteochemometric (PCM) modelling is a computational method to model the bioactivity of multiple ligands against multiple related protein targets simultaneously.
Collapse
Affiliation(s)
- Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Qurrat Ul Ain
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | | | - Eelke B. Lenselink
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Oscar Méndez-Lucio
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| | - Adriaan P. IJzerman
- Division of Medicinal Chemistry
- Leiden Academic Centre for Drug Research
- Leiden
- The Netherlands
| | - Gerd Wohlfahrt
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Peteris Prusis
- Computer-Aided Drug Design
- Orion Pharma
- FIN-02101 Espoo
- Finland
| | - Thérèse E. Malliavin
- Unité de Bioinformatique Structurale
- Institut Pasteur and CNRS UMR 3825
- Structural Biology and Chemistry Department
- 75 724 Paris
- France
| | - Gerard J. P. van Westen
- European Molecular Biology Laboratory
- European Bioinformatics Institute
- Wellcome Trust Genome Campus
- Hinxton
- UK
| | - Andreas Bender
- Unilever Centre for Molecular Informatics
- Department of Chemistry
- CB2 1EW Cambridge
- UK
| |
Collapse
|
199
|
Kennedy EJ. EMBO conference series: Chemical Biology 2014. Chembiochem 2014; 15:2783-7. [PMID: 25318996 DOI: 10.1002/cbic.201402527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Indexed: 11/07/2022]
Abstract
Around 300 people from 18 countries took part in the fourth biennial Chemical Biology conference at The European Molecular Biology Laboratory (EMBL) in Heidelberg, from August 20 to 23, 2014. Many advances in the field of chemical biology were presented in talks and poster sessions. Picture: Petra Riedinger (EMBL).
Collapse
Affiliation(s)
- Eileen J Kennedy
- Department of Pharmaceutical and Biomedical Sciences, College of Pharmacy, University of Georgia, 240 W. Green Street, Athens, GA 30602 (USA).
| |
Collapse
|
200
|
Ferrè F, Palmeri A, Helmer-Citterich M. Computational methods for analysis and inference of kinase/inhibitor relationships. Front Genet 2014; 5:196. [PMID: 25071826 PMCID: PMC4075008 DOI: 10.3389/fgene.2014.00196] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2014] [Accepted: 06/13/2014] [Indexed: 12/21/2022] Open
Abstract
The central role of kinases in virtually all signal transduction networks is the driving motivation for the development of compounds modulating their activity. ATP-mimetic inhibitors are essential tools for elucidating signaling pathways and are emerging as promising therapeutic agents. However, off-target ligand binding and complex and sometimes unexpected kinase/inhibitor relationships can occur for seemingly unrelated kinases, stressing that computational approaches are needed for learning the interaction determinants and for the inference of the effect of small compounds on a given kinase. Recently published high-throughput profiling studies assessed the effects of thousands of small compound inhibitors, covering a substantial portion of the kinome. This wealth of data paved the road for computational resources and methods that can offer a major contribution in understanding the reasons of the inhibition, helping in the rational design of more specific molecules, in the in silico prediction of inhibition for those neglected kinases for which no systematic analysis has been carried yet, in the selection of novel inhibitors with desired selectivity, and offering novel avenues of personalized therapies.
Collapse
Affiliation(s)
- Fabrizio Ferrè
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| | - Antonio Palmeri
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome Tor Vergata Rome, Italy
| |
Collapse
|