101
|
Li T, Zhao XM, Li L. Co-VAE: Drug-Target Binding Affinity Prediction by Co-Regularized Variational Autoencoders. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2022; 44:8861-8873. [PMID: 34652996 DOI: 10.1109/tpami.2021.3120428] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identifying drug-target interactions has been a key step in drug discovery. Many computational methods have been proposed to directly determine whether drugs and targets can interact or not. Drug-target binding affinity is another type of data which could show the strength of the binding interaction between a drug and a target. However, it is more challenging to predict drug-target binding affinity, and thus a very few studies follow this line. In our work, we propose a novel co-regularized variational autoencoders (Co-VAE) to predict drug-target binding affinity based on drug structures and target sequences. The Co-VAE model consists of two VAEs for generating drug SMILES strings and target sequences, respectively, and a co-regularization part for generating the binding affinities. We theoretically prove that the Co-VAE model is to maximize the lower bound of the joint likelihood of drug, protein and their affinity. The Co-VAE could predict drug-target affinity and generate new drugs which share similar targets with the input drugs. The experimental results on two datasets show that the Co-VAE could predict drug-target affinity better than existing affinity prediction methods such as DeepDTA and DeepAffinity, and could generate more new valid drugs than existing methods such as GAN and VAE.
Collapse
|
102
|
Dong R, Yang H, Ai C, Duan G, Wang J, Guo F. DeepBLI: A Transferable Multichannel Model for Detecting β-Lactamase-Inhibitor Interaction. J Chem Inf Model 2022; 62:5830-5840. [PMID: 36245217 DOI: 10.1021/acs.jcim.2c01008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Pathogens producing β-lactamase pose a great challenge to antibiotic-resistant infection treatment; thus, it is urgent to discover novel β-lactamase inhibitors for drug development. Conventional high-throughput screening is very costly, and structure-based virtual screening is limited with mechanisms. In this study, we construct a novel multichannel deep neural network (DeepBLI) for β-lactamase inhibitor screening, pretrained with a label reversal KIBA data set and fine-tuned on β-lactamase-inhibitor pairs from BindingDB. First, the pairs of encoders (Conv and Att) fuse the information spatially and sequentially for both enzymes and inhibitors. Then, a co-attention module creates the connection between the inhibitor and enzyme embeddings. Finally, multichannel outputs fuse with an element-wise product and then are fed into 3-layer fully connected networks to predict interactions. Comparing the state-of-the-art methods, DeepBLI yields an AUROC of 0.9240 and an AUPRC of 0.9715, which indicates that it can identify new β-lactamase-inhibitor interactions. To demonstrate its prediction ability, an application of DeepBLI is described to screen potential inhibitor compounds for metallo-β-lactamase AIM-1 and repurpose rottlerin for four classes of β-lactamase targets, showing the possibility of being a broad-spectrum inhibitor. DeepBLI provides an effective way for antibacterial drug development, contributing to antibiotic-resistant therapeutics.
Collapse
Affiliation(s)
- Ruihan Dong
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing100871, China
| | - Hongpeng Yang
- Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina29208, United States
| | - Chengwei Ai
- College of Intelligence and Computing, Tianjin University, Tianjin300350, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| | - Fei Guo
- School of Computer Science and Engineering, Central South University, Changsha410083, China
| |
Collapse
|
103
|
Nguyen NQ, Jang G, Kim H, Kang J. Perceiver CPI: a nested cross-attention network for compound-protein interaction prediction. Bioinformatics 2022; 39:6842322. [PMID: 36416124 PMCID: PMC9848062 DOI: 10.1093/bioinformatics/btac731] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 10/18/2022] [Accepted: 11/22/2022] [Indexed: 11/24/2022] Open
Abstract
MOTIVATION Compound-protein interaction (CPI) plays an essential role in drug discovery and is performed via expensive molecular docking simulations. Many artificial intelligence-based approaches have been proposed in this regard. Recently, two types of models have accomplished promising results in exploiting molecular information: graph convolutional neural networks that construct a learned molecular representation from a graph structure (atoms and bonds), and neural networks that can be applied to compute on descriptors or fingerprints of molecules. However, the superiority of one method over the other is yet to be determined. Modern studies have endeavored to aggregate information that is extracted from compounds and proteins to form the CPI task. Nonetheless, these approaches have used a simple concatenation to combine them, which cannot fully capture the interaction between such information. RESULTS We propose the Perceiver CPI network, which adopts a cross-attention mechanism to improve the learning ability of the representation of drug and target interactions and exploits the rich information obtained from extended-connectivity fingerprints to improve the performance. We evaluated Perceiver CPI on three main datasets, Davis, KIBA and Metz, to compare the performance of our proposed model with that of state-of-the-art methods. The proposed method achieved satisfactory performance and exhibited significant improvements over previous approaches in all experiments. AVAILABILITY AND IMPLEMENTATION Perceiver CPI is available at https://github.com/dmis-lab/PerceiverCPI. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ngoc-Quang Nguyen
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Gwanghoon Jang
- Department of Computer Science and Engineering, Korea University, Seoul 02841, Republic of Korea
| | - Hajung Kim
- Interdisciplinary Graduate Program in Bioinformatics, Korea University, Seoul 02841, Republic of Korea
| | | |
Collapse
|
104
|
Zhang L, Wang CC, Chen X. Predicting drug-target binding affinity through molecule representation block based on multi-head attention and skip connection. Brief Bioinform 2022; 23:6782838. [PMID: 36411674 DOI: 10.1093/bib/bbac468] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 09/13/2022] [Accepted: 09/29/2022] [Indexed: 11/22/2022] Open
Abstract
Exiting computational models for drug-target binding affinity prediction have much room for improvement in prediction accuracy, robustness and generalization ability. Most deep learning models lack interpretability analysis and few studies provide application examples. Based on these observations, we presented a novel model named Molecule Representation Block-based Drug-Target binding Affinity prediction (MRBDTA). MRBDTA is composed of embedding and positional encoding, molecule representation block and interaction learning module. The advantages of MRBDTA are reflected in three aspects: (i) developing Trans block to extract molecule features through improving the encoder of transformer, (ii) introducing skip connection at encoder level in Trans block and (iii) enhancing the ability to capture interaction sites between proteins and drugs. The test results on two benchmark datasets manifest that MRBDTA achieves the best performance compared with 11 state-of-the-art models. Besides, through replacing Trans block with single Trans encoder and removing skip connection in Trans block, we verified that Trans block and skip connection could effectively improve the prediction accuracy and reliability of MRBDTA. Then, relying on multi-head attention mechanism, we performed interpretability analysis to illustrate that MRBDTA can correctly capture part of interaction sites between proteins and drugs. In case studies, we firstly employed MRBDTA to predict binding affinities between Food and Drug Administration-approved drugs and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) replication-related proteins. Secondly, we compared true binding affinities between 3C-like proteinase and 185 drugs with those predicted by MRBDTA. The final results of case studies reveal reliable performance of MRBDTA in drug design for SARS-CoV-2.
Collapse
Affiliation(s)
- Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
105
|
Huang L, Lin J, Liu R, Zheng Z, Meng L, Chen X, Li X, Wong KC. CoaDTI: multi-modal co-attention based framework for drug-target interaction annotation. Brief Bioinform 2022; 23:6770087. [PMID: 36274236 DOI: 10.1093/bib/bbac446] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Revised: 08/26/2022] [Accepted: 09/18/2022] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION The identification of drug-target interactions (DTIs) plays a vital role for in silico drug discovery, in which the drug is the chemical molecule, and the target is the protein residues in the binding pocket. Manual DTI annotation approaches remain reliable; however, it is notoriously laborious and time-consuming to test each drug-target pair exhaustively. Recently, the rapid growth of labelled DTI data has catalysed interests in high-throughput DTI prediction. Unfortunately, those methods highly rely on the manual features denoted by human, leading to errors. RESULTS Here, we developed an end-to-end deep learning framework called CoaDTI to significantly improve the efficiency and interpretability of drug target annotation. CoaDTI incorporates the Co-attention mechanism to model the interaction information from the drug modality and protein modality. In particular, CoaDTI incorporates transformer to learn the protein representations from raw amino acid sequences, and GraphSage to extract the molecule graph features from SMILES. Furthermore, we proposed to employ the transfer learning strategy to encode protein features by pre-trained transformer to address the issue of scarce labelled data. The experimental results demonstrate that CoaDTI achieves competitive performance on three public datasets compared with state-of-the-art models. In addition, the transfer learning strategy further boosts the performance to an unprecedented level. The extended study reveals that CoaDTI can identify novel DTIs such as reactions between candidate drugs and severe acute respiratory syndrome coronavirus 2-associated proteins. The visualization of co-attention scores can illustrate the interpretability of our model for mechanistic insights. AVAILABILITY Source code are publicly available at https://github.com/Layne-Huang/CoaDTI.
Collapse
Affiliation(s)
- Lei Huang
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Jiecong Lin
- Department of Pathology, Harvard Medical School, Boston, USA.,Department of Computer Science, The University of Hong Kong, Hong Kong SAR
| | - Rui Liu
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Zetian Zheng
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Lingkuan Meng
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Xingjian Chen
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Hong Kong SAR.,Hong Kong Institute for Data Science, City University of Hong Kong, Hong Kong SAR
| |
Collapse
|
106
|
Askr H, Elgeldawi E, Aboul Ella H, Elshaier YAMM, Gomaa MM, Hassanien AE. Deep learning in drug discovery: an integrative review and future challenges. Artif Intell Rev 2022; 56:5975-6037. [PMID: 36415536 PMCID: PMC9669545 DOI: 10.1007/s10462-022-10306-1] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/24/2022] [Indexed: 11/18/2022]
Abstract
Recently, using artificial intelligence (AI) in drug discovery has received much attention since it significantly shortens the time and cost of developing new drugs. Deep learning (DL)-based approaches are increasingly being used in all stages of drug development as DL technology advances, and drug-related data grows. Therefore, this paper presents a systematic Literature review (SLR) that integrates the recent DL technologies and applications in drug discovery Including, drug-target interactions (DTIs), drug-drug similarity interactions (DDIs), drug sensitivity and responsiveness, and drug-side effect predictions. We present a review of more than 300 articles between 2000 and 2022. The benchmark data sets, the databases, and the evaluation measures are also presented. In addition, this paper provides an overview of how explainable AI (XAI) supports drug discovery problems. The drug dosing optimization and success stories are discussed as well. Finally, digital twining (DT) and open issues are suggested as future research challenges for drug discovery problems. Challenges to be addressed, future research directions are identified, and an extensive bibliography is also included.
Collapse
Affiliation(s)
- Heba Askr
- Faculty of Computers and Artificial Intelligence, University of Sadat City, Sadat City, Egypt
| | - Enas Elgeldawi
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Heba Aboul Ella
- Faculty of Pharmacy and Drug Technology, Chinese University in Egypt (CUE), Cairo, Egypt
| | | | - Mamdouh M. Gomaa
- Computer Science Department, Faculty of Science, Minia University, Minia, Egypt
| | - Aboul Ella Hassanien
- Faculty of Computers and Artificial Intelligence, Cairo University, Cairo, Egypt
| |
Collapse
|
107
|
Aleb N. A Mutual Attention Model for Drug Target Binding Affinity Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3224-3232. [PMID: 34665738 DOI: 10.1109/tcbb.2021.3121275] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Vrious machine learning approaches have been developed for drug-target interaction (DTI) prediction. One class of these approaches, DTBA, is interested in Drug-Target Binding Affinity strength, rather than focusing merely on the presence or absence of interaction. Several machine learning methods have been developed for this purpose. However, almost all depend heavily on the use of increasingly sophisticated inputs to improve their performance. In addition, these methods do not allow any analysis or interpretation due to their black-box characteristic. This work is an attempt to overcome these limitations by taking advantage of the use of attention mechanisms with convolution models. In this paper, we define a new mutual attention based model for DTBA prediction. We represent both compounds and targets by sequences. Our model starts by aligning the drug-target pairs, then a learned masking is performed to retain the most promising regions, of both sequences, and amplify them with a learned factor in such a way to make the learning focus more on them. We evaluate the performance of our method on two benchmark datasets, KIBA and Davis. The results show that our mutual attention approach is very effective. Compared to other well-known approaches, it achieved excellent results regarding the considered performance metrics.
Collapse
|
108
|
Liao J, Chen H, Wei L, Wei L. GSAML-DTA: An interpretable drug-target binding affinity prediction model based on graph neural networks with self-attention mechanism and mutual information. Comput Biol Med 2022; 150:106145. [PMID: 37859276 DOI: 10.1016/j.compbiomed.2022.106145] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 08/23/2022] [Accepted: 09/24/2022] [Indexed: 11/03/2022]
Abstract
Identifying drug-target affinity (DTA) has great practical importance in the process of designing efficacious drugs for known diseases. Recently, numerous deep learning-based computational methods have been developed to predict drug-target affinity and achieved impressive performance. However, most of them construct the molecule (drug or target) encoder without considering the weights of features of each node (atom or residue). Besides, they generally combine drug and target representations directly, which may contain irrelevant-task information. In this study, we develop GSAML-DTA, an interpretable deep learning framework for DTA prediction. GSAML-DTA integrates a self-attention mechanism and graph neural networks (GNNs) to build representations of drugs and target proteins from the structural information. In addition, mutual information is introduced to filter out redundant information and retain relevant information in the combined representations of drugs and targets. Extensive experimental results demonstrate that GSAML-DTA outperforms state-of-the-art methods for DTA prediction on two benchmark datasets. Furthermore, GSAML-DTA has the interpretation ability to analyze binding atoms and residues, which may be conducive to chemical biology studies from data. Overall, GSAML-DTA can serve as a powerful and interpretable tool suitable for DTA modelling.
Collapse
Affiliation(s)
- Jiaqi Liao
- School of Software, Shandong University, Jinan, China
| | - Haoyang Chen
- School of Software, Shandong University, Jinan, China
| | - Lesong Wei
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.
| |
Collapse
|
109
|
Deep Neural Networks Compression: a comparative survey and choice recommendations. Neurocomputing 2022. [DOI: 10.1016/j.neucom.2022.11.072] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
110
|
Zhang Y, Hu Y, Li H, Liu X. Drug-protein interaction prediction via variational autoencoders and attention mechanisms. Front Genet 2022; 13:1032779. [PMID: 36313473 PMCID: PMC9614151 DOI: 10.3389/fgene.2022.1032779] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 09/30/2022] [Indexed: 09/29/2023] Open
Abstract
During the process of drug discovery, exploring drug-protein interactions (DPIs) is a key step. With the rapid development of biological data, computer-aided methods are much faster than biological experiments. Deep learning methods have become popular and are mainly used to extract the characteristics of drugs and proteins for further DPIs prediction. Since the prediction of DPIs through machine learning cannot fully extract effective features, in our work, we propose a deep learning framework that uses variational autoencoders and attention mechanisms; it utilizes convolutional neural networks (CNNs) to obtain local features and attention mechanisms to obtain important information about drugs and proteins, which is very important for predicting DPIs. Compared with some machine learning methods on the C.elegans and human datasets, our approach provides a better effect. On the BindingDB dataset, its accuracy (ACC) and area under the curve (AUC) reach 0.862 and 0.913, respectively. To verify the robustness of the model, multiclass classification tasks are performed on Davis and KIBA datasets, and the ACC values reach 0.850 and 0.841, respectively, thus further demonstrating the effectiveness of the model.
Collapse
Affiliation(s)
- Yue Zhang
- School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
| | | | | | | |
Collapse
|
111
|
Hierarchical graph representation learning for the prediction of drug-target binding affinity. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.09.043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
112
|
An interpretable machine learning model for selectivity of small molecules against homologous protein family. Future Med Chem 2022; 14:1441-1453. [PMID: 36169035 DOI: 10.4155/fmc-2022-0075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Aim: In the early stages of drug discovery, various experimental and computational methods are used to measure the specificity of small molecules against a target protein. The selectivity of small molecules remains a challenge leading to off-target side effects. Methods: We have developed a multitask deep learning model for predicting the selectivity on closely related homologs of the target protein. The model has been tested on the Janus-activated kinase and dopamine receptor families of proteins. Results & conclusion: The feature-based representation (extended connectivity fingerprint 4) with Extreme Gradient Boosting performed better when compared with deep neural network models in most of the evaluation metrics. Both the Extreme Gradient Boosting and deep neural network models outperformed the graph-based models. Furthermore, to decipher the model decision on selectivity, the important fragments associated with each homologous protein were identified.
Collapse
|
113
|
Korlepara DB, Vasavi CS, Jeurkar S, Pal PK, Roy S, Mehta S, Sharma S, Kumar V, Muvva C, Sridharan B, Garg A, Modee R, Bhati AP, Nayar D, Priyakumar UD. PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications. Sci Data 2022; 9:548. [PMID: 36071074 PMCID: PMC9451116 DOI: 10.1038/s41597-022-01631-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 08/15/2022] [Indexed: 11/08/2022] Open
Abstract
Computational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities.
Collapse
Affiliation(s)
- Divya B Korlepara
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - C S Vasavi
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Shruti Jeurkar
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Pradeep Kumar Pal
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Subhajit Roy
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
- UM-DAE-Centre For Excellence In Basic Sciences, University of Mumbai, Vidyanagari, Mumbai, India
| | - Sarvesh Mehta
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Shubham Sharma
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Vishal Kumar
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Charuvaka Muvva
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Bhuvanesh Sridharan
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Akshit Garg
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Rohit Modee
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India
| | - Agastya P Bhati
- Centre for Computational Science, Department of Chemistry, University College London, London, WC1H 0AJ, United Kingdom
| | - Divya Nayar
- Department of Materials Science and Engineering, Indian Institute of Technology Delhi, Hauz Khas, New Delhi, 110016, India.
| | - U Deva Priyakumar
- Centre for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500032, India.
| |
Collapse
|
114
|
Lin S, Shi C, Chen J. GeneralizedDTA: combining pre-training and multi-task learning to predict drug-target binding affinity for unknown drug discovery. BMC Bioinformatics 2022; 23:367. [PMID: 36071406 PMCID: PMC9449940 DOI: 10.1186/s12859-022-04905-6] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 08/23/2022] [Indexed: 12/04/2022] Open
Abstract
Background Accurately predicting drug-target binding affinity (DTA) in silico plays an important role in drug discovery. Most of the computational methods developed for predicting DTA use machine learning models, especially deep neural networks, and depend on large-scale labelled data. However, it is difficult to learn enough feature representation from tens of millions of compounds and hundreds of thousands of proteins only based on relatively limited labelled drug-target data. There are a large number of unknown drugs, which never appear in the labelled drug-target data. This is a kind of out-of-distribution problems in bio-medicine. Some recent studies adopted self-supervised pre-training tasks to learn structural information of amino acid sequences for enhancing the feature representation of proteins. However, the task gap between pre-training and DTA prediction brings the catastrophic forgetting problem, which hinders the full application of feature representation in DTA prediction and seriously affects the generalization capability of models for unknown drug discovery. Results To address these problems, we propose the GeneralizedDTA, which is a new DTA prediction model oriented to unknown drug discovery, by combining pre-training and multi-task learning. We introduce self-supervised protein and drug pre-training tasks to learn richer structural information from amino acid sequences of proteins and molecular graphs of drug compounds, in order to alleviate the problem of high variance caused by encoding based on deep neural networks and accelerate the convergence of prediction model on small-scale labelled data. We also develop a multi-task learning framework with a dual adaptation mechanism to narrow the task gap between pre-training and prediction for preventing overfitting and improving the generalization capability of DTA prediction model on unknown drug discovery. To validate the effectiveness of our model, we construct an unknown drug data set to simulate the scenario of unknown drug discovery. Compared with existing DTA prediction models, the experimental results show that our model has the higher generalization capability in the DTA prediction of unknown drugs. Conclusions The advantages of our model are mainly attributed to two kinds of pre-training tasks and the multi-task learning framework, which can learn richer structural information of proteins and drugs from large-scale unlabeled data, and then effectively integrate it into the downstream prediction task for obtaining a high-quality DTA prediction in unknown drug discovery.
Collapse
Affiliation(s)
- Shaofu Lin
- Faculty of Information Technology, Beijing University of Technology, No. 100, Pingleyuan, Chaoyang District, Beijing, 100124, China
| | - Chengyu Shi
- Faculty of Information Technology, Beijing University of Technology, No. 100, Pingleyuan, Chaoyang District, Beijing, 100124, China
| | - Jianhui Chen
- Faculty of Information Technology, Beijing University of Technology, No. 100, Pingleyuan, Chaoyang District, Beijing, 100124, China. .,Beijing International Collaboration Base on Brain Informatics and Wisdom Services, Beijing University of Technology, No. 100, Pingleyuan, Chaoyang District, Beijing, 100124, China. .,Beijing Key Laboratory of MRI and Brain Informatics, Beijing University Of Technology, No. 100, Pingleyuan, Chaoyang District, Beijing, 100124, China.
| |
Collapse
|
115
|
Li Q, Zhang X, Wu L, Bo X, He S, Wang S. PLA-MoRe: A Protein-Ligand Binding Affinity Prediction Model via Comprehensive Molecular Representations. J Chem Inf Model 2022; 62:4380-4390. [PMID: 36054653 DOI: 10.1021/acs.jcim.2c00960] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Accurately predicting the binding affinity of protein-ligand pairs is an essential part of drug discovery. Since wet laboratory experiments to determine the binding affinity are expensive and time-consuming, several computational methods for binding affinity prediction have been proposed. In the representation of compounds, most methods only focus on the structural properties such as SMILES and ignore the bioactive properties. In this study, we proposed a novel model named PLA-MoRe to predict protein-ligand binding affinity, which represents compounds based on both structural and bioactive properties and mainly contains three feature extractors. First, a structure feature extractor based on the graph isomorphism network was constructed to learn the representations of the molecular graphs. Second, we designed an Autoencoder-based bioactive feature extractor to integrate the multisource bioactive information including chemical, target, network, cellular, and clinical. The above two parts aimed to learn representations of compounds in terms of structures and bioactivities, respectively. Then, we constructed a sequence feature extractor to learn embeddings for protein sequences. The output of the three extractors was concatenated and fed into a fully connected network for affinity prediction. We compared PLA-MoRe with three state-of-the-art methods, and an ablation study was conducted to test the role of each part of the model. Further attention visualization showed that our model had the potential to locate the binding sites, which might help explain the mechanism of interaction. These results prove that PLA-MoRe is competitive and reliable. The resource codes are freely available at the GitHub repository https://github.com/QingyuLiaib/PLA-MoRe.
Collapse
Affiliation(s)
- Qingyu Li
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Xiaochang Zhang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| | - Lianlian Wu
- Academy of Medical Engineering and Translational Medicine, Tianjin University, Tianjin 300072, China.,Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Song He
- Institute of Health Service and Transfusion Medicine, Beijing 100850, China
| | - Shengqi Wang
- Beijing Institute of Microbiology and Epidemiology, Beijing 100850, China
| |
Collapse
|
116
|
Pu Y, Li J, Tang J, Guo F. DeepFusionDTA: Drug-Target Binding Affinity Prediction With Information Fusion and Hybrid Deep-Learning Ensemble Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2760-2769. [PMID: 34379594 DOI: 10.1109/tcbb.2021.3103966] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identification of drug-target interaction (DTI) is the most important issue in the broad field of drug discovery. Using purely biological experiments to verify drug-target binding profiles takes lots of time and effort, so computational technologies for this task obviously have great benefits in reducing the drug search space. Most of computational methods to predict DTI are proposed to solve a binary classification problem, which ignore the influence of binding strength. Therefore, drug-target binding affinity prediction is still a challenging issue. Currently, lots of studies only extract sequence information that lacks feature-rich representation, but we consider more spatial features in order to merge various data in drug and target spaces. In this study, we propose a two-stage deep neural network ensemble model for detecting drug-target binding affinity, called DeepFusionDTA, via various information analysis modules. First stage is to utilize sequence and structure information to generate fusion feature map of candidate protein and drug pair through various analysis modules based deep learning. Second stage is to apply bagging-based ensemble learning strategy for regression prediction, and we obtain outstanding results by combining the advantages of various algorithms in efficient feature abstraction and regression calculation. Importantly, we evaluate our novel method, DeepFusionDTA, which delivers 1.5 percent CI increase on KIBA dataset and 1.0 percent increase on Davis dataset, by comparing with existing prediction tools, DeepDTA. Furthermore, the ideas we have offered can be applied to in-silico screening of the interaction space, to provide novel DTIs which can be experimentally pursued. The codes and data are available from https://github.com/guofei-tju/DeepFusionDTA.
Collapse
|
117
|
Prediction of Potential Commercially Available Inhibitors against SARS-CoV-2 by Multi-Task Deep Learning Model. Biomolecules 2022; 12:biom12081156. [PMID: 36009050 PMCID: PMC9405964 DOI: 10.3390/biom12081156] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Revised: 08/16/2022] [Accepted: 08/18/2022] [Indexed: 11/16/2022] Open
Abstract
The outbreak of COVID-19 caused millions of deaths worldwide, and the number of total infections is still rising. It is necessary to identify some potentially effective drugs that can be used to prevent the development of severe symptoms, or even death for those infected. Fortunately, many efforts have been made and several effective drugs have been identified. The rapidly increasing amount of data is of great help for training an effective and specific deep learning model. In this study, we propose a multi-task deep learning model for the purpose of screening commercially available and effective inhibitors against SARS-CoV-2. First, we pretrained a model on several heterogenous protein-ligand interaction datasets. The model achieved competitive results on some benchmark datasets. Next, a coronavirus-specific dataset was collected and used to fine-tune the model. Then, the fine-tuned model was used to select commercially available drugs against SARS-CoV-2 protein targets. Overall, twenty compounds were listed as potential inhibitors. We further explored the model interpretability and exhibited the predicted important binding sites. Based on this prediction, molecular docking was also performed to visualize the binding modes of the selected inhibitors.
Collapse
|
118
|
Hu F, Jiang J, Yin P. Prediction of Potential Commercially Available Inhibitors against SARS-CoV-2 by Multi-Task Deep Learning Model. Biomolecules 2022. [PMID: 36009050 DOI: 10.48550/arxiv.2003.00728] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023] Open
Abstract
The outbreak of COVID-19 caused millions of deaths worldwide, and the number of total infections is still rising. It is necessary to identify some potentially effective drugs that can be used to prevent the development of severe symptoms, or even death for those infected. Fortunately, many efforts have been made and several effective drugs have been identified. The rapidly increasing amount of data is of great help for training an effective and specific deep learning model. In this study, we propose a multi-task deep learning model for the purpose of screening commercially available and effective inhibitors against SARS-CoV-2. First, we pretrained a model on several heterogenous protein-ligand interaction datasets. The model achieved competitive results on some benchmark datasets. Next, a coronavirus-specific dataset was collected and used to fine-tune the model. Then, the fine-tuned model was used to select commercially available drugs against SARS-CoV-2 protein targets. Overall, twenty compounds were listed as potential inhibitors. We further explored the model interpretability and exhibited the predicted important binding sites. Based on this prediction, molecular docking was also performed to visualize the binding modes of the selected inhibitors.
Collapse
Affiliation(s)
- Fan Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Jiaxin Jiang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Peng Yin
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| |
Collapse
|
119
|
Pandey M, Radaeva M, Mslati H, Garland O, Fernandez M, Ester M, Cherkasov A. Ligand Binding Prediction Using Protein Structure Graphs and Residual Graph Attention Networks. Molecules 2022; 27:molecules27165114. [PMID: 36014351 PMCID: PMC9416537 DOI: 10.3390/molecules27165114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/03/2022] [Accepted: 08/09/2022] [Indexed: 11/25/2022] Open
Abstract
Computational prediction of ligand–target interactions is a crucial part of modern drug discovery as it helps to bypass high costs and labor demands of in vitro and in vivo screening. As the wealth of bioactivity data accumulates, it provides opportunities for the development of deep learning (DL) models with increasing predictive powers. Conventionally, such models were either limited to the use of very simplified representations of proteins or ineffective voxelization of their 3D structures. Herein, we present the development of the PSG-BAR (Protein Structure Graph-Binding Affinity Regression) approach that utilizes 3D structural information of the proteins along with 2D graph representations of ligands. The method also introduces attention scores to selectively weight protein regions that are most important for ligand binding. Results: The developed approach demonstrates the state-of-the-art performance on several binding affinity benchmarking datasets. The attention-based pooling of protein graphs enables identification of surface residues as critical residues for protein–ligand binding. Finally, we validate our model predictions against an experimental assay on a viral main protease (Mpro)—the hallmark target of SARS-CoV-2 coronavirus.
Collapse
Affiliation(s)
- Mohit Pandey
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Mariia Radaeva
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Hazem Mslati
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Olivia Garland
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Michael Fernandez
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
| | - Martin Ester
- School of Computing Science, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Artem Cherkasov
- Vancouver Prostate Centre, Department of Urologic Sciences, University of British Columbia, Vancouver, BC V6T 1Z2, Canada
- Correspondence:
| |
Collapse
|
120
|
Zeng Y, Chen X, Peng D, Zhang L, Huang H. Multi-scaled self-attention for drug-target interaction prediction based on multi-granularity representation. BMC Bioinformatics 2022; 23:314. [PMID: 35922768 PMCID: PMC9347097 DOI: 10.1186/s12859-022-04857-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 07/22/2022] [Indexed: 11/21/2022] Open
Abstract
Background Drug–target interaction (DTI) prediction plays a crucial role in drug discovery. Although the advanced deep learning has shown promising results in predicting DTIs, it still needs improvements in two aspects: (1) encoding method, in which the existing encoding method, character encoding, overlooks chemical textual information of atoms with multiple characters and chemical functional groups; as well as (2) the architecture of deep model, which should focus on multiple chemical patterns in drug and target representations. Results In this paper, we propose a multi-granularity multi-scaled self-attention (SAN) model by alleviating the above problems. Specifically, in process of encoding, we investigate a segmentation method for drug and protein sequences and then label the segmented groups as the multi-granularity representations. Moreover, in order to enhance the various local patterns in these multi-granularity representations, a multi-scaled SAN is built and exploited to generate deep representations of drugs and targets. Finally, our proposed model predicts DTIs based on the fusion of these deep representations. Our proposed model is evaluated on two benchmark datasets, KIBA and Davis. The experimental results reveal that our proposed model yields better prediction accuracy than strong baseline models. Conclusion Our proposed multi-granularity encoding method and multi-scaled SAN model improve DTI prediction by encoding the chemical textual information of drugs and targets and extracting their various local patterns, respectively.
Collapse
Affiliation(s)
- Yuni Zeng
- School of Information Science and Technology, Zhejiang Sci-Tech University, Hangzhou, China
| | - Xiangru Chen
- College of Computer Science, Sichuan University, Chengdu, China
| | - Dezhong Peng
- College of Computer Science, Sichuan University, Chengdu, China.,Shenzhen Peng Cheng Laboratory, Shenzhen, China.,Chengdu Sobey Digital Technology Co., Ltd, Chengdu, China
| | - Lijun Zhang
- Sichuan Zhiqian Technology Co., Ltd, Chengdu, China.,Chengdu Ruibei Yingte Information Technology Co., Ltd, Chengdu, China
| | - Haixiao Huang
- Sichuan Provincial Commission of Politics and Law, Chengdu, China.
| |
Collapse
|
121
|
Reciprocal perspective as a super learner improves drug-target interaction prediction (MUSDTI). Sci Rep 2022; 12:13237. [PMID: 35918366 PMCID: PMC9344797 DOI: 10.1038/s41598-022-16493-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 07/11/2022] [Indexed: 11/08/2022] Open
Abstract
The identification of novel drug-target interactions (DTI) is critical to drug discovery and drug repurposing to address contemporary medical and public health challenges presented by emergent diseases. Historically, computational methods have framed DTI prediction as a binary classification problem (indicating whether or not a drug physically interacts with a given protein target); however, framing the problem instead as a regression-based prediction of the physiochemical binding affinity is more meaningful. With growing databases of experimentally derived drug-target interactions (e.g. Davis, Binding-DB, and Kiba), deep learning-based DTI predictors can be effectively leveraged to achieve state-of-the-art (SOTA) performance. In this work, we formulated a DTI competition as part of the coursework for a senior undergraduate machine learning course and challenged students to generate component DTI models that might surpass SOTA models and effectively combine these component models as part of a meta-model using the Reciprocal Perspective (RP) multi-view learning framework. Following 6 weeks of concerted effort, 28 student-produced component deep-learning DTI models were leveraged in this work to produce a new SOTA RP-DTI model, denoted the Meta Undergraduate Student DTI (MUSDTI) model. Through a series of experiments we demonstrate that (1) RP can considerably improve SOTA DTI prediction, (2) our new double-cold experimental design is more appropriate for emergent DTI challenges, (3) that our novel MUSDTI meta-model outperforms SOTA models, (4) that RP can improve upon individual models as an ensembling method, and finally, (5) RP can be utilized for low computation transfer learning. This work introduces a number of important revelations for the field of DTI prediction and sequence-based, pairwise prediction in general.
Collapse
|
122
|
CSatDTA: Prediction of Drug–Target Binding Affinity Using Convolution Model with Self-Attention. Int J Mol Sci 2022; 23:ijms23158453. [PMID: 35955587 PMCID: PMC9369082 DOI: 10.3390/ijms23158453] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 07/27/2022] [Accepted: 07/27/2022] [Indexed: 12/10/2022] Open
Abstract
Drug discovery, which aids to identify potential novel treatments, entails a broad range of fields of science, including chemistry, pharmacology, and biology. In the early stages of drug development, predicting drug–target affinity is crucial. The proposed model, the prediction of drug–target affinity using a convolution model with self-attention (CSatDTA), applies convolution-based self-attention mechanisms to the molecular drug and target sequences to predict drug–target affinity (DTA) effectively, unlike previous convolution methods, which exhibit significant limitations related to this aspect. The convolutional neural network (CNN) only works on a particular region of information, excluding comprehensive details. Self-attention, on the other hand, is a relatively recent technique for capturing long-range interactions that has been used primarily in sequence modeling tasks. The results of comparative experiments show that CSatDTA surpasses previous sequence-based or other approaches and has outstanding retention abilities.
Collapse
|
123
|
Luo H, Xiang Y, Fang X, Lin W, Wang F, Wu H, Wang H. BatchDTA: implicit batch alignment enhances deep learning-based drug-target affinity estimation. Brief Bioinform 2022; 23:6632927. [PMID: 35794723 DOI: 10.1093/bib/bbac260] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 05/23/2022] [Accepted: 06/03/2022] [Indexed: 11/14/2022] Open
Abstract
Candidate compounds with high binding affinities toward a target protein are likely to be developed as drugs. Deep neural networks (DNNs) have attracted increasing attention for drug-target affinity (DTA) estimation owning to their efficiency. However, the negative impact of batch effects caused by measure metrics, system technologies and other assay information is seldom discussed when training a DNN model for DTA. Suffering from the data deviation caused by batch effects, the DNN models can only be trained on a small amount of 'clean' data. Thus, it is challenging for them to provide precise and consistent estimations. We design a batch-sensitive training framework, namely BatchDTA, to train the DNN models. BatchDTA implicitly aligns multiple batches toward the same protein through learning the orders of candidate compounds with respect to the batches, alleviating the impact of the batch effects on the DNN models. Extensive experiments demonstrate that BatchDTA facilitates four mainstream DNN models to enhance the ability and robustness on multiple DTA datasets (BindingDB, Davis and KIBA). The average concordance index of the DNN models achieves a relative improvement of 4.0%. The case study reveals that BatchDTA can successfully learn the ranking orders of the compounds from multiple batches. In addition, BatchDTA can also be applied to the fused data collected from multiple sources to achieve further improvement.
Collapse
Affiliation(s)
- Hongyu Luo
- PaddleHelix team, Baidu Inc., 518000, Shenzhen, China
| | - Yingfei Xiang
- PaddleHelix team, Baidu Inc., 518000, Shenzhen, China
| | - Xiaomin Fang
- PaddleHelix team, Baidu Inc., 518000, Shenzhen, China
| | - Wei Lin
- PaddleHelix team, Baidu Inc., 518000, Shenzhen, China
| | - Fan Wang
- PaddleHelix team, Baidu Inc., 518000, Shenzhen, China
| | - Hua Wu
- Baidu Inc., 100000, Beijing, China
| | | |
Collapse
|
124
|
Zhao Q, Yang M, Cheng Z, Li Y, Wang J. Biomedical Data and Deep Learning Computational Models for Predicting Compound-Protein Relations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2092-2110. [PMID: 33769935 DOI: 10.1109/tcbb.2021.3069040] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
The identification of compound-protein relations (CPRs), which includes compound-protein interactions (CPIs) and compound-protein affinities (CPAs), is critical to drug development. A common method for compound-protein relation identification is the use of in vitro screening experiments. However, the number of compounds and proteins is massive, and in vitro screening experiments are labor-intensive, expensive, and time-consuming with high failure rates. Researchers have developed a computational field called virtual screening (VS) to aid experimental drug development. These methods utilize experimentally validated biological interaction information to generate datasets and use the physicochemical and structural properties of compounds and target proteins as input information to train computational prediction models. At present, deep learning has been widely used in computer vision and natural language processing and has experienced epoch-making progress. At the same time, deep learning has also been used in the field of biomedicine widely, and the prediction of CPRs based on deep learning has developed rapidly and has achieved good results. The purpose of this study is to investigate and discuss the latest applications of deep learning techniques in CPR prediction. First, we describe the datasets and feature engineering (i.e., compound and protein representations and descriptors) commonly used in CPR prediction methods. Then, we review and classify recent deep learning approaches in CPR prediction. Next, a comprehensive comparison is performed to demonstrate the prediction performance of representative methods on classical datasets. Finally, we discuss the current state of the field, including the existing challenges and our proposed future directions. We believe that this investigation will provide sufficient references and insight for researchers to understand and develop new deep learning methods to enhance CPR predictions.
Collapse
|
125
|
Monteiro NRC, Simões CJV, Ávila HV, Abbasi M, Oliveira JL, Arrais JP. Explainable deep drug-target representations for binding affinity prediction. BMC Bioinformatics 2022; 23:237. [PMID: 35715734 PMCID: PMC9204982 DOI: 10.1186/s12859-022-04767-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 05/25/2022] [Indexed: 11/10/2022] Open
Abstract
Background Several computational advances have been achieved in the drug discovery field, promoting the identification of novel drug–target interactions and new leads. However, most of these methodologies have been overlooking the importance of providing explanations to the decision-making process of deep learning architectures. In this research study, we explore the reliability of convolutional neural networks (CNNs) at identifying relevant regions for binding, specifically binding sites and motifs, and the significance of the deep representations extracted by providing explanations to the model’s decisions based on the identification of the input regions that contributed the most to the prediction. We make use of an end-to-end deep learning architecture to predict binding affinity, where CNNs are exploited in their capacity to automatically identify and extract discriminating deep representations from 1D sequential and structural data. Results The results demonstrate the effectiveness of the deep representations extracted from CNNs in the prediction of drug–target interactions. CNNs were found to identify and extract features from regions relevant for the interaction, where the weight associated with these spots was in the range of those with the highest positive influence given by the CNNs in the prediction. The end-to-end deep learning model achieved the highest performance both in the prediction of the binding affinity and on the ability to correctly distinguish the interaction strength rank order when compared to baseline approaches. Conclusions This research study validates the potential applicability of an end-to-end deep learning architecture in the context of drug discovery beyond the confined space of proteins and ligands with determined 3D structure. Furthermore, it shows the reliability of the deep representations extracted from the CNNs by providing explainability to the decision-making process. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04767-y.
Collapse
Affiliation(s)
- Nelson R C Monteiro
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
| | | | - Henrique V Ávila
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - Maryam Abbasi
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| | - José L Oliveira
- IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal
| | - Joel P Arrais
- Univ Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal
| |
Collapse
|
126
|
Jiang M, Wang S, Zhang S, Zhou W, Zhang Y, Li Z. Sequence-based drug-target affinity prediction using weighted graph neural networks. BMC Genomics 2022; 23:449. [PMID: 35715739 PMCID: PMC9205061 DOI: 10.1186/s12864-022-08648-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 05/23/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Affinity prediction between molecule and protein is an important step of virtual screening, which is usually called drug-target affinity (DTA) prediction. Its accuracy directly influences the progress of drug development. Sequence-based drug-target affinity prediction can predict the affinity according to protein sequence, which is fast and can be applied to large datasets. However, due to the lack of protein structure information, the accuracy needs to be improved. RESULTS The proposed model which is called WGNN-DTA can be competent in drug-target affinity (DTA) and compound-protein interaction (CPI) prediction tasks. Various experiments are designed to verify the performance of the proposed method in different scenarios, which proves that WGNN-DTA has the advantages of simplicity and high accuracy. Moreover, because it does not need complex steps such as multiple sequence alignment (MSA), it has fast execution speed, and can be suitable for the screening of large databases. CONCLUSION We construct protein and molecular graphs through sequence and SMILES that can effectively reflect their structures. To utilize the detail contact information of protein, graph neural network is used to extract features and predict the binding affinity based on the graphs, which is called weighted graph neural networks drug-target affinity predictor (WGNN-DTA). The proposed method has the advantages of simplicity and high accuracy.
Collapse
Affiliation(s)
- Mingjian Jiang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266525, China
| | - Shuang Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Shugang Zhang
- College of Computer Science and Technology, Ocean University of China, Qingdao, 266100, China
| | - Wei Zhou
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266525, China
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao, 266525, China
| | - Zhen Li
- College of Computer Science and Technology, Qingdao University, Qingdao, 266071, China.
| |
Collapse
|
127
|
DeepMHADTA: Prediction of Drug-Target Binding Affinity Using Multi-Head Self-Attention and Convolutional Neural Network. Curr Issues Mol Biol 2022; 44:2287-2299. [PMID: 35678684 PMCID: PMC9164023 DOI: 10.3390/cimb44050155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 05/08/2022] [Accepted: 05/14/2022] [Indexed: 11/17/2022] Open
Abstract
Drug-target interactions provide insight into the drug-side effects and drug repositioning. However, wet-lab biochemical experiments are time-consuming and labor-intensive, and are insufficient to meet the pressing demand for drug research and development. With the rapid advancement of deep learning, computational methods are increasingly applied to screen drug-target interactions. Many methods consider this problem as a binary classification task (binding or not), but ignore the quantitative binding affinity. In this paper, we propose a new end-to-end deep learning method called DeepMHADTA, which uses the multi-head self-attention mechanism in a deep residual network to predict drug-target binding affinity. On two benchmark datasets, our method outperformed several current state-of-the-art methods in terms of multiple performance measures, including mean square error (MSE), consistency index (CI), rm2, and PR curve area (AUPR). The results demonstrated that our method achieved better performance in predicting the drug–target binding affinity.
Collapse
|
128
|
Tran HNT, Thomas JJ, Ahamed Hassain Malim NH. DeepNC: a framework for drug-target interaction prediction with graph neural networks. PeerJ 2022; 10:e13163. [PMID: 35578674 PMCID: PMC9107302 DOI: 10.7717/peerj.13163] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 03/03/2022] [Indexed: 01/12/2023] Open
Abstract
The exploration of drug-target interactions (DTI) is an essential stage in the drug development pipeline. Thanks to the assistance of computational models, notably in the deep learning approach, scientists have been able to shorten the time spent on this stage. Widely practiced deep learning algorithms such as convolutional neural networks and recurrent neural networks are commonly employed in DTI prediction projects. However, they can hardly utilize the natural graph structure of molecular inputs. For that reason, a graph neural network (GNN) is an applicable choice for learning the chemical and structural characteristics of molecules when it represents molecular compounds as graphs and learns the compound features from those graphs. In an effort to construct an advanced deep learning-based model for DTI prediction, we propose Deep Neural Computation (DeepNC), which is a framework utilizing three GNN algorithms: Generalized Aggregation Networks (GENConv), Graph Convolutional Networks (GCNConv), and Hypergraph Convolution-Hypergraph Attention (HypergraphConv). In short, our framework learns the features of drugs and targets by the layers of GNN and 1-D convolution network, respectively. Then, representations of the drugs and targets are fed into fully-connected layers to predict the binding affinity values. The models of DeepNC were evaluated on two benchmarked datasets (Davis, Kiba) and one independently proposed dataset (Allergy) to confirm that they are suitable for predicting the binding affinity of drugs and targets. Moreover, compared to the results of baseline methods that worked on the same problem, DeepNC proves to improve the performance in terms of mean square error and concordance index.
Collapse
Affiliation(s)
- Huu Ngoc Tran Tran
- Department of Computing, UOW Malaysia, KDU Penang University College, George Town, Penang, Malaysia
| | - J. Joshua Thomas
- Department of Computing, UOW Malaysia, KDU Penang University College, George Town, Penang, Malaysia
| | | |
Collapse
|
129
|
Wang H, Liu H, Ning S, Zeng C, Zhao Y. DLSSAffinity: protein-ligand binding affinity prediction via a deep learning model. Phys Chem Chem Phys 2022; 24:10124-10133. [PMID: 35416807 DOI: 10.1039/d1cp05558e] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Evaluating the protein-ligand binding affinity is a substantial part of the computer-aided drug discovery process. Most of the proposed computational methods predict protein-ligand binding affinity using either limited full-length protein 3D structures or simple full-length protein sequences as the input features. Thus, protein-ligand binding affinity prediction remains a fundamental challenge in drug discovery. In this study, we proposed a novel deep learning-based approach, DLSSAffinity, to accurately predict the protein-ligand binding affinity. Unlike the existing methods, DLSSAffinity uses the pocket-ligand structural pairs as the local information to predict short-range direct interactions. Besides, DLSSAffinity also uses the full-length protein sequence and ligand SMILES as the global information to predict long-range indirect interactions. We tested DLSSAffinity on the PDBbind benchmark. The results showed that DLSSAffinity achieves Pearson's R = 0.79, RMSE = 1.40, and SD = 1.35 on the test set. Comparing DLSSAffinity with the existing state-of-the-art deep learning-based binding affinity prediction methods, the DLSSAffinity model outperforms other models. These results demonstrate that combining global sequence and local structure information as the input features of a deep learning model can improve the accuracy of protein-ligand binding affinity prediction.
Collapse
Affiliation(s)
- Huiwen Wang
- School of Physics and Engineering, Henan University of Science and Technology, Luoyang 471023, China.
| | - Haoquan Liu
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China.
| | - Shangbo Ning
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China.
| | - Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China.
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan 430079, China.
| |
Collapse
|
130
|
Kalakoti Y, Yadav S, Sundar D. Deep Neural Network-Assisted Drug Recommendation Systems for Identifying Potential Drug-Target Interactions. ACS OMEGA 2022; 7:12138-12146. [PMID: 35449922 PMCID: PMC9016825 DOI: 10.1021/acsomega.2c00424] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/18/2022] [Indexed: 06/14/2023]
Abstract
In silico methods to identify novel drug-target interactions (DTIs) have gained significant importance over conventional techniques owing to their labor-intensive and low-throughput nature. Here, we present a machine learning-based multiclass classification workflow that segregates interactions between active, inactive, and intermediate drug-target pairs. Drug molecules, protein sequences, and molecular descriptors were transformed into machine-interpretable embeddings to extract critical features from standard datasets. Tools such as CHEMBL web resource, iFeature, and an in-house developed deep neural network-assisted drug recommendation (dNNDR)-featx were employed for data retrieval and processing. The models were trained with large-scale DTI datasets, which reported an improvement in performance over baseline methods. External validation results showed that models based on att-biLSTM and gCNN could help predict novel DTIs. When tested with a completely different dataset, the proposed models significantly outperformed competing methods. The validity of novel interactions predicted by dNNDR was backed by experimental and computational evidence in the literature. The proposed methodology could elucidate critical features that govern the relationship between a drug and its target.
Collapse
Affiliation(s)
- Yogesh Kalakoti
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Shashank Yadav
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Durai Sundar
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
- School
of Artificial Intelligence, Indian Institute
of Technology (IIT) Delhi, New Delhi 110 016, India
| |
Collapse
|
131
|
Graph neural network approaches for drug-target interactions. Curr Opin Struct Biol 2022; 73:102327. [DOI: 10.1016/j.sbi.2021.102327] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 11/22/2021] [Accepted: 12/13/2021] [Indexed: 01/06/2023]
|
132
|
Ru X, Ye X, Sakurai T, Zou Q. NerLTR-DTA: drug-target binding affinity prediction based on neighbor relationship and learning to rank. Bioinformatics 2022; 38:1964-1971. [PMID: 35134828 DOI: 10.1093/bioinformatics/btac048] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2021] [Revised: 12/20/2021] [Accepted: 01/28/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Drug-target interaction prediction plays an important role in new drug discovery and drug repurposing. Binding affinity indicates the strength of drug-target interactions. Predicting drug-target binding affinity is expected to provide promising candidates for biologists, which can effectively reduce the workload of wet laboratory experiments and speed up the entire process of drug research. Given that, numerous new proteins are sequenced and compounds are synthesized, several improved computational methods have been proposed for such predictions, but there are still some challenges. (i) Many methods only discuss and implement one application scenario, they focus on drug repurposing and ignore the discovery of new drugs and targets. (ii) Many methods do not consider the priority order of proteins (or drugs) related to each target drug (or protein). Therefore, it is necessary to develop a comprehensive method that can be used in multiple scenarios and focuses on candidate order. RESULTS In this study, we propose a method called NerLTR-DTA that uses the neighbor relationship of similarity and sharing to extract features, and applies a ranking framework with regression attributes to predict affinity values and priority order of query drug (or query target) and its related proteins (or compounds). It is worth noting that using the characteristics of learning to rank to set different queries can smartly realize the multi-scenario application of the method, including the discovery of new drugs and new targets. Experimental results on two commonly used datasets show that NerLTR-DTA outperforms some state-of-the-art competing methods. NerLTR-DTA achieves excellent performance in all application scenarios mentioned in this study, and the rm(test)2 values guarantee such excellent performance is not obtained by chance. Moreover, it can be concluded that NerLTR-DTA can provide accurate ranking lists for the relevant results of most queries through the statistics of the association relationship of each query drug (or query protein). In general, NerLTR-DTA is a powerful tool for predicting drug-target associations and can contribute to new drug discovery and drug repurposing. AVAILABILITY AND IMPLEMENTATION The proposed method is implemented in Python and Java. Source codes and datasets are available at https://github.com/RUXIAOQING964914140/NerLTR-DTA.
Collapse
Affiliation(s)
- Xiaoqing Ru
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| |
Collapse
|
133
|
Affinity2Vec: drug-target binding affinity prediction through representation learning, graph mining, and machine learning. Sci Rep 2022; 12:4751. [PMID: 35306525 PMCID: PMC8934358 DOI: 10.1038/s41598-022-08787-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Accepted: 03/08/2022] [Indexed: 11/21/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays a crucial role in drug repositioning and virtual drug screening. Most DTI prediction methods cast the problem as a binary classification task to predict if interactions exist or as a regression task to predict continuous values that indicate a drug's ability to bind to a specific target. The regression-based methods provide insight beyond the binary relationship. However, most of these methods require the three-dimensional (3D) structural information of targets which are still not generally available to the targets. Despite this bottleneck, only a few methods address the drug-target binding affinity (DTBA) problem from a non-structure-based approach to avoid the 3D structure limitations. Here we propose Affinity2Vec, as a novel regression-based method that formulates the entire task as a graph-based problem. To develop this method, we constructed a weighted heterogeneous graph that integrates data from several sources, including drug-drug similarity, target-target similarity, and drug-target binding affinities. Affinity2Vec further combines several computational techniques from feature representation learning, graph mining, and machine learning to generate or extract features, build the model, and predict the binding affinity between the drug and the target with no 3D structural data. We conducted extensive experiments to evaluate and demonstrate the robustness and efficiency of the proposed method on benchmark datasets used in state-of-the-art non-structured-based drug-target binding affinity studies. Affinity2Vec showed superior and competitive results compared to the state-of-the-art methods based on several evaluation metrics, including mean squared error, rm2, concordance index, and area under the precision-recall curve.
Collapse
|
134
|
Wang J, Wen N, Wang C, Zhao L, Cheng L. ELECTRA-DTA: a new compound-protein binding affinity prediction model based on the contextualized sequence encoding. J Cheminform 2022; 14:14. [PMID: 35292100 PMCID: PMC8922401 DOI: 10.1186/s13321-022-00591-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Accepted: 02/17/2022] [Indexed: 12/28/2022] Open
Abstract
Motivation Drug-target binding affinity (DTA) reflects the strength of the drug-target interaction; therefore, predicting the DTA can considerably benefit drug discovery by narrowing the search space and pruning drug-target (DT) pairs with low binding affinity scores. Representation learning using deep neural networks has achieved promising performance compared with traditional machine learning methods; hence, extensive research efforts have been made in learning the feature representation of proteins and compounds. However, such feature representation learning relies on a large-scale labelled dataset, which is not always available. Results We present an end-to-end deep learning framework, ELECTRA-DTA, to predict the binding affinity of drug-target pairs. This framework incorporates an unsupervised learning mechanism to train two ELECTRA-based contextual embedding models, one for protein amino acids and the other for compound SMILES string encoding. In addition, ELECTRA-DTA leverages a squeeze-and-excitation (SE) convolutional neural network block stacked over three fully connected layers to further capture the sequential and spatial features of the protein sequence and SMILES for the DTA regression task. Experimental evaluations show that ELECTRA-DTA outperforms various state-of-the-art DTA prediction models, especially with the challenging, interaction-sparse BindingDB dataset. In target selection and drug repurposing for COVID-19, ELECTRA-DTA also offers competitive performance, suggesting its potential in speeding drug discovery and generalizability for other compound- or protein-related computational tasks. Supplementary Information The online version contains supplementary material available at 10.1186/s13321-022-00591-x.
Collapse
Affiliation(s)
- Junjie Wang
- Department of Medical Informatics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, People's Republic of China
| | - NaiFeng Wen
- School of Mechanical and Electrical Engineering, Dalian Minzu University, Dalian, People's Republic of China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, People's Republic of China
| | - Lingling Zhao
- Faculty of Computing, Harbin Institute of Technology, Harbin, People's Republic of China.
| | - Liang Cheng
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, People's Republic of China. .,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, People's Republic of China.
| |
Collapse
|
135
|
A Brief Review of Machine Learning-Based Bioactive Compound Research. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Bioactive compounds are often used as initial substances for many therapeutic agents. In recent years, both theoretical and practical innovations in hardware-assisted and fast-evolving machine learning (ML) have made it possible to identify desired bioactive compounds in chemical spaces, such as those in natural products (NPs). This review introduces how machine learning approaches can be used for the identification and evaluation of bioactive compounds. It also provides an overview of recent research trends in machine learning-based prediction and the evaluation of bioactive compounds by listing real-world examples along with various input data. In addition, several ML-based approaches to identify specific bioactive compounds for cardiovascular and metabolic diseases are described. Overall, these approaches are important for the discovery of novel bioactive compounds and provide new insights into the machine learning basis for various traditional applications of bioactive compound-related research.
Collapse
|
136
|
A Novel Deep Neural Network Technique for Drug–Target Interaction. Pharmaceutics 2022; 14:pharmaceutics14030625. [PMID: 35336000 PMCID: PMC8954728 DOI: 10.3390/pharmaceutics14030625] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/08/2022] [Accepted: 03/08/2022] [Indexed: 01/20/2023] Open
Abstract
Drug discovery (DD) is a time-consuming and expensive process. Thus, the industry employs strategies such as drug repositioning and drug repurposing, which allows the application of already approved drugs to treat a different disease, as occurred in the first months of 2020, during the COVID-19 pandemic. The prediction of drug–target interactions is an essential part of the DD process because it can accelerate it and reduce the required costs. DTI prediction performed in silico have used approaches based on molecular docking simulations, including similarity-based and network- and graph-based ones. This paper presents MPS2IT-DTI, a DTI prediction model obtained from research conducted in the following steps: the definition of a new method for encoding molecule and protein sequences onto images; the definition of a deep-learning approach based on a convolutional neural network in order to create a new method for DTI prediction. Training results conducted with the Davis and KIBA datasets show that MPS2IT-DTI is viable compared to other state-of-the-art (SOTA) approaches in terms of performance and complexity of the neural network model. With the Davis dataset, we obtained 0.876 for the concordance index and 0.276 for the MSE; with the KIBA dataset, we obtained 0.836 and 0.226 for the concordance index and the MSE, respectively. Moreover, the MPS2IT-DTI model represents molecule and protein sequences as images, instead of treating them as an NLP task, and as such, does not employ an embedding layer, which is present in other models.
Collapse
|
137
|
Soh J, Park S, Lee H. HIDTI: integration of heterogeneous information to predict drug-target interactions. Sci Rep 2022; 12:3793. [PMID: 35260608 PMCID: PMC8904809 DOI: 10.1038/s41598-022-07608-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Accepted: 02/16/2022] [Indexed: 11/21/2022] Open
Abstract
Identification of drug-target interactions (DTIs) plays a crucial role in drug development. Traditional laboratory-based DTI discovery is generally costly and time-consuming. Therefore, computational approaches have been developed to predict interactions between drug candidates and disease-causing proteins. We designed a novel method, termed heterogeneous information integration for DTI prediction (HIDTI), based on the concept of predicting vectors for all of unknown/unavailable heterogeneous drug- and protein-related information. We applied a residual network in HIDTI to extract features of such heterogeneous information for predicting DTIs, and tested the model using drug-based ten-fold cross-validation to examine the prediction performance for unseen drugs. As a result, HIDTI outperformed existing models using heterogeneous information, and was demonstrating that our method predicted heterogeneous information on unseen data better than other models. In conclusion, our study suggests that HIDTI has the potential to advance the field of drug development by accurately predicting the targets of new drugs.
Collapse
Affiliation(s)
- Jihee Soh
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, South Korea
| | - Sejin Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, South Korea
| | - Hyunju Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, South Korea.
| |
Collapse
|
138
|
Du BX, Qin Y, Jiang YF, Xu Y, Yiu SM, Yu H, Shi JY. Compound–protein interaction prediction by deep learning: Databases, descriptors and models. Drug Discov Today 2022; 27:1350-1366. [DOI: 10.1016/j.drudis.2022.02.023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 11/19/2021] [Accepted: 02/28/2022] [Indexed: 11/24/2022]
|
139
|
Nikolaienko T, Gurbych O, Druchok M. Complex machine learning model needs complex testing: Examining predictability of molecular binding affinity by a graph neural network. J Comput Chem 2022; 43:728-739. [PMID: 35201629 DOI: 10.1002/jcc.26831] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Revised: 01/04/2022] [Accepted: 02/09/2022] [Indexed: 12/12/2022]
Abstract
Drug discovery pipelines typically involve high-throughput screening of large amounts of compounds in a search of potential drugs candidates. As a chemical space of small organic molecules is huge, a "navigation" over it urges for fast and lightweight computational methods, thus promoting machine-learning approaches for processing huge pools of candidates. In this contribution, we present a graph-based deep neural network for prediction of protein-drug binding affinity and assess its predictive power under thorough testing conditions. Within the suggested approach, both protein and drug molecules are represented as graphs and passed to separate graph sub-networks, then concatenated and regressed towards a binding affinity. The neural network is trained on two binding affinity datasets-PDBbind and data imported from RCSB Protein Data Bank. In order to explore the generalization capabilities of the model we go beyond traditional random or leave-cluster-out techniques and demonstrate the need for more elaborate model performance assessment - six different strategies for test/train data partitioning (random, time- and property-arranged, protein- and ligand-clustered) with a k-fold cross-validation are engaged. Finally, we discuss the model performance in terms of a set of metrics for different split strategies and fold arrangement. Our code is available at https://github.com/SoftServeInc/affinity-by-GNN.
Collapse
Affiliation(s)
- Tymofii Nikolaienko
- SoftServe, Inc., Lviv, Ukraine.,Faculty of Physics, Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Oleksandr Gurbych
- Blackthorn AI Ltd., London, UK.,Department of Artificial Intelligence Systems, Lviv Polytechnic National University, Lviv, Ukraine
| | - Maksym Druchok
- SoftServe, Inc., Lviv, Ukraine.,Institute for Condensed Matter Physics, NAS of Ukraine, Lviv, Ukraine
| |
Collapse
|
140
|
Kalakoti Y, Yadav S, Sundar D. TransDTI: Transformer-Based Language Models for Estimating DTIs and Building a Drug Recommendation Workflow. ACS OMEGA 2022; 7:2706-2717. [PMID: 35097268 PMCID: PMC8792915 DOI: 10.1021/acsomega.1c05203] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Accepted: 12/28/2021] [Indexed: 06/09/2023]
Abstract
The identification of novel drug-target interactions is a labor-intensive and low-throughput process. In silico alternatives have proved to be of immense importance in assisting the drug discovery process. Here, we present TransDTI, a multiclass classification and regression workflow employing transformer-based language models to segregate interactions between drug-target pairs as active, inactive, and intermediate. The models were trained with large-scale drug-target interaction (DTI) data sets, which reported an improvement in performance in terms of the area under receiver operating characteristic (auROC), the area under precision recall (auPR), Matthew's correlation coefficient (MCC), and R2 over baseline methods. The results showed that models based on transformer-based language models effectively predict novel drug-target interactions from sequence data. The proposed models significantly outperformed existing methods like DeepConvDTI, DeepDTA, and DeepDTI on a test data set. Further, the validity of novel interactions predicted by TransDTI was found to be backed by molecular docking and simulation analysis, where the model prediction had similar or better interaction potential for MAP2k and transforming growth factor-β (TGFβ) and their known inhibitors. Proposed approaches can have a significant impact on the development of personalized therapy and clinical decision making.
Collapse
Affiliation(s)
- Yogesh Kalakoti
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110016, India
| | - Shashank Yadav
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110016, India
| | - Durai Sundar
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110016, India
- School
of Artificial Intelligence, Indian Institute
of Technology (IIT) Delhi, New
Delhi 110016, India
| |
Collapse
|
141
|
Yang Z, Zhong W, Zhao L, Yu-Chian Chen C. MGraphDTA: deep multiscale graph neural network for explainable drug-target binding affinity prediction. Chem Sci 2022; 13:816-833. [PMID: 35173947 PMCID: PMC8768884 DOI: 10.1039/d1sc05180f] [Citation(s) in RCA: 85] [Impact Index Per Article: 42.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Accepted: 12/17/2021] [Indexed: 12/22/2022] Open
Abstract
Predicting drug-target affinity (DTA) is beneficial for accelerating drug discovery. Graph neural networks (GNNs) have been widely used in DTA prediction. However, existing shallow GNNs are insufficient to capture the global structure of compounds. Besides, the interpretability of the graph-based DTA models highly relies on the graph attention mechanism, which can not reveal the global relationship between each atom of a molecule. In this study, we proposed a deep multiscale graph neural network based on chemical intuition for DTA prediction (MGraphDTA). We introduced a dense connection into the GNN and built a super-deep GNN with 27 graph convolutional layers to capture the local and global structure of the compound simultaneously. We also developed a novel visual explanation method, gradient-weighted affinity activation mapping (Grad-AAM), to analyze a deep learning model from the chemical perspective. We evaluated our approach using seven benchmark datasets and compared the proposed method to the state-of-the-art deep learning (DL) models. MGraphDTA outperforms other DL-based approaches significantly on various datasets. Moreover, we show that Grad-AAM creates explanations that are consistent with pharmacologists, which may help us gain chemical insights directly from data beyond human perception. These advantages demonstrate that the proposed method improves the generalization and interpretation capability of DTA prediction modeling.
Collapse
Affiliation(s)
- Ziduo Yang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University Guangzhou 510655 China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University Shenzhen 510275 China +862039332153
- Department of Medical Research, China Medical University Hospital Taichung 40447 Taiwan
- Department of Bioinformatics and Medical Engineering, Asia University Taichung 41354 Taiwan
| |
Collapse
|
142
|
Timmons JA, Anighoro A, Brogan RJ, Stahl J, Wahlestedt C, Farquhar DG, Taylor-King J, Volmar CH, Kraus WE, Phillips SM. A human-based multi-gene signature enables quantitative drug repurposing for metabolic disease. eLife 2022; 11:68832. [PMID: 35037854 PMCID: PMC8763401 DOI: 10.7554/elife.68832] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 11/26/2021] [Indexed: 12/22/2022] Open
Abstract
Insulin resistance (IR) contributes to the pathophysiology of diabetes, dementia, viral infection, and cardiovascular disease. Drug repurposing (DR) may identify treatments for IR; however, barriers include uncertainty whether in vitro transcriptomic assays yield quantitative pharmacological data, or how to optimise assay design to best reflect in vivo human disease. We developed a clinical-based human tissue IR signature by combining lifestyle-mediated treatment responses (>500 human adipose and muscle biopsies) with biomarkers of disease status (fasting IR from >1200 biopsies). The assay identified a chemically diverse set of >130 positively acting compounds, highly enriched in true positives, that targeted 73 proteins regulating IR pathways. Our multi-gene RNA assay score reflected the quantitative pharmacological properties of a set of epidermal growth factor receptor-related tyrosine kinase inhibitors, providing insight into drug target specificity; an observation supported by deep learning-based genome-wide predicted pharmacology. Several drugs identified are suitable for evaluation in patients, particularly those with either acute or severe chronic IR.
Collapse
Affiliation(s)
- James A Timmons
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom.,Augur Precision Medicine LTD, Stirling, United Kingdom
| | | | | | - Jack Stahl
- Center for Therapeutic Innovation, Miller School of Medicine, University of Miami, Miami, United States
| | - Claes Wahlestedt
- Center for Therapeutic Innovation, Miller School of Medicine, University of Miami, Miami, United States
| | | | | | - Claude-Henry Volmar
- Center for Therapeutic Innovation, Miller School of Medicine, University of Miami, Miami, United States
| | | | - Stuart M Phillips
- Faculty of Science, Kinesiology, McMaster University, Hamilton, Canada
| |
Collapse
|
143
|
Zhao Q, Zhao H, Zheng K, Wang J. HyperAttentionDTI: improving drug-protein interaction prediction by sequence-based deep learning with attention mechanism. Bioinformatics 2022; 38:655-662. [PMID: 34664614 DOI: 10.1093/bioinformatics/btab715] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Revised: 09/24/2021] [Accepted: 10/13/2021] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Identifying drug-target interactions (DTIs) is a crucial step in drug repurposing and drug discovery. Accurately identifying DTIs in silico can significantly shorten development time and reduce costs. Recently, many sequence-based methods are proposed for DTI prediction and improve performance by introducing the attention mechanism. However, these methods only model single non-covalent inter-molecular interactions among drugs and proteins and ignore the complex interaction between atoms and amino acids. RESULTS In this article, we propose an end-to-end bio-inspired model based on the convolutional neural network (CNN) and attention mechanism, named HyperAttentionDTI, for predicting DTIs. We use deep CNNs to learn the feature matrices of drugs and proteins. To model complex non-covalent inter-molecular interactions among atoms and amino acids, we utilize the attention mechanism on the feature matrices and assign an attention vector to each atom or amino acid. We evaluate HpyerAttentionDTI on three benchmark datasets and the results show that our model achieves significantly improved performance compared with the state-of-the-art baselines. Moreover, a case study on the human Gamma-aminobutyric acid receptors confirm that our model can be used as a powerful tool to predict DTIs. AVAILABILITY AND IMPLEMENTATION The codes of our model are available at https://github.com/zhaoqichang/HpyerAttentionDTI and https://zenodo.org/record/5039589. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qichang Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Haochen Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Kai Zheng
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
144
|
Karthikeyan A, Priyakumar UD. Artificial intelligence: machine learning for chemical sciences. J CHEM SCI 2021; 134:2. [PMID: 34955617 PMCID: PMC8691161 DOI: 10.1007/s12039-021-01995-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2021] [Revised: 09/08/2021] [Accepted: 09/14/2021] [Indexed: 12/05/2022]
Abstract
Research in molecular sciences witnessed the rise and fall of Artificial Intelligence (AI)/ Machine Learning (ML) methods, especially artificial neural networks, few decades ago. However, we see a major resurgence in the use of modern ML methods in scientific research during the last few years. These methods have had phenomenal success in the areas of computer vision, speech recognition, natural language processing (NLP), etc. This has inspired chemists and biologists to apply these algorithms to problems in natural sciences. Availability of high performance Graphics Processing Unit (GPU) accelerators, large datasets, new algorithms, and libraries has enabled this surge. ML algorithms have successfully been applied to various domains in molecular sciences by providing much faster and sometimes more accurate solutions compared to traditional methods like Quantum Mechanical (QM) calculations, Density Functional Theory (DFT) or Molecular Mechanics (MM) based methods, etc. Some of the areas where the potential of ML methods are shown to be effective are in drug design, prediction of high-level quantum mechanical energies, molecular design, molecular dynamics materials, and retrosynthesis of organic compounds, etc. This article intends to conceptually introduce various modern ML methods and their relevance and applications in computational natural sciences. Synopsis Recent surge in the application of machine learning (ML) methods in fundamental sciences has led to a perspective that these methods may become important tools in chemical science. This perspective provides an overview of the modern ML methods and their successful applications in chemistry during the last few years.
Collapse
Affiliation(s)
- Akshaya Karthikeyan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500 032 India
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, 500 032 India
| |
Collapse
|
145
|
Yuan W, Chen G, Chen CYC. FusionDTA: attention-based feature polymerizer and knowledge distillation for drug-target binding affinity prediction. Brief Bioinform 2021; 23:6470967. [PMID: 34929738 DOI: 10.1093/bib/bbab506] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Revised: 10/21/2021] [Accepted: 11/03/2021] [Indexed: 12/29/2022] Open
Abstract
The prediction of drug-target affinity (DTA) plays an increasingly important role in drug discovery. Nowadays, lots of prediction methods focus on feature encoding of drugs and proteins, but ignore the importance of feature aggregation. However, the increasingly complex encoder networks lead to the loss of implicit information and excessive model size. To this end, we propose a deep-learning-based approach namely FusionDTA. For the loss of implicit information, a novel muti-head linear attention mechanism was utilized to replace the rough pooling method. This allows FusionDTA aggregates global information based on attention weights, instead of selecting the largest one as max-pooling does. To solve the redundancy issue of parameters, we applied knowledge distillation in FusionDTA by transfering learnable information from teacher model to student. Results show that FusionDTA performs better than existing models for the test domain on all evaluation metrics. We obtained concordance index (CI) index of 0.913 and 0.906 in Davis and KIBA dataset respectively, compared with 0.893 and 0.891 of previous state-of-art model. Under the cold-start constrain, our model proved to be more robust and more effective with unseen inputs than baseline methods. In addition, the knowledge distillation did save half of the parameters of the model, with only 0.006 reduction in CI index. Even FusionDTA with half the parameters could easily exceed the baseline on all metrics. In general, our model has superior performance and improves the effect of drug-target interaction (DTI) prediction. The visualization of DTI can effectively help predict the binding region of proteins during structure-based drug design.
Collapse
Affiliation(s)
- Weining Yuan
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, 510275, China
| | - Guanxing Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, 510275, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, 510275, China.,Guangdong Provincial Key Laboratory of Fire Science and Technology, Guangzhou, 510006, China.,Department of Medical Research, China Medical University Hospital, Taichung, 40447, Taiwan.,Department of Bioinformatics and Medical Engineering, Asia University, Taichung, 41354, Taiwan
| |
Collapse
|
146
|
Multilevel Attention Models for Drug Target Binding Affinity Prediction. Neural Process Lett 2021. [DOI: 10.1007/s11063-021-10617-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
147
|
Jin Y, Lu J, Shi R, Yang Y. EmbedDTI: Enhancing the Molecular Representations via Sequence Embedding and Graph Convolutional Network for the Prediction of Drug-Target Interaction. Biomolecules 2021; 11:biom11121783. [PMID: 34944427 PMCID: PMC8698792 DOI: 10.3390/biom11121783] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2021] [Revised: 11/20/2021] [Accepted: 11/24/2021] [Indexed: 01/09/2023] Open
Abstract
The identification of drug-target interaction (DTI) plays a key role in drug discovery and development. Benefitting from large-scale drug databases and verified DTI relationships, a lot of machine-learning methods have been developed to predict DTIs. However, due to the difficulty in extracting useful information from molecules, the performance of these methods is limited by the representation of drugs and target proteins. This study proposes a new model called EmbedDTI to enhance the representation of both drugs and target proteins, and improve the performance of DTI prediction. For protein sequences, we leverage language modeling for pretraining the feature embeddings of amino acids and feed them to a convolutional neural network model for further representation learning. For drugs, we build two levels of graphs to represent compound structural information, namely the atom graph and substructure graph, and adopt graph convolutional network with an attention module to learn the embedding vectors for the graphs. We compare EmbedDTI with the existing DTI predictors on two benchmark datasets. The experimental results show that EmbedDTI outperforms the state-of-the-art models, and the attention module can identify the components crucial for DTIs in compounds.
Collapse
Affiliation(s)
- Yuan Jin
- Center for Brain-Like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China; (Y.J.); (R.S.)
| | - Jiarui Lu
- School of Chemistry and Chemical Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China;
| | - Runhan Shi
- Center for Brain-Like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China; (Y.J.); (R.S.)
| | - Yang Yang
- Center for Brain-Like Computing and Machine Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China; (Y.J.); (R.S.)
- Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Rd., Shanghai 200240, China
- Correspondence:
| |
Collapse
|
148
|
Anwaar MU, Adnan F, Abro A, Khan RA, Rehman AU, Osama M, Rainville C, Kumar S, Sterner DE, Javed S, Jamal SB, Baig A, Shabbir MR, Ahsan W, Butt TR, Assir MZ. Combined deep learning and molecular docking simulations approach identifies potentially effective FDA approved drugs for repurposing against SARS-CoV-2. Comput Biol Med 2021; 141:105049. [PMID: 34823857 PMCID: PMC8604796 DOI: 10.1016/j.compbiomed.2021.105049] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 11/11/2021] [Accepted: 11/15/2021] [Indexed: 01/10/2023]
Abstract
The ongoing pandemic of Coronavirus Disease 2019 (COVID-19) has posed a serious threat to global public health. Drug repurposing is a time-efficient approach to finding effective drugs against SARS-CoV-2 in this emergency. Here, we present a robust experimental design combining deep learning with molecular docking experiments to identify the most promising candidates from the list of FDA-approved drugs that can be repurposed to treat COVID-19. We have employed a deep learning-based Drug Target Interaction (DTI) model, called DeepDTA, with few improvements to predict drug-protein binding affinities, represented as KIBA scores, for 2440 FDA-approved and 8168 investigational drugs against 24 SARS-CoV-2 viral proteins. FDA-approved drugs with the highest KIBA scores were selected for molecular docking simulations. We ran around 50,000 docking simulations for 168 selected drugs against 285 total predicted and/or experimentally proven active sites of all 24 SARS-CoV-2 viral proteins. A list of 49 most promising FDA-approved drugs with the best consensus KIBA scores and binding affinity values against selected SARS-CoV-2 viral proteins was generated. Most importantly, 16 drugs including anidulafungin, velpatasvir, glecaprevir, rifapentine, flavin adenine dinucleotide (FAD), terlipressin, and selinexor demonstrated the highest predicted inhibitory potential against key SARS-CoV-2 viral proteins. We further measured the inhibitory activity of 5 compounds (rifapentine, velpatasvir, glecaprevir, anidulafungin, and FAD disodium) on SARS-CoV-2 PLpro using Ubiquitin-Rhodamine 110 Gly fluorescent intensity assay. The highest inhibition of PLpro activity was seen with rifapentine (IC50: 15.18 μM) and FAD disodium (IC50: 12.39 μM), the drugs with high predicted KIBA scores and binding affinities.
Collapse
Affiliation(s)
- Muhammad U Anwaar
- Department of Electrical and Computer Engineering, Technical University Munich, Arcisstraße 21, 80333, München, Germany
| | - Farjad Adnan
- Paderborn University, Warburger Str. 100, 33098, Paderborn, Germany
| | - Asma Abro
- Department of Biotechnology, Faculty of Life Sciences and Informatics, Balochistan University of Information Technology, Engineering and Management Sciences, Quetta, 1800, Pakistan
| | - Rayyan A Khan
- Department of Electrical and Computer Engineering, Technical University Munich, Arcisstraße 21, 80333, München, Germany
| | - Asad U Rehman
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan; Center for Undiagnosed, Rare and Emerging Diseases, Lahore, 54550, Pakistan
| | - Muhammad Osama
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan; Center for Undiagnosed, Rare and Emerging Diseases, Lahore, 54550, Pakistan
| | | | - Suresh Kumar
- Progenra Inc, 271A Great Valley Parkway, Malvern, PA, 19355, USA
| | - David E Sterner
- Progenra Inc, 271A Great Valley Parkway, Malvern, PA, 19355, USA
| | - Saad Javed
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan; Center for Undiagnosed, Rare and Emerging Diseases, Lahore, 54550, Pakistan
| | - Syed B Jamal
- Department of Biological Sciences, National University of Medical Sciences, Rawalpindi, Pakistan
| | - Ahmadullah Baig
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan
| | - Muhammad R Shabbir
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan; Center for Undiagnosed, Rare and Emerging Diseases, Lahore, 54550, Pakistan
| | - Waseh Ahsan
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan
| | - Tauseef R Butt
- Progenra Inc, 271A Great Valley Parkway, Malvern, PA, 19355, USA
| | - Muhammad Z Assir
- Department of Medicine, Allama Iqbal Medical College, University of Health Sciences, Lahore, 54550, Pakistan; Center for Undiagnosed, Rare and Emerging Diseases, Lahore, 54550, Pakistan; Department of Molecular Biology, Shaheed Zulfiqar Ali Bhutto Medical University, Islamabad, 44000, Pakistan.
| |
Collapse
|
149
|
MacKinnon SS, Madani Tonekaboni SA, Windemuth A. Proteome-Scale Drug-Target Interaction Predictions: Approaches and Applications. Curr Protoc 2021; 1:e302. [PMID: 34794211 DOI: 10.1002/cpz1.302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Drug-Target interaction predictions are an important cornerstone of computer-aided drug discovery. While predictive methods around individual targets have a long history, the application of proteome-scale models is relatively recent. In this overview, we will provide the context required to understand advances in this emerging field within computational drug discovery, evaluate emerging technologies for suitability to given tasks, and provide guidelines for the design and implementation of new drug-target interaction prediction models. We will discuss the validation approaches used, and propose a set of key criteria that should be applied to evaluate their validity. We note that we find widespread deficiencies in the existing literature, making it difficult to judge the practical effectiveness of some of the techniques proposed from their publications alone. We hope that this review may help remedy this situation and increase awareness of several sources of bias that may enter into commonly used cross-validation methods. © 2021 Cyclica Inc. Current Protocols published by Wiley Periodicals LLC.
Collapse
|
150
|
Lennox M, Robertson N, Devereux B. Modelling Drug-Target Binding Affinity using a BERT based Graph Neural network. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:4348-4353. [PMID: 34892183 DOI: 10.1109/embc46164.2021.9629695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Understanding the interactions between novel drugs and target proteins is fundamentally important in disease research as discovering drug-protein interactions can be an exceptionally time-consuming and expensive process. Alternatively, this process can be simulated using modern deep learning methods that have the potential of utilising vast quantities of data to reduce the cost and time required to provide accurate predictions. We seek to leverage a set of BERT-style models that have been pre-trained on vast quantities of both protein and drug data. The encodings produced by each model are then utilised as node representations for a graph convolutional neural network, which in turn are used to model the interactions without the need to simultaneously fine-tune both protein and drug BERT models to the task. We evaluate the performance of our approach on two drug-target interaction datasets that were previously used as benchmarks in recent work.Our results significantly improve upon a vanilla BERT baseline approach as well as the former state-of-the-art methods for each task dataset. Our approach builds upon past work in two key areas; firstly, we take full advantage of two large pre-trained BERT models that provide improved representations of task-relevant properties of both drugs and proteins. Secondly, inspired by work in natural language processing that investigates how linguistic structure is represented in such models, we perform interpretability analyses that allow us to locate functionally-relevant areas of interest within each drug and protein. By modelling the drug-target interactions as a graph as opposed to a set of isolated interactions, we demonstrate the benefits of combining large pre-trained models and a graph neural network to make state-of-the-art predictions on drug-target binding affinity.
Collapse
|