1
|
Picard M, Leclercq M, Bodein A, Scott-Boyer MP, Perin O, Droit A. Improving drug repositioning with negative data labeling using large language models. J Cheminform 2025; 17:16. [PMID: 39905466 DOI: 10.1186/s13321-025-00962-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Accepted: 01/20/2025] [Indexed: 02/06/2025] Open
Abstract
INTRODUCTION Drug repositioning offers numerous advantages, such as faster development timelines, reduced costs, and lower failure rates in drug development. Supervised machine learning is commonly used to score drug candidates but is hindered by the lack of reliable negative data-drugs that fail due to inefficacy or toxicity- which is difficult to access, lowering their prediction accuracy and generalization. Positive-Unlabeled (PU) learning has been used to overcome this issue by either randomly sampling unlabeled drugs or identifying probable negatives but still suffers from misclassification or oversimplified decision boundaries. RESULTS We proposed a novel strategy using Large Language Models (GPT-4) to analyze all clinical trials on prostate cancer and systematically identify true negatives. This approach showed remarkable improvement in predictive accuracy on independent test sets with a Matthews Correlation Coefficient of 0.76 (± 0.33) compared to 0.55 (± 0.15) and 0.48 (± 0.18) for two commonly used PU learning approaches. Using our labeling strategy, we created a training set of 26 positive and 54 experimentally validated negative drugs. We then applied a machine learning ensemble to this new dataset to assess the repurposing potential of the remaining 11,043 drugs in the DrugBank database. This analysis identified 980 potential candidates for prostate cancer. A detailed review of the top 30 revealed 9 promising drugs targeting various mechanisms such as genomic instability, p53 regulation, or TMPRSS2-ERG fusion. CONCLUSION By expanding our negative data labeling approach to all diseases within the ClinicalTrials.gov database, our method could greatly advance supervised drug repositioning, offering a more accurate and data-driven path for discovering new treatments.
Collapse
Affiliation(s)
- Milan Picard
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Mickael Leclercq
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Antoine Bodein
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Marie Pier Scott-Boyer
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada
| | - Olivier Perin
- Digital Transformation and Innovation Department, L'Oréal Advanced Research, Aulnay-Sous-Bois, France
| | - Arnaud Droit
- Molecular Medicine Department, CHU de Québec Research Center, Université Laval, Québec, QC, Canada.
| |
Collapse
|
2
|
Zuo Y, Wu X, Ge F, Yan H, Fei S, Liang J, Deng Z. Research progress on Drug-Target Interactions in the last five years. Anal Biochem 2025; 697:115691. [PMID: 39455038 DOI: 10.1016/j.ab.2024.115691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2024] [Revised: 10/06/2024] [Accepted: 10/16/2024] [Indexed: 10/28/2024]
Abstract
The identification of Drug-Target Interaction (DTI) is an important step in drug discovery and drug repositioning, and has high application value in multiple fields such as drug discovery, drug repositioning, and repurposing. However, the high cost of experimental validation limits its identification. In contrast, computation-based approaches are both economical and efficient. This review first synthesizes existing chemical genomic approaches, provides a comprehensive summary of prevalent databases for predicting DTIs, and categorizes the feature encodings from recent years. This is followed by an overview and brief description of the methods currently in use for predicting DTIs. The strengths and weaknesses of newly proposed prediction methods in the last five years (2020-2024), including those based on network representation learning and graph neural networks, are then discussed in detail, evaluating the performance of the different methods on a wide range of datasets. Finally, this review explores potential directions for future DTI research, emphasizing how to improve prediction accuracy and efficiency by combining big data and emerging computing technologies.
Collapse
Affiliation(s)
- Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China.
| | - Xubin Wu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Fei Ge
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Hongjin Yan
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Sirui Fei
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Jingwen Liang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, 214000, China.
| |
Collapse
|
3
|
Zhang T, Yu C, Zhang S. CA-SQBG: Cross-attention guided Siamese quantum BiGRU for drug-drug interaction extraction. Comput Biol Med 2025; 186:109655. [PMID: 39864333 DOI: 10.1016/j.compbiomed.2025.109655] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2024] [Revised: 12/14/2024] [Accepted: 01/03/2025] [Indexed: 01/28/2025]
Abstract
Accurate and efficient drug-drug interaction extraction (DDIE) from the medical corpus is essential for pharmacovigilance, drug therapy and drug development. To solve the problems of unbalance dataset and lack of accurate manual annotations in DDIE, a cross-attention guided Siamese quantum BiGRU (CA-SQBG) is constructed to improve feature representation learning ability for DDIE. It mainly consists of two quantum BiGRUs (QBiGRUs) and a cross-attention, where two QBiGRUs are Siamese implemented in a variational quantum environment to learn the contextual semantic feature representation of drug pairs, cross-attention is employed to learn mutual information from the Siamese QBiGRUs, which in turn allows the two modules to extract DDI more collaboratively. Unlike BiGRU, Siamese QBiGRUs uses internal and external dependencies in quaternion algebra to map DDI correlations within and between multidimensional features, whereas BiGRU can only capture dependencies within sequences. CA-SQBG is evaluated on the DDIExtraction2013 dataset, and the results demonstrate that it can effectively capture the inter- and intra-dependencies within multimodal features with few parameters, using a small number of training samples, and is superior to the most advanced DDIE methods. CA-SQBG offers potential applications for quantum computing and Siamese networks in the field of DDIE. Code is available on https://github.com/xaycq/CA-SQBG.
Collapse
Affiliation(s)
- Ting Zhang
- College of Electronic Information, Xijing University, Xi'an, China
| | - Changqing Yu
- College of Electronic Information, Xijing University, Xi'an, China
| | - Shanwen Zhang
- College of Electronic Information, Xijing University, Xi'an, China.
| |
Collapse
|
4
|
Ning Q, Wang Y, Zhao Y, Sun J, Jiang L, Wang K, Yin M. DMHGNN: Double multi-view heterogeneous graph neural network framework for drug-target interaction prediction. Artif Intell Med 2025; 159:103023. [PMID: 39579417 DOI: 10.1016/j.artmed.2024.103023] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 10/09/2024] [Accepted: 11/15/2024] [Indexed: 11/25/2024]
Abstract
Accurate identification of drug-target interactions (DTIs) plays a crucial role in drug discovery. Compared with traditional experimental methods that are labor-intensive and time-consuming, computational methods for drug-target interactions prediction are more popular in recent years. Conventional computational methods almost simply view heterogeneous network constructed by the drug-related and protein-related dataset instead of comprehensively exploring drug-protein pair (DPP) information. To address this limitation, we proposed a Double Multi-view Heterogeneous Graph Neural Network framework for drug-target interaction prediction (DMHGNN). In DMHGNN, one multi-view heterogeneous graph neural network is based on meta-paths and denoising autoencoder for protein-, drug-related heterogeneous network learning, and another multi-view heterogeneous graph neural network is based on multi-channel graph convolutional network for drug-protein pair similarity network learning. First, a meta-path-based graph encoder with the attention mechanism is used for substructure learning of complex relationships from heterogeneous network constructed by proteins, drugs, side-effects and diseases, obtaining key information that is easy to be ignored in global learning of heterogeneous networks, and multi-source neighbouring features for drugs and proteins are learned from heterogeneous network via denoising auto-encoder model. Then, multi-view graphs of drug-protein pairs (DPPs) including the topology graph, semantics graph and collaborative graph with shared weights are constructed, and the multi-channel graph convolutional network (GCN) is utilized to learn the deep representation of DPPs. Finally, a multi-layer fully connection network is trained to predict drug-target interactions. Experiments have demonstrated its effectiveness and better performance than state-of-the-art methods.
Collapse
Affiliation(s)
- Qiao Ning
- The School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, Jiangsu, China; Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China; Neusoft Education Technology Group, Dalian 116026, Liaoning, China; Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun 130015, Jilin, China.
| | - Yue Wang
- Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China
| | - Yaomiao Zhao
- Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China
| | - Jiahao Sun
- Computer Science and Technology, the Northeast Normal University, Changchun 999078, Jilin, China
| | - Lu Jiang
- Information Science and Technology, Dalian Maritime University, Dalian 116026, Liaoning, China; Computer Science and Technology, the Northeast Normal University, Changchun 999078, Jilin, China.
| | - Kaidi Wang
- Computer Science and Technology, the Northeast Normal University, Changchun 999078, Jilin, China
| | - Minghao Yin
- Computer Science and Technology, the Northeast Normal University, Changchun 999078, Jilin, China.
| |
Collapse
|
5
|
Xu P, Wei Z, Li C, Yuan J, Liu Z, Liu W. Drug-Target Prediction Based on Dynamic Heterogeneous Graph Convolutional Network. IEEE J Biomed Health Inform 2024; 28:6997-7005. [PMID: 39120984 DOI: 10.1109/jbhi.2024.3441324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/11/2024]
Abstract
Novel drug-target interaction (DTI) prediction is crucial in drug discovery and repositioning. Recently, graph neural network (GNN) has shown promising results in identifying DTI by using thresholds to construct heterogeneous graphs. However, an empirically selected threshold can lead to loss of valuable information, especially in sparse networks, a common scenario in DTI prediction. To make full use of insufficient information, we propose a DTI prediction model based on Dynamic Heterogeneous Graph (DT-DHG). And progressive learning is introduced to adjust the receptive fields of node. The experimental results show that our method significantly improves the performance of the original GNNs and is robust against the choices of backbones. Meanwhile, DT-DHG outperforms the state-of-the-art methods and effectively predicts novel DTIs.
Collapse
|
6
|
Chakraborty C, Bhattacharya M, Lee SS, Wen ZH, Lo YH. The changing scenario of drug discovery using AI to deep learning: Recent advancement, success stories, collaborations, and challenges. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102295. [PMID: 39257717 PMCID: PMC11386122 DOI: 10.1016/j.omtn.2024.102295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
Due to the transformation of artificial intelligence (AI) tools and technologies, AI-driven drug discovery has come to the forefront. It reduces the time and expenditure. Due to these advantages, pharmaceutical industries are concentrating on AI-driven drug discovery. Several drug molecules have been discovered using AI-based techniques and tools, and several newly AI-discovered drug molecules have already entered clinical trials. In this review, we first present the data and their resources in the pharmaceutical sector for AI-driven drug discovery and illustrated some significant algorithms or techniques used for AI and ML which are used in this field. We gave an overview of the deep neural network (NN) models and compared them with artificial NNs. Then, we illustrate the recent advancement of the landscape of drug discovery using AI to deep learning, such as the identification of drug targets, prediction of their structure, estimation of drug-target interaction, estimation of drug-target binding affinity, design of de novo drug, prediction of drug toxicity, estimation of absorption, distribution, metabolism, excretion, toxicity; and estimation of drug-drug interaction. Moreover, we highlighted the success stories of AI-driven drug discovery and discussed several collaboration and the challenges in this area. The discussions in the article will enrich the pharmaceutical industry.
Collapse
Affiliation(s)
- Chiranjib Chakraborty
- Department of Biotechnology, School of Life Science and Biotechnology, Adamas University, Kolkata, West Bengal 700126, India
| | - Manojit Bhattacharya
- Department of Zoology, Fakir Mohan University, Vyasa Vihar, Balasore, Odisha 756020, India
| | - Sang-Soo Lee
- Institute for Skeletal Aging & Orthopedic Surgery, Hallym University-Chuncheon Sacred Heart Hospital, Chuncheon, Gangwon-Do 24252, Republic of Korea
| | - Zhi-Hong Wen
- Department of Marine Biotechnology and Resources, National Sun Yat-sen University, Kaohsiung 80424, Taiwan
| | - Yi-Hao Lo
- Department of Family Medicine, Zuoying Armed Forces General Hospital, Kaohsiung 813204, Taiwan
- Shu-Zen Junior College of Medicine and Management, Kaohsiung 821004, Taiwan
- Institute of Medical Science and Technology, National Sun Yat-sen University, Kaohsiung 804201, Taiwan
| |
Collapse
|
7
|
Wang W, Yu M, Sun B, Li J, Liu D, Zhang H, Wang X, Zhou Y. SMGCN: Multiple Similarity and Multiple Kernel Fusion Based Graph Convolutional Neural Network for Drug-Target Interactions Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:143-154. [PMID: 38051618 DOI: 10.1109/tcbb.2023.3339645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Accurately identifying potential drug-target interactions (DTIs) is a critical step in accelerating drug discovery. Despite many studies that have been conducted over the past decades, detecting DTIs remains a highly challenging and complicated process. Therefore, we propose a novel method called SMGCN, which combines multiple similarity and multiple kernel fusion based on Graph Convolutional Network (GCN) to predict DTIs. In order to capture the features of the network structure and fully explore direct or indirect relationships between nodes, we propose the method of multiple similarity, which combines similarity fusion matrices with Random Walk with Restart (RWR) and cosine similarity. Then, we use GCN to extract multi-layer low-dimensional embedding features. Unlike traditional GCN methods, we incorporate Multiple Kernel Learning (MKL). Finally, we use the Dual Laplace Regularized Least Squares method to predict novel DTIs through combinatorial kernels in drug and target spaces. We conduct experiments on a golden standard dataset, and demonstrate the effectiveness of our proposed model in predicting DTIs through showing significant improvements in Area Under the Curve (AUC) and Area Under the Precision-Recall Curve (AUPR). In addition, our model can also discover some new DTIs, which can be verified by the KEGG BRITE Database and relevant literature.
Collapse
|
8
|
Ye Q, Zhang X, Lin X. Drug-Target Interaction Prediction via Graph Auto-Encoder and Multi-Subspace Deep Neural Networks. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:2647-2658. [PMID: 36107905 DOI: 10.1109/tcbb.2022.3206907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Computational prediction of drug-target interaction (DTI) is important for the new drug discovery. Currently, the deep neural network (DNN) has been widely used in DTI prediction. However, parameters of the DNN could be insufficiently trained and features of the data could be insufficiently utilized, because the DTI data is limited and its dimension is very high. To deal with the above problems, in this paper, a graph auto-encoder and multi-subspace deep neural network (GAEMSDNN) is designed. GAEMSDNN enhances its learning ability with a graph auto-encoder, a subspace layer and an ensemble layer. The graph auto-encoder can preserve the reconstruction information. The subspace layer can obtain different strong feature subsets. The ensemble layer in the GAEMSDNN can comprehensively utilize these strong feature subsets in a unified optimization framework. As a result, more features can be extracted from the network input and the DNN network can be better trained. In experiments, the results of GAEMSDNN are significantly improved compared to the previous methods, which validates the effectiveness of our strategies.
Collapse
|
9
|
Yuan Y, Zhang Y, Meng X, Liu Z, Wang B, Miao R, Zhang R, Su W, Liu L. EDC-DTI: An end-to-end deep collaborative learning model based on multiple information for drug-target interactions prediction. J Mol Graph Model 2023; 122:108498. [PMID: 37126908 DOI: 10.1016/j.jmgm.2023.108498] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Revised: 04/10/2023] [Accepted: 04/17/2023] [Indexed: 05/03/2023]
Abstract
Innovations in drug-target interactions (DTIs) prediction accelerate the progression of drug development. The introduction of deep learning models has a dramatic impact on DTIs prediction, with a distinct influence on saving time and money in drug discovery. This study develops an end-to-end deep collaborative learning model for DTIs prediction, called EDC-DTI, to identify new targets for existing drugs based on multiple drug-target-related information including homogeneous information and heterogeneous information by the way of deep learning. Our end-to-end model is composed of a feature builder and a classifier. Feature builder consists of two collaborative feature construction algorithms that extract the molecular properties and the topology property of networks, and the classifier consists of a feature encoder and a feature decoder which are designed for feature integration and DTIs prediction, respectively. The feature encoder, mainly based on the improved graph attention network, incorporates heterogeneous information into drug features and target features separately. The feature decoder is composed of multiple neural networks for predictions. Compared with six popular baseline models, EDC-DTI achieves highest predictive performance in the case of low computational costs. Robustness tests demonstrate that EDC-DTI is able to maintain strong predictive performance on sparse datasets. As well, we use the model to predict the most likely targets to interact with Simvastatin (DB00641), Nifedipine (DB01115) and Afatinib (DB08916) as examples. Results show that most of the predictions can be confirmed by literature with clear evidence.
Collapse
Affiliation(s)
- Yongna Yuan
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China.
| | - Yuhao Zhang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Xiangbo Meng
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Zhenyu Liu
- School of Cyberspace Security, Gansu University of Political Science and Law, Anning West Road, Lanzhou, 730070, Gansu, China
| | - Bohan Wang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Ruidong Miao
- School of Life Science, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Ruisheng Zhang
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Wei Su
- School of Information Science & Engineering, Lanzhou University, South Tianshui Road, Lanzhou, 730000, Gansu, China
| | - Lei Liu
- Duzhe Publishing Group Co. Ltd., DuZhe Road, Lanzhou, 730000, Gansu, China
| |
Collapse
|
10
|
Abbasi Mesrabadi H, Faez K, Pirgazi J. Drug-target interaction prediction based on protein features, using wrapper feature selection. Sci Rep 2023; 13:3594. [PMID: 36869062 PMCID: PMC9984486 DOI: 10.1038/s41598-023-30026-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2022] [Accepted: 02/14/2023] [Indexed: 03/05/2023] Open
Abstract
Drug-target interaction prediction is a vital stage in drug development, involving lots of methods. Experimental methods that identify these relationships on the basis of clinical remedies are time-taking, costly, laborious, and complex introducing a lot of challenges. One group of new methods is called computational methods. The development of new computational methods which are more accurate can be preferable to experimental methods, in terms of total cost and time. In this paper, a new computational model to predict drug-target interaction (DTI), consisting of three phases, including feature extraction, feature selection, and classification is proposed. In feature extraction phase, different features such as EAAC, PSSM and etc. would be extracted from sequence of proteins and fingerprint features from drugs. These extracted features would then be combined. In the next step, one of the wrapper feature selection methods named IWSSR, due to the large amount of extracted data, is applied. The selected features are then given to rotation forest classification, to have a more efficient prediction. Actually, the innovation of our work is that we extract different features; and then select features by the use of IWSSR. The accuracy of the rotation forest classifier based on tenfold on the golden standard datasets (enzyme, ion channels, G-protein-coupled receptors, nuclear receptors) is as follows: 98.12, 98.07, 96.82, and 95.64. The results of experiments indicate that the proposed model has an acceptable rate in DTI prediction and is compatible with the proposed methods in other papers.
Collapse
Affiliation(s)
- Hengame Abbasi Mesrabadi
- Faculty of Computer and Information Technology Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
| | - Karim Faez
- Department of Electrical Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran.
| | - Jamshid Pirgazi
- Department of Computer Engineering, University of Science and Technology of Mazandaran, Behshahr, Iran
| |
Collapse
|
11
|
Jamali AA, Kusalik A, Wu FX. NMTF-DTI: A Nonnegative Matrix Tri-factorization Approach With Multiple Kernel Fusion for Drug-Target Interaction Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:586-594. [PMID: 34914594 DOI: 10.1109/tcbb.2021.3135978] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Prediction of drug-target interactions (DTIs) plays a significant role in drug development and drug discovery. Although this task requires a large investment in terms of time and cost, especially when it is performed experimentally, the results are not necessarily significant. Computational DTI prediction is a shortcut to reduce the risks of experimental methods. In this study, we propose an effective approach of nonnegative matrix tri-factorization, referred to as NMTF-DTI, to predict the interaction scores between drugs and targets. NMTF-DTI utilizes multiple kernels (similarity measures) for drugs and targets and Laplacian regularization to boost the prediction performance. The performance of NMTF-DTI is evaluated via cross-validation and is compared with existing DTI prediction methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR). We evaluate our method on four gold standard datasets, comparing to other state-of-the-art methods. Cross-validation and a separate, manually created dataset are used to set parameters. The results show that NMTF-DTI outperforms other competing methods. Moreover, the results of a case study also confirm the superiority of NMTF-DTI.
Collapse
|
12
|
Tangmanussukum P, Kawichai T, Suratanee A, Plaimas K. Heterogeneous network propagation with forward similarity integration to enhance drug-target association prediction. PeerJ Comput Sci 2022; 8:e1124. [PMID: 36262151 PMCID: PMC9575853 DOI: 10.7717/peerj-cs.1124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Accepted: 09/14/2022] [Indexed: 06/16/2023]
Abstract
Identification of drug-target interaction (DTI) is a crucial step to reduce time and cost in the drug discovery and development process. Since various biological data are publicly available, DTIs have been identified computationally. To predict DTIs, most existing methods focus on a single similarity measure of drugs and target proteins, whereas some recent methods integrate a particular set of drug and target similarity measures by a single integration function. Therefore, many DTIs are still missing. In this study, we propose heterogeneous network propagation with the forward similarity integration (FSI) algorithm, which systematically selects the optimal integration of multiple similarity measures of drugs and target proteins. Seven drug-drug and nine target-target similarity measures are applied with four distinct integration methods to finally create an optimal heterogeneous network model. Consequently, the optimal model uses the target similarity based on protein sequences and the fused drug similarity, which combines the similarity measures based on chemical structures, the Jaccard scores of drug-disease associations, and the cosine scores of drug-drug interactions. With an accuracy of 99.8%, this model significantly outperforms others that utilize different similarity measures of drugs and target proteins. In addition, the validation of the DTI predictions of this model demonstrates the ability of our method to discover missing potential DTIs.
Collapse
Affiliation(s)
- Piyanut Tangmanussukum
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| | - Thitipong Kawichai
- Department of Mathematics and Computer Science, Academic Division, Chulachomklao Royal Military Academy, Nakhon Nayok, Thailand
| | - Apichat Suratanee
- Department of Mathematics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
- Intelligent and Nonlinear Dynamics Innovations Research Center, Science and Technology Research Institute, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
| | - Kitiporn Plaimas
- Advanced Virtual and Intelligent Computing (AVIC) Center, Department of Mathematics and Computer Science, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
- Omics Science and Bioinformatics Center, Faculty of Science, Chulalongkorn University, Bangkok, Thailand
| |
Collapse
|
13
|
Fassio AV, Shub L, Ponzoni L, McKinley J, O’Meara MJ, Ferreira RS, Keiser MJ, de Melo Minardi RC. Prioritizing Virtual Screening with Interpretable Interaction Fingerprints. J Chem Inf Model 2022; 62:4300-4318. [DOI: 10.1021/acs.jcim.2c00695] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Alexandre V. Fassio
- São Carlos Institute of Physics, University of São Paulo, São Carlos, São Paulo 13563-120, Brazil
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais 31270-901, Brazil
| | - Laura Shub
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California 94143, United States
| | - Luca Ponzoni
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California 94143, United States
| | - Jessica McKinley
- Gilead Sciences, Inc., Foster City, California 94404, United States
| | - Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Rafaela S. Ferreira
- Department of Biochemistry and Immunology, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais 31270-901, Brazil
| | - Michael J. Keiser
- Department of Pharmaceutical Chemistry, Department of Bioengineering & Therapeutic Sciences, Institute for Neurodegenerative Diseases, Kavli Institute for Fundamental Neuroscience, Bakar Computational Health Sciences Institute, University of California, San Francisco, San Francisco, California 94143, United States
| | - Raquel C. de Melo Minardi
- Department of Computer Science, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais 31270-901, Brazil
| |
Collapse
|
14
|
Pu Y, Li J, Tang J, Guo F. DeepFusionDTA: Drug-Target Binding Affinity Prediction With Information Fusion and Hybrid Deep-Learning Ensemble Model. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2760-2769. [PMID: 34379594 DOI: 10.1109/tcbb.2021.3103966] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Identification of drug-target interaction (DTI) is the most important issue in the broad field of drug discovery. Using purely biological experiments to verify drug-target binding profiles takes lots of time and effort, so computational technologies for this task obviously have great benefits in reducing the drug search space. Most of computational methods to predict DTI are proposed to solve a binary classification problem, which ignore the influence of binding strength. Therefore, drug-target binding affinity prediction is still a challenging issue. Currently, lots of studies only extract sequence information that lacks feature-rich representation, but we consider more spatial features in order to merge various data in drug and target spaces. In this study, we propose a two-stage deep neural network ensemble model for detecting drug-target binding affinity, called DeepFusionDTA, via various information analysis modules. First stage is to utilize sequence and structure information to generate fusion feature map of candidate protein and drug pair through various analysis modules based deep learning. Second stage is to apply bagging-based ensemble learning strategy for regression prediction, and we obtain outstanding results by combining the advantages of various algorithms in efficient feature abstraction and regression calculation. Importantly, we evaluate our novel method, DeepFusionDTA, which delivers 1.5 percent CI increase on KIBA dataset and 1.0 percent increase on Davis dataset, by comparing with existing prediction tools, DeepDTA. Furthermore, the ideas we have offered can be applied to in-silico screening of the interaction space, to provide novel DTIs which can be experimentally pursued. The codes and data are available from https://github.com/guofei-tju/DeepFusionDTA.
Collapse
|
15
|
Detecting Drug–Target Interactions with Feature Similarity Fusion and Molecular Graphs. BIOLOGY 2022; 11:biology11070967. [PMID: 36101348 PMCID: PMC9312204 DOI: 10.3390/biology11070967] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 06/12/2022] [Accepted: 06/24/2022] [Indexed: 12/03/2022]
Abstract
Simple Summary Accurate identification of potential targets for drugs to interact with can accelerate drug development. The identification of drug–target interactions can provide insights into hidden drug efficacy. This paper presents a prediction model based on feature similarity fusion that can identify crucial features of drugs and targets to help predict drug–target interactions. Abstract The key to drug discovery is the identification of a target and a corresponding drug compound. Effective identification of drug–target interactions facilitates the development of drug discovery. In this paper, drug similarity and target similarity are considered, and graphical representations are used to extract internal structural information and intermolecular interaction information about drugs and targets. First, drug similarity and target similarity are fused using the similarity network fusion (SNF) method. Then, the graph isomorphic network (GIN) is used to extract the features with information about the internal structure of drug molecules. For target proteins, feature extraction is carried out using TextCNN to efficiently capture the features of target protein sequences. Three different divisions (CVD, CVP, CVT) are used on the standard dataset, and experiments are carried out separately to validate the performance of the model for drug–target interaction prediction. The experimental results show that our method achieves better results on AUC and AUPR. The docking results also show the superiority of the proposed model in predicting drug–target interactions.
Collapse
|
16
|
Yu H, Zhao S, Shi J. STNN-DDI: a Substructure-aware Tensor Neural Network to predict Drug-Drug Interactions. Brief Bioinform 2022; 23:6603447. [PMID: 35667078 DOI: 10.1093/bib/bbac209] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/25/2022] [Accepted: 05/05/2022] [Indexed: 11/14/2022] Open
Abstract
Computational prediction of multiple-type drug-drug interaction (DDI) helps reduce unexpected side effects in poly-drug treatments. Although existing computational approaches achieve inspiring results, they ignore to study which local structures of drugs cause DDIs, and their interpretability is still weak. In this paper, by supposing that the interactions between two given drugs are caused by their local chemical structures (substructures) and their DDI types are determined by the linkages between different substructure sets, we design a novel Substructure-aware Tensor Neural Network model for DDI prediction (STNN-DDI). The proposed model learns a 3-D tensor of $\langle $ substructure, substructure, interaction type $\rangle $ triplets, which characterizes a substructure-substructure interaction (SSI) space. According to a list of predefined substructures with specific chemical meanings, the mapping of drugs into this SSI space enables STNN-DDI to perform the multiple-type DDI prediction in both transductive and inductive scenarios in a unified form with an explicable manner. The comparison with deep learning-based state-of-the-art baselines demonstrates the superiority of STNN-DDI with the significant improvement of AUC, AUPR, Accuracy and Precision. More importantly, case studies illustrate its interpretability by both revealing an important substructure pair across drugs regarding a DDI type of interest and uncovering interaction type-specific substructure pairs in a given DDI. In summary, STNN-DDI provides an effective approach to predicting DDIs as well as explaining the interaction mechanisms among drugs. Source code is freely available at https://github.com/zsy-9/STNN-DDI.
Collapse
Affiliation(s)
- Hui Yu
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - ShiYu Zhao
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710072, China
| | - JianYu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
17
|
DeepMHADTA: Prediction of Drug-Target Binding Affinity Using Multi-Head Self-Attention and Convolutional Neural Network. Curr Issues Mol Biol 2022; 44:2287-2299. [PMID: 35678684 PMCID: PMC9164023 DOI: 10.3390/cimb44050155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 05/08/2022] [Accepted: 05/14/2022] [Indexed: 11/17/2022] Open
Abstract
Drug-target interactions provide insight into the drug-side effects and drug repositioning. However, wet-lab biochemical experiments are time-consuming and labor-intensive, and are insufficient to meet the pressing demand for drug research and development. With the rapid advancement of deep learning, computational methods are increasingly applied to screen drug-target interactions. Many methods consider this problem as a binary classification task (binding or not), but ignore the quantitative binding affinity. In this paper, we propose a new end-to-end deep learning method called DeepMHADTA, which uses the multi-head self-attention mechanism in a deep residual network to predict drug-target binding affinity. On two benchmark datasets, our method outperformed several current state-of-the-art methods in terms of multiple performance measures, including mean square error (MSE), consistency index (CI), rm2, and PR curve area (AUPR). The results demonstrated that our method achieved better performance in predicting the drug–target binding affinity.
Collapse
|
18
|
Nag S, Baidya ATK, Mandal A, Mathew AT, Das B, Devi B, Kumar R. Deep learning tools for advancing drug discovery and development. 3 Biotech 2022; 12:110. [PMID: 35433167 PMCID: PMC8994527 DOI: 10.1007/s13205-022-03165-8] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2021] [Accepted: 03/18/2022] [Indexed: 12/26/2022] Open
Abstract
A few decades ago, drug discovery and development were limited to a bunch of medicinal chemists working in a lab with enormous amount of testing, validations, and synthetic procedures, all contributing to considerable investments in time and wealth to get one drug out into the clinics. The advancements in computational techniques combined with a boom in multi-omics data led to the development of various bioinformatics/pharmacoinformatics/cheminformatics tools that have helped speed up the drug development process. But with the advent of artificial intelligence (AI), machine learning (ML) and deep learning (DL), the conventional drug discovery process has been further rationalized. Extensive biological data in the form of big data present in various databases across the globe acts as the raw materials for the ML/DL-based approaches and helps in accurate identifications of patterns and models which can be used to identify therapeutically active molecules with much fewer investments on time, workforce and wealth. In this review, we have begun by introducing the general concepts in the drug discovery pipeline, followed by an outline of the fields in the drug discovery process where ML/DL can be utilized. We have also introduced ML and DL along with their applications, various learning methods, and training models used to develop the ML/DL-based algorithms. Furthermore, we have summarized various DL-based tools existing in the public domain with their application in the drug discovery paradigm which includes DL tools for identification of drug targets and drug-target interaction such as DeepCPI, DeepDTA, WideDTA, PADME DeepAffinity, and DeepPocket. Additionally, we have discussed various DL-based models used in protein structure prediction, de novo design of new chemical scaffolds, virtual screening of chemical libraries for hit identification, absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction, metabolite prediction, clinical trial design, and oral bioavailability prediction. In the end, we have tried to shed light on some of the successful ML/DL-based models used in the drug discovery and development pipeline while also discussing the current challenges and prospects of the application of DL tools in drug discovery and development. We believe that this review will be useful for medicinal and computational chemists searching for DL tools for use in their drug discovery projects.
Collapse
Affiliation(s)
- Sagorika Nag
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Anurag T. K. Baidya
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Abhimanyu Mandal
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Alen T. Mathew
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bhanuranjan Das
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Bharti Devi
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| | - Rajnish Kumar
- Department of Pharmaceutical Engineering and Technology, Indian Institute of Technology (B.H.U.), Varanasi, UP 221005 India
| |
Collapse
|
19
|
Multi-TransDTI: Transformer for Drug–Target Interaction Prediction Based on Simple Universal Dictionaries with Multi-View Strategy. Biomolecules 2022; 12:biom12050644. [PMID: 35625572 PMCID: PMC9138327 DOI: 10.3390/biom12050644] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 04/19/2022] [Accepted: 04/25/2022] [Indexed: 01/03/2023] Open
Abstract
Prediction on drug–target interaction has always been a crucial link for drug discovery and repositioning, which have witnessed tremendous progress in recent years. Despite many efforts made, the existing representation learning or feature generation approaches of both drugs and proteins remain complicated as well as in high dimension. In addition, it is difficult for current methods to extract local important residues from sequence information while remaining focused on global structure. At the same time, massive data is not always easily accessible, which makes model learning from small datasets imminent. As a result, we propose an end-to-end learning model with SUPD and SUDD methods to encode drugs and proteins, which not only leave out the complicated feature extraction process but also greatly reduce the dimension of the embedding matrix. Meanwhile, we use a multi-view strategy with a transformer to extract local important residues of proteins for better representation learning. Finally, we evaluate our model on the BindingDB dataset in comparisons with different state-of-the-art models from comprehensive indicators. In results of 100% BindingDB, our AUC, AUPR, ACC, and F1-score reached 90.9%, 89.8%, 84.2%, and 84.3% respectively, which successively exceed the average values of other models by 2.2%, 2.3%, 2.6%, and 2.6%. Moreover, our model also generally surpasses their performance on 30% and 50% BindingDB datasets.
Collapse
|
20
|
Kalakoti Y, Yadav S, Sundar D. Deep Neural Network-Assisted Drug Recommendation Systems for Identifying Potential Drug-Target Interactions. ACS OMEGA 2022; 7:12138-12146. [PMID: 35449922 PMCID: PMC9016825 DOI: 10.1021/acsomega.2c00424] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/18/2022] [Indexed: 06/14/2023]
Abstract
In silico methods to identify novel drug-target interactions (DTIs) have gained significant importance over conventional techniques owing to their labor-intensive and low-throughput nature. Here, we present a machine learning-based multiclass classification workflow that segregates interactions between active, inactive, and intermediate drug-target pairs. Drug molecules, protein sequences, and molecular descriptors were transformed into machine-interpretable embeddings to extract critical features from standard datasets. Tools such as CHEMBL web resource, iFeature, and an in-house developed deep neural network-assisted drug recommendation (dNNDR)-featx were employed for data retrieval and processing. The models were trained with large-scale DTI datasets, which reported an improvement in performance over baseline methods. External validation results showed that models based on att-biLSTM and gCNN could help predict novel DTIs. When tested with a completely different dataset, the proposed models significantly outperformed competing methods. The validity of novel interactions predicted by dNNDR was backed by experimental and computational evidence in the literature. The proposed methodology could elucidate critical features that govern the relationship between a drug and its target.
Collapse
Affiliation(s)
- Yogesh Kalakoti
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Shashank Yadav
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
| | - Durai Sundar
- DAILAB,
Department of Biochemical Engineering & Biotechnology, Indian Institute of Technology (IIT) Delhi, New Delhi 110 016, India
- School
of Artificial Intelligence, Indian Institute
of Technology (IIT) Delhi, New Delhi 110 016, India
| |
Collapse
|
21
|
Staszak M, Staszak K, Wieszczycka K, Bajek A, Roszkowski K, Tylkowski B. Machine learning in drug design: Use of artificial intelligence to explore the chemical structure–biological activity relationship. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1568] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Maciej Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Katarzyna Staszak
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Karolina Wieszczycka
- Institute of Technology and Chemical Engineering Poznan University of Technology Poznan Poland
| | - Anna Bajek
- Department of Tissue Engineering Collegium Medicum, Nicolaus Copernicus University Bydgoszcz Poland
| | - Krzysztof Roszkowski
- Department of Oncology Collegium Medicum Nicolaus Copernicus University Bydgoszcz Poland
| | - Bartosz Tylkowski
- Department of Chemical Engineering University Rovira i Virgili Tarragona Spain
- Eurecat, Centre Tecnològic de Catalunya Chemical Technologies Unit Tarragona Spain
| |
Collapse
|
22
|
Song T, Zhang X, Ding M, Rodriguez-Paton A, Wang S, Wang G. DeepFusion: A Deep Learning Based Multi-Scale Feature Fusion Method for Predicting Drug-Target Interactions. Methods 2022; 204:269-277. [DOI: 10.1016/j.ymeth.2022.02.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Revised: 01/28/2022] [Accepted: 02/20/2022] [Indexed: 12/15/2022] Open
|
23
|
Song T, Wang G, Ding M, Rodriguez-Paton A, Wang X, Wang S. Network-Based Approaches for Drug Repositioning. Mol Inform 2021; 41:e2100200. [PMID: 34970871 DOI: 10.1002/minf.202100200] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 12/05/2021] [Indexed: 12/25/2022]
Abstract
With deep learning creeping up into the ranks of big data, new models based on deep learning and massive data have made great leaps forward rapidly in the field of drug repositioning. However, there is no relevant review to summarize the transformations and development process of models and their data in the field of drug repositioning. Among all the computational methods, network-based methods play an extraordinary role. In view of these circumstances, understanding and comparing existing network-based computational methods applied in drug repositioning will help us recognize the cutting-edge technologies and offer valuable information for relevant researchers. Therefore, in this review, we present an interpretation of the series of important network-based methods applied in drug repositioning, together with their comparisons and development process.
Collapse
Affiliation(s)
- Tao Song
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Gan Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| | - Mao Ding
- Department of Neurology Medicine, The Second Hospital, Cheeloo College of Medicine, Shandong University, Ji Nan Shi, Jinan, 250033, China
| | - Alfonso Rodriguez-Paton
- Department of Artificial Intelligence, Faculty of Computer Science, Polytechnical University of Madrid, Campus de Montegancedo, Boadilla del Monte, 28660, Madrid, Spain
| | - Xun Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China.,China High Performance Computer Research Center, Institute of Computer Technology, Chinese Academy of Science, Beijing, 100190, Beijing, China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, China
| |
Collapse
|
24
|
Vaz JM, Balaji S. Convolutional neural networks (CNNs): concepts and applications in pharmacogenomics. Mol Divers 2021; 25:1569-1584. [PMID: 34031788 PMCID: PMC8342355 DOI: 10.1007/s11030-021-10225-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Accepted: 04/21/2021] [Indexed: 12/17/2022]
Abstract
Convolutional neural networks (CNNs) have been used to extract information from various datasets of different dimensions. This approach has led to accurate interpretations in several subfields of biological research, like pharmacogenomics, addressing issues previously faced by other computational methods. With the rising attention for personalized and precision medicine, scientists and clinicians have now turned to artificial intelligence systems to provide them with solutions for therapeutics development. CNNs have already provided valuable insights into biological data transformation. Due to the rise of interest in precision and personalized medicine, in this review, we have provided a brief overview of the possibilities of implementing CNNs as an effective tool for analyzing one-dimensional biological data, such as nucleotide and protein sequences, as well as small molecular data, e.g., simplified molecular-input line-entry specification, InChI, binary fingerprints, etc., to categorize the models based on their objective and also highlight various challenges. The review is organized into specific research domains that participate in pharmacogenomics for a more comprehensive understanding. Furthermore, the future intentions of deep learning are outlined.
Collapse
Affiliation(s)
- Joel Markus Vaz
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India
| | - S Balaji
- Department of Biotechnology, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, Karnataka, 576104, India.
| |
Collapse
|
25
|
Logistic matrix factorisation and generative adversarial neural network-based method for predicting drug-target interactions. Mol Divers 2021; 25:1497-1516. [PMID: 34297278 DOI: 10.1007/s11030-021-10273-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/04/2021] [Indexed: 12/21/2022]
Abstract
Identifying drug-target protein association pairs is a prerequisite and a crucial task in drug discovery and development. Numerous computational models, based on different assumptions and algorithms, have been proposed as an alternative to the laborious, costly, and time-consuming traditional wet-lab methods. Most proposed methods focus on separated drug and target descriptors, calculated, respectively, from chemical structures and protein sequences, and fail to introduce and extract features where the interaction information is embedded. In this paper, we propose a new three-step method based on matrix factorisation and generative adversarial network (GAN) for drug-target interaction prediction. Firstly, the matrix factorisation technique is used to capture and extract the joint interaction feature, for both drugs and targets, from the drug-target interaction matrix. Then, a GAN is introduced for data augmentation. It generates a fake positive sample similar to the real positive sample (known interactions) in order to balance the samples, allow the exploitation of the entire negative sample, and increase the data size for an accurate prediction. Finally, a fully connected four-layer neural network is built for classification. Experimental results illustrate a higher prediction performance of the proposed method compared to shallow classifiers and to state-of-the-art methods with an accuracy higher than 97%. Moreover, the data generation effect is confirmed by evaluating the proposed method with and without the generation step. These results demonstrated the efficiency of the latent interaction features and data generation on predicting new drugs or repurposing existing drugs. Overview of the WGANMF-DTI workflow for the Drug-Target Interaction Prediction task.
Collapse
|
26
|
Binding affinity prediction for binary drug-target interactions using semi-supervised transfer learning. J Comput Aided Mol Des 2021; 35:883-900. [PMID: 34189637 DOI: 10.1007/s10822-021-00404-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2021] [Accepted: 06/18/2021] [Indexed: 10/21/2022]
Abstract
In the field of drug-target interactions prediction, the majority of approaches formulated the problem as a simple binary classification task. These methods used binary drug-target interaction datasets to train their models. The prediction of drug-target interactions is inherently a regression problem and these interactions would be identified according to the binding affinity between drugs and targets. This paper deals the binary drug-target interactions and tries to identify the binary interactions based on the binding strength of a drug and its target. To this end, we propose a semi-supervised transfer learning approach to predict the binding affinity in a continuous spectrum for binary interactions. Due to the lack of training data with continuous binding affinity in the target domain, the proposed method makes use of the information available in other domains (i.e. source domain), via the transfer learning approach. The general framework of our algorithm is based on an objective function, which considers the performance in both source and target domains as well as the unlabeled data in the target domain via a regularization term. To optimize this objective function, we make use of a gradient boosting machine which constructs the final model. To assess the performance of the proposed method, we have used some benchmark datasets with binary interactions for four classes of human proteins. Our algorithm identifies interactions in a more realistic situation. According to the experimental results, our regression model performs better than the state-of-the-art methods in some procedures.
Collapse
|
27
|
Predicting Drug-Target Interactions Based on the Ensemble Models of Multiple Feature Pairs. Int J Mol Sci 2021; 22:ijms22126598. [PMID: 34202954 PMCID: PMC8234024 DOI: 10.3390/ijms22126598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2021] [Revised: 06/09/2021] [Accepted: 06/16/2021] [Indexed: 11/30/2022] Open
Abstract
Backgroud: The prediction of drug–target interactions (DTIs) is of great significance in drug development. It is time-consuming and expensive in traditional experimental methods. Machine learning can reduce the cost of prediction and is limited by the characteristics of imbalanced datasets and problems of essential feature selection. Methods: The prediction method based on the Ensemble model of Multiple Feature Pairs (Ensemble-MFP) is introduced. Firstly, three negative sets are generated according to the Euclidean distance of three feature pairs. Then, the negative samples of the validation set/test set are randomly selected from the union set of the three negative sets in the validation set/test set. At the same time, the ensemble model with weight is optimized and applied to the test set. Results: The area under the receiver operating characteristic curve (area under ROC, AUC) in three out of four sub-datasets in gold standard datasets was more than 94.0% in the prediction of new drugs. The effectiveness of the proposed method is also shown with the comparison of state-of-the-art methods and demonstration of predicted drug–target pairs. Conclusion: The Ensemble-MFP can weigh the existing feature pairs and has a good prediction effect for general prediction on new drugs.
Collapse
|
28
|
Gao D, Chen Q, Zeng Y, Jiang M, Zhang Y. Applications of Machine Learning in Drug Target Discovery. Curr Drug Metab 2020; 21:790-803. [PMID: 32723266 DOI: 10.2174/1567201817999200728142023] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2020] [Revised: 03/12/2020] [Accepted: 05/13/2020] [Indexed: 12/15/2022]
Abstract
Drug target discovery is a critical step in drug development. It is the basis of modern drug development because it determines the target molecules related to specific diseases in advance. Predicting drug targets by computational methods saves a great deal of financial and material resources compared to in vitro experiments. Therefore, several computational methods for drug target discovery have been designed. Recently, machine learning (ML) methods in biomedicine have developed rapidly. In this paper, we present an overview of drug target discovery methods based on machine learning. Considering that some machine learning methods integrate network analysis to predict drug targets, network-based methods are also introduced in this article. Finally, the challenges and future outlook of drug target discovery are discussed.
Collapse
Affiliation(s)
- Dongrui Gao
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Qingyuan Chen
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Yuanqi Zeng
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| | - Meng Jiang
- School of Mechanical Automotive Engineering, Nanyang Institute of Technology, Nanyang 473000, China
| | - Yongqing Zhang
- School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
| |
Collapse
|
29
|
Hasan Mahmud SM, Chen W, Jahan H, Dai B, Din SU, Dzisoo AM. DeepACTION: A deep learning-based method for predicting novel drug-target interactions. Anal Biochem 2020; 610:113978. [PMID: 33035462 DOI: 10.1016/j.ab.2020.113978] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 09/23/2020] [Accepted: 09/25/2020] [Indexed: 12/13/2022]
Abstract
Drug-target interactions (DTIs) play a key role in drug development and discovery processes. Wet lab prediction of DTIs is time-consuming, expensive, and tedious. Fortunately, computational approaches can identify new interactions (drug-target pairs) and accelerate the process of drug repurposing. However, a vast number of interactions remain undiscovered; therefore, we proposed a deep learning-based method (deepACTION) for predicting potential or unknown DTIs. Here, each drug chemical structure and protein sequence are transformed according to structural and sequence information using different descriptors to represent their features correctly. There have been some challenges, such as the high dimensionality and class imbalance of data during the prediction process. To address these problems, we developed the MMIB technique to balance the majority and minority instances in the dataset and utilized a LASSO model to handle the high dimensionality of the data. In addition, we trained the convolutional neural network algorithm with balanced and reduced features for accurate prediction of DTIs. In this study, the AUC is considered a primary evaluation metric for comparing the performance of the deep ACTION model with that of existing methods by a 5-fold cross-validation test. Our experiential dataset obtained from the DrugBank database and our deepACTION model achieved an AUC of 0.9836 for this dataset. The experimental results ensured that the model can predict significant numbers of new DTIs and provide complete information to motivate scientists to develop drugs.
Collapse
Affiliation(s)
- S M Hasan Mahmud
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Wenyu Chen
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China.
| | - Hosney Jahan
- College of Computer Science, Sichuan University, Chengdu, 610065, China
| | - Bo Dai
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Salah Ud Din
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, 611731, China
| | - Anthony Mackitz Dzisoo
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, 611731, China
| |
Collapse
|