1
|
Li Y, Yang Y, Tong Z, Wang Y, Mi Q, Bai M, Liang G, Li B, Shu K. A comparative benchmarking and evaluation framework for heterogeneous network-based drug repositioning methods. Brief Bioinform 2024; 25:bbae172. [PMID: 38647153 PMCID: PMC11033846 DOI: 10.1093/bib/bbae172] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 02/25/2024] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
Computational drug repositioning, which involves identifying new indications for existing drugs, is an increasingly attractive research area due to its advantages in reducing both overall cost and development time. As a result, a growing number of computational drug repositioning methods have emerged. Heterogeneous network-based drug repositioning methods have been shown to outperform other approaches. However, there is a dearth of systematic evaluation studies of these methods, encompassing performance, scalability and usability, as well as a standardized process for evaluating new methods. Additionally, previous studies have only compared several methods, with conflicting results. In this context, we conducted a systematic benchmarking study of 28 heterogeneous network-based drug repositioning methods on 11 existing datasets. We developed a comprehensive framework to evaluate their performance, scalability and usability. Our study revealed that methods such as HGIMC, ITRPCA and BNNR exhibit the best overall performance, as they rely on matrix completion or factorization. HINGRL, MLMC, ITRPCA and HGIMC demonstrate the best performance, while NMFDR, GROBMC and SCPMF display superior scalability. For usability, HGIMC, DRHGCN and BNNR are the top performers. Building on these findings, we developed an online tool called HN-DREP (http://hn-drep.lyhbio.com/) to facilitate researchers in viewing all the detailed evaluation results and selecting the appropriate method. HN-DREP also provides an external drug repositioning prediction service for a specific disease or drug by integrating predictions from all methods. Furthermore, we have released a Snakemake workflow named HN-DRES (https://github.com/lyhbio/HN-DRES) to facilitate benchmarking and support the extension of new methods into the field.
Collapse
Affiliation(s)
- Yinghong Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yinqi Yang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Zhuohao Tong
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Yu Wang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Qin Mi
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Mingze Bai
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| | - Guizhao Liang
- Key Laboratory of Biorheological Science and Technology, Ministry of Education, Bioengineering College, Chongqing University, Chongqing, 400044, P. R. China
| | - Bo Li
- College of Life Sciences, Chongqing Normal University, Chongqing 401331, P. R. China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing 400065, P. R. China
| |
Collapse
|
2
|
Habib M, Lalagkas PN, Melamed RD. Mapping drug biology to disease genetics to discover drug impacts on the human phenome. BIOINFORMATICS ADVANCES 2024; 4:vbae038. [PMID: 38736684 PMCID: PMC11087821 DOI: 10.1093/bioadv/vbae038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 01/18/2024] [Accepted: 03/07/2024] [Indexed: 05/14/2024]
Abstract
Motivation Medications can have unexpected effects on disease, including not only harmful drug side effects, but also beneficial drug repurposing. These effects on disease may result from hidden influences of drugs on disease gene networks. Then, discovering how biological effects of drugs relate to disease biology can both provide insight into the mechanism of latent drug effects, and can help predict new effects. Results Here, we develop Draphnet, a model that integrates molecular data on 429 drugs and gene associations of nearly 200 common phenotypes to learn a network that explains drug effects on disease in terms of these molecular signals. We present evidence that our method can both predict drug effects, and can provide insight into the biology of unexpected drug effects on disease. Using Draphnet to map a drug's known molecular effects to downstream effects on the disease genome, we put forward disease genes impacted by drugs, and we suggest a new grouping of drugs based on shared effects on the disease genome. Our approach has multiple applications, including predicting drug uses and learning drug biology, with implications for personalized medicine. Availability and implementation Code to reproduce the analysis is available at https://github.com/RDMelamed/drug-phenome.
Collapse
Affiliation(s)
- Mamoon Habib
- Department of Computer Science, University of Massachusetts Lowell, Lowell, MA 01854, United States
| | | | - Rachel D Melamed
- Department of Biological Science, University of Massachusetts Lowell, Lowell, MA 01854, United States
| |
Collapse
|
3
|
Wang S, Li J, Wang D, Xu D, Jin J, Wang Y. Predicting Drug-Disease Associations Through Similarity Network Fusion and Multi-View Feature Projection Representation. IEEE J Biomed Health Inform 2023; 27:5165-5176. [PMID: 37527303 DOI: 10.1109/jbhi.2023.3300717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/03/2023]
Abstract
Predicting drug-disease associations (DDAs) through computational methods has become a prevalent trend in drug development because of their high efficiency and low cost. Existing methods usually focus on constructing heterogeneous networks by collecting multiple data resources to improve prediction ability. However, potential association possibilities of numerous unconfirmed drug-related or disease-related pairs are not sufficiently considered. In this article, we propose a novel computational model to predict new DDAs. First, a heterogeneous network is constructed, including four types of nodes (drugs, targets, cell lines, diseases) and three types of edges (associations, association scores, similarities). Second, an updating and merging-based similarity network fusion method, termed UM-SF, is presented to fuse various similarity networks with diverse weights. Finally, an intermediate layer-mediated multi-view feature projection representation method, termed IM-FP, is proposed to calculate the predicted DDA scores. This method uses multiple association scores to construct multi-view drug features, then projects them into disease space through the intermediate layer, where an intermediate layer similarity constraint is designed to learn the projection matrices. Results of comparative experiments reveal the effectiveness of our innovations. Comparisons with other state-of-the-art models by the 10-fold cross-validation experiment indicate our model's advantage on AUROC and AUPR metrics. Moreover, our proposed model successfully predicted 107 novel high-ranked DDAs.
Collapse
|
4
|
Wang L, Sun C, Xu X, Li J, Zhang W. A neighborhood-regularization method leveraging multiview data for predicting the frequency of drug-side effects. Bioinformatics 2023; 39:btad532. [PMID: 37647657 PMCID: PMC10491955 DOI: 10.1093/bioinformatics/btad532] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 08/24/2023] [Accepted: 08/28/2023] [Indexed: 09/01/2023] Open
Abstract
MOTIVATION A critical issue in drug benefit-risk assessment is to determine the frequency of side effects, which is performed by randomized controlled trails. Computationally predicted frequencies of drug side effects can be used to effectively guide the randomized controlled trails. However, it is more challenging to predict drug side effect frequencies, and thus only a few studies cope with this problem. RESULTS In this work, we propose a neighborhood-regularization method (NRFSE) that leverages multiview data on drugs and side effects to predict the frequency of side effects. First, we adopt a class-weighted non-negative matrix factorization to decompose the drug-side effect frequency matrix, in which Gaussian likelihood is used to model unknown drug-side effect pairs. Second, we design a multiview neighborhood regularization to integrate three drug attributes and two side effect attributes, respectively, which makes most similar drugs and most similar side effects have similar latent signatures. The regularization can adaptively determine the weights of different attributes. We conduct extensive experiments on one benchmark dataset, and NRFSE improves the prediction performance compared with five state-of-the-art approaches. Independent test set of post-marketing side effects further validate the effectiveness of NRFSE. AVAILABILITY AND IMPLEMENTATION Source code and datasets are available at https://github.com/linwang1982/NRFSE or https://codeocean.com/capsule/4741497/tree/v1.
Collapse
Affiliation(s)
- Lin Wang
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Chenhao Sun
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Xianyu Xu
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Jia Li
- College of Artificial Intelligence, Tianjin University of Science and Technology, No. 9, 13th Street, Tianjin Economic-Technological Development Area, Tianjin 300457, China
| | - Wenjuan Zhang
- College of General Education, Tianjin Foreign Studies University, No. 117, Machang Road, Hexi District, Tianjin 300204, China
| |
Collapse
|
5
|
Yang X, Yang G, Chu J. Self-Supervised Learning for Label Sparsity in Computational Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3245-3256. [PMID: 37028367 DOI: 10.1109/tcbb.2023.3254163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
The computational drug repositioning aims to discover new uses for marketed drugs, which can accelerate the drug development process and play an important role in the existing drug discovery system. However, the number of validated drug-disease associations is scarce compared to the number of drugs and diseases in the real world. Too few labeled samples will make the classification model unable to learn effective latent factors of drugs, resulting in poor generalization performance. In this work, we propose a multi-task self-supervised learning framework for computational drug repositioning. The framework tackles label sparsity by learning a better drug representation. Specifically, we take the drug-disease association prediction problem as the main task, and the auxiliary task is to use data augmentation strategies and contrast learning to mine the internal relationships of the original drug features, so as to automatically learn a better drug representation without supervised labels. And through joint training, it is ensured that the auxiliary task can improve the prediction accuracy of the main task. More precisely, the auxiliary task improves drug representation and serving as additional regularization to improve generalization. Furthermore, we design a multi-input decoding network to improve the reconstruction ability of the autoencoder model. We evaluate our model using three real-world datasets. The experimental results demonstrate the effectiveness of the multi-task self-supervised learning framework, and its predictive ability is superior to the state-of-the-art model.
Collapse
|
6
|
Ai C, Yang H, Ding Y, Tang J, Guo F. Low Rank Matrix Factorization Algorithm Based on Multi-Graph Regularization for Detecting Drug-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3033-3043. [PMID: 37159322 DOI: 10.1109/tcbb.2023.3274587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Detecting potential associations between drugs and diseases plays an indispensable role in drug development, which has also become a research hotspot in recent years. Compared with traditional methods, some computational approaches have the advantages of fast speed and low cost, which greatly accelerate the progress of predicting the drug-disease association. In this study, we propose a novel similarity-based method of low-rank matrix decomposition based on multi-graph regularization. On the basis of low-rank matrix factorization with L2 regularization, the multi-graph regularization constraint is constructed by combining a variety of similarity matrices from drugs and diseases respectively. In the experiments, we analyze the difference in the combination of different similarities, resulting that combining all the similarity information on drug space is unnecessary, and only a part of the similarity information can achieve the desired performance. Then our method is compared with other existing models on three data sets (Fdataset, Cdataset and LRSSLdataset) and have a good advantage in the evaluation measurement of AUPR. Besides, a case study experiment is conducted and showing that the superior ability for predicting the potential disease-related drugs of our model. Finally, we compare our model with some methods on six real world datasets, and our model has a good performance in detecting real world data.
Collapse
|
7
|
Zhu X, Lu W. Multi-Label Classification With Dual Tail-Node Augmentation for Drug Repositioning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3068-3079. [PMID: 37418410 DOI: 10.1109/tcbb.2023.3292883] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/09/2023]
Abstract
Due to the lengthy and costly process of new drug discovery, increasing attention has been paid to drug repositioning, i.e., identifying new drug-disease associations. Current machine learning methods for drug repositioning mainly leverage matrix factorization or graph neural networks, and have achieved impressive performance. However, they often suffer from insufficient training labels of inter-domain associations, while ignore the intra-domain associations. Moreover, they often neglect the importance of tail nodes that have few known associations, which limits their effectiveness in drug repositioning. In this paper, we propose a novel multi-label classification model with dual Tail-Node Augmentation for Drug Repositioning (TNA-DR). We incorporate disease-disease similarity and drug-drug similarity information into k-nearest neighbor ( kNN) augmentation module and contrastive augmentation module, respectively, which effectively complements the weak supervision of drug-disease associations. Furthermore, before employing the two augmentation modules, we filter the nodes by their degrees, so that the two modules are only applied to tail nodes. We conduct 10-fold cross validation experiments on four different real-world datasets, and our model achieves the state-of-the-art performance on all the four datasets. We also demonstrate our model's capability of identifying drug candidates for new diseases and discovering potential new links between existing drugs and diseases.
Collapse
|
8
|
Gao Z, Ma H, Zhang X, Wang Y, Wu Z. Similarity measures-based graph co-contrastive learning for drug-disease association prediction. Bioinformatics 2023; 39:btad357. [PMID: 37261859 PMCID: PMC10275904 DOI: 10.1093/bioinformatics/btad357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Revised: 03/14/2023] [Accepted: 05/31/2023] [Indexed: 06/02/2023] Open
Abstract
MOTIVATION An imperative step in drug discovery is the prediction of drug-disease associations (DDAs), which tries to uncover potential therapeutic possibilities for already validated drugs. It is costly and time-consuming to predict DDAs using wet experiments. Graph Neural Networks as an emerging technique have shown superior capacity of dealing with DDA prediction. However, existing Graph Neural Networks-based DDA prediction methods suffer from sparse supervised signals. As graph contrastive learning has shined in mitigating sparse supervised signals, we seek to leverage graph contrastive learning to enhance the prediction of DDAs. Unfortunately, most conventional graph contrastive learning-based models corrupt the raw data graph to augment data, which are unsuitable for DDA prediction. Meanwhile, these methods could not model the interactions between nodes effectively, thereby reducing the accuracy of association predictions. RESULTS A model is proposed to tap potential drug candidates for diseases, which is called Similarity Measures-based Graph Co-contrastive Learning (SMGCL). For learning embeddings from complicated network topologies, SMGCL includes three essential processes: (i) constructs three views based on similarities between drugs and diseases and DDA information; (ii) two graph encoders are performed over the three views, so as to model both local and global topologies simultaneously; and (iii) a graph co-contrastive learning method is introduced, which co-trains the representations of nodes to maximize the agreement between them, thus generating high-quality prediction results. Contrastive learning serves as an auxiliary task for improving DDA predictions. Evaluated by cross-validations, SMGCL achieves pleasing comprehensive performances. Further proof of the SMGCL's practicality is provided by case study of Alzheimer's disease. AVAILABILITY AND IMPLEMENTATION https://github.com/Jcmorz/SMGCL.
Collapse
Affiliation(s)
- Zihao Gao
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Huifang Ma
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
- Guangxi Key Laboratory of Trusted Software, Guilin University of Electronic Technology, No.1 Jinji Road, Guilin, 541004, China
| | - Xiaohui Zhang
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Yike Wang
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| | - Zheyu Wu
- College of Computer Science and Engineering, Northwest Normal University, No.967 Anning East Road, Lanzhou, 730070, China
| |
Collapse
|
9
|
Mai TT. From Bilinear Regression to Inductive Matrix Completion: A Quasi-Bayesian Analysis. ENTROPY (BASEL, SWITZERLAND) 2023; 25:333. [PMID: 36832699 PMCID: PMC9955477 DOI: 10.3390/e25020333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 02/08/2023] [Accepted: 02/09/2023] [Indexed: 06/18/2023]
Abstract
In this paper, we study the problem of bilinear regression, a type of statistical modeling that deals with multiple variables and multiple responses. One of the main difficulties that arise in this problem is the presence of missing data in the response matrix, a problem known as inductive matrix completion. To address these issues, we propose a novel approach that combines elements of Bayesian statistics with a quasi-likelihood method. Our proposed method starts by addressing the problem of bilinear regression using a quasi-Bayesian approach. The quasi-likelihood method that we employ in this step allows us to handle the complex relationships between the variables in a more robust way. Next, we adapt our approach to the context of inductive matrix completion. We make use of a low-rankness assumption and leverage the powerful PAC-Bayes bound technique to provide statistical properties for our proposed estimators and for the quasi-posteriors. To compute the estimators, we propose a Langevin Monte Carlo method to obtain approximate solutions to the problem of inductive matrix completion in a computationally efficient manner. To demonstrate the effectiveness of our proposed methods, we conduct a series of numerical studies. These studies allow us to evaluate the performance of our estimators under different conditions and provide a clear illustration of the strengths and limitations of our approach.
Collapse
Affiliation(s)
- The Tien Mai
- Department of Mathematical Sciences, Norwegian University of Science and Technology, 7034 Trondheim, Norway
| |
Collapse
|
10
|
Liu BM, Gao YL, Zhang DJ, Zhou F, Wang J, Zheng CH, Liu JX. A new framework for drug-disease association prediction combing light-gated message passing neural network and gated fusion mechanism. Brief Bioinform 2022; 23:6775584. [PMID: 36305457 DOI: 10.1093/bib/bbac457] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 09/07/2022] [Accepted: 09/23/2022] [Indexed: 12/14/2022] Open
Abstract
With the development of research on the complex aetiology of many diseases, computational drug repositioning methodology has proven to be a shortcut to costly and inefficient traditional methods. Therefore, developing more promising computational methods is indispensable for finding new candidate diseases to treat with existing drugs. In this paper, a model integrating a new variant of message passing neural network and a novel-gated fusion mechanism called GLGMPNN is proposed for drug-disease association prediction. First, a light-gated message passing neural network (LGMPNN), including message passing, aggregation and updating, is proposed to separately extract multiple pieces of information from the similarity networks and the association network. Then, a gated fusion mechanism consisting of a forget gate and an output gate is applied to integrate the multiple pieces of information to extent. The forget gate calculated by the multiple embeddings is built to integrate the association information into the similarity information. Furthermore, the final node representations are controlled by the output gate, which fuses the topology information of the networks and the initial similarity information. Finally, a bilinear decoder is adopted to reconstruct an adjacency matrix for drug-disease associations. Evaluated by 10-fold cross-validations, GLGMPNN achieves excellent performance compared with the current models. The following studies show that our model can effectively discover novel drug-disease associations.
Collapse
Affiliation(s)
- Bao-Min Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Ying-Lian Gao
- Qufu Normal University Library, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Dai-Jun Zhang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Feng Zhou
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Juan Wang
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Chun-Hou Zheng
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao, 276826, Shandong, China
| |
Collapse
|
11
|
Drug-Disease Association Prediction Using Heterogeneous Networks for Computational Drug Repositioning. Biomolecules 2022; 12:biom12101497. [PMID: 36291706 PMCID: PMC9599692 DOI: 10.3390/biom12101497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2022] [Revised: 10/10/2022] [Accepted: 10/13/2022] [Indexed: 11/18/2022] Open
Abstract
Drug repositioning, which involves the identification of new therapeutic indications for approved drugs, considerably reduces the time and cost of developing new drugs. Recent computational drug repositioning methods use heterogeneous networks to identify drug–disease associations. This review reveals existing network-based approaches for predicting drug–disease associations in three major categories: graph mining, matrix factorization or completion, and deep learning. We selected eleven methods from the three categories to compare their predictive performances. The experiment was conducted using two uniform datasets on the drug and disease sides, separately. We constructed heterogeneous networks using drug–drug similarities based on chemical structures and ATC codes, ontology-based disease–disease similarities, and drug–disease associations. An improved evaluation metric was used to reflect data imbalance as positive associations are typically sparse. The prediction results demonstrated that methods in the graph mining and matrix factorization or completion categories performed well in the overall assessment. Furthermore, prediction on the drug side had higher accuracy than on the disease side. Selecting and integrating informative drug features in drug–drug similarity measurement are crucial for improving disease-side prediction.
Collapse
|
12
|
Jamali AA, Tan Y, Kusalik A, Wu FX. NTD-DR: Nonnegative tensor decomposition for drug repositioning. PLoS One 2022; 17:e0270852. [PMID: 35862409 PMCID: PMC9302855 DOI: 10.1371/journal.pone.0270852] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 06/20/2022] [Indexed: 12/12/2022] Open
Abstract
Computational drug repositioning aims to identify potential applications of existing drugs for the treatment of diseases for which they were not designed. This approach can considerably accelerate the traditional drug discovery process by decreasing the required time and costs of drug development. Tensor decomposition enables us to integrate multiple drug- and disease-related data to boost the performance of prediction. In this study, a nonnegative tensor decomposition for drug repositioning, NTD-DR, is proposed. In order to capture the hidden information in drug-target, drug-disease, and target-disease networks, NTD-DR uses these pairwise associations to construct a three-dimensional tensor representing drug-target-disease triplet associations and integrates them with similarity information of drugs, targets, and disease to make a prediction. We compare NTD-DR with recent state-of-the-art methods in terms of the area under the receiver operating characteristic (ROC) curve (AUC) and the area under the precision and recall curve (AUPR) and find that our method outperforms competing methods. Moreover, case studies with five diseases also confirm the reliability of predictions made by NTD-DR. Our proposed method identifies more known associations among the top 50 predictions than other methods. In addition, novel associations identified by NTD-DR are validated by literature analyses.
Collapse
Affiliation(s)
- Ali Akbar Jamali
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| | - Yuting Tan
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
- School of Mathematics and Statistics, Huazhong Normal University, Wuhan, China
| | - Anthony Kusalik
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
- * E-mail: (AK); (FXW)
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
- Department of Computer Science, University of Saskatchewan, Saskatoon, SK, Canada
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
- * E-mail: (AK); (FXW)
| |
Collapse
|
13
|
Zhang Y, Lei X, Pan Y, Wu FX. Drug Repositioning with GraphSAGE and Clustering Constraints Based on Drug and Disease Networks. Front Pharmacol 2022; 13:872785. [PMID: 35620297 PMCID: PMC9127467 DOI: 10.3389/fphar.2022.872785] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 04/11/2022] [Indexed: 11/29/2022] Open
Abstract
The understanding of therapeutic properties is important in drug repositioning and drug discovery. However, chemical or clinical trials are expensive and inefficient to characterize the therapeutic properties of drugs. Recently, artificial intelligence (AI)-assisted algorithms have received extensive attention for discovering the potential therapeutic properties of drugs and speeding up drug development. In this study, we propose a new method based on GraphSAGE and clustering constraints (DRGCC) to investigate the potential therapeutic properties of drugs for drug repositioning. First, the drug structure features and disease symptom features are extracted. Second, the drug–drug interaction network and disease similarity network are constructed according to the drug–gene and disease–gene relationships. Matrix factorization is adopted to extract the clustering features of networks. Then, all the features are fed to the GraphSAGE to predict new associations between existing drugs and diseases. Benchmark comparisons on two different datasets show that our method has reliable predictive performance and outperforms other six competing. We have also conducted case studies on existing drugs and diseases and aimed to predict drugs that may be effective for the novel coronavirus disease 2019 (COVID-19). Among the predicted anti-COVID-19 drug candidates, some drugs are being clinically studied by pharmacologists, and their binding sites to COVID-19-related protein receptors have been found via the molecular docking technology.
Collapse
Affiliation(s)
- Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
14
|
Yan Y, Yang M, Zhao H, Duan G, Peng X, Wang J. Drug repositioning based on multi-view learning with matrix completion. Brief Bioinform 2022; 23:6548374. [PMID: 35289352 DOI: 10.1093/bib/bbac054] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 01/14/2022] [Accepted: 01/31/2022] [Indexed: 12/21/2022] Open
Abstract
Determining drug indications is a critical part of the drug development process. However, traditional drug discovery is expensive and time-consuming. Drug repositioning aims to find potential indications for existing drugs, which is considered as an important alternative to the traditional drug discovery. In this article, we propose a multi-view learning with matrix completion (MLMC) method to predict the potential associations between drugs and diseases. Specifically, MLMC first learns the comprehensive similarity matrices from five drug similarity matrices and two disease similarity matrices based on the multi-view learning (ML) with Laplacian graph regularization, and updates the drug-disease association matrix simultaneously. Then, we introduce matrix completion (MC) to add some positive entries in original association matrix based on low-rank structure, and re-execute the multi-view learning algorithm for association prediction. At last, the prediction results of the above two operations are integrated as the final output. Evaluated by 10-fold cross-validation and de novo tests, MLMC achieves higher prediction accuracy than the current state-of-the-art methods. Moreover, case studies confirm the ability of our method in novel drug-disease association discovery. The codes of MLMC are available at https://github.com/BioinformaticsCSU/MLMC. Contact: jxwang@mail.csu.edu.cn.
Collapse
Affiliation(s)
- Yixin Yan
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Mengyun Yang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang, Hunan 422000, China
| | - Haochen Zhao
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Guihua Duan
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Xiaoqing Peng
- Center for Medical Genetics & Hunan Key Laboratory of Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410038, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| |
Collapse
|
15
|
Gao CQ, Zhou YK, Xin XH, Min H, Du PF. DDA-SKF: Predicting Drug-Disease Associations Using Similarity Kernel Fusion. Front Pharmacol 2022; 12:784171. [PMID: 35095495 PMCID: PMC8792612 DOI: 10.3389/fphar.2021.784171] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/20/2021] [Indexed: 12/13/2022] Open
Abstract
Drug repositioning provides a promising and efficient strategy to discover potential associations between drugs and diseases. Many systematic computational drug-repositioning methods have been introduced, which are based on various similarities of drugs and diseases. In this work, we proposed a new computational model, DDA-SKF (drug-disease associations prediction using similarity kernels fusion), which can predict novel drug indications by utilizing similarity kernel fusion (SKF) and Laplacian regularized least squares (LapRLS) algorithms. DDA-SKF integrated multiple similarities of drugs and diseases. The prediction performances of DDA-SKF are better, or at least comparable, to all state-of-the-art methods. The DDA-SKF can work without sufficient similarity information between drug indications. This allows us to predict new purpose for orphan drugs. The source code and benchmarking datasets are deposited in a GitHub repository (https://github.com/GCQ2119216031/DDA-SKF).
Collapse
Affiliation(s)
- Chu-Qiao Gao
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yuan-Ke Zhou
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xiao-Hong Xin
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Hui Min
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Pu-Feng Du
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
16
|
Xu X, Yue L, Li B, Liu Y, Wang Y, Zhang W, Wang L. DSGAT: predicting frequencies of drug side effects by graph attention networks. Brief Bioinform 2022; 23:6511198. [DOI: 10.1093/bib/bbab586] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 12/01/2021] [Accepted: 12/20/2021] [Indexed: 12/22/2022] Open
Abstract
Abstract
A critical issue of drug risk–benefit evaluation is to determine the frequencies of drug side effects. Randomized controlled trail is the conventional method for obtaining the frequencies of side effects, while it is laborious and slow. Therefore, it is necessary to guide the trail by computational methods. Existing methods for predicting the frequencies of drug side effects focus on modeling drug–side effect interaction graph. The inherent disadvantage of these approaches is that their performance is closely linked to the density of interactions but which is highly sparse. More importantly, for a cold start drug that does not appear in the training data, such methods cannot learn the preference embedding of the drug because there is no link to the drug in the interaction graph. In this work, we propose a new method for predicting the frequencies of drug side effects, DSGAT, by using the drug molecular graph instead of the commonly used interaction graph. This leads to the ability to learn embeddings for cold start drugs with graph attention networks. The proposed novel loss function, i.e. weighted $\varepsilon$-insensitive loss function, could alleviate the sparsity problem. Experimental results on one benchmark dataset demonstrate that DSGAT yields significant improvement for cold start drugs and outperforms the state-of-the-art performance in the warm start scenario. Source code and datasets are available at https://github.com/xxy45/DSGAT.
Collapse
|
17
|
Meng Y, Lu C, Jin M, Xu J, Zeng X, Yang J. A weighted bilinear neural collaborative filtering approach for drug repositioning. Brief Bioinform 2022; 23:6510159. [PMID: 35039838 DOI: 10.1093/bib/bbab581] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Revised: 11/25/2021] [Accepted: 12/19/2021] [Indexed: 02/07/2023] Open
Abstract
Drug repositioning is an efficient and promising strategy for traditional drug discovery and development. Many research efforts are focused on utilizing deep-learning approaches based on a heterogeneous network for modeling complex drug-disease associations. Similar to traditional latent factor models, which directly factorize drug-disease associations, they assume the neighbors are independent of each other in the network and thus tend to be ineffective to capture localized information. In this study, we propose a novel neighborhood and neighborhood interaction-based neural collaborative filtering approach (called DRWBNCF) to infer novel potential drugs for diseases. Specifically, we first construct three networks, including the known drug-disease association network, the drug-drug similarity and disease-disease similarity networks (using the nearest neighbors). To take the advantage of localized information in the three networks, we then design an integration component by proposing a new weighted bilinear graph convolution operation to integrate the information of the known drug-disease association, the drug's and disease's neighborhood and neighborhood interactions into a unified representation. Lastly, we introduce a prediction component, which utilizes the multi-layer perceptron optimized by the α-balanced focal loss function and graph regularization to model the complex drug-disease associations. Benchmarking comparisons on three datasets verified the effectiveness of DRWBNCF for drug repositioning. Importantly, the unknown drug-disease associations predicted by DRWBNCF were validated against clinical trials and three authoritative databases and we listed several new DRWBNCF-predicted potential drugs for breast cancer (e.g. valrubicin and teniposide) and small cell lung cancer (e.g. valrubicin and cytarabine).
Collapse
Affiliation(s)
- Yajie Meng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Changcheng Lu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | | |
Collapse
|
18
|
Ou-Yang L, Lu F, Zhang ZC, Wu M. Matrix factorization for biomedical link prediction and scRNA-seq data imputation: an empirical survey. Brief Bioinform 2021; 23:6447434. [PMID: 34864871 DOI: 10.1093/bib/bbab479] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/25/2021] [Accepted: 10/18/2021] [Indexed: 02/02/2023] Open
Abstract
Advances in high-throughput experimental technologies promote the accumulation of vast number of biomedical data. Biomedical link prediction and single-cell RNA-sequencing (scRNA-seq) data imputation are two essential tasks in biomedical data analyses, which can facilitate various downstream studies and gain insights into the mechanisms of complex diseases. Both tasks can be transformed into matrix completion problems. For a variety of matrix completion tasks, matrix factorization has shown promising performance. However, the sparseness and high dimensionality of biomedical networks and scRNA-seq data have raised new challenges. To resolve these issues, various matrix factorization methods have emerged recently. In this paper, we present a comprehensive review on such matrix factorization methods and their usage in biomedical link prediction and scRNA-seq data imputation. Moreover, we select representative matrix factorization methods and conduct a systematic empirical comparison on 15 real data sets to evaluate their performance under different scenarios. By summarizing the experimental results, we provide general guidelines for selecting matrix factorization methods for different biomedical matrix completion tasks and point out some future directions to further improve the performance for biomedical link prediction and scRNA-seq data imputation.
Collapse
Affiliation(s)
- Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China.,Shenzhen Institute of Artificial Intelligence and Robotics for Society, Shenzhen,518172, China
| | - Fan Lu
- Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen Key Laboratory of Media Security, and Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ), College of Electronics and Information Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Zi-Chao Zhang
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai, 200433, China
| | - Min Wu
- Institute for Infocomm Research (I2R), A*STAR, 138632, Singapore
| |
Collapse
|
19
|
Drug–disease associations prediction via Multiple Kernel-based Dual Graph Regularized Least Squares. Appl Soft Comput 2021. [DOI: 10.1016/j.asoc.2021.107811] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
20
|
Cai L, Lu C, Xu J, Meng Y, Wang P, Fu X, Zeng X, Su Y. Drug repositioning based on the heterogeneous information fusion graph convolutional network. Brief Bioinform 2021; 22:6347207. [PMID: 34378011 DOI: 10.1093/bib/bbab319] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/30/2021] [Accepted: 07/21/2021] [Indexed: 11/13/2022] Open
Abstract
In silico reuse of old drugs (also known as drug repositioning) to treat common and rare diseases is increasingly becoming an attractive proposition because it involves the use of de-risked drugs, with potentially lower overall development costs and shorter development timelines. Therefore, there is a pressing need for computational drug repurposing methodologies to facilitate drug discovery. In this study, we propose a new method, called DRHGCN (Drug Repositioning based on the Heterogeneous information fusion Graph Convolutional Network), to discover potential drugs for a certain disease. To make full use of different topology information in different domains (i.e. drug-drug similarity, disease-disease similarity and drug-disease association networks), we first design inter- and intra-domain feature extraction modules by applying graph convolution operations to the networks to learn the embedding of drugs and diseases, instead of simply integrating the three networks into a heterogeneous network. Afterwards, we parallelly fuse the inter- and intra-domain embeddings to obtain the more representative embeddings of drug and disease. Lastly, we introduce a layer attention mechanism to combine embeddings from multiple graph convolution layers for further improving the prediction performance. We find that DRHGCN achieves high performance (the average AUROC is 0.934 and the average AUPR is 0.539) in four benchmark datasets, outperforming the current approaches. Importantly, we conducted molecular docking experiments on DRHGCN-predicted candidate drugs, providing several novel approved drugs for Alzheimer's disease (e.g. benzatropine) and Parkinson's disease (e.g. trihexyphenidyl and haloperidol).
Collapse
Affiliation(s)
- Lijun Cai
- Hunan University, Changsha, Hunan, 410082, China
| | | | - Junlin Xu
- Hunan University, Changsha, Hunan, 410082, China
| | - Yajie Meng
- Hunan University, Changsha, Hunan, 410082, China
| | - Peng Wang
- Hunan University, Changsha, Hunan, 410082, China
| | | | | | - Yansen Su
- Anhui University, Changsha, Hunan, 410082, China
| |
Collapse
|
21
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 256] [Impact Index Per Article: 85.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
22
|
Zhang T, Zhang SW, Li Y. Identifying Driver Genes for Individual Patients through Inductive Matrix Completion. Bioinformatics 2021; 37:4477-4484. [PMID: 34175939 DOI: 10.1093/bioinformatics/btab477] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 04/30/2021] [Accepted: 06/25/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The driver genes play a key role in the evolutionary process of cancer. Effectively identifying these driver genes is crucial to cancer diagnosis and treatment. However, due to the high heterogeneity of cancers, it remains challenging to identify the driver genes for individual patients. Although some computational methods have been proposed to tackle this problem, they seldom consider the fact that the genes functionally similar to the well-established driver genes may likely play similar roles in cancer process, which potentially promotes the driver gene identification. Thus, here we developed a novel approach of IMCDriver to promote the driver gene identification both for cohorts and individual patients. RESULTS IMCDriver first considers the well-established driver genes as prior information, and adopts the using multi-omics data (e.g., somatic mutation, gene expression and protein-protein interaction) to compute the similarity between patients/genes. Then, IMCDriver prioritizes the personalized mutated genes according to their functional similarity to the well-established driver genes via Inductive Matrix Completion. Finally, IMCDriver identifies the highly rank-ordered genes as the personalized driver genes. The results on five cancer datasets from TCGA show that our IMCDriver outperforms other existing state-of-the-art methods both in the cohort and patient-specific driver gene identification. IMCDriver also reveals some novel driver genes that potentially drive cancer development. In addition, even for the driver genes rarely mutated among a population, IMCDriver can still identify them and prioritize them with high priorities. AVAILABILITY Code available at https://github.com/NWPU-903PR/IMCDriver. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tong Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an.,School of Electrical and Mechanical Engineering, Pingdingshan University, Pingdingshan, China
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| | - Yan Li
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, China Xi'an
| |
Collapse
|
23
|
Guo L, Shi K, Wang L. MLPMDA: Multi-layer linear projection for predicting miRNA-disease association. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106718] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
24
|
Jarada TN, Rokne JG, Alhajj R. SNF-NN: computational method to predict drug-disease interactions using similarity network fusion and neural networks. BMC Bioinformatics 2021; 22:28. [PMID: 33482713 PMCID: PMC7821180 DOI: 10.1186/s12859-020-03950-3] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Accepted: 12/22/2020] [Indexed: 02/08/2023] Open
Abstract
BACKGROUND Drug repositioning is an emerging approach in pharmaceutical research for identifying novel therapeutic potentials for approved drugs and discover therapies for untreated diseases. Due to its time and cost efficiency, drug repositioning plays an instrumental role in optimizing the drug development process compared to the traditional de novo drug discovery process. Advances in the genomics, together with the enormous growth of large-scale publicly available data and the availability of high-performance computing capabilities, have further motivated the development of computational drug repositioning approaches. More recently, the rise of machine learning techniques, together with the availability of powerful computers, has made the area of computational drug repositioning an area of intense activities. RESULTS In this study, a novel framework SNF-NN based on deep learning is presented, where novel drug-disease interactions are predicted using drug-related similarity information, disease-related similarity information, and known drug-disease interactions. Heterogeneous similarity information related to drugs and disease is fed to the proposed framework in order to predict novel drug-disease interactions. SNF-NN uses similarity selection, similarity network fusion, and a highly tuned novel neural network model to predict new drug-disease interactions. The robustness of SNF-NN is evaluated by comparing its performance with nine baseline machine learning methods. The proposed framework outperforms all baseline methods ([Formula: see text] = 0.867, and [Formula: see text]=0.876) using stratified 10-fold cross-validation. To further demonstrate the reliability and robustness of SNF-NN, two datasets are used to fairly validate the proposed framework's performance against seven recent state-of-the-art methods for drug-disease interaction prediction. SNF-NN achieves remarkable performance in stratified 10-fold cross-validation with [Formula: see text] ranging from 0.879 to 0.931 and [Formula: see text] from 0.856 to 0.903. Moreover, the efficiency of SNF-NN is verified by validating predicted unknown drug-disease interactions against clinical trials and published studies. CONCLUSION In conclusion, computational drug repositioning research can significantly benefit from integrating similarity measures in heterogeneous networks and deep learning models for predicting novel drug-disease interactions. The data and implementation of SNF-NN are available at http://pages.cpsc.ucalgary.ca/ tnjarada/snf-nn.php .
Collapse
Affiliation(s)
- Tamer N Jarada
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Jon G Rokne
- Department of Computer Science, University of Calgary, Calgary, AB, Canada
| | - Reda Alhajj
- Department of Computer Science, University of Calgary, Calgary, AB, Canada.
- Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey.
- Department of Health Informatics, University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
25
|
Yu Z, Huang F, Zhao X, Xiao W, Zhang W. Predicting drug-disease associations through layer attention graph convolutional network. Brief Bioinform 2020; 22:5918381. [PMID: 33078832 DOI: 10.1093/bib/bbaa243] [Citation(s) in RCA: 130] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 08/16/2020] [Accepted: 08/31/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Determining drug-disease associations is an integral part in the process of drug development. However, the identification of drug-disease associations through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for predicting drug-disease associations is of great significance. RESULTS In this paper, we propose a novel computational method named as layer attention graph convolutional network (LAGCN) for the drug-disease association prediction. Specifically, LAGCN first integrates the known drug-disease associations, drug-drug similarities and disease-disease similarities into a heterogeneous network, and applies the graph convolution operation to the network to learn the embeddings of drugs and diseases. Second, LAGCN combines the embeddings from multiple graph convolution layers using an attention mechanism. Third, the unobserved drug-disease associations are scored based on the integrated embeddings. Evaluated by 5-fold cross-validations, LAGCN achieves an area under the precision-recall curve of 0.3168 and an area under the receiver-operating characteristic curve of 0.8750, which are better than the results of existing state-of-the-art prediction methods and baseline methods. The case study shows that LAGCN can discover novel associations that are not curated in our dataset. CONCLUSION LAGCN is a useful tool for predicting drug-disease associations. This study reveals that embeddings from different convolution layers can reflect the proximities of different orders, and combining the embeddings by the attention mechanism can improve the prediction performances.
Collapse
Affiliation(s)
- Zhouxin Yu
- College of Informatics, Huazhong Agricultural University
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University
| | - Xiaohan Zhao
- College of Informatics, Huazhong Agricultural University
| | | | - Wen Zhang
- College of Informatics, Huazhong Agricultural University
| |
Collapse
|