1
|
He H, Xie J, Huang D, Zhang M, Zhao X, Ying Y, Wang J. DRTerHGAT: A drug repurposing method based on the ternary heterogeneous graph attention network. J Mol Graph Model 2024; 130:108783. [PMID: 38677034 DOI: 10.1016/j.jmgm.2024.108783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Revised: 04/21/2024] [Accepted: 04/23/2024] [Indexed: 04/29/2024]
Abstract
Drug repurposing is an effective method to reduce the time and cost of drug development. Computational drug repurposing can quickly screen out the most likely associations from large biological databases to achieve effective drug repurposing. However, building a comprehensive model that integrates drugs, proteins, and diseases for drug repurposing remains challenging. This study proposes a drug repurposing method based on the ternary heterogeneous graph attention network (DRTerHGAT). DRTerHGAT designs a novel protein feature extraction process consisting of a large-scale protein language model and a multi-task autoencoder, so that protein features can be extracted accurately and efficiently from amino acid sequences. The ternary heterogeneous graph of drug-protein-disease comprehensively considering the relationships among the three types of nodes, including three homogeneous and three heterogeneous relationships. Based on the graph and the extracted protein features, the deep features of the drugs and the diseases are extracted by graph convolutional networks (GCN) and heterogeneous graph node attention networks (HGNA). In the experiments, DRTerHGAT is proven superior to existing advanced methods and DRTerHGAT variants. DRTerHGAT's powerful ability for drug repurposing is also demonstrated in Alzheimer's disease.
Collapse
Affiliation(s)
- Hongjian He
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Jiang Xie
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China.
| | - Dingkai Huang
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Mengfei Zhang
- The School of Computer Engineering and Science, Shanghai University, Shanghai, China
| | - Xuyu Zhao
- School of Life Sciences,Shanghai University, Shanghai, China
| | - Yiwei Ying
- School of Life Sciences,Shanghai University, Shanghai, China
| | - Jiao Wang
- School of Life Sciences,Shanghai University, Shanghai, China.
| |
Collapse
|
2
|
Li D, Xiao Z, Sun H, Jiang X, Zhao W, Shen X. Prediction of Drug-Disease Associations Based on Multi-Kernel Deep Learning Method in Heterogeneous Graph Embedding. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:120-128. [PMID: 38051617 DOI: 10.1109/tcbb.2023.3339189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/07/2023]
Abstract
Computational drug repositioning can identify potential associations between drugs and diseases. This technology has been shown to be effective in accelerating drug development and reducing experimental costs. Although there has been plenty of research for this task, existing methods are deficient in utilizing complex relationships among biological entities, which may not be conducive to subsequent simulation of drug treatment processes. In this article, we propose a heterogeneous graph embedding method called HMLKGAT to infer novel potential drugs for diseases. More specifically, we first construct a heterogeneous information network by combining drug-disease, drug-protein and disease-protein biological networks. Then, a multi-layer graph attention model is utilized to capture the complex associations in the network to derive representations for drugs and diseases. Finally, to maintain the relationship of nodes in different feature spaces, we propose a multi-kernel learning method to transform and combine the representations. Experimental results demonstrate that HMLKGAT outperforms six state-of-the-art methods in drug-related disease prediction, and case studies of five classical drugs further demonstrate the effectiveness of HMLKGAT.
Collapse
|
3
|
Huang L, Chen Q, Lan W. Predicting drug-drug interactions based on multi-view and multichannel attention deep learning. Health Inf Sci Syst 2023; 11:50. [PMID: 37941825 PMCID: PMC10628064 DOI: 10.1007/s13755-023-00250-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Accepted: 09/25/2023] [Indexed: 11/10/2023] Open
Abstract
Predicting drug-drug interactions (DDIs) has become a major concern in the drug research field because it helps explore the pharmacological function of drugs and enables the development of new therapeutic drugs. Existing prediction methods simply integrate multiple drug attributes or perform tasks on a biomedical knowledge graph (KG). Though effective, few methods can fully utilize multi-source drug data information. In this paper, a multi-view and multichannel attention deep learning (MMADL) model is proposed, which not only extracts rich drug features containing both drug attributes and drug-related entity information from multi-source databases, but also considers the consistency and complementarity of different drug feature representation learning approaches to improve the effectiveness and accuracy of DDI prediction. A single-layer perceptron encoder is applied to encode multi-source drug information to obtain multi-view drug representation vectors in the same linear space. Then, the multichannel attention mechanism is introduced to obtain the attention weight by adaptively learning the importance of drug features according to their contributions to DDI prediction. Further, the representation vectors of multi-view drug pairs with attention weights are used as inputs of the deep neural network to predict potential DDI. The accuracy and precision-recall curves of MMADL are 93.05 and 95.94, respectively. The results indicate that the proposed method outperforms other state-of-the-art methods.
Collapse
Affiliation(s)
- Liyu Huang
- School of Computer Science and Engineering, South China University of Technology, Guangzhou, 510006 China
| | - Qingfeng Chen
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004 China
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, 3086 Australia
| | - Wei Lan
- School of Computer, Electronics and Information, Guangxi University, Nanning, 530004 China
| |
Collapse
|
4
|
Meng Y, Wang Y, Xu J, Lu C, Tang X, Peng T, Zhang B, Tian G, Yang J. Drug repositioning based on weighted local information augmented graph neural network. Brief Bioinform 2023; 25:bbad431. [PMID: 38019732 PMCID: PMC10686358 DOI: 10.1093/bib/bbad431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 10/13/2023] [Accepted: 11/05/2023] [Indexed: 12/01/2023] Open
Abstract
Drug repositioning, the strategy of redirecting existing drugs to new therapeutic purposes, is pivotal in accelerating drug discovery. While many studies have engaged in modeling complex drug-disease associations, they often overlook the relevance between different node embeddings. Consequently, we propose a novel weighted local information augmented graph neural network model, termed DRAGNN, for drug repositioning. Specifically, DRAGNN firstly incorporates a graph attention mechanism to dynamically allocate attention coefficients to drug and disease heterogeneous nodes, enhancing the effectiveness of target node information collection. To prevent excessive embedding of information in a limited vector space, we omit self-node information aggregation, thereby emphasizing valuable heterogeneous and homogeneous information. Additionally, average pooling in neighbor information aggregation is introduced to enhance local information while maintaining simplicity. A multi-layer perceptron is then employed to generate the final association predictions. The model's effectiveness for drug repositioning is supported by a 10-times 10-fold cross-validation on three benchmark datasets. Further validation is provided through analysis of the predicted associations using multiple authoritative data sources, molecular docking experiments and drug-disease network analysis, laying a solid foundation for future drug discovery.
Collapse
Affiliation(s)
- Yajie Meng
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Yi Wang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Lushan Road (S), Yuelu District, Changsha, Hunan Province 410082, China
| | - Changcheng Lu
- College of Computer Science and Electronic Engineering, Hunan University, Lushan Road (S), Yuelu District, Changsha, Hunan Province 410082, China
| | - Xianfang Tang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Tao Peng
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Bengong Zhang
- Center of Applied Mathematics & Interdisciplinary Science, School of Mathematical & Physical Sciences, Wuhan Textile University, No. 1, Yangguang Avenue, Jiangxia District, Wuhan City, Hubei Province 430200, China
| | - Geng Tian
- Geneis Beijing Co., Ltd, No. 31, New North Road, Laiguanying, Chaoyang District, Beijing 100102, China
| | - Jialiang Yang
- Geneis Beijing Co., Ltd, No. 31, New North Road, Laiguanying, Chaoyang District, Beijing 100102, China
| |
Collapse
|
5
|
Chen L, Chen K, Zhou B. Inferring drug-disease associations by a deep analysis on drug and disease networks. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:14136-14157. [PMID: 37679129 DOI: 10.3934/mbe.2023632] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/09/2023]
Abstract
Drugs, which treat various diseases, are essential for human health. However, developing new drugs is quite laborious, time-consuming, and expensive. Although investments into drug development have greatly increased over the years, the number of drug approvals each year remain quite low. Drug repositioning is deemed an effective means to accelerate the procedures of drug development because it can discover novel effects of existing drugs. Numerous computational methods have been proposed in drug repositioning, some of which were designed as binary classifiers that can predict drug-disease associations (DDAs). The negative sample selection was a common defect of this method. In this study, a novel reliable negative sample selection scheme, named RNSS, is presented, which can screen out reliable pairs of drugs and diseases with low probabilities of being actual DDAs. This scheme considered information from k-neighbors of one drug in a drug network, including their associations to diseases and the drug. Then, a scoring system was set up to evaluate pairs of drugs and diseases. To test the utility of the RNSS, three classic classification algorithms (random forest, bayes network and nearest neighbor algorithm) were employed to build classifiers using negative samples selected by the RNSS. The cross-validation results suggested that such classifiers provided a nearly perfect performance and were significantly superior to those using some traditional and previous negative sample selection schemes.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Kaiyu Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Bo Zhou
- Shanghai University of Medicine & Health Sciences, Shanghai 201318, China
| |
Collapse
|
6
|
Identification of Potential Parkinson's Disease Drugs Based on Multi-Source Data Fusion and Convolutional Neural Network. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27154780. [PMID: 35897954 PMCID: PMC9369596 DOI: 10.3390/molecules27154780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Revised: 07/20/2022] [Accepted: 07/22/2022] [Indexed: 11/20/2022]
Abstract
Parkinson’s disease (PD) is a serious neurodegenerative disease. Most of the current treatment can only alleviate symptoms, but not stop the progress of the disease. Therefore, it is crucial to find medicines to completely cure PD. Finding new indications of existing drugs through drug repositioning can not only reduce risk and cost, but also improve research and development efficiently. A drug repurposing method was proposed to identify potential Parkinson’s disease-related drugs based on multi-source data integration and convolutional neural network. Multi-source data were used to construct similarity networks, and topology information were utilized to characterize drugs and PD-associated proteins. Then, diffusion component analysis method was employed to reduce the feature dimension. Finally, a convolutional neural network model was constructed to identify potential associations between existing drugs and LProts (PD-associated proteins). Based on 10-fold cross-validation, the developed method achieved an accuracy of 91.57%, specificity of 87.24%, sensitivity of 95.27%, Matthews correlation coefficient of 0.8304, area under the receiver operating characteristic curve of 0.9731 and area under the precision–recall curve of 0.9727, respectively. Compared with the state-of-the-art approaches, the current method demonstrates superiority in some aspects, such as sensitivity, accuracy, robustness, etc. In addition, some of the predicted potential PD therapeutics through molecular docking further proved that they can exert their efficacy by acting on the known targets of PD, and may be potential PD therapeutic drugs for further experimental research. It is anticipated that the current method may be considered as a powerful tool for drug repurposing and pathological mechanism studies.
Collapse
|
7
|
A Discovery Strategy for Active Compounds of Chinese Medicine Based on the Prediction Model of Compound-Disease Relationship. JOURNAL OF ONCOLOGY 2022; 2022:8704784. [PMID: 35847368 PMCID: PMC9286898 DOI: 10.1155/2022/8704784] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Accepted: 06/16/2022] [Indexed: 11/17/2022]
Abstract
An accurate characterization of diseases and compounds is the key to predicting the compound-disease relationship (CDR). However, due to the difficulty of a comprehensive description of CDR, the accuracy of traditional drug development models for large-scale CDR prediction is usually unsatisfactory. In order to solve this problem, we propose a new method that integrates the molecular descriptors of compounds and the symptom descriptors of diseases to build a CDR two-dimensional matrix to predict candidate active compounds. The Matlab software draws grayscale images of CDRs, which are used as a benchmark dataset for training convolutional neural network (CNN) models. The trained model is used to predict candidate antitumor active compounds. Among the AlexNet and GoogLeNet models, we selected the GoogLeNet model for the prediction of active compounds in Chinese medicine, and its Acc, Sen, Pre, F-measure, MCC, and AUC are 0.960, 0.956, 0.965, 0.960, 0.920, and 0.964, respectively. In the prediction results of compounds, 1624 candidate CDRs were found in 124 Chinese medicines. Among them, we obtained 31 features of candidate antitumor active compounds. This method provides new insights for the discovery of candidate active compounds in Chinese medicine.
Collapse
|
8
|
Zhang Y, Lei X, Pan Y, Wu FX. Drug Repositioning with GraphSAGE and Clustering Constraints Based on Drug and Disease Networks. Front Pharmacol 2022; 13:872785. [PMID: 35620297 PMCID: PMC9127467 DOI: 10.3389/fphar.2022.872785] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 04/11/2022] [Indexed: 11/29/2022] Open
Abstract
The understanding of therapeutic properties is important in drug repositioning and drug discovery. However, chemical or clinical trials are expensive and inefficient to characterize the therapeutic properties of drugs. Recently, artificial intelligence (AI)-assisted algorithms have received extensive attention for discovering the potential therapeutic properties of drugs and speeding up drug development. In this study, we propose a new method based on GraphSAGE and clustering constraints (DRGCC) to investigate the potential therapeutic properties of drugs for drug repositioning. First, the drug structure features and disease symptom features are extracted. Second, the drug–drug interaction network and disease similarity network are constructed according to the drug–gene and disease–gene relationships. Matrix factorization is adopted to extract the clustering features of networks. Then, all the features are fed to the GraphSAGE to predict new associations between existing drugs and diseases. Benchmark comparisons on two different datasets show that our method has reliable predictive performance and outperforms other six competing. We have also conducted case studies on existing drugs and diseases and aimed to predict drugs that may be effective for the novel coronavirus disease 2019 (COVID-19). Among the predicted anti-COVID-19 drug candidates, some drugs are being clinically studied by pharmacologists, and their binding sites to COVID-19-related protein receptors have been found via the molecular docking technology.
Collapse
Affiliation(s)
- Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Yi Pan
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK, Canada
| |
Collapse
|
9
|
Xuan P, Meng X, Gao L, Zhang T, Nakaguchi T. Heterogeneous multi-scale neighbor topologies enhanced drug-disease association prediction. Brief Bioinform 2022; 23:6565159. [PMID: 35393616 DOI: 10.1093/bib/bbac123] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Revised: 02/20/2022] [Accepted: 03/15/2022] [Indexed: 12/20/2022] Open
Abstract
MOTIVATION Identifying new uses of approved drugs is an effective way to reduce the time and cost of drug development. Recent computational approaches for predicting drug-disease associations have integrated multi-sourced data on drugs and diseases. However, neighboring topologies of various scales in multiple heterogeneous drug-disease networks have yet to be exploited and fully integrated. RESULTS We propose a novel method for drug-disease association prediction, called MGPred, used to encode and learn multi-scale neighboring topologies of drug and disease nodes and pairwise attributes from heterogeneous networks. First, we constructed three heterogeneous networks based on multiple kinds of drug similarities. Each network comprises drug and disease nodes and edges created based on node-wise similarities and associations that reflect specific topological structures. We also propose an embedding mechanism to formulate topologies that cover different ranges of neighbors. To encode the embeddings and derive multi-scale neighboring topology representations of drug and disease nodes, we propose a module based on graph convolutional autoencoders with shared parameters for each heterogeneous network. We also propose scale-level attention to obtain an adaptive fusion of informative topological representations at different scales. Finally, a learning module based on a convolutional neural network with various receptive fields is proposed to learn multi-view attribute representations of a pair of drug and disease nodes. Comprehensive experiment results demonstrate that MGPred outperforms other state-of-the-art methods in comparison to drug-related disease prediction, and the recall rates for the top-ranked candidates and case studies on five drugs further demonstrate the ability of MGPred to retrieve potential drug-disease associations.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China.,School of Computer Science, Shaanxi Normal University, Xi'an 710062, China
| | - Xiangfeng Meng
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Ling Gao
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| |
Collapse
|
10
|
Qin L, Wang J, Wu Z, Li W, Liu G, Tang Y. Drug Repurposing for Newly Emerged Diseases via Network-Based Inference on A Gene-Disease-Drug Network. Mol Inform 2022; 41:e2200001. [PMID: 35338586 DOI: 10.1002/minf.202200001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 03/25/2022] [Indexed: 11/06/2022]
Abstract
Identification of disease-drug associations is an effective strategy for drug repurposing, especially in searching old drugs for newly emerged diseases like COVID-19. In this study, we put forward a network-based method named NEDNBI to predict disease-drug associations based on a gene-disease-drug tripartite network, which could be applied in drug repurposing. The novelty of our method lies in the fact that no negative data are required, and new disease could be added into the disease-drug network with gene as the bridge. The comprehensive evaluation results showed that the proposed method had good performance, with AUC value 0.948 ± 0.009 for 10-fold cross validation. In a case study, 8 of the 20 predicted old drugs have been tested clinically for the treatment of COVID-19, which illustrated the usefulness of our method in drug repurposing. The source code and data of the method are available at https://github.com/Qli97/NEDNBI.
Collapse
Affiliation(s)
- Li Qin
- East China University of Science and Technology School of Pharmacy, CHINA
| | - Jiye Wang
- East China University of Science and Technology School of Pharmacy, CHINA
| | - Zengrui Wu
- East China University of Science and Technology, CHINA
| | | | - Guixia Liu
- East China University of Science and Technology, CHINA
| | - Yun Tang
- East China University of Science and Technology, CHINA
| |
Collapse
|
11
|
Wang L, Tan Y, Yang X, Kuang L, Ping P. Review on predicting pairwise relationships between human microbes, drugs and diseases: from biological data to computational models. Brief Bioinform 2022; 23:6553604. [PMID: 35325024 DOI: 10.1093/bib/bbac080] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2021] [Revised: 02/14/2022] [Accepted: 02/15/2022] [Indexed: 12/11/2022] Open
Abstract
In recent years, with the rapid development of techniques in bioinformatics and life science, a considerable quantity of biomedical data has been accumulated, based on which researchers have developed various computational approaches to discover potential associations between human microbes, drugs and diseases. This paper provides a comprehensive overview of recent advances in prediction of potential correlations between microbes, drugs and diseases from biological data to computational models. Firstly, we introduced the widely used datasets relevant to the identification of potential relationships between microbes, drugs and diseases in detail. And then, we divided a series of a lot of representative computing models into five major categories including network, matrix factorization, matrix completion, regularization and artificial neural network for in-depth discussion and comparison. Finally, we analysed possible challenges and opportunities in this research area, and at the same time we outlined some suggestions for further improvement of predictive performances as well.
Collapse
Affiliation(s)
- Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Yaqin Tan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Xiaoyu Yang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Pengyao Ping
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, 410022, Hunan, China
| |
Collapse
|
12
|
Zhao BW, Hu L, You ZH, Wang L, Su XR. HINGRL: predicting drug-disease associations with graph representation learning on heterogeneous information networks. Brief Bioinform 2021; 23:6456295. [PMID: 34891172 DOI: 10.1093/bib/bbab515] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Revised: 11/08/2021] [Accepted: 11/09/2021] [Indexed: 12/20/2022] Open
Abstract
Identifying new indications for drugs plays an essential role at many phases of drug research and development. Computational methods are regarded as an effective way to associate drugs with new indications. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering the biological knowledge of drugs and diseases, which are believed to be useful for improving the accuracy of drug repositioning. To this end, a novel heterogeneous information network (HIN) based model, namely HINGRL, is proposed to precisely identify new indications for drugs based on graph representation learning techniques. More specifically, HINGRL first constructs a HIN by integrating drug-disease, drug-protein and protein-disease biological networks with the biological knowledge of drugs and diseases. Then, different representation strategies are applied to learn the features of nodes in the HIN from the topological and biological perspectives. Finally, HINGRL adopts a Random Forest classifier to predict unknown drug-disease associations based on the integrated features of drugs and diseases obtained in the previous step. Experimental results demonstrate that HINGRL achieves the best performance on two real datasets when compared with state-of-the-art models. Besides, our case studies indicate that the simultaneous consideration of network topology and biological knowledge of drugs and diseases allows HINGRL to precisely predict drug-disease associations from a more comprehensive perspective. The promising performance of HINGRL also reveals that the utilization of rich heterogeneous information provides an alternative view for HINGRL to identify novel drug-disease associations especially for new diseases.
Collapse
Affiliation(s)
- Bo-Wei Zhao
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Lun Hu
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 530007, China
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Xinjiang Laboratory of Minority Speech and Language Information Processing, Urumqi 830011, China
| |
Collapse
|
13
|
Wang F, Lei X, Liao B, Wu FX. Predicting drug-drug interactions by graph convolutional network with multi-kernel. Brief Bioinform 2021; 23:6447677. [PMID: 34864856 DOI: 10.1093/bib/bbab511] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/28/2021] [Accepted: 11/07/2021] [Indexed: 11/14/2022] Open
Abstract
Drug repositioning is proposed to find novel usages for existing drugs. Among many types of drug repositioning approaches, predicting drug-drug interactions (DDIs) helps explore the pharmacological functions of drugs and achieves potential drugs for novel treatments. A number of models have been applied to predict DDIs. The DDI network, which is constructed from the known DDIs, is a common part in many of the existing methods. However, the functions of DDIs are different, and thus integrating them in a single DDI graph may overlook some useful information. We propose a graph convolutional network with multi-kernel (GCNMK) to predict potential DDIs. GCNMK adopts two DDI graph kernels for the graph convolutional layers, namely, increased DDI graph consisting of 'increase'-related DDIs and decreased DDI graph consisting of 'decrease'-related DDIs. The learned drug features are fed into a block with three fully connected layers for the DDI prediction. We compare various types of drug features, whereas the target feature of drugs outperforms all other types of features and their concatenated features. In comparison with three different DDI prediction methods, our proposed GCNMK achieves the best performance in terms of area under receiver operating characteristic curve and area under precision-recall curve. In case studies, we identify the top 20 potential DDIs from all unknown DDIs, and the top 10 potential DDIs from the unknown DDIs among breast, colorectal and lung neoplasms-related drugs. Most of them have evidence to support the existence of their interactions. fangxiang.wu@usask.ca.
Collapse
Affiliation(s)
- Fei Wang
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, 620 West Chang'an Avenue, 710119, Shaanxi, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, 99 Longkun South Road, 571158, Hainan, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada
- Department of Mechanical Engineering and Department of Computer Science, University of Saskatchewan, 57 Campus Drive, S7N 5A9, Saskatchewan, Canada
| |
Collapse
|
14
|
Gao L, Cui H, Zhang T, Sheng N, Xuan P. Prediction of drug-disease associations by integrating common topologies of heterogeneous networks and specific topologies of subnets. Brief Bioinform 2021; 23:6446271. [PMID: 34850815 DOI: 10.1093/bib/bbab467] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2021] [Revised: 09/23/2021] [Accepted: 10/13/2021] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION The development process of a new drug is time-consuming and costly. Thus, identifying new uses for approved drugs, named drug repositioning, is helpful for speeding up the drug development process and reducing development costs. Existing drug-related disease prediction methods mainly focus on single or multiple drug-disease heterogeneous networks. However, heterogeneous networks, and drug subnets and disease subnet contained in heterogeneous networks cover the common topology information between drug and disease nodes, the specific information between drug nodes and the specific information between disease nodes, respectively. RESULTS We design a novel model, CTST, to extract and integrate common and specific topologies in multiple heterogeneous networks and subnets. Multiple heterogeneous networks composed of drug and disease nodes are established to integrate multiple kinds of similarities and associations among drug and disease nodes. These heterogeneous networks contain multiple drug subnets and a disease subnet. For multiple heterogeneous networks and subnets, we then define the common and specific representations of drug and disease nodes. The common representations of drug and disease nodes are encoded by a graph convolutional autoencoder with sharing parameters and they integrate the topological relationships of all nodes in heterogeneous networks. The specific representations of nodes are learned by specific graph convolutional autoencoders, respectively, and they fuse the topology and attributes of the nodes in each subnet. We then propose attention mechanisms at common representation level and specific representation level to learn more informative common and specific representations, respectively. Finally, an integration module with representation feature level attention is built to adaptively integrate these two representations for final association prediction. Extensive experimental results confirm the effectiveness of CTST. Comparison with six latest methods and case studies on five drugs further verify CTST has the ability to discover potential candidate diseases.
Collapse
Affiliation(s)
- Ling Gao
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Tiangang Zhang
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| | - Nan Sheng
- College of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
15
|
Cai L, Lu C, Xu J, Meng Y, Wang P, Fu X, Zeng X, Su Y. Drug repositioning based on the heterogeneous information fusion graph convolutional network. Brief Bioinform 2021; 22:6347207. [PMID: 34378011 DOI: 10.1093/bib/bbab319] [Citation(s) in RCA: 55] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/30/2021] [Accepted: 07/21/2021] [Indexed: 11/13/2022] Open
Abstract
In silico reuse of old drugs (also known as drug repositioning) to treat common and rare diseases is increasingly becoming an attractive proposition because it involves the use of de-risked drugs, with potentially lower overall development costs and shorter development timelines. Therefore, there is a pressing need for computational drug repurposing methodologies to facilitate drug discovery. In this study, we propose a new method, called DRHGCN (Drug Repositioning based on the Heterogeneous information fusion Graph Convolutional Network), to discover potential drugs for a certain disease. To make full use of different topology information in different domains (i.e. drug-drug similarity, disease-disease similarity and drug-disease association networks), we first design inter- and intra-domain feature extraction modules by applying graph convolution operations to the networks to learn the embedding of drugs and diseases, instead of simply integrating the three networks into a heterogeneous network. Afterwards, we parallelly fuse the inter- and intra-domain embeddings to obtain the more representative embeddings of drug and disease. Lastly, we introduce a layer attention mechanism to combine embeddings from multiple graph convolution layers for further improving the prediction performance. We find that DRHGCN achieves high performance (the average AUROC is 0.934 and the average AUPR is 0.539) in four benchmark datasets, outperforming the current approaches. Importantly, we conducted molecular docking experiments on DRHGCN-predicted candidate drugs, providing several novel approved drugs for Alzheimer's disease (e.g. benzatropine) and Parkinson's disease (e.g. trihexyphenidyl and haloperidol).
Collapse
Affiliation(s)
- Lijun Cai
- Hunan University, Changsha, Hunan, 410082, China
| | | | - Junlin Xu
- Hunan University, Changsha, Hunan, 410082, China
| | - Yajie Meng
- Hunan University, Changsha, Hunan, 410082, China
| | - Peng Wang
- Hunan University, Changsha, Hunan, 410082, China
| | | | | | - Yansen Su
- Anhui University, Changsha, Hunan, 410082, China
| |
Collapse
|
16
|
Xuan P, Gao L, Sheng N, Zhang T, Nakaguchi T. Graph Convolutional Autoencoder and Fully-Connected Autoencoder with Attention Mechanism Based Method for Predicting Drug-Disease Associations. IEEE J Biomed Health Inform 2021; 25:1793-1804. [PMID: 33216722 DOI: 10.1109/jbhi.2020.3039502] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Predicting novel uses for approved drugs helps in reducing the costs of drug development and facilitates the development process. Most of previous methods focused on the multi-source data related to drugs and diseases to predict the candidate associations between drugs and diseases. There are multiple kinds of similarities between drugs, and these similarities reflect how similar two drugs are from the different views, whereas most of the previous methods failed to deeply integrate these similarities. In addition, the topology structures of the multiple drug-disease heterogeneous networks constructed by using the different kinds of drug similarities are not fully exploited. We therefore propose GFPred, a method based on a graph convolutional autoencoder and a fully-connected autoencoder with an attention mechanism, to predict drug-related diseases. GFPred integrates drug-disease associations, disease similarities, three kinds of drug similarities and attributes of the drug nodes. Three drug-disease heterogeneous networks are constructed based on the different kinds of drug similarities. We construct a graph convolutional autoencoder module, and integrate the attributes of the drug and disease nodes in each network to learn the topology representations of each drug node and disease node. As the different kinds of drug attributes contribute differently to the prediction of drug-disease associations, we construct an attribute-level attention mechanism. A fully-connected autoencoder module is established to learn the attribute representations of the drug and disease nodes. Finally, the original features of the drug-disease node pairs are also important auxiliary information for their association prediction. A combined strategy based on a convolutional neural network is proposed to fully integrate the topology representations, the attribute representations, and the original features of the drug-disease pairs. The ablation studies showed the contributions of data related to three types of drug attributes. Comparison with other methods confirmed that GFPred achieved better performance than several state-of-the-art prediction methods. In particular, case studies confirmed that GFPred is able to retrieve more actual drug-disease associations in the top k part of the prediction results. It is helpful for biologists to discover real associations by wet-lab experiments.
Collapse
|
17
|
A New Framework for Discovering Protein Complex and Disease Association via Mining Multiple Databases. Interdiscip Sci 2021; 13:683-692. [PMID: 33905111 DOI: 10.1007/s12539-021-00432-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 03/31/2021] [Accepted: 04/09/2021] [Indexed: 10/21/2022]
Abstract
One important challenge in the post-genomic era is to explore disease mechanisms by efficiently integrating different types of biological data. In fact, a single disease is usually caused through multiple genes products such as protein complexes rather than single gene. Therefore, it is meaningful for us to discover protein communities from the protein-protein interaction network and use them for inferring disease-disease associations. In this article, we propose a new framework including protein-protein networks, disease-gene associations and disease-complex pairs to cluster protein complexes and infer disease associations. Complexes discovered by our approach is superior in quality (Sn, PPV and ACC) and clustering quantity than other four popular methods on three PPI networks. A systematic analysis shows that disease pairs sharing more protein complexes (such as Glucose and Lipid Metabolic Disorders) are more similar and overlapping proteins may have different roles in different diseases. These findings can provide clinical scholars and medical practitioners with new ideas on disease identification and treatment.
Collapse
|
18
|
Yu Z, Huang F, Zhao X, Xiao W, Zhang W. Predicting drug-disease associations through layer attention graph convolutional network. Brief Bioinform 2020; 22:5918381. [PMID: 33078832 DOI: 10.1093/bib/bbaa243] [Citation(s) in RCA: 130] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2020] [Revised: 08/16/2020] [Accepted: 08/31/2020] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Determining drug-disease associations is an integral part in the process of drug development. However, the identification of drug-disease associations through wet experiments is costly and inefficient. Hence, the development of efficient and high-accuracy computational methods for predicting drug-disease associations is of great significance. RESULTS In this paper, we propose a novel computational method named as layer attention graph convolutional network (LAGCN) for the drug-disease association prediction. Specifically, LAGCN first integrates the known drug-disease associations, drug-drug similarities and disease-disease similarities into a heterogeneous network, and applies the graph convolution operation to the network to learn the embeddings of drugs and diseases. Second, LAGCN combines the embeddings from multiple graph convolution layers using an attention mechanism. Third, the unobserved drug-disease associations are scored based on the integrated embeddings. Evaluated by 5-fold cross-validations, LAGCN achieves an area under the precision-recall curve of 0.3168 and an area under the receiver-operating characteristic curve of 0.8750, which are better than the results of existing state-of-the-art prediction methods and baseline methods. The case study shows that LAGCN can discover novel associations that are not curated in our dataset. CONCLUSION LAGCN is a useful tool for predicting drug-disease associations. This study reveals that embeddings from different convolution layers can reflect the proximities of different orders, and combining the embeddings by the attention mechanism can improve the prediction performances.
Collapse
Affiliation(s)
- Zhouxin Yu
- College of Informatics, Huazhong Agricultural University
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University
| | - Xiaohan Zhao
- College of Informatics, Huazhong Agricultural University
| | | | - Wen Zhang
- College of Informatics, Huazhong Agricultural University
| |
Collapse
|