1
|
Romano JD, Truong V, Kumar R, Venkatesan M, Graham BE, Hao Y, Matsumoto N, Li X, Wang Z, Ritchie MD, Shen L, Moore JH. The Alzheimer's Knowledge Base: A Knowledge Graph for Alzheimer Disease Research. J Med Internet Res 2024; 26:e46777. [PMID: 38635981 PMCID: PMC11066745 DOI: 10.2196/46777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 06/23/2023] [Accepted: 11/07/2023] [Indexed: 04/20/2024] Open
Abstract
BACKGROUND As global populations age and become susceptible to neurodegenerative illnesses, new therapies for Alzheimer disease (AD) are urgently needed. Existing data resources for drug discovery and repurposing fail to capture relationships central to the disease's etiology and response to drugs. OBJECTIVE We designed the Alzheimer's Knowledge Base (AlzKB) to alleviate this need by providing a comprehensive knowledge representation of AD etiology and candidate therapeutics. METHODS We designed the AlzKB as a large, heterogeneous graph knowledge base assembled using 22 diverse external data sources describing biological and pharmaceutical entities at different levels of organization (eg, chemicals, genes, anatomy, and diseases). AlzKB uses a Web Ontology Language 2 ontology to enforce semantic consistency and allow for ontological inference. We provide a public version of AlzKB and allow users to run and modify local versions of the knowledge base. RESULTS AlzKB is freely available on the web and currently contains 118,902 entities with 1,309,527 relationships between those entities. To demonstrate its value, we used graph data science and machine learning to (1) propose new therapeutic targets based on similarities of AD to Parkinson disease and (2) repurpose existing drugs that may treat AD. For each use case, AlzKB recovers known therapeutic associations while proposing biologically plausible new ones. CONCLUSIONS AlzKB is a new, publicly available knowledge resource that enables researchers to discover complex translational associations for AD drug discovery. Through 2 use cases, we show that it is a valuable tool for proposing novel therapeutic hypotheses based on public biomedical knowledge.
Collapse
Affiliation(s)
- Joseph D Romano
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Center of Excellence in Environmental Toxicology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Van Truong
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Rachit Kumar
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Medical Scientist Training Program, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Mythreye Venkatesan
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Britney E Graham
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Yun Hao
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Nick Matsumoto
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Xi Li
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Zhiping Wang
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| | - Marylyn D Ritchie
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Li Shen
- Institute for Biomedical Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
- Department of Biostatistics, Epidemiology and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Jason H Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
2
|
Yang Z, Lin Z, Yang Y, Li J. Dual-Path Graph Neural Network with Adaptive Auxiliary Module for Link Prediction. Big Data 2024. [PMID: 38527254 DOI: 10.1089/big.2023.0130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
Link prediction, which has important applications in many fields, predicts the possibility of the link between two nodes in a graph. Link prediction based on Graph Neural Network (GNN) obtains node representation and graph structure through GNN, which has attracted a growing amount of attention recently. However, the existing GNN-based link prediction approaches possess some shortcomings. On the one hand, because a graph contains different types of nodes, it leads to a great challenge for aggregating information and learning node representation from its neighbor nodes. On the other hand, the attention mechanism has been an effect instrument for enhancing the link prediction performance. However, the traditional attention mechanism is always monotonic for query nodes, which limits its influence on link prediction. To address these two problems, a Dual-Path Graph Neural Network (DPGNN) for link prediction is proposed in this study. First, we propose a novel Local Random Features Augmentation for Graph Convolution Network as a baseline of one path. Meanwhile, Graph Attention Network version 2 based on dynamic attention mechanism is adopted as a baseline of the other path. And then, we capture more meaningful node representation and more accurate link features by concatenating the information of these two paths. In addition, we propose an adaptive auxiliary module for better balancing the weight of auxiliary tasks, which brings more benefit to link prediction. Finally, extensive experiments verify the effectiveness and superiority of our proposed DPGNN for link prediction.
Collapse
Affiliation(s)
- Zhenzhen Yang
- Key Laboratory of Ministry of Education in Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Zelong Lin
- Key Laboratory of Ministry of Education in Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Yongpeng Yang
- Key Laboratory of Ministry of Education in Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Nanjing, China
- School of Network and Communication, Nanjing Vocational College of Information Technology, Nanjing, China
| | - Jiaqi Li
- Key Laboratory of Ministry of Education in Broadband Wireless Communication and Sensor Network Technology, Nanjing University of Posts and Telecommunications, Nanjing, China
| |
Collapse
|
3
|
Hu H, Zhao H, Zhong T, Dong X, Wang L, Han P, Li Z. Adaptive deep propagation graph neural network for predicting miRNA-disease associations. Brief Funct Genomics 2023; 22:453-462. [PMID: 37078739 DOI: 10.1093/bfgp/elad010] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/13/2023] [Accepted: 03/09/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND A large number of experiments show that the abnormal expression of miRNA is closely related to the occurrence, diagnosis and treatment of diseases. Identifying associations between miRNAs and diseases is important for clinical applications of complex human diseases. However, traditional biological experimental methods and calculation-based methods have many limitations, which lead to the development of more efficient and accurate deep learning methods for predicting miRNA-disease associations. RESULTS In this paper, we propose a novel model on the basis of adaptive deep propagation graph neural network to predict miRNA-disease associations (ADPMDA). We first construct the miRNA-disease heterogeneous graph based on known miRNA-disease pairs, miRNA integrated similarity information, miRNA sequence information and disease similarity information. Then, we project the features of miRNAs and diseases into a low-dimensional space. After that, attention mechanism is utilized to aggregate the local features of central nodes. In particular, an adaptive deep propagation graph neural network is employed to learn the embedding of nodes, which can adaptively adjust the local and global information of nodes. Finally, the multi-layer perceptron is leveraged to score miRNA-disease pairs. CONCLUSION Experiments on human microRNA disease database v3.0 dataset show that ADPMDA achieves the mean AUC value of 94.75% under 5-fold cross-validation. We further conduct case studies on the esophageal neoplasm, lung neoplasms and lymphoma to confirm the effectiveness of our proposed model, and 49, 49, 47 of the top 50 predicted miRNAs associated with these diseases are confirmed, respectively. These results demonstrate the effectiveness and superiority of our model in predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Hua Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Huan Zhao
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Tangbo Zhong
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Xishang Dong
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
| | - Pengyong Han
- Central Lab, Changzhi Medical College, Changzhi 046012, China
| | - Zhengwei Li
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
- KUNPAND Communications (Kunshan) Co., Ltd., Suzhou 215300, China
| |
Collapse
|
4
|
Ma Y, Zhang H, Jin C, Kang C. Predicting lncRNA-protein interactions with bipartite graph embedding and deep graph neural networks. Front Genet 2023; 14:1136672. [PMID: 36845380 PMCID: PMC9948011 DOI: 10.3389/fgene.2023.1136672] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Accepted: 01/30/2023] [Indexed: 02/11/2023] Open
Abstract
Background: Long non-coding RNAs (lncRNAs) play crucial roles in numerous biological processes. Investigation of the lncRNA-protein interaction contributes to discovering the undetected molecular functions of lncRNAs. In recent years, increasingly computational approaches have substituted the traditional time-consuming experiments utilized to crack the possible unknown associations. However, significant explorations of the heterogeneity in association prediction between lncRNA and protein are inadequate. It remains challenging to integrate the heterogeneity of lncRNA-protein interactions with graph neural network algorithms. Methods: In this paper, we constructed a deep architecture based on GNN called BiHo-GNN, which is the first to integrate the properties of homogeneous with heterogeneous networks through bipartite graph embedding. Different from previous research, BiHo-GNN can capture the mechanism of molecular association by the data encoder of heterogeneous networks. Meanwhile, we design the process of mutual optimization between homogeneous and heterogeneous networks, which can promote the robustness of BiHo-GNN. Results: We collected four datasets for predicting lncRNA-protein interaction and compared the performance of current prediction models on benchmarking dataset. In comparison with the performance of other models, BiHo-GNN outperforms existing bipartite graph-based methods. Conclusion: Our BiHo-GNN integrates the bipartite graph with homogeneous graph networks. Based on this model structure, the lncRNA-protein interactions and potential associations can be predicted and discovered accurately.
Collapse
Affiliation(s)
- Yuzhou Ma
- College of Artificial Intelligence, Nankai University, Tianjin, China
| | - Han Zhang
- College of Artificial Intelligence, Nankai University, Tianjin, China,*Correspondence: Han Zhang,
| | - Chen Jin
- College of Computer Science, Nankai University, Tianjin, China
| | - Chuanze Kang
- College of Artificial Intelligence, Nankai University, Tianjin, China
| |
Collapse
|
5
|
Li M, Cai X, Xu S, Ji H. Metapath-aggregated heterogeneous graph neural network for drug-target interaction prediction. Brief Bioinform 2023; 24:6966534. [PMID: 36592060 DOI: 10.1093/bib/bbac578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 11/03/2022] [Accepted: 11/26/2022] [Indexed: 01/03/2023] Open
Abstract
Drug-target interaction (DTI) prediction is an essential step in drug repositioning. A few graph neural network (GNN)-based methods have been proposed for DTI prediction using heterogeneous biological data. However, existing GNN-based methods only aggregate information from directly connected nodes restricted in a drug-related or a target-related network and are incapable of capturing high-order dependencies in the biological heterogeneous graph. In this paper, we propose a metapath-aggregated heterogeneous graph neural network (MHGNN) to capture complex structures and rich semantics in the biological heterogeneous graph for DTI prediction. Specifically, MHGNN enhances heterogeneous graph structure learning and high-order semantics learning by modeling high-order relations via metapaths. Additionally, MHGNN enriches high-order correlations between drug-target pairs (DTPs) by constructing a DTP correlation graph with DTPs as nodes. We conduct extensive experiments on three biological heterogeneous datasets. MHGNN favorably surpasses 17 state-of-the-art methods over 6 evaluation metrics, which verifies its efficacy for DTI prediction. The code is available at https://github.com/Zora-LM/MHGNN-DTI.
Collapse
Affiliation(s)
- Mei Li
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| | - Xiangrui Cai
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| | - Sihan Xu
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Cyber Science, Nankai University, 300350, Tianjin, China
| | - Hua Ji
- Tianjin Key Laboratory of Network and Data Security Technology, China.,College of Computer Science, Nankai University, 300350, Tianjin, China
| |
Collapse
|
6
|
Liu Z, Huang L, Xu H, Yang W. Locally Differentially Private Heterogeneous Graph Aggregation with Utility Optimization. Entropy (Basel) 2023; 25:130. [PMID: 36673271 PMCID: PMC9858202 DOI: 10.3390/e25010130] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 12/24/2022] [Accepted: 01/04/2023] [Indexed: 06/17/2023]
Abstract
Graph data are widely collected and exploited by organizations, providing convenient services from policy formation and market decisions to medical care and social interactions. Yet, recent exposures of private data abuses have caused huge financial and reputational costs to both organizations and their users, enabling designing efficient privacy protection mechanisms a top priority. Local differential privacy (LDP) is an emerging privacy preservation standard and has been studied in various fields, including graph data aggregation. However, existing research studies of graph aggregation with LDP mainly provide single edge privacy for pure graph, leaving heterogeneous graph data aggregation with stronger privacy as an open challenge. In this paper, we take a step toward simultaneously collecting mixed attributed graph data while retaining intrinsic associations, with stronger local differential privacy protecting more than single edge. Specifically, we first propose a moderate granularity attributewise local differential privacy (ALDP) and formulate the problem of aggregating mixed attributed graph data as collecting two statistics under ALDP. Then we provide mechanisms to privately collect these statistics. For the categorical-attributed graph, we devise a utility-improved PrivAG mechanism, which randomizes and aggregates subsets of attribute and degree vectors. For heterogeneous graph, we present an adaptive binning scheme (ABS) to dynamically segment and simultaneously collect mixed attributed data, and extend the prior mechanism to a generalized PrivHG mechanism based on it. Finally, we practically optimize the utility of the mechanisms by reducing the computation costs and estimation errors. The effectiveness and efficiency of the mechanisms are validated through extensive experiments, and better performance is shown compared with the state-of-the-art mechanisms.
Collapse
|
7
|
Kim J, Lee S, Kim Y, Ahn S, Cho S. Graph Learning-Based Blockchain Phishing Account Detection with a Heterogeneous Transaction Graph. Sensors (Basel) 2023; 23:463. [PMID: 36617060 PMCID: PMC9824179 DOI: 10.3390/s23010463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/07/2022] [Accepted: 12/19/2022] [Indexed: 06/17/2023]
Abstract
Recently, cybercrimes that exploit the anonymity of blockchain are increasing. They steal blockchain users' assets, threaten the network's reliability, and destabilize the blockchain network. Therefore, it is necessary to detect blockchain cybercriminal accounts to protect users' assets and sustain the blockchain ecosystem. Many studies have been conducted to detect cybercriminal accounts in the blockchain network. They represented blockchain transaction records as homogeneous transaction graphs that have a multi-edge. They also adopted graph learning algorithms to analyze transaction graphs. However, most graph learning algorithms are not efficient in multi-edge graphs, and homogeneous graphs ignore the heterogeneity of the blockchain network. In this paper, we propose a novel heterogeneous graph structure called an account-transaction graph, ATGraph. ATGraph represents a multi-edge as single edges by considering transactions as nodes. It allows graph learning more efficiently by eliminating multi-edges. Moreover, we compare the performance of ATGraph with homogeneous transaction graphs in various graph learning algorithms. The experimental results demonstrate that the detection performance using ATGraph as input outperforms that using homogeneous graphs as the input by up to 0.2 AUROC.
Collapse
Affiliation(s)
- Jaehyeon Kim
- Department of Applied Artificial Intelligence, Major in Bio-Artificial Intelligence, Hanyang University, Ansan 15588, Republic of Korea
| | - Sejong Lee
- Department of Computer Science and Engineering, Major in Bio-Artificial Intelligence, Hanyang University, Ansan 15588, Republic of Korea
| | - Yushin Kim
- Department of Computer Science and Engineering, Major in Bio-Artificial Intelligence, Hanyang University, Ansan 15588, Republic of Korea
| | - Seyoung Ahn
- Department of Computer Science and Engineering, Major in Bio-Artificial Intelligence, Hanyang University, Ansan 15588, Republic of Korea
| | - Sunghyun Cho
- Department of Computer Science and Engineering, Hanyang University, Ansan 15588, Republic of Korea
| |
Collapse
|
8
|
Wang H, Huang F, Xiong Z, Zhang W. A heterogeneous network-based method with attentive meta-path extraction for predicting drug-target interactions. Brief Bioinform 2022; 23:6596318. [PMID: 35641162 DOI: 10.1093/bib/bbac184] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/09/2022] [Accepted: 04/23/2022] [Indexed: 11/13/2022] Open
Abstract
Predicting drug-target interactions (DTIs) is crucial at many phases of drug discovery and repositioning. Many computational methods based on heterogeneous networks (HNs) have proved their potential to predict DTIs by capturing extensive biological knowledge and semantic information from meta-paths. However, existing methods manually customize meta-paths, which is overly dependent on some specific expertise. Such strategy heavily limits the scalability and flexibility of these models, and even affects their predictive performance. To alleviate this limitation, we propose a novel HN-based method with attentive meta-path extraction for DTI prediction, named HampDTI, which is capable of automatically extracting useful meta-paths through a learnable attention mechanism instead of pre-definition based on domain knowledge. Specifically, by scoring multi-hop connections across various relations in the HN with each relation assigned an attention weight, HampDTI constructs a new trainable graph structure, called meta-path graph. Such meta-path graph implicitly measures the importance of every possible meta-path between drugs and targets. To enable HampDTI to extract more diverse meta-paths, we adopt a multi-channel mechanism to generate multiple meta-path graphs. Then, a graph neural network is deployed on the generated meta-path graphs to yield the multi-channel embeddings of drugs and targets. Finally, HampDTI fuses all embeddings from different channels for predicting DTIs. The meta-path graphs are optimized along with the model training such that HampDTI can adaptively extract valuable meta-paths for DTI prediction. The experiments on benchmark datasets not only show the superiority of HampDTI in DTI prediction over several baseline methods, but also, more importantly, demonstrate the effectiveness of the model discovering important meta-paths.
Collapse
Affiliation(s)
- Hongzhun Wang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Feng Huang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Zhankun Xiong
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, Wuhan, China
| |
Collapse
|
9
|
Yang F, Zhang S, Pan W, Yao R, Zhang W, Zhang Y, Wang G, Zhang Q, Cheng Y, Dong J, Ruan C, Cui L, Wu H, Xue F. Signaling repurposable drug combinations against COVID-19 by developing the heterogeneous deep herb-graph method. Brief Bioinform 2022; 23:6580251. [PMID: 35514205 DOI: 10.1093/bib/bbac124] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 02/07/2022] [Accepted: 03/15/2022] [Indexed: 11/14/2022] Open
Abstract
BACKGROUND Coronavirus disease 2019 (COVID-19) has spurred a boom in uncovering repurposable existing drugs. Drug repurposing is a strategy for identifying new uses for approved or investigational drugs that are outside the scope of the original medical indication. MOTIVATION Current works of drug repurposing for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) are mostly limited to only focusing on chemical medicines, analysis of single drug targeting single SARS-CoV-2 protein, one-size-fits-all strategy using the same treatment (same drug) for different infected stages of SARS-CoV-2. To dilute these issues, we initially set the research focusing on herbal medicines. We then proposed a heterogeneous graph embedding method to signaled candidate repurposing herbs for each SARS-CoV-2 protein, and employed the variational graph convolutional network approach to recommend the precision herb combinations as the potential candidate treatments against the specific infected stage. METHOD We initially employed the virtual screening method to construct the 'Herb-Compound' and 'Compound-Protein' docking graph based on 480 herbal medicines, 12,735 associated chemical compounds and 24 SARS-CoV-2 proteins. Sequentially, the 'Herb-Compound-Protein' heterogeneous network was constructed by means of the metapath-based embedding approach. We then proposed the heterogeneous-information-network-based graph embedding method to generate the candidate ranking lists of herbs that target structural, nonstructural and accessory SARS-CoV-2 proteins, individually. To obtain precision synthetic effective treatments forvarious COVID-19 infected stages, we employed the variational graph convolutional network method to generate candidate herb combinations as the recommended therapeutic therapies. RESULTS There were 24 ranking lists, each containing top-10 herbs, targeting 24 SARS-CoV-2 proteins correspondingly, and 20 herb combinations were generated as the candidate-specific treatment to target the four infected stages. The code and supplementary materials are freely available at https://github.com/fanyang-AI/TCM-COVID19.
Collapse
Affiliation(s)
- Fan Yang
- The Department of Epidemiology and Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, China
| | - Shuaijie Zhang
- The Department of Epidemiology and Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, China
| | - Wei Pan
- The Department of Epidemiology and Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, China
| | - Ruiyuan Yao
- Shandong University of Traditional Chinese Medicine, Jinan, China
| | - Weiguo Zhang
- Shandong University of Traditional Chinese Medicine, Jinan, China
| | - Yanchun Zhang
- Institute for Sustainable Industries & Liveable Cities, Victoria University, Australia; The Department of New Networks, Peng Cheng Laboratory, Shenzhen, China
| | - Guoyin Wang
- Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Qianghua Zhang
- Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Yunlong Cheng
- Chongqing Key Laboratory of Computational Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, China
| | - Jihua Dong
- The School of Foreign Languages and Literature, Shandong University
| | - Chunyang Ruan
- Department of Data Science and Big Data Technology, Shanghai International Studies University, Shanghai, 200083, China
| | - Lizhen Cui
- School of Software, Shandong University, Jinan, China
| | - Hao Wu
- School of Software, Shandong University, Jinan, China
| | - Fuzhong Xue
- The Department of Epidemiology and Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, China
| |
Collapse
|
10
|
Shao K, Zhang Y, Wen Y, Zhang Z, He S, Bo X. DTI-HETA: prediction of drug-target interactions based on GCN and GAT on heterogeneous graph. Brief Bioinform 2022; 23:6563180. [PMID: 35380622 DOI: 10.1093/bib/bbac109] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 02/14/2022] [Accepted: 03/03/2022] [Indexed: 12/19/2022] Open
Abstract
Drug-target interaction (DTI) prediction plays an important role in drug repositioning, drug discovery and drug design. However, due to the large size of the chemical and genomic spaces and the complex interactions between drugs and targets, experimental identification of DTIs is costly and time-consuming. In recent years, the emerging graph neural network (GNN) has been applied to DTI prediction because DTIs can be represented effectively using graphs. However, some of these methods are only based on homogeneous graphs, and some consist of two decoupled steps that cannot be trained jointly. To further explore GNN-based DTI prediction by integrating heterogeneous graph information, this study regards DTI prediction as a link prediction problem and proposes an end-to-end model based on HETerogeneous graph with Attention mechanism (DTI-HETA). In this model, a heterogeneous graph is first constructed based on the drug-drug and target-target similarity matrices and the DTI matrix. Then, the graph convolutional neural network is utilized to obtain the embedded representation of the drugs and targets. To highlight the contribution of different neighborhood nodes to the central node in aggregating the graph convolution information, a graph attention mechanism is introduced into the node embedding process. Afterward, an inner product decoder is applied to predict DTIs. To evaluate the performance of DTI-HETA, experiments are conducted on two datasets. The experimental results show that our model is superior to the state-of-the-art methods. Also, the identification of novel DTIs indicates that DTI-HETA can serve as a powerful tool for integrating heterogeneous graph information to predict DTIs.
Collapse
Affiliation(s)
| | | | - Yuqi Wen
- Beijing Institute of Radiation Medicine, Beijing, China
| | | | - Song He
- Beijing Institute of Radiation Medicine, Beijing, China
| | - Xiaochen Bo
- Beijing Institute of Radiation Medicine, Beijing, China
| |
Collapse
|
11
|
Abstract
Biomedical knowledge graphs are crucial to support data-intensive applications in the life sciences and health care. These graphs can be extended by generating a heterogeneous graph that contains both ontology terms and biomedical entities. However, state-of-the-art approaches for Gene Ontology representation learnings are constrained to homogeneous graphs that cannot represent different node types and relations. To address this limitation, we present GoVec to produce representations seamlessly for both ontologies and biological entities by utilizing meta-path-based representation learning in the heterogeneous graph. The resulting vectors can be used in many bioinformatics applications, particularly for calculating semantic similarity and extracting relations among biological entities. We verify the approach's usefulness by comparing the resulting semantic similarities with the manually produced similarities by the experts. Furthermore, the superiority of the GoVec is shown by an extensive set of quantitative and qualitative evaluations. Two downstream tasks, including protein-protein interaction and protein family similarity, are evaluated in comparison with many state-of-the-art approaches. Finally, as a qualitative visual representation, the separability of various protein families is examined and visually separable groups of proteins are generated, which shows the capability of GoVec representations to embed functional semantics into the vectors.
Collapse
Affiliation(s)
- Esmaeil Nourani
- Department of Information Technology, Faculty of Computer Engineering and Information Technology, Azarbaijan Shahid Madani University, Tabriz, Iran.,Novo Nordisk Foundation Center for Protein Research, The Faculty of Health Sciences, University of Copenhagen, Copenhagen, Denmark
| |
Collapse
|
12
|
Li Z, Li J, Nie R, You ZH, Bao W. A graph auto-encoder model for miRNA-disease associations prediction. Brief Bioinform 2020; 22:5929824. [PMID: 34293850 DOI: 10.1093/bib/bbaa240] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 08/26/2020] [Accepted: 08/27/2020] [Indexed: 02/06/2023] Open
Abstract
Emerging evidence indicates that the abnormal expression of miRNAs involves in the evolution and progression of various human complex diseases. Identifying disease-related miRNAs as new biomarkers can promote the development of disease pathology and clinical medicine. However, designing biological experiments to validate disease-related miRNAs is usually time-consuming and expensive. Therefore, it is urgent to design effective computational methods for predicting potential miRNA-disease associations. Inspired by the great progress of graph neural networks in link prediction, we propose a novel graph auto-encoder model, named GAEMDA, to identify the potential miRNA-disease associations in an end-to-end manner. More specifically, the GAEMDA model applies a graph neural networks-based encoder, which contains aggregator function and multi-layer perceptron for aggregating nodes' neighborhood information, to generate the low-dimensional embeddings of miRNA and disease nodes and realize the effective fusion of heterogeneous information. Then, the embeddings of miRNA and disease nodes are fed into a bilinear decoder to identify the potential links between miRNA and disease nodes. The experimental results indicate that GAEMDA achieves the average area under the curve of $93.56\pm 0.44\%$ under 5-fold cross-validation. Besides, we further carried out case studies on colon neoplasms, esophageal neoplasms and kidney neoplasms. As a result, 48 of the top 50 predicted miRNAs associated with these diseases are confirmed by the database of differentially expressed miRNAs in human cancers and microRNA deregulation in human disease database, respectively. The satisfactory prediction performance suggests that GAEMDA model could serve as a reliable tool to guide the following researches on the regulatory role of miRNAs. Besides, the source codes are available at https://github.com/chimianbuhetang/GAEMDA.
Collapse
Affiliation(s)
- Zhengwei Li
- Engineering Research Center of Mine Digitalization of Ministry of Education and School of Computer Science and Technology, China University of Mining and Technology
| | - Jiashu Li
- School of Computer Science and Technology, China University of Mining and Technology
| | - Ru Nie
- School of Computer Science and Technology, China University of Mining and Technology
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science
| | - Wenzheng Bao
- School of Information Engineering, Xuzhou University of Technology
| |
Collapse
|
13
|
Zhu R, Ji C, Wang Y, Cai Y, Wu H. Heterogeneous Graph Convolutional Networks and Matrix Completion for miRNA-Disease Association Prediction. Front Bioeng Biotechnol 2020; 8:901. [PMID: 32974293 PMCID: PMC7468400 DOI: 10.3389/fbioe.2020.00901] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Accepted: 05/13/2020] [Indexed: 01/21/2023] Open
Abstract
Due to the cost and complexity of biological experiments, many computational methods have been proposed to predict potential miRNA-disease associations by utilizing known miRNA-disease associations and other related information. However, there are some challenges for these computational methods. First, the relationships between miRNAs and diseases are complex. The computational network should consider the local and global influence of neighborhoods from the network. Furthermore, predicting disease-related miRNAs without any known associations is also very important. This study presents a new computational method that constructs a heterogeneous network composed of a miRNA similarity network, disease similarity network, and known miRNA-disease association network. The miRNA similarity considers the miRNAs and their possible families and clusters. The information of each node in heterogeneous network is obtained by aggregating neighborhood information with graph convolutional networks (GCNs), which can pass the information of a node to its intermediate and distant neighbors. Disease-related miRNAs with no known associations can be predicted with the reconstructed heterogeneous matrix. We apply 5-fold cross-validation, leave-one-disease-out cross-validation, and global and local leave-one-out cross-validation to evaluate our method. The corresponding areas under the curves (AUCs) are 0.9616, 0.9946, 0.9656, and 0.9532, confirming that our approach significantly outperforms the state-of-the-art methods. Case studies show that this approach can effectively predict new diseases without any known miRNAs.
Collapse
Affiliation(s)
- Rongxiang Zhu
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China.,Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Shenzhen, China
| | - Chaojie Ji
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yingying Wang
- Department of Neurology and Stroke Center, The First Affiliated Hospital of Jinan University, Guangzhou, China.,Clinical Neuroscience Institute, The First Affiliated Hospital of Jinan University, Guangzhou, China
| | - Yunpeng Cai
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Hongyan Wu
- Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| |
Collapse
|