1
|
Dey L, Chakraborty S. Supervised learning approaches for predicting Ebola-Human Protein-Protein interactions. Gene 2025; 942:149228. [PMID: 39828063 DOI: 10.1016/j.gene.2025.149228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2024] [Revised: 12/04/2024] [Accepted: 01/07/2025] [Indexed: 01/22/2025]
Abstract
The goal of this research work is to predict protein-protein interactions (PPIs) between the Ebola virus and the host who is at risk of infection. Since there are very limited databases available on the Ebola virus; we have prepared a comprehensive database of all the PPIs between the Ebola virus and human proteins (EbolaInt). Our work focuses on the finding of some new protein-protein interactions between humans and the Ebola virus using some state- of-the-arts machine learning techniques. However, it is basically a two-class problem with a positive interacting dataset and a negative non-interacting dataset. These datasets contain various sequence-based human protein features such as structure of amino acid and conjoint triad and domain-related features. In this research, we have briefly discussed and used some well-known supervised learning approaches to predict PPIs between human proteins and Ebola virus proteins, including K-nearest neighbours (KNN), random forest (RF), support vector machine (SVM), and deep feed-forward multi-layer perceptron (DMLP) etc. We have validated our prediction results using gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. Our goal with this prediction is to compare all other models' accuracy, precision, recall, and f1-score for predicting these PPIs. In the result section, DMLP is giving the highest accuracy along with the prediction of 2655 potential human target proteins.
Collapse
Affiliation(s)
- Lopamudra Dey
- Department of Biomedical and Clinical Sciences, Linköping University, Sweden; Department of Computer Science & Engineering, Meghnad Saha Institute of Technology, Kolkata, India
| | - Sanjay Chakraborty
- Department of Computer and Information Science (IDA), REAL, AIICS, Linköping University, Sweden; Department of Computer Science & Engineering, Techno International New Town, Kolkata, India.
| |
Collapse
|
2
|
Liu Z, Dai Q, Yu X, Duan X, Wang C. Predicting circRNA-Drug Resistance Associations Based on a Multimodal Graph Representation Learning Framework. IEEE J Biomed Health Inform 2025; 29:1838-1848. [PMID: 37498762 DOI: 10.1109/jbhi.2023.3299423] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/29/2023]
Abstract
Circular RNA (circRNA) is a class of noncoding RNA that is highly conserved and exhibit exceptional stability. Due to its function as a microRNA sponge, circRNA has gained significant attention as an essential biomarker and potential drug target in the pathogenesis of several cancers. Although many circRNAs have been identified to play a role in cancer resistance, traditional methods are time-consuming and expensive. In this context, computational methods offer a promising way to facilitate the discovery process. However, most existing prediction models focus on the association between circRNAs and drug resistance, without considering the corresponding disease-related information in the circRNA-drug resistance association. Incorporating disease-related information into the prediction of circRNA-drug resistance associations could potentially improve the efficiency and speed of discovering and developing circRNA-targeting drugs. We propose a computational framework, named GraphCDD, for predicting the association between circRNA and drug resistance. Our model utilizes data from three sources, namely circRNA, disease, and drug, to construct three similarity networks that represent the features of circRNA, disease, and drug, respectively. We utilize a multimodal graph neural network to acquire efficient representations of circRNAs, diseases, and drugs by integrating various types of information, and establish a predictive model. The experimental results have validated the effectiveness of our model and provided a promising method in predicting potential associations between circRNA and drug resistance.
Collapse
|
3
|
Guo C, Wang X, Ren H. Databases and computational methods for the identification of piRNA-related molecules: A survey. Comput Struct Biotechnol J 2024; 23:813-833. [PMID: 38328006 PMCID: PMC10847878 DOI: 10.1016/j.csbj.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/31/2023] [Accepted: 01/15/2024] [Indexed: 02/09/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNAs (ncRNAs) that plays important roles in many biological processes and major cancer diagnosis and treatment, thus becoming a hot research topic. This study aims to provide an in-depth review of computational piRNA-related research, including databases and computational models. Herein, we perform literature analysis and use comparative evaluation methods to summarize and analyze three aspects of computational piRNA-related research: (i) computational models for piRNA-related molecular identification tasks, (ii) computational models for piRNA-disease association prediction tasks, and (iii) computational resources and evaluation metrics for these tasks. This study shows that computational piRNA-related research has significantly progressed, exhibiting promising performance in recent years, whereas they also suffer from the emerging challenges of inconsistent naming systems and the lack of data. Different from other reviews on piRNA-related identification tasks that focus on the organization of datasets and computational methods, we pay more attention to the analysis of computational models, algorithms, and performances that aim to provide valuable references for computational piRNA-related identification tasks. This study will benefit the theoretical development and practical application of piRNAs by better understanding computational models and resources to investigate the biological functions and clinical implications of piRNA.
Collapse
Affiliation(s)
- Chang Guo
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
| | - Xiaoli Wang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Han Ren
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
- Laboratory of Language and Artificial Intelligence, Guangdong University of Foreign Studies, Guangzhou 510420, China
| |
Collapse
|
4
|
Shang J, Zhao L, He X, Meng X, Zhang L, Ge D, Li F, Liu JX. SGFCCDA: Scale Graph Convolutional Networks and Feature Convolution for circRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:7006-7014. [PMID: 39250355 DOI: 10.1109/jbhi.2024.3456478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Circular RNAs (circRNAs) have emerged as a novel class of non-coding RNAs with regulatory roles in disease pathogenesis. Computational models aimed at predicting circRNA-disease associations offer valuable insights into disease mechanisms, thereby enabling the development of innovative diagnostic and therapeutic approaches while reducing the reliance on costly wet experiments. In this study, SGFCCDA is proposed for predicting potential circRNA-disease associations based on scale graph convolutional networks and feature convolution. Specifically, SGFCCDA integrates multiple measures of circRNA and disease similarity and combines known association information to construct a heterogeneous network. This network is then explored by scale graph convolutional networks to capture both topological and attribute information. Additionally, convolutional neural networks are employed to further learn the features and obtain higher-order feature representations containing richer information about nodes. The Hadamard product is utilized to effectively combine circRNA features with disease features, and a multilayer perceptron is applied to predict the association between each pair of circRNA and disease. Five-fold cross validation experiments conducted on the CircR2Disease dataset demonstrate the accurate prediction capabilities of SGFCCDA in identifying potential circRNA-disease associations. Furthermore, case studies provide further confirmation of SGFCCDA's ability to identify disease-associated circRNAs.
Collapse
|
5
|
Wang XF, Huang L, Wang Y, Guan RC, You ZH, Sheng N, Xie XP, Yang QX. A multichannel graph neural network based on multisimilarity modality hypergraph contrastive learning for predicting unknown types of cancer biomarkers. Brief Bioinform 2024; 25:bbae575. [PMID: 39523624 PMCID: PMC11551052 DOI: 10.1093/bib/bbae575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2024] [Revised: 10/19/2024] [Accepted: 11/01/2024] [Indexed: 11/16/2024] Open
Abstract
Identifying potential cancer biomarkers is a key task in biomedical research, providing a promising avenue for the diagnosis and treatment of human tumors and cancers. In recent years, several machine learning-based RNA-disease association prediction techniques have emerged. However, they primarily focus on modeling relationships of a single type, overlooking the importance of gaining insights into molecular behaviors from a complete regulatory network perspective and discovering biomarkers of unknown types. Furthermore, effectively handling local and global topological structural information of nodes in biological molecular regulatory graphs remains a challenge to improving biomarker prediction performance. To address these limitations, we propose a multichannel graph neural network based on multisimilarity modality hypergraph contrastive learning (MML-MGNN) for predicting unknown types of cancer biomarkers. MML-MGNN leverages multisimilarity modality hypergraph contrastive learning to delve into local associations in the regulatory network, learning diverse insights into the topological structures of multiple types of similarities, and then globally modeling the multisimilarity modalities through a multichannel graph autoencoder. By combining representations obtained from local-level associations and global-level regulatory graphs, MML-MGNN can acquire molecular feature descriptors benefiting from multitype association properties and the complete regulatory network. Experimental results on predicting three different types of cancer biomarkers demonstrate the outstanding performance of MML-MGNN. Furthermore, a case study on gastric cancer underscores the outstanding ability of MML-MGNN to gain deeper insights into molecular mechanisms in regulatory networks and prominent potential in cancer biomarker prediction.
Collapse
Affiliation(s)
- Xin-Fei Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Lan Huang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Yan Wang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Ren-Chu Guan
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Youyi West Road, Xi'an,710072, China
| | - Nan Sheng
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Xu-Ping Xie
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| | - Qi-Xing Yang
- Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, No. 2699, Qianjin Street, Changchun 130012, China
| |
Collapse
|
6
|
Lee Y. Three-Dimensional Dense Reconstruction: A Review of Algorithms and Datasets. SENSORS (BASEL, SWITZERLAND) 2024; 24:5861. [PMID: 39338606 PMCID: PMC11435907 DOI: 10.3390/s24185861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2024] [Revised: 09/04/2024] [Accepted: 09/05/2024] [Indexed: 09/30/2024]
Abstract
Three-dimensional dense reconstruction involves extracting the full shape and texture details of three-dimensional objects from two-dimensional images. Although 3D reconstruction is a crucial and well-researched area, it remains an unsolved challenge in dynamic or complex environments. This work provides a comprehensive overview of classical 3D dense reconstruction techniques, including those based on geometric and optical models, as well as approaches leveraging deep learning. It also discusses the datasets used for deep learning and evaluates the performance and the strengths and limitations of deep learning methods on these datasets.
Collapse
Affiliation(s)
- Yangming Lee
- RoCAL Lab, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
7
|
Salooja CM, Sanker A, Deepthi K, Jereesh AS. An ensemble approach for circular RNA-disease association prediction using variational autoencoder and genetic algorithm. J Bioinform Comput Biol 2024; 22:2450018. [PMID: 39215523 DOI: 10.1142/s0219720024500185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/04/2024]
Abstract
Circular RNAs (circRNAs) are endogenous non-coding RNAs with a covalently closed loop structure. They have many biological functions, mainly regulatory ones. They have been proven to modulate protein-coding genes in the human genome. CircRNAs are linked to various diseases like Alzheimer's disease, diabetes, atherosclerosis, Parkinson's disease and cancer. Identifying the associations between circular RNAs and diseases is essential for disease diagnosis, prevention, and treatment. The proposed model, based on the variational autoencoder and genetic algorithm circular RNA disease association (VAGA-CDA), predicts novel circRNA-disease associations. First, the experimentally verified circRNA-disease associations are augmented with the synthetic minority oversampling technique (SMOTE) and regenerated using a variational autoencoder, and feature selection is applied to these vectors by a genetic algorithm (GA). The variational autoencoder effectively extracts features from the augmented samples. The optimized feature selection of the genetic algorithm effectively carried out dimensionality reduction. The sophisticated feature vectors extracted are then given to a Random Forest classifier to predict new circRNA-disease associations. The proposed model yields an AUC value of 0.9644 and 0.9628 under 5-fold and 10-fold cross-validations, respectively. The results of the case studies indicate the robustness of the proposed model.
Collapse
Affiliation(s)
- C M Salooja
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - Arjun Sanker
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| | - K Deepthi
- Department of Computer Science, Central University of Kerala (Central Govt. of India), Kerala-671316, India
| | - A S Jereesh
- Bioinformatics Lab, Department of Computer Science, Cochin University of Science and Technology, Kerala-682022, India
| |
Collapse
|
8
|
Wei H, Gao L, Wu S, Jiang Y, Liu B. DiSMVC: a multi-view graph collaborative learning framework for measuring disease similarity. Bioinformatics 2024; 40:btae306. [PMID: 38715444 PMCID: PMC11256965 DOI: 10.1093/bioinformatics/btae306] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 04/19/2024] [Accepted: 05/05/2024] [Indexed: 05/30/2024] Open
Abstract
MOTIVATION Exploring potential associations between diseases can help in understanding pathological mechanisms of diseases and facilitating the discovery of candidate biomarkers and drug targets, thereby promoting disease diagnosis and treatment. Some computational methods have been proposed for measuring disease similarity. However, these methods describe diseases without considering their latent multi-molecule regulation and valuable supervision signal, resulting in limited biological interpretability and efficiency to capture association patterns. RESULTS In this study, we propose a new computational method named DiSMVC. Different from existing predictors, DiSMVC designs a supervised graph collaborative framework to measure disease similarity. Multiple bio-entity associations related to genes and miRNAs are integrated via cross-view graph contrastive learning to extract informative disease representation, and then association pattern joint learning is implemented to compute disease similarity by incorporating phenotype-annotated disease associations. The experimental results show that DiSMVC can draw discriminative characteristics for disease pairs, and outperform other state-of-the-art methods. As a result, DiSMVC is a promising method for predicting disease associations with molecular interpretability. AVAILABILITY AND IMPLEMENTATION Datasets and source codes are available at https://github.com/Biohang/DiSMVC.
Collapse
Affiliation(s)
- Hang Wei
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Lin Gao
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Shuai Wu
- School of Computer Science and Technology, Xidian University, Xi’an, Shaanxi 710126, China
| | - Yina Jiang
- Department of Basic Medicine, Shaanxi University of Chinese Medicine, Xianyang, Shaanxi 712046, China
| | - Bin Liu
- Faculty of Engineering, Shenzhen MSU-BIT University, Shenzhen, Guangdong 518172, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
| |
Collapse
|
9
|
Yang J, Lei X, Zhang F. Identification of circRNA-disease associations via multi-model fusion and ensemble learning. J Cell Mol Med 2024; 28:e18180. [PMID: 38506066 PMCID: PMC10951890 DOI: 10.1111/jcmm.18180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/21/2024] [Accepted: 02/05/2024] [Indexed: 03/21/2024] Open
Abstract
Circular RNA (circRNA) is a common non-coding RNA and plays an important role in the diagnosis and therapy of human diseases, circRNA-disease associations prediction based on computational methods can provide a new way for better clinical diagnosis. In this article, we proposed a novel method for circRNA-disease associations prediction based on ensemble learning, named ELCDA. First, the association heterogeneous network was constructed via collecting multiple information of circRNAs and diseases, and multiple similarity measures are adopted here, then, we use metapath, matrix factorization and GraphSAGE-based models to extract features of nodes from different views, the final comprehensive features of circRNAs and diseases via ensemble learning, finally, a soft voting ensemble strategy is used to integrate the predicted results of all classifier. The performance of ELCDA is evaluated by fivefold cross-validation and compare with other state-of-the-art methods, the experimental results show that ELCDA is outperformance than others. Furthermore, three common diseases are used as case studies, which also demonstrate that ELCDA is an effective method for predicting circRNA-disease associations.
Collapse
Affiliation(s)
- Jing Yang
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Xiujuan Lei
- School of Computer ScienceShaanxi Normal UniversityXi'anShaanxiChina
| | - Fa Zhang
- School of Medical TechnologyBeijing Institute of TechnologyBeijingChina
| |
Collapse
|
10
|
Hu X, Zhang P, Liu D, Zhang J, Zhang Y, Dong Y, Fan Y, Deng L. IGCNSDA: unraveling disease-associated snoRNAs with an interpretable graph convolutional network. Brief Bioinform 2024; 25:bbae179. [PMID: 38647155 PMCID: PMC11033953 DOI: 10.1093/bib/bbae179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Revised: 12/15/2023] [Accepted: 03/27/2024] [Indexed: 04/25/2024] Open
Abstract
Accurately delineating the connection between short nucleolar RNA (snoRNA) and disease is crucial for advancing disease detection and treatment. While traditional biological experimental methods are effective, they are labor-intensive, costly and lack scalability. With the ongoing progress in computer technology, an increasing number of deep learning techniques are being employed to predict snoRNA-disease associations. Nevertheless, the majority of these methods are black-box models, lacking interpretability and the capability to elucidate the snoRNA-disease association mechanism. In this study, we introduce IGCNSDA, an innovative and interpretable graph convolutional network (GCN) approach tailored for the efficient inference of snoRNA-disease associations. IGCNSDA leverages the GCN framework to extract node feature representations of snoRNAs and diseases from the bipartite snoRNA-disease graph. SnoRNAs with high similarity are more likely to be linked to analogous diseases, and vice versa. To facilitate this process, we introduce a subgraph generation algorithm that effectively groups similar snoRNAs and their associated diseases into cohesive subgraphs. Subsequently, we aggregate information from neighboring nodes within these subgraphs, iteratively updating the embeddings of snoRNAs and diseases. The experimental results demonstrate that IGCNSDA outperforms the most recent, highly relevant methods. Additionally, our interpretability analysis provides compelling evidence that IGCNSDA adeptly captures the underlying similarity between snoRNAs and diseases, thus affording researchers enhanced insights into the snoRNA-disease association mechanism. Furthermore, we present illustrative case studies that demonstrate the utility of IGCNSDA as a valuable tool for efficiently predicting potential snoRNA-disease associations. The dataset and source code for IGCNSDA are openly accessible at: https://github.com/altriavin/IGCNSDA.
Collapse
Affiliation(s)
- Xiaowen Hu
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Pan Zhang
- Hunan Provincial Key Laboratory of Clinical Epidemiology, Xiangya School of Public Health, Central South University, 410078, ChangshaChina
| | - Dayun Liu
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Jiaxuan Zhang
- Department of Electrical and Computer Engineering, University of California, San Diego, 92093, CA, United States
| | - Yuanpeng Zhang
- School of Software, Xinjiang University, 830046, Urumqi, China
| | - Yihan Dong
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Yanhao Fan
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, 410075, Changsha, China
| |
Collapse
|
11
|
Turgut H, Turanli B, Boz B. DCDA: CircRNA-Disease Association Prediction with Feed-Forward Neural Network and Deep Autoencoder. Interdiscip Sci 2024; 16:91-103. [PMID: 37978116 DOI: 10.1007/s12539-023-00590-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Revised: 10/13/2023] [Accepted: 10/15/2023] [Indexed: 11/19/2023]
Abstract
Circular RNA is a single-stranded RNA with a closed-loop structure. In recent years, academic research has revealed that circular RNAs play critical roles in biological processes and are related to human diseases. The discovery of potential circRNAs as disease biomarkers and drug targets is crucial since it can help diagnose diseases in the early stages and be used to treat people. However, in conventional experimental methods, conducting experiments to detect associations between circular RNAs and diseases is time-consuming and costly. To overcome this problem, various computational methodologies are proposed to extract essential features for both circular RNAs and diseases and predict the associations. Studies showed that computational methods successfully predicted performance and made it possible to detect possible highly related circular RNAs for diseases. This study proposes a deep learning-based circRNA-disease association predictor methodology called DCDA, which uses multiple data sources to create circRNA and disease features and reveal hidden feature codings of a circular RNA-disease pair with a deep autoencoder, then predict the relation score of the pair by a deep neural network. Fivefold cross-validation results on the benchmark dataset showed that our model outperforms state-of-the-art prediction methods in the literature with the AUC score of 0.9794.
Collapse
Affiliation(s)
- Hacer Turgut
- Computer Engineering Department, Marmara University, 34854, Istanbul, Türkiye.
| | - Beste Turanli
- Bioengineering Department, Marmara University, 34854, Istanbul, Türkiye
| | - Betül Boz
- Computer Engineering Department, Marmara University, 34854, Istanbul, Türkiye.
| |
Collapse
|
12
|
Wang W, Han P, Li Z, Nie R, Wang K, Wang L, Liao H. LMGATCDA: Graph Neural Network With Labeling Trick for Predicting circRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:289-300. [PMID: 38231821 DOI: 10.1109/tcbb.2024.3355093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/19/2024]
Abstract
Previous studies have proven that circular RNAs (circRNAs) are inextricably connected to the etiology and pathophysiology of complicated diseases. Since conventional biological research are frequently small-scale, expensive, and time-consuming, it is essential to establish an efficient and reasonable computation-based method to identify disease-related circRNAs. In this article, we proposed a novel ensemble model for predicting probable circRNA-disease associations based on multi-source similarity information(LMGATCDA). In particular, LMGATCDA first incorporates information on circRNA functional similarity, disease semantic similarity, and the Gaussian interaction profile (GIP) kernel similarity as explicit features, along with node-labeling of the three-hop subgraphs extracted from each linked target node as graph structural features. After that, the fused features are used as input, and further implied features are extracted by graph sampling aggregation (GraphSAGE) and multi-hop attention graph neural network (MAGNA). Finally, the prediction scores are obtained through a fully connected layer. With five-fold cross-validation, LMGATCDA demonstrated excellent competitiveness against gold standard data, reaching 95.37% accuracy and 91.31% recall with an AUC of 94.25% on the circR2Disease benchmark dataset. Collectively, the noteworthy findings from these case studies support our conclusion that the LMGATCDA model can provide reliable circRNA-disease associations for clinical research while helping to mitigate experimental uncertainties in wet-lab investigations.
Collapse
|
13
|
Wang L, Li ZW, You ZH, Huang DS, Wong L. GSLCDA: An Unsupervised Deep Graph Structure Learning Method for Predicting CircRNA-Disease Association. IEEE J Biomed Health Inform 2024; 28:1742-1751. [PMID: 38127594 DOI: 10.1109/jbhi.2023.3344714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Growing studies reveal that Circular RNAs (circRNAs) are broadly engaged in physiological processes of cell proliferation, differentiation, aging, apoptosis, and are closely associated with the pathogenesis of numerous diseases. Clarification of the correlation among diseases and circRNAs is of great clinical importance to provide new therapeutic strategies for complex diseases. However, previous circRNA-disease association prediction methods rely excessively on the graph network, and the model performance is dramatically reduced when noisy connections occur in the graph structure. To address this problem, this paper proposes an unsupervised deep graph structure learning method GSLCDA to predict potential CDAs. Concretely, we first integrate circRNA and disease multi-source data to constitute the CDA heterogeneous network. Then the network topology is learned using the graph structure, and the original graph is enhanced in an unsupervised manner by maximize the inter information of the learned and original graphs to uncover their essential features. Finally, graph space sensitive k-nearest neighbor (KNN) algorithm is employed to search for latent CDAs. In the benchmark dataset, GSLCDA obtained 92.67% accuracy with 0.9279 AUC. GSLCDA also exhibits exceptional performance on independent datasets. Furthermore, 14, 12 and 14 of the top 16 circRNAs with the most points GSLCDA prediction scores were confirmed in the relevant literature in the breast cancer, colorectal cancer and lung cancer case studies, respectively. Such results demonstrated that GSLCDA can validly reveal underlying CDA and offer new perspectives for the diagnosis and therapy of complex human diseases.
Collapse
|
14
|
Wang L, Li ZW, You ZH, Huang DS, Wong L. MAGCDA: A Multi-Hop Attention Graph Neural Networks Method for CircRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:1752-1761. [PMID: 38145538 DOI: 10.1109/jbhi.2023.3346821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2023]
Abstract
With a growing body of evidence establishing circular RNAs (circRNAs) are widely exploited in eukaryotic cells and have a significant contribution in the occurrence and development of many complex human diseases. Disease-associated circRNAs can serve as clinical diagnostic biomarkers and therapeutic targets, providing novel ideas for biopharmaceutical research. However, available computation methods for predicting circRNA-disease associations (CDAs) do not sufficiently consider the contextual information of biological network nodes, making their performance limited. In this work, we propose a multi-hop attention graph neural network-based approach MAGCDA to infer potential CDAs. Specifically, we first construct a multi-source attribute heterogeneous network of circRNAs and diseases, then use a multi-hop strategy of graph nodes to deeply aggregate node context information through attention diffusion, thus enhancing topological structure information and mining data hidden features, and finally use random forest to accurately infer potential CDAs. In the four gold standard data sets, MAGCDA achieved prediction accuracy of 92.58%, 91.42%, 83.46% and 91.12%, respectively. MAGCDA has also presented prominent achievements in ablation experiments and in comparisons with other models. Additionally, 18 and 17 potential circRNAs in top 20 predicted scores for MAGCDA prediction scores were confirmed in case studies of the complex diseases breast cancer and Almozheimer's disease, respectively. These results suggest that MAGCDA can be a practical tool to explore potential disease-associated circRNAs and provide a theoretical basis for disease diagnosis and treatment.
Collapse
|
15
|
Chang L, Jin X, Rao Y, Zhang X. Predicting abiotic stress-responsive miRNA in plants based on multi-source features fusion and graph neural network. PLANT METHODS 2024; 20:33. [PMID: 38402152 PMCID: PMC10894500 DOI: 10.1186/s13007-024-01158-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Accepted: 02/14/2024] [Indexed: 02/26/2024]
Abstract
BACKGROUND More and more studies show that miRNA plays a crucial role in plants' response to different abiotic stresses. However, traditional experimental methods are often expensive and inefficient, so it is important to develop efficient and economical computational methods. Although researchers have developed machine learning-based method, the information of miRNAs and abiotic stresses has not been fully exploited. Therefore, we propose a novel approach based on graph neural networks for predicting potential miRNA-abiotic stress associations. RESULTS In this study, we fully considered the multi-source feature information from miRNAs and abiotic stresses, and calculated and integrated the similarity network of miRNA and abiotic stress from different feature perspectives using multiple similarity measures. Then, the above multi-source similarity network and association information between miRNAs and abiotic stresses are effectively fused through heterogeneous networks. Subsequently, the Restart Random Walk (RWR) algorithm is employed to extract global structural information from heterogeneous networks, providing feature vectors for miRNA and abiotic stress. After that, we utilized the graph autoencoder based on GIN (Graph Isomorphism Networks) to learn and reconstruct a miRNA-abiotic stress association matrix to obtain potential miRNA-abiotic stress associations. The experimental results show that our model is superior to all known methods in predicting potential miRNA-abiotic stress associations, and the AUPR and AUC metrics of our model achieve 98.24% and 97.43%, respectively, under five-fold cross-validation. CONCLUSIONS The robustness and effectiveness of our proposed model position it as a valuable approach for advancing the field of miRNA-abiotic stress association prediction.
Collapse
Affiliation(s)
- Liming Chang
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
| | - Xiu Jin
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei, 230036, China
| | - Yuan Rao
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei, 230036, China
| | - Xiaodan Zhang
- College of Information and Artificial Intelligence, Anhui Agricultural University, Hefei, 230036, China.
- Anhui Province Key Laboratory of Smart Agricultural Technology and Equipment, Anhui Agricultural University, Hefei, 230036, China.
| |
Collapse
|
16
|
Niu M, Wang C, Zhang Z, Zou Q. A computational model of circRNA-associated diseases based on a graph neural network: prediction and case studies for follow-up experimental validation. BMC Biol 2024; 22:24. [PMID: 38281919 PMCID: PMC10823650 DOI: 10.1186/s12915-024-01826-z] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 01/11/2024] [Indexed: 01/30/2024] Open
Abstract
BACKGROUND Circular RNAs (circRNAs) have been confirmed to play a vital role in the occurrence and development of diseases. Exploring the relationship between circRNAs and diseases is of far-reaching significance for studying etiopathogenesis and treating diseases. To this end, based on the graph Markov neural network algorithm (GMNN) constructed in our previous work GMNN2CD, we further considered the multisource biological data that affects the association between circRNA and disease and developed an updated web server CircDA and based on the human hepatocellular carcinoma (HCC) tissue data to verify the prediction results of CircDA. RESULTS CircDA is built on a Tumarkov-based deep learning framework. The algorithm regards biomolecules as nodes and the interactions between molecules as edges, reasonably abstracts multiomics data, and models them as a heterogeneous biomolecular association network, which can reflect the complex relationship between different biomolecules. Case studies using literature data from HCC, cervical, and gastric cancers demonstrate that the CircDA predictor can identify missing associations between known circRNAs and diseases, and using the quantitative real-time PCR (RT-qPCR) experiment of HCC in human tissue samples, it was found that five circRNAs were significantly differentially expressed, which proved that CircDA can predict diseases related to new circRNAs. CONCLUSIONS This efficient computational prediction and case analysis with sufficient feedback allows us to identify circRNA-associated diseases and disease-associated circRNAs. Our work provides a method to predict circRNA-associated diseases and can provide guidance for the association of diseases with certain circRNAs. For ease of use, an online prediction server ( http://server.malab.cn/CircDA ) is provided, and the code is open-sourced ( https://github.com/nmt315320/CircDA.git ) for the convenience of algorithm improvement.
Collapse
Affiliation(s)
- Mengting Niu
- School of Electronic and Communication Engineering, Shenzhen Polytechnic University, Shenzhen, 518055, China
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, 150000, Heilongjiang, China
| | - Zhanguo Zhang
- Hepatic Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 1095 Jiefang Avenue, Wuhan, 430030, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 4 Block 2 North Jianshe Road, Chengdu, 610054, China.
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China.
| |
Collapse
|
17
|
Chen L, Zhao X. PCDA-HNMP: Predicting circRNA-disease association using heterogeneous network and meta-path. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:20553-20575. [PMID: 38124565 DOI: 10.3934/mbe.2023909] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Increasing amounts of experimental studies have shown that circular RNAs (circRNAs) play important regulatory roles in human diseases through interactions with related microRNAs (miRNAs). CircRNAs have become new potential disease biomarkers and therapeutic targets. Predicting circRNA-disease association (CDA) is of great significance for exploring the pathogenesis of complex diseases, which can improve the diagnosis level of diseases and promote the targeted therapy of diseases. However, determination of CDAs through traditional clinical trials is usually time-consuming and expensive. Computational methods are now alternative ways to predict CDAs. In this study, a new computational method, named PCDA-HNMP, was designed. For obtaining informative features of circRNAs and diseases, a heterogeneous network was first constructed, which defined circRNAs, mRNAs, miRNAs and diseases as nodes and associations between them as edges. Then, a deep analysis was conducted on the heterogeneous network by extracting meta-paths connecting to circRNAs (diseases), thereby mining hidden associations between various circRNAs (diseases). These associations constituted the meta-path-induced networks for circRNAs and diseases. The features of circRNAs and diseases were derived from the aforementioned networks via mashup. On the other hand, miRNA-disease associations (mDAs) were employed to improve the model's performance. miRNA features were yielded from the meta-path-induced networks on miRNAs and circRNAs, which were constructed from the meta-paths connecting miRNAs and circRNAs in the heterogeneous network. A concatenation operation was adopted to build the features of CDAs and mDAs. Such representations of CDAs and mDAs were fed into XGBoost to set up the model. The five-fold cross-validation yielded an area under the curve (AUC) of 0.9846, which was better than those of some existing state-of-the-art methods. The employment of mDAs can really enhance the model's performance and the importance analysis on meta-path-induced networks shown that networks produced by the meta-paths containing validated CDAs provided the most important contributions.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| | - Xiaoyu Zhao
- College of Information Engineering, Shanghai Maritime University, Shanghai 201306, China
| |
Collapse
|
18
|
Zhang J, Lang M, Zhou Y, Zhang Y. Predicting RNA structures and functions by artificial intelligence. Trends Genet 2023; 40:S0168-9525(23)00229-9. [PMID: 39492264 DOI: 10.1016/j.tig.2023.10.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 08/22/2023] [Accepted: 10/03/2023] [Indexed: 11/05/2024]
Abstract
RNA functions by interacting with its intended targets structurally. However, due to the dynamic nature of RNA molecules, RNA structures are difficult to determine experimentally or predict computationally. Artificial intelligence (AI) has revolutionized many biomedical fields and has been progressively utilized to deduce RNA structures, target binding, and associated functionality. Integrating structural and target binding information could also help improve the robustness of AI-based RNA function prediction and RNA design. Given the rapid development of deep learning (DL) algorithms, AI will provide an unprecedented opportunity to elucidate the sequence-structure-function relation of RNAs.
Collapse
Affiliation(s)
- Jun Zhang
- National Engineering Laboratory for Big Data System Computing Technology, College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, Guangdong, 518060, China
| | - Mei Lang
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China
| | - Yaoqi Zhou
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong, 518106, China.
| | - Yang Zhang
- School of Science, Harbin Institute of Technology, Shenzhen, Guangdong, 518055, China.
| |
Collapse
|
19
|
Hu X, Dong Y, Zhang J, Deng L. HGCLMDA: Predicting mRNA-Drug Sensitivity Associations via Hypergraph Contrastive Learning. J Chem Inf Model 2023; 63:5936-5946. [PMID: 37674276 DOI: 10.1021/acs.jcim.3c00957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
The identification of drug sensitivity to mRNA interactions is crucial for drug development and disease treatment, but traditional experimental methods for verifying mRNA-drug sensitivity associations are labor-intensive and time-consuming. In this study, we present a hypergraph contrastive learning approach, HGCLMDA, to predict potential mRNA-drug sensitivity associations. HGCLMDA integrates a graph convolutional network-based method with a hypergraph convolutional network to mine high-order relationships between mRNA-drug association pairs. The proposed cross-view contrastive learning architecture improves the model's learning ability, and the inner product is used to obtain the mRNA-drug sensitivity association score. Our experiments on three mRNA-drug sensitivity association data sets show that HGCLMDA outperforms traditional graph convolutional network-based methods, graph augmentation-based contrastive learning methods, and state-of-the-art association prediction methods. The visualization experiment demonstrates the strong discrimination ability of the mRNA and drug embeddings learned by HGCLMDA, and experiments on sparse data sets showcase the performance and robustness of the method. In-depth analysis of hypergraph structures reveals a crucial role that hypergraphs play in enhancing the performance of models. The case study highlights the potential of HGCLMDA as a valuable tool for predicting mRNA-drug sensitivity associations. The interpretive analysis reveals that HGCLMDA effectively models the similarity between mRNA-mRNA and drug-drug interactions.
Collapse
Affiliation(s)
- Xiaowen Hu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Yihan Dong
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jiaxuan Zhang
- Department of Electrical and Computer Engineering, University of California, San Diego, California 92092, United States
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
20
|
Liang S, Liu S, Song J, Lin Q, Zhao S, Li S, Li J, Liang S, Wang J. HMCDA: a novel method based on the heterogeneous graph neural network and metapath for circRNA-disease associations prediction. BMC Bioinformatics 2023; 24:335. [PMID: 37697297 PMCID: PMC10494331 DOI: 10.1186/s12859-023-05441-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/08/2023] [Indexed: 09/13/2023] Open
Abstract
Circular RNA (CircRNA) is a type of non-coding RNAs in which both ends are covalently linked. Researchers have demonstrated that many circRNAs can act as biomarkers of diseases. However, traditional experimental methods for circRNA-disease associations identification are labor-intensive. In this work, we propose a novel method based on the heterogeneous graph neural network and metapaths for circRNA-disease associations prediction termed as HMCDA. First, a heterogeneous graph consisting of circRNA-disease associations, circRNA-miRNA associations, miRNA-disease associations and disease-disease associations are constructed. Then, six metapaths are defined and generated according to the biomedical pathways. Afterwards, the entity content transformation, intra-metapath and inter-metapath aggregation are implemented to learn the embeddings of circRNA and disease entities. Finally, the learned embeddings are used to predict novel circRNA-disase associations. In particular, the result of extensive experiments demonstrates that HMCDA outperforms four state-of-the-art models in fivefold cross validation. In addition, our case study indicates that HMCDA has the ability to identify novel circRNA-disease associations.
Collapse
Affiliation(s)
- Shiyang Liang
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
- Department of Internal Medicine, The No. 944 Hospital of Joint Logistic Support Force of PLA, Xiongguan Road, Jiuquan, China
| | - Siwei Liu
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Junliang Song
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Qiang Lin
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shihong Zhao
- Department of Respiratory Medicine, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shuaixin Li
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Jiahui Li
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China
| | - Shangsong Liang
- Department of Machine Learning, Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates
| | - Jingjie Wang
- Department of Gastroenterology, Tangdu Hospital, Air Force Medical University, Xinsi Road, Xi'an, China.
| |
Collapse
|
21
|
Wu Q, Deng Z, Zhang W, Pan X, Choi KS, Zuo Y, Shen HB, Yu DJ. MLNGCF: circRNA-disease associations prediction with multilayer attention neural graph-based collaborative filtering. Bioinformatics 2023; 39:btad499. [PMID: 37561093 PMCID: PMC10457666 DOI: 10.1093/bioinformatics/btad499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/17/2023] [Accepted: 08/09/2023] [Indexed: 08/11/2023] Open
Abstract
MOTIVATION CircRNAs play a critical regulatory role in physiological processes, and the abnormal expression of circRNAs can mediate the processes of diseases. Therefore, exploring circRNAs-disease associations is gradually becoming an important area of research. Due to the high cost of validating circRNA-disease associations using traditional wet-lab experiments, novel computational methods based on machine learning are gaining more and more attention in this field. However, current computational methods suffer to insufficient consideration of latent features in circRNA-disease interactions. RESULTS In this study, a multilayer attention neural graph-based collaborative filtering (MLNGCF) is proposed. MLNGCF first enhances multiple biological information with autoencoder as the initial features of circRNAs and diseases. Then, by constructing a central network of different diseases and circRNAs, a multilayer cooperative attention-based message propagation is performed on the central network to obtain the high-order features of circRNAs and diseases. A neural network-based collaborative filtering is constructed to predict the unknown circRNA-disease associations and update the model parameters. Experiments on the benchmark datasets demonstrate that MLNGCF outperforms state-of-the-art methods, and the prediction results are supported by the literature in the case studies. AVAILABILITY AND IMPLEMENTATION The source codes and benchmark datasets of MLNGCF are available at https://github.com/ABard0/MLNGCF.
Collapse
Affiliation(s)
- Qunzhuo Wu
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Zhaohong Deng
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Wei Zhang
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
| | - Kup-Sze Choi
- The Centre for Smart Health, The Hong Kong Polytechnic University, Hong Kong
| | - Yun Zuo
- School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiaotong University, Shanghai, China
| | - Dong-Jun Yu
- School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, China
| |
Collapse
|
22
|
Yuan L, Zhao J, Shen Z, Zhang Q, Geng Y, Zheng CH, Huang DS. iCircDA-NEAE: Accelerated attribute network embedding and dynamic convolutional autoencoder for circRNA-disease associations prediction. PLoS Comput Biol 2023; 19:e1011344. [PMID: 37651321 PMCID: PMC10470932 DOI: 10.1371/journal.pcbi.1011344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Accepted: 07/10/2023] [Indexed: 09/02/2023] Open
Abstract
Accumulating evidence suggests that circRNAs play crucial roles in human diseases. CircRNA-disease association prediction is extremely helpful in understanding pathogenesis, diagnosis, and prevention, as well as identifying relevant biomarkers. During the past few years, a large number of deep learning (DL) based methods have been proposed for predicting circRNA-disease association and achieved impressive prediction performance. However, there are two main drawbacks to these methods. The first is these methods underutilize biometric information in the data. Second, the features extracted by these methods are not outstanding to represent association characteristics between circRNAs and diseases. In this study, we developed a novel deep learning model, named iCircDA-NEAE, to predict circRNA-disease associations. In particular, we use disease semantic similarity, Gaussian interaction profile kernel, circRNA expression profile similarity, and Jaccard similarity simultaneously for the first time, and extract hidden features based on accelerated attribute network embedding (AANE) and dynamic convolutional autoencoder (DCAE). Experimental results on the circR2Disease dataset show that iCircDA-NEAE outperforms other competing methods significantly. Besides, 16 of the top 20 circRNA-disease pairs with the highest prediction scores were validated by relevant literature. Furthermore, we observe that iCircDA-NEAE can effectively predict new potential circRNA-disease associations.
Collapse
Affiliation(s)
- Lin Yuan
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Jiawang Zhao
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Zhen Shen
- School of Computer and Software, Nanyang Institute of Technology, Nanyang, China
| | - Qinhu Zhang
- Eastern Institute for Advanced Study, Eastern Institute of Technology, Ningbo, China
| | - Yushui Geng
- Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Engineering Research Center of Big Data Applied Technology, Faculty of Computer Science and Technology, Qilu University of Technology (Shandong Academy of Sciences), Jinan, China
- Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, School of Artificial Intelligence, Anhui University, Hefei, China
| | - De-Shuang Huang
- Eastern Institute for Advanced Study, Eastern Institute of Technology, Ningbo, China
| |
Collapse
|
23
|
Hou J, Wei H, Liu B. iPiDA-SWGCN: Identification of piRNA-disease associations based on Supplementarily Weighted Graph Convolutional Network. PLoS Comput Biol 2023; 19:e1011242. [PMID: 37339125 DOI: 10.1371/journal.pcbi.1011242] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2022] [Accepted: 06/05/2023] [Indexed: 06/22/2023] Open
Abstract
Accurately identifying potential piRNA-disease associations is of great importance in uncovering the pathogenesis of diseases. Recently, several machine-learning-based methods have been proposed for piRNA-disease association detection. However, they are suffering from the high sparsity of piRNA-disease association network and the Boolean representation of piRNA-disease associations ignoring the confidence coefficients. In this study, we propose a supplementarily weighted strategy to solve these disadvantages. Combined with Graph Convolutional Networks (GCNs), a novel predictor called iPiDA-SWGCN is proposed for piRNA-disease association prediction. There are three main contributions of iPiDA-SWGCN: (i) Potential piRNA-disease associations are preliminarily supplemented in the sparse piRNA-disease network by integrating various basic predictors to enrich network structure information. (ii) The original Boolean piRNA-disease associations are assigned with different relevance confidence to learn node representations from neighbour nodes in varying degrees. (iii) The experimental results show that iPiDA-SWGCN achieves the best performance compared with the other state-of-the-art methods, and can predict new piRNA-disease associations.
Collapse
Affiliation(s)
- Jialu Hou
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| | - Hang Wei
- School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
24
|
Li F, Li PF, Hao XD. Circular RNAs in ferroptosis: regulation mechanism and potential clinical application in disease. Front Pharmacol 2023; 14:1173040. [PMID: 37332354 PMCID: PMC10272566 DOI: 10.3389/fphar.2023.1173040] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 05/25/2023] [Indexed: 06/20/2023] Open
Abstract
Ferroptosis, an iron-dependent non-apoptotic form of cell death, is reportedly involved in the pathogenesis of various diseases, particularly tumors, organ injury, and degenerative pathologies. Several signaling molecules and pathways have been found to be involved in the regulation of ferroptosis, including polyunsaturated fatty acid peroxidation, glutathione/glutathione peroxidase 4, the cysteine/glutamate antiporter system Xc-, ferroptosis suppressor protein 1/ubiquinone, and iron metabolism. An increasing amount of evidence suggests that circular RNAs (circRNAs), which have a stable circular structure, play important regulatory roles in the ferroptosis pathways that contribute to disease progression. Hence, ferroptosis-inhibiting and ferroptosis-stimulating circRNAs have potential as novel diagnostic markers or therapeutic targets for cancers, infarctions, organ injuries, and diabetes complications linked to ferroptosis. In this review, we summarize the roles that circRNAs play in the molecular mechanisms and regulatory networks of ferroptosis and their potential clinical applications in ferroptosis-related diseases. This review furthers our understanding of the roles of ferroptosis-related circRNAs and provides new perspectives on ferroptosis regulation and new directions for the diagnosis, treatment, and prognosis of ferroptosis-related diseases.
Collapse
|
25
|
Lu C, Zhang L, Zeng M, Lan W, Duan G, Wang J. Inferring disease-associated circRNAs by multi-source aggregation based on heterogeneous graph neural network. Brief Bioinform 2023; 24:6960978. [PMID: 36572658 DOI: 10.1093/bib/bbac549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 11/03/2022] [Accepted: 11/11/2022] [Indexed: 12/28/2022] Open
Abstract
Emerging evidence has proved that circular RNAs (circRNAs) are implicated in pathogenic processes. They are regarded as promising biomarkers for diagnosis due to covalently closed loop structures. As opposed to traditional experiments, computational approaches can identify circRNA-disease associations at a lower cost. Aggregating multi-source pathogenesis data helps to alleviate data sparsity and infer potential associations at the system level. The majority of computational approaches construct a homologous network using multi-source data, but they lose the heterogeneity of the data. Effective methods that use the features of multi-source data are considered as a matter of urgency. In this paper, we propose a model (CDHGNN) based on edge-weighted graph attention and heterogeneous graph neural networks for potential circRNA-disease association prediction. The circRNA network, micro RNA network, disease network and heterogeneous network are constructed based on multi-source data. To reflect association probabilities between nodes, an edge-weighted graph attention network model is designed for node features. To assign attention weights to different types of edges and learn contextual meta-path, CDHGNN infers potential circRNA-disease association based on heterogeneous neural networks. CDHGNN outperforms state-of-the-art algorithms in terms of accuracy. Edge-weighted graph attention networks and heterogeneous graph networks have both improved performance significantly. Furthermore, case studies suggest that CDHGNN is capable of identifying specific molecular associations and investigating biomolecular regulatory relationships in pathogenesis. The code of CDHGNN is freely available at https://github.com/BioinformaticsCSU/CDHGNN.
Collapse
Affiliation(s)
- Chengqian Lu
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China.,School of Computer Science, Xiangtan University, Xiangtan, 411105, Hunan, China
| | - Lishen Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Min Zeng
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, 530004, Guangxi, China
| | - Guihua Duan
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| | - Jianxin Wang
- School of Computer Science and Engineering, Central South University, Changsha, 410083, Hunan, China.,Hunan Provincial Key Lab on Bioinformatics, Central South University, Changsha, 410083, Hunan, China
| |
Collapse
|
26
|
Lan W, Dong Y, Zhang H, Li C, Chen Q, Liu J, Wang J, Chen YPP. Benchmarking of computational methods for predicting circRNA-disease associations. Brief Bioinform 2023; 24:6972300. [PMID: 36611256 DOI: 10.1093/bib/bbac613] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 10/29/2022] [Accepted: 12/11/2022] [Indexed: 01/09/2023] Open
Abstract
Accumulating evidences demonstrate that circular RNA (circRNA) plays an important role in human diseases. Identification of circRNA-disease associations can help for the diagnosis of human diseases, while the traditional method based on biological experiments is time-consuming. In order to address the limitation, a series of computational methods have been proposed in recent years. However, few works have summarized these methods or compared the performance of them. In this paper, we divided the existing methods into three categories: information propagation, traditional machine learning and deep learning. Then, the baseline methods in each category are introduced in detail. Further, 5 different datasets are collected, and 14 representative methods of each category are selected and compared in the 5-fold, 10-fold cross-validation and the de novo experiment. In order to further evaluate the effectiveness of these methods, six common cancers are selected to compare the number of correctly identified circRNA-disease associations in the top-10, top-20, top-50, top-100 and top-200. In addition, according to the results, the observation about the robustness and the character of these methods are concluded. Finally, the future directions and challenges are discussed.
Collapse
Affiliation(s)
- Wei Lan
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Yi Dong
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Hongyu Zhang
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Chunling Li
- School of Computer, Electronic and Information and Guangxi Key Laboratory of Multimedia Communications and Network Technology, Guangxi University, Nanning, Guangxi 530004, China
| | - Qingfeng Chen
- School of Computer, Electronic and Information and State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangxi University, Nanning, Guangxi 530004, China
| | - Jin Liu
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformatics, School of Computer Science and Engineering, Central South University, Changsha, Hunan 410083, China
| | - Yi-Ping Phoebe Chen
- Department of Computer Science and Information Technology, La Trobe University, Melbourne, Victoria 3086, Australia
| |
Collapse
|
27
|
Wang L, You ZH, Huang DS, Li JQ. MGRCDA: Metagraph Recommendation Method for Predicting CircRNA-Disease Association. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:67-75. [PMID: 34236991 DOI: 10.1109/tcyb.2021.3090756] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Clinical evidence began to accumulate, suggesting that circRNAs can be novel therapeutic targets for various diseases and play a critical role in human health. However, limited by the complex mechanism of circRNA, it is difficult to quickly and large-scale explore the relationship between disease and circRNA in the wet-lab experiment. In this work, we design a new computational model MGRCDA on account of the metagraph recommendation theory to predict the potential circRNA-disease associations. Specifically, we first regard the circRNA-disease association prediction problem as the system recommendation problem, and design a series of metagraphs according to the heterogeneous biological networks; then extract the semantic information of the disease and the Gaussian interaction profile kernel (GIPK) similarity of circRNA and disease as network attributes; finally, the iterative search of the metagraph recommendation algorithm is used to calculate the scores of the circRNA-disease pair. On the gold standard dataset circR2Disease, MGRCDA achieved a prediction accuracy of 92.49% with an area under the ROC curve of 0.9298, which is significantly higher than other state-of-the-art models. Furthermore, among the top 30 disease-related circRNAs recommended by the model, 25 have been verified by the latest published literature. The experimental results prove that MGRCDA is feasible and efficient, and it can recommend reliable candidates to further wet-lab experiment and reduce the scope of the experiment.
Collapse
|
28
|
Fu Y, Yang R, Zhang L. Association prediction of CircRNAs and diseases using multi-homogeneous graphs and variational graph auto-encoder. Comput Biol Med 2022; 151:106289. [PMID: 36401973 DOI: 10.1016/j.compbiomed.2022.106289] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 10/19/2022] [Accepted: 11/06/2022] [Indexed: 11/12/2022]
Abstract
As a non-coding RNA molecule with closed-loop structure, circular RNA (circRNA) is tissue-specific and cell-specific in expression pattern. It regulates disease development by modulating the expression of disease-related genes. Therefore, exploring the circRNA-disease relationship can reveal the molecular mechanism of disease pathogenesis. Biological experiments for detecting circRNA-disease associations are time-consuming and laborious. Constrained by the sparsity of known circRNA-disease associations, existing algorithms cannot obtain relatively complete structural information to represent features accurately. To this end, this paper proposes a new predictor, VGAERF, combining Variational Graph Auto-Encoder (VGAE) and Random Forest (RF). Firstly, circRNA homogeneous graph structure and disease homogeneous graph structure are constructed by Gaussian interaction profile (GIP) kernel similarity, semantic similarity, and known circRNA-disease associations. VGAEs with the same structure are employed to extract the higher-order features by the encoding and decoding of input graph structures. To further increase the completeness of the network structure information, the deep features acquired from the two VGAEs are summed, and then train the RF with sparse data processing capability to perform the prediction task. On the independent test set, the Area Under ROC Curve (AUC), accuracy, and Area Under PR Curve (AUPR) of the proposed method reach up to 0.9803, 0.9345, and 0.9894, respectively. On the same dataset, the AUC, accuracy, and AUPR of VGAERF are 2.09%, 5.93%, and 1.86% higher than the best-performing method (AEDNN). It is anticipated that VGAERF will provide significant information to decipher the molecular mechanisms of circRNA-disease associations, and promote the diagnosis of circRNA-related diseases.
Collapse
Affiliation(s)
- Yao Fu
- The School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, 264209, China.
| | - Runtao Yang
- The School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, 264209, China.
| | - Lina Zhang
- The School of Mechanical, Electrical and Information Engineering, Shandong University, Weihai, 264209, China.
| |
Collapse
|
29
|
Chen Y, Wang J, Wang C, Liu M, Zou Q. Deep learning models for disease-associated circRNA prediction: a review. Brief Bioinform 2022; 23:6696465. [PMID: 36130259 DOI: 10.1093/bib/bbac364] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 07/30/2022] [Accepted: 08/03/2022] [Indexed: 12/14/2022] Open
Abstract
Emerging evidence indicates that circular RNAs (circRNAs) can provide new insights and potential therapeutic targets for disease diagnosis and treatment. However, traditional biological experiments are expensive and time-consuming. Recently, deep learning with a more powerful ability for representation learning enables it to be a promising technology for predicting disease-associated circRNAs. In this review, we mainly introduce the most popular databases related to circRNA, and summarize three types of deep learning-based circRNA-disease associations prediction methods: feature-generation-based, type-discrimination and hybrid-based methods. We further evaluate seven representative models on benchmark with ground truth for both balance and imbalance classification tasks. In addition, we discuss the advantages and limitations of each type of method and highlight suggested applications for future research.
Collapse
Affiliation(s)
- Yaojia Chen
- College of Electronics and Information Engineering Guangdong Ocean University, Zhanjiang, China and the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiacheng Wang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| | - Chuyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China
| | - Mingxin Liu
- College of Electronics and Information Engineering, Guangdong Ocean University, Zhanjiang, China
| | - Quan Zou
- University of Electronic Science and Technology of China, China
| |
Collapse
|
30
|
Li Y, Hu XG, Wang L, Li PP, You ZH. MNMDCDA: prediction of circRNA-disease associations by learning mixed neighborhood information from multiple distances. Brief Bioinform 2022; 23:6831006. [PMID: 36384071 DOI: 10.1093/bib/bbac479] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 09/25/2022] [Accepted: 10/10/2022] [Indexed: 11/18/2022] Open
Abstract
Emerging evidence suggests that circular RNA (circRNA) is an important regulator of a variety of pathological processes and serves as a promising biomarker for many complex human diseases. Nevertheless, there are relatively few known circRNA-disease associations, and uncovering new circRNA-disease associations by wet-lab methods is time consuming and costly. Considering the limitations of existing computational methods, we propose a novel approach named MNMDCDA, which combines high-order graph convolutional networks (high-order GCNs) and deep neural networks to infer associations between circRNAs and diseases. Firstly, we computed different biological attribute information of circRNA and disease separately and used them to construct multiple multi-source similarity networks. Then, we used the high-order GCN algorithm to learn feature embedding representations with high-order mixed neighborhood information of circRNA and disease from the constructed multi-source similarity networks, respectively. Finally, the deep neural network classifier was implemented to predict associations of circRNAs with diseases. The MNMDCDA model obtained AUC scores of 95.16%, 94.53%, 89.80% and 91.83% on four benchmark datasets, i.e., CircR2Disease, CircAtlas v2.0, Circ2Disease and CircRNADisease, respectively, using the 5-fold cross-validation approach. Furthermore, 25 of the top 30 circRNA-disease pairs with the best scores of MNMDCDA in the case study were validated by recent literature. Numerous experimental results indicate that MNMDCDA can be used as an effective computational tool to predict circRNA-disease associations and can provide the most promising candidates for biological experiments.
Collapse
Affiliation(s)
- Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Xue-Gang Hu
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Pei-Pei Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,School of Computer Science, Northwestern Polytechnical University, Xi'an Shaanxi 710129, China
| |
Collapse
|
31
|
Lan W, Dong Y, Chen Q, Liu J, Wang J, Chen YPP, Pan S. IGNSCDA: Predicting CircRNA-Disease Associations Based on Improved Graph Convolutional Network and Negative Sampling. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3530-3538. [PMID: 34506289 DOI: 10.1109/tcbb.2021.3111607] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Accumulating evidences have shown that circRNA plays an important role in human diseases. It can be used as potential biomarker for diagnose and treatment of disease. Although some computational methods have been proposed to predict circRNA-disease associations, the performance still need to be improved. In this paper, we propose a new computational model based on Improved Graph convolutional network and Negative Sampling to predict CircRNA-Disease Associations. In our method, it constructs the heterogeneous network based on known circRNA-disease associations. Then, an improved graph convolutional network is designed to obtain the feature vectors of circRNA and disease. Further, the multi-layer perceptron is employed to predict circRNA-disease associations based on the feature vectors of circRNA and disease. In addition, the negative sampling method is employed to reduce the effect of the noise samples, which selects negative samples based on circRNA's expression profile similarity and Gaussian Interaction Profile kernel similarity. The 5-fold cross validation is utilized to evaluate the performance of the method. The results show that IGNSCDA outperforms than other state-of-the-art methods in the prediction performance. Moreover, the case study shows that IGNSCDA is an effective tool for predicting potential circRNA-disease associations.
Collapse
|
32
|
Bang D, Gu J, Park J, Jeong D, Koo B, Yi J, Shin J, Jung I, Kim S, Lee S. A Survey on Computational Methods for Investigation on ncRNA-Disease Association through the Mode of Action Perspective. Int J Mol Sci 2022; 23:ijms231911498. [PMID: 36232792 PMCID: PMC9570358 DOI: 10.3390/ijms231911498] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 09/18/2022] [Accepted: 09/26/2022] [Indexed: 02/01/2023] Open
Abstract
Molecular and sequencing technologies have been successfully used in decoding biological mechanisms of various diseases. As revealed by many novel discoveries, the role of non-coding RNAs (ncRNAs) in understanding disease mechanisms is becoming increasingly important. Since ncRNAs primarily act as regulators of transcription, associating ncRNAs with diseases involves multiple inference steps. Leveraging the fast-accumulating high-throughput screening results, a number of computational models predicting ncRNA-disease associations have been developed. These tools suggest novel disease-related biomarkers or therapeutic targetable ncRNAs, contributing to the realization of precision medicine. In this survey, we first introduce the biological roles of different ncRNAs and summarize the databases containing ncRNA-disease associations. Then, we suggest a new trend in recent computational prediction of ncRNA-disease association, which is the mode of action (MoA) network perspective. This perspective includes integrating ncRNAs with mRNA, pathway and phenotype information. In the next section, we describe computational methodologies widely used in this research domain. Existing computational studies are then summarized in terms of their coverage of the MoA network. Lastly, we discuss the potential applications and future roles of the MoA network in terms of integrating biological mechanisms for ncRNA-disease associations.
Collapse
Affiliation(s)
- Dongmin Bang
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Jeonghyeon Gu
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
| | - Joonhyeong Park
- Department of Computer Science and Engineering, Seoul National University, Seoul 08826, Korea
| | - Dabin Jeong
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Bonil Koo
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Jungseob Yi
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
| | - Jihye Shin
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
| | - Inuk Jung
- Department of Computer Science and Engineering, Kyungpook National University, Daegu 41566, Korea
| | - Sun Kim
- Interdisciplinary Program in Bioinformatics, Seoul National University, Seoul 08826, Korea
- Interdisciplinary Program in Artificial Intelligence, Seoul National University, Seoul 08826, Korea
- Department of Computer Science and Engineering, Seoul National University, Seoul 08826, Korea
- MOGAM Institute for Biomedical Research, Yongin-si 16924, Korea
| | - Sunho Lee
- AIGENDRUG Co., Ltd., Seoul 08826, Korea
- Correspondence:
| |
Collapse
|
33
|
Chen Y, Hu Y, Hu X, Feng C, Chen M. CoGO: a contrastive learning framework to predict disease similarity based on gene network and ontology structure. Bioinformatics 2022; 38:4380-4386. [PMID: 35900147 DOI: 10.1093/bioinformatics/btac520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 06/16/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Quantifying the similarity of human diseases provides guiding insights to the discovery of micro-scope mechanisms from a macro scale. Previous work demonstrated that better performance can be gained by integrating multiview data sources or applying machine learning techniques. However, designing an efficient framework to extract and incorporate information from different biological data using deep learning models remains unexplored. RESULTS We present CoGO, a Contrastive learning framework to predict disease similarity based on Gene network and Ontology structure, which incorporates the gene interaction network and gene ontology (GO) domain knowledge using graph deep learning models. First, graph deep learning models are applied to encode the features of genes and GO terms from separate graph structure data. Next, gene and GO features are projected to a common embedding space via a nonlinear projection. Then cross-view contrastive loss is applied to maximize the agreement of corresponding gene-GO associations and lead to meaningful gene representation. Finally, CoGO infers the similarity between diseases by the cosine similarity of disease representation vectors derived from related gene embedding. In our experiments, CoGO outperforms the most competitive baseline method on both AUROC and AUPRC, especially improves 19.57% in AUPRC (0.7733). The prediction results are significantly comparable with other disease similarity studies and thus highly credible. Furthermore, we conduct a detailed case study of top similar disease pairs which is demonstrated by other studies. Empirical results show that CoGO achieves powerful performance in disease similarity problem. AVAILABILITY AND IMPLEMENTATION https://github.com/yhchen1123/CoGO.
Collapse
Affiliation(s)
- Yuhao Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yanshi Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Xiaotian Hu
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Cong Feng
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Ming Chen
- Department of Bioinformatics, College of Life Sciences, Zhejiang University, Hangzhou, 310058, China.,Biomedical Big Data Center, The First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, 310058, China.,Institute of Hematology, Zhejiang University, Hangzhou, 310058, China
| |
Collapse
|
34
|
Su Q, Tan Q, Liu X, Wu L. Prioritizing potential circRNA biomarkers for bladder cancer and bladder urothelial cancer based on an ensemble model. Front Genet 2022; 13:1001608. [PMID: 36186429 PMCID: PMC9521272 DOI: 10.3389/fgene.2022.1001608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2022] [Accepted: 08/15/2022] [Indexed: 12/03/2022] Open
Abstract
Bladder cancer is the most common cancer of the urinary system. Bladder urothelial cancer accounts for 90% of bladder cancer. These two cancers have high morbidity and mortality rates worldwide. The identification of biomarkers for bladder cancer and bladder urothelial cancer helps in their diagnosis and treatment. circRNAs are considered oncogenes or tumor suppressors in cancers, and they play important roles in the occurrence and development of cancers. In this manuscript, we developed an Ensemble model, CDA-EnRWLRLS, to predict circRNA-Disease Associations (CDA) combining Random Walk with restart and Laplacian Regularized Least Squares, and further screen potential biomarkers for bladder cancer and bladder urothelial cancer. First, we compute disease similarity by combining the semantic similarity and association profile similarity of diseases and circRNA similarity by combining the functional similarity and association profile similarity of circRNAs. Second, we score each circRNA-disease pair by random walk with restart and Laplacian regularized least squares, respectively. Third, circRNA-disease association scores from these models are integrated to obtain the final CDAs by the soft voting approach. Finally, we use CDA-EnRWLRLS to screen potential circRNA biomarkers for bladder cancer and bladder urothelial cancer. CDA-EnRWLRLS is compared to three classical CDA prediction methods (CD-LNLP, DWNN-RLS, and KATZHCDA) and two individual models (CDA-RWR and CDA-LRLS), and obtains better AUC of 0.8654. We predict that circHIPK3 has the highest association with bladder cancer and may be its potential biomarker. In addition, circSMARCA5 has the highest association with bladder urothelial cancer and may be its possible biomarker.
Collapse
|
35
|
Wang L, Wong L, Li Z, Huang Y, Su X, Zhao B, You Z. A machine learning framework based on multi-source feature fusion for circRNA-disease association prediction. Brief Bioinform 2022; 23:6693603. [PMID: 36070867 DOI: 10.1093/bib/bbac388] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Revised: 07/26/2022] [Accepted: 08/11/2022] [Indexed: 11/14/2022] Open
Abstract
Circular RNAs (circRNAs) are involved in the regulatory mechanisms of multiple complex diseases, and the identification of their associations is critical to the diagnosis and treatment of diseases. In recent years, many computational methods have been designed to predict circRNA-disease associations. However, most of the existing methods rely on single correlation data. Here, we propose a machine learning framework for circRNA-disease association prediction, called MLCDA, which effectively fuses multiple sources of heterogeneous information including circRNA sequences and disease ontology. Comprehensive evaluation in the gold standard dataset showed that MLCDA can successfully capture the complex relationships between circRNAs and diseases and accurately predict their potential associations. In addition, the results of case studies on real data show that MLCDA significantly outperforms other existing methods. MLCDA can serve as a useful tool for circRNA-disease association prediction, providing mechanistic insights for disease research and thus facilitating the progress of disease treatment.
Collapse
Affiliation(s)
- Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Zhengwei Li
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning, 530007, China
| | - Yuan Huang
- Department of Computing, Hong Kong Polytechnic University, Hong Kong 999077, China
| | - Xiaorui Su
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Bowei Zhao
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zhuhong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an 710129, China
| |
Collapse
|
36
|
Dai Q, Liu Z, Wang Z, Duan X, Guo M. GraphCDA: a hybrid graph representation learning framework based on GCN and GAT for predicting disease-associated circRNAs. Brief Bioinform 2022; 23:6692549. [PMID: 36070619 DOI: 10.1093/bib/bbac379] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 07/18/2022] [Accepted: 08/09/2022] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION CircularRNA (circRNA) is a class of noncoding RNA with high conservation and stability, which is considered as an important disease biomarker and drug target. Accumulating pieces of evidence have indicated that circRNA plays a crucial role in the pathogenesis and progression of many complex diseases. As the biological experiments are time-consuming and labor-intensive, developing an accurate computational prediction method has become indispensable to identify disease-related circRNAs. RESULTS We presented a hybrid graph representation learning framework, named GraphCDA, for predicting the potential circRNA-disease associations. Firstly, the circRNA-circRNA similarity network and disease-disease similarity network were constructed to characterize the relationships of circRNAs and diseases, respectively. Secondly, a hybrid graph embedding model combining Graph Convolutional Networks and Graph Attention Networks was introduced to learn the feature representations of circRNAs and diseases simultaneously. Finally, the learned representations were concatenated and employed to build the prediction model for identifying the circRNA-disease associations. A series of experimental results demonstrated that GraphCDA outperformed other state-of-the-art methods on several public databases. Moreover, GraphCDA could achieve good performance when only using a small number of known circRNA-disease associations as the training set. Besides, case studies conducted on several human diseases further confirmed the prediction capability of GraphCDA for predicting potential disease-related circRNAs. In conclusion, extensive experimental results indicated that GraphCDA could serve as a reliable tool for exploring the regulatory role of circRNAs in complex diseases.
Collapse
Affiliation(s)
- Qiguo Dai
- School of Computer Science and Engineering, Dalian Minzu University, 116600, Dalian, China.,SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Ziqiang Liu
- School of Computer Science and Engineering, Dalian Minzu University, 116600, Dalian, China.,SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Zhaowei Wang
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China.,School of Computer Science and Technology, Dalian University of Technology, 116024, Dalian, China
| | - Xiaodong Duan
- SEAC Key Laboratory of Big Data Applied Technology, Dalian Minzu University, 116600, Dalian, China
| | - Maozu Guo
- School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, 100044, Beijing, China
| |
Collapse
|
37
|
Kouhsar M, Kashaninia E, Mardani B, Rabiee HR. CircWalk: a novel approach to predict CircRNA-disease association based on heterogeneous network representation learning. BMC Bioinformatics 2022; 23:331. [PMID: 35953785 PMCID: PMC9367077 DOI: 10.1186/s12859-022-04883-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 08/08/2022] [Indexed: 11/10/2022] Open
Abstract
Background Several types of RNA in the cell are usually involved in biological processes with multiple functions. Coding RNAs code for proteins while non-coding RNAs regulate gene expression. Some single-strand RNAs can create a circular shape via the back splicing process and convert into a new type called circular RNA (circRNA). circRNAs are among the essential non-coding RNAs in the cell that involve multiple disorders. One of the critical functions of circRNAs is to regulate the expression of other genes through sponging micro RNAs (miRNAs) in diseases. This mechanism, known as the competing endogenous RNA (ceRNA) hypothesis, and additional information obtained from biological datasets can be used by computational approaches to predict novel associations between disease and circRNAs.
Results We applied multiple classifiers to validate the extracted features from the heterogeneous network and selected the most appropriate one based on some evaluation criteria. Then, the XGBoost is utilized in our pipeline to generate a novel approach, called CircWalk, to predict CircRNA-Disease associations. Our results demonstrate that CircWalk has reasonable accuracy and AUC compared with other state-of-the-art algorithms. We also use CircWalk to predict novel circRNAs associated with lung, gastric, and colorectal cancers as a case study. The results show that our approach can accurately detect novel circRNAs related to these diseases. Conclusions Considering the ceRNA hypothesis, we integrate multiple resources to construct a heterogeneous network from circRNAs, mRNAs, miRNAs, and diseases. Next, the DeepWalk algorithm is applied to the network to extract feature vectors for circRNAs and diseases. The extracted features are used to learn a classifier and generate a model to predict novel CircRNA-Disease associations. Our approach uses the concept of the ceRNA hypothesis and the miRNA sponge effect of circRNAs to predict their associations with diseases. Our results show that this outlook could help identify CircRNA-Disease associations more accurately. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04883-9.
Collapse
Affiliation(s)
- Morteza Kouhsar
- BCB Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Esra Kashaninia
- BCB Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Behnam Mardani
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan, Iran
| | - Hamid R Rabiee
- BCB Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran.
| |
Collapse
|
38
|
Wu Q, Deng Z, Pan X, Shen HB, Choi KS, Wang S, Wu J, Yu DJ. MDGF-MCEC: a multi-view dual attention embedding model with cooperative ensemble learning for CircRNA-disease association prediction. Brief Bioinform 2022; 23:6652197. [PMID: 35907779 DOI: 10.1093/bib/bbac289] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 06/19/2022] [Accepted: 06/26/2022] [Indexed: 11/12/2022] Open
Abstract
Circular RNA (circRNA) is closely involved in physiological and pathological processes of many diseases. Discovering the associations between circRNAs and diseases is of great significance. Due to the high-cost to verify the circRNA-disease associations by wet-lab experiments, computational approaches for predicting the associations become a promising research direction. In this paper, we propose a method, MDGF-MCEC, based on multi-view dual attention graph convolution network (GCN) with cooperative ensemble learning to predict circRNA-disease associations. First, MDGF-MCEC constructs two disease relation graphs and two circRNA relation graphs based on different similarities. Then, the relation graphs are fed into a multi-view GCN for representation learning. In order to learn high discriminative features, a dual-attention mechanism is introduced to adjust the contribution weights, at both channel level and spatial level, of different features. Based on the learned embedding features of diseases and circRNAs, nine different feature combinations between diseases and circRNAs are treated as new multi-view data. Finally, we construct a multi-view cooperative ensemble classifier to predict the associations between circRNAs and diseases. Experiments conducted on the CircR2Disease database demonstrate that the proposed MDGF-MCEC model achieves a high area under curve of 0.9744 and outperforms the state-of-the-art methods. Promising results are also obtained from experiments on the circ2Disease and circRNADisease databases. Furthermore, the predicted associated circRNAs for hepatocellular carcinoma and gastric cancer are supported by the literature. The code and dataset of this study are available at https://github.com/ABard0/MDGF-MCEC.
Collapse
Affiliation(s)
| | - Zhaohong Deng
- Jiangnan University, School of Artificial Intelligence and Computer Science, China
| | - Xiaoyong Pan
- Shanghai Jiao Tong University, Department of Automation, China
| | - Hong-Bin Shen
- Shanghai Jiao Tong University, Shanghai, China, Department of Automation, China
| | - Kup-Sze Choi
- Hong Kong Polytechnic University, School of Nursing, China
| | - Shitong Wang
- Jiangnan University, School of Artificial Intelligence and Computer Science, China
| | - Jing Wu
- Jiangnan University, State Key Laboratory of Food Science and Technology, China
| | - Dong-Jun Yu
- Nanjing University of Science and Technology, School of Computer Science and Engineering, China
| |
Collapse
|
39
|
Zheng J, Qian Y, He J, Kang Z, Deng L. Graph Neural Network with Self-Supervised Learning for Noncoding RNA-Drug Resistance Association Prediction. J Chem Inf Model 2022; 62:3676-3684. [PMID: 35838124 DOI: 10.1021/acs.jcim.2c00367] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Noncoding RNA(ncRNA) is closely related to drug resistance. Identifying the association between ncRNA and drug resistance is of great significance for drug development. Methods based on biological experiments are often time-consuming and small-scale. Therefore, developing computational methods to distinguish the association between ncRNA and drug resistance is urgent. We develop a computational framework called GSLRDA to predict the association between ncRNA and drug resistance in this work. First, the known ncRNA-drug resistance associations are modeled as a bipartite graph of ncRNA and drug. Then, GSLRDA uses the light graph convolutional network (lightGCN) to learn the vector representation of ncRNA and drug from the ncRNA-drug bipartite graph. In addition, GSLRDA uses different data augmentation methods to generate different views for ncRNA and drug nodes and performs self-supervised learning, further improving the quality of learned ncRNA and drug vector representations through contrastive learning between nodes. Finally, GSLRDA uses the inner product to predict the association between ncRNA and drug resistance. To the best of our knowledge, GSLRDA is the first to apply self-supervised learning in association prediction tasks in the field of bioinformatics. The experimental results show that GSLRDA takes an AUC value of 0.9101, higher than the other eight state-of-the-art models. In addition, case studies including two drugs further illustrate the effectiveness of GSLRDA in predicting the association between ncRNA and drug resistance. The code and data sets of GSLRDA are available at https://github.com/JJZ-code/GSLRDA.
Collapse
Affiliation(s)
- Jingjing Zheng
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Yurong Qian
- School of Software, Xinjiang University, Urumqi 830091, China
| | - Jie He
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Zerui Kang
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Lei Deng
- School of Software, Xinjiang University, Urumqi 830091, China.,School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
40
|
Cao R, He C, Wei P, Su Y, Xia J, Zheng C. Prediction of circRNA-Disease Associations Based on the Combination of Multi-Head Graph Attention Network and Graph Convolutional Network. Biomolecules 2022; 12:biom12070932. [PMID: 35883487 PMCID: PMC9313348 DOI: 10.3390/biom12070932] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 06/22/2022] [Accepted: 06/30/2022] [Indexed: 11/16/2022] Open
Abstract
Circular RNAs (circRNAs) are covalently closed single-stranded RNA molecules, which have many biological functions. Previous experiments have shown that circRNAs are involved in numerous biological processes, especially regulatory functions. It has also been found that circRNAs are associated with complex diseases of human beings. Therefore, predicting the associations of circRNA with disease (called circRNA-disease associations) is useful for disease prevention, diagnosis and treatment. In this work, we propose a novel computational approach called GGCDA based on the Graph Attention Network (GAT) and Graph Convolutional Network (GCN) to predict circRNA-disease associations. Firstly, GGCDA combines circRNA sequence similarity, disease semantic similarity and corresponding Gaussian interaction profile kernel similarity, and then a random walk with restart algorithm (RWR) is used to obtain the preliminary features of circRNA and disease. Secondly, a heterogeneous graph is constructed from the known circRNA-disease association network and the calculated similarity of circRNAs and diseases. Thirdly, the multi-head Graph Attention Network (GAT) is adopted to obtain different weights of circRNA and disease features, and then GCN is employed to aggregate the features of adjacent nodes in the network and the features of the nodes themselves, so as to obtain multi-view circRNA and disease features. Finally, we combined a multi-layer fully connected neural network to predict the associations of circRNAs with diseases. In comparison with state-of-the-art methods, GGCDA can achieve AUC values of 0.9625 and 0.9485 under the results of fivefold cross-validation on two datasets, and AUC of 0.8227 on the independent test set. Case studies further demonstrate that our approach is promising for discovering potential circRNA-disease associations.
Collapse
Affiliation(s)
- Ruifen Cao
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Computer Science and Technology, Anhui University, Hefei 230601, China;
- Correspondence: (R.C.); (C.Z.)
| | - Chuan He
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, School of Computer Science and Technology, Anhui University, Hefei 230601, China;
| | - Pijing Wei
- Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China; (P.W.); (J.X.)
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei 230601, China;
| | - Junfeng Xia
- Institutes of Physical Science and Information Technology, Anhui University, Hefei 230601, China; (P.W.); (J.X.)
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei 230601, China;
- Correspondence: (R.C.); (C.Z.)
| |
Collapse
|
41
|
Li G, Lin Y, Luo J, Xiao Q, Liang C. GGAECDA: predicting circRNA-disease associations using graph autoencoder based on graph representation learning. Comput Biol Chem 2022; 99:107722. [DOI: 10.1016/j.compbiolchem.2022.107722] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2022] [Revised: 06/25/2022] [Accepted: 06/30/2022] [Indexed: 11/27/2022]
|
42
|
Wang Y, Wang LL, Wong L, Li Y, Wang L, You ZH. SIPGCN: A Novel Deep Learning Model for Predicting Self-Interacting Proteins from Sequence Information Using Graph Convolutional Networks. Biomedicines 2022; 10:biomedicines10071543. [PMID: 35884848 PMCID: PMC9313220 DOI: 10.3390/biomedicines10071543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 06/24/2022] [Accepted: 06/24/2022] [Indexed: 11/16/2022] Open
Abstract
Protein is the basic organic substance that constitutes the cell and is the material condition for the life activity and the guarantee of the biological function activity. Elucidating the interactions and functions of proteins is a central task in exploring the mysteries of life. As an important protein interaction, self-interacting protein (SIP) has a critical role. The fast growth of high-throughput experimental techniques among biomolecules has led to a massive influx of available SIP data. How to conduct scientific research using the massive amount of SIP data has become a new challenge that is being faced in related research fields such as biology and medicine. In this work, we design an SIP prediction method SIPGCN using a deep learning graph convolutional network (GCN) based on protein sequences. First, protein sequences are characterized using a position-specific scoring matrix, which is able to describe the biological evolutionary message, then their hidden features are extracted by the deep learning method GCN, and, finally, the random forest is utilized to predict whether there are interrelationships between proteins. In the cross-validation experiment, SIPGCN achieved 93.65% accuracy and 99.64% specificity in the human data set. SIPGCN achieved 90.69% and 99.08% of these two indicators in the yeast data set, respectively. Compared with other feature models and previous methods, SIPGCN showed excellent results. These outcomes suggest that SIPGCN may be a suitable instrument for predicting SIP and may be a reliable candidate for future wet experiments.
Collapse
Affiliation(s)
- Ying Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
| | - Lin-Lin Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
- Correspondence: (L.-L.W.); (L.W.)
| | - Leon Wong
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China; (L.W.); (Z.-H.Y.)
| | - Yang Li
- School of Computer Science and Information Engineering, Hefei University of Technology, Hefei 230601, China;
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277160, China;
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China; (L.W.); (Z.-H.Y.)
- Correspondence: (L.-L.W.); (L.W.)
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China; (L.W.); (Z.-H.Y.)
- School of Computer Science, Northwestern Polytechnical University, Xi’an 710129, China
| |
Collapse
|
43
|
Niu M, Zou Q, Wang C. GMNN2CD: identification of circRNA-disease associations based on variational inference and graph Markov neural networks. Bioinformatics 2022; 38:2246-2253. [PMID: 35157027 DOI: 10.1093/bioinformatics/btac079] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 12/05/2021] [Accepted: 02/09/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION With the analysis of the characteristic and function of circular RNAs (circRNAs), people have realized that they play a critical role in the diseases. Exploring the relationship between circRNAs and diseases is of far-reaching significance for searching the etiopathogenesis and treatment of diseases. Nevertheless, it is inefficient to learn new associations only through biotechnology. RESULTS Consequently, we present a computational method, GMNN2CD, which employs a graph Markov neural network (GMNN) algorithm to predict unknown circRNA-disease associations. First, used verified associations, we calculate semantic similarity and Gaussian interactive profile kernel similarity (GIPs) of the disease and the GIPs of circRNA and then merge them to form a unified descriptor. After that, GMNN2CD uses a fusion feature variational map autoencoder to learn deep features and uses a label propagation map autoencoder to propagate tags based on known associations. Based on variational inference, GMNN alternate training enhances the ability of GMNN2CD to obtain high-efficiency high-dimensional features from low-dimensional representations. Finally, 5-fold cross-validation of five benchmark datasets shows that GMNN2CD is superior to the state-of-the-art methods. Furthermore, case studies have shown that GMNN2CD can detect potential associations. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/nmt315320/GMNN2CD.git.
Collapse
Affiliation(s)
- Mengting Niu
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610000, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan 610000, China.,Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, Heilongjiang 150000, China
| |
Collapse
|
44
|
Yu W, Gu Q, Wu D, Zhang W, Li G, Lin L, Lowe JM, Hu S, Li TW, Zhou Z, Miao MZ, Gong Y, Zhao Y, Lu E. Identification of potentially functional circRNAs and prediction of circRNA-miRNA-mRNA regulatory network in periodontitis: Bridging the gap between bioinformatics and clinical needs. J Periodontal Res 2022; 57:594-614. [PMID: 35388494 PMCID: PMC9325354 DOI: 10.1111/jre.12989] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2021] [Revised: 03/19/2022] [Accepted: 03/23/2022] [Indexed: 02/06/2023]
Abstract
Background and Objective Periodontitis is a multifactorial chronic inflammatory disease that can lead to the irreversible destruction of dental support tissues. As an epigenetic factor, the expression of circRNA is tissue‐dependent and disease‐dependent. This study aimed to identify novel periodontitis‐associated circRNAs and predict relevant circRNA‐periodontitis regulatory network by using recently developed bioinformatic tools and integrating sequencing profiling with clinical information for getting a better and more thorough image of periodontitis pathogenesis, from gene to clinic. Material and Methods High‐throughput sequencing and RT‐qPCR were conducted to identify differentially expressed circRNAs in gingival tissues from periodontitis patients. The relationship between upregulated circRNAs expression and probing depth (PD) was performed using Spearman's correlation analysis. Bioinformatic analyses including GO analysis, circRNA‐disease association prediction, and circRNA‐miRNA‐mRNA network prediction were performed to clarify potential regulatory functions of identified circRNAs in periodontitis. A receiver‐operating characteristic (ROC) curve was established to assess the diagnostic significance of identified circRNAs. Results High‐throughput sequencing identified 70 differentially expressed circRNAs (68 upregulated and 2 downregulated circRNAs) in human periodontitis (fold change >2.0 and p < .05). The top five upregulated circRNAs were validated by RT‐qPCR that had strong associations with multiple human diseases, including periodontitis. The upregulation of circRNAs were positively correlated with PD (R = .40–.69, p < .05, moderate). A circRNA‐miRNA‐mRNA network with the top five upregulated circRNAs, differentially expressed mRNAs, and overlapped predicted miRNAs indicated potential roles of circRNAs in immune response, cell apoptosis, migration, adhesion, and reaction to oxidative stress. The ROC curve showed that circRNAs had potential value in periodontitis diagnosis (AUC = 0.7321–0.8667, p < .05). Conclusion CircRNA‐disease associations were predicted by online bioinformatic tools. Positive correlation between upregulated circRNAs, circPTP4A2, chr22:23101560‐23135351+, circARHGEF28, circBARD1 and circRASA2, and PD suggested function of circRNAs in periodontitis. Network prediction further focused on downstream targets regulated by circRNAs during periodontitis pathogenesis.
Collapse
Affiliation(s)
- Weijun Yu
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China
| | - Qisheng Gu
- Key Laboratory of Molecular Virology and Immunology, Institut Pasteur of Shanghai, Chinese Academy of Sciences, Shanghai, China.,Department of Immunology, Bio Sorbonne Paris Cité, University of Paris, Paris, France
| | - Di Wu
- Division of Oral and Craniofacial Biomedicine, University of North Carolina Adams School of Dentistry, Chapel Hill, North Carolina, USA.,Department of Biostatistics, Gillings School of Global Public Health, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Weiqi Zhang
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Gang Li
- Department of Genome Sciences, University of Washington, Seattle, Washington, USA
| | - Lu Lin
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Jared M Lowe
- Department of Chemistry, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Shucheng Hu
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Tia Wenjun Li
- Division of Oral and Craniofacial Biomedicine, University of North Carolina Adams School of Dentistry, Chapel Hill, North Carolina, USA.,Gene Therapy Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Zhen Zhou
- Center for Biomedical Image Computing and Analytics, Department of Radiology, University of Pennsylvania, Philadelphia, Pennsylvania, USA
| | - Michael Z Miao
- Division of Oral and Craniofacial Biomedicine, University of North Carolina Adams School of Dentistry, Chapel Hill, North Carolina, USA
| | - Yuhua Gong
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Yifei Zhao
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
| | - Eryi Lu
- Department of Stomatology, Renji Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China.,College of Stomatology, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
45
|
Chen Y, Wang Y, Ding Y, Su X, Wang C. RGCNCDA: Relational graph convolutional network improves circRNA-disease association prediction by incorporating microRNAs. Comput Biol Med 2022; 143:105322. [PMID: 35217342 DOI: 10.1016/j.compbiomed.2022.105322] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 02/11/2022] [Accepted: 02/13/2022] [Indexed: 12/21/2022]
Abstract
Recently, a large number of studies have indicated that circRNAs with covalently closed loops play important roles in biological processes and have potential as diagnostic biomarkers. Therefore, research on the circRNA-disease relationship is helpful in disease diagnosis and treatment. However, traditional biological verification methods require considerable labor and time costs. In this paper, we propose a new computational method (RGCNCDA) to predict circRNA-disease associations based on relational graph convolutional networks (R-GCNs). The method first integrates the circRNA similarity network, miRNA similarity network, disease similarity network and association networks among them to construct a global heterogeneous network. Then, it employs the random walk with restart (RWR) and principal component analysis (PCA) models to learn low-dimensional and high-order information from the global heterogeneous network as the topological features. Finally, a prediction model based on an R-GCN encoder and a DistMult decoder is built to predict the potential disease-associated circRNA. The predicted results demonstrate that RGCNCDA performs significantly better than the other six state-of-the-art methods in a 5-fold cross validation. Furthermore, the case study illustrates that RGCNCDA can effectively discover potential circRNA-disease associations.
Collapse
Affiliation(s)
- Yaojia Chen
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yanpeng Wang
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Xi Su
- Foshan Maternity & Child Healthcare Hospital, Southern Medical University, Foshan, China.
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
46
|
Zhang HY, Wang L, You ZH, Hu L, Zhao BW, Li ZW, Li YM. iGRLCDA: identifying circRNA-disease association based on graph representation learning. Brief Bioinform 2022; 23:6552271. [PMID: 35323894 DOI: 10.1093/bib/bbac083] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 02/16/2022] [Accepted: 02/17/2022] [Indexed: 12/18/2022] Open
Abstract
While the technologies of ribonucleic acid-sequence (RNA-seq) and transcript assembly analysis have continued to improve, a novel topology of RNA transcript was uncovered in the last decade and is called circular RNA (circRNA). Recently, researchers have revealed that they compete with messenger RNA (mRNA) and long noncoding for combining with microRNA in gene regulation. Therefore, circRNA was assumed to be associated with complex disease and discovering the relationship between them would contribute to medical research. However, the work of identifying the association between circRNA and disease in vitro takes a long time and usually without direction. During these years, more and more associations were verified by experiments. Hence, we proposed a computational method named identifying circRNA-disease association based on graph representation learning (iGRLCDA) for the prediction of the potential association of circRNA and disease, which utilized a deep learning model of graph convolution network (GCN) and graph factorization (GF). In detail, iGRLCDA first derived the hidden feature of known associations between circRNA and disease using the Gaussian interaction profile (GIP) kernel combined with disease semantic information to form a numeric descriptor. After that, it further used the deep learning model of GCN and GF to extract hidden features from the descriptor. Finally, the random forest classifier is introduced to identify the potential circRNA-disease association. The five-fold cross-validation of iGRLCDA shows strong competitiveness in comparison with other excellent prediction models at the gold standard data and achieved an average area under the receiver operating characteristic curve of 0.9289 and an area under the precision-recall curve of 0.9377. On reviewing the prediction results from the relevant literature, 22 of the top 30 predicted circRNA-disease associations were noted in recent published papers. These exceptional results make us believe that iGRLCDA can provide reliable circRNA-disease associations for medical research and reduce the blindness of wet-lab experiments.
Collapse
Affiliation(s)
- Han-Yuan Zhang
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Lei Wang
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China.,College of Information Science and Engineering, Zaozhuang University, Shandong 277100, China
| | - Zhu-Hong You
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Bo-Wei Zhao
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zheng-Wei Li
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Sciences, Nanning 530007, China
| | - Yang-Ming Li
- College of Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
47
|
Liu Q, Yu J, Cai Y, Zhang G, Dai X. SAAED: Embedding and Deep Learning Enhance Accurate Prediction of Association Between circRNA and Disease. Front Genet 2022; 13:832244. [PMID: 35273640 PMCID: PMC8902643 DOI: 10.3389/fgene.2022.832244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Accepted: 01/17/2022] [Indexed: 11/13/2022] Open
Abstract
Emerging evidence indicates that circRNA can regulate various diseases. However, the mechanisms of circRNA in these diseases have not been fully understood. Therefore, detecting potential circRNA–disease associations has far-reaching significance for pathological development and treatment of these diseases. In recent years, deep learning models are used in association analysis of circRNA–disease, but a lack of circRNA–disease association data limits further improvement. Therefore, there is an urgent need to mine more semantic information from data. In this paper, we propose a novel method called Semantic Association Analysis by Embedding and Deep learning (SAAED), which consists of two parts, a neural network embedding model called Entity Relation Network (ERN) and a Pseudo-Siamese network (PSN) for analysis. ERN can fuse multiple sources of data and express the information with low-dimensional embedding vectors. PSN can extract the feature between circRNA and disease for the association analysis. CircRNA–disease, circRNA–miRNA, disease–gene, disease–miRNA, disease–lncRNA, and disease–drug association information are used in this paper. More association data can be introduced for analysis without restriction. Based on the CircR2Disease benchmark dataset for evaluation, a fivefold cross-validation experiment showed an AUC of 98.92%, an accuracy of 95.39%, and a sensitivity of 93.06%. Compared with other state-of-the-art models, SAAED achieves the best overall performance. SAAED can expand the expression of the biological related information and is an efficient method for predicting potential circRNA–disease association.
Collapse
Affiliation(s)
- Qingyu Liu
- School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, China
| | - Junjie Yu
- Macquarie Business School, Macquarie University, Sydney, NSW, Australia
| | - Yanning Cai
- College of Information Science and Technology, Jinan University, Guangzhou, China
| | - Guishan Zhang
- College of Engineering, Shantou University, Shantou, China
| | - Xianhua Dai
- School of Electronics and Information Technology, Sun Yat-Sen University, Guangzhou, China
| |
Collapse
|
48
|
Li G, Wang D, Zhang Y, Liang C, Xiao Q, Luo J. Using Graph Attention Network and Graph Convolutional Network to Explore Human CircRNA-Disease Associations Based on Multi-Source Data. Front Genet 2022; 13:829937. [PMID: 35198012 PMCID: PMC8859418 DOI: 10.3389/fgene.2022.829937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022] Open
Abstract
Cumulative research studies have verified that multiple circRNAs are closely associated with the pathogenic mechanism and cellular level. Exploring human circRNA-disease relationships is significant to decipher pathogenic mechanisms and provide treatment plans. At present, several computational models are designed to infer potential relationships between diseases and circRNAs. However, the majority of existing approaches could not effectively utilize the multisource data and achieve poor performance in sparse networks. In this study, we develop an advanced method, GATGCN, using graph attention network (GAT) and graph convolutional network (GCN) to detect potential circRNA-disease relationships. First, several sources of biomedical information are fused via the centered kernel alignment model (CKA), which calculates the corresponding weight of different kernels. Second, we adopt the graph attention network to learn latent representation of diseases and circRNAs. Third, the graph convolutional network is deployed to effectively extract features of associations by aggregating feature vectors of neighbors. Meanwhile, GATGCN achieves the prominent AUC of 0.951 under leave-one-out cross-validation and AUC of 0.932 under 5-fold cross-validation. Furthermore, case studies on lung cancer, diabetes retinopathy, and prostate cancer verify the reliability of GATGCN for detecting latent circRNA-disease pairs.
Collapse
Affiliation(s)
- Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Diancheng Wang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Yuejin Zhang
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
49
|
Lan W, Dong Y, Chen Q, Zheng R, Liu J, Pan Y, Chen YPP. KGANCDA: predicting circRNA-disease associations based on knowledge graph attention network. Brief Bioinform 2021; 23:6447436. [PMID: 34864877 DOI: 10.1093/bib/bbab494] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2021] [Revised: 10/12/2021] [Accepted: 10/26/2021] [Indexed: 12/31/2022] Open
Abstract
Increasing evidences have proved that circRNA plays a significant role in the development of many diseases. In addition, many researches have shown that circRNA can be considered as the potential biomarker for clinical diagnosis and treatment of disease. Some computational methods have been proposed to predict circRNA-disease associations. However, the performance of these methods is limited as the sparsity of low-order interaction information. In this paper, we propose a new computational method (KGANCDA) to predict circRNA-disease associations based on knowledge graph attention network. The circRNA-disease knowledge graphs are constructed by collecting multiple relationship data among circRNA, disease, miRNA and lncRNA. Then, the knowledge graph attention network is designed to obtain embeddings of each entity by distinguishing the importance of information from neighbors. Besides the low-order neighbor information, it can also capture high-order neighbor information from multisource associations, which alleviates the problem of data sparsity. Finally, the multilayer perceptron is applied to predict the affinity score of circRNA-disease associations based on the embeddings of circRNA and disease. The experiment results show that KGANCDA outperforms than other state-of-the-art methods in 5-fold cross validation. Furthermore, the case study demonstrates that KGANCDA is an effective tool to predict potential circRNA-disease associations.
Collapse
Affiliation(s)
- Wei Lan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Yi Dong
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Qingfeng Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Ruiqing Zheng
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Jin Liu
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Yi Pan
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| | - Yi-Ping Phoebe Chen
- School of Computer, Electronic and Information, Guangxi University, Nanning, China
| |
Collapse
|
50
|
Ma Z, Kuang Z, Deng L. CRPGCN: predicting circRNA-disease associations using graph convolutional network based on heterogeneous network. BMC Bioinformatics 2021; 22:551. [PMID: 34772332 PMCID: PMC8588735 DOI: 10.1186/s12859-021-04467-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 11/01/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND The existing studies show that circRNAs can be used as a biomarker of diseases and play a prominent role in the treatment and diagnosis of diseases. However, the relationships between the vast majority of circRNAs and diseases are still unclear, and more experiments are needed to study the mechanism of circRNAs. Nowadays, some scholars use the attributes between circRNAs and diseases to study and predict their associations. Nonetheless, most of the existing experimental methods use less information about the attributes of circRNAs, which has a certain impact on the accuracy of the final prediction results. On the other hand, some scholars also apply experimental methods to predict the associations between circRNAs and diseases. But such methods are usually expensive and time-consuming. Based on the above shortcomings, follow-up research is needed to propose a more efficient calculation-based method to predict the associations between circRNAs and diseases. RESULTS In this study, a novel algorithm (method) is proposed, which is based on the Graph Convolutional Network (GCN) constructed with Random Walk with Restart (RWR) and Principal Component Analysis (PCA) to predict the associations between circRNAs and diseases (CRPGCN). In the construction of CRPGCN, the RWR algorithm is used to improve the similarity associations of the computed nodes with their neighbours. After that, the PCA method is used to dimensionality reduction and extract features, it makes the connection between circRNAs with higher similarity and diseases closer. Finally, The GCN algorithm is used to learn the features between circRNAs and diseases and calculate the final similarity scores, and the learning datas are constructed from the adjacency matrix, similarity matrix and feature matrix as a heterogeneous adjacency matrix and a heterogeneous feature matrix. CONCLUSIONS After 2-fold cross-validation, 5-fold cross-validation and 10-fold cross-validation, the area under the ROC curve of the CRPGCN is 0.9490, 0.9720 and 0.9722, respectively. The CRPGCN method has a valuable effect in predict the associations between circRNAs and diseases.
Collapse
Affiliation(s)
- Zhihao Ma
- School of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, China
| | - Zhufang Kuang
- School of Computer and Information Engineering, Central South University of Forestry and Technology, Changsha, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, China
| |
Collapse
|