1
|
Ma Y, Shi Y, Chen X, Zhang B, Wu H, Gao J. NFMCLDA: Predicting miRNA-based lncRNA-disease associations by network fusion and matrix completion. Comput Biol Med 2024; 174:108403. [PMID: 38582002 DOI: 10.1016/j.compbiomed.2024.108403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/28/2024] [Accepted: 04/01/2024] [Indexed: 04/08/2024]
Abstract
In recent years, emerging evidence has revealed a strong association between dysregulations of long non-coding RNAs (lncRNAs) and sophisticated human diseases. Biological experiments are adequate to identify such associations, but they are costly and time-consuming. Therefore, developing high-quality computational methods is a challenging and urgent task in the field of bioinformatics. This paper proposes a new lncRNA-disease association inference approach NFMCLDA (Network Fusion and Matrix Completion lncRNA-Disease Association), which can effectively integrate multi-source association data. In this approach, miRNA information is used as the transition path, and an unbalanced random walk method on three-layer heterogeneous network is adopted in the preprocessing. Therefore, more effective information between networks can be mined and the sparsity problem of the association matrix can be solved. Finally, the matrix completion method accurately predicts associations. The results show that NFMCLDA can provide more accurate lncRNA-disease associations than state-of-the-art methods. The areas under the receiver operating characteristic curves are 0.9648 and 0.9713, respectively, through the cross-validation of 5-fold and 10-fold. Data from published case studies on four diseases - lung cancer, osteosarcoma, cervical cancer, and colon cancer - have confirmed the reliable predictive potential of NFMCLDA model.
Collapse
Affiliation(s)
- Yibing Ma
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Yongle Shi
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Xiang Chen
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Bai Zhang
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Hanwen Wu
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China
| | - Jie Gao
- School of Science, Jiangnan University, Wuxi, Jiangsu, 214122, China.
| |
Collapse
|
2
|
Liu Y, Zhang R, Dong X, Yang H, Li J, Cao H, Tian J, Zhang Y. DAE-CFR: detecting microRNA-disease associations using deep autoencoder and combined feature representation. BMC Bioinformatics 2024; 25:139. [PMID: 38553698 PMCID: PMC10981315 DOI: 10.1186/s12859-024-05757-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/20/2024] [Indexed: 04/01/2024] Open
Abstract
BACKGROUND MicroRNA (miRNA) has been shown to play a key role in the occurrence and progression of diseases, making uncovering miRNA-disease associations vital for disease prevention and therapy. However, traditional laboratory methods for detecting these associations are slow, strenuous, expensive, and uncertain. Although numerous advanced algorithms have emerged, it is still a challenge to develop more effective methods to explore underlying miRNA-disease associations. RESULTS In the study, we designed a novel approach on the basis of deep autoencoder and combined feature representation (DAE-CFR) to predict possible miRNA-disease associations. We began by creating integrated similarity matrices of miRNAs and diseases, performing a logistic function transformation, balancing positive and negative samples with k-means clustering, and constructing training samples. Then, deep autoencoder was used to extract low-dimensional feature from two kinds of feature representations for miRNAs and diseases, namely, original association information-based and similarity information-based. Next, we combined the resulting features for each miRNA-disease pair and used a logistic regression (LR) classifier to infer all unknown miRNA-disease interactions. Under five and tenfold cross-validation (CV) frameworks, DAE-CFR not only outperformed six popular algorithms and nine classifiers, but also demonstrated superior performance on an additional dataset. Furthermore, case studies on three diseases (myocardial infarction, hypertension and stroke) confirmed the validity of DAE-CFR in practice. CONCLUSIONS DAE-CFR achieved outstanding performance in predicting miRNA-disease associations and can provide evidence to inform biological experiments and clinical therapy.
Collapse
Affiliation(s)
- Yanling Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Department of Mathematics, Changzhi Medical College, Changzhi, China
| | - Ruiyan Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Xiaojing Dong
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Li
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongyan Cao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jing Tian
- Department of Cardiology, First Hospital of Shanxi Medical University, Taiyuan, China.
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China.
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China.
- School of Health and Service Management, Shanxi University of Chinese Medicine, Jinzhong, China.
| |
Collapse
|
3
|
Du XX, Liu Y, Wang B, Zhang JF. lncRNA-disease association prediction method based on the nearest neighbor matrix completion model. Sci Rep 2022; 12:21653. [PMID: 36522410 PMCID: PMC9755128 DOI: 10.1038/s41598-022-25730-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 12/05/2022] [Indexed: 12/23/2022] Open
Abstract
State-of-the-art medical studies proved that long noncoding ribonucleic acids (lncRNAs) are closely related to various diseases. However, their large-scale detection in biological experiments is problematic and expensive. To aid screening and improve the efficiency of biological experiments, this study introduced a prediction model based on the nearest neighbor concept for lncRNA-disease association prediction. We used a new similarity algorithm in the model that fused potential associations. The experimental validation of the proposed algorithm proved its superiority over the available Cosine, Pearson, and Jaccard similarity algorithms. Satisfactory results in the comparative leave-one-out cross-validation test (with AUC = 0.96) confirmed its excellent predictive performance. Finally, the proposed model's reliability was confirmed by performing predictions using a new dataset, yielding AUC = 0.92.
Collapse
Affiliation(s)
- Xiao-xin Du
- grid.412616.60000 0001 0002 2355College of Computer and Control, Qiqihar University, Qiqihar, 161006 China
| | - Yan Liu
- grid.412616.60000 0001 0002 2355College of Computer and Control, Qiqihar University, Qiqihar, 161006 China
| | - Bo Wang
- grid.412616.60000 0001 0002 2355College of Computer and Control, Qiqihar University, Qiqihar, 161006 China
| | - Jian-fei Zhang
- grid.412616.60000 0001 0002 2355College of Computer and Control, Qiqihar University, Qiqihar, 161006 China
| |
Collapse
|
4
|
Shi H, Zhang X, Tang L, Liu L. Heterogeneous graph neural network for lncRNA-disease association prediction. Sci Rep 2022; 12:17519. [PMID: 36266433 PMCID: PMC9585029 DOI: 10.1038/s41598-022-22447-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Accepted: 10/14/2022] [Indexed: 01/12/2023] Open
Abstract
Identifying lncRNA-disease associations is conducive to the diagnosis, treatment and prevention of diseases. Due to the expensive and time-consuming methods verified by biological experiments, prediction methods based on computational models have gradually become an important means of lncRNA-disease associations discovery. However, existing methods still have challenges to make full use of network topology information to identify potential associations between lncRNA and disease in multi-source data. In this study, we propose a novel method called HGNNLDA for lncRNA-disease association prediction. First, HGNNLDA constructs a heterogeneous network composed of lncRNA similarity network, lncRNA-disease association network and lncRNA-miRNA association network; Then, on this heterogeneous network, various types of strong correlation neighbors with fixed size are sampled for each node by restart random walk; Next, the embedding information of lncRNA and disease in each lncRNA-disease association pair is obtained by the method of type-based neighbor aggregation and all types combination though heterogeneous graph neural network, in which attention mechanism is introduced considering that different types of neighbors will make different contributions to the prediction of lncRNA-disease association. As a result, the area under the receiver operating characteristic curve (AUC) and the area under the precision-recall curve (AUPR) under fivefold cross-validation (5FCV) are 0.9786 and 0.8891, respectively. Compared with five state-of-art prediction models, HGNNLDA has better prediction performance. In addition, in two types of case studies, it is further verified that our method can effectively predict the potential lncRNA-disease associations, and have ability to predict new diseases without any known lncRNAs.
Collapse
Affiliation(s)
- Hong Shi
- School of Information, Yunan Normal University, Kunming, 650092 China
| | - Xiaomeng Zhang
- School of Information, Yunan Normal University, Kunming, 650092 China
| | - Lin Tang
- grid.410739.80000 0001 0723 6903Key Laboratory of Educational Informatization for Nationalities Ministry of Education, Yunnan Normal University, Kunming, 650092 China
| | - Lin Liu
- School of Information, Yunan Normal University, Kunming, 650092 China
| |
Collapse
|
5
|
Silva ABOV, Spinosa EJ. Graph Convolutional Auto-Encoders for Predicting Novel lncRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2264-2271. [PMID: 33819159 DOI: 10.1109/tcbb.2021.3070910] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
LncRNAs are intermediate molecules that participate in the most diverse biological processes in humans, such as gene expression control and X-chromosome inactivation. Numerous researches have associated lncRNAs with a wide range of diseases, such as breast cancer, leukemia, and many other conditions. In this work, we propose a graph-based method named PANDA. This method treats the prediction of new associations between lncRNAs and diseases as a link prediction problem in a graph. We start by building a heterogeneous graph that contains the known associations between lncRNAs and diseases and additional information such as gene expression levels and symptoms of diseases. We then use a Graph Auto-encoder to learn the representation of the nodes' features and edges, finally applying a Neural Network to predict potentially interesting novel edges. The experimental results indicate that PANDA achieved a 0.976 AUC-ROC, surpassing state-of-the-art methods for the same problem, showing that PANDA could be a promising approach to generate embeddings to predict potentially novel lncRNA-disease associations.
Collapse
|
6
|
Liu Y, Yang H, Zheng C, Wang K, Yan J, Cao H, Zhang Y. NCP-BiRW: A Hybrid Approach for Predicting Long Noncoding RNA-Disease Associations by Network Consistency Projection and Bi-Random Walk. Front Genet 2022; 13:862272. [PMID: 35495166 PMCID: PMC9043107 DOI: 10.3389/fgene.2022.862272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 03/21/2022] [Indexed: 12/06/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) play significant roles in the disease process. Understanding the pathological mechanisms of lncRNAs during the course of various diseases will help clinicians prevent and treat diseases. With the emergence of high-throughput techniques, many biological experiments have been developed to study lncRNA-disease associations. Because experimental methods are costly, slow, and laborious, a growing number of computational models have emerged. Here, we present a new approach using network consistency projection and bi-random walk (NCP-BiRW) to infer hidden lncRNA-disease associations. First, integrated similarity networks for lncRNAs and diseases were constructed by merging similarity information. Subsequently, network consistency projection was applied to calculate space projection scores for lncRNAs and diseases, which were then introduced into a bi-random walk method for association prediction. To test model performance, we employed 5- and 10-fold cross-validation, with the area under the receiver operating characteristic curve as the evaluation indicator. The computational results showed that our method outperformed the other five advanced algorithms. In addition, the novel method was applied to another dataset in the Mammalian ncRNA-Disease Repository (MNDR) database and showed excellent performance. Finally, case studies were carried out on atherosclerosis and leukemia to confirm the effectiveness of our method in practice. In conclusion, we could infer lncRNA-disease associations using the NCP-BiRW model, which may benefit biomedical studies in the future.
Collapse
Affiliation(s)
- Yanling Liu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Department of Mathematics, Changzhi Medical College, Changzhi, China
| | - Hong Yang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Chu Zheng
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Ke Wang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Jingjing Yan
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Hongyan Cao
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
| | - Yanbo Zhang
- Department of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, Taiyuan, China
- School of Health and Service Management, Shanxi University of Chinese Medicine, Taiyuan, China
- *Correspondence:Yanbo Zhang,
| |
Collapse
|
7
|
Xie G, Jiang J, Sun Y. LDA-LNSUBRW: lncRNA-Disease Association Prediction Based on Linear Neighborhood Similarity and Unbalanced bi-Random Walk. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:989-997. [PMID: 32870798 DOI: 10.1109/tcbb.2020.3020595] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Increasing number of experiments show that lncRNAs are involved in many biological processes, and their mutations and disorders are associated with many diseases. However, verifying the relationships between lncRNAs and diseases is time consuming and laborio. Searching for effective computational methods will contribute to our understanding of the underlying mechanisms of disease and identifying biomarkers of diseases. Therefore, we proposed a method called lncRNA-disease association prediction based on linear neighborhood similarity and unbalanced bi-random walk (LDA-LNSUBRW). Given that the known lncRNA-disease associations are rare, a pretreatment step should be performed to obtain the interaction possibility of unknown cases, so as to help us predict the potential associations. In the framework of leave-one-out cross-validation (LOOCV)and fivefold cross-validation (5-fold CV), LDA-LNSUBRW achieved effective performance with AUC of 0.8874 and 0.8632 ± 0.0051, respectively. The experimental results in this paper show that the proposed method is superior to five other state-of-the-art methods. In addition, case studies of three diseases (lung cancer, breast cancer, and osteosarcoma)were carried out to illustrate that LDA-LNSUBRW could predict the relevant lncRNAs.
Collapse
|
8
|
Wang L, Shang M, Dai Q, He PA. Prediction of lncRNA-disease association based on a Laplace normalized random walk with restart algorithm on heterogeneous networks. BMC Bioinformatics 2022; 23:5. [PMID: 34983367 PMCID: PMC8729064 DOI: 10.1186/s12859-021-04538-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 12/15/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND More and more evidence showed that long non-coding RNAs (lncRNAs) play important roles in the development and progression of human sophisticated diseases. Therefore, predicting human lncRNA-disease associations is a challenging and urgently task in bioinformatics to research of human sophisticated diseases. RESULTS In the work, a global network-based computational framework called as LRWRHLDA were proposed which is a universal network-based method. Firstly, four isomorphic networks include lncRNA similarity network, disease similarity network, gene similarity network and miRNA similarity network were constructed. And then, six heterogeneous networks include known lncRNA-disease, lncRNA-gene, lncRNA-miRNA, disease-gene, disease-miRNA, and gene-miRNA associations network were applied to design a multi-layer network. Finally, the Laplace normalized random walk with restart algorithm in this global network is suggested to predict the relationship between lncRNAs and diseases. CONCLUSIONS The ten-fold cross validation is used to evaluate the performance of LRWRHLDA. As a result, LRWRHLDA achieves an AUC of 0.98402, which is higher than other compared methods. Furthermore, LRWRHLDA can predict isolated disease-related lnRNA (isolated lnRNA related disease). The results for colorectal cancer, lung adenocarcinoma, stomach cancer and breast cancer have been verified by other researches. The case studies indicated that our method is effective.
Collapse
Affiliation(s)
- Liugen Wang
- School of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, China
| | - Min Shang
- School of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, China
| | - Qi Dai
- College of Life Science, Zhejiang Sci-Tech University, Hangzhou, 310018, China
| | - Ping-An He
- School of Science, Zhejiang Sci-Tech University, Hangzhou, 310018, China.
| |
Collapse
|
9
|
Yan C, Zhang Z, Bao S, Hou P, Zhou M, Xu C, Sun J. Computational Methods and Applications for Identifying Disease-Associated lncRNAs as Potential Biomarkers and Therapeutic Targets. MOLECULAR THERAPY. NUCLEIC ACIDS 2020; 21:156-171. [PMID: 32585624 PMCID: PMC7321789 DOI: 10.1016/j.omtn.2020.05.018] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 04/06/2020] [Accepted: 05/18/2020] [Indexed: 12/12/2022]
Abstract
Long non-coding RNAs (lncRNAs) have been recognized as critical components of a broad genomic regulatory network and play pivotal roles in physiological and pathological processes. Identification of disease-associated lncRNAs is becoming increasingly crucial for fundamentally improving our understanding of molecular mechanisms of disease and developing novel biomarkers and therapeutic targets. Considering lower efficiency and higher time and labor cost of biological experiments, computer-aided inference of disease-associated RNAs has become a promising avenue for facilitating the study of lncRNA functions and provides complementary value for experimental studies. In this study, we first summarize data and knowledge resources publicly available for the study of lncRNA-disease associations. Then, we present an updated systematic overview of dozens of computational methods and models for inferring lncRNA-disease associations proposed in recent years. Finally, we explore the perspectives and challenges for further studies. Our study provides a guide for biologists and medical scientists to look for dedicated resources and more competent tools for accelerating the unraveling of disease-associated lncRNAs.
Collapse
Affiliation(s)
- Congcong Yan
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Zicheng Zhang
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Siqi Bao
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Ping Hou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Meng Zhou
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China
| | - Chongyong Xu
- Department of Radiology, The Second Affiliated Hospital of Wenzhou Medical University, Wenzhou 325027, P.R. China.
| | - Jie Sun
- School of Biomedical Engineering, School of Ophthalmology & Optometry and Eye Hospital, Wenzhou Medical University, Wenzhou 325027, P.R. China.
| |
Collapse
|
10
|
Yu L, Shen X, Zhong D, Yang J. Three-Layer Heterogeneous Network Combined With Unbalanced Random Walk for miRNA-Disease Association Prediction. Front Genet 2020; 10:1316. [PMID: 31998371 PMCID: PMC6967737 DOI: 10.3389/fgene.2019.01316] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 12/02/2019] [Indexed: 12/19/2022] Open
Abstract
miRNA plays an important role in many biological processes, and increasing evidence shows that miRNAs are closely related to human diseases. Most existing miRNA-disease association prediction methods were only based on data related to miRNAs and diseases and failed to effectively use other existing biological data. However, experimentally verified miRNA-disease associations are limited, there are complex correlations between biological data. Therefore, we propose a novel Three-layer heterogeneous network Combined with unbalanced Random Walk for MiRNA-Disease Association prediction algorithm (TCRWMDA), which can effectively integrate multi-source association data. TCRWMDA based not only on the known miRNA-disease associations, also add the new priori information (lncRNA-miRNA and lncRNA-disease associations) to build a three-layer heterogeneous network, lncRNA was added as the transition path of the intermediate point to mine more effective information between networks. The AUC value obtained by the TCRWMDA algorithm on 5-fold cross validation is 0.9209, compared with other models based on the same similarity calculation method, TCRWMDA obtained better results. TCRWMDA was applied to the analysis of four types of cancer, the results proved that TCRWMDA is an effective tool to predict the potential miRNA-disease association. The source code and dataset of TCRWMDA are available at: https://github.com/ylm0505/TCRWMDA.
Collapse
Affiliation(s)
- Limin Yu
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Xianjun Shen
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Duo Zhong
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Jincai Yang
- School of Computer, Central China Normal University, Wuhan, China
| |
Collapse
|