1
|
Zhang C, Li Y, Dong Y, Chen W, Yu C. Prediction of miRNA-disease associations based on PCA and cascade forest. BMC Bioinformatics 2024; 25:386. [PMID: 39701957 DOI: 10.1186/s12859-024-05999-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Accepted: 11/26/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND As a key non-coding RNA molecule, miRNA profoundly affects gene expression regulation and connects to the pathological processes of several kinds of human diseases. However, conventional experimental methods for validating miRNA-disease associations are laborious. Consequently, the development of efficient and reliable computational prediction models is crucial for the identification and validation of these associations. RESULTS In this research, we developed the PCACFMDA method to predict the potential associations between miRNAs and diseases. To construct a multidimensional feature matrix, we consider the fusion similarities of miRNA and disease and miRNA-disease pairs. We then use principal component analysis(PCA) to reduce data complexity and extract low-dimensional features. Subsequently, a tuned cascade forest is used to mine the features and output prediction scores deeply. The results of the 5-fold cross-validation using the HMDD v2.0 database indicate that the PCACFMDA algorithm achieved an AUC of 98.56%. Additionally, we perform case studies on breast, esophageal and lung neoplasms. The findings revealed that the top 50 miRNAs most strongly linked to each disease have been validated. CONCLUSIONS Based on PCA and optimized cascade forests, we propose the PCACFMDA model for predicting undiscovered miRNA-disease associations. The experimental results demonstrate superior prediction performance and commendable stability. Consequently, the PCACFMDA is a potent instrument for in-depth exploration of miRNA-disease associations.
Collapse
Affiliation(s)
- Chuanlei Zhang
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Yubo Li
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Yinglun Dong
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Wei Chen
- Computer Science, China University of Mining and Technology, Xuzhou, 221116, China
| | - Changqing Yu
- Electronic Information, Xijing University, Xi'an, 710123, China.
| |
Collapse
|
2
|
Long S, Tang X, Si X, Kong T, Zhu Y, Wang C, Qi C, Mu Z, Liu J. TriFusion enables accurate prediction of miRNA-disease association by a tri-channel fusion neural network. Commun Biol 2024; 7:1067. [PMID: 39215090 PMCID: PMC11364641 DOI: 10.1038/s42003-024-06734-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 08/13/2024] [Indexed: 09/04/2024] Open
Abstract
The identification of miRNA-disease associations is crucial for early disease prevention and treatment. However, it is still a computational challenge to accurately predict such associations due to improper information encoding. Previous methods characterize miRNA-disease associations only from single levels, causing the loss of multi-level association information. In this study, we propose TriFusion, a powerful and interpretable deep learning framework for miRNA-disease association prediction. It develops a tri-channel architecture to encode the association features of miRNAs and diseases from different levels and designs a feature fusion encoder to smoothly fuse these features. After training and testing, TriFusion outperforms other leading methods and offers strong interpretability through its learned representations. Furthermore, TriFusion is applied to three high-risk sexually associated cancers (ovarian, breast, and prostate cancers) and exhibits remarkable ability in the identification of miRNAs associated with the three diseases.
Collapse
Affiliation(s)
- Sheng Long
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xiaoran Tang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Xinyi Si
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Tongxin Kong
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Yanhao Zhu
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Chuanzhi Wang
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Chenqing Qi
- School of Mathematics and Statistics, Shandong University, Weihai, China
| | - Zengchao Mu
- School of Mathematics and Statistics, Shandong University, Weihai, China.
| | - Juntao Liu
- School of Mathematics and Statistics, Shandong University, Weihai, China.
| |
Collapse
|
3
|
Sheng N, Xie X, Wang Y, Huang L, Zhang S, Gao L, Wang H. A Survey of Deep Learning for Detecting miRNA- Disease Associations: Databases, Computational Methods, Challenges, and Future Directions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:328-347. [PMID: 38194377 DOI: 10.1109/tcbb.2024.3351752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/11/2024]
Abstract
MicroRNAs (miRNAs) are an important class of non-coding RNAs that play an essential role in the occurrence and development of various diseases. Identifying the potential miRNA-disease associations (MDAs) can be beneficial in understanding disease pathogenesis. Traditional laboratory experiments are expensive and time-consuming. Computational models have enabled systematic large-scale prediction of potential MDAs, greatly improving the research efficiency. With recent advances in deep learning, it has become an attractive and powerful technique for uncovering novel MDAs. Consequently, numerous MDA prediction methods based on deep learning have emerged. In this review, we first summarize publicly available databases related to miRNAs and diseases for MDA prediction. Next, we outline commonly used miRNA and disease similarity calculation and integration methods. Then, we comprehensively review the 48 existing deep learning-based MDA computation methods, categorizing them into classical deep learning and graph neural network-based techniques. Subsequently, we investigate the evaluation methods and metrics that are frequently used to assess MDA prediction performance. Finally, we discuss the performance trends of different computational methods, point out some problems in current research, and propose 9 potential future research directions. Data resources and recent advances in MDA prediction methods are summarized in the GitHub repository https://github.com/sheng-n/DL-miRNA-disease-association-methods.
Collapse
|
4
|
Jin Z, Wang M, Tang C, Zheng X, Zhang W, Sha X, An S. Predicting miRNA-disease association via graph attention learning and multiplex adaptive modality fusion. Comput Biol Med 2024; 169:107904. [PMID: 38181611 DOI: 10.1016/j.compbiomed.2023.107904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2023] [Revised: 12/12/2023] [Accepted: 12/23/2023] [Indexed: 01/07/2024]
Abstract
miRNAs are a class of small non-coding RNA molecules that play important roles in gene regulation. They are crucial for maintaining normal cellular functions, and dysregulation or dysfunction of miRNAs which are linked to the onset and advancement of multiple human diseases. Research on miRNAs has unveiled novel avenues in the realm of the diagnosis, treatment, and prevention of human diseases. However, clinical trials pose challenges and drawbacks, such as complexity and time-consuming processes, which create obstacles for many researchers. Graph Attention Network (GAT) has shown excellent performance in handling graph-structured data for tasks such as link prediction. Some studies have successfully applied GAT to miRNA-disease association prediction. However, there are several drawbacks to existing methods. Firstly, most of the previous models rely solely on concatenation operations to merge features of miRNAs and diseases, which results in the deprivation of significant modality-specific information and even the inclusion of redundant information. Secondly, as the number of layers in GAT increases, there is a possibility of excessive smoothing in the feature extraction process, which significantly affects the prediction accuracy. To address these issues and effectively complete miRNA disease prediction tasks, we propose an innovative model called Multiplex Adaptive Modality Fusion Graph Attention Network (MAMFGAT). MAMFGAT utilizes GAT as the main structure for feature aggregation and incorporates a multi-modal adaptive fusion module to extract features from three interconnected networks: the miRNA-disease association network, the miRNA similarity network, and the disease similarity network. It employs adaptive learning and cross-modality contrastive learning to fuse more effective miRNA and disease feature embeddings as well as incorporates multi-modal residual feature fusion to tackle the problem of excessive feature smoothing in GATs. Finally, we employ a Multi-Layer Perceptron (MLP) model that takes the embeddings of miRNA and disease features as input to anticipate the presence of potential miRNA-disease associations. Extensive experimental results provide evidence of the superior performance of MAMFGAT in comparison to other state-of-the-art methods. To validate the significance of various modalities and assess the efficacy of the designed modules, we performed an ablation analysis. Furthermore, MAMFGAT shows outstanding performance in three cancer case studies, indicating that it is a reliable method for studying the association between miRNA and diseases. The implementation of MAMFGAT can be accessed at the following GitHub repository: https://github.com/zixiaojin66/MAMFGAT-master.
Collapse
Affiliation(s)
- Zixiao Jin
- School of Computer, China University of Geosciences, Wuhan, 430074, China.
| | - Minhui Wang
- Department of Pharmacy, Lianshui People's Hospital of Kangda College Affiliated to Nanjing Medical University, Huai'an 223300, China.
| | - Chang Tang
- School of Computer, China University of Geosciences, Wuhan, 430074, China.
| | - Xiao Zheng
- School of Computer, National University of Defense Technology, Changsha, 410073, China.
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, 430070, China.
| | - Xiaofeng Sha
- Department of Oncology, Huai'an Hongze District People's Hospital, Huai'an, 223100, China.
| | - Shan An
- JD Health International Inc., China.
| |
Collapse
|
5
|
Hu H, Zhao H, Zhong T, Dong X, Wang L, Han P, Li Z. Adaptive deep propagation graph neural network for predicting miRNA-disease associations. Brief Funct Genomics 2023; 22:453-462. [PMID: 37078739 DOI: 10.1093/bfgp/elad010] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/13/2023] [Accepted: 03/09/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND A large number of experiments show that the abnormal expression of miRNA is closely related to the occurrence, diagnosis and treatment of diseases. Identifying associations between miRNAs and diseases is important for clinical applications of complex human diseases. However, traditional biological experimental methods and calculation-based methods have many limitations, which lead to the development of more efficient and accurate deep learning methods for predicting miRNA-disease associations. RESULTS In this paper, we propose a novel model on the basis of adaptive deep propagation graph neural network to predict miRNA-disease associations (ADPMDA). We first construct the miRNA-disease heterogeneous graph based on known miRNA-disease pairs, miRNA integrated similarity information, miRNA sequence information and disease similarity information. Then, we project the features of miRNAs and diseases into a low-dimensional space. After that, attention mechanism is utilized to aggregate the local features of central nodes. In particular, an adaptive deep propagation graph neural network is employed to learn the embedding of nodes, which can adaptively adjust the local and global information of nodes. Finally, the multi-layer perceptron is leveraged to score miRNA-disease pairs. CONCLUSION Experiments on human microRNA disease database v3.0 dataset show that ADPMDA achieves the mean AUC value of 94.75% under 5-fold cross-validation. We further conduct case studies on the esophageal neoplasm, lung neoplasms and lymphoma to confirm the effectiveness of our proposed model, and 49, 49, 47 of the top 50 predicted miRNAs associated with these diseases are confirmed, respectively. These results demonstrate the effectiveness and superiority of our model in predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Hua Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Huan Zhao
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Tangbo Zhong
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Xishang Dong
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
| | - Pengyong Han
- Central Lab, Changzhi Medical College, Changzhi 046012, China
| | - Zhengwei Li
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
- KUNPAND Communications (Kunshan) Co., Ltd., Suzhou 215300, China
| |
Collapse
|
6
|
Zhang W, Liu B. iSnoDi-MDRF: Identifying snoRNA-Disease Associations Based on Multiple Biological Data by Ranking Framework. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:3013-3019. [PMID: 37030816 DOI: 10.1109/tcbb.2023.3258448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Accumulating evidence indicates that the dysregulation of small nucleolar RNAs (snoRNAs) is relevant with diseases. Identifying snoRNA-disease associations by computational methods is desired for biologists, which can save considerable costs and time compared biological experiments. However, it still faces some challenges as followings: (i) Many snoRNAs are detected in recent years, but only a few snoRNAs have been proved to be associated with diseases; (ii) Computational predictors trained with only a few known snoRNA-disease associations fail to accurately identify the snoRNA-disease associations. In this study, we propose a ranking framework, called iSnoDi-MDRF, to identify potential snoRNA-disease associations based on multiple biological data, which has the following highlights: (i) iSnoDi-MDRF integrates ranking framework, which is not only able to identify potential associations between known snoRNAs and diseases, but also can identify diseases associated with new snoRNAs. (ii) Known gene-disease associations are employed to help train a mature model for predicting snoRNA-disease association. Experimental results illustrate that iSnoDi-MDRF is very suitable for identifying potential snoRNA-disease associations. The web server of iSnoDi-MDRF predictor is freely available at http://bliulab.net/iSnoDi-MDRF/.
Collapse
|
7
|
Qu Q, Chen X, Ning B, Zhang X, Nie H, Zeng L, Chen H, Fu X. Prediction of miRNA-disease associations by neural network-based deep matrix factorization. Methods 2023; 212:1-9. [PMID: 36813017 DOI: 10.1016/j.ymeth.2023.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 01/17/2023] [Accepted: 02/10/2023] [Indexed: 02/23/2023] Open
Abstract
MicroRNA(miRNA) is a class of short non-coding RNAs with a length of about 22 nucleotides, which participates in various biological processes of cells. A number of studies have shown that miRNAs are closely related to the occurrence of cancer and various human diseases. Therefore, studying miRNA-disease associations is helpful to understand the pathogenesis of diseases as well as the prevention, diagnosis, treatment and prognosis of diseases. Traditional biological experimental methods for studying miRNA-disease associations have disadvantages such as expensive equipment, time-consuming and labor-intensive. With the rapid development of bioinformatics, more and more researchers are committed to developing effective computational methods to predict miRNA-disease associations in roder to reduce the time and money cost of experiments. In this study, we proposed a neural network-based deep matrix factorization method named NNDMF to predict miRNA-disease associations. To address the problem that traditional matrix factorization methods can only extract linear features, NNDMF used neural network to perform deep matrix factorization to extract nonlinear features, which makes up for the shortcomings of traditional matrix factorization methods. We compared NNDMF with four previous classical prediction models (IMCMDA, GRMDA, SACMDA and ICFMDA) in global LOOCV and local LOOCV, respectively. The AUCs achieved by NNDMF in two cross-validation methods were 0.9340 and 0.8763, respectively. Furthermore, we conducted case studies on three important human diseases (lymphoma, colorectal cancer and lung cancer) to validate the effectiveness of NNDMF. In conclusion, NNDMF could effectively predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Qiang Qu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xia Chen
- School of Basic Education, Changsha Aeronautical Vocational and Technical College, Changsha, China
| | - Bin Ning
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiang Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Hao Nie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Li Zeng
- College of Life and Environmental Science, Hunan University of Art and Science, Changde, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing, China.
| |
Collapse
|
8
|
S S, E R V, Krishnakumar U. Improving miRNA Disease Association Prediction Accuracy Using Integrated Similarity Information and Deep Autoencoders. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1125-1136. [PMID: 35914051 DOI: 10.1109/tcbb.2022.3195514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
MicroRNAs (miRNAs) are short endogenous non-encoding RNA molecules (22nt) that have a vital role in many biological and molecular processes inside the human body. Abnormal and dysregulated expressions of miRNAs are correlated with many complex disorders. Time-consuming wet-lab biological experiments are costly and labour-intensive. So, the situation demands feasible and efficient computational approaches for predicting promising miRNAs associated with diseases. Here a two-stage feature pruning approach based on miRNA feature similarity fusion that uses deep attention autoencoder and recursive feature elimination with cross-validation (RFECV) is proposed for predicting unknown miRNA-disease associations. In the first stage, an attention autoencoder captures highly influential features from the fused feature vector. For further pruning of features, RFECV is applied. The resultant features were given to a Random Forest classifier for association prediction. The Highest AUC of 94.41% is attained when all miRNA similarity measures are merged with disease similarities. Case studies were done on two diseases-lymphoma and leukaemia, to examine the reliability of the approach. Comparative analysis shows that the proposed approach outperforms recent methodologies for predicting miRNA-disease associations.
Collapse
|
9
|
Zhang H, Fang J, Sun Y, Xie G, Lin Z, Gu G. Predicting miRNA-Disease Associations via Node-Level Attention Graph Auto-Encoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1308-1318. [PMID: 35503834 DOI: 10.1109/tcbb.2022.3170843] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Previous studies have confirmed microRNA (miRNA), small single-stranded non-coding RNA, participates in various biological processes and plays vital roles in many complex human diseases. Therefore, developing an efficient method to infer potential miRNA disease associations could greatly help understand operational mechanisms for diseases at the molecular level. However, during these early stages for miRNA disease prediction, traditional biological experiments are laborious and expensive. Therefore, this study proposes a novel method called AGAEMD (node-level Attention Graph Auto-Encoder to predict potential MiRNA Disease associations). We first create a heterogeneous matrix incorporating miRNA similarity, disease similarity, and known miRNA-disease associations. Then these matrixes are input into a node-level attention encoder-decoder network which utilizes low dimensional dense embeddings to represent nodes and calculate association scores. To verify the effectiveness of the proposed method, we conduct a series of experiments on two benchmark datasets (the Human MicroRNA Disease Database v2.0 and v3.2) and report the averages over 10 runs in comparison with several state-of-the-art methods. Experimental results have demonstrated the excellent performance of AGAEMD in comparison with other methods. Three important diseases (Colon Neoplasms, Lung Neoplasms, Lupus Vulgaris) were applied in case studies. The results comfirm the reliable predictive performance of AGAEMD.
Collapse
|
10
|
Zhang W, Liu B. iSnoDi-LSGT: identifying snoRNA-disease associations based on local similarity constraints and global topological constraints. RNA (NEW YORK, N.Y.) 2022; 28:1558-1567. [PMID: 36192132 PMCID: PMC9670808 DOI: 10.1261/rna.079325.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
Growing evidence proves that small nucleolar RNAs (snoRNAs) have important functions in various biological processes, the malfunction of which leads to the emergence and development of complex diseases. However, identifying snoRNA-disease associations is an ongoing challenging task due to the considerable time- and money-consuming biological experiments. Therefore, it is urgent to design efficient and economical methods for the identification of snoRNA-disease associations. In this regard, we propose a computational method named iSnoDi-LSGT, which utilizes snoRNA sequence similarity and disease similarity as local similarity constraints. The iSnoDi-LSGT predictor further employs network embedding technology to extract topological features of snoRNAs and diseases, based on which snoRNA topological similarity and disease topological similarity are calculated as global topological constraints. To the best of our knowledge, the iSnoDi-LSGT is the first computational method for snoRNA-disease association identification. The experimental results indicate that the iSnoDi-LSGT predictor can effectively predict unknown snoRNA-disease associations. The web server of the iSnoDi-LSGT predictor is freely available at http://bliulab.net/iSnoDi-LSGT.
Collapse
Affiliation(s)
- Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
11
|
Yang Y, Shang J, Sun Y, Li F, Zhang Y, Kong XZ, Li S, Liu JX. TLNPMD: Prediction of miRNA-Disease Associations Based on miRNA-Drug-Disease Three-Layer Heterogeneous Network. Molecules 2022; 27:4371. [PMID: 35889243 PMCID: PMC9324587 DOI: 10.3390/molecules27144371] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 07/06/2022] [Indexed: 12/10/2022] Open
Abstract
Many microRNAs (miRNAs) have been confirmed to be associated with the generation of human diseases. Capturing miRNA-disease associations (M-DAs) provides an effective way to understand the etiology of diseases. Many models for predicting M-DAs have been constructed; nevertheless, there are still several limitations, such as generally considering direct information between miRNAs and diseases, usually ignoring potential knowledge hidden in isolated miRNAs or diseases. To overcome these limitations, in this study a novel method for predicting M-DAs was developed named TLNPMD, highlights of which are the introduction of drug heuristic information and a bipartite network reconstruction strategy. Specifically, three bipartite networks, including drug-miRNA, drug-disease, and miRNA-disease, were reconstructed as weighted ones using such reconstruction strategy. Based on these weighted bipartite networks, as well as three corresponding similarity networks of drugs, miRNAs and diseases, the miRNA-drug-disease three-layer heterogeneous network was constructed. Then, this heterogeneous network was converted into three two-layer heterogeneous networks, for each of which the network path computational model was employed to predict association scores. Finally, both direct and indirect miRNA-disease paths were used to predict M-DAs. Comparative experiments of TLNPMD and other four models were performed and evaluated by five-fold and global leave-one-out cross validations, results of which show that TLNPMD has the highest AUC values among those of compared methods. In addition, case studies of two common diseases were carried out to validate the effectiveness of the TLNPMD. These experiments demonstrate that the TLNPMD may serve as a promising alternative to existing methods for predicting M-DAs.
Collapse
Affiliation(s)
- Yi Yang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Junliang Shang
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Yan Sun
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Feng Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Yuanyuan Zhang
- School of Information and Control Engineering, Qingdao University of Technology, Qingdao 266520, China;
| | - Xiang-Zhen Kong
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Shengjun Li
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| | - Jin-Xing Liu
- School of Computer Science, Qufu Normal University, Rizhao 276826, China; (Y.Y.); (Y.S.); (F.L.); (X.-Z.K.); (S.L.); (J.-X.L.)
| |
Collapse
|
12
|
Ji C, Wang Y, Gao Z, Li L, Ni J, Zheng C. A Semi-Supervised Learning Method for MiRNA-Disease Association Prediction Based on Variational Autoencoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2049-2059. [PMID: 33735084 DOI: 10.1109/tcbb.2021.3067338] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical role in many biological processes, such as cell growth, development, differentiation and aging. Increasing studies have revealed that miRNAs are closely involved in many human diseases. Therefore, the prediction of miRNA-disease associations is of great significance to the study of the pathogenesis, diagnosis and intervention of human disease. However, biological experimentally methods are usually expensive in time and money, while computational methods can provide an efficient way to infer the underlying disease-related miRNAs. In this study, we propose a novel method to predict potential miRNA-disease associations, called SVAEMDA. Our method mainly consider the miRNA-disease association prediction as semi-supervised learning problem. SVAEMDA integrates disease semantic similarity, miRNA functional similarity and respective Gaussian interaction profile (GIP) similarities. The integrated similarities are used to learn the representations of diseases and miRNAs. SVAEMDA trains a variational autoencoder based predictor by using known miRNA-disease associations, with the form of concatenated dense vectors. Reconstruction probability of the predictor is used to measure the correlation of the miRNA-disease pairs. Experimental results show that SVAEMDA outperforms other stat-of-the-art methods. AUC values of SVAEMDA of global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV) are 0.9464 and 0.9428 respectively. In addition, case studies of three common human diseases indicate that SVAEMDA obtains 100 percent of the top 50 predicted candidates in the benchmark databases. Therefore, SVAEMDA can efficiently and accurately predict the potential associations between diseases and miRNAs.
Collapse
|
13
|
Peng L, Wang F, Wang Z, Tan J, Huang L, Tian X, Liu G, Zhou L. Cell-cell communication inference and analysis in the tumour microenvironments from single-cell transcriptomics: data resources and computational strategies. Brief Bioinform 2022; 23:6618236. [PMID: 35753695 DOI: 10.1093/bib/bbac234] [Citation(s) in RCA: 52] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 05/15/2022] [Accepted: 05/19/2022] [Indexed: 12/12/2022] Open
Abstract
Carcinomas are complex ecosystems composed of cancer, stromal and immune cells. Communication between these cells and their microenvironments induces cancer progression and causes therapy resistance. In order to improve the treatment of cancers, it is essential to quantify crosstalk between and within various cell types in a tumour microenvironment. Focusing on the coordinated expression patterns of ligands and cognate receptors, cell-cell communication can be inferred through ligand-receptor interactions (LRIs). In this manuscript, we carry out the following work: (i) introduce pipeline for ligand-receptor-mediated intercellular communication estimation from single-cell transcriptomics and list a few available LRI-related databases and visualization tools; (ii) demonstrate seven classical intercellular communication scoring strategies, highlight four types of representative intercellular communication inference methods, including network-based approaches, machine learning-based approaches, spatial information-based approaches and other approaches; (iii) summarize the evaluation and validation avenues for intercellular communication inference and analyze the advantages and limitations for the above four types of cell-cell communication methods; (iv) comment several major challenges while provide further research directions for intercellular communication analysis in the tumour microenvironments. We anticipate that this work helps to better understand intercellular crosstalk and to further develop powerful cell-cell communication estimation tools for tumor-targeted therapy.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China.,College of Life Sciences and Chemistry, Hunan University of Technology, 412007, Hunan, China
| | - Feixiang Wang
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Zhao Wang
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Jingwei Tan
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, 10084, Beijing, China.,The Future Laboratory, Tsinghua University, 10084, Beijing, China
| | - Xiongfei Tian
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Guangyi Liu
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| | - Liqian Zhou
- School of Computer Science, Hunan University of Technology, 412007, Hunan, China
| |
Collapse
|
14
|
Ni J, Li L, Wang Y, Ji C, Zheng C. MDSCMF: Matrix Decomposition and Similarity-Constrained Matrix Factorization for miRNA-Disease Association Prediction. Genes (Basel) 2022; 13:1021. [PMID: 35741782 PMCID: PMC9223216 DOI: 10.3390/genes13061021] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Revised: 06/01/2022] [Accepted: 06/02/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are small non-coding RNAs that are related to a number of complicated biological processes, and numerous studies have demonstrated that miRNAs are closely associated with many human diseases. In this study, we present a matrix decomposition and similarity-constrained matrix factorization (MDSCMF) to predict potential miRNA-disease associations. First of all, we utilized a matrix decomposition (MD) algorithm to get rid of outliers from the miRNA-disease association matrix. Then, miRNA similarity was determined by utilizing similarity kernel fusion (SKF) to integrate miRNA function similarity and Gaussian interaction profile (GIP) kernel similarity, and disease similarity was determined by utilizing SKF to integrate disease semantic similarity and GIP kernel similarity. Furthermore, we added L2 regularization terms and similarity constraint terms to non-negative matrix factorization to form a similarity-constrained matrix factorization (SCMF) algorithm, which was applied to make prediction. MDSCMF achieved AUC values of 0.9488, 0.9540, and 0.8672 based on fivefold cross-validation (5-CV), global leave-one-out cross-validation (global LOOCV), and local leave-one-out cross-validation (local LOOCV), respectively. Case studies on three common human diseases were also implemented to demonstrate the prediction ability of MDSCMF. All experimental results confirmed that MDSCMF was effective in predicting underlying associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Jiancheng Ni
- Network Information Center, Qufu Normal University, Qufu 273165, China;
| | - Lei Li
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu 273165, China; (Y.W.); (C.J.)
| | - Chunhou Zheng
- School of Artifial Intelligence, Anhui University, Hefei 230601, China
| |
Collapse
|
15
|
Lou Z, Cheng Z, Li H, Teng Z, Liu Y, Tian Z. Predicting miRNA-disease associations via learning multimodal networks and fusing mixed neighborhood information. Brief Bioinform 2022; 23:6582005. [PMID: 35524503 DOI: 10.1093/bib/bbac159] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/29/2022] [Accepted: 04/10/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION In recent years, a large number of biological experiments have strongly shown that miRNAs play an important role in understanding disease pathogenesis. The discovery of miRNA-disease associations is beneficial for disease diagnosis and treatment. Since inferring these associations through biological experiments is time-consuming and expensive, researchers have sought to identify the associations utilizing computational approaches. Graph Convolutional Networks (GCNs), which exhibit excellent performance in link prediction problems, have been successfully used in miRNA-disease association prediction. However, GCNs only consider 1st-order neighborhood information at one layer but fail to capture information from high-order neighbors to learn miRNA and disease representations through information propagation. Therefore, how to aggregate information from high-order neighborhood effectively in an explicit way is still challenging. RESULTS To address such a challenge, we propose a novel method called mixed neighborhood information for miRNA-disease association (MINIMDA), which could fuse mixed high-order neighborhood information of miRNAs and diseases in multimodal networks. First, MINIMDA constructs the integrated miRNA similarity network and integrated disease similarity network respectively with their multisource information. Then, the embedding representations of miRNAs and diseases are obtained by fusing mixed high-order neighborhood information from multimodal network which are the integrated miRNA similarity network, integrated disease similarity network and the miRNA-disease association networks. Finally, we concentrate the multimodal embedding representations of miRNAs and diseases and feed them into the multilayer perceptron (MLP) to predict their underlying associations. Extensive experimental results show that MINIMDA is superior to other state-of-the-art methods overall. Moreover, the outstanding performance on case studies for esophageal cancer, colon tumor and lung cancer further demonstrates the effectiveness of MINIMDA. AVAILABILITY AND IMPLEMENTATION https://github.com/chengxu123/MINIMDA and http://120.79.173.96/.
Collapse
Affiliation(s)
- Zhengzheng Lou
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhaoxu Cheng
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Hui Li
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhixia Teng
- College of Information and Computer Engineering, Northeast Forestry University, Harbin 150040, China
| | - Yang Liu
- Departments of Cerebrovascular Diseases, The Second Affiliated Hospital of Zhengzhou University, Zhengzhou 450000, China
| | - Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
16
|
Zhou Y, Cui Q, Zhou Y. Screening and Comprehensive Analysis of Cancer-Associated tRNA-Derived Fragments. Front Genet 2022; 12:747931. [PMID: 35095997 PMCID: PMC8795687 DOI: 10.3389/fgene.2021.747931] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 12/29/2021] [Indexed: 12/12/2022] Open
Abstract
tRNA-derived fragments (tRFs) constitute a novel class of small non-coding RNA cleaved from tRNAs. In recent years, researches have shown the regulatory roles of a few tRFs in cancers, illuminating a new direction for tRF-centric cancer researches. Nonetheless, more specific screening of tRFs related to oncogenesis pathways, cancer progression stages and cancer prognosis is continuously demanded to reveal the landscape of the cancer-associated tRFs. In this work, by combining the clinical information recorded in The Cancer Genome Atlas (TCGA) and the tRF expression profiles curated by MINTbase v2.0, we systematically screened 1,516 cancer-associated tRFs (ca-tRFs) across seven cancer types. The ca-tRF set collectively combined the differentially expressed tRFs between cancer samples and control samples, the tRFs significantly correlated with tumor stage and the tRFs significantly correlated with patient survival. By incorporating our previous tRF-target dataset, we found the ca-tRFs tend to target cancer-associated genes and onco-pathways like ATF6-mediated unfolded protein response, angiogenesis, cell cycle process regulation, focal adhesion, PI3K-Akt signaling pathway, cellular senescence and FoxO signaling pathway across multiple cancer types. And cell composition analysis implies that the expressions of ca-tRFs are more likely to be correlated with T-cell infiltration. We also found the ca-tRF expression pattern is informative to prognosis, suggesting plausible tRF-based cancer subtypes. Together, our systematic analysis demonstrates the potentially extensive involvements of tRFs in cancers, and provides a reasonable list of cancer-associated tRFs for further investigations.
Collapse
Affiliation(s)
- Yiran Zhou
- MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
- MOE Key Lab of Cardiovascular Sciences, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qinghua Cui
- MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
- MOE Key Lab of Cardiovascular Sciences, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Yuan Zhou
- MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
- MOE Key Lab of Cardiovascular Sciences, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
- *Correspondence: Yuan Zhou,
| |
Collapse
|
17
|
Hu P, Huang YA, Mei J, Leung H, Chen ZH, Kuang ZM, You ZH, Hu L. Learning from low-rank multimodal representations for predicting disease-drug associations. BMC Med Inform Decis Mak 2021; 21:308. [PMID: 34736437 PMCID: PMC8567544 DOI: 10.1186/s12911-021-01648-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 10/06/2021] [Indexed: 12/15/2022] Open
Abstract
Background Disease-drug associations provide essential information for drug discovery and disease treatment. Many disease-drug associations remain unobserved or unknown, and trials to confirm these associations are time-consuming and expensive. To better understand and explore these valuable associations, it would be useful to develop computational methods for predicting unobserved disease-drug associations. With the advent of various datasets describing diseases and drugs, it has become more feasible to build a model describing the potential correlation between disease and drugs.
Results In this work, we propose a new prediction method, called LMFDA, which works in several stages. First, it studies the drug chemical structure, disease MeSH descriptors, disease-related phenotypic terms, and drug-drug interactions. On this basis, similarity networks of different sources are constructed to enrich the representation of drugs and diseases. Based on the fused disease similarity network and drug similarity network, LMFDA calculated the association score of each pair of diseases and drugs in the database. This method achieves good performance on Fdataset and Cdataset, AUROCs were 91.6% and 92.1% respectively, higher than many of the existing computational models. Conclusions The novelty of LMFDA lies in the introduction of multimodal fusion using low-rank tensors to fuse multiple similar networks and combine matrix complement technology to predict potential association. We have demonstrated that LMFDA can display excellent network integration ability for accurate disease-drug association inferring and achieve substantial improvement over the advanced approach. Overall, experimental results on two real-world networks dataset demonstrate that LMFDA able to delivers an excellent detecting performance. Results also suggest that perfecting similar networks with as much domain knowledge as possible is a promising direction for drug repositioning.
Collapse
Affiliation(s)
- Pengwei Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Yu-An Huang
- The Hong Kong Polytechnic University, Hong Kong SAR, China
| | | | - Henry Leung
- Electrical and Computer Engineering, University of Calgary, Calgary, Canada
| | - Zhan-Heng Chen
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China
| | - Ze-Min Kuang
- Beijing Anzhen Hospital of Capital Medical University, Beijing, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| | - Lun Hu
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, China.
| |
Collapse
|
18
|
Wu Y, Zhu D, Wang X, Zhang S. An ensemble learning framework for potential miRNA-disease association prediction with positive-unlabeled data. Comput Biol Chem 2021; 95:107566. [PMID: 34534906 DOI: 10.1016/j.compbiolchem.2021.107566] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 08/13/2021] [Accepted: 08/18/2021] [Indexed: 11/17/2022]
Abstract
To explore the pathogenic mechanisms of MicroRNA (miRNA) on diverse diseases, many researchers have concentrated on discovering the potential associations between miRNA and disease using machine learning methods. However, the prediction accuracy of supervised machine learning methods is limited by lacking of experimentally-validated uncorrelated miRNA-disease pairs. Without these negative samples, training a highly accurate model is much more difficult. Different from traditional miRNA-disease prediction models using randomly selected unknown samples as negative training samples, we propose an ensemble learning framework to solve this positive-unlabeled (PU) learning problem. The framework incorporates two steps, i.e., a novel semi-supervised Kmeans (SS-Kmeans) to extract reliable negative samples from unknown miRNA-disease pairs and subagging method to generate diverse training sample sets to make full use of those reliable negative samples for ensemble learning. Combined with effective random vector functional link (RVFL) network as prediction model, the proposed framework showed superior prediction accuracy comparing with other popular approaches. A case study on lung and gastric neoplasms further confirms the framework's efficacy at identifying miRNA disease associations.
Collapse
Affiliation(s)
- Yao Wu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Donghua Zhu
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| | - Xuefeng Wang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China.
| | - Shuo Zhang
- School of Management and Economics, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
19
|
Ji C, Wang Y, Ni J, Zheng C, Su Y. Predicting miRNA-Disease Associations Based on Heterogeneous Graph Attention Networks. Front Genet 2021; 12:727744. [PMID: 34512733 PMCID: PMC8424198 DOI: 10.3389/fgene.2021.727744] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Accepted: 08/02/2021] [Indexed: 11/23/2022] Open
Abstract
In recent years, more and more evidence has shown that microRNAs (miRNAs) play an important role in the regulation of post-transcriptional gene expression, and are closely related to human diseases. Many studies have also revealed that miRNAs can be served as promising biomarkers for the potential diagnosis and treatment of human diseases. The interactions between miRNA and human disease have rarely been demonstrated, and the underlying mechanism of miRNA is not clear. Therefore, computational approaches has attracted the attention of researchers, which can not only save time and money, but also improve the efficiency and accuracy of biological experiments. In this work, we proposed a Heterogeneous Graph Attention Networks (GAT) based method for miRNA-disease associations prediction, named HGATMDA. We constructed a heterogeneous graph for miRNAs and diseases, introduced weighted DeepWalk and GAT methods to extract features of miRNAs and diseases from the graph. Moreover, a fully-connected neural networks is used to predict correlation scores between miRNA-disease pairs. Experimental results under five-fold cross validation (five-fold CV) showed that HGATMDA achieved better prediction performance than other state-of-the-art methods. In addition, we performed three case studies on breast neoplasms, lung neoplasms and kidney neoplasms. The results showed that for the three diseases mentioned above, 50 out of top 50 candidates were confirmed by the validation datasets. Therefore, HGATMDA is suitable as an effective tool to identity potential diseases-related miRNAs.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Artificial Intelligence, Anhui University, Hefei, China
| | - Yansen Su
- School of Artificial Intelligence, Anhui University, Hefei, China
| |
Collapse
|
20
|
Nie R, Li Z, You ZH, Bao W, Li J. Efficient framework for predicting MiRNA-disease associations based on improved hybrid collaborative filtering. BMC Med Inform Decis Mak 2021; 21:254. [PMID: 34461870 PMCID: PMC8406577 DOI: 10.1186/s12911-021-01616-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2021] [Accepted: 08/23/2021] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Accumulating studies indicates that microRNAs (miRNAs) play vital roles in the process of development and progression of many human complex diseases. However, traditional biochemical experimental methods for identifying disease-related miRNAs cost large amount of time, manpower, material and financial resources. METHODS In this study, we developed a framework named hybrid collaborative filtering for miRNA-disease association prediction (HCFMDA) by integrating heterogeneous data, e.g., miRNA functional similarity, disease semantic similarity, known miRNA-disease association networks, and Gaussian kernel similarity of miRNAs and diseases. To capture the intrinsic interaction patterns embedded in the sparse association matrix, we prioritized the predictive score by fusing three types of information: similar disease associations, similar miRNA associations, and similar disease-miRNA associations. Meanwhile, singular value decomposition was adopted to reduce the impact of noise and accelerate predictive speed. RESULTS We then validated HCFMDA with leave-one-out cross-validation (LOOCV) and two types of case studies. In the LOOCV, we achieved 0.8379 of AUC (area under the curve). To evaluate the performance of HCFMDA on real diseases, we further implemented the first type of case validation over three important human diseases: Colon Neoplasms, Esophageal Neoplasms and Prostate Neoplasms. As a result, 44, 46 and 44 out of the top 50 predicted disease-related miRNAs were confirmed by experimental evidence. Moreover, the second type of case validation on Breast Neoplasms indicates that HCFMDA could also be applied to predict potential miRNAs towards those diseases without any known associated miRNA. CONCLUSIONS The satisfactory prediction performance demonstrates that our model could serve as a reliable tool to guide the following research for identifying candidate miRNAs associated with human diseases.
Collapse
Affiliation(s)
- Ru Nie
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Zhengwei Li
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China.
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China.
- Institute of Machine Learning and Systems Biology, College of Electronics and Information Engineering, Tongji University, Shanghai, 201804, China.
- KUNPAND Communications (Kunshan) Co., Ltd., Suzhou, 215300, China.
| | - Zhu-Hong You
- School of Computer Science, Northwestern Polytechnical University, Xi'an, 710072, China.
| | - Wenzheng Bao
- School of Information Engineering, Xuzhou University of Technology, Xuzhou, 221018, China
| | - Jiashu Li
- Engineering Research Center of Mine Digitalization of Ministry of Education, China University of Mining and Technology, Xuzhou, 221116, China
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
21
|
SCMFMDA: Predicting microRNA-disease associations based on similarity constrained matrix factorization. PLoS Comput Biol 2021; 17:e1009165. [PMID: 34252084 PMCID: PMC8345837 DOI: 10.1371/journal.pcbi.1009165] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Revised: 08/06/2021] [Accepted: 06/08/2021] [Indexed: 11/21/2022] Open
Abstract
miRNAs belong to small non-coding RNAs that are related to a number of complicated biological processes. Considerable studies have suggested that miRNAs are closely associated with many human diseases. In this study, we proposed a computational model based on Similarity Constrained Matrix Factorization for miRNA-Disease Association Prediction (SCMFMDA). In order to effectively combine different disease and miRNA similarity data, we applied similarity network fusion algorithm to obtain integrated disease similarity (composed of disease functional similarity, disease semantic similarity and disease Gaussian interaction profile kernel similarity) and integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity and miRNA Gaussian interaction profile kernel similarity). In addition, the L2 regularization terms and similarity constraint terms were added to traditional Nonnegative Matrix Factorization algorithm to predict disease-related miRNAs. SCMFMDA achieved AUCs of 0.9675 and 0.9447 based on global Leave-one-out cross validation and five-fold cross validation, respectively. Furthermore, the case studies on two common human diseases were also implemented to demonstrate the prediction accuracy of SCMFMDA. The out of top 50 predicted miRNAs confirmed by experimental reports that indicated SCMFMDA was effective for prediction of relationship between miRNAs and diseases. Considerable studies have suggested that miRNAs are closely associated with many human diseases, so predicting potential associations between miRNAs and diseases can contribute to the diagnose and treatment of diseases. Several models of discovering unknown miRNA-diseases associations make the prediction more productive and effective. We proposed SCMFMDA to obtain more accuracy prediction result by applying similarity network fusion to fuse multi-source disease and miRNA information and utilizing similarity constrained matrix factorization to make prediction based on biological information. The global Leave-one-out cross validation and five-fold cross validation were applied to evaluate our model. Consequently, SCMFMDA could achieve AUCs of 0.9675 and 0.9447 that were obviously higher than previous computational models. Furthermore, we implemented case studies on significant human diseases including colon neoplasms and lung neoplasms, 47 and 46 of top-50 were confirmed by experimental reports. All results proved that SCMFMDA could be regard as an effective way to discover unverified connections of miRNA-disease.
Collapse
|
22
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
23
|
Ji C, Gao Z, Ma X, Wu Q, Ni J, Zheng C. AEMDA: inferring miRNA-disease associations based on deep autoencoder. Bioinformatics 2021; 37:66-72. [PMID: 32726399 DOI: 10.1093/bioinformatics/btaa670] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2020] [Revised: 05/27/2020] [Accepted: 07/20/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MicroRNAs (miRNAs) are a class of non-coding RNAs that play critical roles in various biological processes. Many studies have shown that miRNAs are closely related to the occurrence, development and diagnosis of human diseases. Traditional biological experiments are costly and time consuming. As a result, effective computational models have become increasingly popular for predicting associations between miRNAs and diseases, which could effectively boost human disease diagnosis and prevention. RESULTS We propose a novel computational framework, called AEMDA, to identify associations between miRNAs and diseases. AEMDA applies a learning-based method to extract dense and high-dimensional representations of diseases and miRNAs from integrated disease semantic similarity, miRNA functional similarity and heterogeneous related interaction data. In addition, AEMDA adopts a deep autoencoder that does not need negative samples to retrieve the underlying associations between miRNAs and diseases. Furthermore, the reconstruction error is used as a measurement to predict disease-associated miRNAs. Our experimental results indicate that AEMDA can effectively predict disease-related miRNAs and outperforms state-of-the-art methods. AVAILABILITY AND IMPLEMENTATION The source code and data are available at https://github.com/CunmeiJi/AEMDA. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunmei Ji
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Xu Ma
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Qingwen Wu
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu 273165, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu 273165, China.,School of Computer Science and Technology, Anhui University, Hefei 230601, China
| |
Collapse
|
24
|
Li HY, You ZH, Wang L, Yan X, Li ZW. DF-MDA: An effective diffusion-based computational model for predicting miRNA-disease association. Mol Ther 2021; 29:1501-1511. [PMID: 33429082 DOI: 10.1016/j.ymthe.2021.01.003] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 12/21/2020] [Accepted: 01/01/2021] [Indexed: 12/28/2022] Open
Abstract
It is reported that microRNAs (miRNAs) play an important role in various human diseases. However, the mechanisms of miRNA in these diseases have not been fully understood. Therefore, detecting potential miRNA-disease associations has far-reaching significance for pathological development and the diagnosis and treatment of complex diseases. In this study, we propose a novel diffusion-based computational method, DF-MDA, for predicting miRNA-disease association based on the assumption that molecules are related to each other in human physiological processes. Specifically, we first construct a heterogeneous network by integrating various known associations among miRNAs, diseases, proteins, long non-coding RNAs (lncRNAs), and drugs. Then, more representative features are extracted through a diffusion-based machine-learning method. Finally, the Random Forest classifier is adopted to classify miRNA-disease associations. In the 5-fold cross-validation experiment, the proposed model obtained the average area under the curve (AUC) of 0.9321 on the HMDD v3.0 dataset. To further verify the prediction performance of the proposed model, DF-MDA was applied in three significant human diseases, including lymphoma, lung neoplasms, and colon neoplasms. As a result, 47, 46, and 47 out of top 50 predictions were validated by independent databases. These experimental results demonstrated that DF-MDA is a reliable and efficient method for predicting potential miRNA-disease associations.
Collapse
Affiliation(s)
- Hao-Yuan Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Xin Yan
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China; School of Foreign Languages, Zaozhuang University, Zaozhuang, Shandong 277100, China.
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
25
|
Li L, Gao Z, Zheng CH, Wang Y, Wang YT, Ni JC. SNFIMCMDA: Similarity Network Fusion and Inductive Matrix Completion for miRNA-Disease Association Prediction. Front Cell Dev Biol 2021; 9:617569. [PMID: 33634120 PMCID: PMC7900415 DOI: 10.3389/fcell.2021.617569] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 01/05/2021] [Indexed: 02/05/2023] Open
Abstract
MicroRNAs (miRNAs) that belong to non-coding RNAs are verified to be closely associated with several complicated biological processes and human diseases. In this study, we proposed a novel model that was Similarity Network Fusion and Inductive Matrix Completion for miRNA-Disease Association Prediction (SNFIMCMDA). We applied inductive matrix completion (IMC) method to acquire possible associations between miRNAs and diseases, which also could obtain corresponding correlation scores. IMC was performed based on the verified connections of miRNA-disease, miRNA similarity, and disease similarity. In addition, miRNA similarity and disease similarity were calculated by similarity network fusion, which could masterly integrate multiple data types to obtain target data. We integrated miRNA functional similarity and Gaussian interaction profile kernel similarity by similarity network fusion to obtain miRNA similarity. Similarly, disease similarity was integrated in this way. To indicate the utility and effectiveness of SNFIMCMDA, we both applied global leave-one-out cross-validation and five-fold cross-validation to validate our model. Furthermore, case studies on three significant human diseases were also implemented to prove the effectiveness of SNFIMCMDA. The results demonstrated that SNFIMCMDA was effective for prediction of possible associations of miRNA-disease.
Collapse
Affiliation(s)
- Lei Li
- School of Software, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu, China
| | - Chun-Hou Zheng
- School of Software, Qufu Normal University, Qufu, China
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Yu Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, China
| |
Collapse
|
26
|
Ding Y, Lei X, Liao B, Wu FX. Machine learning approaches for predicting biomolecule-disease associations. Brief Funct Genomics 2021; 20:273-287. [PMID: 33554238 DOI: 10.1093/bfgp/elab002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Biomolecules, such as microRNAs, circRNAs, lncRNAs and genes, are functionally interdependent in human cells, and all play critical roles in diverse fundamental and vital biological processes. The dysregulations of such biomolecules can cause diseases. Identifying the associations between biomolecules and diseases can uncover the mechanisms of complex diseases, which is conducive to their diagnosis, treatment, prognosis and prevention. Due to the time consumption and cost of biologically experimental methods, many computational association prediction methods have been proposed in the past few years. In this study, we provide a comprehensive review of machine learning-based approaches for predicting disease-biomolecule associations with multi-view data sources. Firstly, we introduce some databases and general strategies for integrating multi-view data sources in the prediction models. Then we discuss several feature representation methods for machine learning-based prediction models. Thirdly, we comprehensively review machine learning-based prediction approaches in three categories: basic machine learning methods, matrix completion-based methods and deep learning-based methods, while discussing their advantages and disadvantages. Finally, we provide some perspectives for further improving biomolecule-disease prediction methods.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering at the University of Saskatchewan
| | - Xiujuan Lei
- School of Computer Science at Shaanxi Normal University
| | - Bo Liao
- School of Mathematics and Statistics at Hainan Normal University, Haikou, China
| | - Fang-Xiang Wu
- College of Engineering and the Department of Computer Science at University of Saskatchewan
| |
Collapse
|
27
|
Liu L, Zhang J, Liu Y. MicroRNA-1323 serves as a biomarker in gestational diabetes mellitus and aggravates high glucose-induced inhibition of trophoblast cell viability by suppressing TP53INP1. Exp Ther Med 2021; 21:230. [PMID: 33603839 PMCID: PMC7851622 DOI: 10.3892/etm.2021.9661] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 12/14/2020] [Indexed: 12/21/2022] Open
Abstract
Gestational diabetes mellitus (GDM) leads to poor pregnancy outcomes, and microRNAs (miRNAs/miRs) have been suggested to be associated with GDM, but the pathological mechanisms remain unclear. The present study aimed to investigate the diagnostic value of miR-1323 in GDM patients and its effects on trophoblast cell viability. Additionally, the present study investigated the correlation between miR-1323 and TP53INP1 to understand the pathological mechanism of GDM progression. Reverse transcription-quantitative polymerase chain reaction was used to detect the miR-1323 expression and TP53INP1 mRNA expression. The diagnostic value of serum miR-1323 was evaluated by receiver operating characteristic analysis. HTR-8/SVneo and BeWo cells were treated with high glucose (HG) to construct cell models of GDM, and trophoblast cell viability was assessed using an MTT assay. The protein expression of TP53INP1 was detected by western blot analysis. The correlation between miR-1323 and TP53INP1 was investigated by luciferase reporter assay. The miR-1323 expression was increased in patients with GDM, which had relatively high diagnostic accuracy for GDM screening and was positively correlated with fasting blood glucose in patients GDM. HG upregulated the miR-1323 expression and inhibited trophoblast cell viability. Overexpression of miR-1323 significantly inhibited the viability of HG-induced trophoblast cells. TP53INP1, a target gene of miR-1323, was negatively correlated with miR-1323. TP53INP1 overexpression reversed the inhibitory effect of miR-1323 overexpression on the viability of HG-treated trophoblast cells. Increased levels of serum miR-1323 may be a diagnostic biomarker for GDM. Additionally, miR-1323 may inhibit trophoblast cell viability by inhibiting TP53INP1, suggesting that it may be a potential therapeutic target for GDM.
Collapse
Affiliation(s)
- Lijun Liu
- Department of Gynecology, Weifang Maternal and Child Health Hospital, Weifang, Shandong 261011, P.R. China
| | - Jun Zhang
- Department of Pharmacy, Weifang Maternal and Child Health Hospital, Weifang, Shandong 261011, P.R. China
| | - Yujuan Liu
- Department of Central Supply Room, Weifang Maternal and Child Health Hospital, Weifang, Shandong 261011, P.R. China
| |
Collapse
|
28
|
Dong Y, Sun Y, Qin C, Zhu W. EPMDA: Edge Perturbation Based Method for miRNA-Disease Association Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2170-2175. [PMID: 31514148 DOI: 10.1109/tcbb.2019.2940182] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
In the recent few years, plenty of research has shown that microRNA (miRNA) is likely to be involved in the formation of many human diseases. So effectively predicting potential associations between miRNAs and diseases helps to understand the development and treatment of diseases. In this study, an edge perturbation based method is proposed for predicting potential miRNA-disease association (EPMDA). Different from the previous studies, we design an feature vector to describe each edge of a graph by structural Hamiltonian information. Moreover, the extracted features are used to train a multi-layer perception model to predict the candidate disease-miRNA associations. The experimental results on the HMDD dataset show that EPMDA achieves the AUC value of 0.9818 through 5-fold cross-validation, which improves the AUC values by approximately 3.5 percent compared to the latest method DeepMDA. For the leave-one-disease-out cross-validation, EPMDA achieves the AUC value of 0.9371, which improves the AUC values by approximately 7.4 percent compared to DeepMDA. In the case study, we verify the prediction performance of EPMDA on three human diseases. As a result, there are 42, 46, and 41 of the top 50 predicted miRNAs for these three diseases which are confirmed by the published experimental discoveries, respectively.
Collapse
|
29
|
Li J, Chen X, Huang Q, Wang Y, Xie Y, Dai Z, Zou X, Li Z. Seq-SymRF: a random forest model predicts potential miRNA-disease associations based on information of sequences and clinical symptoms. Sci Rep 2020; 10:17901. [PMID: 33087810 PMCID: PMC7578641 DOI: 10.1038/s41598-020-75005-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 10/09/2020] [Indexed: 12/24/2022] Open
Abstract
Increasing evidence indicates that miRNAs play a vital role in biological processes and are closely related to various human diseases. Research on miRNA-disease associations is helpful not only for disease prevention, diagnosis and treatment, but also for new drug identification and lead compound discovery. A novel sequence- and symptom-based random forest algorithm model (Seq-SymRF) was developed to identify potential associations between miRNA and disease. Features derived from sequence information and clinical symptoms were utilized to characterize miRNA and disease, respectively. Moreover, the clustering method by calculating the Euclidean distance was adopted to construct reliable negative samples. Based on the fivefold cross-validation, Seq-SymRF achieved the accuracy of 98.00%, specificity of 99.43%, sensitivity of 96.58%, precision of 99.40% and Matthews correlation coefficient of 0.9604, respectively. The areas under the receiver operating characteristic curve and precision recall curve were 0.9967 and 0.9975, respectively. Additionally, case studies were implemented with leukemia, breast neoplasms and hsa-mir-21. Most of the top-25 predicted disease-related miRNAs (19/25 for leukemia; 20/25 for breast neoplasms) and 15 of top-25 predicted miRNA-related diseases were verified by literature and dbDEMC database. It is anticipated that Seq-SymRF could be regarded as a powerful high-throughput virtual screening tool for drug research and development. All source codes can be downloaded from https://github.com/LeeKamlong/Seq-SymRF.
Collapse
Affiliation(s)
- Jinlong Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Xingyu Chen
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Qixing Huang
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Yang Wang
- School of Chemistry, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China
| | - Yun Xie
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China
| | - Zong Dai
- School of Chemistry, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China
| | - Xiaoyong Zou
- School of Chemistry, Sun Yat-Sen University, Guangzhou, 510275, People's Republic of China.
| | - Zhanchao Li
- School of Chemistry and Chemical Engineering, Guangdong Pharmaceutical University, Guangzhou, 510006, People's Republic of China. .,Key Laboratory of Digital Quality Evaluation of Chinese Materia Medica of State Administration of Traditional Chinese Medicine, Guangzhou, 510006, People's Republic of China.
| |
Collapse
|
30
|
Abstract
Systematics is described for annotation of variations in RNA molecules. The conceptual framework is part of Variation Ontology (VariO) and facilitates depiction of types of variations, their functional and structural effects and other consequences in any RNA molecule in any organism. There are more than 150 RNA related VariO terms in seven levels, which can be further combined to generate even more complicated and detailed annotations. The terms are described together with examples, usually for variations and effects in human and in diseases. RNA variation type has two subcategories: variation classification and origin with subterms. Altogether six terms are available for function description. Several terms are available for affected RNA properties. The ontology contains also terms for structural description for affected RNA type, post-transcriptional RNA modifications, secondary and tertiary structure effects and RNA sugar variations. Together with the DNA and protein concepts and annotations, RNA terms allow comprehensive description of variations of genetic and non-genetic origin at all possible levels. The VariO annotations are readable both for humans and computer programs for advanced data integration and mining.
Collapse
Affiliation(s)
- Mauno Vihinen
- Department of Experimental Medical Science, Lund University, Lund, Sweden
| |
Collapse
|
31
|
FCGCNMDA: predicting miRNA-disease associations by applying fully connected graph convolutional networks. Mol Genet Genomics 2020; 295:1197-1209. [DOI: 10.1007/s00438-020-01693-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/02/2023]
|
32
|
Chen H, Guo R, Li G, Zhang W, Zhang Z. Comparative analysis of similarity measurements in miRNAs with applications to miRNA-disease association predictions. BMC Bioinformatics 2020; 21:176. [PMID: 32366225 PMCID: PMC7199309 DOI: 10.1186/s12859-020-3515-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 04/23/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND As regulators of gene expression, microRNAs (miRNAs) are increasingly recognized as critical biomarkers of human diseases. Till now, a series of computational methods have been proposed to predict new miRNA-disease associations based on similarity measurements. Different categories of features in miRNAs are applied in these methods for miRNA-miRNA similarity calculation. Benchmarking tests on these miRNA similarity measures are warranted to assess their effectiveness and robustness. RESULTS In this study, 5 categories of features, i.e. miRNA sequences, miRNA expression profiles in cell-lines, miRNA expression profiles in tissues, gene ontology (GO) annotations of miRNA target genes and Medical Subject Heading (MeSH) terms of miRNA-associated diseases, are collected and similarity values between miRNAs are quantified based on these feature spaces, respectively. We systematically compare the 5 similarities from multi-statistical views. Furthermore, we adopt a rule-based inference method to test their performance on miRNA-disease association predictions with the similarity measurements. Comprehensive comparison is made based on leave-one-out cross-validations and a case study. Experimental results demonstrate that the similarity measurement using MeSH terms performs best among the 5 measurements. It should be noted that the other 4 measurements can also achieve reliable prediction performance. The best-performed similarity measurement is used for new miRNA-disease association predictions and the inferred results are released for further biomedical screening. CONCLUSIONS Our study suggests that all the 5 features, even though some are restricted by data availability, are useful information for inferring novel miRNA-disease associations. However, biased prediction results might be produced in GO- and MeSH-based similarity measurements due to incomplete feature spaces. Similarity fusion may help produce more reliable prediction results. We expect that future studies will provide more detailed information into the 5 feature spaces and widen our understanding about disease pathogenesis.
Collapse
Affiliation(s)
- Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Ruiyu Guo
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013 China
| | - Wei Zhang
- School of Science, East China Jiaotong University, Nanchang, 330013 China
| | - Zuping Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083 China
| |
Collapse
|
33
|
Zhang Y, Chen M, Cheng X, Wei H. MSFSP: A Novel miRNA-Disease Association Prediction Model by Federating Multiple-Similarities Fusion and Space Projection. Front Genet 2020; 11:389. [PMID: 32425980 PMCID: PMC7204399 DOI: 10.3389/fgene.2020.00389] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 03/27/2020] [Indexed: 12/11/2022] Open
Abstract
Growing evidences have indicated that microRNAs (miRNAs) play a significant role relating to many important bioprocesses; their mutations and disorders will cause the occurrence of various complex diseases. The prediction of miRNAs associated with underlying diseases via computational approaches is beneficial to identify biomarkers and discover specific medicine, which can greatly reduce the cost of diagnosis, cure, prognosis, and prevention of human diseases. However, how to further achieve a more reliable prediction of potential miRNA-disease associations with effective integration of different biological data is a challenge for researchers. In this study, we proposed a computational model by using a federated method of combined multiple-similarities fusion and space projection (MSFSP). MSFSP firstly fused the integrated disease similarity (composed of disease semantic similarity, disease functional similarity, and disease Hamming similarity) with the integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity, and miRNA Hamming similarity). Secondly, it constructed the weighted network of miRNA-disease associations from the experimentally verified Boolean network of miRNA-disease associations by using similarity networks. Finally, it calculated the prediction results by weighting miRNA space projection scores and the disease space projection scores. Leave-one-out cross-validation demonstrated that MSFSP has the distinguished predictive accuracy with area under the receiver operating characteristics curve (AUC) of 0.9613 better than that of five other existing models. In case studies, the predictive ability of MSFSP was further confirmed as 96 and 98% of the top 50 predictions for prostatic neoplasms and lung neoplasms were successfully validated by experimental evidences and supporting experimental evidences were also found for 100% of the top 50 predictions for isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hanyan Wei
- School of Pharmacy, Guilin Medical University, Guilin, China
| |
Collapse
|
34
|
Wu Q, Wang Y, Gao Z, Ni J, Zheng C. MSCHLMDA: Multi-Similarity Based Combinative Hypergraph Learning for Predicting MiRNA-Disease Association. Front Genet 2020; 11:354. [PMID: 32351545 PMCID: PMC7174776 DOI: 10.3389/fgene.2020.00354] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 03/23/2020] [Indexed: 12/17/2022] Open
Abstract
Accumulating biological and clinical evidence has confirmed the important associations between microRNAs (miRNAs) and a variety of human diseases. Predicting disease-related miRNAs is beneficial for understanding the molecular mechanisms of pathological conditions at the miRNA level, and facilitating the finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. However, the challenge for researchers is to establish methods that can effectively combine different datasets and make reliable predictions. In this work, we propose the method of Multi-Similarity based Combinative Hypergraph Learning for Predicting MiRNA-disease Association (MSCHLMDA). To establish this method, complex features were extracted by two measures for each miRNA-disease pair. Then, K-nearest neighbor (KNN) and K-means algorithm were used to construct two different hypergraphs. Finally, results from combinative hypergraph learning were used for predicting miRNA-disease association. In order to evaluate the prediction performance of our method, leave-one-out cross validation and 5-fold cross validation was implemented, showing that our method had significantly improved prediction performance compared to previously used methods. Moreover, three case studies on different human complex diseases were performed, which further demonstrated the predictive performance of MSCHLMDA. It is anticipated that MSCHLMDA would become an excellent complement to the biomedical research field in the future.
Collapse
Affiliation(s)
- Qingwen Wu
- School of Software, Qufu Normal University, Qufu, China
| | - Yutian Wang
- School of Software, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Software, Qufu Normal University, Qufu, China
| | - Jiancheng Ni
- School of Software, Qufu Normal University, Qufu, China
| | - Chunhou Zheng
- School of Software, Qufu Normal University, Qufu, China.,School of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
35
|
Fan Y, Cui J, Zhu Q. Heterogeneous graph inference based on similarity network fusion for predicting lncRNA-miRNA interaction. RSC Adv 2020; 10:11634-11642. [PMID: 35496629 PMCID: PMC9050493 DOI: 10.1039/c9ra11043g] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 03/14/2020] [Indexed: 12/28/2022] Open
Abstract
LncRNA and miRNA are two non-coding RNA types that are popular in current research. LncRNA interacts with miRNA to regulate gene transcription, further affecting human health and disease. Accurate identification of lncRNA-miRNA interactions contributes to the in-depth study of the biological functions and mechanisms of non-coding RNA. However, relying on biological experiments to obtain interaction information is time-consuming and expensive. Considering the rapid accumulation of gene information and the few computational methods, it is urgent to supplement the effective computational models to predict lncRNA-miRNA interactions. In this work, we propose a heterogeneous graph inference method based on similarity network fusion (SNFHGILMI) to predict potential lncRNA-miRNA interactions. First, we calculated multiple similarity data, including lncRNA sequence similarity, miRNA sequence similarity, lncRNA Gaussian nuclear similarity, and miRNA Gaussian nuclear similarity. Second, the similarity network fusion method was employed to integrate the data and get the similarity network of lncRNA and miRNA. Then, we constructed a bipartite network by combining the known interaction network and similarity network of lncRNA and miRNA. Finally, the heterogeneous graph inference method was introduced to construct a prediction model. On the real dataset, the model SNFHGILMI achieved AUC of 0.9501 and 0.9426 ± 0.0035 based on LOOCV and 5-fold cross validation, respectively. Furthermore, case studies also demonstrate that SNFHGILMI is a high-performance prediction method that can accurately predict new lncRNA-miRNA interactions. The Matlab code and readme file of SNFHGILMI can be downloaded from https://github.com/cj-DaSE/SNFHGILMI.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| | - Juan Cui
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| | - QingQi Zhu
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| |
Collapse
|
36
|
Association extraction from biomedical literature based on representation and transfer learning. J Theor Biol 2020; 488:110112. [DOI: 10.1016/j.jtbi.2019.110112] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 12/08/2019] [Indexed: 12/17/2022]
|
37
|
Gao Z, Wang YT, Wu QW, Ni JC, Zheng CH. Graph regularized L 2,1-nonnegative matrix factorization for miRNA-disease association prediction. BMC Bioinformatics 2020; 21:61. [PMID: 32070280 PMCID: PMC7029547 DOI: 10.1186/s12859-020-3409-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 02/11/2020] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.
Collapse
Affiliation(s)
- Zhen Gao
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, 273165, China.
| | - Chun-Hou Zheng
- School of Software, Qufu Normal University, Qufu, 273165, China.
| |
Collapse
|
38
|
Wu M, Yang Y, Wang H, Ding J, Zhu H, Xu Y. IMPMD: An Integrated Method for Predicting Potential Associations Between miRNAs and Diseases. Curr Genomics 2020; 20:581-591. [PMID: 32581646 PMCID: PMC7290057 DOI: 10.2174/1389202920666191023090215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 10/16/2019] [Indexed: 01/06/2023] Open
Abstract
Background With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations. Methodology In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms. Results IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries. Conclusion We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.
Collapse
Affiliation(s)
- Meiqi Wu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yingxi Yang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Hui Wang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Jun Ding
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Huan Zhu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yan Xu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| |
Collapse
|
39
|
Guo ZH, You ZH, Yi HC. Integrative Construction and Analysis of Molecular Association Network in Human Cells by Fusing Node Attribute and Behavior Information. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 19:498-506. [PMID: 31923739 PMCID: PMC6951835 DOI: 10.1016/j.omtn.2019.10.046] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 10/07/2019] [Accepted: 10/21/2019] [Indexed: 11/27/2022]
Abstract
Detecting whether a pair of biomolecules associate is of great significance in the study of molecular biology. Hence, computational methods are urgently needed as guidance for practice. However, most of the previous prediction models influenced by reductionism focused on isolated research objects, which have their own inherent defects. Inspired by holism, a machine-learning-based framework called MAN-node2vec is proposed to predict multi-type relationships in the molecular associations network (MAN). Specifically, we constructed a large-scale MAN composed of 1,023 miRNAs, 1,649 proteins, 769 long non-coding RNAs (lncRNAs), 1,025 drugs, and 2,062 diseases. Then, each biomolecule in MAN can be represented as a vector by its attribute learned by k-mer, etc. and its behavior learned by node2vec. Finally, the random forest classifier is applied to carry out the relationship prediction task. The proposed model achieved a reliable performance with 0.9677 areas under the curve (AUCs) and 0.9562 areas under the precision curve (AUPRs) under 5-fold cross-validation. Also, additional experiments proved that the proposed global model shows more competitive performance than the traditional local method. All of these provided a systematic insight for understanding the synergistic interactions between various molecules and diseases. It is anticipated that this work can bring beneficial inspiration and advance to related systems biology and biomedical research.
Collapse
Affiliation(s)
- Zhen-Hao Guo
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
40
|
Pan X, Shen HB. Inferring Disease-Associated MicroRNAs Using Semi-supervised Multi-Label Graph Convolutional Networks. iScience 2019; 20:265-277. [PMID: 31605942 PMCID: PMC6817654 DOI: 10.1016/j.isci.2019.09.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/05/2019] [Accepted: 09/11/2019] [Indexed: 01/22/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in biological processes involved in diseases. The associations between diseases and protein-coding genes (PCGs) have been well investigated, and miRNAs interact with PCGs to trigger them to be functional. We present a computational method, DimiG, to infer miRNA-associated diseases using a semi-supervised Graph Convolutional Network model (GCN). DimiG uses a multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations, and tissue expression profiles. DimiG is trained on disease-PCG associations and an interaction network using a GCN, which is further used to score associations between diseases and miRNAs. We evaluate DimiG on a benchmark set from verified disease-miRNA associations. Our results demonstrate that DimiG outperforms the best unsupervised method and is comparable to two supervised methods. Three case studies of prostate cancer, lung cancer, and inflammatory bowel disease further demonstrate the efficacy of DimiG, where top miRNAs predicted by DimiG are supported by literature.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China; Department of Medical informatics, Erasmus Medical Center, 3015 CE Rotterdam, the Netherlands.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China.
| |
Collapse
|
41
|
Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol 2019; 20:202. [PMID: 31594544 PMCID: PMC6781296 DOI: 10.1186/s13059-019-1811-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/03/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. RESULTS Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. CONCLUSION Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.
Collapse
Affiliation(s)
- Zhou Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|
42
|
Yang Y, Fu X, Qu W, Xiao Y, Shen HB. MiRGOFS: a GO-based functional similarity measurement for miRNAs, with applications to the prediction of miRNA subcellular localization and miRNA-disease association. Bioinformatics 2019; 34:3547-3556. [PMID: 29718114 DOI: 10.1093/bioinformatics/bty343] [Citation(s) in RCA: 45] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2017] [Accepted: 04/26/2018] [Indexed: 01/22/2023] Open
Abstract
Motivation Benefiting from high-throughput experimental technologies, whole-genome analysis of microRNAs (miRNAs) has been more and more common to uncover important regulatory roles of miRNAs and identify miRNA biomarkers for disease diagnosis. As a complementary information to the high-throughput experimental data, domain knowledge like the Gene Ontology and KEGG pathway is usually used to guide gene function analysis. However, functional annotation for miRNAs is scarce in the public databases. Till now, only a few methods have been proposed for measuring the functional similarity between miRNAs based on public annotation data, and these methods cover a very limited number of miRNAs, which are not applicable to large-scale miRNA analysis. Results In this paper, we propose a new method to measure the functional similarity for miRNAs, called miRGOFS, which has two notable features: (i) it adopts a new GO semantic similarity metric which considers both common ancestors and descendants of GO terms; (i) it computes similarity between GO sets in an asymmetric manner, and weights each GO term by its statistical significance. The miRGOFS-based predictor achieves an F1 of 61.2% on a benchmark dataset of miRNA localization, and AUC values of 87.7 and 81.1% on two benchmark sets of miRNA-disease association, respectively. Compared with the existing functional similarity measurements of miRNAs, miRGOFS has the advantages of higher accuracy and larger coverage of human miRNAs (over 1000 miRNAs). Availability and implementation http://www.csbio.sjtu.edu.cn/bioinf/MiRGOFS/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yang Yang
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China.,Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai, China
| | - Xiaofeng Fu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Wenhao Qu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Yiqun Xiao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China.,Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, China
| |
Collapse
|
43
|
Drug-Drug Interaction Predicting by Neural Network Using Integrated Similarity. Sci Rep 2019; 9:13645. [PMID: 31541145 PMCID: PMC6754439 DOI: 10.1038/s41598-019-50121-3] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Accepted: 09/06/2019] [Indexed: 01/04/2023] Open
Abstract
Drug-Drug Interaction (DDI) prediction is one of the most critical issues in drug development and health. Proposing appropriate computational methods for predicting unknown DDI with high precision is challenging. We proposed "NDD: Neural network-based method for drug-drug interaction prediction" for predicting unknown DDIs using various information about drugs. Multiple drug similarities based on drug substructure, target, side effect, off-label side effect, pathway, transporter, and indication data are calculated. At first, NDD uses a heuristic similarity selection process and then integrates the selected similarities with a nonlinear similarity fusion method to achieve high-level features. Afterward, it uses a neural network for interaction prediction. The similarity selection and similarity integration parts of NDD have been proposed in previous studies of other problems. Our novelty is to combine these parts with new neural network architecture and apply these approaches in the context of DDI prediction. We compared NDD with six machine learning classifiers and six state-of-the-art graph-based methods on three benchmark datasets. NDD achieved superior performance in cross-validation with AUPR ranging from 0.830 to 0.947, AUC from 0.954 to 0.994 and F-measure from 0.772 to 0.902. Moreover, cumulative evidence in case studies on numerous drug pairs, further confirm the ability of NDD to predict unknown DDIs. The evaluations corroborate that NDD is an efficient method for predicting unknown DDIs. The data and implementation of NDD are available at https://github.com/nrohani/NDD.
Collapse
|
44
|
Chen H, Zhang Z, Feng D. Prediction and interpretation of miRNA-disease associations based on miRNA target genes using canonical correlation analysis. BMC Bioinformatics 2019; 20:404. [PMID: 31345171 PMCID: PMC6657378 DOI: 10.1186/s12859-019-2998-8] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2019] [Accepted: 07/16/2019] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND It has been shown that the deregulation of miRNAs is associated with the development and progression of many human diseases. To reduce time and cost of biological experiments, a number of algorithms have been proposed for predicting miRNA-disease associations. However, the existing methods rarely investigated the cause-and-effect mechanism behind these associations, which hindered further biomedical follow-ups. RESULTS In this study, we presented a CCA-based model in which the possible molecular causes of miRNA-disease associations were comprehensively revealed by extracting correlated sets of genes and diseases based on the co-occurrence of miRNAs in target gene profiles and disease profiles. Our method directly suggested the underlying genes involved, which could be used for experimental tests and confirmation. The inference of associated diseases of a new miRNA was made by taking into account the weight vectors of the extracted sets. We extracted 60 pairs of correlated sets from 404 miRNAs with two profiles for 2796 target genes and 362 diseases. The extracted diseases could be considered as possible outcomes of miRNAs regulating the target genes which appeared in the same set, some of which were supported by independent source of information. Furthermore, we tested our method on the 404 miRNAs under the condition of 5-fold cross validations and received an AUC value of 0.84606. Finally, we extensively inferred miRNA-disease associations for 100 new miRNAs and some interesting prediction results were validated by established databases. CONCLUSIONS The encouraging results demonstrated that our method could provide a biologically relevant prediction and interpretation of associations between miRNAs and diseases, which were of great usefulness when guiding biological experiments for scientific research.
Collapse
Affiliation(s)
- Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Zuping Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083 China
| | - Dayi Feng
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| |
Collapse
|
45
|
Li Z, Nie R, You Z, Zhao Y, Ge X, Wang Y. LRMDA: Using Logistic Regression and Random Walk with Restart for MiRNA-Disease Association Prediction. ACTA ACUST UNITED AC 2019. [DOI: 10.1007/978-3-030-26969-2_27] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
|
46
|
Gao YL, Cui Z, Liu JX, Wang J, Zheng CH. NPCMF: Nearest Profile-based Collaborative Matrix Factorization method for predicting miRNA-disease associations. BMC Bioinformatics 2019; 20:353. [PMID: 31234797 PMCID: PMC6591872 DOI: 10.1186/s12859-019-2956-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2019] [Accepted: 06/17/2019] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Predicting meaningful miRNA-disease associations (MDAs) is costly. Therefore, an increasing number of researchers are beginning to focus on methods to predict potential MDAs. Thus, prediction methods with improved accuracy are under development. An efficient computational method is proposed to be crucial for predicting novel MDAs. For improved experimental productivity, large biological datasets are used by researchers. Although there are many effective and feasible methods to predict potential MDAs, the possibility remains that these methods are flawed. RESULTS A simple and effective method, known as Nearest Profile-based Collaborative Matrix Factorization (NPCMF), is proposed to identify novel MDAs. The nearest profile is introduced to our method to achieve the highest AUC value compared with other advanced methods. For some miRNAs and diseases without any association, we use the nearest neighbour information to complete the prediction. CONCLUSIONS To evaluate the performance of our method, five-fold cross-validation is used to calculate the AUC value. At the same time, three disease cases, gastric neoplasms, rectal neoplasms and colonic neoplasms, are used to predict novel MDAs on a gold-standard dataset. We predict the vast majority of known MDAs and some novel MDAs. Finally, the prediction accuracy of our method is determined to be better than that of other existing methods. Thus, the proposed prediction model can obtain reliable experimental results.
Collapse
Affiliation(s)
- Ying-Lian Gao
- Library of Qufu Normal University, Qufu Normal University, Rizhao, China
| | - Zhen Cui
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Jin-Xing Liu
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China. .,Co-Innovation Center for Information Supply and Assurance Technology, Anhui University, Hefei, China.
| | - Juan Wang
- School of Information Science and Engineering, Qufu Normal University, Rizhao, China
| | - Chun-Hou Zheng
- Co-Innovation Center for Information Supply and Assurance Technology, Anhui University, Hefei, China
| |
Collapse
|
47
|
Wei H, Liu B. iCircDA-MF: identification of circRNA-disease associations based on matrix factorization. Brief Bioinform 2019; 21:1356-1367. [DOI: 10.1093/bib/bbz057] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 03/13/2019] [Accepted: 04/17/2019] [Indexed: 12/19/2022] Open
Abstract
Abstract
Circular RNAs (circRNAs) are a group of novel discovered non-coding RNAs with closed-loop structure, which play critical roles in various biological processes. Identifying associations between circRNAs and diseases is critical for exploring the complex disease mechanism and facilitating disease-targeted therapy. Although several computational predictors have been proposed, their performance is still limited. In this study, a novel computational method called iCircDA-MF is proposed. Because the circRNA-disease associations with experimental validation are very limited, the potential circRNA-disease associations are calculated based on the circRNA similarity and disease similarity extracted from the disease semantic information and the known associations of circRNA-gene, gene-disease and circRNA-disease. The circRNA-disease interaction profiles are then updated by the neighbour interaction profiles so as to correct the false negative associations. Finally, the matrix factorization is performed on the updated circRNA-disease interaction profiles to predict the circRNA-disease associations. The experimental results on a widely used benchmark dataset showed that iCircDA-MF outperforms other state-of-the-art predictors and can identify new circRNA-disease associations effectively.
Collapse
Affiliation(s)
- Hang Wei
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
| | - Bin Liu
- School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, Guangdong, China
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
48
|
Kim J, Kim JJ, Lee H. DigChem: Identification of disease-gene-chemical relationships from Medline abstracts. PLoS Comput Biol 2019; 15:e1007022. [PMID: 31091224 PMCID: PMC6519793 DOI: 10.1371/journal.pcbi.1007022] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 04/10/2019] [Indexed: 11/18/2022] Open
Abstract
Chemicals interact with genes in the process of disease development and treatment. Although much biomedical research has been performed to understand relationships among genes, chemicals, and diseases, which have been reported in biomedical articles in Medline, there are few studies that extract disease-gene-chemical relationships from biomedical literature at a PubMed scale. In this study, we propose a deep learning model based on bidirectional long short-term memory to identify the evidence sentences of relationships among genes, chemicals, and diseases from Medline abstracts. Then, we develop the search engine DigChem to enable disease-gene-chemical relationship searches for 35,124 genes, 56,382 chemicals, and 5,675 diseases. We show that the identified relationships are reliable by comparing them with manual curation and existing databases. DigChem is available at http://gcancer.org/digchem.
Collapse
Affiliation(s)
- Jeongkyun Kim
- Gwangju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwangju, Korea
| | - Jung-jae Kim
- Institute for Infocomm Research, A-STAR, 138632, Singapore
| | - Hyunju Lee
- Gwangju Institute of Science and Technology, School of Electrical Engineering and Computer Science, Gwangju, Korea
- * E-mail:
| |
Collapse
|
49
|
Chen M, Zhang Y, Li A, Li Z, Liu W, Chen Z. Bipartite Heterogeneous Network Method Based on Co-neighbor for MiRNA-Disease Association Prediction. Front Genet 2019; 10:385. [PMID: 31080459 PMCID: PMC6497741 DOI: 10.3389/fgene.2019.00385] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 04/10/2019] [Indexed: 12/22/2022] Open
Abstract
In recent years, miRNA variation and dysregulation have been found to be closely related to human tumors, and identifying miRNA-disease associations is helpful for understanding the mechanisms of disease or tumor development and is greatly significant for the prognosis, diagnosis, and treatment of human diseases. This article proposes a Bipartite Heterogeneous network link prediction method based on co-neighbor to predict miRNA-disease association (BHCN). According to the structural characteristics of the bipartite network, the concept of bipartite network co-neighbors is proposed, and the co-neighbors were used to represent the probability of association between disease and miRNA. To predict the isolated diseases and the new miRNA based on the association probability expressed by co-neighbors, we utilized the similarity between disease nodes and the similarity between miRNA nodes in heterogeneous networks to represent the association probability between disease and miRNA. The model's predictive performance was evaluated by the leave-one-out cross validation (LOOCV) on different datasets. The AUC value of BHCN on the gold benchmark dataset was 0.7973, and the AUC obtained on the prediction dataset was 0.9349, which was better than that of the classic global algorithm. In this case study, we conducted predictive studies on breast neoplasms and colon neoplasms. Most of the top 50 predicted results were confirmed by three databases, namely, HMDD, miR2disease, and dbDEMC, with accuracy rates of 96 and 82%. In addition, BHCN can be used for predicting isolated diseases (without any known associated diseases) and new miRNAs (without any known associated miRNAs). In the isolated disease case study, the top 50 of breast neoplasm and colon neoplasm potentials associated with miRNAs predicted an accuracy of 100 and 96%, respectively, thereby demonstrating the favorable predictive power of BHCN for potentially relevant miRNAs.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Ang Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Wenhua Liu
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zheng Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| |
Collapse
|
50
|
Yu SP, Liang C, Xiao Q, Li GH, Ding PJ, Luo JW. MCLPMDA: A novel method for miRNA-disease association prediction based on matrix completion and label propagation. J Cell Mol Med 2018; 23:1427-1438. [PMID: 30499204 PMCID: PMC6349206 DOI: 10.1111/jcmm.14048] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 11/02/2018] [Indexed: 12/20/2022] Open
Abstract
MiRNAs are a class of small non‐coding RNAs that are involved in the development and progression of various complex diseases. Great efforts have been made to discover potential associations between miRNAs and diseases recently. As experimental methods are in general expensive and time‐consuming, a large number of computational models have been developed to effectively predict reliable disease‐related miRNAs. However, the inherent noise and incompleteness in the existing biological datasets have inevitably limited the prediction accuracy of current computational models. To solve this issue, in this paper, we propose a novel method for miRNA‐disease association prediction based on matrix completion and label propagation. Specifically, our method first reconstructs a new miRNA/disease similarity matrix by matrix completion algorithm based on known experimentally verified miRNA‐disease associations and then utilizes the label propagation algorithm to reliably predict disease‐related miRNAs. As a result, MCLPMDA achieved comparable performance under different evaluation metrics and was capable of discovering greater number of true miRNA‐disease associations. Moreover, case study conducted on Breast Neoplasms further confirmed the prediction reliability of the proposed method. Taken together, the experimental results clearly demonstrated that MCLPMDA can serve as an effective and reliable tool for miRNA‐disease association prediction.
Collapse
Affiliation(s)
- Sheng-Peng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha, China
| | - Guang-Hui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, China
| | - Ping-Jian Ding
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Jia-Wei Luo
- College of Information Science and Engineering, Hunan University, Changsha, China
| |
Collapse
|