1
|
Chen M, Deng Y, Li Z, Ye Y, He Z. KATZNCP: a miRNA-disease association prediction model integrating KATZ algorithm and network consistency projection. BMC Bioinformatics 2023; 24:229. [PMID: 37268893 DOI: 10.1186/s12859-023-05365-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/26/2023] [Indexed: 06/04/2023] Open
Abstract
BACKGROUND Clinical studies have shown that miRNAs are closely related to human health. The study of potential associations between miRNAs and diseases will contribute to a profound understanding of the mechanism of disease development, as well as human disease prevention and treatment. MiRNA-disease associations predicted by computational methods are the best complement to biological experiments. RESULTS In this research, a federated computational model KATZNCP was proposed on the basis of the KATZ algorithm and network consistency projection to infer the potential miRNA-disease associations. In KATZNCP, a heterogeneous network was initially constructed by integrating the known miRNA-disease association, integrated miRNA similarities, and integrated disease similarities; then, the KATZ algorithm was implemented in the heterogeneous network to obtain the estimated miRNA-disease prediction scores. Finally, the precise scores were obtained by the network consistency projection method as the final prediction results. KATZNCP achieved the reliable predictive performance in leave-one-out cross-validation (LOOCV) with an AUC value of 0.9325, which was better than the state-of-the-art comparable algorithms. Furthermore, case studies of lung neoplasms and esophageal neoplasms demonstrated the excellent predictive performance of KATZNCP. CONCLUSION A new computational model KATZNCP was proposed for predicting potential miRNA-drug associations based on KATZ and network consistency projections, which can effectively predict the potential miRNA-disease interactions. Therefore, KATZNCP can be used to provide guidance for future experiments.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yingwei Deng
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China.
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yifan Ye
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Ziyi He
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| |
Collapse
|
2
|
Luo Y, Peng L, Shan W, Sun M, Luo L, Liang W. Machine learning in the development of targeting microRNAs in human disease. Front Genet 2023; 13:1088189. [PMID: 36685965 PMCID: PMC9845262 DOI: 10.3389/fgene.2022.1088189] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 12/12/2022] [Indexed: 01/05/2023] Open
Abstract
A microRNA is a small, single-stranded, non-coding ribonucleic acid that plays a crucial role in RNA silencing and can regulate gene expression. With the in-depth study of miRNA in development and disease, miRNA has become an attractive target for novel therapeutic strategies. Exploring miRNA targeting therapy only through experiments is expensive and laborious, so it is essential to develop novel and efficient computational methods to narrow down the search. Recent advances in machine learning applied in biomedical informatics provide opportunities to explore miRNA-targeting drugs, thus promoting miRNA therapeutics. This review provides an overview of recent advancements in miRNA targeting therapeutic using machine learning. First, we mainly describe the basics of predicting miRNA targeting drugs, including pharmacogenomic data resources and data preprocessing. Then we present primary machine learning algorithms and elaborate their application in discovering relationships among miRNAs, drugs, and diseases. Along with the progress of miRNA targeting therapeutics, we finally analyze and discuss the current challenges and opportunities that machine learning confronts.
Collapse
Affiliation(s)
- Yuxun Luo
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China
| | - Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China
| | - Wenyu Shan
- School of Computer Science, University of South China, Hengyang, China
| | - Mengyue Sun
- School of Polymer Science and Polymer Engineering, The University of Akron, Akron, OH, United States
| | - Lingyun Luo
- School of Computer Science, University of South China, Hengyang, China
| | - Wei Liang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, China,Hunan Key Laboratory for Service computing and Novel Software Technology, Xiangtan, China,*Correspondence: Wei Liang,
| |
Collapse
|
3
|
Li S, Wang B, Chang M, Hou R, Tian G, Tong L. A Novel Algorithm for Detecting Microsatellite Instability Based on Next-Generation Sequencing Data. Front Oncol 2022; 12:916379. [PMID: 35847873 PMCID: PMC9280483 DOI: 10.3389/fonc.2022.916379] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2022] [Accepted: 05/27/2022] [Indexed: 11/25/2022] Open
Abstract
Objectives Microsatellite instability (MSI) is the condition of genetic hypermutability caused by spontaneous acquisition or loss of nucleotides during the DNA replication. MSI has been discovered to be a useful immunotherapy biomarker clinically. The main DNA-based method for MSI detection is polymerase chain reaction (PCR) amplification and fragment length analysis, which are costly and laborious. Thus, we developed a novel method to detect MSI based on next-generation sequencing (NGS) data. Methods We chose six markers of MSI. After alignment and reads counting, a histogram was plotted showing the counts of different lengths for each marker. We then designed an algorithm to discover peaks in the generated histograms so that the peak numbers discovered in NGS data resembled that in PCR-based method. Results We selected nine samples as the training dataset, 101 samples for validation, and 68 samples as the test dataset from Chifeng Municipal Hospital, Inner Mongolia, China. The NGS-based method achieved 100% accuracy for the validation dataset and 98.53% accuracy for the test dataset, in which only one false positive was detected. Conclusions Accurate MSI judgments were achieved using NGS data, which could provide comparable MSI detection with the gold standard, PCR-based methods.
Collapse
Affiliation(s)
- Shijun Li
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
| | - Bo Wang
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
| | - Miaomiao Chang
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
| | - Rui Hou
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
| | - Geng Tian
- Science Department, Geneis Beijing Co., Ltd., Beijing, China
- *Correspondence: Geng Tian, ; Ling Tong,
| | - Ling Tong
- Pathology Department, Chifeng Municipal Hospital, Chifeng, China
- *Correspondence: Geng Tian, ; Ling Tong,
| |
Collapse
|
4
|
Peng L, Yang C, Huang L, Chen X, Fu X, Liu W. RNMFLP: Predicting circRNA-disease associations based on robust nonnegative matrix factorization and label propagation. Brief Bioinform 2022; 23:6582881. [PMID: 35534179 DOI: 10.1093/bib/bbac155] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Revised: 03/09/2022] [Accepted: 04/06/2022] [Indexed: 12/22/2022] Open
Abstract
Circular RNAs (circRNAs) are a class of structurally stable endogenous noncoding RNA molecules. Increasing studies indicate that circRNAs play vital roles in human diseases. However, validating disease-related circRNAs in vivo is costly and time-consuming. A reliable and effective computational method to identify circRNA-disease associations deserves further studies. In this study, we propose a computational method called RNMFLP that combines robust nonnegative matrix factorization (RNMF) and label propagation algorithm (LP) to predict circRNA-disease associations. First, to reduce the impact of false negative data, the original circRNA-disease adjacency matrix is updated by matrix multiplication using the integrated circRNA similarity and the disease similarity information. Subsequently, the RNMF algorithm is used to obtain the restricted latent space to capture potential circRNA-disease pairs from the association matrix. Finally, the LP algorithm is utilized to predict more accurate circRNA-disease associations from the integrated circRNA similarity network and integrated disease similarity network, respectively. Fivefold cross-validation of four datasets shows that RNMFLP is superior to the state-of-the-art methods. In addition, case studies on lung cancer, hepatocellular carcinoma and colorectal cancer further demonstrate the reliability of our method to discover disease-related circRNAs.
Collapse
Affiliation(s)
- Li Peng
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Cheng Yang
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, 10084, Beijing, China.,The Future Laboratory, Tsinghua University, 10084, Beijing, China
| | - Xiang Chen
- School of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Wei Liu
- College of Information Engineering, Xiangtan University, Xiangtan, 411105, Hunan, China
| |
Collapse
|
5
|
Cai L, Gao M, Ren X, Fu X, Xu J, Wang P, Chen Y. MILNP: Plant lncRNA-miRNA Interaction Prediction Based on Improved Linear Neighborhood Similarity and Label Propagation. FRONTIERS IN PLANT SCIENCE 2022; 13:861886. [PMID: 35401586 PMCID: PMC8990282 DOI: 10.3389/fpls.2022.861886] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 02/21/2022] [Indexed: 06/14/2023]
Abstract
Knowledge of the interactions between long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) is the basis of understanding various biological activities and designing new drugs. Previous computational methods for predicting lncRNA-miRNA interactions lacked for plants, and they suffer from various limitations that affect the prediction accuracy and their applicability. Research on plant lncRNA-miRNA interactions is still in its infancy. In this paper, we propose an accurate predictor, MILNP, for predicting plant lncRNA-miRNA interactions based on improved linear neighborhood similarity measurement and linear neighborhood propagation algorithm. Specifically, we propose a novel similarity measure based on linear neighborhood similarity from multiple similarity profiles of lncRNAs and miRNAs and derive more precise neighborhood ranges so as to escape the limits of the existing methods. We then simultaneously update the lncRNA-miRNA interactions predicted from both similarity matrices based on label propagation. We comprehensively evaluate MILNP on the latest plant lncRNA-miRNA interaction benchmark datasets. The results demonstrate the superior performance of MILNP than the most up-to-date methods. What's more, MILNP can be leveraged for isolated plant lncRNAs (or miRNAs). Case studies suggest that MILNP can identify novel plant lncRNA-miRNA interactions, which are confirmed by classical tools. The implementation is available on https://github.com/HerSwain/gra/tree/MILNP.
Collapse
Affiliation(s)
| | | | | | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | | | - Peng Wang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | | |
Collapse
|
6
|
Chen M, Zhang Y, Li A, Li Z, Liu W, Chen Z. Bipartite Heterogeneous Network Method Based on Co-neighbor for MiRNA-Disease Association Prediction. Front Genet 2019; 10:385. [PMID: 31080459 PMCID: PMC6497741 DOI: 10.3389/fgene.2019.00385] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 04/10/2019] [Indexed: 12/22/2022] Open
Abstract
In recent years, miRNA variation and dysregulation have been found to be closely related to human tumors, and identifying miRNA-disease associations is helpful for understanding the mechanisms of disease or tumor development and is greatly significant for the prognosis, diagnosis, and treatment of human diseases. This article proposes a Bipartite Heterogeneous network link prediction method based on co-neighbor to predict miRNA-disease association (BHCN). According to the structural characteristics of the bipartite network, the concept of bipartite network co-neighbors is proposed, and the co-neighbors were used to represent the probability of association between disease and miRNA. To predict the isolated diseases and the new miRNA based on the association probability expressed by co-neighbors, we utilized the similarity between disease nodes and the similarity between miRNA nodes in heterogeneous networks to represent the association probability between disease and miRNA. The model's predictive performance was evaluated by the leave-one-out cross validation (LOOCV) on different datasets. The AUC value of BHCN on the gold benchmark dataset was 0.7973, and the AUC obtained on the prediction dataset was 0.9349, which was better than that of the classic global algorithm. In this case study, we conducted predictive studies on breast neoplasms and colon neoplasms. Most of the top 50 predicted results were confirmed by three databases, namely, HMDD, miR2disease, and dbDEMC, with accuracy rates of 96 and 82%. In addition, BHCN can be used for predicting isolated diseases (without any known associated diseases) and new miRNAs (without any known associated miRNAs). In the isolated disease case study, the top 50 of breast neoplasm and colon neoplasm potentials associated with miRNAs predicted an accuracy of 100 and 96%, respectively, thereby demonstrating the favorable predictive power of BHCN for potentially relevant miRNAs.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Ang Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Wenhua Liu
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zheng Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| |
Collapse
|
7
|
Zhang Y, Chen M, Cheng X, Chen Z. LSGSP: a novel miRNA–disease association prediction model using a Laplacian score of the graphs and space projection federated method. RSC Adv 2019; 9:29747-29759. [PMID: 35531537 PMCID: PMC9071959 DOI: 10.1039/c9ra05554a] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Lots of research findings have indicated that miRNAs (microRNAs) are involved in many important biological processes; their mutations and disorders are closely related to diseases, therefore, determining the associations between human diseases and miRNAs is key to understand pathogenic mechanisms. Existing biological experimental methods for identifying miRNA–disease associations are usually expensive and time consuming. Therefore, the development of efficient and reliable computational methods for identifying disease-related miRNAs has become an important topic in the field of biological research in recent years. In this study, we developed a novel miRNA–disease association prediction model using a Laplacian score of the graphs and space projection federated method (LSGSP). This integrates experimentally validated miRNA–disease associations, disease semantic similarity scores, miRNA functional scores, and miRNA family information to build a new disease similarity network and miRNA similarity network, and then obtains the global similarities of these networks through calculating the Laplacian score of the graphs, based on which the miRNA–disease weighted network can be constructed through combination with the miRNA–disease Boolean network. Finally, the miRNA–disease score was obtained via projecting the miRNA space and disease space onto the miRNA–disease weighted network. Compared with several other state-of-the-art methods, using leave-one-out cross validation (LOOCV) to evaluate the accuracy of LSGSP with respect to a benchmark dataset, prediction dataset and compare dataset, LSGSP showed excellent predictive performance with high AUC values of 0.9221, 0.9745 and 0.9194, respectively. In addition, for prostate neoplasms and lung neoplasms, the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the updated HMDD, miR2Disease, and dbDEMC databases) reached 96% and 100%, respectively. Similarly, for isolated diseases (diseases not associated with any miRNAs), the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the above-mentioned three databases) reached 98% and 100%, respectively. These results further indicate that LSGSP can effectively predict potential associations between miRNAs and diseases. Lots of research findings have indicated that the mutations and disorders of miRNAs (microRNAs) are closely related to diseases. Therefore, determining the associations between human diseases and miRNAs is key to understand the pathogenic mechanisms.![]()
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering
- Guilin University of Technology
- 541004 Guilin
- China
| | - Min Chen
- School of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Xiaohui Cheng
- School of Information Science and Engineering
- Guilin University of Technology
- 541004 Guilin
- China
| | - Zheng Chen
- School of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| |
Collapse
|
8
|
Chen M, Peng Y, Li A, Li Z, Deng Y, Liu W, Liao B, Dai C. A novel information diffusion method based on network consistency for identifying disease related microRNAs. RSC Adv 2018; 8:36675-36690. [PMID: 35558942 PMCID: PMC9088870 DOI: 10.1039/c8ra07519k] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Accepted: 10/17/2018] [Indexed: 12/27/2022] Open
Abstract
The abnormal expression of miRNAs is directly related to the development of human diseases. Predicting the potential candidate miRNAs associated with diseases can contribute to the detection, diagnosis, treatment and prevention of human complex diseases. The effective inference of the calculation method of the relationship between miRNAs and diseases is an effective supplement to biological experiments. It is of great help in the prevention, treatment and prognosis of complex diseases. This paper proposes a novel information diffusion method based on network consistency (IDNC) for identifying disease related microRNAs. The model first synthesizes the miRNA family information and the miRNA function similarity to reconstruct the miRNA network, and reconstruct the disease network by using the known disease and miRNA-related information and the semantic score between diseases. Then the global similarity of the two networks is obtained by using the Laplacian score of graphs. The global similarity score is a measure of the similarity between diseases and miRNAs. The disease–miRNA relation network was reconstructed by integrating the global similarity relation. The network consistency diffusion seed is then obtained by combining the global similarity network with the reconstructed disease–miRNA association network. Thereafter, the stable diffusion spectrum is generated as the prediction score by using the restarted random walk algorithm. The AUC value obtained by performing the LOOCV in the gold benchmark dataset is 0.8814. The AUC value obtained by performing the LOOCV in the predictive dataset is 0.9512. Compared with other frontier methods, our method has higher accuracy, which is further illustrated by case studies of breast neoplasms and colon neoplasms to prove that IDNC is valuable. The abnormal expression of miRNAs is directly related to the development of human diseases.![]()
Collapse
Affiliation(s)
- Min Chen
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
- College of Information Science and Engineering
| | - Yan Peng
- College of International Communication
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Ang Li
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Zejun Li
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
- College of Information Science and Engineering
| | - Yingwei Deng
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Wenhua Liu
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| | - Bo Liao
- College of Information Science and Engineering
- Hunan University
- Changsha 410082
- China
| | - Chengqiu Dai
- College of Computer Science and Technology
- Hunan Institute of Technology
- 421002 Hengyang
- China
| |
Collapse
|