1
|
Liang J, Sun Y, Ling J. GRL-PUL: predicting microbe-drug association based on graph representation learning and positive unlabeled learning. Mol Omics 2025; 21:38-50. [PMID: 39540771 DOI: 10.1039/d4mo00117f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/16/2024]
Abstract
Extensive research has confirmed the widespread presence of microorganisms in the human body and their crucial impact on human health, with drugs being an effective method of regulation. Hence it is essential to identify potential microbe-drug associations (MDAs). Owing to the limitations of wet experiments, such as high costs and long durations, computational methods for binary classification tasks have become valuable alternatives for traditional experimental approaches. Since validated negative MDAs are absent in existing datasets, most methods randomly sample negatives from unlabeled data, which evidently leads to false negative issues. In this manuscript, we propose a novel model based on graph representation learning and positive-unlabeled learning (GRL-PUL), to infer potential MDAs. Firstly, we screen reliable negative samples by applying weighted matrix factorization and the PU-bagging strategy on the known microbe-drug bipartite network. Then, we combine muti-model attributes and constructed a microbe-drug heterogeneous network. After that, graph attention auto-encoder module, an encoder combining graph convolutional networks and graph attention networks, is introduced to extract informative embeddings based on the microbe-drug heterogeneous network. Lastly, we adopt a modified random forest as the final classifier. Comparison experiments with five baseline models on three benchmark datasets show that our model surpasses other methods in terms of the AUC, AUPR, ACC, F1-score and MCC. Moreover, several case studies show that GRL-PUL could capably predict latent MDAs. Notably, we further verify the effectiveness of a reliable negative sample selection module by migrating it to other state-of-the-art models, and the experimental results demonstrate its ability to substantially improve their prediction performance.
Collapse
Affiliation(s)
- Jinqing Liang
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| | - Yuping Sun
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| | - Jie Ling
- School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China.
| |
Collapse
|
2
|
Zhang C, Li Y, Dong Y, Chen W, Yu C. Prediction of miRNA-disease associations based on PCA and cascade forest. BMC Bioinformatics 2024; 25:386. [PMID: 39701957 DOI: 10.1186/s12859-024-05999-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/04/2024] [Accepted: 11/26/2024] [Indexed: 12/21/2024] Open
Abstract
BACKGROUND As a key non-coding RNA molecule, miRNA profoundly affects gene expression regulation and connects to the pathological processes of several kinds of human diseases. However, conventional experimental methods for validating miRNA-disease associations are laborious. Consequently, the development of efficient and reliable computational prediction models is crucial for the identification and validation of these associations. RESULTS In this research, we developed the PCACFMDA method to predict the potential associations between miRNAs and diseases. To construct a multidimensional feature matrix, we consider the fusion similarities of miRNA and disease and miRNA-disease pairs. We then use principal component analysis(PCA) to reduce data complexity and extract low-dimensional features. Subsequently, a tuned cascade forest is used to mine the features and output prediction scores deeply. The results of the 5-fold cross-validation using the HMDD v2.0 database indicate that the PCACFMDA algorithm achieved an AUC of 98.56%. Additionally, we perform case studies on breast, esophageal and lung neoplasms. The findings revealed that the top 50 miRNAs most strongly linked to each disease have been validated. CONCLUSIONS Based on PCA and optimized cascade forests, we propose the PCACFMDA model for predicting undiscovered miRNA-disease associations. The experimental results demonstrate superior prediction performance and commendable stability. Consequently, the PCACFMDA is a potent instrument for in-depth exploration of miRNA-disease associations.
Collapse
Affiliation(s)
- Chuanlei Zhang
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Yubo Li
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Yinglun Dong
- Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, 300457, China
| | - Wei Chen
- Computer Science, China University of Mining and Technology, Xuzhou, 221116, China
| | - Changqing Yu
- Electronic Information, Xijing University, Xi'an, 710123, China.
| |
Collapse
|
3
|
Zhao BW, Su XR, Yang Y, Li DX, Li GD, Hu PW, Luo X, Hu L. A heterogeneous information network learning model with neighborhood-level structural representation for predicting lncRNA-miRNA interactions. Comput Struct Biotechnol J 2024; 23:2924-2933. [PMID: 39963422 PMCID: PMC11832017 DOI: 10.1016/j.csbj.2024.06.032] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/31/2024] [Revised: 06/13/2024] [Accepted: 06/23/2024] [Indexed: 02/20/2025] Open
Abstract
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) are closely related to the treatment of human diseases. Traditional biological experiments often require time-consuming and labor-intensive in their search for mechanisms of disease. Computational methods are regarded as an effective way to predict unknown lncRNA-miRNA interactions (LMIs). However, most of them complete their tasks by mainly focusing on a single lncRNA-miRNA network without considering the complex mechanism between biomolecular in life activities, which are believed to be useful for improving the accuracy of LMI prediction. To address this, a heterogeneous information network (HIN) learning model with neighborhood-level structural representation, called HINLMI, to precisely identify LMIs. In particular, HINLMI first constructs a HIN by integrating nine interactions of five biomolecules. After that, different representation learning strategies are applied to learn the biological and network representations of lncRNAs and miRNAs in the HIN from different perspectives. Finally, HINLMI incorporates the XGBoost classifier to predict unknown LMIs using final embeddings of lncRNAs and miRNAs. Experimental results show that HINLMI yields a best performance on the real dataset when compared with state-of-the-art computational models. Moreover, several analysis experiments indicate that the simultaneous consideration of biological knowledge and network topology of lncRNAs and miRNAs allows HINLMI to accurately predict LMIs from a more comprehensive perspective. The promising performance of HINLMI also reveals that the utilization of rich heterogeneous information can provide an alternative insight for HINLMI to identify novel interactions between lncRNAs and miRNAs.
Collapse
Affiliation(s)
- Bo-Wei Zhao
- College of Computer and Information Science, School of Software, Southwest University, Chongqing 400715, China
| | - Xiao-Rui Su
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Yue Yang
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Dong-Xu Li
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Guo-Dong Li
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Peng-Wei Hu
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Xin Luo
- College of Computer and Information Science, School of Software, Southwest University, Chongqing 400715, China
| | - Lun Hu
- The Xinjiang Technical Institute of Physics & Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| |
Collapse
|
4
|
Guo C, Wang X, Ren H. Databases and computational methods for the identification of piRNA-related molecules: A survey. Comput Struct Biotechnol J 2024; 23:813-833. [PMID: 38328006 PMCID: PMC10847878 DOI: 10.1016/j.csbj.2024.01.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/11/2023] [Revised: 12/31/2023] [Accepted: 01/15/2024] [Indexed: 02/09/2024] Open
Abstract
Piwi-interacting RNAs (piRNAs) are a class of small non-coding RNAs (ncRNAs) that plays important roles in many biological processes and major cancer diagnosis and treatment, thus becoming a hot research topic. This study aims to provide an in-depth review of computational piRNA-related research, including databases and computational models. Herein, we perform literature analysis and use comparative evaluation methods to summarize and analyze three aspects of computational piRNA-related research: (i) computational models for piRNA-related molecular identification tasks, (ii) computational models for piRNA-disease association prediction tasks, and (iii) computational resources and evaluation metrics for these tasks. This study shows that computational piRNA-related research has significantly progressed, exhibiting promising performance in recent years, whereas they also suffer from the emerging challenges of inconsistent naming systems and the lack of data. Different from other reviews on piRNA-related identification tasks that focus on the organization of datasets and computational methods, we pay more attention to the analysis of computational models, algorithms, and performances that aim to provide valuable references for computational piRNA-related identification tasks. This study will benefit the theoretical development and practical application of piRNAs by better understanding computational models and resources to investigate the biological functions and clinical implications of piRNA.
Collapse
Affiliation(s)
- Chang Guo
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
| | - Xiaoli Wang
- Institute of Reproductive Health, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China
| | - Han Ren
- Laboratory of Language Engineering and Computing, Guangdong University of Foreign Studies, Guangzhou 510420, China
- Laboratory of Language and Artificial Intelligence, Guangdong University of Foreign Studies, Guangzhou 510420, China
| |
Collapse
|
5
|
Wei Y, Zhang Q, Liu L. The improved de Bruijn graph for multitask learning: predicting functions, subcellular localization, and interactions of noncoding RNAs. Brief Bioinform 2024; 26:bbae627. [PMID: 39592154 PMCID: PMC11596098 DOI: 10.1093/bib/bbae627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/16/2024] [Revised: 11/13/2024] [Accepted: 11/15/2024] [Indexed: 11/28/2024] Open
Abstract
Noncoding RNA refers to RNA that does not encode proteins. The lncRNA and miRNA it contains play crucial regulatory roles in organisms, and their aberrant expression is closely related to various diseases. Traditional experimental methods for validating the interactions of these RNAs have limitations, and existing prediction models exhibit relatively limited functionality, relying on isolated feature extraction and performing poorly in handling various types of small sample tasks. This paper proposes an improved de Bruijn graph that can inject RNA structural information into the graph while preserving sequence information. Furthermore, the improved de Bruijn graph enables graph neural networks to learn broader dependencies and correlations among data by introducing richer edge relationships. Meanwhile, the multitask learning model, DVMnet, proposed in this paper can handle multiple related tasks, and we optimize model parameters by integrating the total loss of three tasks. This enables multitask prediction of RNA interactions, disease associations, and subcellular localization. Compared with the best existing models in this field, DVMnet has achieved the best performance with a 3% improvement in the area under the curve value and demonstrates robust results in predicting diseases and subcellular localization. The improved de Bruijn graph is also applicable to various scenarios and can unify the sequence and structural information of various nucleic acids into a single graph.
Collapse
Affiliation(s)
- Yuxiao Wei
- College of Software, Dalian Jiaotong University,794 Huanghe Road, Dalian 116028, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| | - Liwei Liu
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| |
Collapse
|
6
|
Huang J, Sun C, Li M, Tang R, Xie B, Wang S, Wei JM. Structure-inclusive similarity based directed GNN: a method that can control information flow to predict drug-target binding affinity. Bioinformatics 2024; 40:btae563. [PMID: 39292540 PMCID: PMC11474107 DOI: 10.1093/bioinformatics/btae563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/19/2024] [Revised: 05/21/2024] [Accepted: 09/17/2024] [Indexed: 09/20/2024] Open
Abstract
MOTIVATION Exploring the association between drugs and targets is essential for drug discovery and repurposing. Comparing with the traditional methods that regard the exploration as a binary classification task, predicting the drug-target binding affinity can provide more specific information. Many studies work based on the assumption that similar drugs may interact with the same target. These methods constructed a symmetric graph according to the undirected drug similarity or target similarity. Although these similarities can measure the difference between two molecules, it is unable to analyze the inclusion relationship of their substructure. For example, if drug A contains all the substructures of drug B, then in the message-passing mechanism of the graph neural network, drug A should acquire all the properties of drug B, while drug B should only obtain some of the properties of A. RESULTS To this end, we proposed a structure-inclusive similarity (SIS) which measures the similarity of two drugs by considering the inclusion relationship of their substructures. Based on SIS, we constructed a drug graph and a target graph, respectively, and predicted the binding affinities between drugs and targets by a graph convolutional network-based model. Experimental results show that considering the inclusion relationship of the substructure of two molecules can effectively improve the accuracy of the prediction model. The performance of our SIS-based prediction method outperforms several state-of-the-art methods for drug-target binding affinity prediction. The case studies demonstrate that our model is a practical tool to predict the binding affinity between drugs and targets. AVAILABILITY AND IMPLEMENTATION Source codes and data are available at https://github.com/HuangStomach/SISDTA.
Collapse
Affiliation(s)
- Jipeng Huang
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Chang Sun
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Minglei Li
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Rong Tang
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
- Tianjin Key Laboratory of Network and Data Security, Tianjin 300350, China
| | - Bin Xie
- College of Computer and Cyber Security, Hebei Normal University, Shijiazhuang 050024, China
| | - Shuqin Wang
- College of Computer and Information Engineering, Tianjin Normal University, Tianjin, Xi Qing District 300387, China
| | - Jin-Mao Wei
- Centre for Bioinformatics and Intelligent Medicine, Nankai University, Tianjin 300071, China
- College of Computer Science, Nankai University, Tianjin 300071, China
| |
Collapse
|
7
|
Ji C, Yu N, Wang Y, Ni J, Zheng C. SGLMDA: A Subgraph Learning-Based Method for miRNA-Disease Association Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1191-1201. [PMID: 38446654 DOI: 10.1109/tcbb.2024.3373772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 03/08/2024]
Abstract
MicroRNAs (miRNA) are endogenous non-coding RNAs, typically around 23 nucleotides in length. Many miRNAs have been founded to play crucial roles in gene regulation though post-transcriptional repression in animals. Existing studies suggest that the dysregulation of miRNA is closely associated with many human diseases. Discovering novel associations between miRNAs and diseases is essential for advancing our understanding of disease pathogenesis at molecular level. However, experimental validation is time-consuming and expensive. To address this challenge, numerous computational methods have been proposed for predicting miRNA-disease associations. Unfortunately, most existing methods face difficulties when applied to large-scale miRNA-disease complex networks. In this paper, we present a novel subgraph learning method named SGLMDA for predicting miRNA-disease associations. For miRNA-disease pairs, SGLMDA samples K-hop subgraphs from the global heterogeneous miRNA-disease graph. It then introduces a novel subgraph representation algorithm based on Graph Neural Network (GNN) for feature extraction and prediction. Extensive experiments conducted on benchmark datasets demonstrate that SGLMDA can effectively and robustly predict potential miRNA-disease associations. Compared to other state-of-the-art methods, SGLMDA achieves superior prediction performance in terms of Area Under the Curve (AUC) and Average Precision (AP) values during 5-fold Cross-Validation (5CV) on benchmark datasets such as HMDD v2.0 and HMDD v3.2. Additionally, case studies on Colon Neoplasms and Triple-Negative Breast Cancer (TNBC) further underscore the predictive power of SGLMDA.
Collapse
|
8
|
Meng Z, Liu S, Liang S, Jani B, Meng Z. Heterogeneous biomedical entity representation learning for gene-disease association prediction. Brief Bioinform 2024; 25:bbae380. [PMID: 39154194 PMCID: PMC11330343 DOI: 10.1093/bib/bbae380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/31/2024] [Revised: 05/29/2024] [Accepted: 07/22/2024] [Indexed: 08/19/2024] Open
Abstract
Understanding the genetic basis of disease is a fundamental aspect of medical research, as genes are the classic units of heredity and play a crucial role in biological function. Identifying associations between genes and diseases is critical for diagnosis, prevention, prognosis, and drug development. Genes that encode proteins with similar sequences are often implicated in related diseases, as proteins causing identical or similar diseases tend to show limited variation in their sequences. Predicting gene-disease association (GDA) requires time-consuming and expensive experiments on a large number of potential candidate genes. Although methods have been proposed to predict associations between genes and diseases using traditional machine learning algorithms and graph neural networks, these approaches struggle to capture the deep semantic information within the genes and diseases and are dependent on training data. To alleviate this issue, we propose a novel GDA prediction model named FusionGDA, which utilizes a pre-training phase with a fusion module to enrich the gene and disease semantic representations encoded by pre-trained language models. Multi-modal representations are generated by the fusion module, which includes rich semantic information about two heterogeneous biomedical entities: protein sequences and disease descriptions. Subsequently, the pooling aggregation strategy is adopted to compress the dimensions of the multi-modal representation. In addition, FusionGDA employs a pre-training phase leveraging a contrastive learning loss to extract potential gene and disease features by training on a large public GDA dataset. To rigorously evaluate the effectiveness of the FusionGDA model, we conduct comprehensive experiments on five datasets and compare our proposed model with five competitive baseline models on the DisGeNet-Eval dataset. Notably, our case study further demonstrates the ability of FusionGDA to discover hidden associations effectively. The complete code and datasets of our experiments are available at https://github.com/ZhaohanM/FusionGDA.
Collapse
Affiliation(s)
- Zhaohan Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Siwei Liu
- School of Natural and Computing Science, University of Aberdeen King’s College, Aberdeen, AB24 3FX, UK
| | - Shangsong Liang
- Machine Learning Department, Mohamed bin Zayed University of Artificial Intelligence, Building 1B, Masdar City, Abu Dhabi 000000, UAE
| | - Bhautesh Jani
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| | - Zaiqiao Meng
- School of Computing Science, University of Glasgow, 18 Lilybank Gardens, Glasgow G12 8RZ, UK
| |
Collapse
|
9
|
Peng H, Xu J, Liu K, Liu F, Zhang A, Zhang X. EIEPCF: accurate inference of functional gene regulatory networks by eliminating indirect effects from confounding factors. Brief Funct Genomics 2024; 23:373-383. [PMID: 37642217 DOI: 10.1093/bfgp/elad040] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 05/05/2023] [Revised: 07/07/2023] [Accepted: 08/14/2023] [Indexed: 08/31/2023] Open
Abstract
Reconstructing functional gene regulatory networks (GRNs) is a primary prerequisite for understanding pathogenic mechanisms and curing diseases in animals, and it also provides an important foundation for cultivating vegetable and fruit varieties that are resistant to diseases and corrosion in plants. Many computational methods have been developed to infer GRNs, but most of the regulatory relationships between genes obtained by these methods are biased. Eliminating indirect effects in GRNs remains a significant challenge for researchers. In this work, we propose a novel approach for inferring functional GRNs, named EIEPCF (eliminating indirect effects produced by confounding factors), which eliminates indirect effects caused by confounding factors. This method eliminates the influence of confounding factors on regulatory factors and target genes by measuring the similarity between their residuals. The validation results of the EIEPCF method on simulation studies, the gold-standard networks provided by the DREAM3 Challenge and the real gene networks of Escherichia coli demonstrate that it achieves significantly higher accuracy compared to other popular computational methods for inferring GRNs. As a case study, we utilized the EIEPCF method to reconstruct the cold-resistant specific GRN from gene expression data of cold-resistant in Arabidopsis thaliana. The source code and data are available at https://github.com/zhanglab-wbgcas/EIEPCF.
Collapse
Affiliation(s)
- Huixiang Peng
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Jing Xu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Kangchen Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- University of Chinese Academy of Sciences, Beijing 100049 China
| | - Fang Liu
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Aidi Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
| | - Xiujun Zhang
- Key Laboratory of Plant Germplasm Enhancement and Specialty Agriculture, Wuhan Botanical Garden, Chinese Academy of Sciences, Wuhan 430074 China
- Center of Economic Botany, Core Botanical Gardens, Chinese Academy of Sciences, Wuhan 430074, China
| |
Collapse
|
10
|
Sun SL, Zhou BW, Liu SZ, Xiu YH, Bilal A, Long HX. Prediction of miRNAs and diseases association based on sparse autoencoder and MLP. Front Genet 2024; 15:1369811. [PMID: 38873111 PMCID: PMC11169787 DOI: 10.3389/fgene.2024.1369811] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 01/13/2024] [Accepted: 05/07/2024] [Indexed: 06/15/2024] Open
Abstract
Introduction: MicroRNAs (miRNAs) are small and non-coding RNA molecules which have multiple important regulatory roles within cells. With the deepening research on miRNAs, more and more researches show that the abnormal expression of miRNAs is closely related to various diseases. The relationship between miRNAs and diseases is crucial for discovering the pathogenesis of diseases and exploring new treatment methods. Methods: Therefore, we propose a new sparse autoencoder and MLP method (SPALP) to predict the association between miRNAs and diseases. In this study, we adopt advanced deep learning technologies, including sparse autoencoder and multi-layer perceptron (MLP), to improve the accuracy of predicting miRNA-disease associations. Firstly, the SPALP model uses a sparse autoencoder to perform feature learning and extract the initial features of miRNAs and diseases separately, obtaining the latent features of miRNAs and diseases. Then, the latent features combine miRNAs functional similarity data with diseases semantic similarity data to construct comprehensive miRNAs-diseases datasets. Subsequently, the MLP model can predict the unknown association among miRNAs and diseases. Result: To verify the performance of our model, we set up several comparative experiments. The experimental results show that, compared with traditional methods and other deep learning prediction methods, our method has significantly improved the accuracy of predicting miRNAs-disease associations, with 94.61% accuracy and 0.9859 AUC value. Finally, we conducted case study of SPALP model. We predicted the top 30 miRNAs that might be related to Lupus Erythematosus, Ecute Myeloid Leukemia, Cardiovascular, Stroke, Diabetes Mellitus five elderly diseases and validated that 27, 29, 29, 30, and 30 of the top 30 are indeed associated. Discussion: The SPALP approach introduced in this study is adept at forecasting the links between miRNAs and diseases, addressing the complexities of analyzing extensive bioinformatics datasets and enriching the comprehension contribution to disease progression of miRNAs.
Collapse
Affiliation(s)
- Si-Lin Sun
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Bing-Wei Zhou
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Sheng-Zheng Liu
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Yu-Han Xiu
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
| | - Anas Bilal
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
- Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou, China
| | - Hai-Xia Long
- Department of Information Science Technology, Hainan Normal University, Haikou, Hainan, China
- Key Laboratory of Data Science and Smart Education, Ministry of Education, Hainan Normal University, Haikou, China
| |
Collapse
|
11
|
Jia C, Wang F, Xing B, Li S, Zhao Y, Li Y, Wang Q. DGAMDA: Predicting miRNA-disease association based on dynamic graph attention network. INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN BIOMEDICAL ENGINEERING 2024; 40:e3809. [PMID: 38472636 DOI: 10.1002/cnm.3809] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Academic Contribution Register] [Received: 04/11/2023] [Revised: 01/22/2024] [Accepted: 01/27/2024] [Indexed: 03/14/2024]
Abstract
MiRNA (microRNA)-disease association prediction has essential applications for early disease screening. The process of traditional biological experimental validation is both time-consuming and expensive. However, as artificial intelligence technology continues to advance, computational methods have become efficient tools for predicting miRNA-disease associations. These methods often rely on the combination of multiple sources of association data and require improved feature mining. This study proposes a dynamic graph attention-based association prediction model, DGAMDA, which combines feature mapping and dynamic graph attention mechanisms through feature mining on a single miRNA-disease association network. DGAMDA effectively solves the problems of feature heterogeneity and inadequate feature mining by previous static graph attention mechanisms and achieves high-precision feature mining and association scoring prediction. We conducted a five-fold cross-validation experiment and obtained the mean values of Accuracy, Precision, Recall, and F1-score, which were .8986, .8869, .9115, and .8984, respectively. Our proposed model outperforms other advanced models in terms of experimental results, demonstrating its effectiveness in feature mining and association prediction based on a single association network. In addition, our model can also be used to predict miRNAs associated with unknown diseases.
Collapse
Affiliation(s)
- ChangXin Jia
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - FuYu Wang
- College of Computer Science and Technology, China University of Petroleum, Qingdao, People's Republic of China
| | - Baoxiang Xing
- Department of Obstetrics, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - ShaoNa Li
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Yang Zhao
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Yu Li
- Department of Anesthesiology, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| | - Qing Wang
- Department of Endocrine and Metabolic, the Affiliated Hospital of Qingdao University, Qingdao, People's Republic of China
| |
Collapse
|
12
|
Sheng N, Xie X, Wang Y, Huang L, Zhang S, Gao L, Wang H. A Survey of Deep Learning for Detecting miRNA- Disease Associations: Databases, Computational Methods, Challenges, and Future Directions. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:328-347. [PMID: 38194377 DOI: 10.1109/tcbb.2024.3351752] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 01/11/2024]
Abstract
MicroRNAs (miRNAs) are an important class of non-coding RNAs that play an essential role in the occurrence and development of various diseases. Identifying the potential miRNA-disease associations (MDAs) can be beneficial in understanding disease pathogenesis. Traditional laboratory experiments are expensive and time-consuming. Computational models have enabled systematic large-scale prediction of potential MDAs, greatly improving the research efficiency. With recent advances in deep learning, it has become an attractive and powerful technique for uncovering novel MDAs. Consequently, numerous MDA prediction methods based on deep learning have emerged. In this review, we first summarize publicly available databases related to miRNAs and diseases for MDA prediction. Next, we outline commonly used miRNA and disease similarity calculation and integration methods. Then, we comprehensively review the 48 existing deep learning-based MDA computation methods, categorizing them into classical deep learning and graph neural network-based techniques. Subsequently, we investigate the evaluation methods and metrics that are frequently used to assess MDA prediction performance. Finally, we discuss the performance trends of different computational methods, point out some problems in current research, and propose 9 potential future research directions. Data resources and recent advances in MDA prediction methods are summarized in the GitHub repository https://github.com/sheng-n/DL-miRNA-disease-association-methods.
Collapse
|
13
|
Daniel Thomas S, Vijayakumar K, John L, Krishnan D, Rehman N, Revikumar A, Kandel Codi JA, Prasad TSK, S S V, Raju R. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2024; 28:213-233. [PMID: 38752932 DOI: 10.1089/omi.2024.0047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/23/2024]
Abstract
MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases.
Collapse
Affiliation(s)
- Sonet Daniel Thomas
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Krithika Vijayakumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Levin John
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Deepak Krishnan
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Niyas Rehman
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | - Amjesh Revikumar
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Kerala Genome Data Centre, Kerala Development and Innovation Strategic Council, Thiruvananthapuram, Kerala, India
| | - Jalaluddin Akbar Kandel Codi
- Department of Surgical Oncology, Yenepoya Medical College, Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| | | | - Vinodchandra S S
- Department of Computer Science, University of Kerala, Thiruvananthapuram, Kerala, India
| | - Rajesh Raju
- Centre for Integrative Omics Data Science (CIODS), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
- Centre for Systems Biology and Molecular Medicine (CSBMM), Yenepoya (Deemed to Be University), Manglore, Karnataka, India
| |
Collapse
|
14
|
He J, Li M, Qiu J, Pu X, Guo Y. HOPEXGB: A Consensual Model for Predicting miRNA/lncRNA-Disease Associations Using a Heterogeneous Disease-miRNA-lncRNA Information Network. J Chem Inf Model 2024; 64:2863-2877. [PMID: 37604142 DOI: 10.1021/acs.jcim.3c00856] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 08/23/2023]
Abstract
Predicting disease-related microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) is crucial to find new biomarkers for the prevention, diagnosis, and treatment of complex human diseases. Computational predictions for miRNA/lncRNA-disease associations are of great practical significance, since traditional experimental detection is expensive and time-consuming. In this paper, we proposed a consensual machine-learning technique-based prediction approach to identify disease-related miRNAs and lncRNAs by high-order proximity preserved embedding (HOPE) and eXtreme Gradient Boosting (XGB), named HOPEXGB. By connecting lncRNA, miRNA, and disease nodes based on their correlations and relationships, we first created a heterogeneous disease-miRNA-lncRNA (DML) information network to achieve an effective fusion of information on similarities, correlations, and interactions among miRNAs, lncRNAs, and diseases. In addition, a more rational negative data set was generated based on the similarities of unknown associations with the known ones, so as to effectively reduce the false negative rate in the data set for model construction. By 10-fold cross-validation, HOPE shows better performance than other graph embedding methods. The final consensual HOPEXGB model yields robust performance with a mean prediction accuracy of 0.9569 and also demonstrates high sensitivity and specificity advantages compared to lncRNA/miRNA-specific predictions. Moreover, it is superior to other existing methods and gives promising performance on the external testing data, indicating that integrating the information on lncRNA-miRNA interactions and the similarities of lncRNAs/miRNAs is beneficial for improving the prediction performance of the model. Finally, case studies on lung, stomach, and breast cancers indicate that HOPEXGB could be a powerful tool for preclinical biomarker detection and bioexperiment preliminary screening for the diagnosis and prognosis of cancers. HOPEXGB is publicly available at https://github.com/airpamper/HOPEXGB.
Collapse
Affiliation(s)
- Jian He
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Jiangguo Qiu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
15
|
Xie G, Xie W, Gu G, Lin Z, Chen R, Liu S, Yu J. A vector projection similarity-based method for miRNA-disease association prediction. Anal Biochem 2024; 687:115431. [PMID: 38123111 DOI: 10.1016/j.ab.2023.115431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/24/2023] [Revised: 12/06/2023] [Accepted: 12/15/2023] [Indexed: 12/23/2023]
Abstract
[S U M M A R Y] Many miRNA-disease association prediction models incorporate Gaussian interaction profile kernel similarity (GIPS). However, the GIPS fails to consider the specificity of the miRNA-disease association matrix, where matrix elements with a value of 0 represent miRNA and disease relationships that have not been discovered yet. To address this issue and better account for the impact of known and unknown miRNA-disease associations on similarity, we propose a method called vector projection similarity-based method for miRNA-disease association prediction (VPSMDA). In VPSMDA, we introduce three projection rules and combined with logistic functions for the miRNA-disease association matrix and propose a vector projection similarity measure for miRNAs and diseases. By integrating the vector projection similarity matrix with the original one, we obtain the improved miRNA and disease similarity matrix. Additionally, we construct a weight matrix using different numbers of neighbors to reduce the noise in the similarity matrix. In performance evaluation, both LOOCV and 5-fold CV experiments demonstrate that VPSMDA outperforms seven other state-of-the-art methods in AUC. Furthermore, in a case study, VPSMDA successfully predicted 10, 9, and 10 out of the top 10 associations for three important human diseases, respectively, and these predictions were confirmed by recent biomedical resources.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Weijie Xie
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Guosheng Gu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Zhiyi Lin
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Shigang Liu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| | - Junrui Yu
- School of Computer, Guangdong University of Technology, Guangzhou, 510000, China
| |
Collapse
|
16
|
Tian Z, Han C, Xu L, Teng Z, Song W. MGCNSS: miRNA-disease association prediction with multi-layer graph convolution and distance-based negative sample selection strategy. Brief Bioinform 2024; 25:bbae168. [PMID: 38622356 PMCID: PMC11018511 DOI: 10.1093/bib/bbae168] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/12/2023] [Revised: 03/14/2024] [Accepted: 03/31/2024] [Indexed: 04/17/2024] Open
Abstract
Identifying disease-associated microRNAs (miRNAs) could help understand the deep mechanism of diseases, which promotes the development of new medicine. Recently, network-based approaches have been widely proposed for inferring the potential associations between miRNAs and diseases. However, these approaches ignore the importance of different relations in meta-paths when learning the embeddings of miRNAs and diseases. Besides, they pay little attention to screening out reliable negative samples which is crucial for improving the prediction accuracy. In this study, we propose a novel approach named MGCNSS with the multi-layer graph convolution and high-quality negative sample selection strategy. Specifically, MGCNSS first constructs a comprehensive heterogeneous network by integrating miRNA and disease similarity networks coupled with their known association relationships. Then, we employ the multi-layer graph convolution to automatically capture the meta-path relations with different lengths in the heterogeneous network and learn the discriminative representations of miRNAs and diseases. After that, MGCNSS establishes a highly reliable negative sample set from the unlabeled sample set with the negative distance-based sample selection strategy. Finally, we train MGCNSS under an unsupervised learning manner and predict the potential associations between miRNAs and diseases. The experimental results fully demonstrate that MGCNSS outperforms all baseline methods on both balanced and imbalanced datasets. More importantly, we conduct case studies on colon neoplasms and esophageal neoplasms, further confirming the ability of MGCNSS to detect potential candidate miRNAs. The source code is publicly available on GitHub https://github.com/15136943622/MGCNSS/tree/master.
Collapse
Affiliation(s)
- Zhen Tian
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Chenguang Han
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Lewen Xu
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| | - Zhixia Teng
- College of Computer and Control Engineering, Northeast Forestry University, Harbin 150040, China
| | - Wei Song
- School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450000, China
| |
Collapse
|
17
|
Zou H, Ji B, Zhang M, Liu F, Xie X, Peng S. MHGTMDA: Molecular heterogeneous graph transformer based on biological entity graph for miRNA-disease associations prediction. MOLECULAR THERAPY. NUCLEIC ACIDS 2024; 35:102139. [PMID: 38384447 PMCID: PMC10879798 DOI: 10.1016/j.omtn.2024.102139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Academic Contribution Register] [Received: 11/28/2023] [Accepted: 01/31/2024] [Indexed: 02/23/2024]
Abstract
MicroRNAs (miRNAs) play a crucial role in the prevention, prognosis, diagnosis, and treatment of complex diseases. Existing computational methods primarily focus on biologically relevant molecules directly associated with miRNA or disease, overlooking the fact that the human body is a highly complex system where miRNA or disease may indirectly correlate with various types of biomolecules. To address this, we propose a novel prediction model named MHGTMDA (miRNA and disease association prediction using heterogeneous graph transformer based on molecular heterogeneous graph). MHGTMDA integrates biological entity relationships of eight biomolecules, constructing a relatively comprehensive heterogeneous biological entity graph. MHGTMDA serves as a powerful molecular heterogeneity map transformer, capturing structural elements and properties of miRNAs and diseases, revealing potential associations. In a 5-fold cross-validation study, MHGTMDA achieved an area under the receiver operating characteristic curve of 0.9569, surpassing state-of-the-art methods by at least 3%. Feature ablation experiments suggest that considering features among multiple biomolecules is more effective in uncovering miRNA-disease correlations. Furthermore, we conducted differential expression analyses on breast cancer and lung cancer, using MHGTMDA to further validate differentially expressed miRNAs. The results demonstrate MHGTMDA's capability to identify novel MDAs.
Collapse
Affiliation(s)
- Haitao Zou
- Guilin University of Technology, College of Information Science and Engineering, Guilin 541006, China
- Hunan University, College of Computer Science and Electronic Engineering, Changsha 410082, China
| | - Boya Ji
- Hunan University, College of Computer Science and Electronic Engineering, Changsha 410082, China
| | - Meng Zhang
- Xiangya Hospital, The Department of Thoracic Surgery, Changsha 410082, China
| | - Fen Liu
- Hunan Provincial People’s Hospital, Institute of Cardiovascular Epidemiology, Changsha 410082, China
| | - Xiaolan Xie
- Guilin University of Technology, College of Information Science and Engineering, Guilin 541006, China
| | - Shaoliang Peng
- Hunan University, College of Computer Science and Electronic Engineering, Changsha 410082, China
| |
Collapse
|
18
|
Yao HB, Hou ZJ, Zhang WG, Li H, Chen Y. Prediction of MicroRNA-Disease Potential Association Based on Sparse Learning and Multilayer Random Walks. J Comput Biol 2024; 31:241-256. [PMID: 38377572 DOI: 10.1089/cmb.2023.0266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 02/22/2024] Open
Abstract
More and more studies have shown that microRNAs (miRNAs) play an indispensable role in the study of complex diseases in humans. Traditional biological experiments to detect miRNA-disease associations are expensive and time-consuming. Therefore, it is necessary to propose efficient and meaningful computational models to predict miRNA-disease associations. In this study, we aim to propose a miRNA-disease association prediction model based on sparse learning and multilayer random walks (SLMRWMDA). The miRNA-disease association matrix is decomposed and reconstructed by the sparse learning method to obtain richer association information, and at the same time, the initial probability matrix for the random walk with restart algorithm is obtained. The disease similarity network, miRNA similarity network, and miRNA-disease association network are used to construct heterogeneous networks, and the stable probability is obtained based on the topological structure features of diseases and miRNAs through a multilayer random walk algorithm to predict miRNA-disease potential association. The experimental results show that the prediction accuracy of this model is significantly improved compared with the previous related models. We evaluated the model using global leave-one-out cross-validation (global LOOCV) and fivefold cross-validation (5-fold CV). The area under the curve (AUC) value for the LOOCV is 0.9368. The mean AUC value for 5-fold CV is 0.9335 and the variance is 0.0004. In the case study, the results show that SLMRWMDA is effective in inferring the potential association of miRNA-disease.
Collapse
Affiliation(s)
- Hai-Bin Yao
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Zhen-Jie Hou
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Wen-Guang Zhang
- Life Sciences, Inner Mongolia Agricultural University, Hohhot, China
| | - Han Li
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| | - Yan Chen
- Computer Science and Artificial Intelligence and Aliyun School of Big Data, Changzhou University, Changzhou, China
| |
Collapse
|
19
|
Xie GB, Yu JR, Lin ZY, Gu GS, Chen RB, Xu HJ, Liu ZG. Prediction of miRNA-disease associations based on strengthened hypergraph convolutional autoencoder. Comput Biol Chem 2024; 108:107992. [PMID: 38056378 DOI: 10.1016/j.compbiolchem.2023.107992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/21/2023] [Revised: 11/04/2023] [Accepted: 11/24/2023] [Indexed: 12/08/2023]
Abstract
Most existing graph neural network-based methods for predicting miRNA-disease associations rely on initial association matrices to pass messages, but the sparsity of these matrices greatly limits performance. To address this issue and predict potential associations between miRNAs and diseases, we propose a method called strengthened hypergraph convolutional autoencoder (SHGAE). SHGAE leverages multiple layers of strengthened hypergraph neural networks (SHGNN) to obtain robust node embeddings. Within SHGNN, we design a strengthened hypergraph convolutional network module (SHGCN) that enhances original graph associations and reduces matrix sparsity. Additionally, SHGCN expands node receptive fields by utilizing hyperedge features as intermediaries to obtain high-order neighbor embeddings. To improve performance, we also incorporate attention-based fusion of self-embeddings and SHGCN embeddings. SHGAE predicts potential miRNA-disease associations using a multilayer perceptron as the decoder. Across multiple metrics, SHGAE outperforms other state-of-the-art methods in five-fold cross-validation. Furthermore, we evaluate SHGAE on colon and lung neoplasms cases to demonstrate its ability to predict potential associations. Notably, SHGAE also performs well in the analysis of gastric neoplasms without miRNA associations.
Collapse
Affiliation(s)
- Guo-Bo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Jun-Rui Yu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Zhi-Yi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Guo-Sheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Rui-Bin Chen
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Hao-Jie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510000, China.
| | - Zhen-Guo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou 510080, China.
| |
Collapse
|
20
|
Jiao CN, Zhou F, Liu BM, Zheng CH, Liu JX, Gao YL. Multi-Kernel Graph Attention Deep Autoencoder for MiRNA-Disease Association Prediction. IEEE J Biomed Health Inform 2024; 28:1110-1121. [PMID: 38055359 DOI: 10.1109/jbhi.2023.3336247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 12/08/2023]
Abstract
Accumulating evidence indicates that microRNAs (miRNAs) can control and coordinate various biological processes. Consequently, abnormal expressions of miRNAs have been linked to various complex diseases. Recognizable proof of miRNA-disease associations (MDAs) will contribute to the diagnosis and treatment of human diseases. Nevertheless, traditional experimental verification of MDAs is laborious and limited to small-scale. Therefore, it is necessary to develop reliable and effective computational methods to predict novel MDAs. In this work, a multi-kernel graph attention deep autoencoder (MGADAE) method is proposed to predict potential MDAs. In detail, MGADAE first employs the multiple kernel learning (MKL) algorithm to construct an integrated miRNA similarity and disease similarity, providing more biological information for further feature learning. Second, MGADAE combines the known MDAs, disease similarity, and miRNA similarity into a heterogeneous network, then learns the representations of miRNAs and diseases through graph convolution operation. After that, an attention mechanism is introduced into MGADAE to integrate the representations from multiple graph convolutional network (GCN) layers. Lastly, the integrated representations of miRNAs and diseases are input into the bilinear decoder to obtain the final predicted association scores. Corresponding experiments prove that the proposed method outperforms existing advanced approaches in MDA prediction. Furthermore, case studies related to two human cancers provide further confirmation of the reliability of MGADAE in practice.
Collapse
|
21
|
Han Y, Zhou Q, Liu L, Li J, Zhou Y. DNI-MDCAP: improvement of causal MiRNA-disease association prediction based on deep network imputation. BMC Bioinformatics 2024; 25:22. [PMID: 38216907 PMCID: PMC10785389 DOI: 10.1186/s12859-024-05644-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/07/2023] [Accepted: 01/08/2024] [Indexed: 01/14/2024] Open
Abstract
BACKGROUND MiRNAs are involved in the occurrence and development of many diseases. Extensive literature studies have demonstrated that miRNA-disease associations are stratified and encompass ~ 20% causal associations. Computational models that predict causal miRNA-disease associations provide effective guidance in identifying novel interpretations of disease mechanisms and potential therapeutic targets. Although several predictive models for miRNA-disease associations exist, it is still challenging to discriminate causal miRNA-disease associations from non-causal ones. Hence, there is a pressing need to develop an efficient prediction model for causal miRNA-disease association prediction. RESULTS We developed DNI-MDCAP, an improved computational model that incorporated additional miRNA similarity metrics, deep graph embedding learning-based network imputation and semi-supervised learning framework. Through extensive predictive performance evaluation, including tenfold cross-validation and independent test, DNI-MDCAP showed excellent performance in identifying causal miRNA-disease associations, achieving an area under the receiver operating characteristic curve (AUROC) of 0.896 and 0.889, respectively. Regarding the challenge of discriminating causal miRNA-disease associations from non-causal ones, DNI-MDCAP exhibited superior predictive performance compared to existing models MDCAP and LE-MDCAP, reaching an AUROC of 0.870. Wilcoxon test also indicated significantly higher prediction scores for causal associations than for non-causal ones. Finally, the potential causal miRNA-disease associations predicted by DNI-MDCAP, exemplified by diabetic nephropathies and hsa-miR-193a, have been validated by recently published literature, further supporting the reliability of the prediction model. CONCLUSIONS DNI-MDCAP is a dedicated tool to specifically distinguish causal miRNA-disease associations with substantially improved accuracy. DNI-MDCAP is freely accessible at http://www.rnanut.net/DNIMDCAP/ .
Collapse
Affiliation(s)
- Yu Han
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qiong Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Leibo Liu
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yuan Zhou
- Department of Biomedical Informatics, School of Basic Medical Sciences, Peking University, Beijing, China.
- State Key Laboratory of Vascular Homeostasis and Remodeling, Peking University, Beijing, China.
| |
Collapse
|
22
|
Xu L, Fu X, Zhuo L, Zhou Z, Liao X, Tian S, Kang R, Chen Y. SGAE-MDA: Exploring the MiRNA-disease associations in herbal medicines based on semi-supervised graph autoencoder. Methods 2024; 221:73-81. [PMID: 38123109 DOI: 10.1016/j.ymeth.2023.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/30/2023] [Revised: 11/28/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
Research indicates that miRNAs present in herbal medicines are crucial for identifying disease markers, advancing gene therapy, facilitating drug delivery, and so on. These miRNAs maintain stability in the extracellular environment, making them viable tools for disease diagnosis. They can withstand the digestive processes in the gastrointestinal tract, positioning them as potential carriers for specific oral drug delivery. By engineering plants to generate effective, non-toxic miRNA interference sequences, it's possible to broaden their applicability, including the treatment of diseases such as hepatitis C. Consequently, delving into the miRNA-disease associations (MDAs) within herbal medicines holds immense promise for diagnosing and addressing miRNA-related diseases. In our research, we propose the SGAE-MDA model, which harnesses the strengths of a graph autoencoder (GAE) combined with a semi-supervised approach to uncover potential MDAs in herbal medicines more effectively. Leveraging the GAE framework, the SGAE-MDA model exactly integrates the inherent feature vectors of miRNAs and disease nodes with the regulatory data in the miRNA-disease network. Additionally, the proposed semi-supervised learning approach randomly hides the partial structure of the miRNA-disease network, subsequently reconstructing them within the GAE framework. This technique effectively minimizes network noise interference. Through comparison against other leading deep learning models, the results consistently highlighted the superior performance of the proposed SGAE-MDA model. Our code and dataset can be available at: https://github.com/22n9n23/SGAE-MDA.
Collapse
Affiliation(s)
- Lei Xu
- Wenzhou University of Technology, Wenzhou, China
| | - Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China; College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
| | - Linlin Zhuo
- Wenzhou University of Technology, Wenzhou, China
| | | | - Xuefeng Liao
- Wenzhou University of Technology, Wenzhou, China.
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China.
| | - Ruofei Kang
- Xuhui Excellent Health Information Technology Co., Ltd., China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China.
| |
Collapse
|
23
|
Yang C, Wang Z, Zhang S, Li X, Wang X, Liu J, Li R, Zeng S. MVNMDA: A Multi-View Network Combing Semantic and Global Features for Predicting miRNA-Disease Association. Molecules 2023; 29:230. [PMID: 38202814 PMCID: PMC10780172 DOI: 10.3390/molecules29010230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/04/2023] [Revised: 12/23/2023] [Accepted: 12/28/2023] [Indexed: 01/12/2024] Open
Abstract
A growing body of experimental evidence suggests that microRNAs (miRNAs) are closely associated with specific human diseases and play critical roles in their development and progression. Therefore, identifying miRNA related to specific diseases is of great significance for disease screening and treatment. In the early stages, the identification of associations between miRNAs and diseases demanded laborious and time-consuming biological experiments that often carried a substantial risk of failure. With the exponential growth in the number of potential miRNA-disease association combinations, traditional biological experimental methods face difficulties in processing massive amounts of data. Hence, developing more efficient computational methods to predict possible miRNA-disease associations and prioritize them is particularly necessary. In recent years, numerous deep learning-based computational methods have been developed and have demonstrated excellent performance. However, most of these methods rely on external databases or tools to compute various auxiliary information. Unfortunately, these external databases or tools often cover only a limited portion of miRNAs and diseases, resulting in many miRNAs and diseases being unable to match with these computational methods. Therefore, there are certain limitations associated with the practical application of these methods. To overcome the above limitations, this study proposes a multi-view computational model called MVNMDA, which predicts potential miRNA-disease associations by integrating features of miRNA and diseases from local views, global views, and semantic views. Specifically, MVNMDA utilizes known association information to construct node initial features. Then, multiple networks are constructed based on known association to extract low-dimensional feature embedding of all nodes. Finally, a cascaded attention classifier is proposed to fuse features from coarse to fine, suppressing noise within the features and making precise predictions. To validate the effectiveness of the proposed method, extensive experiments were conducted on the HMDD v2.0 and HMDD v3.2 datasets. The experimental results demonstrate that MVNMDA achieves better performance compared to other computational methods. Additionally, the case study results further demonstrate the reliable predictive performance of MVNMDA.
Collapse
Affiliation(s)
| | - Zhen Wang
- School of Electronic Infomation, Xijing University, Xi’an 710123, China; (C.Y.); (S.Z.); (X.L.); (X.W.); (J.L.); (R.L.); (S.Z.)
| | | | | | | | | | | | | |
Collapse
|
24
|
Liao Q, Fu X, Zhuo L, Chen H. An efficient model for predicting human diseases through miRNA based on multiple-types of contrastive learning. Front Microbiol 2023; 14:1325001. [PMID: 38163075 PMCID: PMC10755968 DOI: 10.3389/fmicb.2023.1325001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 10/20/2023] [Accepted: 11/16/2023] [Indexed: 01/03/2024] Open
Abstract
Multiple studies have demonstrated that microRNA (miRNA) can be deeply involved in the regulatory mechanism of human microbiota, thereby inducing disease. Developing effective methods to infer potential associations between microRNAs (miRNAs) and diseases can aid early diagnosis and treatment. Recent methods utilize machine learning or deep learning to predict miRNA-disease associations (MDAs), achieving state-of-the-art performance. However, the problem of sparse neighborhoods of nodes due to lack of data has not been well solved. To this end, we propose a new model named MTCL-MDA, which integrates multiple-types of contrastive learning strategies into a graph collaborative filtering model to predict potential MDAs. The model adopts a contrastive learning strategy based on topology, which alleviates the damage to model performance caused by sparse neighborhoods. In addition, the model also adopts a semantic-based contrastive learning strategy, which not only reduces the impact of noise introduced by topology-based contrastive learning, but also enhances the semantic information of nodes. Experimental results show that our model outperforms existing models on all evaluation metrics. Case analysis shows that our model can more accurately identify potential MDA, which is of great significance for the screening and diagnosis of real-life diseases. Our data and code are publicly available at: https://github.com/Lqingquan/MTCL-MDA.
Collapse
Affiliation(s)
- Qingquan Liao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiangzheng Fu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
| | - Hao Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
25
|
Peng L, Tan J, Xiong W, Zhang L, Wang Z, Yuan R, Li Z, Chen X. Deciphering ligand-receptor-mediated intercellular communication based on ensemble deep learning and the joint scoring strategy from single-cell transcriptomic data. Comput Biol Med 2023; 163:107137. [PMID: 37364528 DOI: 10.1016/j.compbiomed.2023.107137] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/26/2023] [Revised: 05/18/2023] [Accepted: 06/04/2023] [Indexed: 06/28/2023]
Abstract
BACKGROUND Cell-cell communication in a tumor microenvironment is vital to tumorigenesis, tumor progression and therapy. Intercellular communication inference helps understand molecular mechanisms of tumor growth, progression and metastasis. METHODS Focusing on ligand-receptor co-expressions, in this study, we developed an ensemble deep learning framework, CellComNet, to decipher ligand-receptor-mediated cell-cell communication from single-cell transcriptomic data. First, credible LRIs are captured by integrating data arrangement, feature extraction, dimension reduction, and LRI classification based on an ensemble of heterogeneous Newton boosting machine and deep neural network. Next, known and identified LRIs are screened based on single-cell RNA sequencing (scRNA-seq) data in certain tissues. Finally, cell-cell communication is inferred by incorporating scRNA-seq data, the screened LRIs, a joint scoring strategy that combines expression thresholding and expression product of ligands and receptors. RESULTS The proposed CellComNet framework was compared with four competing protein-protein interaction prediction models (PIPR, XGBoost, DNNXGB, and OR-RCNN) and obtained the best AUCs and AUPRs on four LRI datasets, elucidating the optimal LRI classification ability. CellComNet was further applied to analyze intercellular communication in human melanoma and head and neck squamous cell carcinoma (HNSCC) tissues. The results demonstrate that cancer-associated fibroblasts highly communicate with melanoma cells and endothelial cells strong communicate with HNSCC cells. CONCLUSIONS The proposed CellComNet framework efficiently identified credible LRIs and significantly improved cell-cell communication inference performance. We anticipate that CellComNet can contribute to anticancer drug design and tumor-targeted therapy.
Collapse
Affiliation(s)
- Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China; College of Life Sciences and Chemistry, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Jingwei Tan
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Wei Xiong
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, Jiangsu, China
| | - Zhao Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Ruya Yuan
- School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, 421002, Hunan, China.
| | - Xing Chen
- School of Science, Jiangnan University, Wuxi, 214122, Jiangsu, China.
| |
Collapse
|
26
|
Azani A, Omran SP, Ghasrsaz H, Idani A, Eliaderani MK, Peirovi N, Dokhani N, Lotfalizadeh MH, Rezaei MM, Ghahfarokhi MS, KarkonShayan S, Hanjani PN, Kardaan Z, Navashenagh JG, Yousefi M, Abdolahi M, Salmaninejad A. MicroRNAs as biomarkers for early diagnosis, targeting and prognosis of prostate cancer. Pathol Res Pract 2023; 248:154618. [PMID: 37331185 DOI: 10.1016/j.prp.2023.154618] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Academic Contribution Register] [Received: 05/17/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 06/20/2023]
Abstract
Globally, prostate cancer (PC) is leading cause of cancer-related mortality in men worldwide. Despite significant advances in the treatment and management of this disease, the cure rates for PC remains low, largely due to late detection. PC detection is mostly reliant on prostate-specific antigen (PSA) and digital rectal examination (DRE); however, due to the low positive predictive value of current diagnostics, there is an urgent need to identify new accurate biomarkers. Recent studies support the biological role of microRNAs (miRNAs) in the initiation and progression of PC, as well as their potential as novel biomarkers for patients' diagnosis, prognosis, and disease relapse. In the advanced stages, cancer-cell-derived small extracellular vesicles (SEVs) may constitute a significant part of circulating vesicles and cause detectable changes in the plasma vesicular miRNA profile. Recent computational model for the identification of miRNA biomarkers discussed. In addition, accumulating evidence indicates that miRNAs can be utilized to target PC cells. In this article, the current understanding of the role of microRNAs and exosomes in the pathogenesis and their significance in PC prognosis, early diagnosis, chemoresistance, and treatment are reviewed.
Collapse
Affiliation(s)
- Alireza Azani
- Department of Medical Genetics, Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran; Drug Applied Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Sima Parvizi Omran
- Department of Medical Genetics, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Haniyeh Ghasrsaz
- Faculty of Medicine, Mazandaran University of Medical Sciences, Mazandaran, Iran
| | - Asra Idani
- Department of Medical Genetics, Faculty of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | | | - Niloufar Peirovi
- Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Negar Dokhani
- Student Research Committee, School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | | | | | | | - Sepideh KarkonShayan
- Social Development and Health Promotion Research Center, Gonabad University of Medical Sciences, Gonabad, Iran
| | - Parisa Najari Hanjani
- Department of Genetics, Faculty of Advanced Technologies in Medicine, Golestan University of Medical Science, Gorgan, Iran
| | - Zahra Kardaan
- Department of Cellular Molecular Biology, Faculty of Life Science and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | | | - Meysam Yousefi
- Department of Medical Genetics, Faculty of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran
| | - Mitra Abdolahi
- Department of Pathology, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Arash Salmaninejad
- Department of Medical Genetics, Faculty of Medicine, Tehran University of Medical Sciences, Tehran, Iran; Drug Applied Research Center, Tabriz University of Medical Sciences, Tabriz, Iran; Regenerative Medicine, Organ Procurement and Transplantation Multi-Disciplinary Center, Razi Hospital, School of Medicine, Guilan University of Medical Sciences, Rasht, Iran.
| |
Collapse
|
27
|
Zhou L, Wang Y, Peng L, Li Z, Luo X. Identifying potential drug-target interactions based on ensemble deep learning. Front Aging Neurosci 2023; 15:1176400. [PMID: 37396659 PMCID: PMC10309650 DOI: 10.3389/fnagi.2023.1176400] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/28/2023] [Accepted: 05/10/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction Drug-target interaction prediction is one important step in drug research and development. Experimental methods are time consuming and laborious. Methods In this study, we developed a novel DTI prediction method called EnGDD by combining initial feature acquisition, dimensional reduction, and DTI classification based on Gradient boosting neural network, Deep neural network, and Deep Forest. Results EnGDD was compared with seven stat-of-the-art DTI prediction methods (BLM-NII, NRLMF, WNNGIP, NEDTP, DTi2Vec, RoFDT, and MolTrans) on the nuclear receptor, GPCR, ion channel, and enzyme datasets under cross validations on drugs, targets, and drug-target pairs, respectively. EnGDD computed the best recall, accuracy, F1-score, AUC, and AUPR under the majority of conditions, demonstrating its powerful DTI identification performance. EnGDD predicted that D00182 and hsa2099, D07871 and hsa1813, DB00599 and hsa2562, D00002 and hsa10935 have a higher interaction probabilities among unknown drug-target pairs and may be potential DTIs on the four datasets, respectively. In particular, D00002 (Nadide) was identified to interact with hsa10935 (Mitochondrial peroxiredoxin3) whose up-regulation might be used to treat neurodegenerative diseases. Finally, EnGDD was used to find possible drug targets for Parkinson's disease and Alzheimer's disease after confirming its DTI identification performance. The results show that D01277, D04641, and D08969 may be applied to the treatment of Parkinson's disease through targeting hsa1813 (dopamine receptor D2) and D02173, D02558, and D03822 may be the clues of treatment for patients with Alzheimer's disease through targeting hsa5743 (prostaglandinendoperoxide synthase 2). The above prediction results need further biomedical validation. Discussion We anticipate that our proposed EnGDD model can help discover potential therapeutic clues for various diseases including neurodegenerative diseases.
Collapse
Affiliation(s)
- Liqian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Yuzhuang Wang
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Zejun Li
- School of Computer Science, Hunan Institute of Technology, Hengyang, China
| | - Xueming Luo
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| |
Collapse
|
28
|
Fan C, Ding M. Inferring pseudogene-MiRNA associations based on an ensemble learning framework with similarity kernel fusion. Sci Rep 2023; 13:8833. [PMID: 37258695 DOI: 10.1038/s41598-023-36054-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 03/29/2023] [Accepted: 05/28/2023] [Indexed: 06/02/2023] Open
Abstract
Accumulating evidence shows that pseudogenes can function as microRNAs (miRNAs) sponges and regulate gene expression. Mining potential interactions between pseudogenes and miRNAs will facilitate the clinical diagnosis and treatment of complex diseases. However, identifying their interactions through biological experiments is time-consuming and labor intensive. In this study, an ensemble learning framework with similarity kernel fusion is proposed to predict pseudogene-miRNA associations, named ELPMA. First, four pseudogene similarity profiles and five miRNA similarity profiles are measured based on the biological and topology properties. Subsequently, similarity kernel fusion method is used to integrate the similarity profiles. Then, the feature representation for pseudogenes and miRNAs is obtained by combining the pseudogene-pseudogene similarities, miRNA-miRNA similarities. Lastly, individual learners are performed on each training subset, and the soft voting is used to yield final decision based on the prediction results of individual learners. The k-fold cross validation is implemented to evaluate the prediction performance of ELPMA method. Besides, case studies are conducted on three investigated pseudogenes to validate the predict performance of ELPMA method for predicting pseudogene-miRNA interactions. Therefore, all experiment results show that ELPMA model is a feasible and effective tool to predict interactions between pseudogenes and miRNAs.
Collapse
Affiliation(s)
- Chunyan Fan
- School of Computer Science and Engineering, Xi'an Technological University, Xi'an, 710021, China.
| | - Mingchao Ding
- School of Computer Science, Hubei University of Technology, Wuhan, 430068, China
| |
Collapse
|
29
|
Gu C, Li X. Prediction of disease-related miRNAs by voting with multiple classifiers. BMC Bioinformatics 2023; 24:177. [PMID: 37122001 PMCID: PMC10150488 DOI: 10.1186/s12859-023-05308-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/31/2022] [Accepted: 04/26/2023] [Indexed: 05/02/2023] Open
Abstract
There is strong evidence to support that mutations and dysregulation of miRNAs are associated with a variety of diseases, including cancer. However, the experimental methods used to identify disease-related miRNAs are expensive and time-consuming. Effective computational approaches to identify disease-related miRNAs are in high demand and would aid in the detection of lncRNA biomarkers for disease diagnosis, treatment, and prevention. In this study, we develop an ensemble learning framework to reveal the potential associations between miRNAs and diseases (ELMDA). The ELMDA framework does not rely on the known associations when calculating miRNA and disease similarities and uses multi-classifiers voting to predict disease-related miRNAs. As a result, the average AUC of the ELMDA framework was 0.9229 for the HMDD v2.0 database in a fivefold cross-validation. All potential associations in the HMDD V2.0 database were predicted, and 90% of the top 50 results were verified with the updated HMDD V3.2 database. The ELMDA framework was implemented to investigate gastric neoplasms, prostate neoplasms and colon neoplasms, and 100%, 94%, and 90%, respectively, of the top 50 potential miRNAs were validated by the HMDD V3.2 database. Moreover, the ELMDA framework can predict isolated disease-related miRNAs. In conclusion, ELMDA appears to be a reliable method to uncover disease-associated miRNAs.
Collapse
Affiliation(s)
- Changlong Gu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| | - Xiaoying Li
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China.
| |
Collapse
|
30
|
Qu Q, Chen X, Ning B, Zhang X, Nie H, Zeng L, Chen H, Fu X. Prediction of miRNA-disease associations by neural network-based deep matrix factorization. Methods 2023; 212:1-9. [PMID: 36813017 DOI: 10.1016/j.ymeth.2023.02.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 12/06/2022] [Revised: 01/17/2023] [Accepted: 02/10/2023] [Indexed: 02/23/2023] Open
Abstract
MicroRNA(miRNA) is a class of short non-coding RNAs with a length of about 22 nucleotides, which participates in various biological processes of cells. A number of studies have shown that miRNAs are closely related to the occurrence of cancer and various human diseases. Therefore, studying miRNA-disease associations is helpful to understand the pathogenesis of diseases as well as the prevention, diagnosis, treatment and prognosis of diseases. Traditional biological experimental methods for studying miRNA-disease associations have disadvantages such as expensive equipment, time-consuming and labor-intensive. With the rapid development of bioinformatics, more and more researchers are committed to developing effective computational methods to predict miRNA-disease associations in roder to reduce the time and money cost of experiments. In this study, we proposed a neural network-based deep matrix factorization method named NNDMF to predict miRNA-disease associations. To address the problem that traditional matrix factorization methods can only extract linear features, NNDMF used neural network to perform deep matrix factorization to extract nonlinear features, which makes up for the shortcomings of traditional matrix factorization methods. We compared NNDMF with four previous classical prediction models (IMCMDA, GRMDA, SACMDA and ICFMDA) in global LOOCV and local LOOCV, respectively. The AUCs achieved by NNDMF in two cross-validation methods were 0.9340 and 0.8763, respectively. Furthermore, we conducted case studies on three important human diseases (lymphoma, colorectal cancer and lung cancer) to validate the effectiveness of NNDMF. In conclusion, NNDMF could effectively predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Qiang Qu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xia Chen
- School of Basic Education, Changsha Aeronautical Vocational and Technical College, Changsha, China
| | - Bin Ning
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Xiang Zhang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Hao Nie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Li Zeng
- College of Life and Environmental Science, Hunan University of Art and Science, Changde, China
| | - Haowen Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China.
| | - Xiangzheng Fu
- Research Institute of Hunan University in Chongqing, Chongqing, China.
| |
Collapse
|
31
|
S S, E R V, Krishnakumar U. Improving miRNA Disease Association Prediction Accuracy Using Integrated Similarity Information and Deep Autoencoders. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1125-1136. [PMID: 35914051 DOI: 10.1109/tcbb.2022.3195514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/04/2023]
Abstract
MicroRNAs (miRNAs) are short endogenous non-encoding RNA molecules (22nt) that have a vital role in many biological and molecular processes inside the human body. Abnormal and dysregulated expressions of miRNAs are correlated with many complex disorders. Time-consuming wet-lab biological experiments are costly and labour-intensive. So, the situation demands feasible and efficient computational approaches for predicting promising miRNAs associated with diseases. Here a two-stage feature pruning approach based on miRNA feature similarity fusion that uses deep attention autoencoder and recursive feature elimination with cross-validation (RFECV) is proposed for predicting unknown miRNA-disease associations. In the first stage, an attention autoencoder captures highly influential features from the fused feature vector. For further pruning of features, RFECV is applied. The resultant features were given to a Random Forest classifier for association prediction. The Highest AUC of 94.41% is attained when all miRNA similarity measures are merged with disease similarities. Case studies were done on two diseases-lymphoma and leukaemia, to examine the reliability of the approach. Comparative analysis shows that the proposed approach outperforms recent methodologies for predicting miRNA-disease associations.
Collapse
|
32
|
Ha J, Park S. NCMD: Node2vec-Based Neural Collaborative Filtering for Predicting MiRNA-Disease Association. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1257-1268. [PMID: 35849666 DOI: 10.1109/tcbb.2022.3191972] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/04/2023]
Abstract
Numerous studies have reported that micro RNAs (miRNAs) play pivotal roles in disease pathogenesis based on the deregulation of the expressions of target messenger RNAs. Therefore, the identification of disease-related miRNAs is of great significance in understanding human complex diseases, which can also provide insight into the design of novel prognostic markers and disease therapies. Considering the time and cost involved in wet experiments, most recent works have focused on the effective and feasible modeling of computational frameworks to uncover miRNA-disease associations. In this study, we propose a novel framework called node2vec-based neural collaborative filtering for predicting miRNA-disease association (NCMD) based on deep neural networks. Initially, NCMD exploits Node2vec to learn low-dimensional vector representations of miRNAs and diseases. Next, it utilizes a deep learning framework that combines the linear ability of generalized matrix factorization and nonlinear ability of a multilayer perceptron. Experimental results clearly demonstrate the comparable performance of NCMD relative to the state-of-the-art methods according to statistical measures. In addition, case studies on breast cancer, lung cancer and pancreatic cancer validate the effectiveness of NCMD. Extensive experiments demonstrate the benefits of modeling a neural collaborative-filtering-based approach for discovering novel miRNA-disease associations.
Collapse
|
33
|
Zhang H, Fang J, Sun Y, Xie G, Lin Z, Gu G. Predicting miRNA-Disease Associations via Node-Level Attention Graph Auto-Encoder. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2023; 20:1308-1318. [PMID: 35503834 DOI: 10.1109/tcbb.2022.3170843] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 05/04/2023]
Abstract
Previous studies have confirmed microRNA (miRNA), small single-stranded non-coding RNA, participates in various biological processes and plays vital roles in many complex human diseases. Therefore, developing an efficient method to infer potential miRNA disease associations could greatly help understand operational mechanisms for diseases at the molecular level. However, during these early stages for miRNA disease prediction, traditional biological experiments are laborious and expensive. Therefore, this study proposes a novel method called AGAEMD (node-level Attention Graph Auto-Encoder to predict potential MiRNA Disease associations). We first create a heterogeneous matrix incorporating miRNA similarity, disease similarity, and known miRNA-disease associations. Then these matrixes are input into a node-level attention encoder-decoder network which utilizes low dimensional dense embeddings to represent nodes and calculate association scores. To verify the effectiveness of the proposed method, we conduct a series of experiments on two benchmark datasets (the Human MicroRNA Disease Database v2.0 and v3.2) and report the averages over 10 runs in comparison with several state-of-the-art methods. Experimental results have demonstrated the excellent performance of AGAEMD in comparison with other methods. Three important diseases (Colon Neoplasms, Lung Neoplasms, Lupus Vulgaris) were applied in case studies. The results comfirm the reliable predictive performance of AGAEMD.
Collapse
|
34
|
Wang T, Sun J, Zhao Q. Investigating cardiotoxicity related with hERG channel blockers using molecular fingerprints and graph attention mechanism. Comput Biol Med 2023; 153:106464. [PMID: 36584603 DOI: 10.1016/j.compbiomed.2022.106464] [Citation(s) in RCA: 129] [Impact Index Per Article: 64.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/23/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Human ether-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Failure or inhibition of hERG channel activity caused by drug molecules can lead to prolonging QT interval, which will result in serious cardiotoxicity. Thus, evaluating the hERG blocking activity of all these small molecular compounds is technically challenging, and the relevant procedures are expensive and time-consuming. In this study, we develop a novel deep learning predictive model named DMFGAM for predicting hERG blockers. In order to characterize the molecule more comprehensively, we first consider the fusion of multiple molecular fingerprint features to characterize its final molecular fingerprint features. Then, we use the multi-head attention mechanism to extract the molecular graph features. Both molecular fingerprint features and molecular graph features are fused as the final features of the compounds to make the feature expression of compounds more comprehensive. Finally, the molecules are classified into hERG blockers or hERG non-blockers through the fully connected neural network. We conduct 5-fold cross-validation experiment to evaluate the performance of DMFGAM, and verify the robustness of DMFGAM on external validation datasets. We believe DMFGAM can serve as a powerful tool to predict hERG channel blockers in the early stages of drug discovery and development.
Collapse
Affiliation(s)
- Tianyi Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China.
| |
Collapse
|
35
|
Feng H, Jin D, Li J, Li Y, Zou Q, Liu T. Matrix reconstruction with reliable neighbors for predicting potential MiRNA-disease associations. Brief Bioinform 2023; 24:6960615. [PMID: 36567252 DOI: 10.1093/bib/bbac571] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/12/2022] [Revised: 10/16/2022] [Accepted: 11/23/2022] [Indexed: 12/27/2022] Open
Abstract
Numerous experimental studies have indicated that alteration and dysregulation in mircroRNAs (miRNAs) are associated with serious diseases. Identifying disease-related miRNAs is therefore an essential and challenging task in bioinformatics research. Computational methods are an efficient and economical alternative to conventional biomedical studies and can reveal underlying miRNA-disease associations for subsequent experimental confirmation with reasonable confidence. Despite the success of existing computational approaches, most of them only rely on the known miRNA-disease associations to predict associations without adding other data to increase the prediction accuracy, and they are affected by issues of data sparsity. In this paper, we present MRRN, a model that combines matrix reconstruction with node reliability to predict probable miRNA-disease associations. In MRRN, the most reliable neighbors of miRNA and disease are used to update the original miRNA-disease association matrix, which significantly reduces data sparsity. Unknown miRNA-disease associations are reconstructed by aggregating the most reliable first-order neighbors to increase prediction accuracy by representing the local and global structure of the heterogeneous network. Five-fold cross-validation of MRRN produced an area under the curve (AUC) of 0.9355 and area under the precision-recall curve (AUPR) of 0.2646, values that were greater than those produced by comparable models. Two different types of case studies using three diseases were conducted to demonstrate the accuracy of MRRN, and all top 30 predicted miRNAs were verified.
Collapse
Affiliation(s)
- Hailin Feng
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Dongdong Jin
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Jian Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Yane Li
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, No. 2006, Xiyuan Avenue, West District, high tech Zone, 611731, Chengdu, China
| | - Tongcun Liu
- School of mathematics and computer science, Zhejiang A&F University, No.666 Wusu Street,Lin'an District, 311300, Hangzhou, China
| |
Collapse
|
36
|
Zhao J, Sun J, Shuai SC, Zhao Q, Shuai J. Predicting potential interactions between lncRNAs and proteins via combined graph auto-encoder methods. Brief Bioinform 2023; 24:6896030. [PMID: 36515153 DOI: 10.1093/bib/bbac527] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/02/2022] [Revised: 10/23/2022] [Accepted: 11/06/2022] [Indexed: 12/15/2022] Open
Abstract
Long noncoding RNA (lncRNA) is a kind of noncoding RNA with a length of more than 200 nucleotide units. Numerous research studies have proven that although lncRNAs cannot be directly translated into proteins, lncRNAs still play an important role in human growth processes by interacting with proteins. Since traditional biological experiments often require a lot of time and material costs to explore potential lncRNA-protein interactions (LPI), several computational models have been proposed for this task. In this study, we introduce a novel deep learning method known as combined graph auto-encoders (LPICGAE) to predict potential human LPIs. First, we apply a variational graph auto-encoder to learn the low dimensional representations from the high-dimensional features of lncRNAs and proteins. Then the graph auto-encoder is used to reconstruct the adjacency matrix for inferring potential interactions between lncRNAs and proteins. Finally, we minimize the loss of the two processes alternately to gain the final predicted interaction matrix. The result in 5-fold cross-validation experiments illustrates that our method achieves an average area under receiver operating characteristic curve of 0.974 and an average accuracy of 0.985, which is better than those of existing six state-of-the-art computational methods. We believe that LPICGAE can help researchers to gain more potential relationships between lncRNAs and proteins effectively.
Collapse
Affiliation(s)
- Jingxuan Zhao
- University of Science and Technology Liaoning, 66459, Anshan, China
| | | | - Stella C Shuai
- Northwestern University, 3270, Evanston, IllinoisUnited States
| | - Qi Zhao
- University of Science and Technology Liaoning, 66459, Anshan, China
| | - Jianwei Shuai
- Department of Physics, Xiamen University, Xiamen, China
| |
Collapse
|
37
|
Wang W, Chen H. Predicting miRNA-disease associations based on lncRNA-miRNA interactions and graph convolution networks. Brief Bioinform 2023; 24:6918743. [PMID: 36526276 DOI: 10.1093/bib/bbac495] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/08/2022] [Revised: 10/17/2022] [Accepted: 10/18/2022] [Indexed: 12/23/2022] Open
Abstract
Increasing studies have proved that microRNAs (miRNAs) are critical biomarkers in the development of human complex diseases. Identifying disease-related miRNAs is beneficial to disease prevention, diagnosis and remedy. Based on the assumption that similar miRNAs tend to associate with similar diseases, various computational methods have been developed to predict novel miRNA-disease associations (MDAs). However, selecting proper features for similarity calculation is a challenging task because of data deficiencies in biomedical science. In this study, we propose a deep learning-based computational method named MAGCN to predict potential MDAs without using any similarity measurements. Our method predicts novel MDAs based on known lncRNA-miRNA interactions via graph convolution networks with multichannel attention mechanism and convolutional neural network combiner. Extensive experiments show that the average area under the receiver operating characteristic values obtained by our method under 2-fold, 5-fold and 10-fold cross-validations are 0.8994, 0.9032 and 0.9044, respectively. When compared with five state-of-the-art methods, MAGCN shows improvement in terms of prediction accuracy. In addition, we conduct case studies on three diseases to discover their related miRNAs, and find that all the top 50 predictions for all the three diseases have been supported by established databases. The comprehensive results demonstrate that our method is a reliable tool in detecting new disease-related miRNAs.
Collapse
|
38
|
Liang Q, Zhang W, Wu H, Liu B. LncRNA-disease association identification using graph auto-encoder and learning to rank. Brief Bioinform 2023; 24:6955271. [PMID: 36545805 DOI: 10.1093/bib/bbac539] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/20/2022] [Revised: 10/18/2022] [Accepted: 11/08/2022] [Indexed: 12/24/2022] Open
Abstract
Discovering the relationships between long non-coding RNAs (lncRNAs) and diseases is significant in the treatment, diagnosis and prevention of diseases. However, current identified lncRNA-disease associations are not enough because of the expensive and heavy workload of wet laboratory experiments. Therefore, it is greatly important to develop an efficient computational method for predicting potential lncRNA-disease associations. Previous methods showed that combining the prediction results of the lncRNA-disease associations predicted by different classification methods via Learning to Rank (LTR) algorithm can be effective for predicting potential lncRNA-disease associations. However, when the classification results are incorrect, the ranking results will inevitably be affected. We propose the GraLTR-LDA predictor based on biological knowledge graphs and ranking framework for predicting potential lncRNA-disease associations. Firstly, homogeneous graph and heterogeneous graph are constructed by integrating multi-source biological information. Then, GraLTR-LDA integrates graph auto-encoder and attention mechanism to extract embedded features from the constructed graphs. Finally, GraLTR-LDA incorporates the embedded features into the LTR via feature crossing statistical strategies to predict priority order of diseases associated with query lncRNAs. Experimental results demonstrate that GraLTR-LDA outperforms the other state-of-the-art predictors and can effectively detect potential lncRNA-disease associations. Availability and implementation: Datasets and source codes are available at http://bliulab.net/GraLTR-LDA.
Collapse
Affiliation(s)
- Qi Liang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Wenxiang Zhang
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Hao Wu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China.,Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing, China
| |
Collapse
|
39
|
Lin L, Chen R, Zhu Y, Xie W, Jing H, Chen L, Zou M. SCCPMD: Probability matrix decomposition method subject to corrected similarity constraints for inferring long non-coding RNA-disease associations. Front Microbiol 2023; 13:1093615. [PMID: 36713213 PMCID: PMC9874942 DOI: 10.3389/fmicb.2022.1093615] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 11/09/2022] [Accepted: 11/30/2022] [Indexed: 01/13/2023] Open
Abstract
Accumulating evidence has demonstrated various associations of long non-coding RNAs (lncRNAs) with human diseases, such as abnormal expression due to microbial influences that cause disease. Gaining a deeper understanding of lncRNA-disease associations is essential for disease diagnosis, treatment, and prevention. In recent years, many matrix decomposition methods have also been used to predict potential lncRNA-disease associations. However, these methods do not consider the use of microbe-disease association information to enrich disease similarity, and also do not make more use of similarity information in the decomposition process. To address these issues, we here propose a correction-based similarity-constrained probability matrix decomposition method (SCCPMD) to predict lncRNA-disease associations. The microbe-disease associations are first used to enrich the disease semantic similarity matrix, and then the logistic function is used to correct the lncRNA and disease similarity matrix, and then these two corrected similarity matrices are added to the probability matrix decomposition as constraints to finally predict the potential lncRNA-disease associations. The experimental results show that SCCPMD outperforms the five advanced comparison algorithms. In addition, SCCPMD demonstrated excellent prediction performance in a case study for breast cancer, lung cancer, and renal cell carcinoma, with prediction accuracy reaching 80, 100, and 100%, respectively. Therefore, SCCPMD shows excellent predictive performance in identifying unknown lncRNA-disease associations.
Collapse
Affiliation(s)
- Lieqing Lin
- Center of Campus Network & Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Ruibin Chen
- School of Computer, Guangdong University of Technology, Guangzhou, China
| | - Yinting Zhu
- School of Computer, Guangdong University of Technology, Guangzhou, China
| | - Weijie Xie
- School of Computer, Guangdong University of Technology, Guangzhou, China
| | - Huaiguo Jing
- Sports Department, Guangdong University of Technology, Guangzhou, China
| | - Langcheng Chen
- Center of Campus Network & Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Minqing Zou
- Department of Experiment Teaching, Guangdong University of Technology, Guangzhou, China
| |
Collapse
|
40
|
Kim N, Choung H, Kim YJ, Woo SE, Yang MK, Khwarg SI, Lee MJ. Serum microRNA as a potential biomarker for the activity of thyroid eye disease. Sci Rep 2023; 13:234. [PMID: 36604580 PMCID: PMC9816116 DOI: 10.1038/s41598-023-27483-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/24/2022] [Accepted: 01/03/2023] [Indexed: 01/06/2023] Open
Abstract
The aim of this study is to characterize the microRNA (miRNA) expression signatures in patients with thyroid eye disease (TED) and identify miRNA biomarkers of disease activity. Total RNA was isolated from the sera of patients with TED (n = 10) and healthy controls (HCs, n = 5) using the miRNeasy Serum/Plasma Kit. The NanoString assay was used for the comprehensive analysis of 798 miRNA expression profiles. Analysis of specific miRNA signatures, mRNA target pathway analysis, and network analysis were performed. Patients with TED were divided into two groups according to disease activity: active and inactive TED groups. Differentially expressed circulating miRNAs were identified and tested using quantitative reverse transcription-polymerase chain reaction (qRT-PCR) tests in the validation cohort. Among the 798 miRNAs analyzed, 173 differentially downregulated miRNAs were identified in TED patients compared to those in the HCs. Ten circulating miRNAs were differentially expressed between the active and inactive TED groups and regarded as candidate biomarkers for TED activity (one upregulated miRNA: miR-29c-3p; nine downregulated miRNAs: miR-4286, miR-941, miR-571, miR-129-2-3p, miR-484, miR-192-5p, miR-502-3p, miR-597-5p, and miR-296-3p). In the validation cohort, miR-484 and miR-192-5p showed significantly lower expression in the active TED group than in the inactive TED group. In conclusion, the expression levels of miR-484 and miR-192-5p differed significantly between the active and inactive TED groups, suggesting that these miRNAs could serve as circulating biomarkers of TED activity, however, these findings need to be validated in further studies.
Collapse
Affiliation(s)
- Namju Kim
- grid.412480.b0000 0004 0647 3378Department of Ophthalmology, Seoul National University Bundang Hospital, Seongnam, Korea
| | - Hokyung Choung
- grid.412479.dDepartment of Ophthalmology, Seoul Metropolitan Government-Seoul National University, Boramae Medical Center, Seoul, Korea ,grid.31501.360000 0004 0470 5905Department of Ophthalmology, Seoul National University College of Medicine, Seoul, Korea
| | - Yu Jeong Kim
- grid.412484.f0000 0001 0302 820XDepartment of Ophthalmology, Seoul National University Hospital, Seoul, Korea
| | - Sang Earn Woo
- grid.412479.dDepartment of Ophthalmology, Seoul Metropolitan Government-Seoul National University, Boramae Medical Center, Seoul, Korea
| | - Min Kyu Yang
- grid.413967.e0000 0001 0842 2126Department of Ophthalmology, Asan Medical Center, Seoul, Korea
| | - Sang In Khwarg
- grid.31501.360000 0004 0470 5905Department of Ophthalmology, Seoul National University College of Medicine, Seoul, Korea ,grid.412484.f0000 0001 0302 820XDepartment of Ophthalmology, Seoul National University Hospital, Seoul, Korea
| | - Min Joung Lee
- Department of Ophthalmology, Hallym University College of Medicine, Hallym University Sacred Heart Hospital, 22, Gwanpyeong-Ro 170 Beon-Gil, Dongan-Gu, Anyang-Si, Gyeonggi-Do, 14068, Republic of Korea.
| |
Collapse
|
41
|
Ha J. SMAP: Similarity-based matrix factorization framework for inferring miRNA-disease association. Knowl Based Syst 2023. [DOI: 10.1016/j.knosys.2023.110295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 01/15/2023]
|
42
|
Liao Q, Ye Y, Li Z, Chen H, Zhuo L. Prediction of miRNA-disease associations in microbes based on graph convolutional networks and autoencoders. Front Microbiol 2023; 14:1170559. [PMID: 37187536 PMCID: PMC10175670 DOI: 10.3389/fmicb.2023.1170559] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 02/21/2023] [Accepted: 03/21/2023] [Indexed: 05/17/2023] Open
Abstract
MicroRNAs (miRNAs) are short RNA molecular fragments that regulate gene expression by targeting and inhibiting the expression of specific RNAs. Due to the fact that microRNAs affect many diseases in microbial ecology, it is necessary to predict microRNAs' association with diseases at the microbial level. To this end, we propose a novel model, termed as GCNA-MDA, where dual-autoencoder and graph convolutional network (GCN) are integrated to predict miRNA-disease association. The proposed method leverages autoencoders to extract robust representations of miRNAs and diseases and meantime exploits GCN to capture the topological information of miRNA-disease networks. To alleviate the impact of insufficient information for the original data, the association similarity and feature similarity data are combined to calculate a more complete initial basic vector of nodes. The experimental results on the benchmark datasets demonstrate that compared with the existing representative methods, the proposed method has achieved the superior performance and its precision reaches up to 0.8982. These results demonstrate that the proposed method can serve as a tool for exploring miRNA-disease associations in microbial environments.
Collapse
Affiliation(s)
- Qingquan Liao
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| | - Yuxiang Ye
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
| | - Zihang Li
- School of Computing and Data Science, Xiamen University Malaysia, Sepang, Selangor, Malaysia
| | - Hao Chen
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
- *Correspondence: Hao Chen
| | - Linlin Zhuo
- School of Data Science and Artificial Intelligence, Wenzhou University of Technology, Wenzhou, China
- Linlin Zhuo
| |
Collapse
|
43
|
Li P, Tiwari P, Xu J, Qian Y, Ai C, Ding Y, Guo F. Sparse regularized joint projection model for identifying associations of non-coding RNAs and human diseases. Knowl Based Syst 2022. [DOI: 10.1016/j.knosys.2022.110044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Indexed: 11/06/2022]
|
44
|
Peng L, Yang J, Wang M, Zhou L. Editorial: Machine learning-based methods for RNA data analysis—Volume II. Front Genet 2022; 13:1010089. [DOI: 10.3389/fgene.2022.1010089] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 08/02/2022] [Accepted: 09/20/2022] [Indexed: 12/02/2022] Open
|
45
|
Peng L, Tu Y, Huang L, Li Y, Fu X, Chen X. DAESTB: inferring associations of small molecule-miRNA via a scalable tree boosting model based on deep autoencoder. Brief Bioinform 2022; 23:6827720. [PMID: 36377749 DOI: 10.1093/bib/bbac478] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/16/2022] [Revised: 09/28/2022] [Accepted: 10/08/2022] [Indexed: 11/16/2022] Open
Abstract
MicroRNAs (miRNAs) are closely related to a variety of human diseases, not only regulating gene expression, but also having an important role in human life activities and being viable targets of small molecule drugs for disease treatment. Current computational techniques to predict the potential associations between small molecule and miRNA are not that accurate. Here, we proposed a new computational method based on a deep autoencoder and a scalable tree boosting model (DAESTB), to predict associations between small molecule and miRNA. First, we constructed a high-dimensional feature matrix by integrating small molecule-small molecule similarity, miRNA-miRNA similarity and known small molecule-miRNA associations. Second, we reduced feature dimensionality on the integrated matrix using a deep autoencoder to obtain the potential feature representation of each small molecule-miRNA pair. Finally, a scalable tree boosting model is used to predict small molecule and miRNA potential associations. The experiments on two datasets demonstrated the superiority of DAESTB over various state-of-the-art methods. DAESTB achieved the best AUC value. Furthermore, in three case studies, a large number of predicted associations by DAESTB are confirmed with the public accessed literature. We envision that DAESTB could serve as a useful biological model for predicting potential small molecule-miRNA associations.
Collapse
Affiliation(s)
- Li Peng
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China.,Hunan Key Laboratory for Service computing and Novel Software Technology
| | - Yuan Tu
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Yang Li
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan, 411105, China
| | - Xiangzheng Fu
- College of Information Science and Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Xiang Chen
- College of Computer Science and Engineering, Hunan University of Science and Technology, Xiangtan, 411201, Hunan, China
| |
Collapse
|
46
|
Wang W, Zhang L, Sun J, Zhao Q, Shuai J. Predicting the potential human lncRNA-miRNA interactions based on graph convolution network with conditional random field. Brief Bioinform 2022; 23:6775599. [PMID: 36305458 DOI: 10.1093/bib/bbac463] [Citation(s) in RCA: 149] [Impact Index Per Article: 49.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/10/2022] [Revised: 09/10/2022] [Accepted: 09/27/2022] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNA (lncRNA) and microRNA (miRNA) are two typical types of non-coding RNAs (ncRNAs), their interaction plays an important regulatory role in many biological processes. Exploring the interactions between unknown lncRNA and miRNA can help us better understand the functional expression between lncRNA and miRNA. At present, the interactions between lncRNA and miRNA are mainly obtained through biological experiments, but such experiments are often time-consuming and labor-intensive, it is necessary to design a computational method that can predict the interactions between lncRNA and miRNA. In this paper, we propose a method based on graph convolutional neural (GCN) network and conditional random field (CRF) for predicting human lncRNA-miRNA interactions, named GCNCRF. First, we construct a heterogeneous network using the known interactions of lncRNA and miRNA in the LncRNASNP2 database, the lncRNA/miRNA integration similarity network, and the lncRNA/miRNA feature matrix. Second, the initial embedding of nodes is obtained using a GCN network. A CRF set in the GCN hidden layer can update the obtained preliminary embeddings so that similar nodes have similar embeddings. At the same time, an attention mechanism is added to the CRF layer to reassign weights to nodes to better grasp the feature information of important nodes and ignore some nodes with less influence. Finally, the final embedding is decoded and scored through the decoding layer. Through a 5-fold cross-validation experiment, GCNCRF has an area under the receiver operating characteristic curve value of 0.947 on the main dataset, which has higher prediction accuracy than the other six state-of-the-art methods.
Collapse
Affiliation(s)
- Wenya Wang
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, 276000, China
| | - Qi Zhao
- School of Computer Science and Software Engineering, University of Science and Technology Liaoning, Anshan, 114051, China
| | - Jianwei Shuai
- Oujiang Laboratory (Zhejiang Lab for Regenerative Medicine, Vision and Brain Health), and Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang, 325001, China.,Department of Physics, and Fujian Provincial Key Laboratory for Soft Functional Materials Research, Xiamen University, Xiamen, 361005, China.,National Institute for Data Science in Health and Medicine, and State Key Laboratory of Cellular Stress Biology, Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, 361005, China
| |
Collapse
|
47
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
48
|
Chen L, Lin D, Xu H, Li J, Lin L. WLLP: A weighted reconstruction-based linear label propagation algorithm for predicting potential therapeutic agents for COVID-19. Front Microbiol 2022; 13:1040252. [PMID: 36466666 PMCID: PMC9713947 DOI: 10.3389/fmicb.2022.1040252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 09/09/2022] [Accepted: 10/06/2022] [Indexed: 11/18/2022] Open
Abstract
The global coronavirus disease 2019 (COVID-19) pandemic caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV) has led to a huge health and economic crises. However, the research required to develop new drugs and vaccines is very expensive in terms of labor, money, and time. Owing to recent advances in data science, drug-repositioning technologies have become one of the most promising strategies available for developing effective treatment options. Using the previously reported human drug virus database (HDVD), we proposed a model to predict possible drug regimens based on a weighted reconstruction-based linear label propagation algorithm (WLLP). For the drug–virus association matrix, we used the weighted K-nearest known neighbors method for preprocessing and label propagation of the network based on the linear neighborhood similarity of drugs and viruses to obtain the final prediction results. In the framework of 10 times 10-fold cross-validated area under the receiver operating characteristic (ROC) curve (AUC), WLLP exhibited excellent performance with an AUC of 0.8828 ± 0.0037 and an area under the precision-recall curve of 0.5277 ± 0.0053, outperforming the other four models used for comparison. We also predicted effective drug regimens against SARS-CoV-2, and this case study showed that WLLP can be used to suggest potential drugs for the treatment of COVID-19.
Collapse
Affiliation(s)
- Langcheng Chen
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
| | - Dongying Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Lieqing Lin
- Center of Campus Network and Modern Educational Technology, Guangdong University of Technology, Guangzhou, China
- *Correspondence: Lieqing Lin
| |
Collapse
|
49
|
Zhang X, Zhang D, Bu X, Zhang X, Cui L. Identification of a novel miRNA-based recurrence and prognosis prediction biomarker for hepatocellular carcinoma. BMC Bioinformatics 2022; 23:479. [PMID: 36376850 PMCID: PMC9664787 DOI: 10.1186/s12859-022-05040-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Academic Contribution Register] [Received: 07/18/2022] [Accepted: 11/07/2022] [Indexed: 11/16/2022] Open
Abstract
Background A high recurrence rate has always been a serious problem for treatment of hepatocellular carcinoma (HCC). Exploring predictors of postoperative and posttransplantation recurrence in patients with HCC can guide treatment strategies for clinicians. Results In this study, logistic regression and multivariate Cox regression models were constructed with microRNA expression profile data from The Cancer Genome Atlas (TCGA) and gene expression omnibus (GEO). The accuracy of predictions was assessed using receiver operating characteristic curve (ROC) and Kaplan‒Meier survival curve analyses. The results showed that the combination of 10 miRNAs (including hsa-miR-509-3p, hsa-miR-769-3p, hsa-miR-671-3p, hsa-miR-296-5p, hsa-miR-767-5p, hsa-miR-421, hsa-miR-193a-3p, hsa-miR-139-3p, hsa-miR-342-3p, and hsa-miR-193a-5p) accurately predicted postoperative and posttransplantation malignancy recurrence in HCC patients and was also valuable for prognostic evaluation of HCC patients. The 10-miRNA prediction model might assist doctors in making prognoses for HCC patients who have a high probability of relapse following surgery and in offering additional, individualized treatment to lessen that risk.
Collapse
|
50
|
Li W, Wang S, Xu J, Xiang J. Inferring Latent MicroRNA-Disease Associations on a Gene-Mediated Tripartite Heterogeneous Multiplexing Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3190-3201. [PMID: 35041612 DOI: 10.1109/tcbb.2022.3143770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Academic Contribution Register] [Indexed: 06/14/2023]
Abstract
MicroRNA (miRNA) is a class of non-coding single-stranded RNA molecules encoded by endogenous genes with a length of about 22 nucleotides. MiRNAs have been successfully identified as differentially expressed in various cancers. There is evidence that disorders of miRNAs are associated with a variety of complex diseases. Therefore, inferring potential miRNA-disease associations (MDAs) is very important for understanding the aetiology and pathogenesis of many diseases and is useful to disease diagnosis, prognosis and treatment. First, We creatively fused multiple similarity subnetworks from multi-sources for miRNAs, genes and diseases by multiplexing technology, respectively. Then, three multiplexed biological subnetworks are connected through the extended binary association to form a tripartite complete heterogeneous multiplexed network (Tri-HM). Finally, because the constructed Tri-HM network can retain subnetworks' original topology and biological functions and expands the binary association and dependence between the three biological entities, rich neighbourhood information is obtained iteratively from neighbours by a non-equilibrium random walk. Through cross-validation, our tri-HM-RWR model obtained an AUC value of 0.8657, and an AUPR value of 0.2139 in the global 5-fold cross-validation, which shows that our model can more fully speculate disease-related miRNAs.
Collapse
|