1
|
Han S, Liu L. GP-HTNLoc: A graph prototype head-tail network-based model for multi-label subcellular localization prediction of ncRNAs. Comput Struct Biotechnol J 2024; 23:2034-2048. [PMID: 38765609 PMCID: PMC11101938 DOI: 10.1016/j.csbj.2024.04.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/17/2024] [Accepted: 04/18/2024] [Indexed: 05/22/2024] Open
Abstract
Numerous research results demonstrated that understanding the subcellular localization of non-coding RNAs (ncRNAs) is pivotal in elucidating their roles and regulatory mechanisms in cells. Despite the existence of over ten computational models dedicated to predicting the subcellular localization of ncRNAs, a majority of these models are designed solely for single-label prediction. In reality, ncRNAs often exhibit localization across multiple subcellular compartments. Furthermore, the existing multi-label localization prediction models are insufficient in addressing the challenges posed by the scarcity of training samples and class imbalance in ncRNA dataset. To address these limitations, this study proposes a novel multi-label localization prediction model for ncRNAs, named GP-HTNLoc. To mitigate class imbalance, GP-HTNLoc adopts separate training approaches for head and tail location labels. Additionally, GP-HTNLoc introduces a pioneering graph prototype module to enhance its performance in small-sample, multi-label scenarios. The experimental results based on 10-fold cross-validation on benchmark datasets demonstrate that GP-HTNLoc achieves competitive predictive performance. The average results from 10 rounds of testing on an independent dataset show that GP-HTNLoc outperforms the best existing models on the human lncRNA, human snoRNA, and human miRNA subsets, with average precision improvements of 31.5%, 14.2%, and 5.6%, respectively, reaching 0.685, 0.632, and 0.704. A user-friendly online GP-HTNLoc server is accessible at https://56s8y85390.goho.co.
Collapse
Affiliation(s)
- Shuangkai Han
- School of Information, Yunnan Normal University, Kunming, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, China
| | - Lin Liu
- School of Information, Yunnan Normal University, Kunming, China
- Engineering Research Center of Computer Vision and Intelligent Control Technology, Department of Education of Yunnan Province, China
| |
Collapse
|
2
|
Wei Y, Zhang Q, Liu L. The improved de Bruijn graph for multitask learning: predicting functions, subcellular localization, and interactions of noncoding RNAs. Brief Bioinform 2024; 26:bbae627. [PMID: 39592154 PMCID: PMC11596098 DOI: 10.1093/bib/bbae627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 11/13/2024] [Accepted: 11/15/2024] [Indexed: 11/28/2024] Open
Abstract
Noncoding RNA refers to RNA that does not encode proteins. The lncRNA and miRNA it contains play crucial regulatory roles in organisms, and their aberrant expression is closely related to various diseases. Traditional experimental methods for validating the interactions of these RNAs have limitations, and existing prediction models exhibit relatively limited functionality, relying on isolated feature extraction and performing poorly in handling various types of small sample tasks. This paper proposes an improved de Bruijn graph that can inject RNA structural information into the graph while preserving sequence information. Furthermore, the improved de Bruijn graph enables graph neural networks to learn broader dependencies and correlations among data by introducing richer edge relationships. Meanwhile, the multitask learning model, DVMnet, proposed in this paper can handle multiple related tasks, and we optimize model parameters by integrating the total loss of three tasks. This enables multitask prediction of RNA interactions, disease associations, and subcellular localization. Compared with the best existing models in this field, DVMnet has achieved the best performance with a 3% improvement in the area under the curve value and demonstrates robust results in predicting diseases and subcellular localization. The improved de Bruijn graph is also applicable to various scenarios and can unify the sequence and structural information of various nucleic acids into a single graph.
Collapse
Affiliation(s)
- Yuxiao Wei
- College of Software, Dalian Jiaotong University,794 Huanghe Road, Dalian 116028, China
| | - Qi Zhang
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| | - Liwei Liu
- College of Science, Dalian Jiaotong University, 794 Huanghe Road, Dalian 116028, China
| |
Collapse
|
3
|
Liang Y, You X, Zhang Z, Qiu S, Li S, Fu L. MGFmiRNAloc: Predicting miRNA Subcellular Localization Using Molecular Graph Feature and Convolutional Block Attention Module. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1348-1357. [PMID: 38557611 DOI: 10.1109/tcbb.2024.3383438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
MiRNA has distinct physiological functions at various cellular locations. However, few effective computational methods for predicting the subcellular location of miRNA exist, thereby leaving considerable room for improvement. Accordingly, our study proposes the MGFmiRNAloc simplified molecular input line entry system (SMILES) format as a new approach for predicting the subcellular localization of miRNA. Additionally, the graphical convolutional network (GCN) technique was employed to extract the atomic nodes and topological structure of a single base, thereby constructing RNA sequence molecular map features. Subsequently, the channel attention and spatial attention mechanisms (CBAM) were designed to mine deeper for more efficient information. Finally, the prediction module was used to detect the subcellular localization of miRNA. The 10-fold cross-validation and independent test set experiments demonstrate that MGFmiRNAloc outperforms the most sophisticated methods. The results indicate that the new atomic level feature representation proposed in this study could overcome the limitations of small samples and short miRNA sequences, accurately predict the subcellular localization of miRNAs, and be extended to the subcellular localization of other sequences.
Collapse
|
4
|
Chen L, Gu J, Zhou B. PMiSLocMF: predicting miRNA subcellular localizations by incorporating multi-source features of miRNAs. Brief Bioinform 2024; 25:bbae386. [PMID: 39154195 PMCID: PMC11330342 DOI: 10.1093/bib/bbae386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 07/04/2024] [Accepted: 07/23/2024] [Indexed: 08/19/2024] Open
Abstract
The microRNAs (miRNAs) play crucial roles in several biological processes. It is essential for a deeper insight into their functions and mechanisms by detecting their subcellular localizations. The traditional methods for determining miRNAs subcellular localizations are expensive. The computational methods are alternative ways to quickly predict miRNAs subcellular localizations. Although several computational methods have been proposed in this regard, the incomplete representations of miRNAs in these methods left the room for improvement. In this study, a novel computational method for predicting miRNA subcellular localizations, named PMiSLocMF, was developed. As lots of miRNAs have multiple subcellular localizations, this method was a multi-label classifier. Several properties of miRNA, such as miRNA sequences, miRNA functional similarity, miRNA-disease, miRNA-drug, and miRNA-mRNA associations were adopted for generating informative miRNA features. To this end, powerful algorithms [node2vec and graph attention auto-encoder (GATE)] and one newly designed scheme were adopted to process above properties, producing five feature types. All features were poured into self-attention and fully connected layers to make predictions. The cross-validation results indicated the high performance of PMiSLocMF with accuracy higher than 0.83, average area under the receiver operating characteristic curve (AUC) and area under the precision-recall curve (AUPR) exceeding 0.90 and 0.77, respectively. Such performance was better than all previous methods based on the same dataset. Further tests proved that using all feature types can improve the performance of PMiSLocMF, and GATE and self-attention layer can help enhance the performance. Finally, we deeply analyzed the influence of miRNA associations with diseases, drugs, and mRNAs on PMiSLocMF. The dataset and codes are available at https://github.com/Gu20201017/PMiSLocMF.
Collapse
Affiliation(s)
- Lei Chen
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Avenue, Pudong New District, Shanghai 201306, China
| | - Jiahui Gu
- College of Information Engineering, Shanghai Maritime University, 1550 Haigang Avenue, Pudong New District, Shanghai 201306, China
| | - Bo Zhou
- School of Basic Medical Sciences, Shanghai University of Medicine and Health Sciences, 279 Zhouzhu Road, Pudong New District, Shanghai 201318, China
| |
Collapse
|
5
|
Li J, Ma X, Lin H, Zhao S, Li B, Huang Y. MHIF-MSEA: a novel model of miRNA set enrichment analysis based on multi-source heterogeneous information fusion. Front Genet 2024; 15:1375148. [PMID: 38586586 PMCID: PMC10995286 DOI: 10.3389/fgene.2024.1375148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 03/11/2024] [Indexed: 04/09/2024] Open
Abstract
Introduction: MicroRNAs (miRNAs) are a class of non-coding RNA molecules that play a crucial role in the regulation of diverse biological processes across various organisms. Despite not encoding proteins, miRNAs have been found to have significant implications in the onset and progression of complex human diseases. Methods: Conventional methods for miRNA functional enrichment analysis have certain limitations, and we proposed a novel method called MiRNA Set Enrichment Analysis based on Multi-source Heterogeneous Information Fusion (MHIF-MSEA). Three miRNA similarity networks (miRSN-DA, miRSN-GOA, and miRSN-PPI) were constructed in MHIF-MSEA. These networks were built based on miRNA-disease association, gene ontology (GO) annotation of target genes, and protein-protein interaction of target genes, respectively. These miRNA similarity networks were fused into a single similarity network with the averaging method. This fused network served as the input for the random walk with restart algorithm, which expanded the original miRNA list. Finally, MHIF-MSEA performed enrichment analysis on the expanded list. Results and Discussion: To determine the optimal network fusion approach, three case studies were introduced: colon cancer, breast cancer, and hepatocellular carcinoma. The experimental results revealed that the miRNA-miRNA association network constructed using miRSN-DA and miRSN-GOA exhibited superior performance as the input network. Furthermore, the MHIF-MSEA model performed enrichment analysis on differentially expressed miRNAs in breast cancer and hepatocellular carcinoma. The achieved p-values were 2.17e(-75) and 1.50e(-77), and the hit rates improved by 39.01% and 44.68% compared to traditional enrichment analysis methods, respectively. These results confirm that the MHIF-MSEA method enhances the identification of enriched miRNA sets by leveraging multiple sources of heterogeneous information, leading to improved insights into the functional implications of miRNAs in complex diseases.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Xuxu Ma
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Hongxin Lin
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Shisheng Zhao
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Bing Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yan Huang
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education, Beijing), Department of Anesthesiology, Peking University Cancer Hospital and Institute, Beijing, China
| |
Collapse
|
6
|
Zhang Y, Chu Y, Lin S, Xiong Y, Wei DQ. ReHoGCNES-MDA: prediction of miRNA-disease associations using homogenous graph convolutional networks based on regular graph with random edge sampler. Brief Bioinform 2024; 25:bbae103. [PMID: 38517693 PMCID: PMC10959163 DOI: 10.1093/bib/bbae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/04/2024] [Accepted: 02/23/2024] [Indexed: 03/24/2024] Open
Abstract
Numerous investigations increasingly indicate the significance of microRNA (miRNA) in human diseases. Hence, unearthing associations between miRNA and diseases can contribute to precise diagnosis and efficacious remediation of medical conditions. The detection of miRNA-disease linkages via computational techniques utilizing biological information has emerged as a cost-effective and highly efficient approach. Here, we introduced a computational framework named ReHoGCNES, designed for prospective miRNA-disease association prediction (ReHoGCNES-MDA). This method constructs homogenous graph convolutional network with regular graph structure (ReHoGCN) encompassing disease similarity network, miRNA similarity network and known MDA network and then was tested on four experimental tasks. A random edge sampler strategy was utilized to expedite processes and diminish training complexity. Experimental results demonstrate that the proposed ReHoGCNES-MDA method outperforms both homogenous graph convolutional network and heterogeneous graph convolutional network with non-regular graph structure in all four tasks, which implicitly reveals steadily degree distribution of a graph does play an important role in enhancement of model performance. Besides, ReHoGCNES-MDA is superior to several machine learning algorithms and state-of-the-art methods on the MDA prediction. Furthermore, three case studies were conducted to further demonstrate the predictive ability of ReHoGCNES. Consequently, 93.3% (breast neoplasms), 90% (prostate neoplasms) and 93.3% (prostate neoplasms) of the top 30 forecasted miRNAs were validated by public databases. Hence, ReHoGCNES-MDA might serve as a dependable and beneficial model for predicting possible MDAs.
Collapse
Affiliation(s)
- Yufang Zhang
- School of Mathematical Sciences and SJTU-Yale Joint Center for Biostatistics and Data Science, Shanghai Jiao Tong University, Shanghai 200240, China
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
| | - Yanyi Chu
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, 94305, USA
| | - Shenggeng Lin
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
- Shanghai Artificial Intelligence Laboratory, Shanghai, 200232, China
| | - Dong-Qing Wei
- Peng Cheng Laboratory, Shenzhen, Guangdong 518055, China
- Zhongjing Research and Industrialization Institute of Chinese Medicine, Zhongguancun Scientific Park, Meixi, Nanyang, Henan, 473006, China
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, and Joint Laboratory of International Cooperation in Metabolic and Developmental Sciences, Ministry of Education, Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
7
|
Wang J, Horlacher M, Cheng L, Winther O. RNA trafficking and subcellular localization-a review of mechanisms, experimental and predictive methodologies. Brief Bioinform 2023; 24:bbad249. [PMID: 37466130 PMCID: PMC10516376 DOI: 10.1093/bib/bbad249] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2023] [Revised: 05/30/2023] [Accepted: 06/16/2023] [Indexed: 07/20/2023] Open
Abstract
RNA localization is essential for regulating spatial translation, where RNAs are trafficked to their target locations via various biological mechanisms. In this review, we discuss RNA localization in the context of molecular mechanisms, experimental techniques and machine learning-based prediction tools. Three main types of molecular mechanisms that control the localization of RNA to distinct cellular compartments are reviewed, including directed transport, protection from mRNA degradation, as well as diffusion and local entrapment. Advances in experimental methods, both image and sequence based, provide substantial data resources, which allow for the design of powerful machine learning models to predict RNA localizations. We review the publicly available predictive tools to serve as a guide for users and inspire developers to build more effective prediction models. Finally, we provide an overview of multimodal learning, which may provide a new avenue for the prediction of RNA localization.
Collapse
Affiliation(s)
- Jun Wang
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
| | - Marc Horlacher
- Computational Health Center, Helmholtz Center, Munich, Germany
| | - Lixin Cheng
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China
| | - Ole Winther
- Bioinformatics Centre, Department of Biology, University of Copenhagen, København Ø 2100, Denmark
- Center for Genomic Medicine, Rigshospitalet (Copenhagen University Hospital), Copenhagen 2100, Denmark
- Section for Cognitive Systems, Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby 2800, Denmark
| |
Collapse
|
8
|
Wang J, Guo C, Yang L, Sun P, Jing X. Peripheral blood microR-146a and microR-29c expression in children with Mycoplasma pneumoniae pneumonia and its clinical value. Ital J Pediatr 2023; 49:119. [PMID: 37705091 PMCID: PMC10500935 DOI: 10.1186/s13052-023-01500-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 07/20/2023] [Indexed: 09/15/2023] Open
Abstract
BACKGROUND We investigated changes in microR-29c and microR-146a expression in the serum of children with Mycoplasma pneumoniae pneumonia, analysed their relationship with inflammatory factors and disease severity, and evaluated their diagnostic significance. METHODS Fifty-six children with Mycoplasma pneumoniae pneumonia were enrolled as the Mycoplasma pneumoniae pneumonia group; 37 healthy children were enrolled as the control group. The microR-29c or microR-146a serum expression levels were determined using real-time quantitative reverse transcription polymerase chain reaction. Interleukin-17, tumour necrosis factor-alpha, and interleukin-1 beta levels were detected using enzyme-linked immunosorbent assay. The correlation between serum microR-29c or microR-146a expression and inflammatory factors was analysed using the Pearson's method. Receiver operating characteristic curves were used to evaluate the diagnostic value of serum microR-29c, microR-146a, and their combined detection in Mycoplasma pneumoniae pneumonia. RESULTS Compared with that in healthy children, the microR-29c and microR-146a serum levels were significantly downregulated in children with Mycoplasma pneumoniae pneumonia; the decrease was more obvious in children with severe cases than that in those with mild cases. In addition, microR-29c and microR-146a were negatively correlated with increased expression of interleukin-17, tumour necrosis factor-alpha, and interleukin-1 beta. Receiver operating characteristic curves showed that a combination of microR-29c and microR-146a was highly suitable for diagnosing Mycoplasma pneumoniae pneumonia. CONCLUSION Serum microR-29c and microR-146a were underexpressed in children with Mycoplasma pneumoniae pneumonia, and diagnostic accuracy was significantly improved with combined microR-29c and microR-146a detection. Therefore, both microR-29c and microR-146a levels can be used as biomarkers for the diagnosis of Mycoplasma pneumoniae pneumonia.
Collapse
Affiliation(s)
- Jingcai Wang
- Department of Pediatric Medicine, Affiliated Hospital of Chengde Medical College, Chengde, 067000, China
| | - Chunyan Guo
- Department of Pediatric Medicine, Affiliated Hospital of Chengde Medical College, Chengde, 067000, China
| | - Lixin Yang
- Department of Pediatric Medicine, Affiliated Hospital of Chengde Medical College, Chengde, 067000, China
| | - Peng Sun
- Department of Pediatric Medicine, Affiliated Hospital of Chengde Medical College, Chengde, 067000, China
| | - Xiaoqing Jing
- Department of Pediatric Medicine, Affiliated Hospital of Chengde Medical College, Chengde, 067000, China.
| |
Collapse
|
9
|
Bai T, Yan K, Liu B. DAmiRLocGNet: miRNA subcellular localization prediction by combining miRNA-disease associations and graph convolutional networks. Brief Bioinform 2023:bbad212. [PMID: 37332057 DOI: 10.1093/bib/bbad212] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 05/17/2023] [Accepted: 05/18/2023] [Indexed: 06/20/2023] Open
Abstract
MicroRNAs (miRNAs) are human post-transcriptional regulators in humans, which are involved in regulating various physiological processes by regulating the gene expression. The subcellular localization of miRNAs plays a crucial role in the discovery of their biological functions. Although several computational methods based on miRNA functional similarity networks have been presented to identify the subcellular localization of miRNAs, it remains difficult for these approaches to effectively extract well-referenced miRNA functional representations due to insufficient miRNA-disease association representation and disease semantic representation. Currently, there has been a significant amount of research on miRNA-disease associations, making it possible to address the issue of insufficient miRNA functional representation. In this work, a novel model is established, named DAmiRLocGNet, based on graph convolutional network (GCN) and autoencoder (AE) for identifying the subcellular localizations of miRNA. The DAmiRLocGNet constructs the features based on miRNA sequence information, miRNA-disease association information and disease semantic information. GCN is utilized to gather the information of neighboring nodes and capture the implicit information of network structures from miRNA-disease association information and disease semantic information. AE is employed to capture sequence semantics from sequence similarity networks. The evaluation demonstrates that the performance of DAmiRLocGNet is superior to other competing computational approaches, benefiting from implicit features captured by using GCNs. The DAmiRLocGNet has the potential to be applied to the identification of subcellular localization of other non-coding RNAs. Moreover, it can facilitate further investigation into the functional mechanisms underlying miRNA localization. The source code and datasets are accessed at http://bliulab.net/DAmiRLocGNet.
Collapse
Affiliation(s)
- Tao Bai
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- School of Mathematics & Computer Science, Yan'an University, Shaanxi 716000, China
| | - Ke Yan
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
| | - Bin Liu
- School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
- Advanced Research Institute of Multidisciplinary Science, Beijing Institute of Technology, Beijing 100081, China
| |
Collapse
|
10
|
Li J, Lin H, Wang Y, Li Z, Wu B. Prediction of potential small molecule-miRNA associations based on heterogeneous network representation learning. Front Genet 2022; 13:1079053. [PMID: 36531225 PMCID: PMC9755196 DOI: 10.3389/fgene.2022.1079053] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Accepted: 11/21/2022] [Indexed: 11/25/2023] Open
Abstract
MicroRNAs (miRNAs) are closely associated with the occurrences and developments of many complex human diseases. Increasing studies have shown that miRNAs emerge as new therapeutic targets of small molecule (SM) drugs. Since traditional experiment methods are expensive and time consuming, it is particularly crucial to find efficient computational approaches to predict potential small molecule-miRNA (SM-miRNA) associations. Considering that integrating multi-source heterogeneous information related with SM-miRNA association prediction would provide a comprehensive insight into the features of both SMs and miRNAs, we proposed a novel model of Small Molecule-MiRNA Association prediction based on Heterogeneous Network Representation Learning (SMMA-HNRL) for more precisely predicting the potential SM-miRNA associations. In SMMA-HNRL, a novel heterogeneous information network was constructed with SM nodes, miRNA nodes and disease nodes. To access and utilize of the topological information of the heterogeneous information network, feature vectors of SM and miRNA nodes were obtained by two different heterogeneous network representation learning algorithms (HeGAN and HIN2Vec) respectively and merged with connect operation. Finally, LightGBM was chosen as the classifier of SMMA-HNRL for predicting potential SM-miRNA associations. The 10-fold cross validations were conducted to evaluate the prediction performance of SMMA-HNRL, it achieved an area under of ROC curve of 0.9875, which was superior to other three state-of-the-art models. With two independent validation datasets, the test experiment results revealed the robustness of our model. Moreover, three case studies were performed. As a result, 35, 37, and 22 miRNAs among the top 50 predicting miRNAs associated with 5-FU, cisplatin, and imatinib were validated by experimental literature works respectively, which confirmed the effectiveness of SMMA-HNRL. The source code and experimental data of SMMA-HNRL are available at https://github.com/SMMA-HNRL/SMMA-HNRL.
Collapse
Affiliation(s)
- Jianwei Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
- Hebei Province Key Laboratory of Big Data Calculation, Hebei University of Technology, Tianjin, China
| | - Hongxin Lin
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Yinfei Wang
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Zhiguang Li
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| | - Baoqin Wu
- School of Artificial Intelligence, Institute of Computational Medicine, Hebei University of Technology, Tianjin, China
| |
Collapse
|
11
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: towards systematic evaluation of computational models. Brief Bioinform 2022; 23:6712303. [PMID: 36151749 DOI: 10.1093/bib/bbac407] [Citation(s) in RCA: 58] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Revised: 08/11/2022] [Accepted: 08/20/2022] [Indexed: 12/14/2022] Open
Abstract
Currently, there exist no generally accepted strategies of evaluating computational models for microRNA-disease associations (MDAs). Though K-fold cross validations and case studies seem to be must-have procedures, the value of K, the evaluation metrics, and the choice of query diseases as well as the inclusion of other procedures (such as parameter sensitivity tests, ablation studies and computational cost reports) are all determined on a case-by-case basis and depending on the researchers' choices. In the current review, we include a comprehensive analysis on how 29 state-of-the-art models for predicting MDAs were evaluated. Based on the analytical results, we recommend a feasible evaluation workflow that would suit any future model to facilitate fair and systematic assessment of predictive performance.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
12
|
Zhou H, Wang H, Tang J, Ding Y, Guo F. Identify ncRNA Subcellular Localization via Graph Regularized k-Local Hyperplane Distance Nearest Neighbor Model on Multi-Kernel Learning. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3517-3529. [PMID: 34432632 DOI: 10.1109/tcbb.2021.3107621] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Non-coding RNAs (ncRNAs) are a type of RNAs which are not used to encode protein sequences. Emerging evidence shows that lots of ncRNAs may participate in many biological processes and must be widely involved in many types of cancers. Therefore, understanding their functionality is of great importance. Similar to proteins, various functions of ncRNAs relies on their subcellular localizations. Traditional high-throughput methods in wet-lab to identify subcellular localization is time-consuming and costly. In this paper, we propose a novel computational method based on multi-kernel learning to identify multi-label ncRNA subcellular localizations, via graph regularized k-local hyperplane distance nearest neighbor algorithm. First, we construct six types of sequence-based feature descriptors and select important feature vectors. Then, we build a multi-kernel learning model with Hilbert-Schmidt independence criterion (HSIC) to obtain optimal weights for vairous features. Furthermore, we propose the graph regularized k-local hyperplane distance nearest neighbor algorithm (GHKNN) as a binary classification model for detecting one kind of non-coding RNA subcellular localization. Finally, we apply One-vs-Rest strategy to decompose multi-label problem of non-coding RNA subcellular localizations. Our method achieves excellent performance on three ncRNA datasets and three human ncRNA datasets, and out-performs other outstanding machine learning methods. Comparing to existing method, our model also performs well especially on small datasets. We expect that this model will be useful for the prediction of subcellular localization and the study of important functional mechanisms of ncRNAs. Furthermore, we establish user-friendly web server (http://ncrna.lbci.net/) with the implementation of our method, which can be easily used by most experimental scientists.
Collapse
|
13
|
Arshinchi Bonab R, Asfa S, Kontou P, Karakülah G, Pavlopoulou A. Identification of neoplasm-specific signatures of miRNA interactions by employing a systems biology approach. PeerJ 2022; 10:e14149. [PMID: 36213495 PMCID: PMC9536303 DOI: 10.7717/peerj.14149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 09/07/2022] [Indexed: 01/21/2023] Open
Abstract
MicroRNAs represent major regulatory components of the disease epigenome and they constitute powerful biomarkers for the accurate diagnosis and prognosis of various diseases, including cancers. The advent of high-throughput technologies facilitated the generation of a vast amount of miRNA-cancer association data. Computational approaches have been utilized widely to effectively analyze and interpret these data towards the identification of miRNA signatures for diverse types of cancers. Herein, a novel computational workflow was applied to discover core sets of miRNA interactions for the major groups of neoplastic diseases by employing network-based methods. To this end, miRNA-cancer association data from four comprehensive publicly available resources were utilized for constructing miRNA-centered networks for each major group of neoplasms. The corresponding miRNA-miRNA interactions were inferred based on shared functionally related target genes. The topological attributes of the generated networks were investigated in order to detect clusters of highly interconnected miRNAs that form core modules in each network. Those modules that exhibited the highest degree of mutual exclusivity were selected from each graph. In this way, neoplasm-specific miRNA modules were identified that could represent potential signatures for the corresponding diseases.
Collapse
Affiliation(s)
- Reza Arshinchi Bonab
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey,Izmir Biomedicine and Genome Center, Izmir, Turkey
| | - Seyedehsadaf Asfa
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey,Izmir Biomedicine and Genome Center, Izmir, Turkey
| | - Panagiota Kontou
- Department of Mathematics, University of Thessaly, Lamia, Greece
| | - Gökhan Karakülah
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey,Izmir Biomedicine and Genome Center, Izmir, Turkey
| | - Athanasia Pavlopoulou
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey,Izmir Biomedicine and Genome Center, Izmir, Turkey
| |
Collapse
|
14
|
Zhao C, Wang H, Qi W, Liu S. Toward drug-miRNA resistance association prediction by positional encoding graph neural network and multi-channel neural network. Methods 2022; 207:81-89. [PMID: 36167292 DOI: 10.1016/j.ymeth.2022.09.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 09/01/2022] [Accepted: 09/18/2022] [Indexed: 10/31/2022] Open
Abstract
Drug discovery is a costly and time-consuming process, and most drugs exert therapeutic efficacy by targeting specific proteins. However, there are a large number of proteins that are not targeted by any drug. Recently, miRNA-based therapeutics are becoming increasingly important, since miRNA can regulate the expressions of specific genes and affect a variety of human diseases. Therefore, it is of great significance to study the associations between miRNAs and drugs to enable drug discovery and disease treatment. In this work, we propose a novel method named DMR-PEG, which facilitates drug-miRNA resistance association (DMRA) prediction by leveraging positional encoding graph neural network with layer attention (LAPEG) and multi-channel neural network (MNN). LAPEG considers both the potential information in the miRNA-drug resistance heterogeneous network and the specific characteristics of entities (i.e., drugs and miRNAs) to learn favorable representations of drugs and miRNAs. And MNN models various sophisticated relations and synthesizes the predictions from different perspectives effectively. In the comprehensive experiments, DMR-PEG achieves the area under the precision-recall curve (AUPR) score of 0.2793 and the area under the receiver-operating characteristic curve (AUC) score of 0.9475, which outperforms the most state-of-the-art methods. Further experimental results show that our proposed method has good robustness and stability. The ablation study demonstrates each component in DMR-PEG is essential for drug-miRNA drug resistance association prediction. And real-world case study presents that DMR-PEG is promising for DMRA inference.
Collapse
Affiliation(s)
- Chengshuai Zhao
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Haorui Wang
- School of Computer Science, Wuhan University, Wuhan 430072, China
| | - Weiwei Qi
- Hubei Bailianhe Pumped-storage Power Station, Wuhan 430074, China
| | - Shichao Liu
- College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
15
|
Zhang ZY, Ning L, Ye X, Yang YH, Futamura Y, Sakurai T, Lin H. iLoc-miRNA: extracellular/intracellular miRNA prediction using deep BiLSTM with attention mechanism. Brief Bioinform 2022; 23:6693601. [PMID: 36070864 DOI: 10.1093/bib/bbac395] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 07/27/2022] [Accepted: 08/13/2022] [Indexed: 11/13/2022] Open
Abstract
The location of microRNAs (miRNAs) in cells determines their function in regulation activity. Studies have shown that miRNAs are stable in the extracellular environment that mediates cell-to-cell communication and are located in the intracellular region that responds to cellular stress and environmental stimuli. Though in situ detection techniques of miRNAs have made great contributions to the study of the localization and distribution of miRNAs, miRNA subcellular localization and their role are still in progress. Recently, some machine learning-based algorithms have been designed for miRNA subcellular location prediction, but their performance is still far from satisfactory. Here, we present a new data partitioning strategy that categorizes functionally similar locations for the precise and instructive prediction of miRNA subcellular location in Homo sapiens. To characterize the localization signals, we adopted one-hot encoding with post padding to represent the whole miRNA sequences, and proposed a deep bidirectional long short-term memory with the multi-head self-attention algorithm to model. The algorithm showed high selectivity in distinguishing extracellular miRNAs from intracellular miRNAs. Moreover, a series of motif analyses were performed to explore the mechanism of miRNA subcellular localization. To improve the convenience of the model, a user-friendly web server named iLoc-miRNA was established (http://iLoc-miRNA.lin-group.cn/).
Collapse
Affiliation(s)
- Zhao-Yue Zhang
- Tsukuba Life Science Innovation Program, University of Tsukuba, Tsukuba 3058577, Japan
| | - Lin Ning
- School of Healthcare Technology, Chengdu Neusoft University, 611844, Chengdu, China
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Yu-He Yang
- Center for Information Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Yasunori Futamura
- Tsukuba Life Science Innovation Program, University of Tsukuba, Tsukuba 3058577, Japan.,Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Tetsuya Sakurai
- Tsukuba Life Science Innovation Program, University of Tsukuba, Tsukuba 3058577, Japan.,Department of Computer Science, University of Tsukuba, Tsukuba 3058577, Japan
| | - Hao Lin
- Center for Information Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|
16
|
Qian Q, Ma Q, Wang B, Qian Q, Zhao C, Feng F, Dong X. Downregulated miR-129-5p expression inhibits rat pulmonary fibrosis by upregulating STAT1 gene expression in macrophages. Int Immunopharmacol 2022; 109:108880. [PMID: 35689956 DOI: 10.1016/j.intimp.2022.108880] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
OBJECTIVE This study investigated the mechanism by which microRNA-129-5p (miR-129-5p) in macrophages affects pulmonary fibrosis in rats by regulating the expression of the signal transducer and activator of transcription 1 (STAT1) gene. METHODS After the establishment of a pulmonary fibrosis rat model, quantitative real-time polymerase chain reaction (qRT-PCR) was employed to detect the expression of miR-129-5p in the sham group and model group. The binding sites between miR-129-5p and STAT1 were predicted online and verified by using a dual luciferase reporter system. qRT-PCR and Western blot analyses were used to test the effect of miR-129-5p on STAT1 gene expression. M2 macrophages were isolated and induced, and exosomes were extracted. Cell proliferation was detected by EdU. Furthermore, qRT-PCR was performed to detect the expression of STAT1, collagen type I A2 (COL1A2), collagen type III A1 (COL3A1), fibronectin, and α-SMA in cells and tissues followed by the detection of CD9, CD63, CD81, CD31 and STAT1 protein expression using a Western blot analysis. The pulmonary fibrosis area was detected by Masson staining followed by the immunohistochemical detection of α-smooth muscle actin (α-SMA) and type I collagen (COL-I) expression in pulmonary fibroblasts. RESULTS Compared with the sham group, the expression level of miR-129-5p in the model group was significantly increased (P < 0.05). miR-129-5p was observed to negatively regulate the expression of STAT1 (P < 0.05). The in vitro cell transfection experiments showed that after inhibiting the expression of miR-129-5p, the expression of STAT1 was increased, and the proliferation of fibroblasts and pulmonary fibrosis were inhibited (all P < 0.05). Furthermore, compared with the fibroblasts without coculture, the proliferation of the fibroblasts cocultured with M2 macrophage-secreted exosomes was clearly increased, and the expression levels of COL1A2, COL3A1, fibronectin and α-SMA were significantly increased (all P < 0.05). Compared with the mimic NC-exo group, the miR-129-5p-exo group had significantly increased proliferation of fibroblasts, decreased expression of STAT1, and significantly increased expression of COL1A2, COL3A1, fibronectin and α-SMA, and M2 macrophage-secreted exosomes could carry miR-129-5p to fibroblasts. Furthermore, the in vivo experiment confirmed that the exosomes of M2 macrophages could carry miR-129-5p, which could regulate M2 macrophages with pulmonary fibrosis in vivo. CONCLUSION M2 macrophages can carry miR-129-5p to pulmonary interstitial fibroblasts and inhibit STAT1 gene expression, which may lead to the proliferation of fibroblasts and promote pulmonary fibrosis. The downregulation of miR-129-5p can significantly promote STAT1 gene expression in macrophages to inhibit pulmonary fibrosis in rats.
Collapse
Affiliation(s)
- Qingzeng Qian
- School of Public Health, North China University of Science and Technology, Tangshan 063210, Hebei, China
| | - Qinghua Ma
- Department of Preventive Health, The Third People's Hospital Of Xiangcheng District In Suzhou, Suzhou 215134, Jiangsu, China
| | - Bin Wang
- Department of Pediatrics, North China University of Science and Technology Affiliated Hospital, Tangshan 063210, Hebei, China
| | - Qingqiang Qian
- Department of Neurology, Tangshan Gongren Hospital, Tangshan, Hebei, China
| | - Changsong Zhao
- Department of Emergency, Tangshan Hospital of Traditional Chinese Medicine, Tangshan, Hebei, China
| | - Fumin Feng
- School of Public Health, North China University of Science and Technology, Tangshan 063210, Hebei, China
| | - Xiaona Dong
- Department of Respiratory Medicine, Tangshan People's Hospital, Tangshan 063001, Hebei, China.
| |
Collapse
|
17
|
Pati SK, Gupta MK, Shai R, Banerjee A, Ghosh A. Missing value estimation of microarray data using Sim-GAN. Knowl Inf Syst 2022. [DOI: 10.1007/s10115-022-01718-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
|
18
|
Asim MN, Ibrahim MA, Malik MI, Zehe C, Cloarec O, Trygg J, Dengel A, Ahmed S. EL-RMLocNet: An explainable LSTM network for RNA-associated multi-compartment localization prediction. Comput Struct Biotechnol J 2022; 20:3986-4002. [PMID: 35983235 PMCID: PMC9356161 DOI: 10.1016/j.csbj.2022.07.031] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2022] [Revised: 07/16/2022] [Accepted: 07/16/2022] [Indexed: 11/23/2022] Open
Abstract
Subcellular localization of Ribonucleic Acid (RNA) molecules provide significant insights into the functionality of RNAs and helps to explore their association with various diseases. Predominantly developed single-compartment localization predictors (SCLPs) lack to demystify RNA association with diverse biochemical and pathological processes mainly happen through RNA co-localization in multiple compartments. Limited multi-compartment localization predictors (MCLPs) manage to produce decent performance only for target RNA class of particular sub-type. Further, existing computational approaches have limited practical significance and potential to optimize therapeutics due to the poor degree of model explainability. The paper in hand presents an explainable Long Short-Term Memory (LSTM) network "EL-RMLocNet", predictive performance and interpretability of which are optimized using a novel GeneticSeq2Vec statistical representation learning scheme and attention mechanism for accurate multi-compartment localization prediction of different RNAs solely using raw RNA sequences. GeneticSeq2Vec generates optimized statistical vectors of raw RNA sequences by capturing short and long range relations of nucleotide k-mers. Using sequence vectors generated by GeneticSeq2Vec scheme, Long Short Term Memory layers extract most informative features, weighting of which on the basis of discriminative potential for accurate multi-compartment localization prediction is performed using attention layer. Through reverse engineering, weights of statistical feature space are mapped to nucleotide k-mers patterns to make multi-compartment localization prediction decision making transparent and explainable for different RNA classes and species. Empirical evaluation indicates that EL-RMLocNet outperforms state-of-the-art predictor for subcellular localization prediction of 4 different RNA classes by an average accuracy figure of 8% for Homo Sapiens species and 6% for Mus Musculus species. EL-RMLocNet is freely available as a web server at (https://sds_genetic_analysis.opendfki.de/subcellular_loc/).
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern 67663, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern 67663, Germany
| | - Muhammad Ali Ibrahim
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern 67663, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern 67663, Germany
| | - Muhammad Imran Malik
- School of Computer Science & Electrical Engineering, National University of Sciences and Technology, 44000, Islamabad, Pakistan
| | - Christoph Zehe
- Sartorius Corporate Research, Sartorius Stedim Cellca GmbH, 89081 Ulm, Germany
| | - Olivier Cloarec
- Sartorius Corporate Research, Sartorius Stedim Cellca GmbH, 89081 Ulm, Germany
| | - Johan Trygg
- Computational Life Science Cluster (CLiC), Umeå University, 90187 Umea, Sweden
- Sartorius Corporate Research, Sartorius Stedim Data Analytics, 90333 Umea, Sweden
| | - Andreas Dengel
- Department of Computer Science, Technical University of Kaiserslautern, Kaiserslautern 67663, Germany
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern 67663, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, Kaiserslautern 67663, Germany
| |
Collapse
|
19
|
Chen Y, Wang Y, Ding Y, Su X, Wang C. RGCNCDA: Relational graph convolutional network improves circRNA-disease association prediction by incorporating microRNAs. Comput Biol Med 2022; 143:105322. [PMID: 35217342 DOI: 10.1016/j.compbiomed.2022.105322] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Revised: 02/11/2022] [Accepted: 02/13/2022] [Indexed: 12/21/2022]
Abstract
Recently, a large number of studies have indicated that circRNAs with covalently closed loops play important roles in biological processes and have potential as diagnostic biomarkers. Therefore, research on the circRNA-disease relationship is helpful in disease diagnosis and treatment. However, traditional biological verification methods require considerable labor and time costs. In this paper, we propose a new computational method (RGCNCDA) to predict circRNA-disease associations based on relational graph convolutional networks (R-GCNs). The method first integrates the circRNA similarity network, miRNA similarity network, disease similarity network and association networks among them to construct a global heterogeneous network. Then, it employs the random walk with restart (RWR) and principal component analysis (PCA) models to learn low-dimensional and high-order information from the global heterogeneous network as the topological features. Finally, a prediction model based on an R-GCN encoder and a DistMult decoder is built to predict the potential disease-associated circRNA. The predicted results demonstrate that RGCNCDA performs significantly better than the other six state-of-the-art methods in a 5-fold cross validation. Furthermore, the case study illustrates that RGCNCDA can effectively discover potential circRNA-disease associations.
Collapse
Affiliation(s)
- Yaojia Chen
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Yanpeng Wang
- Beidahuang Industry Group General Hospital, Harbin, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Xi Su
- Foshan Maternity & Child Healthcare Hospital, Southern Medical University, Foshan, China.
| | - Chunyu Wang
- Faculty of Computing, Harbin Institute of Technology, Harbin, China.
| |
Collapse
|
20
|
Xu M, Chen Y, Xu Z, Zhang L, Jiang H, Pian C. MiRLoc: predicting miRNA subcellular localization by incorporating miRNA-mRNA interactions and mRNA subcellular localization. Brief Bioinform 2022; 23:6532537. [PMID: 35183063 DOI: 10.1093/bib/bbac044] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Revised: 01/17/2022] [Accepted: 01/28/2022] [Indexed: 12/19/2022] Open
Abstract
Subcellular localization of microRNAs (miRNAs) is an important reflection of their biological functions. Considering the spatio-temporal specificity of miRNA subcellular localization, experimental detection techniques are expensive and time-consuming, which strongly motivates an efficient and economical computational method to predict miRNA subcellular localization. In this paper, we describe a computational framework, MiRLoc, to predict the subcellular localization of miRNAs. In contrast to existing methods, MiRLoc uses the functional similarity between miRNAs instead of sequence features and incorporates information about the subcellular localization of the corresponding target mRNAs. The results show that miRNA functional similarity data can be effectively used to predict miRNA subcellular localization, and that inclusion of subcellular localization information of target mRNAs greatly improves prediction performance.
Collapse
Affiliation(s)
- Mingmin Xu
- College of Agriculture, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Yuanyuan Chen
- College of Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Zhihui Xu
- Simcere Diagnostics Co., Ltd., Nanjing, Jiangsu, China
| | - Liangyun Zhang
- College of Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, China
| | - Hangjin Jiang
- Center for Data Science, Zhejiang University, Hangzhou, Zhejiang, China
| | - Cong Pian
- College of Sciences, Nanjing Agricultural University, Nanjing, Jiangsu, China.,Simcere Diagnostics Co., Ltd., Nanjing, Jiangsu, China
| |
Collapse
|
21
|
Niu Y, Song C, Gong Y, Zhang W. MiRNA-Drug Resistance Association Prediction Through the Attentive Multimodal Graph Convolutional Network. Front Pharmacol 2022; 12:799108. [PMID: 35095506 PMCID: PMC8790023 DOI: 10.3389/fphar.2021.799108] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2021] [Accepted: 12/20/2021] [Indexed: 11/13/2022] Open
Abstract
MiRNAs can regulate genes encoding specific proteins which are related to the efficacy of drugs, and predicting miRNA-drug resistance associations is of great importance. In this work, we propose an attentive multimodal graph convolution network method (AMMGC) to predict miRNA-drug resistance associations. AMMGC learns the latent representations of drugs and miRNAs from four graph convolution sub-networks with distinctive combinations of features. Then, an attention neural network is employed to obtain attentive representations of drugs and miRNAs, and miRNA-drug resistance associations are predicted by the inner product of learned attentive representations. The computational experiments show that AMMGC outperforms other state-of-the-art methods and baseline methods, achieving the AUPR score of 0.2399 and the AUC score of 0.9467. The analysis demonstrates that leveraging multiple features of drugs and miRNAs can make a contribution to the miRNA-drug resistance association prediction. The usefulness of AMMGC is further validated by case studies.
Collapse
Affiliation(s)
- Yanqing Niu
- School of Mathematics and Statistics, South-Central University for Nationalities, Wuhan, China
| | - Congzhi Song
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| | - Yuchong Gong
- School of Computer Science, Wuhan University, Wuhan, China
| | - Wen Zhang
- College of Informatics, Huazhong Agricultural University, Wuhan, China
| |
Collapse
|
22
|
Pesaranghader A, Matwin S, Sokolova M, Grenier JC, Beiko RG, Hussin J. OUP accepted manuscript. Bioinformatics 2022; 38:3051-3061. [PMID: 35536192 PMCID: PMC9154256 DOI: 10.1093/bioinformatics/btac304] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 02/12/2022] [Indexed: 11/24/2022] Open
Abstract
Motivation There is a plethora of measures to evaluate functional similarity (FS) of genes based on their co-expression, protein–protein interactions and sequence similarity. These measures are typically derived from hand-engineered and application-specific metrics to quantify the degree of shared information between two genes using their Gene Ontology (GO) annotations. Results We introduce deepSimDEF, a deep learning method to automatically learn FS estimation of gene pairs given a set of genes and their GO annotations. deepSimDEF’s key novelty is its ability to learn low-dimensional embedding vector representations of GO terms and gene products and then calculate FS using these learned vectors. We show that deepSimDEF can predict the FS of new genes using their annotations: it outperformed all other FS measures by >5–10% on yeast and human reference datasets on protein–protein interactions, gene co-expression and sequence homology tasks. Thus, deepSimDEF offers a powerful and adaptable deep neural architecture that can benefit a wide range of problems in genomics and proteomics, and its architecture is flexible enough to support its extension to any organism. Availability and implementation Source code and data are available at https://github.com/ahmadpgh/deepSimDEF Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Stan Matwin
- Faculty of Computer Science, Dalhousie University, Halifax B3H 4R2, Canada
- Institute for Big Data Analytics, Dalhousie University, Halifax B3H 4R2, Canada
- Institute of Computer Science, Polish Academy of Sciences, Warsaw, Poland
| | - Marina Sokolova
- Institute for Big Data Analytics, Dalhousie University, Halifax B3H 4R2, Canada
- Faculty of Medicine and Faculty of Engineering, University of Ottawa, Ottawa K1H 8M5, Canada
| | | | - Robert G Beiko
- Faculty of Computer Science, Dalhousie University, Halifax B3H 4R2, Canada
- Institute for Big Data Analytics, Dalhousie University, Halifax B3H 4R2, Canada
| | | |
Collapse
|
23
|
Savulescu AF, Bouilhol E, Beaume N, Nikolski M. Prediction of RNA subcellular localization: Learning from heterogeneous data sources. iScience 2021; 24:103298. [PMID: 34765919 PMCID: PMC8571491 DOI: 10.1016/j.isci.2021.103298] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
RNA subcellular localization has recently emerged as a widespread phenomenon, which may apply to the majority of RNAs. The two main sources of data for characterization of RNA localization are sequence features and microscopy images, such as obtained from single-molecule fluorescent in situ hybridization-based techniques. Although such imaging data are ideal for characterization of RNA distribution, these techniques remain costly, time-consuming, and technically challenging. Given these limitations, imaging data exist only for a limited number of RNAs. We argue that the field of RNA localization would greatly benefit from complementary techniques able to characterize location of RNA. Here we discuss the importance of RNA localization and the current methodology in the field, followed by an introduction on prediction of location of molecules. We then suggest a machine learning approach based on the integration between imaging localization data and sequence-based data to assist in characterization of RNA localization on a transcriptome level.
Collapse
Affiliation(s)
- Anca Flavia Savulescu
- Division of Chemical, Systems & Synthetic Biology, Institute for Infectious Disease & Molecular Medicine, Faculty of Health Sciences, University of Cape Town, 7925 Cape Town, South Africa
| | - Emmanuel Bouilhol
- Université de Bordeaux, Bordeaux Bioinformatics Center, Bordeaux, France
- Université de Bordeaux, CNRS, IBGC, UMR 5095, Bordeaux, France
| | - Nicolas Beaume
- Division of Medical Virology, Faculty of Health Sciences, University of Cape Town,7925 Cape Town, South Africa
| | - Macha Nikolski
- Université de Bordeaux, Bordeaux Bioinformatics Center, Bordeaux, France
- Université de Bordeaux, CNRS, IBGC, UMR 5095, Bordeaux, France
| |
Collapse
|
24
|
Human microRNA similarity in breast cancer. Biosci Rep 2021; 41:229885. [PMID: 34612484 PMCID: PMC8529337 DOI: 10.1042/bsr20211123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Revised: 09/28/2021] [Accepted: 10/04/2021] [Indexed: 11/25/2022] Open
Abstract
MicroRNAs (miRNAs) play important roles in a variety of human diseases, including breast cancer. A number of miRNAs are up- and down-regulated in breast cancer. However, little is known about miRNA similarity and similarity network in breast cancer. Here, a collection of 272 breast cancer-associated miRNA precursors (pre-miRNAs) were utilized to calculate similarities of sequences, target genes, pathways and functions and construct a combined similarity network. Well-characterized miRNAs and their similarity network were highlighted. Interestingly, miRNA sequence-dependent similarity networks were not identified in spite of sequence–target gene association. Similarity networks with minimum and maximum number of miRNAs originate from pathway and mature sequence, respectively. The breast cancer-associated miRNAs were divided into seven functional classes (classes I–VII) followed by disease enrichment analysis and novel miRNA-based disease similarities were found. The finding would provide insight into miRNA similarity, similarity network and disease heterogeneity in breast cancer.
Collapse
|
25
|
Zhang ZW, Gao Z, Zheng CH, Li L, Qi SM, Wang YT. WVMDA: Predicting miRNA-Disease Association Based on Weighted Voting. Front Genet 2021; 12:742992. [PMID: 34659363 PMCID: PMC8511643 DOI: 10.3389/fgene.2021.742992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 09/09/2021] [Indexed: 11/15/2022] Open
Abstract
An increasing number of experiments had verified that miRNA expression is related to human diseases. The miRNA expression profile may be an indicator of clinical diagnosis and provides a new direction for the prevention and treatment of complex diseases. In this work, we present a weighted voting-based model for predicting miRNA–disease association (WVMDA). To reasonably build a network of similarity, we established credibility similarity based on the reliability of known associations and used it to improve the original incomplete similarity. To eliminate noise interference as much as possible while maintaining more reliable similarity information, we developed a filter. More importantly, to ensure the fairness and efficiency of weighted voting, we focus on the design of weighting. Finally, cross-validation experiments and case studies are undertaken to verify the efficacy of the proposed model. The results showed that WVMDA could efficiently identify miRNAs associated with the disease.
Collapse
Affiliation(s)
- Zhen-Wei Zhang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Zhen Gao
- School of Computer Science and Technology, Anhui University, Hefei, China
| | - Chun-Hou Zheng
- School of Cyberspace Security, Qufu Normal University, Qufu, China.,School of Computer Science and Technology, Anhui University, Hefei, China
| | - Lei Li
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Su-Min Qi
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| | - Yu-Tian Wang
- School of Cyberspace Security, Qufu Normal University, Qufu, China
| |
Collapse
|
26
|
Asim MN, Ibrahim MA, Imran Malik M, Dengel A, Ahmed S. Advances in Computational Methodologies for Classification and Sub-Cellular Locality Prediction of Non-Coding RNAs. Int J Mol Sci 2021; 22:8719. [PMID: 34445436 PMCID: PMC8395733 DOI: 10.3390/ijms22168719] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 02/06/2023] Open
Abstract
Apart from protein-coding Ribonucleic acids (RNAs), there exists a variety of non-coding RNAs (ncRNAs) which regulate complex cellular and molecular processes. High-throughput sequencing technologies and bioinformatics approaches have largely promoted the exploration of ncRNAs which revealed their crucial roles in gene regulation, miRNA binding, protein interactions, and splicing. Furthermore, ncRNAs are involved in the development of complicated diseases like cancer. Categorization of ncRNAs is essential to understand the mechanisms of diseases and to develop effective treatments. Sub-cellular localization information of ncRNAs demystifies diverse functionalities of ncRNAs. To date, several computational methodologies have been proposed to precisely identify the class as well as sub-cellular localization patterns of RNAs). This paper discusses different types of ncRNAs, reviews computational approaches proposed in the last 10 years to distinguish coding-RNA from ncRNA, to identify sub-types of ncRNAs such as piwi-associated RNA, micro RNA, long ncRNA, and circular RNA, and to determine sub-cellular localization of distinct ncRNAs and RNAs. Furthermore, it summarizes diverse ncRNA classification and sub-cellular localization determination datasets along with benchmark performance to aid the development and evaluation of novel computational methodologies. It identifies research gaps, heterogeneity, and challenges in the development of computational approaches for RNA sequence analysis. We consider that our expert analysis will assist Artificial Intelligence researchers with knowing state-of-the-art performance, model selection for various tasks on one platform, dominantly used sequence descriptors, neural architectures, and interpreting inter-species and intra-species performance deviation.
Collapse
Affiliation(s)
- Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Ali Ibrahim
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Muhammad Imran Malik
- National Center for Artificial Intelligence (NCAI), National University of Sciences and Technology, Islamabad 44000, Pakistan;
- School of Electrical Engineering & Computer Science, National University of Sciences and Technology, Islamabad 44000, Pakistan
| | - Andreas Dengel
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- Department of Computer Science, Technical University of Kaiserslautern, 67663 Kaiserslautern, Germany
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence (DFKI), 67663 Kaiserslautern, Germany; (M.A.I.); (A.D.); (S.A.)
- DeepReader GmbH, Trippstadter Str. 122, 67663 Kaiserslautern, Germany
| |
Collapse
|
27
|
Shu L, Zhou C, Yuan X, Zhang J, Deng L. MSCFS: inferring circRNA functional similarity based on multiple data sources. BMC Bioinformatics 2021; 22:371. [PMID: 34271851 PMCID: PMC8285884 DOI: 10.1186/s12859-021-04287-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Accepted: 07/06/2021] [Indexed: 12/13/2022] Open
Abstract
Background More and more evidence shows that circRNA plays an important role in various biological processes and human health. Therefore, inferring the circRNA’s potential functions and obtaining circRNA functional similarity has become more and more significant. However, there is no effective approach to explore the functional similarity of circRNAs. Methods In this paper, we propose a new approach, called MSCFS, to calculate the functional similarity of circRNA by integrating multiple data sources. We combine circRNA-disease association, circRNA-gene-Gene Ontology association, and circRNA sequence information to explore the functional similarity of circRNA. Firstly, we employ different learning representation methods from three data sources to establish three circRNA functional similarity networks. Then we integrate the three networks to obtain the final circRNA functional similarity. Results We utilize circRNA–miRNA association similarity and circRNA co-expression similarity to evaluate the performance of MSCFS. The results show a positive correlation with miRNA association (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$R=0.213$$\end{document}R=0.213) and circRNA co-expression similarity (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$R=0.8991$$\end{document}R=0.8991). Finally, we construct a circRNA functional similarity network and perform case analysis. The result shows our method can be applied to infer new potential functions of circRNA and other associations. Conclusions MSCFS combines multiple data sources related to circRNA functions. Correlation analysis and case analyses prove that MSCFS is a useful method to explore circRNA functional similarity.
Collapse
Affiliation(s)
- Liang Shu
- School of Computer Science and Engineering, Central South University, Lushangnan Road, Changsha, China
| | - Cheng Zhou
- School of Computer Science and Engineering, Central South University, Lushangnan Road, Changsha, China
| | - Xinxu Yuan
- Department of Chemical and Life Science Engineering, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Longxiang Road, Pingdingshan, 467000, China
| | - Lei Deng
- School of Computer Science and Engineering, Central South University, Lushangnan Road, Changsha, China.
| |
Collapse
|
28
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
29
|
Chen Y, Wu T, Zhu Z, Huang H, Zhang L, Goel A, Yang M, Wang X. An integrated workflow for biomarker development using microRNAs in extracellular vesicles for cancer precision medicine. Semin Cancer Biol 2021; 74:134-155. [PMID: 33766650 DOI: 10.1016/j.semcancer.2021.03.011] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Revised: 03/13/2021] [Accepted: 03/16/2021] [Indexed: 02/06/2023]
Abstract
EV-miRNAs are microRNA (miRNA) molecules encapsulated in extracellular vesicles (EVs), which play crucial roles in tumor pathogenesis, progression, and metastasis. Recent studies about EV-miRNAs have gained novel insights into cancer biology and have demonstrated a great potential to develop novel liquid biopsy assays for various applications. Notably, compared to conventional liquid biomarkers, EV-miRNAs are more advantageous in representing host-cell molecular architecture and exhibiting higher stability and specificity. Despite various available techniques for EV-miRNA separation, concentration, profiling, and data analysis, a standardized approach for EV-miRNA biomarker development is yet lacking. In this review, we performed a substantial literature review and distilled an integrated workflow encompassing important steps for EV-miRNA biomarker development, including sample collection and EV isolation, EV-miRNA extraction and quantification, high-throughput data preprocessing, biomarker prioritization and model construction, functional analysis, as well as validation. With the rapid growth of "big data", we highlight the importance of efficient mining of high-throughput data for the discovery of EV-miRNA biomarkers and integrating multiple independent datasets for in silico and experimental validations to increase the robustness and reproducibility. Furthermore, as an efficient strategy in systems biology, network inference provides insights into the regulatory mechanisms and can be used to select functionally important EV-miRNAs to refine the biomarker candidates. Despite the encouraging development in the field, a number of challenges still hinder the clinical translation. We finally summarize several common challenges in various biomarker studies and discuss potential opportunities emerging in the related fields.
Collapse
Affiliation(s)
- Yu Chen
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Tan Wu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Zhongxu Zhu
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Hao Huang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong
| | - Liang Zhang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Ajay Goel
- Department of Molecular Diagnostics and Experimental Therapeutics, Beckman Research Institute of City of Hope Comprehensive Cancer Center, Duarte, CA, USA
| | - Mengsu Yang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China
| | - Xin Wang
- Department of Biomedical Sciences, City University of Hong Kong, 31 To Yuen Street, Kowloon Tong, Hong Kong; Tung Biomedical Sciences Centre, City University of Hong Kong, Hong Kong; Key Laboratory of Biochip Technology, Biotech and Health Centre, Shenzhen Research Institute, City University of Hong Kong, Shenzhen, Guangdong Province, China.
| |
Collapse
|
30
|
Wang H, Ding Y, Tang J, Zou Q, Guo F. Identify RNA-associated subcellular localizations based on multi-label learning using Chou's 5-steps rule. BMC Genomics 2021; 22:56. [PMID: 33451286 PMCID: PMC7811227 DOI: 10.1186/s12864-020-07347-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Accepted: 12/22/2020] [Indexed: 12/04/2022] Open
Abstract
BACKGROUND Biological functions of biomolecules rely on the cellular compartments where they are located in cells. Importantly, RNAs are assigned in specific locations of a cell, enabling the cell to implement diverse biochemical processes in the way of concurrency. However, lots of existing RNA subcellular localization classifiers only solve the problem of single-label classification. It is of great practical significance to expand RNA subcellular localization into multi-label classification problem. RESULTS In this study, we extract multi-label classification datasets about RNA-associated subcellular localizations on various types of RNAs, and then construct subcellular localization datasets on four RNA categories. In order to study Homo sapiens, we further establish human RNA subcellular localization datasets. Furthermore, we utilize different nucleotide property composition models to extract effective features to adequately represent the important information of nucleotide sequences. In the most critical part, we achieve a major challenge that is to fuse the multivariate information through multiple kernel learning based on Hilbert-Schmidt independence criterion. The optimal combined kernel can be put into an integration support vector machine model for identifying multi-label RNA subcellular localizations. Our method obtained excellent results of 0.703, 0.757, 0.787, and 0.800, respectively on four RNA data sets on average precision. CONCLUSION To be specific, our novel method performs outstanding rather than other prediction tools on novel benchmark datasets. Moreover, we establish user-friendly web server with the implementation of our method.
Collapse
Affiliation(s)
- Hao Wang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
- School of Computational Science and Engineering, University of South Carolina, Columbia, 29208, SC, US
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, Sichuan, China
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.
| |
Collapse
|
31
|
Li M, Wu M, Qin Y, Liu H, Tu C, Shen B, Xu X, Chen H. Differentially expressed serum proteins in children with or without asthma as determined using isobaric tags for relative and absolute quantitation proteomics. PeerJ 2020; 8:e9971. [PMID: 33194371 PMCID: PMC7646293 DOI: 10.7717/peerj.9971] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2020] [Accepted: 08/26/2020] [Indexed: 12/31/2022] Open
Abstract
Background Although asthma is one of the most common chronic, noncommunicable diseases worldwide, the pathogenesis of childhood asthma is not yet clear. Genetic factors and environmental factors may lead to airway immune-inflammation responses and an imbalance of airway nerve regulation. The aim of the present study was to determine which serum proteins are differentially expressed between children with or without asthma and to ascertain the potential roles that these differentially expressed proteins (DEPs) may play in the pathogenesis of childhood asthma. Methods Serum samples derived from four children with asthma and four children without asthma were collected. The DEPs were identified by using isobaric tags for relative and absolute quantitation (iTRAQ) combined with liquid chromatography tandem mass spectrometry (LC-MS/MS) analyses. Using biological information technology, including Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Cluster of Orthologous Groups of Proteins (COG) databases and analyses, we determined the biological processes associated with these DEPs. Key protein glucose-6-phosphate dehydrogenase (G6PD) was verified by enzyme linked immunosorbent assay (ELISA). Results We found 46 DEPs in serum samples of children with asthma vs. children without asthma. Among these DEPs, 12 proteins were significantly (>1.5 fold change) upregulated and 34 proteins were downregulated. The results of GO analyses showed that the DEPs were mainly involved in binding, the immune system, or responding to stimuli or were part of a cellular anatomical entity. In the KEGG signaling pathway analysis, most of the downregulated DEPs were associated with cardiomyopathy, phagosomes, viral infections, and regulation of the actin cytoskeleton. The results of a COG analysis showed that the DEPs were primarily involved in signal transduction mechanisms and posttranslational modifications. These DEPs were associated with and may play important roles in the immune response, the inflammatory response, extracellular matrix degradation, and the nervous system. The downregulated of G6PD in the asthma group was confirmed using ELISA experiment. Conclusion After bioinformatics analyses, we found numerous DEPs that may play important roles in the pathogenesis of childhood asthma. Those proteins may be novel biomarkers of childhood asthma and may provide new clues for the early clinical diagnosis and treatment of childhood asthma.
Collapse
Affiliation(s)
- Ming Li
- Department of Neonatology, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Mingzhu Wu
- Department of Obstetrics and Gynecology, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Ying Qin
- School of Basic Medicine, Anhui Medical University, Hefei, Anhui, China
| | - Huaqing Liu
- Department of Neonatology, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Chengcheng Tu
- Department of Obstetrics and Gynecology, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Bing Shen
- School of Basic Medicine, Anhui Medical University, Hefei, Anhui, China
| | - Xiaohong Xu
- Department of Clinical Laboratory, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| | - Hongbo Chen
- Department of Obstetrics and Gynecology, Maternal and Child Health Hospital, the Affiliated Hospital of Anhui Medical University, Hefei, Anhui, China
| |
Collapse
|
32
|
Liu M, Yang J, Wang J, Deng L. Predicting miRNA-disease associations using a hybrid feature representation in the heterogeneous network. BMC Med Genomics 2020; 13:153. [PMID: 33087118 PMCID: PMC7579981 DOI: 10.1186/s12920-020-00783-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Studies have found that miRNAs play an important role in many biological activities involved in human diseases. Revealing the associations between miRNA and disease by biological experiments is time-consuming and expensive. The computational approaches provide a new alternative. However, because of the limited knowledge of the associations between miRNAs and diseases, it is difficult to support the prediction model effectively. METHODS In this work, we propose a model to predict miRNA-disease associations, MDAPCOM, in which protein information associated with miRNAs and diseases is introduced to build a global miRNA-protein-disease network. Subsequently, diffusion features and HeteSim features, extracted from the global network, are combined to train the prediction model by eXtreme Gradient Boosting (XGBoost). RESULTS The MDAPCOM model achieves AUC of 0.991 based on 10-fold cross-validation, which is significantly better than that of other two state-of-the-art methods RWRMDA and PRINCE. Furthermore, the model performs well on three unbalanced data sets. CONCLUSIONS The results suggest that the information behind proteins associated with miRNAs and diseases is crucial to the prediction of the associations between miRNAs and diseases, and the hybrid feature representation in the heterogeneous network is very effective for improving predictive performance.
Collapse
Affiliation(s)
- Minghui Liu
- School of Computer Science and Engineering,Central South University, Changsha, 410075, China
| | - Jingyi Yang
- School of Computer Science and Engineering,Central South University, Changsha, 410075, China
| | - Jiacheng Wang
- School of Computer Science and Engineering,Central South University, Changsha, 410075, China
| | - Lei Deng
- School of Computer Science and Engineering,Central South University, Changsha, 410075, China. .,School of Software, Xinjiang University, Urumqi, 830008, China.
| |
Collapse
|
33
|
Huang YA, Hu P, Chan KCC, You ZH. Graph convolution for predicting associations between miRNA and drug resistance. Bioinformatics 2020; 36:851-858. [PMID: 31397851 DOI: 10.1093/bioinformatics/btz621] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Revised: 07/17/2019] [Accepted: 08/08/2019] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MicroRNA (miRNA) therapeutics is becoming increasingly important. However, aberrant expression of miRNAs is known to cause drug resistance and can become an obstacle for miRNA-based therapeutics. At present, little is known about associations between miRNA and drug resistance and there is no computational tool available for predicting such association relationship. Since it is known that miRNAs can regulate genes that encode specific proteins that are keys for drug efficacy, we propose here a computational approach, called GCMDR, for finding a three-layer latent factor model that can be used to predict miRNA-drug resistance associations. RESULTS In this paper, we discuss how the problem of predicting such associations can be formulated as a link prediction problem involving a bipartite attributed graph. GCMDR makes use of the technique of graph convolution to build a latent factor model, which can effectively utilize information of high-dimensional attributes of miRNA/drug in an end-to-end learning scheme. In addition, GCMDR also learns graph embedding features for miRNAs and drugs. We leveraged the data from multiple databases storing miRNA expression profile, drug substructure fingerprints, gene ontology and disease ontology. The test for performance shows that the GCMDR prediction model can achieve AUCs of 0.9301 ± 0.0005, 0.9359 ± 0.0006 and 0.9369 ± 0.0003 based on 2-fold, 5-fold and 10-fold cross validation, respectively. Using this model, we show that the associations between miRNA and drug resistance can be reliably predicted by properly introducing useful side information like miRNA expression profile and drug structure fingerprints. AVAILABILITY AND IMPLEMENTATION Python codes and dataset are available at https://github.com/yahuang1991polyu/GCMDR/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yu-An Huang
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR, 999077, China
| | - Pengwei Hu
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR, 999077, China
| | - Keith C C Chan
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR, 999077, China
| | - Zhu-Hong You
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR, 999077, China.,Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
| |
Collapse
|
34
|
miRNALoc: predicting miRNA subcellular localizations based on principal component scores of physico-chemical properties and pseudo compositions of di-nucleotides. Sci Rep 2020; 10:14557. [PMID: 32884018 PMCID: PMC7471944 DOI: 10.1038/s41598-020-71381-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2020] [Accepted: 07/07/2020] [Indexed: 12/20/2022] Open
Abstract
MicroRNAs (miRNAs) are one kind of non-coding RNA, play vital role in regulating several physiological and developmental processes. Subcellular localization of miRNAs and their abundance in the native cell are central for maintaining physiological homeostasis. Besides, RNA silencing activity of miRNAs is also influenced by their localization and stability. Thus, development of computational method for subcellular localization prediction of miRNAs is desired. In this work, we have proposed a computational method for predicting subcellular localizations of miRNAs based on principal component scores of thermodynamic, structural properties and pseudo compositions of di-nucleotides. Prediction accuracy was analyzed following fivefold cross validation, where ~ 63–71% of AUC-ROC and ~ 69–76% of AUC-PR were observed. While evaluated with independent test set, > 50% localizations were found to be correctly predicted. Besides, the developed computational model achieved higher accuracy than the existing methods. A user-friendly prediction server “miRNALoc” is freely accessible at https://cabgrid.res.in:8080/mirnaloc/, by which the user can predict localizations of miRNAs.
Collapse
|
35
|
Zheng K, You ZH, Wang L, Guo ZH. iMDA-BN: Identification of miRNA-disease associations based on the biological network and graph embedding algorithm. Comput Struct Biotechnol J 2020; 18:2391-2400. [PMID: 33005302 PMCID: PMC7508695 DOI: 10.1016/j.csbj.2020.08.023] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2020] [Revised: 08/24/2020] [Accepted: 08/26/2020] [Indexed: 11/30/2022] Open
Abstract
Benefiting from advances in high-throughput experimental techniques, important regulatory roles of miRNAs, lncRNAs, and proteins, as well as biological property information, are gradually being complemented. As the key data support to promote biomedical research, domain knowledge such as intermolecular relationships that are increasingly revealed by molecular genome-wide analysis is often used to guide the discovery of potential associations. However, the method of performing network representation learning from the perspective of the global biological network is scarce. These methods cover a very limited type of molecular associations and are therefore not suitable for more comprehensive analysis of molecular network representation information. In this study, we propose a computational model based on the Biological network for predicting potential associations between miRNAs and diseases called iMDA-BN. The iMDA-BN has three significant advantages: I) It uses a new method to describe disease and miRNA characteristics which analyzes node representation information for disease and miRNA from the perspective of biological networks. II) It can predict unproven associations even if miRNAs and diseases do not appear in the biological network. III) Accurate description of miRNA characteristics from biological properties based on high-throughput sequence information. The iMDA-BN predictor achieves an AUC of 0.9145 and an accuracy of 84.49% on the miRNA-disease association baseline dataset, and it can also achieve an AUC of 0.8765 and an accuracy of 80.96% when predicting unknown diseases and miRNAs in the biological network. Compared to existing miRNA-disease association prediction methods, iMDA-BN has higher accuracy and the advantage of predicting unknown associations. In addition, 45, 49, and 49 of the top 50 miRNA-disease associations with the highest predicted scores were confirmed in the case studies, respectively.
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China
| | - Zhen-Hao Guo
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Science, Urumqi 830011, China
| |
Collapse
|
36
|
Huang YA, Chan KCC, You ZH, Hu P, Wang L, Huang ZA. Predicting microRNA-disease associations from lncRNA-microRNA interactions via Multiview Multitask Learning. Brief Bioinform 2020; 22:5868072. [PMID: 32633319 DOI: 10.1093/bib/bbaa133] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 05/26/2020] [Accepted: 06/01/2020] [Indexed: 01/16/2023] Open
Abstract
MOTIVATION Identifying microRNAs that are associated with different diseases as biomarkers is a problem of great medical significance. Existing computational methods for uncovering such microRNA-diseases associations (MDAs) are mostly developed under the assumption that similar microRNAs tend to associate with similar diseases. Since such an assumption is not always valid, these methods may not always be applicable to all kinds of MDAs. Considering that the relationship between long noncoding RNA (lncRNA) and different diseases and the co-regulation relationships between the biological functions of lncRNA and microRNA have been established, we propose here a multiview multitask method to make use of the known lncRNA-microRNA interaction to predict MDAs on a large scale. The investigation is performed in the absence of complete information of microRNAs and any similarity measurement for it and to the best knowledge, the work represents the first ever attempt to discover MDAs based on lncRNA-microRNA interactions. RESULTS In this paper, we propose to develop a deep learning model called MVMTMDA that can create a multiview representation of microRNAs. The model is trained based on an end-to-end multitasking approach to machine learning so that, based on it, missing data in the side information can be determined automatically. Experimental results show that the proposed model yields an average area under ROC curve of 0.8410+/-0.018, 0.8512+/-0.012 and 0.8521+/-0.008 when k is set to 2, 5 and 10, respectively. In addition, we also propose here a statistical approach to predicting lncRNA-disease associations based on these associations and the MDA discovered using MVMTMDA. AVAILABILITY Python code and the datasets used in our studies are made available at https://github.com/yahuang1991polyu/MVMTMDA/.
Collapse
Affiliation(s)
- Yu-An Huang
- Department of Computing at the Hong Kong Polytechnic University
| | - Keith C C Chan
- Systems Design Engineering from the University of Waterloo, Canada
| | | | - Pengwei Hu
- Department of Computing, The Hong Kong Polytechnic University, Hong Kong
| | - Lei Wang
- China University of Mining and Technology
| | | |
Collapse
|
37
|
FCGCNMDA: predicting miRNA-disease associations by applying fully connected graph convolutional networks. Mol Genet Genomics 2020; 295:1197-1209. [DOI: 10.1007/s00438-020-01693-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/02/2023]
|
38
|
Ding Y, Chen B, Lei X, Liao B, Wu FX. Predicting novel CircRNA-disease associations based on random walk and logistic regression model. Comput Biol Chem 2020; 87:107287. [PMID: 32446243 DOI: 10.1016/j.compbiolchem.2020.107287] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 05/09/2020] [Indexed: 12/24/2022]
Abstract
Circular RNAs (circRNAs), a large group of small endogenous noncoding RNA molecules, have been proved to modulate protein-coding genes in the human genome. In recent years, many experimental studies have demonstrated that circRNAs are dysregulated in a number of diseases, and they can serve as biomarkers for disease diagnosis and prognosis. However, it is expensive and time-consuming to identify circRNA-disease associations by biological experiments and few computational models have been proposed for novel circRNA-disease association prediction. In this study, we develop a computational model based on the random walk and the logistic regression (RWLR) to predict circRNA-disease associations. Firstly, a circRNA-circRNA similarity network is constructed by calculating their functional similarity of circRNA based on circRNA-related gene ontology. Then, a random walk with restart is implemented on the circRNA similarity network, and the features of each pair of circRNA-disease are extracted based on the results of the random walk and the circRNA-disease association matrix. Finally, a logistic regression model is used to predict novel circRNA-disease associations. Leave one out validation (LOOCV), five-fold cross validation (5CV) and ten-fold cross validation (10CV) are adopted to evaluate the prediction performance of RWLR, by comparing with the latest two methods PWCDA and DWNN-RLS. The experiment results show that our RWLR has higher AUC values of LOOCV, 5CV and 10CV than the other two latest methods, which demonstrates that RWLR has a better performance than other computational methods. What's more, case studies also illustrate the reliability and effectiveness of RWLR for circRNA-disease association prediction.
Collapse
Affiliation(s)
- Yulian Ding
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 1L5, Canada
| | - Bolin Chen
- School of Computer Science and Technology, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Bo Liao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China
| | - Fang-Xiang Wu
- Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 1L5, Canada; Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada; Department of Computer Science, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada.
| |
Collapse
|
39
|
Li J, Zhang S, Wan Y, Zhao Y, Shi J, Zhou Y, Cui Q. MISIM v2.0: a web server for inferring microRNA functional similarity based on microRNA-disease associations. Nucleic Acids Res 2020; 47:W536-W541. [PMID: 31069374 PMCID: PMC6602518 DOI: 10.1093/nar/gkz328] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Revised: 04/14/2019] [Accepted: 04/25/2019] [Indexed: 01/11/2023] Open
Abstract
MicroRNAs (miRNAs) are one class of important small non-coding RNA molecules and play critical roles in health and disease. Therefore, it is important and necessary to evaluate the functional relationship of miRNAs and then predict novel miRNA-disease associations. For this purpose, here we developed the updated web server MISIM (miRNA similarity) v2.0. Besides a 3-fold increase in data content compared with MISIM v1.0, MISIM v2.0 improved the original MISIM algorithm by implementing both positive and negative miRNA-disease associations. That is, the MISIM v2.0 scores could be positive or negative, whereas MISIM v1.0 only produced positive scores. Moreover, MISIM v2.0 achieved an algorithm for novel miRNA-disease prediction based on MISIM v2.0 scores. Finally, MISIM v2.0 provided network visualization and functional enrichment analysis for functionally paired miRNAs. The MISIM v2.0 web server is freely accessible at http://www.lirmed.com/misim/.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China.,Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Shan Zhang
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Yanping Wan
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Yingshu Zhao
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin 300401, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing 100191, China.,Sanbo Brain Institute, Sanbo Brain Hospital, Capital Medical University, Beijing 100093, China
| |
Collapse
|
40
|
Chen H, Guo R, Li G, Zhang W, Zhang Z. Comparative analysis of similarity measurements in miRNAs with applications to miRNA-disease association predictions. BMC Bioinformatics 2020; 21:176. [PMID: 32366225 PMCID: PMC7199309 DOI: 10.1186/s12859-020-3515-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 04/23/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND As regulators of gene expression, microRNAs (miRNAs) are increasingly recognized as critical biomarkers of human diseases. Till now, a series of computational methods have been proposed to predict new miRNA-disease associations based on similarity measurements. Different categories of features in miRNAs are applied in these methods for miRNA-miRNA similarity calculation. Benchmarking tests on these miRNA similarity measures are warranted to assess their effectiveness and robustness. RESULTS In this study, 5 categories of features, i.e. miRNA sequences, miRNA expression profiles in cell-lines, miRNA expression profiles in tissues, gene ontology (GO) annotations of miRNA target genes and Medical Subject Heading (MeSH) terms of miRNA-associated diseases, are collected and similarity values between miRNAs are quantified based on these feature spaces, respectively. We systematically compare the 5 similarities from multi-statistical views. Furthermore, we adopt a rule-based inference method to test their performance on miRNA-disease association predictions with the similarity measurements. Comprehensive comparison is made based on leave-one-out cross-validations and a case study. Experimental results demonstrate that the similarity measurement using MeSH terms performs best among the 5 measurements. It should be noted that the other 4 measurements can also achieve reliable prediction performance. The best-performed similarity measurement is used for new miRNA-disease association predictions and the inferred results are released for further biomedical screening. CONCLUSIONS Our study suggests that all the 5 features, even though some are restricted by data availability, are useful information for inferring novel miRNA-disease associations. However, biased prediction results might be produced in GO- and MeSH-based similarity measurements due to incomplete feature spaces. Similarity fusion may help produce more reliable prediction results. We expect that future studies will provide more detailed information into the 5 feature spaces and widen our understanding about disease pathogenesis.
Collapse
Affiliation(s)
- Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Ruiyu Guo
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013 China
| | - Wei Zhang
- School of Science, East China Jiaotong University, Nanchang, 330013 China
| | - Zuping Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083 China
| |
Collapse
|
41
|
Geng L, Tang X, Wang S, Sun Y, Wang D, Tsao BP, Feng X, Sun L. Reduced Let-7f in Bone Marrow-Derived Mesenchymal Stem Cells Triggers Treg/Th17 Imbalance in Patients With Systemic Lupus Erythematosus. Front Immunol 2020; 11:233. [PMID: 32133007 PMCID: PMC7040072 DOI: 10.3389/fimmu.2020.00233] [Citation(s) in RCA: 37] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 01/28/2020] [Indexed: 12/30/2022] Open
Abstract
Systemic lupus erythematosus (SLE) patients exist an imbalance between regulatory T (Treg) and T helper 17 cells (Th17), which might be contributed by defective immune regulation of bone marrow derived mesenchymal stem cells (BM-MSCs) from SLE patients. Our microRNA array analysis showed markedly down-regulated expression levels of microRNA let-7f in BM-MSCs from SLE patients compared to those from normal controls (NOR). To explore the role of let-7f in the disease pathogenesis, we showed that expression levels of let-7f in SLE BM-MSCs were negatively associated with SLE disease activity, and the predicted let-7 family targeted gene expression of interlukin-6 (IL-6) was significantly higher in BM-MSCs from SLE patients compared to normal controls (NOR). Transient transfection of BM-MSCs with let-7f mimics or inhibitors showed reduced levels of let-7f impaired the proliferation rate of BM-MSCs, BM-MSC-mediated downregulation of Th17 cells and upregulation of Treg cells, increased the apoptosis rate of BM-MSCs through targeting IL-6 and activating signal transducers and activators of transcription-3 (STAT3) pathway, but had no significant effect on the differentiation of Th1 and Th2. Our findings showed a key role of let-7f in the imbalance of Treg/Th17 mediated by SLE BM-MSCs, suggesting the potential of manipulating let-7f expression in BM-MSCs for treating SLE patients.
Collapse
Affiliation(s)
- Linyu Geng
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China.,Division of Rheumatology & Immunology, Medical University of South Carolina, Charleston, SC, United States
| | - Xiaojun Tang
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Shiying Wang
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Yue Sun
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Dandan Wang
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Betty P Tsao
- Division of Rheumatology & Immunology, Medical University of South Carolina, Charleston, SC, United States
| | - Xuebing Feng
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| | - Lingyun Sun
- Department of Rheumatology and Immunology, The Affiliated Drum Tower Hospital of Nanjing University Medical School, Nanjing, China
| |
Collapse
|
42
|
Peng LH, Zhou LQ, Chen X, Piao X. A Computational Study of Potential miRNA-Disease Association Inference Based on Ensemble Learning and Kernel Ridge Regression. Front Bioeng Biotechnol 2020; 8:40. [PMID: 32117922 PMCID: PMC7015868 DOI: 10.3389/fbioe.2020.00040] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 01/17/2020] [Indexed: 12/11/2022] Open
Abstract
As increasing experimental studies have shown that microRNAs (miRNAs) are closely related to multiple biological processes and the prevention, diagnosis and treatment of human diseases, a growing number of researchers are focusing on the identification of associations between miRNAs and diseases. Identifying such associations purely via experiments is costly and demanding, which prompts researchers to develop computational methods to complement the experiments. In this paper, a novel prediction model named Ensemble of Kernel Ridge Regression based MiRNA-Disease Association prediction (EKRRMDA) was developed. EKRRMDA obtained features of miRNAs and diseases by integrating the disease semantic similarity, the miRNA functional similarity and the Gaussian interaction profile kernel similarity for diseases and miRNAs. Under the computational framework that utilized ensemble learning and feature dimensionality reduction, multiple base classifiers that combined two Kernel Ridge Regression classifiers from the miRNA side and disease side, respectively, were obtained based on random selection of features. Then average strategy for these base classifiers was adopted to obtain final association scores of miRNA-disease pairs. In the global and local leave-one-out cross validation, EKRRMDA attained the AUCs of 0.9314 and 0.8618, respectively. Moreover, the model’s average AUC with standard deviation in 5-fold cross validation was 0.9275 ± 0.0008. In addition, we implemented three different types of case studies on predicting miRNAs associated with five important diseases. As a result, there were 90% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 86% (Lymphoma), 98% (Lung Neoplasms), and 96% (Breast Neoplasms) of the top 50 predicted miRNAs verified to have associations with these diseases.
Collapse
Affiliation(s)
- Li-Hong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Li-Qian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Xue Piao
- School of Medical Informatics, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
43
|
Zheng K, You ZH, Wang L, Zhou Y, Li LP, Li ZW. DBMDA: A Unified Embedding for Sequence-Based miRNA Similarity Measure with Applications to Predict and Validate miRNA-Disease Associations. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 19:602-611. [PMID: 31931344 PMCID: PMC6957846 DOI: 10.1016/j.omtn.2019.12.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 10/09/2019] [Accepted: 12/10/2019] [Indexed: 11/24/2022]
Abstract
MicroRNAs (miRNAs) play a critical role in human diseases. Determining the association between miRNAs and disease contributes to elucidating the pathogenesis of liver diseases and seeking the effective treatment method. Despite great recent advances in the field of the associations between miRNAs and diseases, implementing association verification and recognition efficiently at scale presents serious challenges to biological experimental approaches. Thus, computational methods for predicting miRNA-disease association have become a research hotspot. In this paper, we present a new computational method, named distance-based sequence similarity for miRNA-disease association prediction (DBMDA), that directly learns a mapping from miRNA sequence to a Euclidean space. The notable feature of our approach consists of inferring global similarity from region distances that can be figured by chaos game representation algorithm based on the miRNA sequences. In the 5-fold cross-validation experiment, the area under the curve (AUC) obtained by DBMDA in predicting potential miRNA-disease associations reached 0.9129. To assess the effectiveness of DBMDA more effectively, we compared it with different classifiers and former prediction models. Besides, we constructed two case studies for prostate neoplasms and colon neoplasms. Results show that 39 and 39 out of the top 40 predicted miRNAs were confirmed by other databases, respectively. BDMDA has made new attempts in sequence similarity and achieved excellent results, while at the same time providing a new perspective for predicting the relationship between diseases and miRNAs. The source code and datasets explored in this work are available online from the University of Chinese Academy of Sciences (http://220.171.34.3:81/).
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
44
|
Pan X, Shen HB. Inferring Disease-Associated MicroRNAs Using Semi-supervised Multi-Label Graph Convolutional Networks. iScience 2019; 20:265-277. [PMID: 31605942 PMCID: PMC6817654 DOI: 10.1016/j.isci.2019.09.013] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/05/2019] [Accepted: 09/11/2019] [Indexed: 01/22/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in biological processes involved in diseases. The associations between diseases and protein-coding genes (PCGs) have been well investigated, and miRNAs interact with PCGs to trigger them to be functional. We present a computational method, DimiG, to infer miRNA-associated diseases using a semi-supervised Graph Convolutional Network model (GCN). DimiG uses a multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations, and tissue expression profiles. DimiG is trained on disease-PCG associations and an interaction network using a GCN, which is further used to score associations between diseases and miRNAs. We evaluate DimiG on a benchmark set from verified disease-miRNA associations. Our results demonstrate that DimiG outperforms the best unsupervised method and is comparable to two supervised methods. Three case studies of prostate cancer, lung cancer, and inflammatory bowel disease further demonstrate the efficacy of DimiG, where top miRNAs predicted by DimiG are supported by literature.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China; Department of Medical informatics, Erasmus Medical Center, 3015 CE Rotterdam, the Netherlands.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China.
| |
Collapse
|
45
|
Zheng K, You ZH, Wang L, Zhou Y, Li LP, Li ZW. MLMDA: a machine learning approach to predict and validate MicroRNA-disease associations by integrating of heterogenous information sources. J Transl Med 2019; 17:260. [PMID: 31395072 PMCID: PMC6688360 DOI: 10.1186/s12967-019-2009-x] [Citation(s) in RCA: 54] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 07/31/2019] [Indexed: 02/01/2023] Open
Abstract
Background Emerging evidences show that microRNA (miRNA) plays an important role in many human complex diseases. However, considering the inherent time-consuming and expensive of traditional in vitro experiments, more and more attention has been paid to the development of efficient and feasible computational methods to predict the potential associations between miRNA and disease. Methods In this work, we present a machine learning-based model called MLMDA for predicting the association of miRNAs and diseases. More specifically, we first use the k-mer sparse matrix to extract miRNA sequence information, and combine it with miRNA functional similarity, disease semantic similarity and Gaussian interaction profile kernel similarity information. Then, more representative features are extracted from them through deep auto-encoder neural network (AE). Finally, the random forest classifier is used to effectively predict potential miRNA–disease associations. Results The experimental results show that the MLMDA model achieves promising performance under fivefold cross validations with AUC values of 0.9172, which is higher than the methods using different classifiers or different feature combination methods mentioned in this paper. In addition, to further evaluate the prediction performance of MLMDA model, case studies are carried out with three Human complex diseases including Lymphoma, Lung Neoplasm, and Esophageal Neoplasms. As a result, 39, 37 and 36 out of the top 40 predicted miRNAs are confirmed by other miRNA–disease association databases. Conclusions These prominent experimental results suggest that the MLMDA model could serve as a useful tool guiding the future experimental validation for those promising miRNA biomarker candidates. The source code and datasets explored in this work are available at http://220.171.34.3:81/.
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China. .,College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|