151
|
Chen H, Guo R, Li G, Zhang W, Zhang Z. Comparative analysis of similarity measurements in miRNAs with applications to miRNA-disease association predictions. BMC Bioinformatics 2020; 21:176. [PMID: 32366225 PMCID: PMC7199309 DOI: 10.1186/s12859-020-3515-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Accepted: 04/23/2020] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND As regulators of gene expression, microRNAs (miRNAs) are increasingly recognized as critical biomarkers of human diseases. Till now, a series of computational methods have been proposed to predict new miRNA-disease associations based on similarity measurements. Different categories of features in miRNAs are applied in these methods for miRNA-miRNA similarity calculation. Benchmarking tests on these miRNA similarity measures are warranted to assess their effectiveness and robustness. RESULTS In this study, 5 categories of features, i.e. miRNA sequences, miRNA expression profiles in cell-lines, miRNA expression profiles in tissues, gene ontology (GO) annotations of miRNA target genes and Medical Subject Heading (MeSH) terms of miRNA-associated diseases, are collected and similarity values between miRNAs are quantified based on these feature spaces, respectively. We systematically compare the 5 similarities from multi-statistical views. Furthermore, we adopt a rule-based inference method to test their performance on miRNA-disease association predictions with the similarity measurements. Comprehensive comparison is made based on leave-one-out cross-validations and a case study. Experimental results demonstrate that the similarity measurement using MeSH terms performs best among the 5 measurements. It should be noted that the other 4 measurements can also achieve reliable prediction performance. The best-performed similarity measurement is used for new miRNA-disease association predictions and the inferred results are released for further biomedical screening. CONCLUSIONS Our study suggests that all the 5 features, even though some are restricted by data availability, are useful information for inferring novel miRNA-disease associations. However, biased prediction results might be produced in GO- and MeSH-based similarity measurements due to incomplete feature spaces. Similarity fusion may help produce more reliable prediction results. We expect that future studies will provide more detailed information into the 5 feature spaces and widen our understanding about disease pathogenesis.
Collapse
Affiliation(s)
- Hailin Chen
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Ruiyu Guo
- School of Software, East China Jiaotong University, Nanchang, 330013 China
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang, 330013 China
| | - Wei Zhang
- School of Science, East China Jiaotong University, Nanchang, 330013 China
| | - Zuping Zhang
- School of Computer Science and Engineering, Central South University, Changsha, 410083 China
| |
Collapse
|
152
|
Zhang Y, Chen M, Cheng X, Wei H. MSFSP: A Novel miRNA-Disease Association Prediction Model by Federating Multiple-Similarities Fusion and Space Projection. Front Genet 2020; 11:389. [PMID: 32425980 PMCID: PMC7204399 DOI: 10.3389/fgene.2020.00389] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 03/27/2020] [Indexed: 12/11/2022] Open
Abstract
Growing evidences have indicated that microRNAs (miRNAs) play a significant role relating to many important bioprocesses; their mutations and disorders will cause the occurrence of various complex diseases. The prediction of miRNAs associated with underlying diseases via computational approaches is beneficial to identify biomarkers and discover specific medicine, which can greatly reduce the cost of diagnosis, cure, prognosis, and prevention of human diseases. However, how to further achieve a more reliable prediction of potential miRNA-disease associations with effective integration of different biological data is a challenge for researchers. In this study, we proposed a computational model by using a federated method of combined multiple-similarities fusion and space projection (MSFSP). MSFSP firstly fused the integrated disease similarity (composed of disease semantic similarity, disease functional similarity, and disease Hamming similarity) with the integrated miRNA similarity (composed of miRNA functional similarity, miRNA sequence similarity, and miRNA Hamming similarity). Secondly, it constructed the weighted network of miRNA-disease associations from the experimentally verified Boolean network of miRNA-disease associations by using similarity networks. Finally, it calculated the prediction results by weighting miRNA space projection scores and the disease space projection scores. Leave-one-out cross-validation demonstrated that MSFSP has the distinguished predictive accuracy with area under the receiver operating characteristics curve (AUC) of 0.9613 better than that of five other existing models. In case studies, the predictive ability of MSFSP was further confirmed as 96 and 98% of the top 50 predictions for prostatic neoplasms and lung neoplasms were successfully validated by experimental evidences and supporting experimental evidences were also found for 100% of the top 50 predictions for isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hanyan Wei
- School of Pharmacy, Guilin Medical University, Guilin, China
| |
Collapse
|
153
|
Jiang H, Yang M, Chen X, Li M, Li Y, Wang J. miRTMC: A miRNA Target Prediction Method Based on Matrix Completion Algorithm. IEEE J Biomed Health Inform 2020; 24:3630-3641. [PMID: 32287029 DOI: 10.1109/jbhi.2020.2987034] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
microRNAs (miRNAs) are small non-coding RNAs which modulate the stability of gene targets and their rates of translation into proteins at transcriptional level and post-transcriptional level. miRNA dysfunctions can lead to human diseases because of dysregulation of their targets. Correct miRNA target prediction will lead to better understanding of the mechanisms of human diseases and provide hints on curing them. In recent years, computational miRNA target prediction methods have been proposed according to the interaction rules between miRNAs and targets. However, these methods suffer from high false positive rates due to the complicated relationship between miRNAs and their targets. The rapidly growing number of experimentally validated miRNA targets enables predicting miRNA targets with high precision via accurate data analysis. Taking advantage of these known miRNA targets, a novel recommendation system model (miRTMC) for miRNA target prediction is established using a new matrix completion algorithm. In miRTMC, a heterogeneous network is constructed by integrating the miRNA similarity network, the gene similarity network, and the miRNA-gene interaction network. Our assumption is that the latent factors determining whether a gene is the target of miRNA or not are highly correlated, i.e., the adjacency matrix of the heterogeneous network is low-rank, which is then completed by using a nuclear norm regularized linear least squares model under non-negative constraints. Alternating direction method of multipliers (ADMM) is adopted to numerically solve the matrix completion problem. Our results show that miRTMC outperforms the competing methods in terms of various evaluation metrics. Our software package is available at https://github.com/hjiangcsu/miRTMC.
Collapse
|
154
|
Ha J, Park C, Park C, Park S. Improved Prediction of miRNA-Disease Associations Based on Matrix Completion with Network Regularization. Cells 2020; 9:cells9040881. [PMID: 32260218 PMCID: PMC7226829 DOI: 10.3390/cells9040881] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Revised: 12/30/2019] [Accepted: 04/01/2020] [Indexed: 12/12/2022] Open
Abstract
The identification of potential microRNA (miRNA)-disease associations enables the elucidation of the pathogenesis of complex human diseases owing to the crucial role of miRNAs in various biologic processes and it yields insights into novel prognostic markers. In the consideration of the time and costs involved in wet experiments, computational models for finding novel miRNA-disease associations would be a great alternative. However, computational models, to date, are biased towards known miRNA-disease associations; this is not suitable for rare miRNAs (i.e., miRNAs with a few known disease associations) and uncommon diseases (i.e., diseases with a few known miRNA associations). This leads to poor prediction accuracies. The most straightforward way of improving the performance is by increasing the number of known miRNA-disease associations. However, due to lack of information, increasing attention has been paid to developing computational models that can handle insufficient data via a technical approach. In this paper, we present a general framework—improved prediction of miRNA-disease associations (IMDN)—based on matrix completion with network regularization to discover potential disease-related miRNAs. The success of adopting matrix factorization is demonstrated by its excellent performance in recommender systems. This approach considers a miRNA network as additional implicit feedback and makes predictions for disease associations relevant to a given miRNA based on its direct neighbors. Our experimental results demonstrate that IMDN achieved excellent performance with reliable area under the receiver operating characteristic (ROC) area under the curve (AUC) values of 0.9162 and 0.8965 in the frameworks of global and local leave-one-out cross-validations (LOOCV), respectively. Further, case studies demonstrated that our method can not only validate true miRNA-disease associations but also suggest novel disease-related miRNA candidates.
Collapse
Affiliation(s)
- Jihwan Ha
- Department of Computer Science, Yonsei University, Seoul 03722, Korea; (J.H.); (C.P.)
| | - Chihyun Park
- Department of Computer Science, Yonsei University, Seoul 03722, Korea; (J.H.); (C.P.)
| | - Chanyoung Park
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, OH 61801, USA;
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, Seoul 03722, Korea; (J.H.); (C.P.)
- Correspondence: ; Tel.: +82-2-2123-5714
| |
Collapse
|
155
|
Predicting potential miRNA-disease associations by combining gradient boosting decision tree with logistic regression. Comput Biol Chem 2020; 85:107200. [DOI: 10.1016/j.compbiolchem.2020.107200] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 01/04/2020] [Accepted: 01/05/2020] [Indexed: 12/19/2022]
|
156
|
A random forest based computational model for predicting novel lncRNA-disease associations. BMC Bioinformatics 2020; 21:126. [PMID: 32216744 PMCID: PMC7099795 DOI: 10.1186/s12859-020-3458-1] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 03/18/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Accumulated evidence shows that the abnormal regulation of long non-coding RNA (lncRNA) is associated with various human diseases. Accurately identifying disease-associated lncRNAs is helpful to study the mechanism of lncRNAs in diseases and explore new therapies of diseases. Many lncRNA-disease association (LDA) prediction models have been implemented by integrating multiple kinds of data resources. However, most of the existing models ignore the interference of noisy and redundancy information among these data resources. RESULTS To improve the ability of LDA prediction models, we implemented a random forest and feature selection based LDA prediction model (RFLDA in short). First, the RFLDA integrates the experiment-supported miRNA-disease associations (MDAs) and LDAs, the disease semantic similarity (DSS), the lncRNA functional similarity (LFS) and the lncRNA-miRNA interactions (LMI) as input features. Then, the RFLDA chooses the most useful features to train prediction model by feature selection based on the random forest variable importance score that takes into account not only the effect of individual feature on prediction results but also the joint effects of multiple features on prediction results. Finally, a random forest regression model is trained to score potential lncRNA-disease associations. In terms of the area under the receiver operating characteristic curve (AUC) of 0.976 and the area under the precision-recall curve (AUPR) of 0.779 under 5-fold cross-validation, the performance of the RFLDA is better than several state-of-the-art LDA prediction models. Moreover, case studies on three cancers demonstrate that 43 of the 45 lncRNAs predicted by the RFLDA are validated by experimental data, and the other two predicted lncRNAs are supported by other LDA prediction models. CONCLUSIONS Cross-validation and case studies indicate that the RFLDA has excellent ability to identify potential disease-associated lncRNAs.
Collapse
|
157
|
Fan Y, Cui J, Zhu Q. Heterogeneous graph inference based on similarity network fusion for predicting lncRNA-miRNA interaction. RSC Adv 2020; 10:11634-11642. [PMID: 35496629 PMCID: PMC9050493 DOI: 10.1039/c9ra11043g] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 03/14/2020] [Indexed: 12/28/2022] Open
Abstract
LncRNA and miRNA are two non-coding RNA types that are popular in current research. LncRNA interacts with miRNA to regulate gene transcription, further affecting human health and disease. Accurate identification of lncRNA-miRNA interactions contributes to the in-depth study of the biological functions and mechanisms of non-coding RNA. However, relying on biological experiments to obtain interaction information is time-consuming and expensive. Considering the rapid accumulation of gene information and the few computational methods, it is urgent to supplement the effective computational models to predict lncRNA-miRNA interactions. In this work, we propose a heterogeneous graph inference method based on similarity network fusion (SNFHGILMI) to predict potential lncRNA-miRNA interactions. First, we calculated multiple similarity data, including lncRNA sequence similarity, miRNA sequence similarity, lncRNA Gaussian nuclear similarity, and miRNA Gaussian nuclear similarity. Second, the similarity network fusion method was employed to integrate the data and get the similarity network of lncRNA and miRNA. Then, we constructed a bipartite network by combining the known interaction network and similarity network of lncRNA and miRNA. Finally, the heterogeneous graph inference method was introduced to construct a prediction model. On the real dataset, the model SNFHGILMI achieved AUC of 0.9501 and 0.9426 ± 0.0035 based on LOOCV and 5-fold cross validation, respectively. Furthermore, case studies also demonstrate that SNFHGILMI is a high-performance prediction method that can accurately predict new lncRNA-miRNA interactions. The Matlab code and readme file of SNFHGILMI can be downloaded from https://github.com/cj-DaSE/SNFHGILMI.
Collapse
Affiliation(s)
- Yongxian Fan
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| | - Juan Cui
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| | - QingQi Zhu
- School of Computer and Information Security, Guilin University of Electronic Technology Guilin 541004 China
| |
Collapse
|
158
|
Xiao Q, Zhang N, Luo J, Dai J, Tang X. Adaptive multi-source multi-view latent feature learning for inferring potential disease-associated miRNAs. Brief Bioinform 2020; 22:2043-2057. [PMID: 32186712 DOI: 10.1093/bib/bbaa028] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 02/16/2020] [Accepted: 01/14/2020] [Indexed: 12/13/2022] Open
Abstract
Accumulating evidence has shown that microRNAs (miRNAs) play crucial roles in different biological processes, and their mutations and dysregulations have been proved to contribute to tumorigenesis. In silico identification of disease-associated miRNAs is a cost-effective strategy to discover those most promising biomarkers for disease diagnosis and treatment. The increasing available omics data sources provide unprecedented opportunities to decipher the underlying relationships between miRNAs and diseases by computational models. However, most existing methods are biased towards a single representation of miRNAs or diseases and are also not capable of discovering unobserved associations for new miRNAs or diseases without association information. In this study, we present a novel computational method with adaptive multi-source multi-view latent feature learning (M2LFL) to infer potential disease-associated miRNAs. First, we adopt multiple data sources to obtain similarity profiles and capture different latent features according to the geometric characteristic of miRNA and disease spaces. Then, the multi-modal latent features are projected to a common subspace to discover unobserved miRNA-disease associations in both miRNA and disease views, and an adaptive joint graph regularization term is developed to preserve the intrinsic manifold structures of multiple similarity profiles. Meanwhile, the Lp,q-norms are imposed into the projection matrices to ensure the sparsity and improve interpretability. The experimental results confirm the superior performance of our proposed method in screening reliable candidate disease miRNAs, which suggests that M2LFL could be an efficient tool to discover diagnostic biomarkers for guiding laborious clinical trials.
Collapse
|
159
|
Yan C, Wu FX, Wang J, Duan G. PESM: predicting the essentiality of miRNAs based on gradient boosting machines and sequences. BMC Bioinformatics 2020; 21:111. [PMID: 32183740 PMCID: PMC7079416 DOI: 10.1186/s12859-020-3426-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2019] [Accepted: 02/21/2020] [Indexed: 11/16/2022] Open
Abstract
Background MicroRNAs (miRNAs) are a kind of small noncoding RNA molecules that are direct posttranscriptional regulations of mRNA targets. Studies have indicated that miRNAs play key roles in complex diseases by taking part in many biological processes, such as cell growth, cell death and so on. Therefore, in order to improve the effectiveness of disease diagnosis and treatment, it is appealing to develop advanced computational methods for predicting the essentiality of miRNAs. Result In this study, we propose a method (PESM) to predict the miRNA essentiality based on gradient boosting machines and miRNA sequences. First, PESM extracts the sequence and structural features of miRNAs. Then it uses gradient boosting machines to predict the essentiality of miRNAs. We conduct the 5-fold cross-validation to assess the prediction performance of our method. The area under the receiver operating characteristic curve (AUC), F-measure and accuracy (ACC) are used as the metrics to evaluate the prediction performance. We also compare PESM with other three competing methods which include miES, Gaussian Naive Bayes and Support Vector Machine. Conclusion The results of experiments show that PESM achieves the better prediction performance (AUC: 0.9117, F-measure: 0.8572, ACC: 0.8516) than other three computing methods. In addition, the relative importance of all features also further shows that newly added features can be helpful to improve the prediction performance of methods.
Collapse
Affiliation(s)
- Cheng Yan
- Hunan Provincial Key Lab on Bioinformtics, School of Computer Science and Engineering, Central South University, 932 South Lushan Rd, ChangSha, 410083, China.,School of Computer and Information,Qiannan Normal University for Nationalities, Longshan Road, DuYun, 558000, China
| | - Fang-Xiang Wu
- Biomedical Engineering and Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, SKS7N5A9, Canada
| | - Jianxin Wang
- Hunan Provincial Key Lab on Bioinformtics, School of Computer Science and Engineering, Central South University, 932 South Lushan Rd, ChangSha, 410083, China
| | - Guihua Duan
- Hunan Provincial Key Lab on Bioinformtics, School of Computer Science and Engineering, Central South University, 932 South Lushan Rd, ChangSha, 410083, China.
| |
Collapse
|
160
|
Xie W, Luo J, Pan C, Liu Y. SG-LSTM-FRAME: a computational frame using sequence and geometrical information via LSTM to predict miRNA-gene associations. Brief Bioinform 2020; 22:2032-2042. [PMID: 32181478 DOI: 10.1093/bib/bbaa022] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 02/10/2020] [Accepted: 02/11/2020] [Indexed: 12/19/2022] Open
Abstract
MOTIVATION MircroRNAs (miRNAs) regulate target genes and are responsible for lethal diseases such as cancers. Accurately recognizing and identifying miRNA and gene pairs could be helpful in deciphering the mechanism by which miRNA affects and regulates the development of cancers. Embedding methods and deep learning methods have shown their excellent performance in traditional classification tasks in many scenarios. But not so many attempts have adapted and merged these two methods into miRNA-gene relationship prediction. Hence, we proposed a novel computational framework. We first generated representational features for miRNAs and genes using both sequence and geometrical information and then leveraged a deep learning method for the associations' prediction. RESULTS We used long short-term memory (LSTM) to predict potential relationships and proved that our method outperformed other state-of-the-art methods. Results showed that our framework SG-LSTM got an area under curve of 0.94 and was superior to other methods. In the case study, we predicted the top 10 miRNA-gene relationships and recommended the top 10 potential genes for hsa-miR-335-5p for SG-LSTM-core. We also tested our model using a larger dataset, from which 14 668 698 miRNA-gene pairs were predicted. The top 10 unknown pairs were also listed. AVAILABILITY Our work can be download in https://github.com/Xshelton/SG_LSTM. CONTACT luojiawei@hnu.edu.cn. SUPPLEMENTARY INFORMATION Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Weidun Xie
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, Hunan, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, Hunan, China
| | - Chu Pan
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, Hunan, China
| | - Ying Liu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, Hunan, China
| |
Collapse
|
161
|
Zhu J, Xu Y, Liu S, Qiao L, Sun J, Zhao Q. MicroRNAs Associated With Colon Cancer: New Potential Prognostic Markers and Targets for Therapy. Front Bioeng Biotechnol 2020; 8:176. [PMID: 32211396 PMCID: PMC7075808 DOI: 10.3389/fbioe.2020.00176] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Accepted: 02/20/2020] [Indexed: 12/24/2022] Open
Abstract
MicroRNAs (miRNAs) are a kind of non-coding RNA (ncRNA) that regulate the expression of target genes and play a role in the occurrence and development of cancers. Colon cancer (COAD) is the second most common cause of cancer-related mortality. However, the prognostic value of miRNAs in COAD is still confusing. In this study, we obtain miRNAs and messenger RNAs (mRNAs) expression profiles of COAD from the Cancer Genome Atlas (TCGA) database. After preliminary data screening and preprocessing, we acquire the expression data of 894 miRNAs and 17,019 mRNAs. Then, compared with the normal samples, 39 upregulated miRNAs and 54 downregulated miRNAs are identified by differential expression analysis. Furthermore, we obtain 1,487 upregulated mRNAs and 2,847 downregulated mRNAs. We confirm nine key miRNAs related to the survival rate of COAD patients. Moreover, by using bioinformatics methods, we get 461 common genes from both the target genes of these nine key miRNAs and differentially expressed mRNAs. Through analyzing the protein-protein interaction (PPI) network of these 461 common genes and survival analysis, we confirm five hub genes as promising biomarkers for COAD prognosis. It is worth mentioning that no previous reports have found that PGR and KCNB1 are related to COAD. We expect these key miRNAs and hub genes will provide a new way for the study of COAD.
Collapse
Affiliation(s)
- Junfeng Zhu
- Department of Clinical Laboratory, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Ying Xu
- Office of Drug Clinical Trials, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Shanshan Liu
- Department of Clinical Laboratory, Affiliated Hospital of Guilin Medical University, Guilin, China
| | - Li Qiao
- Department of Clinical Laboratory, General Hospital of Northern Theater Command, Shenyang, China
| | - Jianqiang Sun
- School of Automation and Electrical Engineering, Linyi University, Linyi, China
| | - Qi Zhao
- Department of Clinical Laboratory, Affiliated Hospital of Guilin Medical University, Guilin, China.,College of Computer Science, Shenyang Aerospace University, Shenyang, China
| |
Collapse
|
162
|
Zhang ZC, Zhang XF, Wu M, Ou-Yang L, Zhao XM, Li XL. A graph regularized generalized matrix factorization model for predicting links in biomedical bipartite networks. Bioinformatics 2020; 36:3474-3481. [DOI: 10.1093/bioinformatics/btaa157] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 02/05/2020] [Accepted: 03/03/2020] [Indexed: 12/13/2022] Open
Abstract
Abstract
Motivation
Predicting potential links in biomedical bipartite networks can provide useful insights into the diagnosis and treatment of complex diseases and the discovery of novel drug targets. Computational methods have been proposed recently to predict potential links for various biomedical bipartite networks. However, existing methods are usually rely on the coverage of known links, which may encounter difficulties when dealing with new nodes without any known link information.
Results
In this study, we propose a new link prediction method, named graph regularized generalized matrix factorization (GRGMF), to identify potential links in biomedical bipartite networks. First, we formulate a generalized matrix factorization model to exploit the latent patterns behind observed links. In particular, it can take into account the neighborhood information of each node when learning the latent representation for each node, and the neighborhood information of each node can be learned adaptively. Second, we introduce two graph regularization terms to draw support from affinity information of each node derived from external databases to enhance the learning of latent representations. We conduct extensive experiments on six real datasets. Experiment results show that GRGMF can achieve competitive performance on all these datasets, which demonstrate the effectiveness of GRGMF in prediction potential links in biomedical bipartite networks.
Availability and implementation
The package is available at https://github.com/happyalfred2016/GRGMF.
Contact
leouyang@szu.edu.cn
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zi-Chao Zhang
- Guangdong Key Laboratory of Intelligent Information Processing, Key Laboratory of Media Security, Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, China
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
| | - Xiao-Fei Zhang
- School of Mathematics and Statistics, Central China Normal University, Wuhan 430079, China
| | - Min Wu
- Institute for Infocomm Research (I2R), A*STAR, 138632, Singapore
| | - Le Ou-Yang
- Guangdong Key Laboratory of Intelligent Information Processing, Key Laboratory of Media Security, Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen 518060, China
| | - Xing-Ming Zhao
- Institute of Science and Technology for Brain-Inspired Intelligence, Fudan University, Shanghai 200433, China
- Key Laboratory of Computational Neuroscience and Brain-Inspired Intelligence, Ministry of Education, 200433 China
| | - Xiao-Li Li
- Institute for Infocomm Research (I2R), A*STAR, 138632, Singapore
| |
Collapse
|
163
|
Liu H, Ren G, Chen H, Liu Q, Yang Y, Zhao Q. Predicting lncRNA–miRNA interactions based on logistic matrix factorization with neighborhood regularized. Knowl Based Syst 2020. [DOI: 10.1016/j.knosys.2019.105261] [Citation(s) in RCA: 73] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
164
|
Computational Models in Non-Coding RNA and Human Disease. Int J Mol Sci 2020; 21:ijms21051557. [PMID: 32106478 PMCID: PMC7084754 DOI: 10.3390/ijms21051557] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Accepted: 02/24/2020] [Indexed: 01/01/2023] Open
|
165
|
Jiao CN, Gao YL, Yu N, Liu JX, Qi LY. Hyper-Graph Regularized Constrained NMF for Selecting Differentially Expressed Genes and Tumor Classification. IEEE J Biomed Health Inform 2020; 24:3002-3011. [PMID: 32086224 DOI: 10.1109/jbhi.2020.2975199] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Non-negative Matrix Factorization (NMF) is a dimensionality reduction approach for learning a parts-based and linear representation of non-negative data. It has attracted more attention because of that. In practice, NMF not only neglects the manifold structure of data samples, but also overlooks the priori label information of different classes. In this paper, a novel matrix decomposition method called Hyper-graph regularized Constrained Non-negative Matrix Factorization (HCNMF) is proposed for selecting differentially expressed genes and tumor sample classification. The advantage of hyper-graph learning is to capture local spatial information in high dimensional data. This method incorporates a hyper-graph regularization constraint to consider the higher order data sample relationships. The application of hyper-graph theory can effectively find pathogenic genes in cancer datasets. Besides, the label information is further incorporated in the objective function to improve the discriminative ability of the decomposition matrix. Supervised learning with label information greatly improves the classification effect. We also provide the iterative update rules and convergence proofs for the optimization problems of HCNMF. Experiments under The Cancer Genome Atlas (TCGA) datasets confirm the superiority of HCNMF algorithm compared with other representative algorithms through a set of evaluations.
Collapse
|
166
|
Gao Z, Wang YT, Wu QW, Ni JC, Zheng CH. Graph regularized L 2,1-nonnegative matrix factorization for miRNA-disease association prediction. BMC Bioinformatics 2020; 21:61. [PMID: 32070280 PMCID: PMC7029547 DOI: 10.1186/s12859-020-3409-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 02/11/2020] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.
Collapse
Affiliation(s)
- Zhen Gao
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, 273165, China.
| | - Chun-Hou Zheng
- School of Software, Qufu Normal University, Qufu, 273165, China.
| |
Collapse
|
167
|
Sadegh Shesh Poli M, Khajeniazi S, Behnampour N, Kalani MR, Moradi A, Marjani A. MicroRNA-146a as a Prognostic Biomarker for Esophageal Squamous Cell Carcinoma. Cancer Manag Res 2020; 12:973-980. [PMID: 32104079 PMCID: PMC7023856 DOI: 10.2147/cmar.s229397] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 01/22/2020] [Indexed: 12/13/2022] Open
Abstract
Background and Aims MicroRNAs including miR146a have a regulatory role on the expression of genes and act with binding to 3'-UTR region of the genes. Cyclooxygenase-2 (COX-2) is involved in carcinogenesis as an inflammatory marker, and microRNA-146a (miR-146a) as a negative regulatory factor. We aimed to evaluate miR146a expression as a prognostic or diagnostic biomarker for esophageal squamous cell carcinoma (ESCC) and also an association between miR146a and COX2 expression. Materials and Methods We quantified the level of miR-146a and COX-2 expression in cancerous and adjacent normal tissue samples obtained from 34 patients with ESCC, using real-time-PCR. Statistical analyses were conducted using one-sample t-test. Receiver-operating characteristic (ROC) curve and Kaplan-Meier analysis were applied to assay miR146a as a diagnostic and prognostic marker, respectively, during 4 years of the study. Furthermore, the Cox regression model was performed to assay the hazard ratio (HR). The association between miR-146a and COX2 expression level in ESCC patients was evaluated by nonparametric Spearman's rho analysis. Results The results revealed a reduction of miR-146a expression in 50% of cancerous tissue when compared with adjacent normal regions (P-value=0.127). COX-2 expression in 80% of ESCC patients was higher than in the controls (P-value=0.001). Overall, in 60% of cases, direct association was seen between microRNA-146a and COX-2 expression level (correlation coefficient= 0.438, P-value=0.011). COX2 can be considered as a diagnostic biomarker (AUC=0.834, sensitivity=72%, specificity =83%, P-value<0.0001) but miR146a cannot be considered as a diagnostic biomarker (AUC=0.553, sensitivity=88%, specificity =28%, P-value=0.453). Survival analysis by Kaplan-Meier method showed miR146a and COX2 expression can be probably considered as prognostic biomarkers for ESCC because patients with high expression of miR146a had 7 months shorter life span and patients with low expression of COX2 had 8 months shorter life span. Conclusion COX2 expression is a diagnostic biomarker. MiR-146a and COX2 expression can probably be considered as prognostic biomarkers for survival in ESCC.
Collapse
Affiliation(s)
| | - Safoura Khajeniazi
- Medical Cellular and Molecular Research Center, Golestan University of Medical Sciences, Gorgan, Iran
| | - Nasser Behnampour
- Health Management and Social Development Research Center, Gorgan, Iran
| | - Mohammad Reza Kalani
- Molecular Medicine Research Center, Golestan University of Medical Sciences, Gorgan, Iran
| | - Abdolvahab Moradi
- Gastroenterology and Hepatology Research Center, Golestan University of Medical Sciences, Gorgan, Iran
| | - Abdoljalal Marjani
- Abdoljalal Marjani Metabolic Disorders Research Center, Department of Biochemistry and Biophysics, Faculty of Medicine, Golestan University of Medical Sciences, Gorgan, Iran
| |
Collapse
|
168
|
Peng LH, Zhou LQ, Chen X, Piao X. A Computational Study of Potential miRNA-Disease Association Inference Based on Ensemble Learning and Kernel Ridge Regression. Front Bioeng Biotechnol 2020; 8:40. [PMID: 32117922 PMCID: PMC7015868 DOI: 10.3389/fbioe.2020.00040] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 01/17/2020] [Indexed: 12/11/2022] Open
Abstract
As increasing experimental studies have shown that microRNAs (miRNAs) are closely related to multiple biological processes and the prevention, diagnosis and treatment of human diseases, a growing number of researchers are focusing on the identification of associations between miRNAs and diseases. Identifying such associations purely via experiments is costly and demanding, which prompts researchers to develop computational methods to complement the experiments. In this paper, a novel prediction model named Ensemble of Kernel Ridge Regression based MiRNA-Disease Association prediction (EKRRMDA) was developed. EKRRMDA obtained features of miRNAs and diseases by integrating the disease semantic similarity, the miRNA functional similarity and the Gaussian interaction profile kernel similarity for diseases and miRNAs. Under the computational framework that utilized ensemble learning and feature dimensionality reduction, multiple base classifiers that combined two Kernel Ridge Regression classifiers from the miRNA side and disease side, respectively, were obtained based on random selection of features. Then average strategy for these base classifiers was adopted to obtain final association scores of miRNA-disease pairs. In the global and local leave-one-out cross validation, EKRRMDA attained the AUCs of 0.9314 and 0.8618, respectively. Moreover, the model’s average AUC with standard deviation in 5-fold cross validation was 0.9275 ± 0.0008. In addition, we implemented three different types of case studies on predicting miRNAs associated with five important diseases. As a result, there were 90% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 86% (Lymphoma), 98% (Lung Neoplasms), and 96% (Breast Neoplasms) of the top 50 predicted miRNAs verified to have associations with these diseases.
Collapse
Affiliation(s)
- Li-Hong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Li-Qian Zhou
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Xue Piao
- School of Medical Informatics, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
169
|
Ha J, Park C, Park C, Park S. IMIPMF: Inferring miRNA-disease interactions using probabilistic matrix factorization. J Biomed Inform 2020; 102:103358. [DOI: 10.1016/j.jbi.2019.103358] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2019] [Revised: 11/11/2019] [Accepted: 12/12/2019] [Indexed: 12/09/2022]
|
170
|
Wu M, Yang Y, Wang H, Ding J, Zhu H, Xu Y. IMPMD: An Integrated Method for Predicting Potential Associations Between miRNAs and Diseases. Curr Genomics 2020; 20:581-591. [PMID: 32581646 PMCID: PMC7290057 DOI: 10.2174/1389202920666191023090215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 10/16/2019] [Indexed: 01/06/2023] Open
Abstract
Background With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations. Methodology In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms. Results IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries. Conclusion We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.
Collapse
Affiliation(s)
- Meiqi Wu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yingxi Yang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Hui Wang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Jun Ding
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Huan Zhu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yan Xu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| |
Collapse
|
171
|
Chen X, Sun LG, Zhao Y. NCMCMDA: miRNA-disease association prediction through neighborhood constraint matrix completion. Brief Bioinform 2020; 22:485-496. [PMID: 31927572 DOI: 10.1093/bib/bbz159] [Citation(s) in RCA: 125] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2019] [Revised: 11/01/2019] [Accepted: 11/11/2019] [Indexed: 12/13/2022] Open
Abstract
Emerging evidence shows that microRNAs (miRNAs) play a critical role in diverse fundamental and important biological processes associated with human diseases. Inferring potential disease related miRNAs and employing them as the biomarkers or drug targets could contribute to the prevention, diagnosis and treatment of complex human diseases. In view of that traditional biological experiments cost much time and resources, computational models would serve as complementary means to uncover potential miRNA-disease associations. In this study, we proposed a new computational model named Neighborhood Constraint Matrix Completion for MiRNA-Disease Association prediction (NCMCMDA) to predict potential miRNA-disease associations. The main task of NCMCMDA was to recover the missing miRNA-disease associations based on the known miRNA-disease associations and integrated disease (miRNA) similarity. In this model, we innovatively integrated neighborhood constraint with matrix completion, which provided a novel idea of utilizing similarity information to assist the prediction. After the recovery task was transformed into an optimization problem, we solved it with a fast iterative shrinkage-thresholding algorithm. As a result, the AUCs of NCMCMDA in global and local leave-one-out cross validation were 0.9086 and 0.8453, respectively. In 5-fold cross validation, NCMCMDA achieved an average AUC of 0.8942 and standard deviation of 0.0015, which demonstrated NCMCMDA's superior performance than many previous computational methods. Furthermore, NCMCMDA was applied to three different types of case studies to further evaluate its prediction reliability and accuracy. As a result, 84% (colon neoplasms), 98% (esophageal neoplasms) and 98% (breast neoplasms) of the top 50 predicted miRNAs were verified by recent literature.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology
| | - Lian-Gang Sun
- School of Information and Control Engineering, China University of Mining and Technology
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology
| |
Collapse
|
172
|
Yan P, Tang L, Liu L, Tu G. Identification of candidate RNA signatures in triple-negative breast cancer by the construction of a competing endogenous RNA network with integrative analyses of Gene Expression Omnibus and The Cancer Genome Atlas data. Oncol Lett 2020; 19:1915-1927. [PMID: 32194687 PMCID: PMC7039180 DOI: 10.3892/ol.2020.11292] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2019] [Accepted: 11/21/2019] [Indexed: 12/16/2022] Open
Abstract
Triple-negative breast cancer (TNBC) is a subtype of breast cancer that is characterized by aggressive and metastatic clinical characteristics and generally leads to earlier distant recurrence and poorer prognosis than other molecular subtypes. Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) serve a crucial role in a wide variety of biological processes by interacting with microRNAs (miRNAs) as competing endogenous RNAs (ceRNAs) and, thus, affect the expression of target genes in multiple types of cancer. Seven datasets from the Gene Expression Omnibus (GEO) database, including 444 tumor and 88 healthy tissue samples, were utilized to investigate the underlying mechanisms of TNBC and identify prognostic biomarkers. Differentially expressed genes (DEGs) were further validated in The Cancer Genome Atlas database and the associations between their expression levels and clinical information were analyzed to identify prognostic values. A potential lncRNA-miRNA-mRNA ceRNA network was also constructed. Finally, 69 mRNAs from the integrated Gene Expression Omnibus datasets were identified as DEGs using the robust rank aggregation method with |log2FC|>1 and adjusted P<0.01 set as the significance cut-off levels. In addition, 29 lncRNAs, 21 miRNAs and 27 mRNAs were included in the construction of the ceRNA network. The present study elucidated the mechanisms underlying the progression of TNBC and identified novel prognostic biomarkers for TNBC.
Collapse
Affiliation(s)
- Ping Yan
- Department of Endocrine and Breast Surgery, The First Affiliated Hospital, Chongqing Medical University, Chongqing 400016, P.R. China
| | - Lingfeng Tang
- Department of Endocrine and Breast Surgery, The First Affiliated Hospital, Chongqing Medical University, Chongqing 400016, P.R. China
| | - Li Liu
- Department of Endocrine and Breast Surgery, The First Affiliated Hospital, Chongqing Medical University, Chongqing 400016, P.R. China
| | - Gang Tu
- Department of Endocrine and Breast Surgery, The First Affiliated Hospital, Chongqing Medical University, Chongqing 400016, P.R. China
| |
Collapse
|
173
|
Li J, Zhang S, Liu T, Ning C, Zhang Z, Zhou W. Neural inductive matrix completion with graph convolutional networks for miRNA-disease association prediction. Bioinformatics 2020; 36:2538-2546. [DOI: 10.1093/bioinformatics/btz965] [Citation(s) in RCA: 100] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2019] [Revised: 12/17/2019] [Accepted: 12/31/2019] [Indexed: 12/26/2022] Open
Abstract
AbstractMotivationPredicting the association between microRNAs (miRNAs) and diseases plays an import role in identifying human disease-related miRNAs. As identification of miRNA-disease associations via biological experiments is time-consuming and expensive, computational methods are currently used as effective complements to determine the potential associations between disease and miRNA.ResultsWe present a novel method of neural inductive matrix completion with graph convolutional network (NIMCGCN) for predicting miRNA-disease association. NIMCGCN first uses graph convolutional networks to learn miRNA and disease latent feature representations from the miRNA and disease similarity networks. Then, learned features were input into a novel neural inductive matrix completion (NIMC) model to generate an association matrix completion. The parameters of NIMCGCN were learned based on the known miRNA-disease association data in a supervised end-to-end way. We compared the proposed method with other state-of-the-art methods. The area under the receiver operating characteristic curve results showed that our method is significantly superior to existing methods. Furthermore, 50, 47 and 48 of the top 50 predicted miRNAs for three high-risk human diseases, namely, colon cancer, lymphoma and kidney cancer, were verified using experimental literature. Finally, 100% prediction accuracy was achieved when breast cancer was used as a case study to evaluate the ability of NIMCGCN for predicting a new disease without any known related miRNAs.Availability and implementationhttps://github.com/ljatynu/NIMCGCN/Supplementary informationSupplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jin Li
- School of Software, Yunnan University, Kunming 650091, China
| | - Sai Zhang
- School of Software, Yunnan University, Kunming 650091, China
| | - Tao Liu
- School of Software, Yunnan University, Kunming 650091, China
| | - Chenxi Ning
- School of Software, Yunnan University, Kunming 650091, China
| | - Zhuoxuan Zhang
- School of Software, Yunnan University, Kunming 650091, China
| | - Wei Zhou
- School of Software, Yunnan University, Kunming 650091, China
| |
Collapse
|
174
|
Potential miRNA-disease association prediction based on kernelized Bayesian matrix factorization. Genomics 2020; 112:809-819. [DOI: 10.1016/j.ygeno.2019.05.021] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/09/2019] [Accepted: 05/24/2019] [Indexed: 12/19/2022]
|
175
|
Qu J, Zhao Y, Zhang L, Cai SB, Ming Z, Wang CC. Computational Models for Self-Interacting Proteins Prediction. Protein Pept Lett 2019; 27:392-399. [PMID: 31880240 DOI: 10.2174/0929866527666191227141713] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 11/19/2019] [Accepted: 11/21/2019] [Indexed: 11/22/2022]
Abstract
Self-Interacting Proteins (SIPs), whose two or more copies can interact with each other, have significant roles in cellular functions and evolution of Protein Interaction Networks (PINs). Knowing whether a protein can act on itself is important to understand its functions. Previous studies on SIPs have focused on their structures and functions, while their whole properties are less emphasized. Not surprisingly, identifying SIPs is one of the most important works in biomedical research, which will help to understanding the function and mechanism of proteins. It is worth noting that high throughput methods can be used for SIPs prediction, but can be costly, time consuming and challenging. Therefore, it is urgent to design computational models for the identification of SIPs. In this review, the concept and function of SIPs were introduced in detail. We further introduced SIPs data and some excellent computational models that have been designed for SIPs prediction. Specially, the most existing approaches were developed based on machine learning through carrying out different extract feature methods. Finally, we discussed several difficult problems in developing computational models for SIPs prediction.
Collapse
Affiliation(s)
- Jia Qu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Shu-Bin Cai
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Zhong Ming
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
176
|
Associating lncRNAs with small molecules via bilevel optimization reveals cancer-related lncRNAs. PLoS Comput Biol 2019; 15:e1007540. [PMID: 31877126 PMCID: PMC6948815 DOI: 10.1371/journal.pcbi.1007540] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2018] [Revised: 01/08/2020] [Accepted: 11/12/2019] [Indexed: 12/28/2022] Open
Abstract
Long noncoding RNA (lncRNA) transcripts have emerging impacts in cancer studies, which suggests their potential as novel therapeutic agents. However, the molecular mechanism behind their treatment effects is still unclear. Here, we designed a computational model to Associate LncRNAs with Anti-Cancer Drugs (ALACD) based on a bilevel optimization model, which optimized the gene signature overlap in the upper level and imputed the missing lncRNA-gene association in the lower level. ALACD predicts genes coexpressed with lncRNAs mean while matching drug’s gene signatures. This model allows us to borrow the target gene information of small molecules to understand the mechanisms of action of lncRNAs and their roles in cancer. The ALACD model was systematically applied to the 10 cancer types in The Cancer Genome Atlas (TCGA) that had matched lncRNA and mRNA expression data. Cancer type-specific lncRNAs and associated drugs were identified. These lncRNAs show significantly different expression levels in cancer patients. Follow-up functional and molecular pathway analysis suggest the gene signatures bridging drugs and lncRNAs are closely related to cancer development. Importantly, patient survival information and evidence from the literature suggest that the lncRNAs and drug-lncRNA associations identified by the ALACD model can provide an alternative choice for cancer targeting treatment and potential cancer pognostic biomarkers. The ALACD model is freely available at https://github.com/wangyc82/ALACD-v1. LncRNAs are RNA transcripts that are longer than 200 bp and do not encode proteins. Recent experimental studies have indicated the crucial role of lncRNAs in cancer. We proposed a computational model, ALACD, to understand a lncRNA’s molecular mechanism by associating it with a drug through the drug’s target genes. ALACD reveals lncRNAs, the associated anti-cancer drug, and the induced gene signatures that are involved in the regulation of cancer. Furthermore, these cancer-related lncRNAs are differentially expressed in cancer patients and closely associated with patient survival.
Collapse
|
177
|
Zheng K, You ZH, Wang L, Zhou Y, Li LP, Li ZW. DBMDA: A Unified Embedding for Sequence-Based miRNA Similarity Measure with Applications to Predict and Validate miRNA-Disease Associations. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 19:602-611. [PMID: 31931344 PMCID: PMC6957846 DOI: 10.1016/j.omtn.2019.12.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 10/09/2019] [Accepted: 12/10/2019] [Indexed: 11/24/2022]
Abstract
MicroRNAs (miRNAs) play a critical role in human diseases. Determining the association between miRNAs and disease contributes to elucidating the pathogenesis of liver diseases and seeking the effective treatment method. Despite great recent advances in the field of the associations between miRNAs and diseases, implementing association verification and recognition efficiently at scale presents serious challenges to biological experimental approaches. Thus, computational methods for predicting miRNA-disease association have become a research hotspot. In this paper, we present a new computational method, named distance-based sequence similarity for miRNA-disease association prediction (DBMDA), that directly learns a mapping from miRNA sequence to a Euclidean space. The notable feature of our approach consists of inferring global similarity from region distances that can be figured by chaos game representation algorithm based on the miRNA sequences. In the 5-fold cross-validation experiment, the area under the curve (AUC) obtained by DBMDA in predicting potential miRNA-disease associations reached 0.9129. To assess the effectiveness of DBMDA more effectively, we compared it with different classifiers and former prediction models. Besides, we constructed two case studies for prostate neoplasms and colon neoplasms. Results show that 39 and 39 out of the top 40 predicted miRNAs were confirmed by other databases, respectively. BDMDA has made new attempts in sequence similarity and achieved excellent results, while at the same time providing a new perspective for predicting the relationship between diseases and miRNAs. The source code and datasets explored in this work are available online from the University of Chinese Academy of Sciences (http://220.171.34.3:81/).
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
178
|
An improved random forest-based computational model for predicting novel miRNA-disease associations. BMC Bioinformatics 2019; 20:624. [PMID: 31795954 PMCID: PMC6889672 DOI: 10.1186/s12859-019-3290-7] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 11/21/2019] [Indexed: 01/29/2023] Open
Abstract
Background A large body of evidence shows that miRNA regulates the expression of its target genes at post-transcriptional level and the dysregulation of miRNA is related to many complex human diseases. Accurately discovering disease-related miRNAs is conductive to the exploring of the pathogenesis and treatment of diseases. However, because of the limitation of time-consuming and expensive experimental methods, predicting miRNA-disease associations by computational models has become a more economical and effective mean. Results Inspired by the work of predecessors, we proposed an improved computational model based on random forest (RF) for identifying miRNA-disease associations (IRFMDA). First, the integrated similarity of diseases and the integrated similarity of miRNAs were calculated by combining the semantic similarity and Gaussian interaction profile kernel (GIPK) similarity of diseases, the functional similarity and GIPK similarity of miRNAs, respectively. Then, the integrated similarity of diseases and the integrated similarity of miRNAs were combined to represent each miRNA-disease relationship pair. Next, the miRNA-disease relationship pairs contained in the HMDD (v2.0) database were considered positive samples, and the randomly constructed miRNA-disease relationship pairs not included in HMDD (v2.0) were considered negative samples. Next, the feature selection based on the variable importance score of RF was performed to choose more useful features to represent samples to optimize the model’s ability of inferring miRNA-disease associations. Finally, a RF regression model was trained on reduced sample space to score the unknown miRNA-disease associations. The AUCs of IRFMDA under local leave-one-out cross-validation (LOOCV), global LOOCV and 5-fold cross-validation achieved 0.8728, 0.9398 and 0.9363, which were better than several excellent models for predicting miRNA-disease associations. Moreover, case studies on oesophageal cancer, lymphoma and lung cancer showed that 94 (oesophageal cancer), 98 (lymphoma) and 100 (lung cancer) of the top 100 disease-associated miRNAs predicted by IRFMDA were supported by the experimental data in the dbDEMC (v2.0) database. Conclusions Cross-validation and case studies demonstrated that IRFMDA is an excellent miRNA-disease association prediction model, and can provide guidance and help for experimental studies on the regulatory mechanism of miRNAs in complex human diseases in the future.
Collapse
|
179
|
Prediction of potential miRNA-disease associations using matrix decomposition and label propagation. Knowl Based Syst 2019. [DOI: 10.1016/j.knosys.2019.104963] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
180
|
Taxonomy dimension reduction for colorectal cancer prediction. Comput Biol Chem 2019; 83:107160. [DOI: 10.1016/j.compbiolchem.2019.107160] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 11/02/2019] [Accepted: 11/04/2019] [Indexed: 02/01/2023]
|
181
|
Zhao Y, Chen X, Yin J, Qu J. SNMFSMMA: using symmetric nonnegative matrix factorization and Kronecker regularized least squares to predict potential small molecule-microRNA association. RNA Biol 2019; 17:281-291. [PMID: 31739716 DOI: 10.1080/15476286.2019.1694732] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Accumulating studies have shown that microRNAs (miRNAs) could be used as targets of small-molecule (SM) drugs to treat diseases. In recent years, researchers have proposed many computational models to reveal miRNA-SM associations due to the huge cost of experimental methods. Considering the shortcomings of the previous models, such as the prediction accuracy of some models is low or some cannot be applied for new SMs (miRNAs), we developed a novel model named Symmetric Nonnegative Matrix Factorization for Small Molecule-MiRNA Association prediction (SNMFSMMA). Different from some models directly applying the integrated similarities, SNMFSMMA first performed matrix decomposition on the integrated similarity matrixes, and calculated the Kronecker product of the new integrated similarity matrixes to obtain the SM-miRNA pair similarity. Further, we applied regularized least square to obtain the mapping function of the SM-miRNA pairs to the associated probabilities by minimizing the objective function. On the basis of Dataset 1 and 2 extracted from SM2miR v1.0 database, we implemented global leave-one-out cross validation (LOOCV), miRNA-fixed local LOOCV, SM-fixed local LOOCV and 5-fold cross-validation to evaluate the prediction performance. Finally, the AUC values obtained by SNMFSMMA in these validation reached 0.9711 (0.8895), 0.9698 (0.8884), 0.8329 (0.7651) and 0.9644 ± 0.0035 (0.8814 ± 0.0033) based on Dataset 1 (Dataset 2), respectively. In the first case study, 5 of the top 10 associations predicted were confirmed. In the second, 7 and 8 of the top 10 predicted miRNAs related with 5-FU and 5-Aza-2'-deoxycytidine were confirmed. These results demonstrated the reliable predictive power of SNMFSMMA.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Jia Qu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
182
|
Wang CC, Chen X. A Unified Framework for the Prediction of Small Molecule–MicroRNA Association Based on Cross-Layer Dependency Inference on Multilayered Networks. J Chem Inf Model 2019; 59:5281-5293. [DOI: 10.1021/acs.jcim.9b00667] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
183
|
Wan H, Li JM, Ding H, Lin SX, Tu SQ, Tian XH, Hu JP, Chang S. An Overview of Computational Tools of Nucleic Acid Binding Site Prediction for Site-specific Proteins and Nucleases. Protein Pept Lett 2019; 27:370-384. [PMID: 31746287 DOI: 10.2174/0929866526666191028162302] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2019] [Revised: 05/24/2019] [Accepted: 09/24/2019] [Indexed: 12/26/2022]
Abstract
Understanding the interaction mechanism of proteins and nucleic acids is one of the most fundamental problems for genome editing with engineered nucleases. Due to some limitations of experimental investigations, computational methods have played an important role in obtaining the knowledge of protein-nucleic acid interaction. Over the past few years, dozens of computational tools have been used for identification of nucleic acid binding site for site-specific proteins and design of site-specific nucleases because of their significant advantages in genome editing. Here, we review existing widely-used computational tools for target prediction of site-specific proteins as well as off-target prediction of site-specific nucleases. This article provides a list of on-line prediction tools according to their features followed by the description of computational methods used by these tools, which range from various sequence mapping algorithms (like Bowtie, FetchGWI and BLAST) to different machine learning methods (such as Support Vector Machine, hidden Markov models, Random Forest, elastic network and deep neural networks). We also make suggestions on the further development in improving the accuracy of prediction methods. This survey will provide a reference guide for computational biologists working in the field of genome editing.
Collapse
Affiliation(s)
- Hua Wan
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Jian-Ming Li
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Huang Ding
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Shuo-Xin Lin
- Department of Electrical and Computer Engineering, James Clark School of Engineering, University of Maryland, College Park, MD 20742, United States
| | - Shu-Qin Tu
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Xu-Hong Tian
- College of Mathematics and Informatics, South China Agricultural University, Guangzhou 510642, China
| | - Jian-Ping Hu
- College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, Antibiotics Research and Re-Evaluation Key Laboratory of Sichuan Province, Chengdu University, Chengdu 610106, China
| | - Shan Chang
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
| |
Collapse
|
184
|
Guan NN, Wang CC, Zhang L, Huang L, Li JQ, Piao X. In silico prediction of potential miRNA-disease association using an integrative bioinformatics approach based on kernel fusion. J Cell Mol Med 2019; 24:573-587. [PMID: 31747722 PMCID: PMC6933403 DOI: 10.1111/jcmm.14765] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2019] [Revised: 08/13/2019] [Accepted: 09/20/2019] [Indexed: 12/18/2022] Open
Abstract
Accumulating experimental evidence has demonstrated that microRNAs (miRNAs) have a huge impact on numerous critical biological processes and they are associated with different complex human diseases. Nevertheless, the task to predict potential miRNAs related to diseases remains difficult. In this paper, we developed a Kernel Fusion-based Regularized Least Squares for MiRNA-Disease Association prediction model (KFRLSMDA), which applied kernel fusion technique to fuse similarity matrices and then utilized regularized least squares to predict potential miRNA-disease associations. To prove the effectiveness of KFRLSMDA, we adopted leave-one-out cross-validation (LOOCV) and 5-fold cross-validation and then compared KFRLSMDA with 10 previous computational models (MaxFlow, MiRAI, MIDP, RKNNMDA, MCMDA, HGIMDA, RLSMDA, HDMP, WBSMDA and RWRMDA). Outperforming other models, KFRLSMDA achieved AUCs of 0.9246 in global LOOCV, 0.8243 in local LOOCV and average AUC of 0.9175 ± 0.0008 in 5-fold cross-validation. In addition, respectively, 96%, 100% and 90% of the top 50 potential miRNAs for breast neoplasms, colon neoplasms and oesophageal neoplasms were confirmed by experimental discoveries. We also predicted potential miRNAs related to hepatocellular cancer by removing all known related miRNAs of this cancer and 98% of the top 50 potential miRNAs were verified. Furthermore, we predicted potential miRNAs related to lymphoma using the data set in the old version of the HMDD database and 80% of the top 50 potential miRNAs were confirmed. Therefore, it can be concluded that KFRLSMDA has reliable prediction performance.
Collapse
Affiliation(s)
- Na-Na Guan
- College of Big Data Statistics, Guizhou University of Finance and Economics, Guiyang, China.,College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Chun-Chun Wang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, China.,The Future Laboratory, Tsinghua University, Beijing, China
| | - Jian-Qiang Li
- College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, China
| | - Xue Piao
- School of Medical Informatics, Xuzhou Medical University, Xuzhou, China
| |
Collapse
|
185
|
Tao L, Yang L, Huang X, Hua F, Yang X. Reconstruction and Analysis of the lncRNA-miRNA-mRNA Network Based on Competitive Endogenous RNA Reveal Functional lncRNAs in Dilated Cardiomyopathy. Front Genet 2019; 10:1149. [PMID: 31803236 PMCID: PMC6873784 DOI: 10.3389/fgene.2019.01149] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Accepted: 10/21/2019] [Indexed: 12/15/2022] Open
Abstract
Dilated cardiomyopathy (DCM) is an important cause of sudden death and heart failure with an unknown etiology. Recent studies have suggested that long non-coding RNA (lncRNA) can interact with microRNA (miRNA) and indirectly interact with mRNA through competitive endogenous RNA (ceRNA) activities. However, the mechanism of ceRNA in DCM remains unclear. In this study, a miRNA array was first performed using heart samples from DCM patients and healthy controls. For further validation, we conducted real-time quantitative reverse transcription (RT)-PCR using samples from DCM patients and a doxorubicin-induced rodent model of cardiomyopathy, revealing that miR-144-3p and miR-451a were down-regulated, and miR-21-5p was up-regulated. Based on the ceRNA theory, we constructed a global triple network using data from the National Center for Biotechnology Information Gene Expression Omnibus (NCBI-GEO) and our miRNA array. The lncRNA-miRNA-mRNA network comprised 22 lncRNA nodes, 32 mRNA nodes, and 11 miRNA nodes. Hub nodes and the number of relationship pairs were then analyzed, and the results showed that two lncRNAs (NONHSAT001691 and NONHSAT006358) targeting miR-144/451 were highly related to DCM. Then, cluster module and random walk with restart for the ceRNA network were analyzed and identified four lncRNAs (NONHSAT026953/NONHSAT006250/NONHSAT133928/NONHSAT041662) targeting miR-21 that were significantly related to DCM. This study provides a new strategy for research on DCM or other diseases. Furthermore, lncRNA-miRNA pairs may be regarded as candidate diagnostic biomarkers or potential therapeutic targets of DCM.
Collapse
Affiliation(s)
- Lichan Tao
- Department of Cardiology, The Third Affiliated Hospital of Soochow University, Changzhou, China
| | - Ling Yang
- Department of Cardiology, The Third Affiliated Hospital of Soochow University, Changzhou, China
| | - Xiaoli Huang
- Department of Endocrinology, The Third Affiliated Hospital of Soochow University, Changzhou, China
| | - Fei Hua
- Department of Endocrinology, The Third Affiliated Hospital of Soochow University, Changzhou, China
| | - Xiaoyu Yang
- Department of Cardiology, The Third Affiliated Hospital of Soochow University, Changzhou, China
| |
Collapse
|
186
|
Hong J, Luo Y, Mou M, Fu J, Zhang Y, Xue W, Xie T, Tao L, Lou Y, Zhu F. Convolutional neural network-based annotation of bacterial type IV secretion system effectors with enhanced accuracy and reduced false discovery. Brief Bioinform 2019; 21:1825-1836. [PMID: 31860715 DOI: 10.1093/bib/bbz120] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2019] [Revised: 08/12/2019] [Accepted: 08/21/2019] [Indexed: 12/20/2022] Open
Abstract
The type IV bacterial secretion system (SS) is reported to be one of the most ubiquitous SSs in nature and can induce serious conditions by secreting type IV SS effectors (T4SEs) into the host cells. Recent studies mainly focus on annotating new T4SE from the huge amount of sequencing data, and various computational tools are therefore developed to accelerate T4SE annotation. However, these tools are reported as heavily dependent on the selected methods and their annotation performance need to be further enhanced. Herein, a convolution neural network (CNN) technique was used to annotate T4SEs by integrating multiple protein encoding strategies. First, the annotation accuracies of nine encoding strategies integrated with CNN were assessed and compared with that of the popular T4SE annotation tools based on independent benchmark. Second, false discovery rates of various models were systematically evaluated by (1) scanning the genome of Legionella pneumophila subsp. ATCC 33152 and (2) predicting the real-world non-T4SEs validated using published experiments. Based on the above analyses, the encoding strategies, (a) position-specific scoring matrix (PSSM), (b) protein secondary structure & solvent accessibility (PSSSA) and (c) one-hot encoding scheme (Onehot), were identified as well-performing when integrated with CNN. Finally, a novel strategy that collectively considers the three well-performing models (CNN-PSSM, CNN-PSSSA and CNN-Onehot) was proposed, and a new tool (CNN-T4SE, https://idrblab.org/cnnt4se/) was constructed to facilitate T4SE annotation. All in all, this study conducted a comprehensive analysis on the performance of a collection of encoding strategies when integrated with CNN, which could facilitate the suppression of T4SS in infection and limit the spread of antimicrobial resistance.
Collapse
Affiliation(s)
- Jiajun Hong
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Minjie Mou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jianbo Fu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yang Zhang
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences, Chongqing University, Chongqing 401331, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou 310036, China
| | - Yan Lou
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou 310000, Zhejiang, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
187
|
Long Y, Luo J. WMGHMDA: a novel weighted meta-graph-based model for predicting human microbe-disease association on heterogeneous information network. BMC Bioinformatics 2019; 20:541. [PMID: 31675979 PMCID: PMC6824056 DOI: 10.1186/s12859-019-3066-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Accepted: 09/02/2019] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND An increasing number of biological and clinical evidences have indicated that the microorganisms significantly get involved in the pathological mechanism of extensive varieties of complex human diseases. Inferring potential related microbes for diseases can not only promote disease prevention, diagnosis and treatment, but also provide valuable information for drug development. Considering that experimental methods are expensive and time-consuming, developing computational methods is an alternative choice. However, most of existing methods are biased towards well-characterized diseases and microbes. Furthermore, existing computational methods are limited in predicting potential microbes for new diseases. RESULTS Here, we developed a novel computational model to predict potential human microbe-disease associations (MDAs) based on Weighted Meta-Graph (WMGHMDA). We first constructed a heterogeneous information network (HIN) by combining the integrated microbe similarity network, the integrated disease similarity network and the known microbe-disease bipartite network. And then, we implemented iteratively pre-designed Weighted Meta-Graph search algorithm on the HIN to uncover possible microbe-disease pairs by cumulating the contribution values of weighted meta-graphs to the pairs as their probability scores. Depending on contribution potential, we described the contribution degree of different types of meta-graphs to a microbe-disease pair with bias rating. Meta-graph with higher bias rating will be assigned greater weight value when calculating probability scores. CONCLUSIONS The experimental results showed that WMGHMDA outperformed some state-of-the-art methods with average AUCs of 0.9288, 0.9068 ±0.0031 in global leave-one-out cross validation (LOOCV) and 5-fold cross validation (5-fold CV), respectively. In the case studies, 9, 19, 37 and 10, 20, 45 out of top-10, 20, 50 candidate microbes were manually verified by previous reports for asthma and inflammatory bowel disease (IBD), respectively. Furthermore, three common human diseases (Crohn's disease, Liver cirrhosis, Type 1 diabetes) were adopted to demonstrate that WMGHMDA could be efficiently applied to make predictions for new diseases. In summary, WMGHMDA has a high potential in predicting microbe-disease associations.
Collapse
Affiliation(s)
- Yahui Long
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, China.
| |
Collapse
|
188
|
Zhong L, Ming Z, Xie G, Fan C, Piao X. Recent Advances on the Semi-Supervised Learning for Long Non-Coding RNA-Protein Interactions Prediction: A Review. Protein Pept Lett 2019; 27:385-391. [PMID: 31654509 DOI: 10.2174/0929866526666191025104043] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 05/30/2019] [Accepted: 09/24/2019] [Indexed: 12/24/2022]
Abstract
In recent years, more and more evidence indicates that long non-coding RNA (lncRNA) plays a significant role in the development of complex biological processes, especially in RNA progressing, chromatin modification, and cell differentiation, as well as many other processes. Surprisingly, lncRNA has an inseparable relationship with human diseases such as cancer. Therefore, only by knowing more about the function of lncRNA can we better solve the problems of human diseases. However, lncRNAs need to bind to proteins to perform their biomedical functions. So we can reveal the lncRNA function by studying the relationship between lncRNA and protein. But due to the limitations of traditional experiments, researchers often use computational prediction models to predict lncRNA protein interactions. In this review, we summarize several computational models of the lncRNA protein interactions prediction base on semi-supervised learning during the past two years, and introduce their advantages and shortcomings briefly. Finally, the future research directions of lncRNA protein interaction prediction are pointed out.
Collapse
Affiliation(s)
- Lin Zhong
- School of Mathematics, Liaoning University, Shenyang, 110036, China
| | - Zhong Ming
- National Engineering Laboratory for Big Data System Computing Technology, Shenzhen University, Shenzhen, 518060, China.,College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, 518060, China
| | - Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, 510006, China
| | - Chunlong Fan
- College of Computer Science, Shenyang Aerospace University, Shenyang, 110136, China
| | - Xue Piao
- School of Medical Informatics, Xuzhou Medical University, Xuzhou, 221004, China
| |
Collapse
|
189
|
Pan X, Shen HB. Inferring Disease-Associated MicroRNAs Using Semi-supervised Multi-Label Graph Convolutional Networks. iScience 2019; 20:265-277. [PMID: 31605942 PMCID: PMC6817654 DOI: 10.1016/j.isci.2019.09.013] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 09/05/2019] [Accepted: 09/11/2019] [Indexed: 01/22/2023] Open
Abstract
MicroRNAs (miRNAs) play crucial roles in biological processes involved in diseases. The associations between diseases and protein-coding genes (PCGs) have been well investigated, and miRNAs interact with PCGs to trigger them to be functional. We present a computational method, DimiG, to infer miRNA-associated diseases using a semi-supervised Graph Convolutional Network model (GCN). DimiG uses a multi-label framework to integrate PCG-PCG interactions, PCG-miRNA interactions, PCG-disease associations, and tissue expression profiles. DimiG is trained on disease-PCG associations and an interaction network using a GCN, which is further used to score associations between diseases and miRNAs. We evaluate DimiG on a benchmark set from verified disease-miRNA associations. Our results demonstrate that DimiG outperforms the best unsupervised method and is comparable to two supervised methods. Three case studies of prostate cancer, lung cancer, and inflammatory bowel disease further demonstrate the efficacy of DimiG, where top miRNAs predicted by DimiG are supported by literature.
Collapse
Affiliation(s)
- Xiaoyong Pan
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China; Department of Medical informatics, Erasmus Medical Center, 3015 CE Rotterdam, the Netherlands.
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, and Key Laboratory of System Control and Information Processing, Ministry of Education of China, 200240 Shanghai, China.
| |
Collapse
|
190
|
Identification of key protein-coding genes and lncRNAs in spontaneous neutrophil apoptosis. Sci Rep 2019; 9:15106. [PMID: 31641174 PMCID: PMC6805912 DOI: 10.1038/s41598-019-51597-9] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Accepted: 10/04/2019] [Indexed: 01/17/2023] Open
Abstract
Polymorphonuclear leukocytes (PMNs) are the most abundant cells of the innate immune system in humans, and spontaneous PMN apoptosis plays crucial roles in maintaining neutrophil homeostasis and resolving inflammation. However, the detailed mechanisms of spontaneous PMN apoptosis remain to be elucidated. By analysis of the public microarray dataset GSE37416, we identified a total of 3050 mRNAs and 220 long non-coding RNAs (lncRNAs) specifically expressed during PMN apoptosis in a time-dependent manner. By short time-series expression miner (STEM) analysis, Gene Ontology analysis, and lncRNA-mRNA co-expression network analyses, we identified some key molecules specifically related to PMN apoptosis. STEM analysis identified 12 gene profiles with statistically significance, including 2 associated with apoptosis. Protein-protein interaction (PPI) network analysis of the genes from 2 profiles and lncRNA-mRNA co-expression network analysis identified a 12-gene hub (including NFκB1 and BIRC3) associated with apoptosis, as well as 2 highly correlated lncRNAs (THAP9-AS1, and AL021707.6). We experimentally examined the expression profiles of two mRNA (NFκB1 and BIRC3) and two lncRNAs (THAP9-AS1 andAL021707.6) by quantitative real-time polymerase chain reaction to confirm their time-dependent expressions. These data altogether demonstrated that these genes are involved in the regulation of spontaneous neutrophil apoptosis and the corresponding gene products could also serve as potential key regulatory molecules for PMN apoptosis and/or therapeutic targets for over-reactive inflammatory response caused by the abnormality in PMN apoptosis.
Collapse
|
191
|
Huang Z, Liu L, Gao Y, Shi J, Cui Q, Li J, Zhou Y. Benchmark of computational methods for predicting microRNA-disease associations. Genome Biol 2019; 20:202. [PMID: 31594544 PMCID: PMC6781296 DOI: 10.1186/s13059-019-1811-3] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2019] [Accepted: 09/03/2019] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. RESULTS Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. CONCLUSION Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.
Collapse
Affiliation(s)
- Zhou Huang
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Leibo Liu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China
| | - Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China
- Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, 610054, China
| | - Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, 300401, China.
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, 38 Xueyuan Rd, Beijing, 100191, China.
| |
Collapse
|
192
|
Ping J, Oyebamiji O, Yu H, Ness S, Chien J, Ye F, Kang H, Samuels D, Ivanov S, Chen D, Zhao YY, Guo Y. MutEx: a multifaceted gateway for exploring integrative pan-cancer genomic data. Brief Bioinform 2019; 21:1479-1486. [PMID: 31588509 DOI: 10.1093/bib/bbz084] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Revised: 06/03/2019] [Accepted: 06/17/2019] [Indexed: 12/11/2022] Open
Abstract
Somatic mutation and gene expression dysregulation are considered two major tumorigenesis factors. While independent investigations of either factor pervade, studies of associations between somatic mutations and gene expression changes have been sporadic and nonsystematic. Utilizing genomic data collected from 11 315 subjects of 33 distinct cancer types, we constructed MutEx, a pan-cancer integrative genomic database. This database records the relationships among gene expression, somatic mutation and survival data for cancer patients. MutEx can be used to swiftly explore the relationship between these genomic/clinic features within and across cancer types and, more importantly, search for corroborating evidence for hypothesis inception. Our database also incorporated Gene Ontology and several pathway databases to enhance functional annotation, and elastic net and a gene expression composite score to aid in survival analysis. To demonstrate the usability of MutEx, we provide several application examples, including top somatic mutations associated with the most extensive expression dysregulation in breast cancer, differential mutational burden downstream of DNA mismatch repair gene mutations and composite gene expression score-based survival difference in breast cancer. MutEx can be accessed at http://www.innovebioinfo.com/Databases/Mutationdb_About.php.
Collapse
Affiliation(s)
- Jie Ping
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, USA, 37232
| | | | - Hui Yu
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, USA, 87109
| | - Scott Ness
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, USA, 87109
| | - Jeremy Chien
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, USA, 87109
| | - Fei Ye
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, USA, 37232
| | - Huining Kang
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, USA, 87109
| | - David Samuels
- Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, USA, 37232
| | - Sergey Ivanov
- Department of Internal Medicine, Vanderbilt University, Nashville, USA, 37232
| | - Danqian Chen
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi'an, Shaanxi 710069, China
| | - Ying-Yong Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University, Xi'an, Shaanxi 710069, China
| | - Yan Guo
- Comprehensive Cancer Center, University of New Mexico, Albuquerque, NM, USA, 87109
| |
Collapse
|
193
|
Gao Y, Jia K, Shi J, Zhou Y, Cui Q. A Computational Model to Predict the Causal miRNAs for Diseases. Front Genet 2019; 10:935. [PMID: 31632446 PMCID: PMC6786093 DOI: 10.3389/fgene.2019.00935] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2019] [Accepted: 09/05/2019] [Indexed: 01/30/2023] Open
Abstract
MicroRNAs (miRNAs) are one class of important noncoding RNA molecules, and their dysfunction is associated with a number of diseases. Currently, a series of databases and algorithms have been developed for dissecting human miRNA-disease associations. However, these tools only presented the associations between miRNAs and disease but did not address whether the associations are causal or not, a key biomedical issue that is critical for understanding the roles of candidate miRNAs in the mechanisms of specific diseases. Here we first manually curated causal miRNA-disease association information and updated the human miRNA disease database (HMDD) accordingly. Then we built a computational model, MDCAP (MiRNA-Disease Causal Association Predictor), to predict novel causal miRNA-disease associations. As a result, we collected 6,667 causal miRNA-disease associations between 616 miRNAs and 440 diseases, which accounts for ∼20% of the total data in HMDD. The MDCAP model achieved an area under the receiver operating characteristic (ROC) curve of 0.928 for ROC analysis by independent test and an area under the ROC curve of 0.925 for ROC analysis by 10-fold cross-validation. Finally, case studies conducted on myocardial infarction and hsa-mir-498 further suggested the biomedical significance of the predictions.
Collapse
Affiliation(s)
- Yuanxu Gao
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Kaiwen Jia
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Jiangcheng Shi
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Yuan Zhou
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Qinghua Cui
- Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Center for Noncoding RNA Medicine, MOE Key Lab of Cardiovascular Sciences, School of Basic Medical Sciences, Peking University, Beijing, China.,Center of Bioinformatics, Key Laboratory for Neuro-Information of Ministry of Education, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
194
|
Wu HY, Wei Y, Pan SL. Down-regulation and clinical significance of miR-7-2-3p in papillary thyroid carcinoma with multiple detecting methods. IET Syst Biol 2019; 13:225-233. [PMID: 31538956 PMCID: PMC8687168 DOI: 10.1049/iet-syb.2019.0025] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Revised: 05/30/2019] [Accepted: 06/10/2019] [Indexed: 04/05/2024] Open
Abstract
Altered miRNA expression participates in the biological progress of thyroid carcinoma and functions as a diagnostic marker or therapeutic agent. However, the role of miR-7-2-3p is currently unclear. The authors' study was the first investigation of miR-7-2-3p expression level and diagnostic ability in several public databases. Potential target genes were obtained from DIANA Tools, and function enrichment analysis was then performed. Furthermore, the authors examined expression levels of potential targets in the Human Protein Atlas (HPA) and the Cancer Genome Atlas (TCGA). Finally, the potential transcription factors (TFs) were predicted by JASPAR. TCGA, GSE62054, GSE73182, GSE40807, and GSE55780 revealed that miR-7-2-3p expression in papillary thyroid carcinoma (PTC) tissues was notably lower compared with non-tumour tissues, while its expression in E-MATB-736 showed no remarkable difference. Function enrichment analysis showed that 698 genes were enriched in pathways, including pathways in cancer, and glioma. CCND1, GSK3B, and ITGAV of pathways in cancer were inverse correlations with miR-7-2-3p in both post-transcription and protein levels. According to the TF prediction, the prospective upstream TFs of miR-7-2-3p were ISX, SPI1, PRRX1, and BARX1. MiR-7-2-3p was significantly down-regulated and may act on PTC progression by crucial pathways. However, the mechanisms of miR-7-2-3p need further investigation.
Collapse
Affiliation(s)
- Hua-Yu Wu
- Department of Cell Biology and Genetics, School of Pre-clinical Medicine, Guangxi Medical University, Nanning, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China
| | - Yi Wei
- Department of Pathophysiology, School of Pre-clinical Medicine, Guangxi Medical University, Nanning, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China
| | - Shang-Ling Pan
- Department of Pathophysiology, School of Pre-clinical Medicine, Guangxi Medical University, Nanning, 530021, Guangxi Zhuang Autonomous Region, People's Republic of China.
| |
Collapse
|
195
|
Cheng L, Zhao H, Wang P, Zhou W, Luo M, Li T, Han J, Liu S, Jiang Q. Computational Methods for Identifying Similar Diseases. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 18:590-604. [PMID: 31678735 PMCID: PMC6838934 DOI: 10.1016/j.omtn.2019.09.019] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/11/2019] [Accepted: 09/12/2019] [Indexed: 02/01/2023]
Abstract
Although our knowledge of human diseases has increased dramatically, the molecular basis, phenotypic traits, and therapeutic targets of most diseases still remain unclear. An increasing number of studies have observed that similar diseases often are caused by similar molecules, can be diagnosed by similar markers or phenotypes, or can be cured by similar drugs. Thus, the identification of diseases similar to known ones has attracted considerable attention worldwide. To this end, the associations between diseases at the molecular, phenotypic, and taxonomic levels were used to measure the pairwise similarity in diseases. The corresponding performance assessment strategies for these methods involving the terms “category-based,” “simulated-patient-based,” and “benchmark-data-based” were thus further emphasized. Then, frequently used methods were evaluated using a benchmark-data-based strategy. To facilitate the assessment of disease similarity scores, researchers have designed dozens of tools that implement these methods for calculating disease similarity. Currently, disease similarity has been advantageous in predicting noncoding RNA (ncRNA) function and therapeutic drugs for diseases. In this article, we review disease similarity methods, evaluation strategies, tools, and their applications in the biomedical community. We further evaluate the performance of these methods and discuss the current limitations and future trends for calculating disease similarity.
Collapse
Affiliation(s)
- Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Hengqiang Zhao
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Pingping Wang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Wenyang Zhou
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Meng Luo
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Tianxin Li
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| | - Shulin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, Heilongjiang, China; Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada.
| | - Qinghua Jiang
- School of Life Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China.
| |
Collapse
|
196
|
Shi C, Chen J, Kang X, Zhao G, Lao X, Zheng H. Deep Learning in the Study of Protein-Related Interactions. Protein Pept Lett 2019; 27:359-369. [PMID: 31538879 DOI: 10.2174/0929866526666190723114142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2019] [Revised: 03/13/2019] [Accepted: 04/05/2019] [Indexed: 11/22/2022]
Abstract
Protein-related interaction prediction is critical to understanding life processes, biological functions, and mechanisms of drug action. Experimental methods used to determine proteinrelated interactions have always been costly and inefficient. In recent years, advances in biological and medical technology have provided us with explosive biological and physiological data, and deep learning-based algorithms have shown great promise in extracting features and learning patterns from complex data. At present, deep learning in protein research has emerged. In this review, we provide an introductory overview of the deep neural network theory and its unique properties. Mainly focused on the application of this technology in protein-related interactions prediction over the past five years, including protein-protein interactions prediction, protein-RNA\DNA, Protein- drug interactions prediction, and others. Finally, we discuss some of the challenges that deep learning currently faces.
Collapse
Affiliation(s)
- Cheng Shi
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Jiaxing Chen
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xinyue Kang
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Guiling Zhao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Xingzhen Lao
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| | - Heng Zheng
- School of Life Science and Technology, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
197
|
Zhang Y, Chen M, Cheng X, Chen Z. LSGSP: a novel miRNA-disease association prediction model using a Laplacian score of the graphs and space projection federated method. RSC Adv 2019; 9:29747-29759. [PMID: 35531537 PMCID: PMC9071959 DOI: 10.1039/c9ra05554a] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Accepted: 09/09/2019] [Indexed: 12/31/2022] Open
Abstract
Lots of research findings have indicated that miRNAs (microRNAs) are involved in many important biological processes; their mutations and disorders are closely related to diseases, therefore, determining the associations between human diseases and miRNAs is key to understand pathogenic mechanisms. Existing biological experimental methods for identifying miRNA-disease associations are usually expensive and time consuming. Therefore, the development of efficient and reliable computational methods for identifying disease-related miRNAs has become an important topic in the field of biological research in recent years. In this study, we developed a novel miRNA-disease association prediction model using a Laplacian score of the graphs and space projection federated method (LSGSP). This integrates experimentally validated miRNA-disease associations, disease semantic similarity scores, miRNA functional scores, and miRNA family information to build a new disease similarity network and miRNA similarity network, and then obtains the global similarities of these networks through calculating the Laplacian score of the graphs, based on which the miRNA-disease weighted network can be constructed through combination with the miRNA-disease Boolean network. Finally, the miRNA-disease score was obtained via projecting the miRNA space and disease space onto the miRNA-disease weighted network. Compared with several other state-of-the-art methods, using leave-one-out cross validation (LOOCV) to evaluate the accuracy of LSGSP with respect to a benchmark dataset, prediction dataset and compare dataset, LSGSP showed excellent predictive performance with high AUC values of 0.9221, 0.9745 and 0.9194, respectively. In addition, for prostate neoplasms and lung neoplasms, the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the updated HMDD, miR2Disease, and dbDEMC databases) reached 96% and 100%, respectively. Similarly, for isolated diseases (diseases not associated with any miRNAs), the consistencies between the top 50 predicted miRNAs (obtained from LSGSP) and the results (confirmed from the above-mentioned three databases) reached 98% and 100%, respectively. These results further indicate that LSGSP can effectively predict potential associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology 541004 Guilin China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology 421002 Hengyang China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology 541004 Guilin China
| | - Zheng Chen
- School of Computer Science and Technology, Hunan Institute of Technology 421002 Hengyang China
| |
Collapse
|
198
|
Zhang L, Chen X, Yin J. Prediction of Potential miRNA-Disease Associations Through a Novel Unsupervised Deep Learning Framework with Variational Autoencoder. Cells 2019; 8:cells8091040. [PMID: 31489920 PMCID: PMC6770222 DOI: 10.3390/cells8091040] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Revised: 08/31/2019] [Accepted: 09/02/2019] [Indexed: 12/22/2022] Open
Abstract
The important role of microRNAs (miRNAs) in the formation, development, diagnosis, and treatment of diseases has attracted much attention among researchers recently. In this study, we present an unsupervised deep learning model of the variational autoencoder for MiRNA–disease association prediction (VAEMDA). Through combining the integrated miRNA similarity and the integrated disease similarity with known miRNA–disease associations, respectively, we constructed two spliced matrices. These matrices were applied to train the variational autoencoder (VAE), respectively. The final predicted association scores between miRNAs and diseases were obtained by integrating the scores from the two trained VAE models. Unlike previous models, VAEMDA can avoid noise introduced by the random selection of negative samples and reveal associations between miRNAs and diseases from the perspective of data distribution. Compared with previous methods, VAEMDA obtained higher area under the receiver operating characteristics curves (AUCs) of 0.9118, 0.8652, and 0.9091 ± 0.0065 in global leave-one-out cross validation (LOOCV), local LOOCV, and five-fold cross validation, respectively. Further, the AUCs of VAEMDA were 0.8250 and 0.8237 in global leave-one-disease-out cross validation (LODOCV), and local LODOCV, respectively. In three different types of case studies on three important diseases, the results showed that most of the top 50 potentially associated miRNAs were verified by databases and the literature.
Collapse
Affiliation(s)
- Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China.
| |
Collapse
|
199
|
Predicting miRNA-Disease Associations by Incorporating Projections in Low-Dimensional Space and Local Topological Information. Genes (Basel) 2019; 10:genes10090685. [PMID: 31500152 PMCID: PMC6770973 DOI: 10.3390/genes10090685] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2019] [Revised: 08/31/2019] [Accepted: 09/03/2019] [Indexed: 12/14/2022] Open
Abstract
Predicting the potential microRNA (miRNA) candidates associated with a disease helps in exploring the mechanisms of disease development. Most recent approaches have utilized heterogeneous information about miRNAs and diseases, including miRNA similarities, disease similarities, and miRNA-disease associations. However, these methods do not utilize the projections of miRNAs and diseases in a low-dimensional space. Thus, it is necessary to develop a method that can utilize the effective information in the low-dimensional space to predict potential disease-related miRNA candidates. We proposed a method based on non-negative matrix factorization, named DMAPred, to predict potential miRNA-disease associations. DMAPred exploits the similarities and associations of diseases and miRNAs, and it integrates local topological information of the miRNA network. The likelihood that a miRNA is associated with a disease also depends on their projections in low-dimensional space. Therefore, we project miRNAs and diseases into low-dimensional feature space to yield their low-dimensional and dense feature representations. Moreover, the sparse characteristic of miRNA-disease associations was introduced to make our predictive model more credible. DMAPred achieved superior performance for 15 well-characterized diseases with AUCs (area under the receiver operating characteristic curve) ranging from 0.860 to 0.973 and AUPRs (area under the precision-recall curve) ranging from 0.118 to 0.761. In addition, case studies on breast, prostatic, and lung neoplasms demonstrated the ability of DMAPred to discover potential disease-related miRNAs.
Collapse
|
200
|
Wiczling P, Daghir-Wojtkowiak E, Kaliszan R, Markuszewski MJ, Limon J, Koczkowska M, Stukan M, Kuźniacka A, Ratajska M. Bayesian multilevel model of micro RNA levels in ovarian-cancer and healthy subjects. PLoS One 2019; 14:e0221764. [PMID: 31465488 PMCID: PMC6715278 DOI: 10.1371/journal.pone.0221764] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 08/14/2019] [Indexed: 12/31/2022] Open
Abstract
In transcriptomics, micro RNAs (miRNAs) has gained much interest especially as potential disease indicators. However, apart from holding a great promise related to their clinical application, a lot of inconsistent results have been published. Our aim was to compare the miRNA expression levels in ovarian cancer and healthy subjects using the Bayesian multilevel model and to assess their potential usefulness in diagnosis. We have analyzed a case-control observational data on expression profiling of 49 preselected miRNA-based ovarian cancer indicators in 119 controls and 59 patients. A Bayesian multilevel model was used to characterize the effect of disease on miRNA levels controlling for differences in age and body weight. The difference between the miRNA level and health status of the patient on the scale of the data variability were discussed in the context of their potential usefulness in diagnosis. Additionally, the cross-validated area under the ROC curve (AUC) was used to assess the expected out-of-sample discrimination index of a different sets of miRNAs. The proposed model allowed us to describe the set of miRNA levels in patients and controls. Three highly correlated miRNAs: miR-101-3p, miR-142-5p, miR-148a-3p rank the highest with almost identical effect sizes that ranges from 0.45 to 1.0. For those miRNAs the credible interval for AUC ranged from 0.63 to 0.67 indicating their limited discrimination potential. A little benefit in adding information from other miRNAs was observed. There were several miRNAs in the dataset (miR-604, hsa-miR-221-5p) for which inferences were uncertain. For those miRNAs more experimental effort is needed to fully assess their effect in the context of new hits discovery and usefulness as disease indicators. The proposed multilevel Bayesian model can be used to characterize the panel of miRNA profile and to assess the difference in expression levels between healthy and cancer individuals.
Collapse
Affiliation(s)
- Paweł Wiczling
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Emilia Daghir-Wojtkowiak
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Roman Kaliszan
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Michał Jan Markuszewski
- Department of Biopharmaceutics and Pharmacodynamics, Medical University of Gdańsk, Gen. J. Hallera, Gdańsk, Poland
| | - Janusz Limon
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Magdalena Koczkowska
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Maciej Stukan
- Department of Gynecological Oncology, Gdynia Oncology Centre, Powstania Styczniowego, Gdynia, Poland
| | - Alina Kuźniacka
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| | - Magdalena Ratajska
- Department of Biology and Genetics, Medical University of Gdańsk, Dębinki, Gdańsk, Poland
| |
Collapse
|