51
|
Wu QW, Xia JF, Ni JC, Zheng CH. GAERF: predicting lncRNA-disease associations by graph auto-encoder and random forest. Brief Bioinform 2021; 22:6067881. [PMID: 33415333 DOI: 10.1093/bib/bbaa391] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Revised: 11/26/2020] [Accepted: 11/30/2020] [Indexed: 12/11/2022] Open
Abstract
Predicting disease-related long non-coding RNAs (lncRNAs) is beneficial to finding of new biomarkers for prevention, diagnosis and treatment of complex human diseases. In this paper, we proposed a machine learning techniques-based classification approach to identify disease-related lncRNAs by graph auto-encoder (GAE) and random forest (RF) (GAERF). First, we combined the relationship of lncRNA, miRNA and disease into a heterogeneous network. Then, low-dimensional representation vectors of nodes were learned from the network by GAE, which reduce the dimension and heterogeneity of biological data. Taking these feature vectors as input, we trained a RF classifier to predict new lncRNA-disease associations (LDAs). Related experiment results show that the proposed method for the representation of lncRNA-disease characterizes them accurately. GAERF achieves superior performance owing to the ensemble learning method, outperforming other methods significantly. Moreover, case studies further demonstrated that GAERF is an effective method to predict LDAs.
Collapse
Affiliation(s)
- Qing-Wen Wu
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| | - Jun-Feng Xia
- Institute of Physical Science and Information Technology, Anhui University, Hefei, China
| | - Jian-Cheng Ni
- School of Cyber Science and Engineering, Qufu Normal University, Qufu, China
| | - Chun-Hou Zheng
- Key Lab of Intelligent Computing and Signal Processing of Ministry of Education, College of Computer Science and Technology, Anhui University, Hefei, China
| |
Collapse
|
52
|
Yang F, Zhang J, Li B, Zhao Z, Liu Y, Zhao Z, Jing S, Wang G. Identification of Potential lncRNAs and miRNAs as Diagnostic Biomarkers for Papillary Thyroid Carcinoma Based on Machine Learning. Int J Endocrinol 2021; 2021:3984463. [PMID: 34335744 PMCID: PMC8318749 DOI: 10.1155/2021/3984463] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 07/06/2021] [Accepted: 07/12/2021] [Indexed: 12/28/2022] Open
Abstract
BACKGROUND Papillary thyroid carcinoma (PTC) accounts for most of the proportion of thyroid cancer (TC). The objective of this study was to identify diagnostic, differentially expressed long noncoding RNAs (lncRNAs) and microRNAs (miRNAs), contributing to understanding the epigenetics mechanism of PTC. METHODS The data of lncRNA, miRNA, and mRNA were downloaded from the Cancer Genome Atlas (TCGA) dataset, followed by functional analysis of differentially expressed mRNAs. Optimal diagnostic lncRNA and miRNA biomarkers were identified via random forest. The regulatory network between optimal diagnostic lncRNA and mRNAs and optimal diagnostic miRNA and mRNAs was identified, followed by the construction of ceRNA network of lncRNA-mRNA-miRNA. Expression validation and diagnostic analysis of lncRNAs, miRNAs, and mRNAs were performed. Overexpression of ADD3-AS1 was performed in PTC-UC3 cell lines, and cell proliferation and invasion assay were used for investigating the role of ADD3-AS1 in PTC. RESULTS A total of 107 differentially expressed lncRNAs, 81 differentially expressed miRNAs, and 515 differentially expressed mRNAs were identified. 11 lncRNAs and 6 miRNAs were regarded as the optimal diagnostic biomarkers for PTC. The epigenetic modifications via the above diagnostic lncRNAs and miRNAs were identified, including MIR181A2HG-FOXP2-hsa-miR-146b-3p, BLACAT1/ST7-AS1-RPS6KA5-hsa-miR-34a-5p, LBX2-AS1/MIR100HG-CDHR3-hsa-miR-34a-5p, ADD3-AS1-PTPRE-hsa-miR-9-5p, ADD3-AS1-TGFBR1-hsa-miR-214-3p, LINC00506-MMRN1-hsa-miR-4709-3p, and LOC339059-STK32A-hsa-miR-199b-5p. In the functional analysis, MMRN1 and TGFBR1 were involved in cell adhesion and endothelial cell migration, respectively. Overexpression of ADD3-AS1 inhibited cell growth and invasion in PTC cell lines. CONCLUSION The identified lncRNAs/miRNAs/mRNA were differentially expressed between normal and cancerous tissues. In addition, identified altered lncRNAs and miRNAs may be potential diagnostic biomarkers for PTC. Additionally, epigenetic modifications via the above lncRNAs and miRNAs may be involved in tumorigenesis of PTC.
Collapse
Affiliation(s)
- Fei Yang
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Jie Zhang
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Baokun Li
- General Surgical Department, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Zhijun Zhao
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Yan Liu
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Zhen Zhao
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Shanghua Jing
- Department of Otolaryngology-Head and Neck Surgery, The Fourth Hospital of Hebei Medical University, Hebei, China
| | - Guiying Wang
- General Surgical Department, The Fourth Hospital of Hebei Medical University, Hebei, China
- General Surgical Department, The Third Hospital of Hebei Medical University, Hebei, China
| |
Collapse
|
53
|
Xiao Y, Xiao Z, Feng X, Chen Z, Kuang L, Wang L. A novel computational model for predicting potential LncRNA-disease associations based on both direct and indirect features of LncRNA-disease pairs. BMC Bioinformatics 2020; 21:555. [PMID: 33267800 PMCID: PMC7709313 DOI: 10.1186/s12859-020-03906-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2019] [Accepted: 11/25/2020] [Indexed: 12/25/2022] Open
Abstract
Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.
Collapse
Affiliation(s)
- Yubin Xiao
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China
| | - Zheng Xiao
- Hunan Province Key Laboratory of Tumor Cellular and Molecular Pathology, Cancer Research Institute, University of South China, Hengyang, 421001, Hunan, People's Republic of China
| | - Xiang Feng
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China
| | - Zhiping Chen
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China
| | - Linai Kuang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China
| | - Lei Wang
- College of Computer Engineering and Applied Mathematics, Changsha University, Changsha, 410001, People's Republic of China. .,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, 411105, People's Republic of China.
| |
Collapse
|
54
|
Kashyap MP, Sinha R, Mukhtar MS, Athar M. Epigenetic regulation in the pathogenesis of non-melanoma skin cancer. Semin Cancer Biol 2020; 83:36-56. [PMID: 33242578 DOI: 10.1016/j.semcancer.2020.11.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Revised: 11/17/2020] [Accepted: 11/18/2020] [Indexed: 02/07/2023]
Abstract
Understanding of cancer with the help of ever-expanding cutting edge technological tools and bioinformatics is revolutionizing modern cancer research by broadening the space of discovery window of various genomic and epigenomic processes. Genomics data integrated with multi-omics layering have advanced cancer research. Uncovering such layers of genetic mutations/modifications, epigenetic regulation and their role in the complex pathophysiology of cancer progression could lead to novel therapeutic interventions. Although a plethora of literature is available in public domain defining the role of various tumor driver gene mutations, understanding of epigenetic regulation of cancer is still emerging. This review focuses on epigenetic regulation association with the pathogenesis of non-melanoma skin cancer (NMSC). NMSC has higher prevalence in Caucasian populations compared to other races. Due to lack of proper reporting to cancer registries, the incidence rates for NMSC worldwide cannot be accurately estimated. However, this is the most common neoplasm in humans, and millions of new cases per year are reported in the United States alone. In organ transplant recipients, the incidence of NMSC particularly of squamous cell carcinoma (SCC) is very high and these SCCs frequently become metastatic and lethal. Understanding of solar ultraviolet (UV) light-induced damage and impaired DNA repair process leading to DNA mutations and nuclear instability provide an insight into the pathogenesis of metastatic neoplasm. This review discusses the recent advances in the field of epigenetics of NMSCs. Particularly, the role of DNA methylation, histone hyperacetylation and non-coding RNA such as long-chain noncoding (lnc) RNAs, circular RNAs and miRNA in the disease progression are summarized.
Collapse
Affiliation(s)
- Mahendra Pratap Kashyap
- UAB Research Center of Excellence in Arsenicals, Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Rajesh Sinha
- UAB Research Center of Excellence in Arsenicals, Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - M Shahid Mukhtar
- Department of Biology, University of Alabama at Birmingham, Birmingham, AL 35294, USA
| | - Mohammad Athar
- UAB Research Center of Excellence in Arsenicals, Department of Dermatology, University of Alabama at Birmingham, Birmingham, AL 35294, USA.
| |
Collapse
|
55
|
Ji BY, You ZH, Chen ZH, Wong L, Yi HC. NEMPD: a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. BMC Bioinformatics 2020; 21:401. [PMID: 32912137 PMCID: PMC7646193 DOI: 10.1186/s12859-020-03716-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 08/19/2020] [Indexed: 12/25/2022] Open
Abstract
Background As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. Results In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. Conclusions The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.
Collapse
Affiliation(s)
- Bo-Ya Ji
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Zhan-Heng Chen
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Leon Wong
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
56
|
Zhang Y, Ye F, Xiong D, Gao X. LDNFSGB: prediction of long non-coding rna and disease association using network feature similarity and gradient boosting. BMC Bioinformatics 2020; 21:377. [PMID: 32883200 PMCID: PMC7469344 DOI: 10.1186/s12859-020-03721-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 08/21/2020] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND A large number of experimental studies show that the mutation and regulation of long non-coding RNAs (lncRNAs) are associated with various human diseases. Accurate prediction of lncRNA-disease associations can provide a new perspective for the diagnosis and treatment of diseases. The main function of many lncRNAs is still unclear and using traditional experiments to detect lncRNA-disease associations is time-consuming. RESULTS In this paper, we develop a novel and effective method for the prediction of lncRNA-disease associations using network feature similarity and gradient boosting (LDNFSGB). In LDNFSGB, we first construct a comprehensive feature vector to effectively extract the global and local information of lncRNAs and diseases through considering the disease semantic similarity (DISSS), the lncRNA function similarity (LNCFS), the lncRNA Gaussian interaction profile kernel similarity (LNCGS), the disease Gaussian interaction profile kernel similarity (DISGS), and the lncRNA-disease interaction (LNCDIS). Particularly, two methods are used to calculate the DISSS (LNCFS) for considering the local and global information of disease semantics (lncRNA functions) respectively. An autoencoder is then used to reduce the dimensionality of the feature vector to obtain the optimal feature parameter from the original feature set. Furthermore, we employ the gradient boosting algorithm to obtain the lncRNA-disease association prediction. CONCLUSIONS In this study, hold-out, leave-one-out cross-validation, and ten-fold cross-validation methods are implemented on three publicly available datasets to evaluate the performance of LDNFSGB. Extensive experiments show that LDNFSGB dramatically outperforms other state-of-the-art methods. The case studies on six diseases, including cancers and non-cancers, further demonstrate the effectiveness of our method in real-world applications.
Collapse
Affiliation(s)
- Yuan Zhang
- School of Mathematics and Computational Science, Xiangtan University, Xiangtan 411105, China
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - Fei Ye
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China
| | - Dapeng Xiong
- Department of Computational Biology, Ithaca, New York 14853, USA
- Weill Institute for Cell and Molecular Biology, Cornell University, Ithaca, New York 14853, USA
| | - Xieping Gao
- Key Laboratory of Intelligent Computing and Information Processing of Ministry of Education, Xiangtan University, Xiangtan 411105, China.
- College of Medical Imaging and Inspection, Xiangnan University, Chenzhou 423000, China.
| |
Collapse
|
57
|
Zeng M, Lu C, Zhang F, Li Y, Wu FX, Li Y, Li M. SDLDA: lncRNA-disease association prediction based on singular value decomposition and deep learning. Methods 2020; 179:73-80. [DOI: 10.1016/j.ymeth.2020.05.002] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 04/24/2020] [Accepted: 05/02/2020] [Indexed: 12/20/2022] Open
|
58
|
Shi C, Cao J, Shi T, Liang M, Ding C, Lv Y, Zhang W, Li C, Gao W, Wu G, Man J. BRAF V600E mutation, BRAF-activated long non-coding RNA and miR-9 expression in papillary thyroid carcinoma, and their association with clinicopathological features. World J Surg Oncol 2020; 18:145. [PMID: 32593310 PMCID: PMC7321545 DOI: 10.1186/s12957-020-01923-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Accepted: 06/18/2020] [Indexed: 12/15/2022] Open
Abstract
Background The incidence of thyroid cancer is increasing worldwide. This study investigated the association of B-type RAF kinase (BRAF)V600E mutation status, the expression of BRAF-activated long non-coding RNA (BANCR) and microRNA miR-9, and the clinicopathological features of papillary thyroid carcinoma (PTC). Methods Clinicopathological data for PTC patients (n = 51) diagnosed and treated between 2018 and 2019 were collected. Carcinoma and adjacent normal tissue samples were analyzed for the presence of the BRAFV600E mutation and/or expression of BANCR and miR-9. Results Larger tumor, higher rate of bilateral tumors and multifocality, extracapsular invasion, and lateral lymph node metastasis (LNM) were observed in PTC patients with BRAF V600E mutation. Patients with higher BANCR expression had a higher rate of extracapsular invasion and lateral LNM in carcinoma tissue and a lower frequency of bilateral tumors and multifocality in normal adjacent tissue. Patients with higher miR-9 expression had a lower rate of central and lateral LNM in carcinoma tissue and higher rates of bilateral tumor location and multifocality in normal adjacent tissue. Patients with BRAFV600E mutation have a higher rate of BANCR overexpression and tended to have a lower rate of miR-9 overexpression (P = 0.057), and a negative association was observed between BANCR and miR-9 expression in carcinoma tissue. Conclusions BRAFV600E mutation and the BANCR and miR-9 expression were closely associated with the tumor size, bilateral tumor location, multifocality, extracapsular invasion, and lateral LNM. PTC patients with these clinicopathological characteristics, BRAFV600E mutation, and high BANCR expression and low miR-9 expression needed earlier surgical treatment and are recommended for total thyroidectomy in primary surgery for reducing the risk of recurrence. These findings provide new insight into the molecular basis for PTC and can inform strategies for the management of PTC.
Collapse
Affiliation(s)
- Chenlei Shi
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Jia Cao
- The Department of Head and Neck Surgery, General Hospital of Heilongjiang Province Land Reclamation Bureau, Harbin, 150088, Heilongjiang Province, China
| | - Tiefeng Shi
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China.
| | - Meihua Liang
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Chao Ding
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Yichen Lv
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Weifeng Zhang
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Chuanle Li
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Wenchao Gao
- The Fourth Department of General Surgery, Harbin First Hospital Affiliated to Harbin Institute of Technology, Harbin, 150010, Heilongjiang Province, China
| | - Gang Wu
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| | - Jianting Man
- The Fourth Department of General Surgery, the Second Affiliated Hospital, Harbin Medical University, 246 Xuefu Road, Harbin, 150001, Heilongjiang Province, China
| |
Collapse
|
59
|
Yi HC, You ZH, Huang DS, Guo ZH, Chan KCC, Li Y. Learning Representations to Predict Intermolecular Interactions on Large-Scale Heterogeneous Molecular Association Network. iScience 2020; 23:101261. [PMID: 32580123 PMCID: PMC7317230 DOI: 10.1016/j.isci.2020.101261] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 04/29/2020] [Accepted: 06/08/2020] [Indexed: 02/07/2023] Open
Abstract
Molecular components that are functionally interdependent in human cells constitute molecular association networks. Disease can be caused by disturbance of multiple molecular interactions. New biomolecular regulatory mechanisms can be revealed by discovering new biomolecular interactions. To this end, a heterogeneous molecular association network is formed by systematically integrating comprehensive associations between miRNAs, lncRNAs, circRNAs, mRNAs, proteins, drugs, microbes, and complex diseases. We propose a machine learning method for predicting intermolecular interactions, named MMI-Pred. More specifically, a network embedding model is developed to fully exploit the network behavior of biomolecules, and attribute features are also calculated. Then, these discriminative features are combined to train a random forest classifier to predict intermolecular interactions. MMI-Pred achieves an outstanding performance of 93.50% accuracy in hybrid associations prediction under 5-fold cross-validation. This work provides systematic landscape and machine learning method to model and infer complex associations between various biological components.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - De-Shuang Huang
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Zhen-Hao Guo
- Xinjiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Keith C C Chan
- Department of Computing, Hong Kong Polytechnic University, Hong Kong SAR 999077, China
| | - Yangming Li
- College of Engineering Technology, Rochester Institute of Technology, Rochester, NY 14623, USA
| |
Collapse
|
60
|
Li J, Ma X, Li X, Gu J. PPAI: a web server for predicting protein-aptamer interactions. BMC Bioinformatics 2020; 21:236. [PMID: 32517696 PMCID: PMC7285591 DOI: 10.1186/s12859-020-03574-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 05/28/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The interactions between proteins and aptamers are prevalent in organisms and play an important role in various life activities. Thanks to the rapid accumulation of protein-aptamer interaction data, it is necessary and feasible to construct an accurate and effective computational model to predict aptamers binding to certain interested proteins and protein-aptamer interactions, which is beneficial for understanding mechanisms of protein-aptamer interactions and improving aptamer-based therapies. RESULTS In this study, a novel web server named PPAI is developed to predict aptamers and protein-aptamer interactions with key sequence features of proteins/aptamers and a machine learning framework integrated adaboost and random forest. A new method for extracting several key sequence features of both proteins and aptamers is presented, where the features for proteins are extracted from amino acid composition, pseudo-amino acid composition, grouped amino acid composition, C/T/D composition and sequence-order-coupling number, while the features for aptamers are extracted from nucleotide composition, pseudo-nucleotide composition (PseKNC) and normalized Moreau-Broto autocorrelation coefficient. On the basis of these feature sets and balanced the samples with SMOTE algorithm, we validate the performance of PPAI by the independent test set. The results demonstrate that the Area Under Curve (AUC) is 0.907 for prediction of aptamer, while the AUC reaches 0.871 for prediction of protein-aptamer interactions. CONCLUSION These results indicate that PPAI can query aptamers and proteins, predict aptamers and predict protein-aptamer interactions in batch mode precisely and efficiently, which would be a novel bioinformatics tool for the research of protein-aptamer interactions. PPAI web-server is freely available at http://39.96.85.9/PPAI.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China. .,Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin, China.
| | - Xiaoyu Ma
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Xichuan Li
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Junhua Gu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| |
Collapse
|
61
|
Guo ZH, You ZH, Wang YB, Huang DS, Yi HC, Chen ZH. Bioentity2vec: Attribute- and behavior-driven representation for predicting multi-type relationships between bioentities. Gigascience 2020; 9:giaa032. [PMID: 32533701 PMCID: PMC7293023 DOI: 10.1093/gigascience/giaa032] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 01/06/2020] [Accepted: 03/13/2020] [Indexed: 01/14/2023] Open
Abstract
BACKGROUND The explosive growth of genomic, chemical, and pathological data provides new opportunities and challenges for humans to thoroughly understand life activities in cells. However, there exist few computational models that aggregate various bioentities to comprehensively reveal the physical and functional landscape of biological systems. RESULTS We constructed a molecular association network, which contains 18 edges (relationships) between 8 nodes (bioentities). Based on this, we propose Bioentity2vec, a new method for representing bioentities, which integrates information about the attributes and behaviors of a bioentity. Applying the random forest classifier, we achieved promising performance on 18 relationships, with an area under the curve of 0.9608 and an area under the precision-recall curve of 0.9572. CONCLUSIONS Our study shows that constructing a network with rich topological and biological information is important for systematic understanding of the biological landscape at the molecular level. Our results show that Bioentity2vec can effectively represent biological entities and provides easily distinguishable information about classification tasks. Our method is also able to simultaneously predict relationships between single types and multiple types, which will accelerate progress in biological experimental research and industrial product development.
Collapse
Affiliation(s)
- Zhen-Hao Guo
- XinJiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, No. 40-1, Beijing South Road, Urumqi, Xinjiang, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- XinJiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, No. 40-1, Beijing South Road, Urumqi, Xinjiang, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yan-Bin Wang
- School of Cyber Science and Technology, Zhejiang University, Hangzhou 310000, Zhejiang, China
| | - De-Shuang Huang
- Computer Science Department, Tongji University, Shanghai 200000, China
| | - Hai-Cheng Yi
- XinJiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, No. 40-1, Beijing South Road, Urumqi, Xinjiang, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhan-Heng Chen
- XinJiang Laboratory of Minority Speech and Language Information Processing, Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, No. 40-1, Beijing South Road, Urumqi, Xinjiang, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
62
|
Wang CC, Zhao Y, Chen X. Drug-pathway association prediction: from experimental results to computational models. Brief Bioinform 2020; 22:5835554. [PMID: 32393976 DOI: 10.1093/bib/bbaa061] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 03/16/2020] [Accepted: 03/26/2020] [Indexed: 12/14/2022] Open
Abstract
Effective drugs are urgently needed to overcome human complex diseases. However, the research and development of novel drug would take long time and cost much money. Traditional drug discovery follows the rule of one drug-one target, while some studies have demonstrated that drugs generally perform their task by affecting related pathway rather than targeting single target. Thus, the new strategy of drug discovery, namely pathway-based drug discovery, have been proposed. Obviously, identifying associations between drugs and pathways plays a key role in the development of pathway-based drug discovery. Revealing the drug-pathway associations by experiment methods would take much time and cost. Therefore, some computational models were established to predict potential drug-pathway associations. In this review, we first introduced the background of drug and the concept of drug-pathway associations. Then, some publicly accessible databases and web servers about drug-pathway associations were listed. Next, we summarized some state-of-the-art computational methods in the past years for inferring drug-pathway associations and divided these methods into three classes, namely Bayesian spare factor-based, matrix decomposition-based and other machine learning methods. In addition, we introduced several evaluation strategies to estimate the predictive performance of various computational models. In the end, we discussed the advantages and limitations of existing computational methods and provided some suggestions about the future directions of the data collection and the calculation models development.
Collapse
|
63
|
A random forest based computational model for predicting novel lncRNA-disease associations. BMC Bioinformatics 2020; 21:126. [PMID: 32216744 PMCID: PMC7099795 DOI: 10.1186/s12859-020-3458-1] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2019] [Accepted: 03/18/2020] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Accumulated evidence shows that the abnormal regulation of long non-coding RNA (lncRNA) is associated with various human diseases. Accurately identifying disease-associated lncRNAs is helpful to study the mechanism of lncRNAs in diseases and explore new therapies of diseases. Many lncRNA-disease association (LDA) prediction models have been implemented by integrating multiple kinds of data resources. However, most of the existing models ignore the interference of noisy and redundancy information among these data resources. RESULTS To improve the ability of LDA prediction models, we implemented a random forest and feature selection based LDA prediction model (RFLDA in short). First, the RFLDA integrates the experiment-supported miRNA-disease associations (MDAs) and LDAs, the disease semantic similarity (DSS), the lncRNA functional similarity (LFS) and the lncRNA-miRNA interactions (LMI) as input features. Then, the RFLDA chooses the most useful features to train prediction model by feature selection based on the random forest variable importance score that takes into account not only the effect of individual feature on prediction results but also the joint effects of multiple features on prediction results. Finally, a random forest regression model is trained to score potential lncRNA-disease associations. In terms of the area under the receiver operating characteristic curve (AUC) of 0.976 and the area under the precision-recall curve (AUPR) of 0.779 under 5-fold cross-validation, the performance of the RFLDA is better than several state-of-the-art LDA prediction models. Moreover, case studies on three cancers demonstrate that 43 of the 45 lncRNAs predicted by the RFLDA are validated by experimental data, and the other two predicted lncRNAs are supported by other LDA prediction models. CONCLUSIONS Cross-validation and case studies indicate that the RFLDA has excellent ability to identify potential disease-associated lncRNAs.
Collapse
|
64
|
Zhang Y, Chen M, Li A, Cheng X, Jin H, Liu Y. LDAI-ISPS: LncRNA-Disease Associations Inference Based on Integrated Space Projection Scores. Int J Mol Sci 2020; 21:E1508. [PMID: 32098405 PMCID: PMC7073162 DOI: 10.3390/ijms21041508] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2019] [Revised: 02/18/2020] [Accepted: 02/19/2020] [Indexed: 12/14/2022] Open
Abstract
Long non-coding RNAs (long ncRNAs, lncRNAs) of all kinds have been implicated in a range of cell developmental processes and diseases, while they are not translated into proteins. Inferring diseases associated lncRNAs by computational methods can be helpful to understand the pathogenesis of diseases, but those current computational methods still have not achieved remarkable predictive performance: such as the inaccurate construction of similarity networks and inadequate numbers of known lncRNA-disease associations. In this research, we proposed a lncRNA-disease associations inference based on integrated space projection scores (LDAI-ISPS) composed of the following key steps: changing the Boolean network of known lncRNA-disease associations into the weighted networks via combining all the global information (e.g., disease semantic similarities, lncRNA functional similarities, and known lncRNA-disease associations); obtaining the space projection scores via vector projections of the weighted networks to form the final prediction scores without biases. The leave-one-out cross validation (LOOCV) results showed that, compared with other methods, LDAI-ISPS had a higher accuracy with area-under-the-curve (AUC) value of 0.9154 for inferring diseases, with AUC value of 0.8865 for inferring new lncRNAs (whose associations related to diseases are unknown), with AUC value of 0.7518 for inferring isolated diseases (whose associations related to lncRNAs are unknown). A case study also confirmed the predictive performance of LDAI-ISPS as a helper for traditional biological experiments in inferring the potential LncRNA-disease associations and isolated diseases.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang 421002, China
| | - Xiaohui Cheng
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Hong Jin
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| | - Yarong Liu
- School of Information Science and Engineering, Guilin University of Technology, Guilin 541004, China
| |
Collapse
|
65
|
Li J, Zhao Y, Zhou S, Zhou Y, Lang L. Inferring lncRNA Functional Similarity Based on Integrating Heterogeneous Network Data. Front Bioeng Biotechnol 2020; 8:27. [PMID: 32117916 PMCID: PMC7015864 DOI: 10.3389/fbioe.2020.00027] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Accepted: 01/13/2020] [Indexed: 01/26/2023] Open
Abstract
Although lncRNAs lack the potential to be translated into proteins directly, their complicated and diversiform functions make them as a window into decoding the mechanisms of human physiological activities. Accumulating experiment studies have identified associations between lncRNA dysfunction and many important complex diseases. However, known experimentally confirmed lncRNA functions are still very limited. It is urgent to build effective computational models for rapid predicting of unknown lncRNA functions on a large scale. To this end, valid similarity measure between known and unknown lncRNAs plays a vital role. In this paper, an original model was developed to calculate functional similarities between lncRNAs by integrating heterogeneous network data. In this model, a novel integrated network was constructed based on the data of four single lncRNA functional similarity networks (miRNA-based similarity network, disease-based similarity network, GTEx expression-based network and NONCODE expression-based network). Using the lncRNA pairs that share the target mRNAs as the benchmark, the results show that this integrated network is more effective than any single networks with an AUC of 0.736 in the cross validation, while the AUC of four single networks were 0.703, 0.733, 0.611, and 0.602. To implement our model, a web server named IHNLncSim was constructed for inferring lncRNA functional similarity based on integrating heterogeneous network data. Moreover, the modules of network visualization and disease-based lncRNA function enrichment analysis were added into IHNLncSim. It is anticipated that IHNLncSim could be an effective bioinformatics tool for the researches of lncRNA regulation function studies. IHNLncSim is freely available at http://www.lirmed.com/ihnlncsim.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yingshu Zhao
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Siyuan Zhou
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Yuan Zhou
- MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Center for Noncoding RNA Medicine, School of Basic Medical Sciences, Peking University, Beijing, China
| | - Liying Lang
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| |
Collapse
|
66
|
Wu M, Yang Y, Wang H, Ding J, Zhu H, Xu Y. IMPMD: An Integrated Method for Predicting Potential Associations Between miRNAs and Diseases. Curr Genomics 2020; 20:581-591. [PMID: 32581646 PMCID: PMC7290057 DOI: 10.2174/1389202920666191023090215] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/07/2019] [Accepted: 10/16/2019] [Indexed: 01/06/2023] Open
Abstract
Background With the rapid development of biological research, microRNAs (miRNAs) have increasingly attracted worldwide attention. The increasing biological studies and scientific experiments have proven that miRNAs are related to the occurrence and development of a large number of key biological processes which cause complex human diseases. Thus, identifying the association between miRNAs and disease is helpful to diagnose the diseases. Although some studies have found considerable associations between miRNAs and diseases, there are still a lot of associations that need to be identified. Experimental methods to uncover miRNA-disease associations are time-consuming and expensive. Therefore, effective computational methods are urgently needed to predict new associations. Methodology In this work, we propose an integrated method for predicting potential associations between miRNAs and diseases (IMPMD). The enhanced similarity for miRNAs is obtained by combination of functional similarity, gaussian similarity and Jaccard similarity. To diseases, it is obtained by combination of semantic similarity, gaussian similarity and Jaccard similarity. Then, we use these two enhanced similarities to construct the features and calculate cumulative score to choose robust features. Finally, the general linear regression is applied to assign weights for Support Vector Machine, K-Nearest Neighbor and Logistic Regression algorithms. Results IMPMD obtains AUC of 0.9386 in 10-fold cross-validation, which is better than most of the previous models. To further evaluate our model, we implement IMPMD on two types of case studies for lung cancer and breast cancer. 49 (Lung Cancer) and 50 (Breast Cancer) out of the top 50 related miRNAs are validated by experimental discoveries. Conclusion We built a software named IMPMD which can be freely downloaded from https://github.com/Sunmile/IMPMD.
Collapse
Affiliation(s)
- Meiqi Wu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yingxi Yang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Hui Wang
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Jun Ding
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Huan Zhu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| | - Yan Xu
- 1Department of Information and Computer Science, University of Science and Technology Beijing, Beijing100083, China; 2Department of Chemical and Biological Engineering, Hong Kong University of Science and Technology, Hong Kong, China; 3Institute of Computing Technology, Chinese Academy of Sciences, Beijing100080, China
| |
Collapse
|
67
|
Yu L, Shen X, Zhong D, Yang J. Three-Layer Heterogeneous Network Combined With Unbalanced Random Walk for miRNA-Disease Association Prediction. Front Genet 2020; 10:1316. [PMID: 31998371 PMCID: PMC6967737 DOI: 10.3389/fgene.2019.01316] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2019] [Accepted: 12/02/2019] [Indexed: 12/19/2022] Open
Abstract
miRNA plays an important role in many biological processes, and increasing evidence shows that miRNAs are closely related to human diseases. Most existing miRNA-disease association prediction methods were only based on data related to miRNAs and diseases and failed to effectively use other existing biological data. However, experimentally verified miRNA-disease associations are limited, there are complex correlations between biological data. Therefore, we propose a novel Three-layer heterogeneous network Combined with unbalanced Random Walk for MiRNA-Disease Association prediction algorithm (TCRWMDA), which can effectively integrate multi-source association data. TCRWMDA based not only on the known miRNA-disease associations, also add the new priori information (lncRNA-miRNA and lncRNA-disease associations) to build a three-layer heterogeneous network, lncRNA was added as the transition path of the intermediate point to mine more effective information between networks. The AUC value obtained by the TCRWMDA algorithm on 5-fold cross validation is 0.9209, compared with other models based on the same similarity calculation method, TCRWMDA obtained better results. TCRWMDA was applied to the analysis of four types of cancer, the results proved that TCRWMDA is an effective tool to predict the potential miRNA-disease association. The source code and dataset of TCRWMDA are available at: https://github.com/ylm0505/TCRWMDA.
Collapse
Affiliation(s)
- Limin Yu
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Xianjun Shen
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Duo Zhong
- School of Computer, Central China Normal University, Wuhan, China
- Hubei Provincial Key Laboratory of Artificial Intelligence and Smart Learning, Central China Normal University, Wuhan, China
| | - Jincai Yang
- School of Computer, Central China Normal University, Wuhan, China
| |
Collapse
|
68
|
Wang Q, Yan G. IDLDA: An Improved Diffusion Model for Predicting LncRNA-Disease Associations. Front Genet 2019; 10:1259. [PMID: 31867043 PMCID: PMC6909379 DOI: 10.3389/fgene.2019.01259] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2019] [Accepted: 11/14/2019] [Indexed: 11/13/2022] Open
Abstract
It has been demonstrated that long non-coding RNAs (lncRNAs) play important roles in a variety of biological processes associated with human diseases. However, the identification of lncRNA–disease associations by experimental methods is time-consuming and labor-intensive. Computational methods provide an effective strategy to predict more potential lncRNA–disease associations to some degree. Based on the hypothesis that phenotypically similar diseases are often associated with functionally similar lncRNAs and vice versa, we developed an improved diffusion model to predict potential lncRNA–disease associations (IDLDA). As a result, our model performed well in the global and local cross-validations, which indicated that IDLDA had a great performance in predicting novel associations. Case studies of colon cancer, breast cancer, and gastric cancer were also implemented, all lncRNAs which ranked top 10 in both databases were verified by databases and related literature. The results showed that IDLDA might play a key role in biomedical research.
Collapse
Affiliation(s)
- Qi Wang
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Guiying Yan
- Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
69
|
Li J, Li X, Feng X, Wang B, Zhao B, Wang L. A novel target convergence set based random walk with restart for prediction of potential LncRNA-disease associations. BMC Bioinformatics 2019; 20:626. [PMID: 31795943 PMCID: PMC6889579 DOI: 10.1186/s12859-019-3216-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Accepted: 11/12/2019] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND In recent years, lncRNAs (long-non-coding RNAs) have been proved to be closely related to the occurrence and development of many serious diseases that are seriously harmful to human health. However, most of the lncRNA-disease associations have not been found yet due to high costs and time complexity of traditional bio-experiments. Hence, it is quite urgent and necessary to establish efficient and reasonable computational models to predict potential associations between lncRNAs and diseases. RESULTS In this manuscript, a novel prediction model called TCSRWRLD is proposed to predict potential lncRNA-disease associations based on improved random walk with restart. In TCSRWRLD, a heterogeneous lncRNA-disease network is constructed first by combining the integrated similarity of lncRNAs and the integrated similarity of diseases. And then, for each lncRNA/disease node in the newly constructed heterogeneous lncRNA-disease network, it will establish a node set called TCS (Target Convergence Set) consisting of top 100 disease/lncRNA nodes with minimum average network distances to these disease/lncRNA nodes having known associations with itself. Finally, an improved random walk with restart is implemented on the heterogeneous lncRNA-disease network to infer potential lncRNA-disease associations. The major contribution of this manuscript lies in the introduction of the concept of TCS, based on which, the velocity of convergence of TCSRWRLD can be quicken effectively, since the walker can stop its random walk while the walking probability vectors obtained by it at the nodes in TCS instead of all nodes in the whole network have reached stable state. And Simulation results show that TCSRWRLD can achieve a reliable AUC of 0.8712 in the Leave-One-Out Cross Validation (LOOCV), which outperforms previous state-of-the-art results apparently. Moreover, case studies of lung cancer and leukemia demonstrate the satisfactory prediction performance of TCSRWRLD as well. CONCLUSIONS Both comparative results and case studies have demonstrated that TCSRWRLD can achieve excellent performances in prediction of potential lncRNA-disease associations, which imply as well that TCSRWRLD may be a good addition to the research of bioinformatics in the future.
Collapse
Affiliation(s)
- Jiechen Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Xueyong Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Bing Wang
- School of Electrical and Information Engineering, Anhui University of Technology, Anhui, 243002, Maanshan, People's Republic of China
| | - Bihai Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China
| | - Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan, People's Republic of China. .,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan, People's Republic of China.
| |
Collapse
|
70
|
Guo ZH, You ZH, Yi HC. Integrative Construction and Analysis of Molecular Association Network in Human Cells by Fusing Node Attribute and Behavior Information. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 19:498-506. [PMID: 31923739 PMCID: PMC6951835 DOI: 10.1016/j.omtn.2019.10.046] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Revised: 10/07/2019] [Accepted: 10/21/2019] [Indexed: 11/27/2022]
Abstract
Detecting whether a pair of biomolecules associate is of great significance in the study of molecular biology. Hence, computational methods are urgently needed as guidance for practice. However, most of the previous prediction models influenced by reductionism focused on isolated research objects, which have their own inherent defects. Inspired by holism, a machine-learning-based framework called MAN-node2vec is proposed to predict multi-type relationships in the molecular associations network (MAN). Specifically, we constructed a large-scale MAN composed of 1,023 miRNAs, 1,649 proteins, 769 long non-coding RNAs (lncRNAs), 1,025 drugs, and 2,062 diseases. Then, each biomolecule in MAN can be represented as a vector by its attribute learned by k-mer, etc. and its behavior learned by node2vec. Finally, the random forest classifier is applied to carry out the relationship prediction task. The proposed model achieved a reliable performance with 0.9677 areas under the curve (AUCs) and 0.9562 areas under the precision curve (AUPRs) under 5-fold cross-validation. Also, additional experiments proved that the proposed global model shows more competitive performance than the traditional local method. All of these provided a systematic insight for understanding the synergistic interactions between various molecules and diseases. It is anticipated that this work can bring beneficial inspiration and advance to related systems biology and biomedical research.
Collapse
Affiliation(s)
- Zhen-Hao Guo
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
71
|
Yi HC, You ZH, Guo ZH. Construction and Analysis of Molecular Association Network by Combining Behavior Representation and Node Attributes. Front Genet 2019; 10:1106. [PMID: 31788002 PMCID: PMC6854842 DOI: 10.3389/fgene.2019.01106] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Accepted: 10/15/2019] [Indexed: 11/13/2022] Open
Abstract
A key aim of post-genomic biomedical research is to systematically understand and model complex biomolecular activities based on a systematic perspective. Biomolecular interactions are widespread and interrelated, multiple biomolecules coordinate to sustain life activities, any disturbance of these complex connections can lead to abnormal of life activities or complex diseases. However, many existing researches usually only focus on individual intermolecular interactions. In this work, we revealed, constructed, and analyzed a large-scale molecular association network of multiple biomolecules in human by integrating associations among lncRNAs, miRNAs, proteins, drugs, and diseases, in which various associations are interconnected and any type of associations can be predicted. We propose Molecular Association Network (MAN)–High-Order Proximity preserved Embedding (HOPE), a novel network representation learning based method to fully exploit latent feature of biomolecules to accurately predict associations between molecules. More specifically, network representation learning algorithm HOPE was applied to learn behavior feature of nodes in the association network. Attribute features of nodes were also adopted. Then, a machine learning model CatBoost was trained to predict potential association between any nodes. The performance of our method was evaluated under five-fold cross validation. A case study to predict miRNA-disease associations was also conducted to verify the prediction capability. MAN-HOPE achieves high accuracy of 93.3% and area under the receiver operating characteristic curve of 0.9793. The experimental results demonstrate the novelty of our systematic understanding of the intermolecular associations, and enable systematic exploration of the landscape of molecular interactions that shape specialized cellular functions.
Collapse
Affiliation(s)
- Hai-Cheng Yi
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
| | - Zhen-Hao Guo
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, China
| |
Collapse
|
72
|
Dopaminergic neuron injury in Parkinson's disease is mitigated by interfering lncRNA SNHG14 expression to regulate the miR-133b/ α-synuclein pathway. Aging (Albany NY) 2019; 11:9264-9279. [PMID: 31683259 PMCID: PMC6874444 DOI: 10.18632/aging.102330] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2019] [Accepted: 09/22/2019] [Indexed: 02/06/2023]
Abstract
This study explored the influence of long non-coding RNA (lncRNA) SNHG14 on α-synuclein (α-syn) expression and Parkinson’s disease (PD) pathogenesis. Firstly, we found that the expression level of SNHG14 was elevated in brain tissues of PD mice. In MN9D cells, the rotenone treatment (1μmol/L) enhanced the binding between transcriptional factor SP-1 and SNHG14 promoter, thus promoting SNHG14 expression. Interference of SNHG14 ameliorated the DA neuron injury induced by rotenone. Next, we found an interaction between SNHG14 and miR-133b. Further study showed that miR-133b down-regulated α-syn expression by targeting its 3’-UTR of mRNA and SNHG14 could reverse the negative effect of miR-133b on α-syn expression. Interference of SNHG14 reduced rotenone-induced DA neuron damage through miR-133b in MN9D cells and α-syn was responsible for the protective effect of miR-133b. Similarly, interference of SNHG14 mitigated neuron injury in PD mouse model. All in all, silence of SNHG14 mitigates dopaminergic neuron injury by down-regulating α-syn via targeting miR-133b, which contributes to improving PD.
Collapse
|
73
|
Chen X, Xie D, Zhao Q, You ZH. MicroRNAs and complex diseases: from experimental results to computational models. Brief Bioinform 2019; 20:515-539. [PMID: 29045685 DOI: 10.1093/bib/bbx130] [Citation(s) in RCA: 397] [Impact Index Per Article: 79.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2017] [Revised: 08/13/2017] [Indexed: 12/22/2022] Open
Abstract
Plenty of microRNAs (miRNAs) were discovered at a rapid pace in plants, green algae, viruses and animals. As one of the most important components in the cell, miRNAs play a growing important role in various essential and important biological processes. For the recent few decades, amounts of experimental methods and computational models have been designed and implemented to identify novel miRNA-disease associations. In this review, the functions of miRNAs, miRNA-target interactions, miRNA-disease associations and some important publicly available miRNA-related databases were discussed in detail. Specially, considering the important fact that an increasing number of miRNA-disease associations have been experimentally confirmed, we selected five important miRNA-related human diseases and five crucial disease-related miRNAs and provided corresponding introductions. Identifying disease-related miRNAs has become an important goal of biomedical research, which will accelerate the understanding of disease pathogenesis at the molecular level and molecular tools design for disease diagnosis, treatment and prevention. Computational models have become an important means for novel miRNA-disease association identification, which could select the most promising miRNA-disease pairs for experimental validation and significantly reduce the time and cost of the biological experiments. Here, we reviewed 20 state-of-the-art computational models of predicting miRNA-disease associations from different perspectives. Finally, we summarized four important factors for the difficulties of predicting potential disease-related miRNAs, the framework of constructing powerful computational models to predict potential miRNA-disease associations including five feasible and important research schemas, and future directions for further development of computational models.
Collapse
Affiliation(s)
- Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Di Xie
- School of Mathematics, Liaoning University
| | - Qi Zhao
- School of Mathematics, Liaoning University
| | - Zhu-Hong You
- Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Science
| |
Collapse
|
74
|
Cui Z, Liu JX, Gao YL, Zhu R, Yuan SS. LncRNA-Disease Associations Prediction Using Bipartite Local Model With Nearest Profile-Based Association Inferring. IEEE J Biomed Health Inform 2019; 24:1519-1527. [PMID: 31478878 DOI: 10.1109/jbhi.2019.2937827] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
There is much evidence that long non-coding RNA (lncRNA) is associated with many diseases. However, it is time-consuming and expensive to identify meaningful lncRNA-disease associations (LDAs) through medical or biological experiments. Therefore, investigating how to identify more meaningful LDAs is necessary, and at the same time it is conducive to the prevention, diagnosis and treatment of complex diseases. Considering the limitations of some current prediction models, a novel model based on bipartite local model with nearest profile-based association inferring, BLM-NPAI, is developed for predicting LDAs. This model predicts novel LDAs from the lncRNA side and the disease side, respectively. More importantly, for some lncRNAs and diseases without any association, the model can also be predicted by their nearest neighbors. Leave-one-out cross validation (LOOCV) and 5-fold cross validation are implemented for BLM-NPAI to evaluate the performance of this model. Our model is superior to current advanced methods in most cases. In addition, to verify the validity and reliability of BLM-NPAI, three disease cases and three lncRNA cases are analyzed to further evaluate BLM-NPAI. Finally, these predicted novel LDAs are confirmed by using the LncRNA-disease database.
Collapse
|
75
|
Xie G, Meng T, Luo Y, Liu Z. SKF-LDA: Similarity Kernel Fusion for Predicting lncRNA-Disease Association. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 18:45-55. [PMID: 31514111 PMCID: PMC6742806 DOI: 10.1016/j.omtn.2019.07.022] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2019] [Revised: 07/13/2019] [Accepted: 07/24/2019] [Indexed: 01/24/2023]
Abstract
Recently, prediction of lncRNA-disease associations has attracted more and more attentions. Various computational models have been proposed; however, there is still room to improve the prediction accuracy. In this paper, we propose a kernel fusion method with different types of similarities for the lncRNAs and diseases. The expression similarity and cosine similarity are used for lncRNAs, and the semantic similarity and cosine similarity are used for the diseases. To eliminate the noise effect, a neighbor constraint is enforced to refine all the similarity matrices before fusion. Experimental results show that the proposed similarity kernel fusion (SKF)-LDA method has the superiority performance in terms of AUC values and other measurements. In the schemes of LOOCV and 5-fold CV, AUC values of SKF-LDA achieve 0.9049 and 0.8743±0.0050 respectively. In addition, the conducted case studies of three diseases (hepatocellular carcinoma, lung cancer, and prostate cancer) show that SKF-LDA can predict related lncRNAs accurately.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Tengfei Meng
- School of Computer Science, Guangdong University of Technology, Guangzhou, China
| | - Yu Luo
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Zhenguo Liu
- Department of Thoracic Surgery, The First Affiliated Hospital of Sun Yat-sen University, Guangzhou, China.
| |
Collapse
|
76
|
Guo ZH, Yi HC, You ZH. Construction and Comprehensive Analysis of a Molecular Association Network via lncRNA-miRNA -Disease-Drug-Protein Graph. Cells 2019; 8:E866. [PMID: 31405040 PMCID: PMC6721720 DOI: 10.3390/cells8080866] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Revised: 07/20/2019] [Accepted: 07/31/2019] [Indexed: 12/14/2022] Open
Abstract
One key issue in the post-genomic era is how to systematically describe the associations between small molecule transcripts or translations inside cells. With the rapid development of high-throughput "omics" technologies, the achieved ability to detect and characterize molecules with other molecule targets opens the possibility of investigating the relationships between different molecules from a global perspective. In this article, a molecular association network (MAN) is constructed and comprehensively analyzed by integrating the associations among miRNA, lncRNA, protein, drug, and disease, in which any kind of potential associations can be predicted. More specifically, each node in MAN can be represented as a vector by combining two kinds of information including the attribute of the node itself (e.g., sequences of ncRNAs and proteins, semantics of diseases and molecular fingerprints of drugs) and the behavior of the node in the complex network (associations with other nodes). A random forest classifier is trained to classify and predict new interactions or associations between biomolecules. In the experiment, the proposed method achieved a superb performance with an area under curve (AUC) of 0.9735 under a five-fold cross-validation, which showed that the proposed method could provide new insight for exploration of the molecular mechanisms of disease and valuable clues for disease treatment.
Collapse
Affiliation(s)
- Zhen-Hao Guo
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Hai-Cheng Yi
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhu-Hong You
- The Xinjiang Technical Institute of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
- University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
77
|
Yu J, Xuan Z, Feng X, Zou Q, Wang L. A novel collaborative filtering model for LncRNA-disease association prediction based on the Naïve Bayesian classifier. BMC Bioinformatics 2019; 20:396. [PMID: 31315558 PMCID: PMC6637631 DOI: 10.1186/s12859-019-2985-0] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Accepted: 07/03/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Since the number of known lncRNA-disease associations verified by biological experiments is quite limited, it has been a challenging task to uncover human disease-related lncRNAs in recent years. Moreover, considering the fact that biological experiments are very expensive and time-consuming, it is important to develop efficient computational models to discover potential lncRNA-disease associations. RESULTS In this manuscript, a novel Collaborative Filtering model called CFNBC for inferring potential lncRNA-disease associations is proposed based on Naïve Bayesian Classifier. In CFNBC, an original lncRNA-miRNA-disease tripartite network is constructed first by integrating known miRNA-lncRNA associations, miRNA-disease associations and lncRNA-disease associations, and then, an updated lncRNA-miRNA-disease tripartite network is further constructed through applying the item-based collaborative filtering algorithm on the original tripartite network. Finally, based on the updated tripartite network, a novel approach based on the Naïve Bayesian Classifier is proposed to predict potential associations between lncRNAs and diseases. The novelty of CFNBC lies in the construction of the updated lncRNA-miRNA-disease tripartite network and the introduction of the item-based collaborative filtering algorithm and Naïve Bayesian Classifier, which guarantee that CFNBC can be applied to predict potential lncRNA-disease associations efficiently without entirely relying on known miRNA-disease associations. Simulation results show that CFNBC can achieve a reliable AUC of 0.8576 in the Leave-One-Out Cross Validation (LOOCV), which is considerably better than previous state-of-the-art results. Moreover, case studies of glioma, colorectal cancer and gastric cancer demonstrate the excellent prediction performance of CFNBC as well. CONCLUSIONS According to simulation results, due to the satisfactory prediction performance, CFNBC may be an excellent addition to biomedical researches in the future.
Collapse
Affiliation(s)
- Jingwen Yu
- grid.448798.eCollege of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan People’s Republic of China
- 0000 0000 8633 7608grid.412982.4Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, XiangTan, People’s Republic of China
| | - Zhanwei Xuan
- grid.448798.eCollege of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan People’s Republic of China
- 0000 0000 8633 7608grid.412982.4Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, XiangTan, People’s Republic of China
| | - Xiang Feng
- grid.448798.eCollege of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan People’s Republic of China
- 0000 0000 8633 7608grid.412982.4Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, XiangTan, People’s Republic of China
| | - Quan Zou
- 0000 0004 0369 4060grid.54549.39Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, People’s Republic of China
- 0000 0004 1761 2484grid.33763.32School of Computer Science and Technology, Tianjin University, Tianjin, People’s Republic of China
| | - Lei Wang
- grid.448798.eCollege of Computer Engineering & Applied Mathematics, Changsha University, Changsha, Hunan People’s Republic of China
- 0000 0000 8633 7608grid.412982.4Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, XiangTan, People’s Republic of China
| |
Collapse
|
78
|
Fu G, Wang J, Domeniconi C, Yu G. Matrix factorization-based data fusion for the prediction of lncRNA-disease associations. Bioinformatics 2019; 34:1529-1537. [PMID: 29228285 DOI: 10.1093/bioinformatics/btx794] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2017] [Accepted: 12/05/2017] [Indexed: 12/21/2022] Open
Abstract
Motivation Long non-coding RNAs (lncRNAs) play crucial roles in complex disease diagnosis, prognosis, prevention and treatment, but only a small portion of lncRNA-disease associations have been experimentally verified. Various computational models have been proposed to identify lncRNA-disease associations by integrating heterogeneous data sources. However, existing models generally ignore the intrinsic structure of data sources or treat them as equally relevant, while they may not be. Results To accurately identify lncRNA-disease associations, we propose a Matrix Factorization based LncRNA-Disease Association prediction model (MFLDA in short). MFLDA decomposes data matrices of heterogeneous data sources into low-rank matrices via matrix tri-factorization to explore and exploit their intrinsic and shared structure. MFLDA can select and integrate the data sources by assigning different weights to them. An iterative solution is further introduced to simultaneously optimize the weights and low-rank matrices. Next, MFLDA uses the optimized low-rank matrices to reconstruct the lncRNA-disease association matrix and thus to identify potential associations. In 5-fold cross validation experiments to identify verified lncRNA-disease associations, MFLDA achieves an area under the receiver operating characteristic curve (AUC) of 0.7408, at least 3% higher than those given by state-of-the-art data fusion based computational models. An empirical study on identifying masked lncRNA-disease associations again shows that MFLDA can identify potential associations more accurately than competing models. A case study on identifying lncRNAs associated with breast, lung and stomach cancers show that 38 out of 45 (84%) associations predicted by MFLDA are supported by recent biomedical literature and further proves the capability of MFLDA in identifying novel lncRNA-disease associations. MFLDA is a general data fusion framework, and as such it can be adopted to predict associations between other biological entities. Availability and implementation The source code for MFLDA is available at: http://mlda.swu.edu.cn/codes.php? name = MFLDA. Contact gxyu@swu.edu.cn. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Guangyuan Fu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Jun Wang
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| | - Carlotta Domeniconi
- Department of Computer Science, George Mason University, Farifax, VA 22030, USA
| | - Guoxian Yu
- College of Computer and Information Science, Southwest University, Chongqing 400715, China
| |
Collapse
|
79
|
LncRNAs Regulatory Networks in Cellular Senescence. Int J Mol Sci 2019; 20:ijms20112615. [PMID: 31141943 PMCID: PMC6600251 DOI: 10.3390/ijms20112615] [Citation(s) in RCA: 63] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 04/19/2019] [Accepted: 05/06/2019] [Indexed: 02/07/2023] Open
Abstract
Long noncoding RNAs (lncRNAs) are a class of transcripts longer than 200 nucleotides with no open reading frame. They play a key role in the regulation of cellular processes such as genome integrity, chromatin organization, gene expression, translation regulation, and signal transduction. Recent studies indicated that lncRNAs are not only dysregulated in different types of diseases but also function as direct effectors or mediators for many pathological symptoms. This review focuses on the current findings of the lncRNAs and their dysregulated signaling pathways in senescence. Different functional mechanisms of lncRNAs and their downstream signaling pathways are integrated to provide a bird’s-eye view of lncRNA networks in senescence. This review not only highlights the role of lncRNAs in cell fate decision but also discusses how several feedback loops are interconnected to execute persistent senescence response. Finally, the significance of lncRNAs in senescence-associated diseases and their therapeutic and diagnostic potentials are highlighted.
Collapse
|
80
|
Kuang L, Zhao H, Wang L, Xuan Z, Pei T. A Novel Approach Based on Point Cut Set to Predict Associations of Diseases and LncRNAs. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181026122045] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
In recent years, more evidence have progressively indicated that Long
non-coding RNAs (lncRNAs) play vital roles in wide-ranging human diseases, which can serve as
potential biomarkers and drug targets. Comparing with vast lncRNAs being found, the relationships
between lncRNAs and diseases remain largely unknown.
Objective:
The prediction of novel and potential associations between lncRNAs and diseases would
contribute to dissect the complex mechanisms of disease pathogenesis.
associations while known disease-lncRNA associations are required only.
Method:
In this paper, a new computational method based on Point Cut Set is proposed to predict
LncRNA-Disease Associations (PCSLDA) based on known lncRNA-disease associations. Compared
with the existing state-of-the-art methods, the major novelty of PCSLDA lies in the incorporation of
distance difference matrix and point cut set to set the distance correlation coefficient of nodes in the
lncRNA-disease interaction network. Hence, PCSLDA can be applied to forecast potential lncRNAdisease
associations while known disease-lncRNA associations are required only.
Results:
Simulation results show that PCSLDA can significantly outperform previous state-of-the-art
methods with reliable AUC of 0.8902 in the leave-one-out cross-validation and AUCs of 0.7634 and
0.8317 in 5-fold cross-validation and 10-fold cross-validation respectively. And additionally, 70% of
top 10 predicted cancer-lncRNA associations can be confirmed.
Conclusion:
It is anticipated that our proposed model can be a great addition to the biomedical
research field.
Collapse
Affiliation(s)
- Linai Kuang
- Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, Xiangtan, China
| | - Haochen Zhao
- Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, Xiangtan, China
| | - Lei Wang
- Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, Xiangtan, China
| | - Zhanwei Xuan
- Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, Xiangtan, China
| | - Tingrui Pei
- Key Laboratory of Intelligent Computing & Information Processing, Xiangtan University, Xiangtan, China
| |
Collapse
|
81
|
Liu Y, Feng X, Zhao H, Xuan Z, Wang L. A Novel Network-Based Computational Model for Prediction of Potential LncRNA⁻Disease Association. Int J Mol Sci 2019; 20:ijms20071549. [PMID: 30925672 PMCID: PMC6480945 DOI: 10.3390/ijms20071549] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 03/22/2019] [Accepted: 03/25/2019] [Indexed: 12/12/2022] Open
Abstract
Accumulating studies have shown that long non-coding RNAs (lncRNAs) are involved in many biological processes and play important roles in a variety of complex human diseases. Developing effective computational models to identify potential relationships between lncRNAs and diseases can not only help us understand disease mechanisms at the lncRNA molecular level, but also promote the diagnosis, treatment, prognosis, and prevention of human diseases. For this paper, a network-based model called NBLDA was proposed to discover potential lncRNA⁻disease associations, in which two novel lncRNA⁻disease weighted networks were constructed. They were first based on known lncRNA⁻disease associations and topological similarity of the lncRNA⁻disease association network, and then an lncRNA⁻lncRNA weighted matrix and a disease⁻disease weighted matrix were obtained based on a resource allocation strategy of unequal allocation and unbiased consistence. Finally, a label propagation algorithm was applied to predict associated lncRNAs for the investigated diseases. Moreover, in order to estimate the prediction performance of NBLDA, the framework of leave-one-out cross validation (LOOCV) was implemented on NBLDA, and simulation results showed that NBLDA can achieve reliable areas under the ROC curve (AUCs) of 0.8846, 0.8273, and 0.8075 in three known lncRNA⁻disease association datasets downloaded from the lncRNADisease database, respectively. Furthermore, in case studies of lung cancer, leukemia, and colorectal cancer, simulation results demonstrated that NBLDA can be a powerful tool for identifying potential lncRNA⁻disease associations as well.
Collapse
Affiliation(s)
- Yang Liu
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410000, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411100, China.
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410000, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411100, China.
| | - Haochen Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411100, China.
| | - Zhanwei Xuan
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411100, China.
| | - Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410000, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411100, China.
| |
Collapse
|
82
|
Long Noncoding RNA and Protein Interactions: From Experimental Results to Computational Models Based on Network Methods. Int J Mol Sci 2019; 20:ijms20061284. [PMID: 30875752 PMCID: PMC6471543 DOI: 10.3390/ijms20061284] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2019] [Revised: 03/09/2019] [Accepted: 03/11/2019] [Indexed: 01/13/2023] Open
Abstract
Non-coding RNAs with a length of more than 200 nucleotides are long non-coding RNAs (lncRNAs), which have gained tremendous attention in recent decades. Many studies have confirmed that lncRNAs have important influence in post-transcriptional gene regulation; for example, lncRNAs affect the stability and translation of splicing factor proteins. The mutations and malfunctions of lncRNAs are closely related to human disorders. As lncRNAs interact with a variety of proteins, predicting the interaction between lncRNAs and proteins is a significant way to depth exploration functions and enrich annotations of lncRNAs. Experimental approaches for lncRNA–protein interactions are expensive and time-consuming. Computational approaches to predict lncRNA–protein interactions can be grouped into two broad categories. The first category is based on sequence, structural information and physicochemical property. The second category is based on network method through fusing heterogeneous data to construct lncRNA related heterogeneous network. The network-based methods can capture the implicit feature information in the topological structure of related biological heterogeneous networks containing lncRNAs, which is often ignored by sequence-based methods. In this paper, we summarize and discuss the materials, interaction score calculation algorithms, advantages and disadvantages of state-of-the-art algorithms of lncRNA–protein interaction prediction based on network methods to assist researchers in selecting a suitable method for acquiring more dependable results. All the related different network data are also collected and processed in convenience of users, and are available at https://github.com/HAN-Siyu/APINet/.
Collapse
|
83
|
Wang L, Xuan Z, Zhou S, Kuang L, Pei T. A Novel Model for Predicting LncRNA-disease Associations based on the LncRNA-MiRNA-Disease Interactive Network. Curr Bioinform 2019. [DOI: 10.2174/1574893613666180703105258] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Background:
Accumulating experimental studies have manifested that long-non-coding
RNAs (lncRNAs) play an important part in various biological process. It has been shown that their
alterations and dysregulations are closely related to many critical complex diseases.
Objective:
It is of great importance to develop effective computational models for predicting
potential lncRNA-disease associations.
Method:
Based on the hypothesis that there would be potential associations between a lncRNA
and a disease if both of them have associations with the same group of microRNAs, and similar
diseases tend to be in close association with functionally similar lncRNAs. A novel method for
calculating similarities of both lncRNAs and diseases is proposed, and then a novel prediction
model LDLMD for inferring potential lncRNA-disease associations is proposed.
Results:
LDLMD can achieve an AUC of 0.8925 in the Leave-One-Out Cross Validation
(LOOCV), which demonstrated that the newly proposed model LDLMD significantly outperforms
previous state-of-the-art methods and could be a great addition to the biomedical research field.
Conclusion:
Here, we present a new method for predicting lncRNA-disease associations,
moreover, the method of our present decrease the time and cost of biological experiments.
Collapse
Affiliation(s)
- Lei Wang
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Zhanwei Xuan
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Shunxian Zhou
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Linai Kuang
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| | - Tingrui Pei
- College of Information Engineering, Xiangtan University, Xiangtan 411105, China
| |
Collapse
|
84
|
Ping P, Wang L, Kuang L, Ye S, Iqbal MFB, Pei T. A Novel Method for LncRNA-Disease Association Prediction Based on an lncRNA-Disease Association Network. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:688-693. [PMID: 29993639 DOI: 10.1109/tcbb.2018.2827373] [Citation(s) in RCA: 61] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
An increasing number of studies have indicated that long-non-coding RNAs (lncRNAs) play critical roles in many important biological processes. Predicting potential lncRNA-disease associations can improve our understanding of the molecular mechanisms of human diseases and aid in finding biomarkers for disease diagnosis, treatment, and prevention. In this paper, we constructed a bipartite network based on known lncRNA-disease associations; based on this work, we proposed a novel model for inferring potential lncRNA-disease associations. Specifically, we analyzed the properties of the bipartite network and found that it closely followed a power-law distribution. Moreover, to evaluate the performance of our model, a leave-one-out cross-validation (LOOCV) framework was implemented, and the simulation results showed that our computational model significantly outperformed previous state-of-the-art models, with AUCs of 0.8825, 0.9004, and 0.9292 for known lncRNA-disease associations obtained from the LncRNADisease database, Lnc2Cancer database, and MNDR database, respectively. Thus, our approach may be an excellent addition to the biomedical research field in the future.
Collapse
|
85
|
Fan XN, Zhang SW, Zhang SY, Zhu K, Lu S. Prediction of lncRNA-disease associations by integrating diverse heterogeneous information sources with RWR algorithm and positive pointwise mutual information. BMC Bioinformatics 2019; 20:87. [PMID: 30782113 PMCID: PMC6381749 DOI: 10.1186/s12859-019-2675-y] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Accepted: 02/12/2019] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Long non-coding RNAs play an important role in human complex diseases. Identification of lncRNA-disease associations will gain insight into disease-related lncRNAs and benefit disease diagnoses and treatment. However, using experiments to explore the lncRNA-disease associations is expensive and time consuming. RESULTS In this study, we developed a novel method to identify potential lncRNA-disease associations by Integrating Diverse Heterogeneous Information sources with positive pointwise Mutual Information and Random Walk with restart algorithm (namely IDHI-MIRW). IDHI-MIRW first constructs multiple lncRNA similarity networks and disease similarity networks from diverse lncRNA-related and disease-related datasets, then implements the random walk with restart algorithm on these similarity networks for extracting the topological similarities which are fused with positive pointwise mutual information to build a large-scale lncRNA-disease heterogeneous network. Finally, IDHI-MIRW implemented random walk with restart algorithm on the lncRNA-disease heterogeneous network to infer potential lncRNA-disease associations. CONCLUSIONS Compared with other state-of-the-art methods, IDHI-MIRW achieves the best prediction performance. In case studies of breast cancer, stomach cancer, and colorectal cancer, 36/45 (80%) novel lncRNA-disease associations predicted by IDHI-MIRW are supported by recent literatures. Furthermore, we found lncRNA LINC01816 is associated with the survival of colorectal cancer patients. IDHI-MIRW is freely available at https://github.com/NWPU-903PR/IDHI-MIRW .
Collapse
Affiliation(s)
- Xiao-Nan Fan
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Xi’an, 710072 Shaanxi China
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Pittsburgh, PA 15206 USA
| | - Shao-Wu Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Xi’an, 710072 Shaanxi China
| | - Song-Yao Zhang
- Key Laboratory of Information Fusion Technology of Ministry of Education, School of Automation, Northwestern Polytechnical University, 127 West Youyi Road, Xi’an, 710072 Shaanxi China
| | - Kunju Zhu
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Pittsburgh, PA 15206 USA
- The First Affiliated Hospital and Clinical Medicine Research Institute, Jinan University, Guangzhou, China
| | - Songjian Lu
- Department of Biomedical Informatics, University of Pittsburgh, 5607 Baum Blvd, Pittsburgh, PA 15206 USA
| |
Collapse
|
86
|
Xuan Z, Li J, Yu J, Feng X, Zhao B, Wang L. A Probabilistic Matrix Factorization Method for Identifying lncRNA-disease Associations. Genes (Basel) 2019; 10:genes10020126. [PMID: 30744078 PMCID: PMC6410097 DOI: 10.3390/genes10020126] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 01/31/2019] [Accepted: 02/04/2019] [Indexed: 12/15/2022] Open
Abstract
Recently, an increasing number of studies have indicated that long-non-coding RNAs (lncRNAs) can participate in various crucial biological processes and can also be used as the most promising biomarkers for the treatment of certain diseases such as coronary artery disease and various cancers. Due to costs and time complexity, the number of possible disease-related lncRNAs that can be verified by traditional biological experiments is very limited. Therefore, in recent years, it has been very popular to use computational models to predict potential disease-lncRNA associations. In this study, we constructed three kinds of association networks, namely the lncRNA-miRNA association network, the miRNA-disease association network, and the lncRNA-disease correlation network firstly. Then, through integrating these three newly constructed association networks, we constructed an lncRNA-disease weighted association network, which would be further updated by adopting the KNN algorithm based on the semantic similarity of diseases and the similarity of lncRNA functions. Thereafter, according to the updated lncRNA-disease weighted association network, a novel computational model called PMFILDA was proposed to infer potential lncRNA-disease associations based on the probability matrix decomposition. Finally, to evaluate the superiority of the new prediction model PMFILDA, we performed Leave One Out Cross-Validation (LOOCV) based on strongly validated data filtered from MNDR and the simulation results indicated that the performance of PMFILDA was better than some state-of-the-art methods. Moreover, case studies of breast cancer, lung cancer, and colorectal cancer were implemented to further estimate the performance of PMFILDA, and simulation results illustrated that PMFILDA could achieve satisfying prediction performance as well.
Collapse
Affiliation(s)
- Zhanwei Xuan
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan 411105, China.
| | - Jiechen Li
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan 411105, China.
| | - Jingwen Yu
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan 411105, China.
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan 411105, China.
| | - Bihai Zhao
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
| | - Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, XiangTan 411105, China.
| |
Collapse
|
87
|
Wen Y, Han G, Anh VV. Laplacian normalization and bi-random walks on heterogeneous networks for predicting lncRNA-disease associations. BMC SYSTEMS BIOLOGY 2018; 12:122. [PMID: 30598088 PMCID: PMC6311918 DOI: 10.1186/s12918-018-0660-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
BACKGROUND Evidences have increasingly indicated that lncRNAs (long non-coding RNAs) are deeply involved in important biological regulation processes leading to various human complex diseases. Experimental investigations of these disease associated lncRNAs are slow with high costs. Computational methods to infer potential associations between lncRNAs and diseases have become an effective prior-pinpointing approach to the experimental verification. RESULTS In this study, we develop a novel method for the prediction of lncRNA-disease associations using bi-random walks on a network merging the similarities of lncRNAs and diseases. Particularly, this method applies a Laplacian technique to normalize the lncRNA similarity matrix and the disease similarity matrix before the construction of the lncRNA similarity network and disease similarity network. The two networks are then connected via existing lncRNA-disease associations. After that, bi-random walks are applied on the heterogeneous network to predict the potential associations between the lncRNAs and the diseases. Experimental results demonstrate that the performance of our method is highly comparable to or better than the state-of-the-art methods for predicting lncRNA-disease associations. Our analyses on three cancer data sets (breast cancer, lung cancer, and liver cancer) also indicate the usefulness of our method in practical applications. CONCLUSIONS Our proposed method, including the construction of the lncRNA similarity network and disease similarity network and the bi-random walks algorithm on the heterogeneous network, could be used for prediction of potential associations between the lncRNAs and the diseases.
Collapse
Affiliation(s)
- Yaping Wen
- School of Mathematics and Computational Science, Xiangtan University, Hunan, 411105, China
| | - Guosheng Han
- School of Mathematics and Computational Science, Xiangtan University, Hunan, 411105, China.
| | - Vo V Anh
- School of Mathematics and Computational Science, Xiangtan University, Hunan, 411105, China.,Department of Mathematics, Swinburne University of Technology, PO Box 218, Hawthorn, Vic 3122, Australia
| |
Collapse
|
88
|
Manzanarez-Ozuna E, Flores DL, Gutiérrez-López E, Cervantes D, Juárez P. Model based on GA and DNN for prediction of mRNA-Smad7 expression regulated by miRNAs in breast cancer. Theor Biol Med Model 2018; 15:24. [PMID: 30594253 PMCID: PMC6310970 DOI: 10.1186/s12976-018-0095-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2018] [Accepted: 11/30/2018] [Indexed: 01/06/2023] Open
Abstract
Background The Smad7 protein is negative regulator of the TGF-β signaling pathway, which is upregulated in patients with breast cancer. miRNAs regulate proteins expressions by arresting or degrading the mRNAs. The purpose of this work is to identify a miRNAs profile that regulates the expression of the mRNA coding for Smad7 in breast cancer using the data from patients with breast cancer obtained from the Cancer Genome Atlas Project. Methods We develop an automatic search method based on genetic algorithms to find a predictive model based on deep neural networks (DNN) which fit the set of biological data and apply the Olden algorithm to identify the relative importance of each miRNAs. Results A computational model of non-linear regression is shown, based on deep neural networks that predict the regulation given by the miRNA target transcripts mRNA coding for Smad7 protein in patients with breast cancer, with R2 of 0.99 is shown and MSE of 0.00001. In addition, the model is validated with the results in vivo and in vitro experiments reported in the literature. The set of miRNAs hsa-mir-146a, hsa-mir-93, hsa-mir-375, hsa-mir-205, hsa-mir-15a, hsa-mir-21, hsa-mir-20a, hsa-mir-503, hsa-mir-29c, hsa-mir-497, hsa-mir-107, hsa-mir-125a, hsa-mir-200c, hsa-mir-212, hsa-mir-429, hsa-mir-34a, hsa-let-7c, hsa-mir-92b, hsa-mir-33a, hsa-mir-15b, hsa-mir-224, hsa-mir-185 and hsa-mir-10b integrate a profile that critically regulates the expression of the mRNA coding for Smad7 in breast cancer. Conclusions We developed a genetic algorithm to select best features as DNN inputs (miRNAs). The genetic algorithm also builds the best DNN architecture by optimizing the parameters. Although the confirmation of the results by laboratory experiments has not occurred, the results allow suggesting that miRNAs profile could be used as biomarkers or targets in targeted therapies. Electronic supplementary material The online version of this article (10.1186/s12976-018-0095-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Edgar Manzanarez-Ozuna
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - Dora-Luz Flores
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico.
| | - Everardo Gutiérrez-López
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - David Cervantes
- Universidad Autónoma de Baja California, Carretera Transpeninsular Ensenada-Tijuana 3917 Colonia Playitas, C.P. 22860, Ensenada, B.C., Mexico
| | - Patricia Juárez
- Centro de Investigación Científica y de Educación Superior de Ensenada, Carretera Ensenada-Tijuana No. 3918, Zona Playitas, C.P. 22860, Ensenada, B.C., Mexico
| |
Collapse
|
89
|
Huang H, Fu S, Liu D. Detection and Analysis of the Hedgehog Signaling Pathway-Related Long Non-Coding RNA (lncRNA) Expression Profiles in Keloid. Med Sci Monit 2018; 24:9032-9044. [PMID: 30543583 PMCID: PMC6301256 DOI: 10.12659/msm.911159] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
BACKGROUND Hedgehog (Hh) signaling pathway-related genes have important roles in several physiological and disease processes that involve cell proliferation. Long non-coding region RNAs (lncRNAs) have a regulatory role on gene expression. Keloid is characterized by excessive proliferation of scar tissue following trauma. The aims of this study were to evaluate the Hh signaling pathway in keloid skin tissues and its downstream gene expression and lncRNAs, compared with normal skin. MATERIAL AND METHODS Four pairs of keloids and adjacent normal skin epidermis underwent total RNA extraction. Gene chip high-throughput real-time quantitative polymerase chain reaction (qPCR) was used to examine the differential expression profiles of the Hh signaling pathway-related lncRNAs and mRNAs in the human keloid and normal skin. The differentially expressed mRNAs were analyzed by Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) to identify their biological roles. RESULTS In keloid tissue, differential expression of 33 mRNAs and 30 lncRNAs relating to the Hh pathway, were verified by gene chip qPCR. The results of GO and KEGG analysis showed that the upregulated mRNAs were involved in cell proliferation, cell growth, and tissue repair, and down-regulated mRNAs were involved in apoptosis. The lncRNA, AC073257.2, affected cell keloid growth and proliferation by its upstream target the GLI2 gene at the transcriptional level. The lncRNA, HNF1A-AS1, affected cell keloid growth and proliferation by its neighboring target gene, HNF1A. CONCLUSIONS Differential expression occurred in Hh signaling pathway-related lncRNAs and mRNAs, which may provide further insight into the development of keloid.
Collapse
Affiliation(s)
- Heping Huang
- Institute of Burns, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China (mainland).,Department of Plastic and Aesthetic Surgery, Jingxi Maternal and Child Health Hospital, Nanchang, Jiangxi, China (mainland)
| | - Shangfeng Fu
- Institute of Burns, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China (mainland)
| | - Dewu Liu
- Institute of Burns, The First Affiliated Hospital of Nanchang University, Nanchang, Jiangxi, China (mainland)
| |
Collapse
|
90
|
Multiple Linear Regression Analysis of lncRNA-Disease Association Prediction Based on Clinical Prognosis Data. BIOMED RESEARCH INTERNATIONAL 2018; 2018:3823082. [PMID: 30643802 PMCID: PMC6311254 DOI: 10.1155/2018/3823082] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 10/23/2018] [Accepted: 11/05/2018] [Indexed: 01/06/2023]
Abstract
Long noncoding RNAs (lncRNAs) have an important role in various life processes of the body, especially cancer. The analysis of disease prognosis is ignored in current prediction on lncRNA-disease associations. In this study, a multiple linear regression model was constructed for lncRNA-disease association prediction based on clinical prognosis data (MlrLDAcp), which integrated the cancer data of clinical prognosis and the expression quantity of lncRNA transcript. MlrLDAcp could realize not only cancer survival prediction but also lncRNA-disease association prediction. Ultimately, 60 lncRNAs most closely related to prostate cancer survival were selected from 481 alternative lncRNAs. Then, the multiple linear regression relationship between the prognosis survival of 176 patients with prostate cancer and 60 lncRNAs was also given. Compared with previous studies, MlrLDAcp had a predominant survival predictive ability and could effectively predict lncRNA-disease associations. MlrLDAcp had an area under the curve (AUC) value of 0.875 for survival prediction and an AUC value of 0.872 for lncRNA-disease association prediction. It could be an effective biological method for biomedical research.
Collapse
|
91
|
Yang YL, Hu F, Xue M, Jia YJ, Zheng ZJ, Li Y, Xue YM. Early growth response protein-1 upregulates long noncoding RNA Arid2-IR to promote extracellular matrix production in diabetic kidney disease. Am J Physiol Cell Physiol 2018; 316:C340-C352. [PMID: 30462533 DOI: 10.1152/ajpcell.00167.2018] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Diabetic kidney disease (DKD) has surpassed chronic glomerulonephritis as the leading cause of end-stage renal disease. Previously, we showed that early growth response protein-1 (Egr1) plays a key role in DKD by enhancing mesangial cell proliferation and extracellular matrix (ECM) production. The long noncoding RNA (lncRNA) AT-rich interactive domain 2-IR (Arid2-IR) has been identified as a mothers against decapentaplegic homolog 3 (Smad3)-associated lncRNA in unilateral ureteral obstructive kidney disease. However, the effect of Egr1 on Arid2-IR in the development of DKD is still unknown. In this study, we found that Arid2-IR was increased in mice with high-fat diet and streptozotocin-induced type 2 diabetes and in mouse mesangial cells cultured with high glucose to mimic diabetes. Knockdown of Arid2-IR in mouse mesangial cells reduced the high expression levels of collagen-α1(I) (Col1a1) and α-smooth muscle actin (α-SMA) induced by high glucose. Furthermore, Arid2-IR expression changed the increased expression of Col1a1 and α-SMA caused by overexpression of Egr1. Overall, these data suggest that increased Arid2-IR likely contributes to ECM production in DKD and that Egr1 promotes ECM production in DKD partly by upregulating Arid2-IR. Thus, Arid2-IR may be a new target in the treatment of DKD.
Collapse
Affiliation(s)
- Yan-Lin Yang
- Department of Endocrinology and Metabolism, Nanfang Hospital, Southern Medical University , Guangzhou , China
| | - Fang Hu
- Department of Endocrinology and Metabolism, The Fifth Affiliated Hospital of Sun Yat-Sen University , Zhuhai , China
| | - Meng Xue
- Department of Endocrinology and Metabolism, Shenzhen People's Hospital, Second Affiliated Hospital of Jinan University , Shenzhen , China
| | - Yi-Jie Jia
- Department of Endocrinology and Metabolism, Nanfang Hospital, Southern Medical University , Guangzhou , China
| | - Zong-Ji Zheng
- Department of Endocrinology and Metabolism, Nanfang Hospital, Southern Medical University , Guangzhou , China
| | - Yang Li
- Department of Geriatrics, Zhu Jiang Hospital, Southern Medical University , Guangzhou , China
| | - Yao-Ming Xue
- Department of Endocrinology and Metabolism, Nanfang Hospital, Southern Medical University , Guangzhou , China
| |
Collapse
|
92
|
Xiao X, Zhu W, Liao B, Xu J, Gu C, Ji B, Yao Y, Peng L, Yang J. BPLLDA: Predicting lncRNA-Disease Associations Based on Simple Paths With Limited Lengths in a Heterogeneous Network. Front Genet 2018; 9:411. [PMID: 30459803 PMCID: PMC6232683 DOI: 10.3389/fgene.2018.00411] [Citation(s) in RCA: 34] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2018] [Accepted: 09/05/2018] [Indexed: 12/31/2022] Open
Abstract
In recent years, it has been increasingly clear that long noncoding RNAs (lncRNAs) play critical roles in many biological processes associated with human diseases. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method called BPLLDA to predict lncRNA-disease associations based on paths of fixed lengths in a heterogeneous lncRNA-disease association network. Specifically, BPLLDA first constructs a heterogeneous lncRNA-disease network by integrating the lncRNA-disease association network, the lncRNA functional similarity network, and the disease semantic similarity network. It then infers the probability of an lncRNA-disease association based on paths connecting them and their lengths in the network. Compared to existing methods, BPLLDA has a few advantages, including not demanding negative samples and the ability to predict associations related to novel lncRNAs or novel diseases. BPLLDA was applied to a canonical lncRNA-disease association database called LncRNADisease, together with two popular methods LRLSLDA and GrwLDA. The leave-one-out cross-validation areas under the receiver operating characteristic curve of BPLLDA are 0.87117, 0.82403, and 0.78528, respectively, for predicting overall associations, associations related to novel lncRNAs, and associations related to novel diseases, higher than those of the two compared methods. In addition, cervical cancer, glioma, and non-small-cell lung cancer were selected as case studies, for which the predicted top five lncRNA-disease associations were verified by recently published literature. In summary, BPLLDA exhibits good performances in predicting novel lncRNA-disease associations and associations related to novel lncRNAs and diseases. It may contribute to the understanding of lncRNA-associated diseases like certain cancers.
Collapse
Affiliation(s)
- Xiaofang Xiao
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Wen Zhu
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Bo Liao
- College of Information Science and Engineering, Hunan University, Changsha, China.,School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Junlin Xu
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Changlong Gu
- College of Information Science and Engineering, Hunan University, Changsha, China
| | - Binbin Ji
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China
| | - Lihong Peng
- School of Computer Science, Hunan University of Technology, Zhuzhou, China
| | - Jialiang Yang
- School of Mathematics and Statistics, Hainan Normal University, Haikou, China.,Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| |
Collapse
|
93
|
Zeng P, Chen J, Meng Y, Zhou Y, Yang J, Cui Q. Defining Essentiality Score of Protein-Coding Genes and Long Noncoding RNAs. Front Genet 2018; 9:380. [PMID: 30356729 PMCID: PMC6189311 DOI: 10.3389/fgene.2018.00380] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2018] [Accepted: 08/27/2018] [Indexed: 12/16/2022] Open
Abstract
Measuring the essentiality of genes is critically important in biology and medicine. Here we proposed a computational method, GIC (Gene Importance Calculator), which can efficiently predict the essentiality of both protein-coding genes and long noncoding RNAs (lncRNAs) based on only sequence information. For identifying the essentiality of protein-coding genes, GIC outperformed well-established computational scores. In an independent mouse lncRNA dataset, GIC also achieved an exciting performance (AUC = 0.918). In contrast, the traditional computational methods are not applicable to lncRNAs. Moreover, we explored several potential applications of GIC score. Firstly, we revealed a correlation between gene GIC score and research hotspots of genes. Moreover, GIC score can be used to evaluate whether a gene in mouse is representative for its homolog in human by dissecting its cross-species difference. This is critical for basic medicine because many basic medical studies are performed in animal models. Finally, we showed that GIC score can be used to identify candidate genes from a transcriptomics study. GIC is freely available at http://www.cuilab.cn/gic/.
Collapse
Affiliation(s)
- Pan Zeng
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| | - Ji Chen
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| | - Yuhong Meng
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| | - Yuan Zhou
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| | - Jichun Yang
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| | - Qinghua Cui
- School of Basic Medical Sciences, MOE Key Lab of Cardiovascular Sciences, Department of Biomedical Informatics, Department of Physiology and Pathophysiology, Centre for Noncoding RNA Medicine, Peking University, Beijing, China
| |
Collapse
|
94
|
Chen X, Cheng JY, Yin J. Predicting microRNA-disease associations using bipartite local models and hubness-aware regression. RNA Biol 2018; 15:1192-1205. [PMID: 30196756 DOI: 10.1080/15476286.2018.1517010] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open
Abstract
The development and progression of numerous complex human diseases have been confirmed to be associated with microRNAs (miRNAs) by various experimental and clinical studies. Predicting potential miRNA-disease associations can help us understand the underlying molecular and cellular mechanisms of diseases and promote the development of disease treatment and diagnosis. Due to the high cost of conventional experimental verification, proposing a new computational method for miRNA-disease association prediction is an efficient and economical way. Since previous computational models ignored the hubness phenomenon, we presented a novel computational model of Bipartite Local models and Hubness-Aware Regression for MiRNA-Disease Association prediction (BLHARMDA). In this method, we first used known miRNA-disease associations to calculate the Jaccard similarity between miRNAs and between diseases, then utilized a modified kNNs model in the bipartite local model method. As a result, we effectively alleviated the detriments from 'bad' hubs. BLHARMDA obtained AUCs of 0.9141 and 0.8390 in the global and local leave-one-out cross validation, respectively, which outperformed most of the previous models and proved high prediction performance of BLHARMDA. Besides, the standard deviation of 0.0006 in 5-fold cross validation confirmed our model's prediction stability and the averaged prediction accuracy of 0.9120 showed the high precision of our model. In addition, to further evaluate our model's accuracy, we implemented BLHARMDA on three typical human diseases in three different types of case studies. As a result, 49 (Esophageal Neoplasms), 50 (Lung Neoplasms) and 50 (Carcinoma Hepatocellular) out of the top 50 related miRNAs were validated by recent experimental discoveries.
Collapse
Affiliation(s)
- Xing Chen
- a School of Information and Control Engineering , China University of Mining and Technology , Xuzhou , China
| | - Jun-Yan Cheng
- b College of Computer Science and Technology , Wuhan University of Science and Technology , Hubei , China
| | - Jun Yin
- a School of Information and Control Engineering , China University of Mining and Technology , Xuzhou , China
| |
Collapse
|
95
|
PR-LncRNA signature regulates glioma cell activity through expression of SOX factors. Sci Rep 2018; 8:12746. [PMID: 30143669 PMCID: PMC6109087 DOI: 10.1038/s41598-018-30836-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Accepted: 08/07/2018] [Indexed: 11/27/2022] Open
Abstract
Long non-coding RNAs (LncRNAs) have emerged as a relevant class of genome regulators involved in a broad range of biological processes and with important roles in tumor initiation and malignant progression. We have previously identified a p53-regulated tumor suppressor signature of LncRNAs (PR-LncRNAs) in colorectal cancer. Our aim was to identify the expression and function of this signature in gliomas. We found that the expression of the four PR-LncRNAs tested was high in human low-grade glioma samples and diminished with increasing grade of disease, being the lowest in glioblastoma samples. Functional assays demonstrated that PR-LncRNA silencing increased glioma cell proliferation and oncosphere formation. Mechanistically, we found an inverse correlation between PR-LncRNA expression and SOX1, SOX2 and SOX9 stem cell factors in human glioma biopsies and in glioma cells in vitro. Moreover, knock-down of SOX activity abolished the effect of PR-LncRNA silencing in glioma cell activity. In conclusion, our results demonstrate that the expression and function of PR-LncRNAs are significantly altered in gliomagenesis and that their activity is mediated by SOX factors. These results may provide important insights into the mechanisms responsible for glioblastoma pathogenesis.
Collapse
|
96
|
Zhao Y, Chen X, Yin J. A Novel Computational Method for the Identification of Potential miRNA-Disease Association Based on Symmetric Non-negative Matrix Factorization and Kronecker Regularized Least Square. Front Genet 2018; 9:324. [PMID: 30186308 PMCID: PMC6111239 DOI: 10.3389/fgene.2018.00324] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2018] [Accepted: 07/30/2018] [Indexed: 12/31/2022] Open
Abstract
Increasing evidence has indicated that microRNAs (miRNAs) are associated with numerous human diseases. Studying the associations between miRNAs and diseases contributes to the exploration of effective diagnostic and treatment approaches for diseases. Unfortunately, the use of biological experiments to reveal the potential associations between miRNAs and diseases is time consuming and costly. Therefore, it is very necessary to use simple and efficient calculation models to predict potential disease-related miRNAs. Considering the limitations of other previous methods, we proposed a novel computational model of Symmetric Nonnegative Matrix Factorization for MiRNA-Disease Association prediction (SNMFMDA) to reveal the relation of miRNA-disease pairs. SNMFMDA could be applied to predict miRNAs associated with new diseases. Compared to the direct use of the integrated similarity in previous computational models, the integrated similarity need to be interpolated by symmetric non-negative matrix factorization (SymNMF) before application in SNMFMDA, and the relevant probability of disease-miRNA was obtained mainly through Kronecker regularized least square (KronRLS) method in our model. What's more, the AUC of global leave-one-out cross validation (LOOCV) reached 0.9007, and the AUC based on local LOOCV was 0.8426. Besides, the mean and the standard deviation of AUCs achieved 0.8830 and 0.0017 respectively in 5-fold cross validation. All of the above results demonstrated the superior prediction performance of SNMFMDA. We also conducted three different case studies on Esophageal Neoplasms, Breast Neoplasms and Lung Neoplasms, and 49, 49, and 48 of the top 50 of their predicted miRNAs respectively were confirmed by databases or related literatures. It could be expected that SNMFMDA would be a model with the ability to predict disease-related miRNAs efficiently and accurately.
Collapse
Affiliation(s)
- Yan Zhao
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Jun Yin
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| |
Collapse
|
97
|
He BS, Qu J, Zhao Q. Identifying and Exploiting Potential miRNA-Disease Associations With Neighborhood Regularized Logistic Matrix Factorization. Front Genet 2018; 9:303. [PMID: 30131824 PMCID: PMC6090164 DOI: 10.3389/fgene.2018.00303] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 07/18/2018] [Indexed: 12/12/2022] Open
Abstract
With the rapid development of biological research, microRNAs (miRNA) have become an attractive topic because lots of experimental studies have revealed the significant associations between miRNAs and diseases. However, considering that experiments are expensive and time-consuming, computational methods for predicting associations between miRNAs and diseases have become increasingly crucial. In this study, we proposed a neighborhood regularized logistic matrix factorization method for miRNA-disease association prediction (NRLMFMDA) by integrating miRNA functional similarity, disease semantic similarity, Gaussian interaction profile kernel similarity, and experimentally validation of disease-miRNA association. We used Gaussian interaction profile kernel similarity to cover the shortage of the traditional similarity to make it more reasonable and complete. Furthermore, NRLMFMDA also considered the important influences of the neighborhood information and took full advantage of them to improve the accuracy of the miRNA-disease association prediction. We also improved the accuracy by giving higher weights to the known association data in the process of calculating the potential association probabilities. In the global and the local leave-one-out cross validation, NRLMFMDA got the AUCs of 0.9068 and 0.8239, respectively. Moreover, the average AUC of NRLMFMDA in 5-fold cross validation was 0.8976 ± 0.0034. All the three kinds of cross validations have shown significant advantages to a number of previous models. In the case studies of breast neoplasms, esophageal neoplasms and lymphoma according to known miRNA-disease associations in the recent version of HMDD database, there were 78, 80, and 74% of top 50 predicted related miRNAs verified to have associations with these three diseases, respectively. In the further case studies for new disease without any known related miRNAs and the previous version of HMDD database, there were also high proportions of the predicted miRNAs verified by experimental reports. All the validation experiment results have demonstrated the effectiveness and practicability of NRLFMDA to predict the potential miRNA-disease associations.
Collapse
Affiliation(s)
- Bin-Sheng He
- The First Affiliated Hospital, Changsha Medical University, Changsha, China
| | - Jia Qu
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, China
| | - Qi Zhao
- School of Mathematics, Liaoning University, Shenyang, China.,Research Center for Computer Simulating and Information Processing of Bio-Macromolecules of Liaoning Province, Shenyang, China
| |
Collapse
|
98
|
A Novel Probability Model for LncRNA⁻Disease Association Prediction Based on the Naïve Bayesian Classifier. Genes (Basel) 2018; 9:genes9070345. [PMID: 29986541 PMCID: PMC6071012 DOI: 10.3390/genes9070345] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2018] [Revised: 06/24/2018] [Accepted: 07/03/2018] [Indexed: 12/17/2022] Open
Abstract
An increasing number of studies have indicated that long-non-coding RNAs (lncRNAs) play crucial roles in biological processes, complex disease diagnoses, prognoses, and treatments. However, experimentally validated associations between lncRNAs and diseases are still very limited. Recently, computational models have been developed to discover potential associations between lncRNAs and diseases by integrating multiple heterogeneous biological data; this has become a hot topic in biological research. In this article, we constructed a global tripartite network by integrating a variety of biological information including miRNA–disease, miRNA–lncRNA, and lncRNA–disease associations and interactions. Then, we constructed a global quadruple network by appending gene–lncRNA interaction, gene–disease association, and gene–miRNA interaction networks to the global tripartite network. Subsequently, based on these two global networks, a novel approach was proposed based on the naïve Bayesian classifier to predict potential lncRNA–disease associations (NBCLDA). Comparing with the state-of-the-art methods, our new method does not entirely rely on known lncRNA–disease associations, and can achieve a reliable performance with effective area under ROC curve (AUCs)in leave-one-out cross validation. Moreover, in order to further estimate the performance of NBCLDA, case studies of colorectal cancer, prostate cancer, and glioma were implemented in this paper, and the simulation results demonstrated that NBCLDA can be an excellent tool for biomedical research in the future.
Collapse
|
99
|
A Novel Approach for Predicting Disease-lncRNA Associations Based on the Distance Correlation Set and Information of the miRNAs. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2018; 2018:6747453. [PMID: 30046354 PMCID: PMC6038663 DOI: 10.1155/2018/6747453] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Revised: 04/04/2018] [Accepted: 04/17/2018] [Indexed: 12/29/2022]
Abstract
Recently, accumulating laboratorial studies have indicated that plenty of long noncoding RNAs (lncRNAs) play important roles in various biological processes and are associated with many complex human diseases. Therefore, developing powerful computational models to predict correlation between lncRNAs and diseases based on heterogeneous biological datasets will be important. However, there are few approaches to calculating and analyzing lncRNA-disease associations on the basis of information about miRNAs. In this article, a new computational method based on distance correlation set is developed to predict lncRNA-disease associations (DCSLDA). Comparing with existing state-of-the-art methods, we found that the major novelty of DCSLDA lies in the introduction of lncRNA-miRNA-disease network and distance correlation set; thus DCSLDA can be applied to predict potential lncRNA-disease associations without requiring any known disease-lncRNA associations. Simulation results show that DCSLDA can significantly improve previous existing models with reliable AUC of 0.8517 in the leave-one-out cross-validation. Furthermore, while implementing DCSLDA to prioritize candidate lncRNAs for three important cancers, in the first 0.5% of forecast results, 17 predicted associations are verified by other independent studies and biological experimental studies. Hence, it is anticipated that DCSLDA could be a great addition to the biomedical research field.
Collapse
|
100
|
Hu H, Zhang L, Ai H, Zhang H, Fan Y, Zhao Q, Liu H. HLPI-Ensemble: Prediction of human lncRNA-protein interactions based on ensemble strategy. RNA Biol 2018; 15:797-806. [PMID: 29583068 DOI: 10.1080/15476286.2018.1457935] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022] Open
Abstract
LncRNA plays an important role in many biological and disease progression by binding to related proteins. However, the experimental methods for studying lncRNA-protein interactions are time-consuming and expensive. Although there are a few models designed to predict the interactions of ncRNA-protein, they all have some common drawbacks that limit their predictive performance. In this study, we present a model called HLPI-Ensemble designed specifically for human lncRNA-protein interactions. HLPI-Ensemble adopts the ensemble strategy based on three mainstream machine learning algorithms of Support Vector Machines (SVM), Random Forests (RF) and Extreme Gradient Boosting (XGB) to generate HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble, respectively. The results of 10-fold cross-validation show that HLPI-SVM Ensemble, HLPI-RF Ensemble and HLPI-XGB Ensemble achieved AUCs of 0.95, 0.96 and 0.96, respectively, in the test dataset. Furthermore, we compared the performance of the HLPI-Ensemble models with the previous models through external validation dataset. The results show that the false positives (FPs) of HLPI-Ensemble models are much lower than that of the previous models, and other evaluation indicators of HLPI-Ensemble models are also higher than those of the previous models. It is further showed that HLPI-Ensemble models are superior in predicting human lncRNA-protein interaction compared with previous models. The HLPI-Ensemble is publicly available at: http://ccsipb.lnu.edu.cn/hlpiensemble/ .
Collapse
Affiliation(s)
- Huan Hu
- a School of Life Science , Liaoning University , Shenyang , China
| | - Li Zhang
- a School of Life Science , Liaoning University , Shenyang , China
| | - Haixin Ai
- a School of Life Science , Liaoning University , Shenyang , China.,b Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province , Shenyang , China.,c Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning , Shenyang , China
| | - Hui Zhang
- a School of Life Science , Liaoning University , Shenyang , China
| | - Yetian Fan
- d School of Mathematics , Liaoning University , Shenyang , China
| | - Qi Zhao
- d School of Mathematics , Liaoning University , Shenyang , China
| | - Hongsheng Liu
- a School of Life Science , Liaoning University , Shenyang , China.,b Research Center for Computer Simulating and Information Processing of Bio-macromolecules of Liaoning Province , Shenyang , China.,c Engineering Laboratory for Molecular Simulation and Designing of Drug Molecules of Liaoning , Shenyang , China
| |
Collapse
|