1
|
Caniza H, Cáceres JJ, Torres M, Paccanaro A. LanDis: the disease landscape explorer. Eur J Hum Genet 2024; 32:461-465. [PMID: 38200084 PMCID: PMC10999415 DOI: 10.1038/s41431-023-01511-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2023] [Revised: 11/01/2023] [Accepted: 11/23/2023] [Indexed: 01/12/2024] Open
Abstract
From a network medicine perspective, a disease is the consequence of perturbations on the interactome. These perturbations tend to appear in a specific neighbourhood on the interactome, the disease module, and modules related to phenotypically similar diseases tend to be located in close-by regions. We present LanDis, a freely available web-based interactive tool ( https://paccanarolab.org/landis ) that allows domain experts, medical doctors and the larger scientific community to graphically navigate the interactome distances between the modules of over 44 million pairs of heritable diseases. The map-like interface provides detailed comparisons between pairs of diseases together with supporting evidence. Every disease in LanDis is linked to relevant entries in OMIM and UniProt, providing a starting point for in-depth analysis and an opportunity for novel insight into the aetiology of diseases as well as differential diagnosis.
Collapse
Affiliation(s)
- Horacio Caniza
- Universidad Paraguayo Alemana de Ciencias Aplicadas, Facultad de Ciencias de la Ingeniería, San Lorenzo, Paraguay
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
| | - Juan J Cáceres
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK
| | - Mateo Torres
- Escola de Matemática Aplicada, Fundação Getúlio Vargas, Rio de Janeiro, Brazil
| | - Alberto Paccanaro
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway University of London, Egham, UK.
- Escola de Matemática Aplicada, Fundação Getúlio Vargas, Rio de Janeiro, Brazil.
| |
Collapse
|
2
|
Xu L, Fu X, Zhuo L, Zhou Z, Liao X, Tian S, Kang R, Chen Y. SGAE-MDA: Exploring the MiRNA-disease associations in herbal medicines based on semi-supervised graph autoencoder. Methods 2024; 221:73-81. [PMID: 38123109 DOI: 10.1016/j.ymeth.2023.12.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 11/28/2023] [Accepted: 12/12/2023] [Indexed: 12/23/2023] Open
Abstract
Research indicates that miRNAs present in herbal medicines are crucial for identifying disease markers, advancing gene therapy, facilitating drug delivery, and so on. These miRNAs maintain stability in the extracellular environment, making them viable tools for disease diagnosis. They can withstand the digestive processes in the gastrointestinal tract, positioning them as potential carriers for specific oral drug delivery. By engineering plants to generate effective, non-toxic miRNA interference sequences, it's possible to broaden their applicability, including the treatment of diseases such as hepatitis C. Consequently, delving into the miRNA-disease associations (MDAs) within herbal medicines holds immense promise for diagnosing and addressing miRNA-related diseases. In our research, we propose the SGAE-MDA model, which harnesses the strengths of a graph autoencoder (GAE) combined with a semi-supervised approach to uncover potential MDAs in herbal medicines more effectively. Leveraging the GAE framework, the SGAE-MDA model exactly integrates the inherent feature vectors of miRNAs and disease nodes with the regulatory data in the miRNA-disease network. Additionally, the proposed semi-supervised learning approach randomly hides the partial structure of the miRNA-disease network, subsequently reconstructing them within the GAE framework. This technique effectively minimizes network noise interference. Through comparison against other leading deep learning models, the results consistently highlighted the superior performance of the proposed SGAE-MDA model. Our code and dataset can be available at: https://github.com/22n9n23/SGAE-MDA.
Collapse
Affiliation(s)
- Lei Xu
- Wenzhou University of Technology, Wenzhou, China
| | - Xiangzheng Fu
- Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, China; College of Information Science and Engineering, Hunan University, Changsha, Hunan, China
| | - Linlin Zhuo
- Wenzhou University of Technology, Wenzhou, China
| | | | - Xuefeng Liao
- Wenzhou University of Technology, Wenzhou, China.
| | - Sha Tian
- Department of Internal Medicine, College of Integrated Chinese and Western Medicine, Hunan University of Chinese Medicine, Changsha, Hunan, China.
| | - Ruofei Kang
- Xuhui Excellent Health Information Technology Co., Ltd., China
| | - Yifan Chen
- College of Information Science and Engineering, Hunan University, Changsha, Hunan, China.
| |
Collapse
|
3
|
Chen M, Deng Y, Li Z, Ye Y, He Z. KATZNCP: a miRNA-disease association prediction model integrating KATZ algorithm and network consistency projection. BMC Bioinformatics 2023; 24:229. [PMID: 37268893 DOI: 10.1186/s12859-023-05365-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Accepted: 05/26/2023] [Indexed: 06/04/2023] Open
Abstract
BACKGROUND Clinical studies have shown that miRNAs are closely related to human health. The study of potential associations between miRNAs and diseases will contribute to a profound understanding of the mechanism of disease development, as well as human disease prevention and treatment. MiRNA-disease associations predicted by computational methods are the best complement to biological experiments. RESULTS In this research, a federated computational model KATZNCP was proposed on the basis of the KATZ algorithm and network consistency projection to infer the potential miRNA-disease associations. In KATZNCP, a heterogeneous network was initially constructed by integrating the known miRNA-disease association, integrated miRNA similarities, and integrated disease similarities; then, the KATZ algorithm was implemented in the heterogeneous network to obtain the estimated miRNA-disease prediction scores. Finally, the precise scores were obtained by the network consistency projection method as the final prediction results. KATZNCP achieved the reliable predictive performance in leave-one-out cross-validation (LOOCV) with an AUC value of 0.9325, which was better than the state-of-the-art comparable algorithms. Furthermore, case studies of lung neoplasms and esophageal neoplasms demonstrated the excellent predictive performance of KATZNCP. CONCLUSION A new computational model KATZNCP was proposed for predicting potential miRNA-drug associations based on KATZ and network consistency projections, which can effectively predict the potential miRNA-disease interactions. Therefore, KATZNCP can be used to provide guidance for future experiments.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yingwei Deng
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China.
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Yifan Ye
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| | - Ziyi He
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, 421002, China
| |
Collapse
|
4
|
Justyna M, Antczak M, Szachniuk M. Machine learning for RNA 2D structure prediction benchmarked on experimental data. Brief Bioinform 2023; 24:7140288. [PMID: 37096592 DOI: 10.1093/bib/bbad153] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 03/15/2023] [Accepted: 03/29/2023] [Indexed: 04/26/2023] Open
Abstract
Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL) and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.
Collapse
Affiliation(s)
- Marek Justyna
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
| | - Maciej Antczak
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| | - Marta Szachniuk
- Institute of Computing Science, Poznan University of Technology, Piotrowo 2, 60-965 Poznan, Poland
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland
| |
Collapse
|
5
|
Li L, Gao Z, Zheng CH, Qi R, Wang YT, Ni JC. Predicting miRNA-Disease Association Based on Improved Graph Regression. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3604-3613. [PMID: 34757912 DOI: 10.1109/tcbb.2021.3127017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Recently, as a growing number of associations between microRNAs (miRNAs) and diseases are discovered, researchers gradually realize that miRNAs are closely related to several complicated biological processes and human diseases. Hence, it is especially important to construct availably models to infer associations between miRNAs and diseases. In this study, we presented Improved Graph Regression for miRNA-Disease Association Prediction (IGRMDA) to observe potential relationship between miRNAs and diseases. In order to reduce the inherent noise existing in the acquired biological datasets, we utilized matrix decomposition algorithm to process miRNA functional similarity and disease semantic similarity and then combining them with existing similarity information to obtain final miRNA similarity data and disease similarity data. Then, we applied miRNA-disease association data, miRNA similarity data and disease similarity data to form corresponding latent spaces. Furthermore, we performed improved graph regression algorithm in latent spaces, which included miRNA-disease association space, miRNA similarity space and disease similarity space. Non-negative matrix factorization and partial least squares were used in the graph regression process to obtain important related attributes. The cross validation experiments and case studies were also implemented to prove the effectiveness of IGRMDA, which showed that IGRMDA could predict potential associations between miRNAs and diseases.
Collapse
|
6
|
Xu D, Liu B, Wang J, Zhang Z. Bibliometric analysis of artificial intelligence for biotechnology and applied microbiology: Exploring research hotspots and frontiers. Front Bioeng Biotechnol 2022; 10:998298. [PMID: 36277390 PMCID: PMC9585160 DOI: 10.3389/fbioe.2022.998298] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 09/23/2022] [Indexed: 11/13/2022] Open
Abstract
Background: In the biotechnology and applied microbiology sectors, artificial intelligence (AI) has been extensively used in disease diagnostics, drug research and development, functional genomics, biomarker recognition, and medical imaging diagnostics. In our study, from 2000 to 2021, science publications focusing on AI in biotechnology were reviewed, and quantitative, qualitative, and modeling analyses were performed. Methods: On 6 May 2022, the Web of Science Core Collection (WoSCC) was screened for AI applications in biotechnology and applied microbiology; 3,529 studies were identified between 2000 and 2022, and analyzed. The following information was collected: publication, country or region, references, knowledgebase, institution, keywords, journal name, and research hotspots, and examined using VOSviewer and CiteSpace V bibliometric platforms. Results: We showed that 128 countries published articles related to AI in biotechnology and applied microbiology; the United States had the most publications. In addition, 584 global institutions contributed to publications, with the Chinese Academy of Science publishing the most. Reference clusters from studies were categorized into ten headings: deep learning, prediction, support vector machines (SVM), object detection, feature representation, synthetic biology, amyloid, human microRNA precursors, systems biology, and single cell RNA-Sequencing. Research frontier keywords were represented by microRNA (2012–2020) and protein-protein interactions (PPIs) (2012–2020). Conclusion: We systematically, objectively, and comprehensively analyzed AI-related biotechnology and applied microbiology literature, and additionally, identified current hot spots and future trends in this area. Our review provides researchers with a comprehensive overview of the dynamic evolution of AI in biotechnology and applied microbiology and identifies future key research areas.
Collapse
Affiliation(s)
- Dongyu Xu
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
| | - Bing Liu
- Department of Bone Oncology, The People’s Hospital of Liaoning Province, Shenyang, Liaoning, China
| | - Jian Wang
- Department of Pathogenic Biology, School of Basic Medicine, China Medical University, Shenyang, Liaoning, China
| | - Zhichang Zhang
- Department of Computer, School of Intelligent Medicine, China Medical University, Shenyang, Liaoning, China
- *Correspondence: Zhichang Zhang,
| |
Collapse
|
7
|
Xie G, Xu H, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W, Wang Y, Shao J. DRPADC: A novel drug repositioning algorithm predicting adaptive drugs for COVID-19. Comput Chem Eng 2022; 166:107947. [PMID: 35942213 PMCID: PMC9349049 DOI: 10.1016/j.compchemeng.2022.107947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Revised: 04/13/2022] [Accepted: 07/27/2022] [Indexed: 12/25/2022]
Abstract
Given that the usual process of developing a new vaccine or drug for COVID-19 demands significant time and funds, drug repositioning has emerged as a promising therapeutic strategy. We propose a method named DRPADC to predict novel drug-disease associations effectively from the original sparse drug-disease association adjacency matrix. Specifically, DRPADC processes the original association matrix with the WKNKN algorithm to reduce its sparsity. Furthermore, multiple types of similarity information are fused by a CKA-MKL algorithm. Finally, a compressed sensing algorithm is used to predict the potential drug-disease (virus) association scores. Experimental results show that DRPADC has superior performance than several competitive methods in terms of AUC values and case studies. DRPADC achieved the AUC value of 0.941, 0.955 and 0.876 in Fdataset, Cdataset and HDVD dataset, respectively. In addition, the conducted case studies of COVID-19 show that DRPADC can predict drug candidates accurately.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Haojie Xu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China,Corresponding author
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou 510006, China
| | - Youfu Wang
- Huaneng Qinghai Power Generation Co., Ltd. New Energy Branch, Xining 810000, China
| | - Jiang Shao
- School of Architecture & Design, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
8
|
Fujimura Y, Kumazoe M, Tachibana H. 67-kDa Laminin Receptor-Mediated Cellular Sensing System of Green Tea Polyphenol EGCG and Functional Food Pairing. Molecules 2022; 27:molecules27165130. [PMID: 36014370 PMCID: PMC9416087 DOI: 10.3390/molecules27165130] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 08/07/2022] [Accepted: 08/10/2022] [Indexed: 11/16/2022] Open
Abstract
The body is equipped with a “food factor-sensing system” that senses food factors, such as polyphenols, sulfur-containing compounds, and vitamins, taken into the body, and plays an essential role in manifesting their physiological effects. For example, (–)-epigallocatechin-3-O-gallate (EGCG), the representative catechin in green tea (Camellia sinensi L.), exerts various effects, including anti-cancer, anti-inflammatory, and anti-allergic effects, when sensed by the cell surficial protein 67-kDa laminin receptor (67LR). Here, we focus on three representative effects of EGCG and provide their specific signaling mechanisms, the 67LR-mediated EGCG-sensing systems. Various components present in foods, such as eriodictyol, hesperetin, sulfide, vitamin A, and fatty acids, have been found to act on the food factor-sensing system and affect the functionality of other foods/food factors, such as green tea extract, EGCG, or its O-methylated derivative at different experimental levels, i.e., in vitro, animal models, and/or clinical trials. These phenomena are observed by increasing or decreasing the activity or expression of EGCG-sensing-related molecules. Such functional interaction between food factors is called “functional food pairing”. In this review, we introduce examples of functional food pairings using EGCG.
Collapse
|
9
|
Toor R, Chana I. Exploring diet associations with Covid-19 and other diseases: a Network Analysis-based approach. Med Biol Eng Comput 2022; 60:991-1013. [PMID: 35171411 PMCID: PMC8852958 DOI: 10.1007/s11517-022-02505-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2021] [Accepted: 01/10/2022] [Indexed: 02/07/2023]
Abstract
The current global pandemic, Covid-19, is a severe threat to human health and existence especially when it is mutating very frequently. Being a novel disease, Covid-19 is impacting the patients with comorbidities and is predicted to have long-term consequences, even for those who have recovered from it. To clearly recognize its impact, it is important to comprehend the complex relationship between Covid-19 and other diseases. It is also being observed that people with good immune system are less susceptible to the disease. It is perceived that if a correlation between Covid-19, other diseases, and diet is realized, then caregivers would be able to enhance their further course of medical action and recommendations. Network Analysis is one such technique that can bring forth such complex interdependencies and associations. In this paper, a Network Analysis-based approach has been proposed for analyzing the interplay of diets/foods along with Covid-19 and other diseases. Relationships between Covid-19, diabetes mellitus type 2 (T2DM), non-alcoholic fatty liver disease (NAFLD), and diets have been curated, visualized, and further analyzed in this study so as to predict unknown associations. Network algorithms including Louvain graph algorithm (LA), K nearest neighbors (KNN), and Page rank algorithms (PR) have been employed for predicting a total of 60 disease-diet associations, out of which 46 have been found to be either significant in disease risk prevention/mitigation or in its progression as validated using PubMed literature. A precision of 76.7% has been achieved which is significant considering the involvement of a novel disease like Covid-19. The generated interdependencies can be further explored by medical professionals and caregivers in order to plan healthy eating patterns for Covid-19 patients. The proposed approach can also be utilized for finding beneficial diets for different combinations of comorbidities with Covid-19 as per the underlying health conditions of a patient. Graphical abstract.
Collapse
Affiliation(s)
- Rashmeet Toor
- Cloud and IoT Research Lab, Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| | - Inderveer Chana
- Cloud and IoT Research Lab, Computer Science and Engineering Department, Thapar Institute of Engineering and Technology, Patiala, India
| |
Collapse
|
10
|
Gao Z, Wang YT, Wu QW, Li L, Ni JC, Zheng CH. A New Method Based on Matrix Completion and Non-Negative Matrix Factorization for Predicting Disease-Associated miRNAs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:763-772. [PMID: 32991287 DOI: 10.1109/tcbb.2020.3027444] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Numerous studies have shown that microRNAs are associated with the occurrence and development of human diseases. Thus, studying disease-associated miRNAs is significantly valuable to the prevention, diagnosis and treatment of diseases. In this paper, we proposed a novel method based on matrix completion and non-negative matrix factorization (MCNMF)for predicting disease-associated miRNAs. Due to the information inadequacy on miRNA similarities and disease similarities, we calculated the latter via two models, and introduced the Gaussian interaction profile kernel similarity. In addition, the matrix completion (MC)was employed to further replenish the miRNA and disease similarities to improve the prediction performance. And to reduce the sparsity of miRNA-disease association matrix, the method of weighted K nearest neighbor (WKNKN)was used, which is a pre-processing step. We also utilized non-negative matrix factorization (NMF)using dual L2,1-norm, graph Laplacian regularization, and Tikhonov regularization to effectively avoid the overfitting during the prediction. Finally, several experiments and a case study were implemented to evaluate the effectiveness and performance of the proposed MCNMF model. The results indicated that our method could reliably and effectively predict disease-associated miRNAs.
Collapse
|
11
|
Turning Data to Knowledge: Online Tools, Databases, and Resources in microRNA Research. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1385:133-160. [DOI: 10.1007/978-3-031-08356-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
12
|
Xie G, Li J, Gu G, Sun Y, Lin Z, Zhu Y, Wang W. BGMSDDA: a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction. Mol Omics 2021; 17:997-1011. [PMID: 34610633 DOI: 10.1039/d1mo00237f] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Drug repositioning, a method that relies on the information from the original drug-disease association matrix, aims to identify new indications for existing drugs and is expected to greatly reduce the cost and time of drug development. However, most current drug repositioning methods make use of the original drug-disease association matrix directly without preconditioning. As relatively only a few associations between drugs and diseases have been determined from actual observations, the original drug-disease association matrix used in the prediction is sparse, which affects the performance of the prediction method. A method for mining similar features of drugs and diseases is still lacking. To solve these problems, we developed a bipartite graph diffusion algorithm with multiple similarity integration for drug-disease association prediction (BGMSDDA). First, the weight K nearest known neighbors (WKNKN) algorithm was used to reconstruct the drug-disease association matrix. Secondly, an effective method was designed to extract similar characteristics of drugs and diseases based on integrating linear neighborhood similarity and Gaussian kernel similarity. Finally, bipartite graph diffusion was used to infer undiscovered drug-disease associations. After carrying out 10-fold cross-validation experiments, BGMSDDA showed excellent performance on two datasets, specifically with AUC values of 0.939 (Fdataset) and 0.954 (Cdataset), and AUPR values of 0.466 (Fdataset) and 0.565 (Cdataset). Furthermore, to evaluate the accuracy of the results of BGMSDDA, we conducted case studies on three medically used drugs selected from Fdataset and Cdataset and validated the predictive associated diseases of each drug with some databases. Based on the results obtained, BGMSDDA was demonstrated to be useful for predicting drug-disease associations.
Collapse
Affiliation(s)
- Guobo Xie
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Jianming Li
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Guosheng Gu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yuping Sun
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Zhiyi Lin
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Yinting Zhu
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| | - Weiming Wang
- School of Computer Science, Guangdong University of Technology, Guangzhou, China.
| |
Collapse
|
13
|
Zhang Y, Chen M, Huang L, Xie X, Li X, Jin H, Wang X, Wei H. Fusion of KATZ measure and space projection to fast probe potential lncRNA-disease associations in bipartite graphs. PLoS One 2021; 16:e0260329. [PMID: 34807960 PMCID: PMC8608294 DOI: 10.1371/journal.pone.0260329] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2021] [Accepted: 11/06/2021] [Indexed: 11/19/2022] Open
Abstract
It is well known that numerous long noncoding RNAs (lncRNAs) closely relate to the physiological and pathological processes of human diseases and can serves as potential biomarkers. Therefore, lncRNA-disease associations that are identified by computational methods as the targeted candidates reduce the cost of biological experiments focusing on deep study furtherly. However, inaccurate construction of similarity networks and inadequate numbers of observed known lncRNA–disease associations, such inherent problems make many mature computational methods that have been developed for many years still exit some limitations. It motivates us to explore a new computational method that was fused with KATZ measure and space projection to fast probing potential lncRNA-disease associations (namely KATZSP). KATZSP is comprised of following key steps: combining all the global information with which to change Boolean network of known lncRNA–disease associations into the weighted networks; changing the similarities calculation into counting the number of walks that connect lncRNA nodes and disease nodes in bipartite graphs; obtaining the space projection scores to refine the primary prediction scores. The process to fuse KATZ measure and space projection was simplified and uncomplicated with needing only one attenuation factor. The leave-one-out cross validation (LOOCV) experimental results showed that, compared with other state-of-the-art methods (NCPLDA, LDAI-ISPS and IIRWR), KATZSP had a higher predictive accuracy shown with area-under-the-curve (AUC) value on the three datasets built, while KATZSP well worked on inferring potential associations related to new lncRNAs (or isolated diseases). The results from real cases study (such as pancreas cancer, lung cancer and colorectal cancer) further confirmed that KATZSP is capable of superior predictive ability to be applied as a guide for traditional biological experiments.
Collapse
Affiliation(s)
- Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin, China
| | - Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, China
- The Future Laboratory, Tsinghua University, Beijing, China
| | - Xiaolan Xie
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Xin Li
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Hong Jin
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Xiaohua Wang
- Pharmacy School, Guilin Medical University, Guilin, China
| | - Hanyan Wei
- Pharmacy School, Guilin Medical University, Guilin, China
| |
Collapse
|
14
|
Zeng M, Lu C, Fei Z, Wu FX, Li Y, Wang J, Li M. DMFLDA: A Deep Learning Framework for Predicting lncRNA-Disease Associations. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021; 18:2353-2363. [PMID: 32248123 DOI: 10.1109/tcbb.2020.2983958] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
A growing amount of evidence suggests that long non-coding RNAs (lncRNAs) play important roles in the regulation of biological processes in many human diseases. However, the number of experimentally verified lncRNA-disease associations is very limited. Thus, various computational approaches are proposed to predict lncRNA-disease associations. Current matrix factorization-based methods cannot capture the complex non-linear relationship between lncRNAs and diseases, and traditional machine learning-based methods are not sufficiently powerful to learn the representation of lncRNAs and diseases. Considering these limitations in existing computational methods, we propose a deep matrix factorization model to predict lncRNA-disease associations (DMFLDA in short). DMFLDA uses a cascade of non-linear hidden layers to learn latent representation to represent lncRNAs and diseases. By using non-linear hidden layers, DMFLDA captures the more complex non-linear relationship between lncRNAs and diseases than traditional matrix factorization-based methods. In addition, DMFLDA learns features directly from the lncRNA-disease interaction matrix and thus can obtain more accurate representation learning for lncRNAs and diseases than traditional machine learning methods. The low dimensional representations of the lncRNAs and diseases are fused to estimate the new interaction value. To evaluate the performance of DMFLDA, we perform leave-one-out cross-validation and 5-fold cross-validation on known experimentally verified lncRNA-disease associations. The experimental results show that DMFLDA performs better than the existing methods. The case studies show that many predicted interactions of colorectal cancer, prostate cancer, and renal cancer have been verified by recent biomedical literature. The source code and datasets can be obtained from https://github.com/CSUBioGroup/DMFLDA.
Collapse
|
15
|
Zhao Q, Zhao Z, Fan X, Yuan Z, Mao Q, Yao Y. Review of machine learning methods for RNA secondary structure prediction. PLoS Comput Biol 2021; 17:e1009291. [PMID: 34437528 PMCID: PMC8389396 DOI: 10.1371/journal.pcbi.1009291] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Abstract
Secondary structure plays an important role in determining the function of noncoding RNAs. Hence, identifying RNA secondary structures is of great value to research. Computational prediction is a mainstream approach for predicting RNA secondary structure. Unfortunately, even though new methods have been proposed over the past 40 years, the performance of computational prediction methods has stagnated in the last decade. Recently, with the increasing availability of RNA structure data, new methods based on machine learning (ML) technologies, especially deep learning, have alleviated the issue. In this review, we provide a comprehensive overview of RNA secondary structure prediction methods based on ML technologies and a tabularized summary of the most important methods in this field. The current pending challenges in the field of RNA secondary structure prediction and future trends are also discussed.
Collapse
Affiliation(s)
- Qi Zhao
- College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, Liaoning, China
| | - Zheng Zhao
- School of Information Science and Technology, Dalian Maritime University, Dalian, Liaoning, China
| | - Xiaoya Fan
- School of Software, Key Laboratory for Ubiquitous Network and Service Software of Liaoning Province, Dalian University of Technology, Dalian, Liaoning, China
| | - Zhengwei Yuan
- Key Laboratory of Health Ministry for Congenital Malformation, Shengjing Hospital of China Medical University, Shenyang, Liaoning, China
| | - Qian Mao
- College of Light Industry, Liaoning University, Shenyang, Liaoning, China
- Key Laboratory of Agroproducts Processing Technology, Changchun University, Changchun, Jilin, China
| | - Yudong Yao
- Department of Electrical and Computer Engineering, Stevens Institute of Technology, Hoboken, New Jersey, United States of America
| |
Collapse
|
16
|
Li A, Deng Y, Tan Y, Chen M. A novel miRNA-disease association prediction model using dual random walk with restart and space projection federated method. PLoS One 2021; 16:e0252971. [PMID: 34138933 PMCID: PMC8211179 DOI: 10.1371/journal.pone.0252971] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 05/26/2021] [Indexed: 12/27/2022] Open
Abstract
A large number of studies have shown that the variation and disorder of miRNAs are important causes of diseases. The recognition of disease-related miRNAs has become an important topic in the field of biological research. However, the identification of disease-related miRNAs by biological experiments is expensive and time consuming. Thus, computational prediction models that predict disease-related miRNAs must be developed. A novel network projection-based dual random walk with restart (NPRWR) was used to predict potential disease-related miRNAs. The NPRWR model aims to estimate and accurately predict miRNA-disease associations by using dual random walk with restart and network projection technology, respectively. The leave-one-out cross validation (LOOCV) was adopted to evaluate the prediction performance of NPRWR. The results show that the area under the receiver operating characteristic curve(AUC) of NPRWR was 0.9029, which is superior to that of other advanced miRNA-disease associated prediction methods. In addition, lung and kidney neoplasms were selected to present a case study. Among the first 50 miRNAs predicted, 50 and 49 miRNAs have been proven by in databases or relevant literature. Moreover, NPRWR can be used to predict isolated diseases and new miRNAs. LOOCV and the case study achieved good prediction results. Thus, NPRWR will become an effective and accurate disease-miRNA association prediction model.
Collapse
Affiliation(s)
- Ang Li
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Yingwei Deng
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
- Hainan Key Laboratory for Computational Science and Application, Haikou, China
| | - Yan Tan
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| | - Min Chen
- Hunan Institute of Technology, School of Computer Science and Technology, Hengyang, China
| |
Collapse
|
17
|
Sun Y, Hou G. Analysis on the Spatial-Temporal Evolution Characteristics and Spatial Network Structure of Tourism Eco-Efficiency in the Yangtze River Delta Urban Agglomeration. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph18052577. [PMID: 33806633 PMCID: PMC7967336 DOI: 10.3390/ijerph18052577] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Revised: 02/28/2021] [Accepted: 03/02/2021] [Indexed: 11/18/2022]
Abstract
Based on the panel data of 41 cities in the Yangtze River Delta from 2008 to 2017, this paper constructs an evaluation indicators system for urban tourism eco-efficiency. By measuring the tourism eco-efficiency in the Yangtze River Delta urban agglomeration, we analyze its spatial-temporal evolution characteristics. Furthermore, the modified gravity model and social network analysis are introduced to explore the spatial network structure of tourism eco-efficiency and its evolution trend.The results show that:(1) The overall eco-efficiency of tourism in the Yangtze River Delta region presents a fluctuating downward trend, among which Jiangsu and Zhejiang have high eco-efficiency, Shanghai and Anhui are relatively low. The gap within the region first increased and then decreased. (2) During this decade, the spatial network structure of tourism eco-efficiency in the Yangtze River Delta has become increasingly loose. The weakening of the network connection strength has led to a decrease in the regional tourism eco-efficiency to a great extent. (3) The network centrality of cities such as Zhoushan, Huzhou, and Huangshan has always maintained a high level, and these cities have firmly occupied the core position of network. (4) The spatial association network of tourism eco-efficiency can be divided into four blocks: “two-way spillover”, “net spillover”, “net benefit” and “agent”. The synergy and spillover effect between various blocks are significant, and there is a spatial polarization trend centered on a few cities. Based on this, this paper puts forward optimization suggestions for the spatial network structure of the Yangtze River Delta urban agglomeration, in anticipation of promoting the improvement of regional tourism eco-efficiency.
Collapse
Affiliation(s)
- Yiyang Sun
- School of Geographic Science, Nanjing Normal University, Nanjing 210023, China;
- Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
| | - Guolin Hou
- School of Geographic Science, Nanjing Normal University, Nanjing 210023, China;
- Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing 210023, China
- Correspondence: ; Tel.: +86-25-8589-1347
| |
Collapse
|
18
|
Meng Y, Jin M, Tang X, Xu J. Drug repositioning based on similarity constrained probabilistic matrix factorization: COVID-19 as a case study. Appl Soft Comput 2021; 103:107135. [PMID: 33519322 PMCID: PMC7825831 DOI: 10.1016/j.asoc.2021.107135] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 01/10/2021] [Accepted: 01/20/2021] [Indexed: 12/21/2022]
Abstract
The novel coronavirus disease 2019 (COVID-19) pandemic has caused a massive health crisis worldwide and upended the global economy. However, vaccines and traditional drug discovery for COVID-19 cost too much in terms of time, manpower, and money. Drug repurposing becomes one of the promising treatment strategies amid the COVID-19 crisis. At present, there are no publicly existing databases for experimentally supported human drug–virus interactions, and most existing drug repurposing methods require the rich information, which is not always available, especially for a new virus. In this study, on the one hand, we put size-able efforts to collect drug–virus interaction entries from literature and build the Human Drug Virus Database (HDVD). On the other hand, we propose a new approach, called SCPMF (similarity constrained probabilistic matrix factorization), to identify new drug–virus interactions for drug repurposing. SCPMF is implemented on an adjacency matrix of a heterogeneous drug–virus network, which integrates the known drug–virus interactions, drug chemical structures, and virus genomic sequences. SCPMF projects the drug–virus interactions matrix into two latent feature matrices for the drugs and viruses, which reconstruct the drug–virus interactions matrix when multiplied together, and then introduces the weighted similarity interaction matrix as constraints for drugs and viruses. Benchmarking comparisons on two different datasets demonstrate that SCPMF has reliable prediction performance and outperforms several recent approaches. Moreover, SCPMF-predicted drug candidates of COVID-19 also confirm the accuracy and reliability of SCPMF.
Collapse
Affiliation(s)
- Yajie Meng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Min Jin
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Xianfang Tang
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| | - Junlin Xu
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China
| |
Collapse
|
19
|
Lei X, Mudiyanselage TB, Zhang Y, Bian C, Lan W, Yu N, Pan Y. A comprehensive survey on computational methods of non-coding RNA and disease association prediction. Brief Bioinform 2020; 22:6042241. [PMID: 33341893 DOI: 10.1093/bib/bbaa350] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Revised: 10/20/2020] [Accepted: 11/01/2020] [Indexed: 02/06/2023] Open
Abstract
The studies on relationships between non-coding RNAs and diseases are widely carried out in recent years. A large number of experimental methods and technologies of producing biological data have also been developed. However, due to their high labor cost and production time, nowadays, calculation-based methods, especially machine learning and deep learning methods, have received a lot of attention and been used commonly to solve these problems. From a computational point of view, this survey mainly introduces three common non-coding RNAs, i.e. miRNAs, lncRNAs and circRNAs, and the related computational methods for predicting their association with diseases. First, the mainstream databases of above three non-coding RNAs are introduced in detail. Then, we present several methods for RNA similarity and disease similarity calculations. Later, we investigate ncRNA-disease prediction methods in details and classify these methods into five types: network propagating, recommend system, matrix completion, machine learning and deep learning. Furthermore, we provide a summary of the applications of these five types of computational methods in predicting the associations between diseases and miRNAs, lncRNAs and circRNAs, respectively. Finally, the advantages and limitations of various methods are identified, and future researches and challenges are also discussed.
Collapse
Affiliation(s)
- Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | | | - Yuchen Zhang
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Chen Bian
- School of Computer Science, Shaanxi Normal University, Xi'an, China
| | - Wei Lan
- School of Computer, Electronics and Information at Guangxi University, Nanning, China
| | - Ning Yu
- Department of Computing Sciences at the College at Brockport, State University of New York, Rochester, NY, USA
| | - Yi Pan
- Computer Science Department at Georgia State University, Atlanta, GA, USA
| |
Collapse
|
20
|
Sun P, Yang S, Cao Y, Cheng R, Han S. Prediction of Potential Associations Between miRNAs and Diseases Based on Matrix Decomposition. Front Genet 2020; 11:598185. [PMID: 33304393 PMCID: PMC7701300 DOI: 10.3389/fgene.2020.598185] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Accepted: 10/22/2020] [Indexed: 01/06/2023] Open
Abstract
It is known that miRNA plays an increasingly important role in many physiological processes. Disease-related miRNAs could be potential biomarkers for clinical diagnosis, prognosis, and treatment. Therefore, accurately inferring potential miRNAs related to diseases has become a hot topic in the bioinformatics community recently. In this study, we proposed a mathematical model based on matrix decomposition, named MFMDA, to identify potential miRNA-disease associations by integrating known miRNA and disease-related data, similarities between miRNAs and between diseases. We also compared MFMDA with some of the latest algorithms in several established miRNA disease databases. MFMDA reached an AUC of 0.9061 in the fivefold cross-validation. The experimental results show that MFMDA effectively infers novel miRNA-disease associations. In addition, we conducted case studies by applying MFMDA to three types of high-risk human cancers. While most predicted miRNAs are confirmed by external databases of experimental literature, we also identified a few novel disease-related miRNAs for further experimental validation.
Collapse
Affiliation(s)
- Pengcheng Sun
- Department of Obstetrics and Gynecology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Shuyan Yang
- Department of Obstetrics and Gynecology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Ye Cao
- Department of Obstetrics and Gynecology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Rongjie Cheng
- Department of Obstetrics and Gynecology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| | - Shiyu Han
- Department of Obstetrics and Gynecology, The Fourth Affiliated Hospital of Harbin Medical University, Harbin, China
| |
Collapse
|
21
|
Wang C, Sun K, Wang J, Guo M. Data fusion-based algorithm for predicting miRNA–Disease associations. Comput Biol Chem 2020; 88:107357. [DOI: 10.1016/j.compbiolchem.2020.107357] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2020] [Revised: 07/24/2020] [Accepted: 08/05/2020] [Indexed: 11/30/2022]
|
22
|
Ji BY, You ZH, Chen ZH, Wong L, Yi HC. NEMPD: a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. BMC Bioinformatics 2020; 21:401. [PMID: 32912137 PMCID: PMC7646193 DOI: 10.1186/s12859-020-03716-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 08/19/2020] [Indexed: 12/25/2022] Open
Abstract
Background As an important non-coding RNA, microRNA (miRNA) plays a significant role in a series of life processes and is closely associated with a variety of Human diseases. Hence, identification of potential miRNA-disease associations can make great contributions to the research and treatment of Human diseases. However, to our knowledge, many existing computational methods only utilize the single type of known association information between miRNAs and diseases to predict their potential associations, without focusing on their interactions or associations with other types of molecules. Results In this paper, we propose a network embedding-based method for predicting miRNA-disease associations by preserving behavior and attribute information. Firstly, a heterogeneous network is constructed by integrating known associations among miRNA, protein and disease, and the network representation method Learning Graph Representations with Global Structural Information (GraRep) is implemented to learn the behavior information of miRNAs and diseases in the network. Then, the behavior information of miRNAs and diseases is combined with the attribute information of them to represent miRNA-disease association pairs. Finally, the prediction model is established based on the Random Forest algorithm. Under the five-fold cross validation, the proposed NEMPD model obtained average 85.41% prediction accuracy with 80.96% sensitivity at the AUC of 91.58%. Furthermore, the performance of NEMPD is also validated by the case studies. Among the top 50 predicted disease-related miRNAs, 48 (breast neoplasms), 47 (colon neoplasms), 47 (lung neoplasms) were confirmed by two other databases. Conclusions The proposed NEMPD model has a good performance in predicting the potential associations between miRNAs and diseases, and has great potency in the field of miRNA-disease association prediction in the future.
Collapse
Affiliation(s)
- Bo-Ya Ji
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Zhan-Heng Chen
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Leon Wong
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Hai-Cheng Yi
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, 830011, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| |
Collapse
|
23
|
Zhuang H, Zhang Y, Yang S, Cheng L, Liu SL. A Mendelian Randomization Study on Infant Length and Type 2 Diabetes Mellitus Risk. Curr Gene Ther 2020; 19:224-231. [PMID: 31553296 DOI: 10.2174/1566523219666190925115535] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/15/2019] [Accepted: 06/16/2019] [Indexed: 12/12/2022]
Abstract
OBJECTIVE Infant length (IL) is a positively associated phenotype of type 2 diabetes mellitus (T2DM), but the causal relationship of which is still unclear. Here, we applied a Mendelian randomization (MR) study to explore the causal relationship between IL and T2DM, which has the potential to provide guidance for assessing T2DM activity and T2DM- prevention in young at-risk populations. MATERIALS AND METHODS To classify the study, a two-sample MR, using genetic instrumental variables (IVs) to explore the causal effect was applied to test the influence of IL on the risk of T2DM. In this study, MR was carried out on GWAS data using 8 independent IL SNPs as IVs. The pooled odds ratio (OR) of these SNPs was calculated by the inverse-variance weighted method for the assessment of the risk the shorter IL brings to T2DM. Sensitivity validation was conducted to identify the effect of individual SNPs. MR-Egger regression was used to detect pleiotropic bias of IVs. RESULTS The pooled odds ratio from the IVW method was 1.03 (95% CI 0.89-1.18, P = 0.0785), low intercept was -0.477, P = 0.252, and small fluctuation of ORs ranged from -0.062 ((0.966 - 1.03) / 1.03) to 0.05 ((1.081 - 1.03) / 1.03) in leave-one-out validation. CONCLUSION We validated that the shorter IL causes no additional risk to T2DM. The sensitivity analysis and the MR-Egger regression analysis also provided adequate evidence that the above result was not due to any heterogeneity or pleiotropic effect of IVs.
Collapse
Affiliation(s)
- He Zhuang
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine- Pharmaceutics of China), Harbin Medical University, Harbin, China.,HMU-UCFM Centre for Infection and Genomics, Harbin Medical University, Harbin, China
| | - Ying Zhang
- Department of Pharmacy, Heilongjiang Province Land Reclamation Headquarters General Hospital, 150001, Harbin, China
| | - Shuo Yang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shu-Lin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine- Pharmaceutics of China), Harbin Medical University, Harbin, China.,HMU-UCFM Centre for Infection and Genomics, Harbin Medical University, Harbin, China.,Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, Canada.,Department of Infectious Diseases, The First Affiliated Hospital, Harbin Medical University, Harbin, China.,Translational Medicine Research and Cooperation Center of Northern China, Heilongjiang Academy of Medical Sciences, Harbin, China
| |
Collapse
|
24
|
Cheng L, Qi C, Zhuang H, Fu T, Zhang X. gutMDisorder: a comprehensive database for dysbiosis of the gut microbiota in disorders and interventions. Nucleic Acids Res 2020; 48:D554-D560. [PMID: 31584099 PMCID: PMC6943049 DOI: 10.1093/nar/gkz843] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2019] [Revised: 09/18/2019] [Accepted: 10/01/2019] [Indexed: 12/11/2022] Open
Abstract
gutMDisorder (http://bio-annotation.cn/gutMDisorder), a manually curated database, aims at providing a comprehensive resource of dysbiosis of the gut microbiota in disorders and interventions. Alterations in the composition of the gut microbial community play crucial roles in the development of chronic disorders. And the beneficial effects of drugs, foods and other intervention measures on disorders could be microbially mediated. The current version of gutMDisorder documents 2263 curated associations between 579 gut microbes and 123 disorders or 77 intervention measures in Human, and 930 curated associations between 273 gut microbes and 33 disorders or 151 intervention measures in Mouse. Each entry in the gutMDisorder contains detailed information on an association, including an intestinal microbe, a disorder name, intervention measures, experimental technology and platform, characteristic of samples, web sites for downloading the sequencing data, a brief description of the association, a literature reference, and so on. gutMDisorder provides a user-friendly interface to browse, retrieve each entry using gut microbes, disorders, and intervention measures. It also offers pages for downloading all the entries and submitting new experimentally validated associations.
Collapse
Affiliation(s)
- Liang Cheng
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, Heilongjiang, China, 150028.,College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China, 150081
| | - Changlu Qi
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China, 150081
| | - He Zhuang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China, 150081
| | - Tongze Fu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang, China, 150081
| | - Xue Zhang
- NHC and CAMS Key Laboratory of Molecular Probe and Targeted Theranostics, Harbin Medical University, Harbin, Heilongjiang, China, 150028.,McKusick-Zhang Center for Genetic Medicine, Peking Union Medical College, Beijing, China, 100005
| |
Collapse
|
25
|
Ni P, Wang J, Zhong P, Li Y, Wu FX, Pan Y. Constructing Disease Similarity Networks Based on Disease Module Theory. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:906-915. [PMID: 29993782 DOI: 10.1109/tcbb.2018.2817624] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Quantifying the associations between diseases is now playing an important role in modern biology and medicine. Actually discovering associations between diseases could help us gain deeper insights into pathogenic mechanisms of complex diseases, thus could lead to improvements in disease diagnosis, drug repositioning, and drug development. Due to the growing body of high-throughput biological data, a number of methods have been developed for computing similarity between diseases during the past decade. However, these methods rarely consider the interconnections of genes related to each disease in protein-protein interaction network (PPIN). Recently, the disease module theory has been proposed, which states that disease-related genes or proteins tend to interact with each other in the same neighborhood of a PPIN. In this study, we propose a new method called ModuleSim to measure associations between diseases by using disease-gene association data and PPIN data based on disease module theory. The experimental results show that by considering the interactions between disease modules and their modularity, the disease similarity calculated by ModuleSim has a significant correlation with disease classification of Disease Ontology (DO). Furthermore, ModuleSim outperforms other four popular methods which are all using disease-gene association data and PPIN data to measure disease-disease associations. In addition, the disease similarity network constructed by MoudleSim suggests that ModuleSim is capable of finding potential associations between diseases.
Collapse
|
26
|
Zhu X, Wang X, Zhao H, Pei T, Kuang L, Wang L. BHCMDA: A New Biased Heat Conduction Based Method for Potential MiRNA-Disease Association Prediction. Front Genet 2020; 11:384. [PMID: 32425979 PMCID: PMC7212362 DOI: 10.3389/fgene.2020.00384] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2020] [Accepted: 03/27/2020] [Indexed: 01/04/2023] Open
Abstract
Recent studies have indicated that microRNAs (miRNAs) are closely related to sundry human sophisticated diseases. According to the surmise that functionally similar miRNAs are more likely associated with phenotypically similar diseases, researchers have proposed a variety of valid computational models through integrating known miRNA-disease associations, disease semantic similarity, miRNA functional similarity, and Gaussian interaction profile kernel similarity to discover the potential miRNA-disease relationships in biomedical researches. Taking account of the limitations of previous computational models, a new computational model based on biased heat conduction for MiRNA-Disease Association prediction (BHCMDA) was proposed in this paper, which can achieve the AUC of 0.8890 in LOOCV (Leave-One-Out Cross Validation) and the mean AUC of 0.9060, 0.8931 under the framework of twofold cross validation, fivefold cross validation, respectively. In addition, BHCMDA was further implemented to the case studies of three vital human cancers, and simulation results illustrated that there were 88% (Esophageal Neoplasms), 92% (Colonic Neoplasms) and 92% (Lymphoma) out of top 50 predicted miRNAs having been confirmed by experimental literatures, separately, which demonstrated the good performance of BHCMDA as well. Thence, BHCMDA would be a useful calculative resource for potential miRNA-disease association prediction.
Collapse
Affiliation(s)
- Xianyou Zhu
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China
| | - Xuzai Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Haochen Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Tingrui Pei
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Linai Kuang
- College of Computer Science and Technology, Hengyang Normal University, Hengyang, China.,Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China
| | - Lei Wang
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan, China.,College of Computer Engineering & Applied Mathematics, Changsha University, Changsha, China
| |
Collapse
|
27
|
Gao Z, Wang YT, Wu QW, Ni JC, Zheng CH. Graph regularized L 2,1-nonnegative matrix factorization for miRNA-disease association prediction. BMC Bioinformatics 2020; 21:61. [PMID: 32070280 PMCID: PMC7029547 DOI: 10.1186/s12859-020-3409-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 02/11/2020] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND The aberrant expression of microRNAs is closely connected to the occurrence and development of a great deal of human diseases. To study human diseases, numerous effective computational models that are valuable and meaningful have been presented by researchers. RESULTS Here, we present a computational framework based on graph Laplacian regularized L2, 1-nonnegative matrix factorization (GRL2, 1-NMF) for inferring possible human disease-connected miRNAs. First, manually validated disease-connected microRNAs were integrated, and microRNA functional similarity information along with two kinds of disease semantic similarities were calculated. Next, we measured Gaussian interaction profile (GIP) kernel similarities for both diseases and microRNAs. Then, we adopted a preprocessing step, namely, weighted K nearest known neighbours (WKNKN), to decrease the sparsity of the miRNA-disease association matrix network. Finally, the GRL2,1-NMF framework was used to predict links between microRNAs and diseases. CONCLUSIONS The new method (GRL2, 1-NMF) achieved AUC values of 0.9280 and 0.9276 in global leave-one-out cross validation (global LOOCV) and five-fold cross validation (5-CV), respectively, showing that GRL2, 1-NMF can powerfully discover potential disease-related miRNAs, even if there is no known associated disease.
Collapse
Affiliation(s)
- Zhen Gao
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Yu-Tian Wang
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Qing-Wen Wu
- School of Software, Qufu Normal University, Qufu, 273165, China
| | - Jian-Cheng Ni
- School of Software, Qufu Normal University, Qufu, 273165, China.
| | - Chun-Hou Zheng
- School of Software, Qufu Normal University, Qufu, 273165, China.
| |
Collapse
|
28
|
Huang Q, Zhang J, Wei L, Guo F, Zou Q. 6mA-RicePred: A Method for Identifying DNA N 6-Methyladenine Sites in the Rice Genome Based on Feature Fusion. FRONTIERS IN PLANT SCIENCE 2020; 11:4. [PMID: 32076430 PMCID: PMC7006724 DOI: 10.3389/fpls.2020.00004] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2019] [Accepted: 01/06/2020] [Indexed: 06/01/2023]
Abstract
MOTIVATION The biological function of N 6-methyladenine DNA (6mA) in plants is largely unknown. Rice is one of the most important crops worldwide and is a model species for molecular and genetic studies. There are few methods for 6mA site recognition in the rice genome, and an effective computational method is needed. RESULTS In this paper, we propose a new computational method called 6mA-Pred to identify 6mA sites in the rice genome. 6mA-Pred employs a feature fusion method to combine advantageous features from other methods and thus obtain a new feature to identify 6mA sites. This method achieved an accuracy of 87.27% in the identification of 6mA sites with 10-fold cross-validation and achieved an accuracy of 85.6% in independent test sets.
Collapse
Affiliation(s)
- Qianfei Huang
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Jun Zhang
- Rehabilitation Department, Heilongjiang Province Land Reclamation Headquarters General Hospital, Harbin, China
| | - Leyi Wei
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China
| |
Collapse
|
29
|
Abstract
BACKGROUND A collection of disease-associated data contributes to study the association between diseases. Discovering closely related diseases plays a crucial role in revealing their common pathogenic mechanisms. This might further imply treatment that can be appropriated from one disease to another. During the past decades, a number of approaches for calculating disease similarity have been developed. However, most of them are designed to take advantage of single or few data sources, which results in their low accuracy. METHODS In this paper, we propose a novel method, called MultiSourcDSim, to calculate disease similarity by integrating multiple data sources, namely, gene-disease associations, GO biological process-disease associations and symptom-disease associations. Firstly, we establish three disease similarity networks according to the three disease-related data sources respectively. Secondly, the representation of each node is obtained by integrating the three small disease similarity networks. In the end, the learned representations are applied to calculate the similarity between diseases. RESULTS Our approach shows the best performance compared to the other three popular methods. Besides, the similarity network built by MultiSourcDSim suggests that our method can also uncover the latent relationships between diseases. CONCLUSIONS MultiSourcDSim is an efficient approach to predict similarity between diseases.
Collapse
Affiliation(s)
- Lei Deng
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Danyi Ye
- School of Computer Science and Engineering, Central South University, Changsha, 410075 China
| | - Junmin Zhao
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| | - Jingpu Zhang
- School of Computer and Data Science, Henan University of Urban Construction, Pingdingshan, 467000 China
| |
Collapse
|
30
|
Osone T, Yoshida N. The Relationship Between the miRNA Sequence and Disease May be Revealed by Focusing on Hydrogen Bonding Sites in RNA-RNA Interactions. Cells 2019; 8:cells8121615. [PMID: 31835885 PMCID: PMC6952923 DOI: 10.3390/cells8121615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2019] [Revised: 11/26/2019] [Accepted: 12/09/2019] [Indexed: 11/18/2022] Open
Abstract
MicroRNAs are important genes in biological processes. Although the function of microRNAs has been elucidated, the relationship between the sequence and the disease is not sufficiently clear. It is important to clarify the relationship between the sequence and the disease because it is possible to clarify the meaning of the microRNA genetic code consisting of four nucleobases. Since seed theory is based on sequences, its development can be expected to reveal the meaning of microRNA sequences. However, this method has many false positives and false negatives. On the other hand, disease-related microRNA searches using network analysis are not based on sequences, so it is difficult to clarify the relationship between sequences and diseases. Therefore, RNA–RNA interactions which are caused by hydrogen bonding were focused on. As a result, it was clarified that sequences and diseases were highly correlated by calculating the electric field in microRNA which is considered as the torus. It was also suggested that four diseases with different major classifications can be distinguished. Conventionally, RNA was interpreted as a one-dimensional array of four nucleobases, but a new approach to RNA from this study can be expected to provide a new perspective on RNA-RNA interactions.
Collapse
Affiliation(s)
- Tatsunori Osone
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Yokohama 226-8502, Japan
- Correspondence: ; Tel.: +81-50-3568-0281
| | - Naohiro Yoshida
- School of Materials and Chemical Technology, Tokyo Institute of Technology, Yokohama 226-8502, Japan
- Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo 152-8550, Japan;
| |
Collapse
|
31
|
Su L, Liu G, Wang J, Xu D. A rectified factor network based biclustering method for detecting cancer-related coding genes and miRNAs, and their interactions. Methods 2019; 166:22-30. [PMID: 31121299 PMCID: PMC6708461 DOI: 10.1016/j.ymeth.2019.05.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Revised: 04/14/2019] [Accepted: 05/13/2019] [Indexed: 12/12/2022] Open
Abstract
Detecting cancer-related genes and their interactions is a crucial task in cancer research. For this purpose, we proposed an efficient method, to detect coding genes, microRNAs (miRNAs), and their interactions related to a particular cancer or a cancer subtype using their expression data from the same set of samples. Firstly, biclusters specific to a particular type of cancer are detected based on rectified factor networks and ranked according to their associations with general cancers. Secondly, coding genes and miRNAs in each bicluster are prioritized by considering their differential expression and differential correlation values, protein-protein interaction data, and potential cancer markers. Finally, a rank fusion process is used to obtain the final comprehensive rank by combining multiple ranking results. We applied our proposed method on breast cancer datasets. Results show that our method outperforms other methods in detecting breast cancer-related coding genes and miRNAs. Furthermore, our method is very efficient in computing time, which can handle tens of thousands genes/miRNAs and hundreds of patients in hours on a desktop. This work may aid researchers in studying the genetic architecture of complex diseases, and improving the accuracy of diagnosis.
Collapse
Affiliation(s)
- Lingtao Su
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China; Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Guixia Liu
- Department of Computer Science and Technology, Jilin University, Changchun 130012, China
| | - Juexin Wang
- Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| | - Dong Xu
- Department of Electrical Engineering & Computer Science and Christopher S. Bond Life Sciences Center, University of Missouri, Columbia, MO 65211, USA.
| |
Collapse
|
32
|
Liu B, Han L, Liu X, Wu J, Ma Q. Computational Prediction of Sigma-54 Promoters in Bacterial Genomes by Integrating Motif Finding and Machine Learning Strategies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1211-1218. [PMID: 29993815 DOI: 10.1109/tcbb.2018.2816032] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Sigma factor, as a unit of RNA polymerase holoenzyme, is a critical factor in the process of gene transcriptional regulation. It recognizes the specific DNA sites and brings the core enzyme of RNA polymerase to the upstream regions of target genes. Therefore, the prediction of the promoters for a particular sigma factor is essential for interpreting functional genomic data and observation. This paper develops a new method to predict sigma-54 promoters in bacterial genomes. The new method organically integrates motif finding and machine learning strategies to capture the intrinsic features of sigma-54 promoters. The experiments on E. coli benchmark test set show that our method has good capability to distinguish sigma-54 promoters from surrounding or randomly selected DNA sequences. The applications of the other three bacterial genomes indicate the potential robustness and applicative power of our method on a large number of bacterial genomes. The source code of our method can be freely downloaded at https://github.com/maqin2001/PromotePredictor.
Collapse
|
33
|
Pan Z, Zhang H, Liang C, Li G, Xiao Q, Ding P, Luo J. Self-Weighted Multi-Kernel Multi-Label Learning for Potential miRNA-Disease Association Prediction. MOLECULAR THERAPY-NUCLEIC ACIDS 2019; 17:414-423. [PMID: 31319245 PMCID: PMC6637211 DOI: 10.1016/j.omtn.2019.06.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2019] [Revised: 05/22/2019] [Accepted: 06/12/2019] [Indexed: 11/23/2022]
Abstract
Researchers have realized that microRNAs (miRNAs) play significant roles in the pathogenesis of various diseases. Although many computational models have been proposed to predict the associations between miRNAs and diseases, prediction performance could still be improved. In this paper, we propose a novel self-weighted, multi-kernel, multi-label learning (SwMKML) method to predict disease-related miRNAs. SwMKML adaptively learns two optimal kernel matrices for both miRNAs and diseases from multiple kernels constructed from known miRNA-disease associations. Moreover, the miRNA-disease associations predicted from both spaces are updated simultaneously based on a multi-label framework. Compared with four state-of-the-art computational models, SwMKML achieved best results of 95.5%, 93.1%, and 84.1% in global leave-one-out cross-validation, 5-fold cross-validation, and overall prediction accuracy, respectively. A case study conducted on head and neck neoplasms further identified two potential prognostic biomarkers, hsa-mir-125b-1 and hsa-mir-125b-2, for the disease. SwMKML is freely available at Github, and we anticipate that it may become an effective tool for potential miRNA-disease association prediction.
Collapse
Affiliation(s)
- Zhenxia Pan
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China
| | - Huaxiang Zhang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China.
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan 250358, China.
| | - Guanghui Li
- School of Information Engineering, East China Jiaotong University, Nanchang 330013, China
| | - Qiu Xiao
- College of Information Science and Engineering, Hunan Normal University, Changsha 410006, China
| | - Pingjian Ding
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410082, China
| |
Collapse
|
34
|
Chen M, Zhang Y, Li A, Li Z, Liu W, Chen Z. Bipartite Heterogeneous Network Method Based on Co-neighbor for MiRNA-Disease Association Prediction. Front Genet 2019; 10:385. [PMID: 31080459 PMCID: PMC6497741 DOI: 10.3389/fgene.2019.00385] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 04/10/2019] [Indexed: 12/22/2022] Open
Abstract
In recent years, miRNA variation and dysregulation have been found to be closely related to human tumors, and identifying miRNA-disease associations is helpful for understanding the mechanisms of disease or tumor development and is greatly significant for the prognosis, diagnosis, and treatment of human diseases. This article proposes a Bipartite Heterogeneous network link prediction method based on co-neighbor to predict miRNA-disease association (BHCN). According to the structural characteristics of the bipartite network, the concept of bipartite network co-neighbors is proposed, and the co-neighbors were used to represent the probability of association between disease and miRNA. To predict the isolated diseases and the new miRNA based on the association probability expressed by co-neighbors, we utilized the similarity between disease nodes and the similarity between miRNA nodes in heterogeneous networks to represent the association probability between disease and miRNA. The model's predictive performance was evaluated by the leave-one-out cross validation (LOOCV) on different datasets. The AUC value of BHCN on the gold benchmark dataset was 0.7973, and the AUC obtained on the prediction dataset was 0.9349, which was better than that of the classic global algorithm. In this case study, we conducted predictive studies on breast neoplasms and colon neoplasms. Most of the top 50 predicted results were confirmed by three databases, namely, HMDD, miR2disease, and dbDEMC, with accuracy rates of 96 and 82%. In addition, BHCN can be used for predicting isolated diseases (without any known associated diseases) and new miRNAs (without any known associated miRNAs). In the isolated disease case study, the top 50 of breast neoplasm and colon neoplasm potentials associated with miRNAs predicted an accuracy of 100 and 96%, respectively, thereby demonstrating the favorable predictive power of BHCN for potentially relevant miRNAs.
Collapse
Affiliation(s)
- Min Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Yi Zhang
- School of Information Science and Engineering, Guilin University of Technology, Guilin, China
| | - Ang Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zejun Li
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Wenhua Liu
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| | - Zheng Chen
- School of Computer Science and Technology, Hunan Institute of Technology, Hengyang, China
| |
Collapse
|
35
|
Zeng X, Wang W, Deng G, Bing J, Zou Q. Prediction of Potential Disease-Associated MicroRNAs by Using Neural Networks. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 16:566-575. [PMID: 31077936 PMCID: PMC6510966 DOI: 10.1016/j.omtn.2019.04.010] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2019] [Revised: 04/11/2019] [Accepted: 04/11/2019] [Indexed: 12/13/2022]
Abstract
Identifying disease-related microRNAs (miRNAs) is an essential but challenging task in bioinformatics research. Much effort has been devoted to discovering the underlying associations between miRNAs and diseases. However, most studies mainly focus on designing advanced methods to improve prediction accuracy while neglecting to investigate the link predictability of the relationships between miRNAs and diseases. In this work, we construct a heterogeneous network by integrating neighborhood information in the neural network to predict potential associations between miRNAs and diseases, which also consider the imbalance of datasets. We also employ a new computational method called a neural network model for miRNA-disease association prediction (NNMDA). This model predicts miRNA-disease associations by integrating multiple biological data resources. Comparison of our work with other algorithms reveals the reliable performance of NNMDA. Its average AUC score was 0.937 over 15 diseases in a 5-fold cross-validation and AUC of 0.8439 based on leave-one-out cross-validation. The results indicate that NNMDA could be used in evaluating the accuracy of miRNA-disease associations. Moreover, NNMDA was applied to two common human diseases in two types of case studies. In the first type, 26 out of the top 30 predicted miRNAs of lung neoplasms were confirmed by the experiments. In the second type of case study for new diseases without any known miRNAs related to it, we selected breast neoplasms as the test example by hiding the association information between the miRNAs and this disease. The results verified 50 out of the top 50 predicted breast-neoplasm-related miRNAs.
Collapse
Affiliation(s)
- Xiangxiang Zeng
- Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, Guangdong, China; Department of Information Science and Technology, Xiamen University, Xiamen 361005, Fujian, China
| | - Wen Wang
- Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, Guangdong, China
| | - Gaoshan Deng
- Department of Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Jiaxin Bing
- Shenzhen Research Institute of Xiamen University, Xiamen University, Shenzhen 518000, Guangdong, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610000, China; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610000, China.
| |
Collapse
|
36
|
Zhuang H, Han J, Cheng L, Liu SL. A Positive Causal Influence of IL-18 Levels on the Risk of T2DM: A Mendelian Randomization Study. Front Genet 2019; 10:295. [PMID: 31024619 PMCID: PMC6459887 DOI: 10.3389/fgene.2019.00295] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 03/19/2019] [Indexed: 12/21/2022] Open
Abstract
A large number of clinical studies have shown that interleukin-18 (IL-18) plasma levels are positively correlated with the pathogenesis and development of type 2 diabetes mellitus (T2DM), but it remains unclear whether IL-18 causes T2DM, primarily due to the influence of reverse causality and residual confounding factors. Genome-wide association studies have led to the discovery of numerous common variants associated with IL-18 and T2DM and opened unprecedented opportunities for investigating possible associations between genetic traits and diseases. In this study, we employed a two-sample Mendelian randomization (MR) method to analyze the causal relationships between IL-18 plasma levels and T2DM using IL18-related SNPs as genetic instrumental variables (IVs). We first selected eight SNPs that were significantly associated with IL-18 but independent of T2DM. We then used these SNPs as IVs to evaluate their effects on T2DM using the inverse-variance weighted (IVW) method. Finally, we conducted sensitivity analysis and MR-Egger regression analysis to evaluate the heterogeneity and pleiotropic effects of each variant. The results based on the IVW method demonstrate that high IL-18 plasma levels significantly increase the risk of T2DM, and no heterogeneity or pleiotropic effects appeared after the sensitivity and MR-Egger analyses.
Collapse
Affiliation(s)
- He Zhuang
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, China
| | - Junwei Han
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Liang Cheng
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, China
| | - Shu-Lin Liu
- Systemomics Center, College of Pharmacy, and Genomics Research Center (State-Province Key Laboratories of Biomedicine-Pharmaceutics of China), Harbin Medical University, Harbin, China.,Department of Microbiology, Immunology and Infectious Diseases, University of Calgary, Calgary, AB, Canada
| |
Collapse
|
37
|
FCMDAP: using miRNA family and cluster information to improve the prediction accuracy of disease related miRNAs. BMC SYSTEMS BIOLOGY 2019; 13:26. [PMID: 30953512 PMCID: PMC6449885 DOI: 10.1186/s12918-019-0696-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Background Biological experiments have confirmed the association between miRNAs and various diseases. However, such experiments are costly and time consuming. Computational methods help select potential disease-related miRNAs to improve the efficiency of biological experiments. Methods In this work, we develop a novel method using multiple types of data to calculate miRNA and disease similarity based on mutual information, and add miRNA family and cluster information to predict human disease-related miRNAs (FCMDAP). This method not only depends on known miRNA-diseases associations but also accurately measures miRNA and disease similarity and resolves the problem of overestimation. FCMDAP uses the k most similar neighbor recommendation algorithm to predict the association score between miRNA and disease. Information about miRNA cluster is also used to improve prediction accuracy. Result FCMDAP achieves an average AUC of 0.9165 based on leave-one-out cross validation. Results confirm the 100, 98 and 96% of the top 50 predicted miRNAs reported in case studies on colorectal, lung, and pancreatic neoplasms. FCMDAP also exhibits satisfactory performance in predicting diseases without any related miRNAs and miRNAs without any related diseases. Conclusions In this study, we present a computational method FCMDAP to improve the prediction accuracy of disease related miRNAs. FCMDAP could be an effective tool for further biological experiments. Electronic supplementary material The online version of this article (10.1186/s12918-019-0696-9) contains supplementary material, which is available to authorized users.
Collapse
|
38
|
Liang C, Yu S, Luo J. Adaptive multi-view multi-label learning for identifying disease-associated candidate miRNAs. PLoS Comput Biol 2019; 15:e1006931. [PMID: 30933970 PMCID: PMC6459551 DOI: 10.1371/journal.pcbi.1006931] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Revised: 04/11/2019] [Accepted: 03/05/2019] [Indexed: 11/29/2022] Open
Abstract
Increasing evidence has indicated that microRNAs(miRNAs) play vital roles in various pathological processes and thus are closely related with many complex human diseases. The identification of potential disease-related miRNAs offers new opportunities to understand disease etiology and pathogenesis. Although there have been numerous computational methods proposed to predict reliable miRNA-disease associations, they suffer from various limitations that affect the prediction accuracy and their applicability. In this study, we develop a novel method to discover disease-related candidate miRNAs based on Adaptive Multi-View Multi-Label learning(AMVML). Specifically, considering the inherent noise existed in the current dataset, we propose to learn a new affinity graph adaptively for both diseases and miRNAs from multiple similarity profiles. We then simultaneously update the miRNA-disease association predicted from both spaces based on multi-label learning. In particular, we prove the convergence of AMVML theoretically and the corresponding analysis indicates that it has a fast convergence rate. To comprehensively illustrate the prediction performance of our method, we compared AMVML with four state-of-the-art methods under different validation frameworks. As a result, our method achieved comparable performance under various evaluation metrics, which suggests that our method is capable of discovering greater number of true miRNA-disease associations. The case study conducted on thyroid neoplasms further identified a potential diagnostic biomarker. Together, the experimental results confirms the utility of our method and we anticipate that our method could serve as a reliable and efficient tool for uncovering novel disease-related miRNAs.
Collapse
Affiliation(s)
- Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Shengpeng Yu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Jiawei Luo
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, China
| |
Collapse
|
39
|
Ha J, Park C, Park S. PMAMCA: prediction of microRNA-disease association utilizing a matrix completion approach. BMC SYSTEMS BIOLOGY 2019; 13:33. [PMID: 30894171 PMCID: PMC6425656 DOI: 10.1186/s12918-019-0700-4] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2018] [Accepted: 01/29/2019] [Indexed: 01/29/2023]
Abstract
Background Numerous experimental results have indicated that microRNAs (miRNAs) play a vital role in biological processes, as well as outbreaks of diseases at the molecular level. Despite their important role in biological processes, knowledge regarding specific functions of miRNAs in the development of human diseases is very limited. While attempting to solve this problem, many computational approaches have been proposed and attracted significant attention. However, most previous approaches suffer from the common problem of being inapplicable to new diseases without any known miRNA-disease associations. Results This paper proposes a novel method for inferring disease-miRNA associations utilizing a machine learning technique called matrix factorization, which is widely used in recommendation systems. In recommendation systems, the goal is to predict rating scores that a user might assign to specific items. By replacing users with miRNAs and items with diseases, we can efficiently predict miRNA-disease associations without seed miRNAs. As a result, our proposed model, called prediction of microRNA-disease association utilizing a matrix completion approach, achieves excellent performance compared to previous approaches with a reliable AUC value of 0.882 by implementing five-fold cross validation. Conclusions To the best of our knowledge, the proposed method applies the matrix completion technique to infer miRNA-disease associations and overcome the seed-miRNA problem negatively affects existing computational models. Electronic supplementary material The online version of this article (10.1186/s12918-019-0700-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jihwan Ha
- Department of Computer Science, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul, South Korea
| | - Chihyun Park
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, 9211 Euclid Ave., Cleveland, OH, 44106, USA
| | - Sanghyun Park
- Department of Computer Science, Yonsei University, 134 Sinchon-dong, Seodaemun-gu, Seoul, South Korea.
| |
Collapse
|
40
|
Zhao J, Ma X. Multiple Partial Regularized Nonnegative Matrix Factorization for Predicting Ontological Functions of lncRNAs. Front Genet 2019; 9:685. [PMID: 30728826 PMCID: PMC6351489 DOI: 10.3389/fgene.2018.00685] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 12/10/2018] [Indexed: 02/02/2023] Open
Abstract
Long non-coding RNAs (LncRNA) are critical regulators for biological processes, which are highly related to complex diseases. Even though the next generation sequence technology facilitates the discovery of a great number of lncRNAs, the knowledge about the functions of lncRNAs is limited. Thus, it is promising to predict the functions of lncRNAs, which shed light on revealing the mechanisms of complex diseases. The current algorithms predict the functions of lncRNA by using the features of protein-coding genes. Generally speaking, these algorithms fuse heterogeneous genomic data to construct lncRNA-gene associations via a linear combination, which cannot fully characterize the function-lncRNA relations. To overcome this issue, we present an nonnegative matrix factorization algorithm with multiple partial regularization (aka MPrNMF) to predict the functions of lncRNAs without fusing the heterogeneous genomic data. In details, for each type of genomic data, we construct the lncRNA-gene associations, resulting in multiple associations. The proposed method integrates separately them via regularization strategy, rather than fuse them into a single type of associations. The results demonstrate that the proposed algorithm outperforms state-of-the-art methods based network-analysis. The model and algorithm provide an effective way to explore the functions of lncRNAs.
Collapse
Affiliation(s)
- Jianbang Zhao
- College of Information Engineering, Northwest Agriculture & Forestry University, Xianyang, China
| | - Xiaoke Ma
- School of Computer Science and Technology, Xidian University, Xi'an, China
| |
Collapse
|
41
|
Shen C, Ding Y, Tang J, Guo F. Multivariate Information Fusion With Fast Kernel Learning to Kernel Ridge Regression in Predicting LncRNA-Protein Interactions. Front Genet 2019; 9:716. [PMID: 30697228 PMCID: PMC6340980 DOI: 10.3389/fgene.2018.00716] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2018] [Accepted: 12/21/2018] [Indexed: 12/31/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) constitute a large class of transcribed RNA molecules. They have a characteristic length of more than 200 nucleotides which do not encode proteins. They play an important role in regulating gene expression by interacting with the homologous RNA-binding proteins. Due to the laborious and time-consuming nature of wet experimental methods, more researchers should pay great attention to computational approaches for the prediction of lncRNA-protein interaction (LPI). An in-depth literature review in the state-of-the-art in silico investigations, leads to the conclusion that there is still room for improving the accuracy and velocity. This paper propose a novel method for identifying LPI by employing Kernel Ridge Regression, based on Fast Kernel Learning (LPI-FKLKRR). This approach, uses four distinct similarity measures for lncRNA and protein space, respectively. It is remarkable, that we extract Gene Ontology (GO) with proteins, in order to improve the quality of information in protein space. The process of heterogeneous kernels integration, applies Fast Kernel Learning (FastKL) to deal with weight optimization. The extrapolation model is obtained by gaining the ultimate prediction associations, after using Kernel Ridge Regression (KRR). Experimental outcomes show that the ability of modeling with LPI-FKLKRR has extraordinary performance compared with LPI prediction schemes. On benchmark dataset, it has been observed that the best Area Under Precision Recall Curve (AUPR) of 0.6950 is obtained by our proposed model LPI-FKLKRR, which outperforms the integrated LPLNP (AUPR: 0.4584), RWR (AUPR: 0.2827), CF (AUPR: 0.2357), LPIHN (AUPR: 0.2299), and LPBNI (AUPR: 0.3302). Also, combined with the experimental results of a case study on a novel dataset, it is anticipated that LPI-FKLKRR will be a useful tool for LPI prediction.
Collapse
Affiliation(s)
- Cong Shen
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, United States
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China
| |
Collapse
|
42
|
Yan C, Wang J, Ni P, Lan W, Wu FX, Pan Y. DNRLMF-MDA:Predicting microRNA-Disease Associations Based on Similarities of microRNAs and Diseases. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:233-243. [PMID: 29990253 DOI: 10.1109/tcbb.2017.2776101] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
MicroRNAs (miRNAs) are a class of non-coding RNAs about ∼ 22nt nucleotides. Studies have proven that miRNAs play key roles in many human complex diseases. Therefore, discovering miRNA-disease associations is beneficial to understanding disease mechanisms, developing drugs, and treating complex diseases. It is well known that it is a time-consuming and expensive process to discover the miRNA-disease associations via biological experiments. Alternatively, computational models could provide a low-cost and high-efficiency way for predicting miRNA-disease associations. In this study, we propose a method (called DNRLMF-MDA) to predict miRNA-disease associations based on dynamic neighborhood regularized logistic matrix factorization. DNRLMF-MDA integrates known miRNA-disease associations, functional similarity and Gaussian Interaction Profile (GIP) kernel similarity of miRNAs, and functional similarity and GIP kernel similarity of diseases. Especially, positive observations (known miRNA-disease associations) are assigned higher importance levels than negative observations (unknown miRNA-disease associations).DNRLMF-MDA computes the probability that a miRNA would interact with a disease by a logistic matrix factorization method, where latent vectors of miRNAs and diseases represent the properties of miRNAs and diseases, respectively, and further improve prediction performance via dynamic neighborhood regularized. The 5-fold cross validation is adopted to assess the performance of our DNRLMF-MDA, as well as other competing methods for comparison. The computational experiments show that DNRLMF-MDA outperforms the state-of-art method PBMDA. The AUC values of DNRLMF-MDA on three datasets are 0.9357, 0.9411, and 0.9416, respectively, which are superior to the PBMDA's results of 0.9218, 0.9187, and 0.9262. The average computation times per 5-fold cross validation of DNRLMF-MDA on three datasets are 38, 46, and 50 seconds, which are shorter than the PBMDA's average computation times of 10869, 916, and 8448 seconds, respectively. DNRLMF-MDA also can predict potential diseases for new miRNAs. Furthermore, case studies illustrate that DNRLMF-MDA is an effective method to predict miRNA-disease associations.
Collapse
|
43
|
Paul S, Brahma D. An Integrated Approach for Identification of Functionally Similar MicroRNAs in Colorectal Cancer. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:183-192. [PMID: 29990005 DOI: 10.1109/tcbb.2017.2765332] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Colorectal cancer (CRC) is one of the most prevalent cancers around the globe. However, the molecular reasons for pathogenesis of CRC are still poorly understood. Recently, the role of microRNAs or miRNAs in the initiation and progression of CRC has been studied. MicroRNAs are small, endogenous noncoding RNAs found in plants, animals, and some viruses, which function in RNA silencing and posttranscriptional regulation of gene expression. Their role in CRC development is studied and they are found to be potential biomarkers in diagnosis and treatment of CRC. Therefore, identification of functionally similar CRC related miRNAs may help in the development of a prognostic tool. In this regard, this paper presents a new algorithm, called μSim. It is an integrative approach for identification of functionally similar miRNAs associated with CRC. It integrates judiciously the information of miRNA expression data and miRNA-miRNA functionally synergistic network data. The functional similarity is calculated based on both miRNA expression data and miRNA-miRNA functionally synergistic network data. The effectiveness of the proposed method in comparison to other related methods is shown on four CRC miRNA data sets. The proposed method selected more significant miRNAs related to CRC as compared to other related methods.
Collapse
|
44
|
Jiang L, Xiao Y, Ding Y, Tang J, Guo F. FKL-Spa-LapRLS: an accurate method for identifying human microRNA-disease association. BMC Genomics 2018; 19:911. [PMID: 30598109 PMCID: PMC6311941 DOI: 10.1186/s12864-018-5273-x] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In the process of post-transcription, microRNAs (miRNAs) are closely related to various complex human diseases. Traditional verification methods for miRNA-disease associations take a lot of time and expense, so it is especially important to design computational methods for detecting potential associations. Considering the restrictions of previous computational methods for predicting potential miRNAs-disease associations, we develop the model of FKL-Spa-LapRLS (Fast Kernel Learning Sparse kernel Laplacian Regularized Least Squares) to break through the limitations. RESULT First, we extract three miRNA similarity kernels and three disease similarity kernels. Then, we combine these kernels into a single kernel through the Fast Kernel Learning (FKL) model, and use sparse kernel (Spa) to eliminate noise in the integrated similarity kernel. Finally, we find the associations via Laplacian Regularized Least Squares (LapRLS). Based on three evaluation methods, global and local leave-one-out cross validation (LOOCV), and 5-fold cross validation, the AUCs of our method achieve 0.9563, 0.8398 and 0.9535, thus it can be seen that our method is reliable. Then, we use case studies of eight neoplasms to further analyze the performance of our method. We find that most of the predicted miRNA-disease associations are confirmed by previous traditional experiments, and some important miRNAs should be paid more attention, which uncover more associations of various neoplasms than other miRNAs. CONCLUSIONS Our proposed model can reveal miRNA-disease associations and improve the accuracy of correlation prediction for various diseases. Our method can be also easily extended with more similarity kernels.
Collapse
Affiliation(s)
- Limin Jiang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Tianjin University Institute of Computational Biology, Tianjin University, Tianjin, China
| | - Yongkang Xiao
- School of Chemical Engineering and Technology, Tianjin University, Tianjin, China
| | - Yijie Ding
- School of Electronic and Information Engineering, Suzhou University of Science and Technology, Suzhou, China
| | - Jijun Tang
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.,Tianjin University Institute of Computational Biology, Tianjin University, Tianjin, China.,Department of Computer Science and Engineering, University of South Carolina, Columbia, SC, USA
| | - Fei Guo
- School of Computer Science and Technology, College of Intelligence and Computing, Tianjin University, Tianjin, China.
| |
Collapse
|
45
|
Zhao H, Kuang L, Feng X, Zou Q, Wang L. A Novel Approach Based on a Weighted Interactive Network to Predict Associations of MiRNAs and Diseases. Int J Mol Sci 2018; 20:E110. [PMID: 30597923 PMCID: PMC6337518 DOI: 10.3390/ijms20010110] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 12/23/2018] [Accepted: 12/24/2018] [Indexed: 01/15/2023] Open
Abstract
Accumulating evidence progressively indicated that microRNAs (miRNAs) play a significant role in the pathogenesis of diseases through many experimental studies; therefore, developing powerful computational models to identify potential human miRNA⁻disease associations is vital for an understanding of the disease etiology and pathogenesis. In this paper, a weighted interactive network was firstly constructed by combining known miRNA⁻disease associations, as well as the integrated similarity between diseases and the integrated similarity between miRNAs. Then, a new computational method implementing the newly weighted interactive network was developed for discovering potential miRNA⁻disease associations (WINMDA) by integrating the T most similar neighbors and the shortest path algorithm. Simulation results show that WINMDA can achieve reliable area under the receiver operating characteristics (ROC) curve (AUC) results of 0.9183 ± 0.0007 in 5-fold cross-validation, 0.9200 ± 0.0004 in 10-fold cross-validation, 0.9243 in global leave-one-out cross-validation (LOOCV), and 0.8856 in local LOOCV. Furthermore, case studies of colon neoplasms, gastric neoplasms, and prostate neoplasms based on the Human microRNA Disease Database (HMDD) database were implemented, for which 94% (colon neoplasms), 96% (gastric neoplasms), and 96% (prostate neoplasms) of the top 50 predicting miRNAs were confirmed by recent experimental reports, which also demonstrates that WINMDA can effectively uncover potential miRNA⁻disease associations.
Collapse
Affiliation(s)
- Haochen Zhao
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411105, China.
| | - Linai Kuang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411105, China.
| | - Xiang Feng
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411105, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610000, China.
- School of Computer Science and Technology, Tianjin University, Tianjin 300000, China.
| | - Lei Wang
- College of Computer Engineering & Applied Mathematics, Changsha University, Changsha 410001, China.
- Key Laboratory of Hunan Province for Internet of Things and Information Security, Xiangtan University, Xiangtan 411105, China.
| |
Collapse
|
46
|
Qu Y, Zhang H, Lyu C, Liang C. LLCMDA: A Novel Method for Predicting miRNA Gene and Disease Relationship Based on Locality-Constrained Linear Coding. Front Genet 2018; 9:576. [PMID: 30555511 PMCID: PMC6282048 DOI: 10.3389/fgene.2018.00576] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Accepted: 11/08/2018] [Indexed: 01/03/2023] Open
Abstract
MiRNAs are small non-coding regulatory RNAs which are associated with multiple diseases. Increasing evidence has shown that miRNAs play important roles in various biological and physiological processes. Therefore, the identification of potential miRNA-disease associations could provide new clues to understanding the mechanism of pathogenesis. Although many traditional methods have been successfully applied to discover part of the associations, they are in general time-consuming and expensive. Consequently, computational-based methods are urgently needed to predict the potential miRNA-disease associations in a more efficient and resources-saving way. In this paper, we propose a novel method to predict miRNA-disease associations based on Locality-constrained Linear Coding (LLC). Specifically, we first reconstruct similarity networks for both miRNAs and diseases using LLC and then apply label propagation on the similarity networks to get relevant scores. To comprehensively verify the performance of the proposed method, we compare our method with several state-of-the-art methods under different evaluation metrics. Moreover, two types of case studies conducted on two common diseases further demonstrate the validity and utility of our method. Extensive experimental results indicate that our method can effectively predict potential associations between miRNAs and diseases.
Collapse
Affiliation(s)
- Yu Qu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Huaxiang Zhang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Chen Lyu
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| | - Cheng Liang
- School of Information Science and Engineering, Shandong Normal University, Jinan, China
| |
Collapse
|
47
|
Fan C, Lei X, Wu FX. Prediction of CircRNA-Disease Associations Using KATZ Model Based on Heterogeneous Networks. Int J Biol Sci 2018; 14:1950-1959. [PMID: 30585259 PMCID: PMC6299360 DOI: 10.7150/ijbs.28260] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2018] [Accepted: 09/30/2018] [Indexed: 01/08/2023] Open
Abstract
Circular RNAs (circRNAs) are a large group of endogenous non-coding RNAs which are key members of gene regulatory processes. Those circRNAs in human paly significant roles in health and diseases. Owing to the characteristics of their universality, specificity and stability, circRNAs are becoming an ideal class of biomarkers for disease diagnosis, treatment and prognosis. Identification of the relationships between circRNAs and diseases can help understand the complex disease mechanism. However, traditional experiments are costly and time-consuming, and little computational models have been developed to predict novel circRNA-disease associations. In this study, a heterogeneous network was constructed by employing the circRNA expression profiles, disease phenotype similarity and Gaussian interaction profile kernel similarity. Then, we developed a computational model of KATZ measures for human circRNA-disease association prediction (KATZHCDA). The leave-one-out cross validation (LOOCV) and 5-fold cross validation were implemented to investigate the effects of these four types of similarity measures. As a result, KATZHCDA model yields the AUCs of 0.8469 and 0.7936+/-0.0065 in LOOCV and 5-fold cross validation, respectively. Furthermore, we analyze the candidate association between hsa_circ_0006054 and colorectal cancer, and results showed that hsa_circ_0006054 may function as miRNA sponge in the carcinogenesis of colorectal cancer. Overall, it is anticipated that our proposed model could become an effective resource for clinical experimental guidance.
Collapse
Affiliation(s)
- Chunyan Fan
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Xiujuan Lei
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China
| | - Fang-Xiang Wu
- School of Computer Science, Shaanxi Normal University, Xi'an 710119, China.,Department of Mechanical Engineering and Division of Biomedical Engineering, University of Saskatchewan, Saskatoon, SK S7N 5A9, Canada
| |
Collapse
|
48
|
Lan W, Wang J, Li M, Liu J, Wu FX, Pan Y. Predicting MicroRNA-Disease Associations Based on Improved MicroRNA and Disease Similarities. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1774-1782. [PMID: 27392365 DOI: 10.1109/tcbb.2016.2586190] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
MicroRNAs (miRNAs) are a type of non-coding RNAs with about ∼22nt nucleotides. Increasing evidences have shown that miRNAs play critical roles in many human diseases. The identification of human disease-related miRNAs is helpful to explore the underlying pathogenesis of diseases. More and more experimental validated associations between miRNAs and diseases have been reported in the recent studies, which provide useful information for new miRNA-disease association discovery. In this study, we propose a computational framework, KBMF-MDI, to predict the associations between miRNAs and diseases based on their similarities. The sequence and function information of miRNAs are used to measure similarity among miRNAs while the semantic and function information of disease are used to measure similarity among diseases, respectively. In addition, the kernelized Bayesian matrix factorization method is employed to infer potential miRNA-disease associations by integrating these data sources. We applied this method to 6,084 known miRNA-disease associations and utilized 5-fold cross validation to evaluate the performance. The experimental results demonstrate that our method can effectively predict unknown miRNA-disease associations.
Collapse
|
49
|
Hu Y, Dingerdissen H, Gupta S, Kahsay R, Shanker V, Wan Q, Yan C, Mazumder R. Identification of key differentially expressed MicroRNAs in cancer patients through pan-cancer analysis. Comput Biol Med 2018; 103:183-197. [PMID: 30384176 DOI: 10.1016/j.compbiomed.2018.10.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Revised: 10/01/2018] [Accepted: 10/17/2018] [Indexed: 12/16/2022]
Abstract
microRNAs (miRNAs) functioning in gene silencing have been associated with cancer progression. However, common abnormal miRNA expression patterns and their potential roles in cancer have not yet been evaluated. To account for individual differences between patients, we retrieved miRNA sequencing data for 575 patients with both tumor and adjacent non-tumorous tissues from 14 cancer types from The Cancer Genome Atlas (TCGA). We then performed differential expression analysis using DESeq2 and edgeR. Results showed that cancer types can be grouped based on the distribution of miRNAs with different expression patterns between tumor and non-tumor samples. We found 81 significantly differentially expressed miRNAs (SDEmiRNAs) in a single cancer. We also found 21 key SDEmiRNAs (nine over-expressed and 12 under-expressed) associated with at least eight cancers each and enriched in more than 60% of patients per cancer, including four newly identified SDEmiRNAs (hsa-mir-4746, hsa-mir-3648, hsa-mir-3687, and hsa-mir-1269a). The downstream effects of these 21 SDEmiRNAs on cellular function were evaluated through enrichment and pathway analysis of 7186 protein-coding gene targets mined from literature reports of differential expression of miRNAs in cancer. This analysis enables identification of SDEmiRNA functional similarity in cell proliferation control across a wide range of cancers, and assembly of common regulatory networks over cancer-related pathways. These findings were validated by construction of a regulatory network in the PI3K pathway. This study provides evidence for the value of further analysis of SDEmiRNAs as potential biomarkers and therapeutic targets for cancer diagnosis and treatment.
Collapse
Affiliation(s)
- Yu Hu
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Hayley Dingerdissen
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Samir Gupta
- Department of Computer and Information Science, University of Delaware, Newark, DE, 19716, USA.
| | - Robel Kahsay
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Vijay Shanker
- Department of Computer and Information Science, University of Delaware, Newark, DE, 19716, USA.
| | - Quan Wan
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Cheng Yan
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA.
| | - Raja Mazumder
- The Department of Biochemistry & Molecular Medicine, The George Washington University Medical Center, Washington, DC, 20037, USA; The McCormick Genomic and Proteomic Center, The George Washington University, Washington, DC, 20037, USA.
| |
Collapse
|
50
|
Han K, Wang M, Zhang L, Wang C. Application of Molecular Methods in the Identification of Ingredients in Chinese Herbal Medicines. Molecules 2018; 23:E2728. [PMID: 30360419 PMCID: PMC6222746 DOI: 10.3390/molecules23102728] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Revised: 10/19/2018] [Accepted: 10/20/2018] [Indexed: 11/16/2022] Open
Abstract
There are several kinds of Chinese herbal medicines originating from diverse sources. However, the rapid taxonomic identification of large quantities of Chinese herbal medicines is difficult using traditional methods, and the process of identification itself is prone to error. Therefore, the traditional methods of Chinese herbal medicine identification must meet higher standards of accuracy. With the rapid development of bioinformatics, methods relying on bioinformatics strategies offer advantages with respect to the speed and accuracy of the identification of Chinese herbal medicine ingredients. This article reviews the applicability and limitations of biochip and DNA barcoding technology in the identification of Chinese herbal medicines. Furthermore, the future development of the two technologies of interest is discussed.
Collapse
Affiliation(s)
- Ke Han
- School of Computer and Information Engineering, Harbin University of Commerce, Harbin 150028, China.
| | - Miao Wang
- Life sciences and Environmental Sciences Development Center, Harbin University of Commerce, Harbin 150010, China.
| | - Lei Zhang
- Life sciences and Environmental Sciences Development Center, Harbin University of Commerce, Harbin 150010, China.
| | - Chunyu Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China.
| |
Collapse
|