1
|
Moturi S, Rao SNT, Vemuru S. Grey wolf assisted dragonfly-based weighted rule generation for predicting heart disease and breast cancer. Comput Med Imaging Graph 2021; 91:101936. [PMID: 34218121 DOI: 10.1016/j.compmedimag.2021.101936] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 01/06/2021] [Accepted: 05/07/2021] [Indexed: 11/29/2022]
Abstract
Disease prediction plays a significant role in the life of people, as predicting the threat of diseases is necessary for citizens to live life in a healthy manner. The current development of data mining schemes has offered several systems that concern on disease prediction. Even though the disease prediction system includes more advantages, there are still many challenges that might limit its realistic use, such as the efficiency of prediction and information protection. This paper intends to develop an improved disease prediction model, which includes three phases: Weighted Coalesce rule generation, Optimized feature extraction, and Classification. At first, Coalesce rule generation is carried out after data transformation that involves normalization and sequential labeling. Here, rule generation is done based on the weights (priority level) assigned for each attribute by the expert. The support of each rule is multiplied with the proposed weighted function, and the resultant weighted support is compared with the minimum support for selecting the rules. Further, the obtained rule is subject to the optimal feature selection process. The hybrid classifiers that merge Support Vector Machine (SVM), and Deep Belief Network (DBN) takes the role of classification, which characterizes whether the patient is affected with the disease or not. In fact, the optimized feature selection process depends on a new hybrid optimization algorithm by linking the Grey Wolf Optimization (GWO) with Dragonfly Algorithm (DA) and hence, the presented model is termed as Grey Wolf Levy Updated-DA (GWU-DA). Here, the heart disease and breast cancer data are taken, where the efficiency of the proposed model is validated by comparing over the state-of-the-art models. From the analysis, the proposed GWU-DA model for accuracy is 65.98 %, 53.61 %, 42.27 %, 35.05 %, 34.02 %, 11.34 %, 13.4 %, 10.31 %, 9.28 % and 9.89 % better than CBA + CPAR, MKL + ANFIS, RF + EA, WCBA, IQR + KNN + PSO, NL-DA + SVM + DBN, AWFS-RA, HCS-RFRS, ADS-SM-DNN and OSSVM-HGSA models at 60th learning percentage.
Collapse
Affiliation(s)
- Sireesha Moturi
- Research Scholar, Computer Science and Engineering, KLEF, Green Fields, Vaddeswaram, Andhra Pradesh, 522502, India.
| | - S N Tirumala Rao
- Professor, Computer Science and Engineering, Narasaraopeta Engineering College, Narasaraopet, Guntur(Dt), Andhra Pradesh, India
| | - Srikanth Vemuru
- Professor, Computer Science and Engineering, KLEF, Green Fields, Vaddeswaram, Andhra Pradesh, 522502, India
| |
Collapse
|
2
|
Chu Y, Wang X, Dai Q, Wang Y, Wang Q, Peng S, Wei X, Qiu J, Salahub DR, Xiong Y, Wei DQ. MDA-GCNFTG: identifying miRNA-disease associations based on graph convolutional networks via graph sampling through the feature and topology graph. Brief Bioinform 2021; 22:6261915. [PMID: 34009265 DOI: 10.1093/bib/bbab165] [Citation(s) in RCA: 34] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2021] [Revised: 04/02/2021] [Accepted: 04/08/2021] [Indexed: 11/13/2022] Open
Abstract
Accurate identification of the miRNA-disease associations (MDAs) helps to understand the etiology and mechanisms of various diseases. However, the experimental methods are costly and time-consuming. Thus, it is urgent to develop computational methods towards the prediction of MDAs. Based on the graph theory, the MDA prediction is regarded as a node classification task in the present study. To solve this task, we propose a novel method MDA-GCNFTG, which predicts MDAs based on Graph Convolutional Networks (GCNs) via graph sampling through the Feature and Topology Graph to improve the training efficiency and accuracy. This method models both the potential connections of feature space and the structural relationships of MDA data. The nodes of the graphs are represented by the disease semantic similarity, miRNA functional similarity and Gaussian interaction profile kernel similarity. Moreover, we considered six tasks simultaneously on the MDA prediction problem at the first time, which ensure that under both balanced and unbalanced sample distribution, MDA-GCNFTG can predict not only new MDAs but also new diseases without known related miRNAs and new miRNAs without known related diseases. The results of 5-fold cross-validation show that the MDA-GCNFTG method has achieved satisfactory performance on all six tasks and is significantly superior to the classic machine learning methods and the state-of-the-art MDA prediction methods. Moreover, the effectiveness of GCNs via the graph sampling strategy and the feature and topology graph in MDA-GCNFTG has also been demonstrated. More importantly, case studies for two diseases and three miRNAs are conducted and achieved satisfactory performance.
Collapse
Affiliation(s)
- Yanyi Chu
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Xuhong Wang
- School of Electronic, Information and Electrical Engineering (SEIEE), Shanghai Jiao Tong University, China
| | - Qiuying Dai
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Yanjing Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Qiankun Wang
- School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, China
| | - Shaoliang Peng
- College of Computer Science and Electronic Engineering, Hunan University, China
| | | | | | - Dennis Russell Salahub
- Department of Chemistry, University of Calgary, Fellow Royal Society of Canada and Fellow of the American Association for the Advancement of Science, China
| | - Yi Xiong
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Dong-Qing Wei
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad-Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| |
Collapse
|
3
|
Zheng K, You ZH, Wang L, Zhou Y, Li LP, Li ZW. DBMDA: A Unified Embedding for Sequence-Based miRNA Similarity Measure with Applications to Predict and Validate miRNA-Disease Associations. MOLECULAR THERAPY. NUCLEIC ACIDS 2019; 19:602-611. [PMID: 31931344 PMCID: PMC6957846 DOI: 10.1016/j.omtn.2019.12.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 10/09/2019] [Accepted: 12/10/2019] [Indexed: 11/24/2022]
Abstract
MicroRNAs (miRNAs) play a critical role in human diseases. Determining the association between miRNAs and disease contributes to elucidating the pathogenesis of liver diseases and seeking the effective treatment method. Despite great recent advances in the field of the associations between miRNAs and diseases, implementing association verification and recognition efficiently at scale presents serious challenges to biological experimental approaches. Thus, computational methods for predicting miRNA-disease association have become a research hotspot. In this paper, we present a new computational method, named distance-based sequence similarity for miRNA-disease association prediction (DBMDA), that directly learns a mapping from miRNA sequence to a Euclidean space. The notable feature of our approach consists of inferring global similarity from region distances that can be figured by chaos game representation algorithm based on the miRNA sequences. In the 5-fold cross-validation experiment, the area under the curve (AUC) obtained by DBMDA in predicting potential miRNA-disease associations reached 0.9129. To assess the effectiveness of DBMDA more effectively, we compared it with different classifiers and former prediction models. Besides, we constructed two case studies for prostate neoplasms and colon neoplasms. Results show that 39 and 39 out of the top 40 predicted miRNAs were confirmed by other databases, respectively. BDMDA has made new attempts in sequence similarity and achieved excellent results, while at the same time providing a new perspective for predicting the relationship between diseases and miRNAs. The source code and datasets explored in this work are available online from the University of Chinese Academy of Sciences (http://220.171.34.3:81/).
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China; College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277100, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221116, China
| |
Collapse
|
4
|
Zheng K, You ZH, Wang L, Zhou Y, Li LP, Li ZW. MLMDA: a machine learning approach to predict and validate MicroRNA-disease associations by integrating of heterogenous information sources. J Transl Med 2019; 17:260. [PMID: 31395072 PMCID: PMC6688360 DOI: 10.1186/s12967-019-2009-x] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2019] [Accepted: 07/31/2019] [Indexed: 02/01/2023] Open
Abstract
Background Emerging evidences show that microRNA (miRNA) plays an important role in many human complex diseases. However, considering the inherent time-consuming and expensive of traditional in vitro experiments, more and more attention has been paid to the development of efficient and feasible computational methods to predict the potential associations between miRNA and disease. Methods In this work, we present a machine learning-based model called MLMDA for predicting the association of miRNAs and diseases. More specifically, we first use the k-mer sparse matrix to extract miRNA sequence information, and combine it with miRNA functional similarity, disease semantic similarity and Gaussian interaction profile kernel similarity information. Then, more representative features are extracted from them through deep auto-encoder neural network (AE). Finally, the random forest classifier is used to effectively predict potential miRNA–disease associations. Results The experimental results show that the MLMDA model achieves promising performance under fivefold cross validations with AUC values of 0.9172, which is higher than the methods using different classifiers or different feature combination methods mentioned in this paper. In addition, to further evaluate the prediction performance of MLMDA model, case studies are carried out with three Human complex diseases including Lymphoma, Lung Neoplasm, and Esophageal Neoplasms. As a result, 39, 37 and 36 out of the top 40 predicted miRNAs are confirmed by other miRNA–disease association databases. Conclusions These prominent experimental results suggest that the MLMDA model could serve as a useful tool guiding the future experimental validation for those promising miRNA biomarker candidates. The source code and datasets explored in this work are available at http://220.171.34.3:81/.
Collapse
Affiliation(s)
- Kai Zheng
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China.
| | - Zhu-Hong You
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China.
| | - Lei Wang
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China. .,College of Information Science and Engineering, Zaozhuang University, Zaozhuang, 277100, China.
| | - Yong Zhou
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| | - Li-Ping Li
- Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Ürümqi, 830011, China
| | - Zheng-Wei Li
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
5
|
Zhu X, Shao J, Zhang J. Pattern discovery from multi-source data. Pattern Recognit Lett 2018. [DOI: 10.1016/j.patrec.2018.03.012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|