Xiong Y, Ye M, Wu C. Cancer Classification with a Cost-Sensitive Naive Bayes Stacking Ensemble.
COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2021;
2021:5556992. [PMID:
33986823 PMCID:
PMC8093037 DOI:
10.1155/2021/5556992]
[Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 03/17/2021] [Accepted: 04/15/2021] [Indexed: 02/07/2023]
Abstract
Ensemble learning combines multiple learners to perform combinatorial learning, which has advantages of good flexibility and higher generalization performance. To achieve higher quality cancer classification, in this study, the fast correlation-based feature selection (FCBF) method was used to preprocess the data to eliminate irrelevant and redundant features. Then, the classification was carried out in the stacking ensemble learner. A library for support vector machine (LIBSVM), K-nearest neighbor (KNN), decision tree C4.5 (C4.5), and random forest (RF) were used as the primary learners of the stacking ensemble. Given the imbalanced characteristics of cancer gene expression data, the embedding cost-sensitive naive Bayes was used as the metalearner of the stacking ensemble, which was represented as CSNB stacking. The proposed CSNB stacking method was applied to nine cancer datasets to further verify the classification performance of the model. Compared with other classification methods, such as single classifier algorithms and ensemble algorithms, the experimental results showed the effectiveness and robustness of the proposed method in processing different types of cancer data. This method may therefore help guide cancer diagnosis and research.
Collapse