Adeoye J, Zheng LW, Thomson P, Choi SW, Su YX. Explainable ensemble learning model improves identification of candidates for oral cancer screening.
Oral Oncol 2023;
136:106278. [PMID:
36525782 DOI:
10.1016/j.oraloncology.2022.106278]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 11/26/2022] [Accepted: 12/06/2022] [Indexed: 12/15/2022]
Abstract
OBJECTIVES
Artificial intelligence could enhance the use of disparate risk factors (crude method) for better stratification of patients to be screened for oral cancer. This study aims to construct a meta-classifier that considers diverse risk factors to identify patients at risk of oral cancer and other suspicious oral diseases for targeted screening.
MATERIALS AND METHODS
A retrospective dataset from a community oral cancer screening program was used to construct and train the novel voting meta-classifier. Comprehensive risk factor information from this dataset was used as input features for eleven supervised learning algorithms which served as base learners and provided predicted probabilities that are weighted and aggregated by the meta-classifier. Training dataset was augmented using SMOTE-ENN. Additionally, Shapley additive explanations (SHAP) values were generated to implement the explainability of the model and display the important risk factors.
RESULTS
Our meta-classifier had an internal validation recall, specificity, and AUROC of 0.83, 0.86, and 0.85 for identifying the risk of oral cancer and 0.92, 0.60, and 0.76 for identifying suspicious oral mucosal disease respectively. Upon external validation, the meta-classifier had a significantly higher AUROC than the crude/current method used for identifying the risk of oral cancer (0.78 vs 0.46; p = 0.001) Also, the meta-classifier had better recall than the crude method for predicting the risk of suspicious oral mucosal diseases (0.78 vs 0.47).
CONCLUSION
Overall, these findings showcase that our approach optimizes the use of risk factors in identifying patients for oral screening which suggests potential clinical application.
Collapse