Nopour R. Prediction of five-year survival among esophageal cancer patients using machine learning.
Heliyon 2023;
9:e22654. [PMID:
38125437 PMCID:
PMC10730993 DOI:
10.1016/j.heliyon.2023.e22654]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 11/16/2023] [Accepted: 11/16/2023] [Indexed: 12/23/2023] Open
Abstract
Background and aim
Considering the silent progression of esophageal cancer, the survival prediction of this disease is crucial in enhancing the quality of life of these patients globally. So far, no prediction solution has been introduced for the survival of EC in Iran based on the machine learning approach. So, this study aims to develop a prediction model for the five-year survival of EC based on the ML approach to promote clinical outcomes and various treatment and preventive plans.
Material and methods
In this retrospective study, we investigated the 1656 cases of survived and non-survived EC patients belonging to Imam Khomeini Hospital in Sari City from 2013 to 2020. The multivariable regression analysis was used to select the best predictors of five-year survival. We leveraged random forest, eXtreme Gradient Boosting, support vector machine, artificial neural networks, Bayesian networks, J-48 decision tree, and K-nearest neighborhood to develop the prediction models. To get the best model for predicting the five-year survival of EC, we compared them using the area under the receiver operator characteristics.
Results
The age at diagnosis, body mass index, smoking, obstruction, dysphagia, weight loss, lymphadenopathy, chemotherapy, radiotherapy, family history of EC, tumor stage, type of appearance, histological type, grade of differentiation, tumor location, tumor size, lymphatic invasion, vascular invasion, and platelet albumin ratio were considered as the best predictors associated with the five-year survival of EC based on the regression analysis. In this respect, the random forest with the area under the receiver operator characteristics of 0.95 was identified as a superior model.
Conclusion
The experimental results of the current study showed that the random forest could have a significant role in enhancing the quality of care in EC patients by increasing the effectiveness of follow-up and treatment measures introduced by care providers.
Collapse