Baik SM, Hong KS, Lee JM, Park DJ. Integrating ensemble and machine learning models for early prediction of pneumonia mortality using laboratory tests.
Heliyon 2024;
10:e34525. [PMID:
39149016 PMCID:
PMC11324817 DOI:
10.1016/j.heliyon.2024.e34525]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 07/09/2024] [Accepted: 07/10/2024] [Indexed: 08/17/2024] Open
Abstract
Background
The recent use of artificial intelligence (AI) in medical research is noteworthy. However, most research has focused on medical imaging. Although the importance of laboratory tests in the clinical field is acknowledged by clinicians, they are undervalued in medical AI research. Our study aims to develop an early prediction AI model for pneumonia mortality, primarily using laboratory test results.
Materials and methods
We developed a mortality prediction model using initial laboratory results and basic clinical information of patients with pneumonia. Several machine learning (ML) models and a deep learning method-multilayer perceptron (MLP)-were selected for model development. The area under the receiver operating characteristic curve (AUROC) and F1-score were optimized to improve model performance. In addition, an ensemble model was developed by blending several models to improve the prediction performance. We used 80,940 data instances for model development.
Results
Among the ML models, XGBoost exhibited the best performance (AUROC = 0.8989, accuracy = 0.88, F1-score = 0.80). MLP achieved an AUROC of 0.8498, accuracy of 0.86, and F1-score of 0.75. The performance of the ensemble model was the best among the developed models, with an AUROC of 0.9006, accuracy of 0.90, and F1-score of 0.81. Several laboratory tests were conducted to identify risk factors that affect pneumonia mortality using the "Feature importance" technique and SHapley Additive exPlanations. We identified several laboratory results, including systolic blood pressure, serum glucose level, age, aspartate aminotransferase-to-alanine aminotransferase ratio, and monocyte-to-lymphocyte ratio, as significant predictors of mortality in patients with pneumonia.
Conclusions
Our study demonstrates that the ensemble model, incorporating XGBoost, CatBoost, and LGBM techniques, outperforms individual ML and deep learning models in predicting pneumonia mortality. Our findings emphasize the importance of integrating AI techniques to leverage laboratory test data effectively, offering a promising direction for advancing AI applications in medical research and clinical decision-making.
Collapse