1
|
Ding Z, Zhang L, Zhang Y, Yang J, Luo Y, Ge M, Yao W, Hei Z, Chen C. A Supervised Explainable Machine Learning Model for Perioperative Neurocognitive Disorder in Liver-Transplantation Patients and External Validation on the Medical Information Mart for Intensive Care IV Database: Retrospective Study. J Med Internet Res 2025; 27:e55046. [PMID: 39813086 PMCID: PMC11780294 DOI: 10.2196/55046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 04/12/2024] [Accepted: 10/30/2024] [Indexed: 01/16/2025] Open
Abstract
BACKGROUND Patients undergoing liver transplantation (LT) are at risk of perioperative neurocognitive dysfunction (PND), which significantly affects the patients' prognosis. OBJECTIVE This study used machine learning (ML) algorithms with an aim to extract critical predictors and develop an ML model to predict PND among LT recipients. METHODS In this retrospective study, data from 958 patients who underwent LT between January 2015 and January 2020 were extracted from the Third Affiliated Hospital of Sun Yat-sen University. Six ML algorithms were used to predict post-LT PND, and model performance was evaluated using area under the receiver operating curve (AUC), accuracy, sensitivity, specificity, and F1-scores. The best-performing model was additionally validated using a temporal external dataset including 309 LT cases from February 2020 to August 2022, and an independent external dataset extracted from the Medical Information Mart for Intensive Care Ⅳ (MIMIC-Ⅳ) database including 325 patients. RESULTS In the development cohort, 201 out of 751 (33.5%) patients were diagnosed with PND. The logistic regression model achieved the highest AUC (0.799) in the internal validation set, with comparable AUC in the temporal external (0.826) and MIMIC-Ⅳ validation sets (0.72). The top 3 features contributing to post-LT PND diagnosis were the preoperative overt hepatic encephalopathy, platelet level, and postoperative sequential organ failure assessment score, as revealed by the Shapley additive explanations method. CONCLUSIONS A real-time logistic regression model-based online predictor of post-LT PND was developed, providing a highly interoperable tool for use across medical institutions to support early risk stratification and decision making for the LT recipients.
Collapse
Affiliation(s)
- Zhendong Ding
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Linan Zhang
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yihan Zhang
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Jing Yang
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Yuheng Luo
- Guangzhou AI & Data Cloud Technology Co., LTD, Guangzhou, China
| | - Mian Ge
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Weifeng Yao
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Ziqing Hei
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| | - Chaojin Chen
- Department of Anesthesiology, The Third Affiliated Hospital of Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
2
|
Chen H, Yu D, Zhang J, Li J. Machine Learning for Prediction of Postoperative Delirium in Adult Patients: A Systematic Review and Meta-analysis. Clin Ther 2024; 46:1069-1081. [PMID: 39395856 DOI: 10.1016/j.clinthera.2024.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2024] [Revised: 09/03/2024] [Accepted: 09/09/2024] [Indexed: 10/14/2024]
Abstract
PURPOSE This meta-analysis aimed to evaluate the performance of machine learning (ML) models in predicting postoperative delirium (POD) and to provide guidance for clinical application. METHODS PubMed, Embase, Cochrane Library, and Web of Science databases were searched from inception to April 29, 2024. Studies reported ML models for predicting POD in adult patients were included. Data extraction and risk of bias assessment were performed using the Transparent Reporting of a multivariable prediction model for Individual Prognosis Or Diagnosis - AI (TRIPOD-AI) and Prediction model Risk Of Bias ASsessment Tool (PROBAST) tools. Meta-analysis with the area under the curve (AUC) was performed using MedCalc software. FINDINGS A total of 23 studies were included after screening. Age (n = 20, 86.95%) and Random Forest (RF) (n = 24, 17.27%) were the most frequently used feature and ML algorithm, respectively. The meta-analysis showed an overall AUC of 0.792. The ensemble models (AUC = 0.805) showed better predictive performance than single models (AUC = 0.782). Additionally, considerable variations in AUC were found among different ML algorithms, with AdaBoost (AB) demonstrating good performance with AUC of 0.870. Notably, the generalizability of these models was uncertain due to limitations in external validation and bias assessment. IMPLICATIONS The performance of ensemble models were higher than single models, and the AB algorithms demonstrated better performance, compared with other algorithms. However, further research was needed to enhance the generalizability and transparency of ML models.
Collapse
Affiliation(s)
- Hao Chen
- Department of Anesthesiology, Hebei General Hospital, Shijiazhuang, Hebei Province, China; North China University of Science and Technology, Tangshan, China
| | - Dongdong Yu
- Department of Anesthesiology, Hebei General Hospital, Shijiazhuang, Hebei Province, China
| | - Jing Zhang
- Department of Anesthesiology, Hebei General Hospital, Shijiazhuang, Hebei Province, China
| | - Jianli Li
- Department of Anesthesiology, Hebei General Hospital, Shijiazhuang, Hebei Province, China.
| |
Collapse
|
3
|
Han H, Li R, Fu D, Zhou H, Zhan Z, Wu Y, Meng B. Revolutionizing spinal interventions: a systematic review of artificial intelligence technology applications in contemporary surgery. BMC Surg 2024; 24:345. [PMID: 39501233 PMCID: PMC11536876 DOI: 10.1186/s12893-024-02646-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Accepted: 10/28/2024] [Indexed: 11/09/2024] Open
Abstract
Leveraging its ability to handle large and complex datasets, artificial intelligence can uncover subtle patterns and correlations that human observation may overlook. This is particularly valuable for understanding the intricate dynamics of spinal surgery and its multifaceted impacts on patient prognosis. This review aims to delineate the role of artificial intelligence in spinal surgery. A search of the PubMed database from 1992 to 2023 was conducted using relevant English publications related to the application of artificial intelligence in spinal surgery. The search strategy involved a combination of the following keywords: "Artificial neural network," "deep learning," "artificial intelligence," "spinal," "musculoskeletal," "lumbar," "vertebra," "disc," "cervical," "cord," "stenosis," "procedure," "operation," "surgery," "preoperative," "postoperative," and "operative." A total of 1,182 articles were retrieved. After a careful evaluation of abstracts, 90 articles were found to meet the inclusion criteria for this review. Our review highlights various applications of artificial neural networks in spinal disease management, including (1) assessing surgical indications, (2) assisting in surgical procedures, (3) preoperatively predicting surgical outcomes, and (4) estimating the occurrence of various surgical complications and adverse events. By utilizing these technologies, surgical outcomes can be improved, ultimately enhancing the quality of life for patients.
Collapse
Affiliation(s)
- Hao Han
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Ran Li
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Dongming Fu
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Hongyou Zhou
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Zihao Zhan
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Yi'ang Wu
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China
| | - Bin Meng
- Department of Orthopedics, The First Affiliated Hospital of Soochow University, Suzhou, China.
| |
Collapse
|
4
|
Zong R, Ma X, Shi Y, Geng L. Can Machine Learning Models Based on Computed Tomography Radiomics and Clinical Characteristics Provide Diagnostic Value for Epstein-Barr Virus-Associated Gastric Cancer? J Comput Assist Tomogr 2024; 48:859-867. [PMID: 38924393 DOI: 10.1097/rct.0000000000001636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/28/2024]
Abstract
OBJECTIVE The aim of this study was to explore whether machine learning model based on computed tomography (CT) radiomics and clinical characteristics can differentiate Epstein-Barr virus-associated gastric cancer (EBVaGC) from non-EBVaGC. METHODS Contrast-enhanced CT images were collected from 158 patients with GC (46 EBV-positive, 112 EBV-negative) between April 2018 and February 2023. Radiomics features were extracted from the volumes of interest. A radiomics signature was built based on radiomics features by the least absolute shrinkage and selection operator logistic regression algorithm. Multivariate analyses were used to identify significant clinicoradiological variables. We developed 6 ML models for EBVaGC, including logistic regression, Extreme Gradient Boosting, random forest (RF), support vector machine, Gaussian Naive Bayes, and K-nearest neighbor algorithm. The area under the receiver operating characteristic curve (AUC), the area under the precision-recall curves (AP), calibration plots, and decision curve analysis were applied to assess the effectiveness of each model. RESULTS Six ML models achieved AUC of 0.706-0.854 and AP of 0.480-0.793 for predicting EBV status in GC. With an AUC of 0.854 and an AP of 0.793, the RF model performed the best. The forest plot of the AUC score revealed that the RF model had the most stable performance, with a standard deviation of 0.003 for AUC score. RF also performed well in the testing dataset, with an AUC of 0.832 (95% confidence interval: 0.679-0.951), accuracy of 0.833, sensitivity of 0.857, and specificity of 0.824, respectively. CONCLUSIONS The RF model based on clinical variables and Rad_score can serve as a noninvasive tool to evaluate the EBV status of gastric cancer.
Collapse
Affiliation(s)
- Ruilong Zong
- From the Department of Radiology, Xuzhou Central Hospital, Xuzhou, China
| | - Xijuan Ma
- From the Department of Radiology, Xuzhou Central Hospital, Xuzhou, China
| | - Yibing Shi
- From the Department of Radiology, Xuzhou Central Hospital, Xuzhou, China
| | - Li Geng
- Department of Radiology, The Affiliated Hospital of Xuzhou Medical University
| |
Collapse
|
5
|
Zhang Y, Ren M, Zhai W, Han J, Guo Z. Construction and validation of a risk prediction model for postoperative delirium in patients with off‑pump coronary artery bypass grafting. J Thorac Dis 2024; 16:3944-3955. [PMID: 38983165 PMCID: PMC11228710 DOI: 10.21037/jtd-24-578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 05/24/2024] [Indexed: 07/11/2024]
Abstract
Background Compared with cardiopulmonary bypass surgery, off-pump coronary artery bypass grafting (OPCABG) reduces trauma to the body. However, there is still a risk of neurological complications, including postoperative delirium (POD). To date, few studies have been conducted on the risk of POD in OPCABG patients, and no standardized prediction model has been established. Thus, this study sought to analyze the factors influencing POD in OPCABG patients and to construct a risk prediction model. Methods A total of 1,258 patients with OPCABG were enrolled and divided into the training set for model construction (944 cases) and the test set for model validation (314 cases). A risk prediction model for POD in OPCABG patients was established by least absolute shrinkage and selection operator (LASSO) regression and multivariate logistic regression, and a nomogram was drawn. The discrimination and calibration degree of the model was evaluated by the receiver operator characteristic (ROC) curve and calibration curve. Results Eight variables [i.e., age, tissue oxygen saturation, mean arterial pressure (MAP), carotid stenosis, the anterior-posterior diameter of the aortic sinus, ventricular septum thickness, left ventricular ejection fraction (LVEF), and Mini-Mental State Examination (MMSE) scores] were screen out by the LASSO regression and multivariate logistic regression, and the model was constructed. The area under the ROC curve of the training set was 0.702 [95% confidence interval (CI): 0.662-0.743], and that of the test set was 0.658 (95% CI: 0.585-0.730). The results of the Hosmer-Lemeshow goodness-of-fit test showed that the predicted POD risk of OPCABG patients in the training and test sets was consistent with the actual POD risk (χ2=5.154, P=0.74). Conclusions The occurrence of POD in OPCABG patients is related to age, tissue oxygen saturation, MAP, carotid artery stenosis, the anterior-posterior diameter of aortic sinus, ventricular septal thickness, LVEF, and MMSE scores. The prediction model constructed with the above variables had high predictive performance, and thus may be helpful in the early identification of such patients.
Collapse
Affiliation(s)
- Ying Zhang
- Department of Anesthesiology, Tianjin University Chest Hospital, Tianjin, China
| | - Min Ren
- Tianjin Institute of Cardiovascular Diseases, Tianjin, China
| | - Wenqian Zhai
- Department of Anesthesiology, Tianjin University Chest Hospital, Tianjin, China
- Tianjin Key Laboratory of Cardiovascular Emergency and Critical Care, Tianjin, China
| | - Jiange Han
- Department of Anesthesiology, Tianjin University Chest Hospital, Tianjin, China
- Tianjin Key Laboratory of Cardiovascular Emergency and Critical Care, Tianjin, China
| | - Zhigang Guo
- Tianjin Key Laboratory of Cardiovascular Emergency and Critical Care, Tianjin, China
- Department of Cardiovascular Surgery, Tianjin University Chest Hospital, Tianjin, China
| |
Collapse
|
6
|
Kim YJ, Lee H, Woo HG, Lee SW, Hong M, Jung EH, Yoo SH, Lee J, Yon DK, Kang B. Machine learning-based model to predict delirium in patients with advanced cancer treated with palliative care: a multicenter, patient-based registry cohort. Sci Rep 2024; 14:11503. [PMID: 38769382 PMCID: PMC11106243 DOI: 10.1038/s41598-024-61627-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 05/07/2024] [Indexed: 05/22/2024] Open
Abstract
This study aimed to present a new approach to predict to delirium admitted to the acute palliative care unit. To achieve this, this study employed machine learning model to predict delirium in patients in palliative care and identified the significant features that influenced the model. A multicenter, patient-based registry cohort study in South Korea between January 1, 2019, and December 31, 2020. Delirium was identified by reviewing the medical records based on the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition. The study dataset included 165 patients with delirium among 2314 patients with advanced cancer admitted to the acute palliative care unit. Seven machine learning models, including extreme gradient boosting, adaptive boosting, gradient boosting, light gradient boosting, logistic regression, support vector machine, and random forest, were evaluated to predict delirium in patients with advanced cancer admitted to the acute palliative care unit. An ensemble approach was adopted to determine the optimal model. For k-fold cross-validation, the combination of extreme gradient boosting and random forest provided the best performance, achieving the following accuracy metrics: 68.83% sensitivity, 70.85% specificity, 69.84% balanced accuracy, and 74.55% area under the receiver operating characteristic curve. The performance of the isolated testing dataset was also validated, and the machine learning model was successfully deployed on a public website ( http://ai-wm.khu.ac.kr/Delirium/ ) to provide public access to delirium prediction results in patients with advanced cancer. Furthermore, using feature importance analysis, sex was determined to be the top contributor in predicting delirium, followed by a history of delirium, chemotherapy, smoking status, alcohol consumption, and living with family. Based on a large-scale, multicenter, patient-based registry cohort, a machine learning prediction model for delirium in patients with advanced cancer was developed in South Korea. We believe that this model will assist healthcare providers in treating patients with delirium and advanced cancer.
Collapse
Affiliation(s)
- Yu Jung Kim
- Division of Hematology and Medical Oncology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, South Korea
| | - Hayeon Lee
- Department of Biomedical Engineering, Kyung Hee University, 1732 Deogyeong-daero, Giheung-gu, Yongin, 17104, South Korea
| | - Ho Geol Woo
- Department of Neurology, Kyung Hee University Medical Center, Kyung Hee University College of Medicine, Seoul, South Korea
| | - Si Won Lee
- Division of Medical Oncology, Department of Internal Medicine, Yonsei Cancer Center, Yonsei University Health System, Seoul, South Korea
- Palliative Cancer Center, Yonsei Cancer Center, Yonsei University Health System, Seoul, South Korea
| | - Moonki Hong
- Division of Medical Oncology, Department of Internal Medicine, Yonsei Cancer Center, Yonsei University Health System, Seoul, South Korea
- Palliative Cancer Center, Yonsei Cancer Center, Yonsei University Health System, Seoul, South Korea
| | - Eun Hee Jung
- Division of Hematology and Medical Oncology, Department of Internal Medicine, Seoul National University Bundang Hospital, Seoul National University College of Medicine, Seongnam, South Korea
| | - Shin Hye Yoo
- Center for Palliative Care and Clinical Ethics, Seoul National University Hospital, Seoul, South Korea
| | - Jinseok Lee
- Department of Biomedical Engineering, Kyung Hee University, 1732 Deogyeong-daero, Giheung-gu, Yongin, 17104, South Korea.
| | - Dong Keon Yon
- Center for Digital Health, Medical Science Research Institute, Kyung Hee University Medical Center, Kyung Hee University College of Medicine, Seoul, South Korea.
- Department of Pediatrics, Kyung Hee University College of Medicine, 23 Kyungheedae-ro, Dongdaemun-gu, Seoul, 02447, South Korea.
| | - Beodeul Kang
- Division of Medical Oncology, Department of Internal Medicine, CHA Bundang Medical Center, CHA University School of Medicine, 59 Yatap-ro, Bundang-gu, Seongnam, 13496, South Korea.
| |
Collapse
|
7
|
Zhang L, Zhou X, Cao J. Establishment and validation of a heart failure risk prediction model for elderly patients after coronary rotational atherectomy based on machine learning. PeerJ 2024; 12:e16867. [PMID: 38313005 PMCID: PMC10838101 DOI: 10.7717/peerj.16867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 01/10/2024] [Indexed: 02/06/2024] Open
Abstract
Objective To develop and validate a heart failure risk prediction model for elderly patients after coronary rotational atherectomy based on machine learning methods. Methods A retrospective cohort study was conducted to select 303 elderly patients with severe coronary calcification as the study subjects. According to the occurrence of postoperative heart failure, the study subjects were divided into the heart failure group (n = 53) and the non-heart failure group (n = 250). Retrospective collection of clinical data from the study subjects during hospitalization. After processing the missing values in the original data and addressing sample imbalance using Adaptive Synthetic Sampling (ADASYN) method, the final dataset consists of 502 samples: 250 negative samples (i.e., patients not suffering from heart failure) and 252 positive samples (i.e., patients with heart failure). According to a 7:3 ratio, the datasets of 502 patients were randomly divided into a training set (n = 351) and a validation set (n = 151). On the training set, logistic regression (LR), extreme gradient boosting (XGBoost), support vector machine (SVM), and lightweight gradient boosting machine (LightGBM) algorithms were used to construct heart failure risk prediction models; Evaluate model performance on the validation set by calculating the area under the receiver operating characteristic curve (ROC) curve (AUC), sensitivity, specificity, positive predictive value, negative predictive value, F1-score, and prediction accuracy. Result A total of 17.49% of 303 patients occured postoperative heart failure. The AUC of LR, XGBoost, SVM, and LightGBM models in the training set were 0.872, 1.000, 0.699, and 1.000, respectively. After 10 fold cross validation, the AUC was 0.863, 0.972, 0.696, and 0.963 in the training set, respectively. Among them, XGBoost had the highest AUC and better predictive performance, while SVM models had the worst performance. The XGBoost model also showed good predictive performance in the validation set (AUC = 0.972, 95% CI [0.951-0.994]). The Shapley additive explanation (SHAP) method suggested that the six characteristic variables of blood cholesterol, serum creatinine, fasting blood glucose, age, triglyceride and NT-proBNP were important positive factors for the occurrence of heart failure, and LVEF was important negative factors for the occurrence of heart failure. Conclusion The seven characteristic variables of blood cholesterol, blood creatinine, fasting blood glucose, NT-proBNP, age, triglyceride and LVEF are all important factors affecting the occurrence of heart failure. The prediction model of heart failure risk for elderly patients after CRA based on the XGBoost algorithm is superior to SVM, LightGBM and the traditional LR model. This model could be used to assist clinical decision-making and improve the adverse outcomes of patients after CRA.
Collapse
Affiliation(s)
- Lixiang Zhang
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Xiaojuan Zhou
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| | - Jiaoyu Cao
- Department of Cardiology, The First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Hefei, Anhui, China
| |
Collapse
|
8
|
Wang Q, Wang X, Jiang X, Lin C. Machine learning in female urinary incontinence: A scoping review. Digit Health 2024; 10:20552076241281450. [PMID: 39381822 PMCID: PMC11459541 DOI: 10.1177/20552076241281450] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Accepted: 08/20/2024] [Indexed: 10/10/2024] Open
Abstract
Introduction and Hypothesis The aim was to conduct a scoping review of the literature on the use of machine learning (ML) in female urinary incontinence (UI) over the last decade. Methods A systematic search was performed among the Medline, Google Scholar, PubMed, and Web of Science databases using the following keywords: [Urinary incontinence] and [(Machine learning) or (Predict) or (Prediction model)]. Eligible studies were considered to have applied ML model to explore different management processes of female UI. Data analyzed included the field of application, type of ML, input variables, and results of model validation. Results A total of 798 papers were identified while 23 finally met the inclusion criteria. The vast majority of studies applied logistic regression to establish models (91.3%, 21/23). Most frequently ML was applied to predict postpartum UI (39.1%, 9/23), followed by de novo incontinence after pelvic floor surgery (34.8%, 8/23).There are also three papers using ML models to predict treatment outcomes and three papers using ML models to assist in diagnosis. Variables for modeling included demographic characteristics, clinical data, pelvic floor ultrasound, and urodynamic parameters. The area under receiver operating characteristic curve of these models fluctuated from 0.56 to 0.95, and only 11 studies reported sensitivity and specificity, with sensitivity ranging from 20% to 96.2% and specificity from 59.8% to 94.5%. Conclusion Machine learning modeling demonstrated good predictive and diagnostic abilities in some aspects of female UI, showing its promising prospects in near future. However, the lack of standardization and transparency in the validation and evaluation of the models, and the insufficient external validation greatly diminished the applicability and reproducibility, thus a focus on filling this gap is strongly recommended for future research.
Collapse
Affiliation(s)
- Qi Wang
- College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
- Department of Gynecology, Fujian Maternity and Child Health Hospital, Fuzhou, China
| | - Xiaoxiao Wang
- College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
- Department of Gynecology, Fujian Maternity and Child Health Hospital, Fuzhou, China
| | - Xiaoxiang Jiang
- Fujian Provincial Key Laboratory of Women and Children's Critical Diseases Research, Fuzhou, China
| | - Chaoqin Lin
- College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, Fuzhou, China
- Department of Gynecology, Fujian Maternity and Child Health Hospital, Fuzhou, China
| |
Collapse
|
9
|
Ghanem M, Ghaith AK, El-Hajj VG, Bhandarkar A, de Giorgio A, Elmi-Terander A, Bydon M. Limitations in Evaluating Machine Learning Models for Imbalanced Binary Outcome Classification in Spine Surgery: A Systematic Review. Brain Sci 2023; 13:1723. [PMID: 38137171 PMCID: PMC10741524 DOI: 10.3390/brainsci13121723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 12/12/2023] [Accepted: 12/15/2023] [Indexed: 12/24/2023] Open
Abstract
Clinical prediction models for spine surgery applications are on the rise, with an increasing reliance on machine learning (ML) and deep learning (DL). Many of the predicted outcomes are uncommon; therefore, to ensure the models' effectiveness in clinical practice it is crucial to properly evaluate them. This systematic review aims to identify and evaluate current research-based ML and DL models applied for spine surgery, specifically those predicting binary outcomes with a focus on their evaluation metrics. Overall, 60 papers were included, and the findings were reported according to the PRISMA guidelines. A total of 13 papers focused on lengths of stay (LOS), 12 on readmissions, 12 on non-home discharge, 6 on mortality, and 5 on reoperations. The target outcomes exhibited data imbalances ranging from 0.44% to 42.4%. A total of 59 papers reported the model's area under the receiver operating characteristic (AUROC), 28 mentioned accuracies, 33 provided sensitivity, 29 discussed specificity, 28 addressed positive predictive value (PPV), 24 included the negative predictive value (NPV), 25 indicated the Brier score with 10 providing a null model Brier, and 8 detailed the F1 score. Additionally, data visualization varied among the included papers. This review discusses the use of appropriate evaluation schemes in ML and identifies several common errors and potential bias sources in the literature. Embracing these recommendations as the field advances may facilitate the integration of reliable and effective ML models in clinical settings.
Collapse
Affiliation(s)
- Marc Ghanem
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
- School of Medicine, Lebanese American University, Byblos 4504, Lebanon
| | - Abdul Karim Ghaith
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| | - Victor Gabriel El-Hajj
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
- Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
| | - Archis Bhandarkar
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| | - Andrea de Giorgio
- Artificial Engineering, Via del Rione Sirignano, 80121 Naples, Italy;
| | - Adrian Elmi-Terander
- Department of Clinical Neuroscience, Karolinska Institutet, 17177 Stockholm, Sweden
- Department of Surgical Sciences, Uppsala University, 75236 Uppsala, Sweden
| | - Mohamad Bydon
- Mayo Clinic Neuro-Informatics Laboratory, Mayo Clinic, Rochester, MN 55902, USA; (M.G.); (A.K.G.); (V.G.E.-H.); (A.B.); (M.B.)
- Department of Neurological Surgery, Mayo Clinic, Rochester, MN 55902, USA
| |
Collapse
|
10
|
Hu YL, Wang PY, Xie ZY, Ren GR, Zhang C, Ji HY, Xie XH, Zhuang SY, Wu XT. Interpretable Machine Learning Model to Predict Bone Cement Leakage in Percutaneous Vertebral Augmentation for Osteoporotic Vertebral Compression Fracture Based on SHapley Additive exPlanations. Global Spine J 2023:21925682231204159. [PMID: 37922496 DOI: 10.1177/21925682231204159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/05/2023] Open
Abstract
STUDY DESIGN Retrospective study. OBJECTIVES Our objective is to create comprehensible machine learning (ML) models that can forecast bone cement leakage in percutaneous vertebral augmentation (PVA) for individuals with osteoporotic vertebral compression fracture (OVCF) while also identifying the associated risk factors. METHODS We incorporated data from patients (n = 425) which underwent PVA. To predict cement leakage, we devised six models based on a variety of parameters. Evaluate and juxtapose the predictive performances relied on measures of discrimination, calibration, and clinical utility. SHapley Additive exPlanations (SHAP) methodology was used to interpret model and evaluate the risk factors associated with cement leakage. RESULTS The occurrence rate of cement leakage was established at 50.4%. A binary logistic regression analysis identified cortical disruption (OR 6.880, 95% CI 4.209-11.246), the basivertebral foramen sign (OR 2.142, 95% CI 1.303-3.521), the fracture type (OR 1.683, 95% CI 1.083-2.617), and the volume of bone cement (OR 1.198, 95% CI 1.070-1.341) as independent predictors of cement leakage. The XGBoost model outperformed all others in predicting cement leakage in the testing set, with AUC of .8819, accuracy of .8025, recall score of .7872, F1 score of .8315, and a precision score of .881. Several important factors related to cement leakage were drawn based on the analysis of SHAP values and their clinical significance. CONCLUSION The ML based predictive model demonstrated significant accuracy in forecasting bone cement leakage for patients with OVCF undergoing PVA. When combined with SHAP, ML facilitated a personalized prediction and offered a visual interpretation of feature importance.
Collapse
Affiliation(s)
- Yi-Li Hu
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Pei-Yang Wang
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Zhi-Yang Xie
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Guan-Rui Ren
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Cong Zhang
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Hang-Yu Ji
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Xin-Hui Xie
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Su-Yang Zhuang
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| | - Xiao-Tao Wu
- Department of Spine Surgery, ZhongDa Hospital, School of Medicine, Southeast University, Nanjing, China
| |
Collapse
|
11
|
Matsumoto K, Nohara Y, Sakaguchi M, Takayama Y, Fukushige S, Soejima H, Nakashima N, Kamouchi M. Temporal Generalizability of Machine Learning Models for Predicting Postoperative Delirium Using Electronic Health Record Data: Model Development and Validation Study. JMIR Perioper Med 2023; 6:e50895. [PMID: 37883164 PMCID: PMC10636625 DOI: 10.2196/50895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2023] [Revised: 09/24/2023] [Accepted: 09/29/2023] [Indexed: 10/27/2023] Open
Abstract
BACKGROUND Although machine learning models demonstrate significant potential in predicting postoperative delirium, the advantages of their implementation in real-world settings remain unclear and require a comparison with conventional models in practical applications. OBJECTIVE The objective of this study was to validate the temporal generalizability of decision tree ensemble and sparse linear regression models for predicting delirium after surgery compared with that of the traditional logistic regression model. METHODS The health record data of patients hospitalized at an advanced emergency and critical care medical center in Kumamoto, Japan, were collected electronically. We developed a decision tree ensemble model using extreme gradient boosting (XGBoost) and a sparse linear regression model using least absolute shrinkage and selection operator (LASSO) regression. To evaluate the predictive performance of the model, we used the area under the receiver operating characteristic curve (AUROC) and the Matthews correlation coefficient (MCC) to measure discrimination and the slope and intercept of the regression between predicted and observed probabilities to measure calibration. The Brier score was evaluated as an overall performance metric. We included 11,863 consecutive patients who underwent surgery with general anesthesia between December 2017 and February 2022. The patients were divided into a derivation cohort before the COVID-19 pandemic and a validation cohort during the COVID-19 pandemic. Postoperative delirium was diagnosed according to the confusion assessment method. RESULTS A total of 6497 patients (68.5, SD 14.4 years, women n=2627, 40.4%) were included in the derivation cohort, and 5366 patients (67.8, SD 14.6 years, women n=2105, 39.2%) were included in the validation cohort. Regarding discrimination, the XGBoost model (AUROC 0.87-0.90 and MCC 0.34-0.44) did not significantly outperform the LASSO model (AUROC 0.86-0.89 and MCC 0.34-0.41). The logistic regression model (AUROC 0.84-0.88, MCC 0.33-0.40, slope 1.01-1.19, intercept -0.16 to 0.06, and Brier score 0.06-0.07), with 8 predictors (age, intensive care unit, neurosurgery, emergency admission, anesthesia time, BMI, blood loss during surgery, and use of an ambulance) achieved good predictive performance. CONCLUSIONS The XGBoost model did not significantly outperform the LASSO model in predicting postoperative delirium. Furthermore, a parsimonious logistic model with a few important predictors achieved comparable performance to machine learning models in predicting postoperative delirium.
Collapse
Affiliation(s)
| | - Yasunobu Nohara
- Big Data Science and Technology, Faculty of Advanced Science and Technology, Kumamoto University, Kumamoto, Japan
| | - Mikako Sakaguchi
- Department of Nursing, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Yohei Takayama
- Department of Nursing, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Syota Fukushige
- Department of Inspection, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Hidehisa Soejima
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto, Japan
| | - Naoki Nakashima
- Medical Information Center, Kyushu University Hospital, Fukuoka, Japan
| | - Masahiro Kamouchi
- Department of Health Care Administration and Management, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
- Center for Cohort Studies, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| |
Collapse
|
12
|
Ren Y, Zhang Y, Zhan J, Sun J, Luo J, Liao W, Cheng X. Machine learning for prediction of delirium in patients with extensive burns after surgery. CNS Neurosci Ther 2023; 29:2986-2997. [PMID: 37122154 PMCID: PMC10493655 DOI: 10.1111/cns.14237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 02/23/2023] [Accepted: 04/15/2023] [Indexed: 05/02/2023] Open
Abstract
AIMS Machine learning-based identification of key variables and prediction of postoperative delirium in patients with extensive burns. METHODS Five hundred and eighteen patients with extensive burns who underwent surgery were included and randomly divided into a training set, a validation set, and a testing set. Multifactorial logistic regression analysis was used to screen for significant variables. Nine prediction models were constructed in the training and validation sets (80% of dataset). The testing set (20% of dataset) was used to further evaluate the model. The area under the receiver operating curve (AUROC) was used to compare model performance. SHapley Additive exPlanations (SHAP) was used to interpret the best one and to externally validate it in another large tertiary hospital. RESULTS Seven variables were used in the development of nine prediction models: physical restraint, diabetes, sex, preoperative hemoglobin, acute physiological and chronic health assessment, time in the Burn Intensive Care Unit and total body surface area. Random Forest (RF) outperformed the other eight models in terms of predictive performance (ROC:84.00%) When external validation was performed, RF performed well (accuracy: 77.12%, sensitivity: 67.74% and specificity: 80.46%). CONCLUSION The first machine learning-based delirium prediction model for patients with extensive burns was successfully developed and validated. High-risk patients for delirium can be effectively identified and targeted interventions can be made to reduce the incidence of delirium.
Collapse
Affiliation(s)
- Yujie Ren
- Medical Center of Burn Plastic and Wound RepairThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| | - Yu Zhang
- Medical Innovation CenterThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| | - Jianhua Zhan
- Medical Center of Burn Plastic and Wound RepairThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| | - Junfeng Sun
- Medical Center of Burns and PlasticGanzhou People's HospitalGanzhouChina
| | - Jinhua Luo
- Medical Center of Burn Plastic and Wound RepairThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| | - Wenqiang Liao
- Medical Center of Burn Plastic and Wound RepairThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| | - Xing Cheng
- Medical Center of Burn Plastic and Wound RepairThe First Affiliated Hospital of Nanchang UniversityNanchangChina
| |
Collapse
|
13
|
Mueller B, Street WN, Carnahan RM, Lee S. Evaluating the performance of machine learning methods for risk estimation of delirium in patients hospitalized from the emergency department. Acta Psychiatr Scand 2023; 147:493-505. [PMID: 36999191 PMCID: PMC10147581 DOI: 10.1111/acps.13551] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/02/2022] [Revised: 03/06/2023] [Accepted: 03/23/2023] [Indexed: 04/01/2023]
Abstract
INTRODUCTION Delirium is a cerebral dysfunction seen commonly in the acute care setting. It is associated with increased mortality and morbidity and is frequently missed in the emergency department (ED) and inpatient care by clinical gestalt alone. Identifying those at risk of delirium may help prioritize screening and interventions in the hospital setting. OBJECTIVE Our objective was to leverage electronic health records to identify a clinically valuable risk estimation model for prevalent delirium in patients being transferred from the ED to inpatient units. METHODS This was a retrospective cohort study to develop and validate a risk model to detect delirium using patient data available from prior visits and ED encounter. Electronic health records were extracted for patients hospitalized from the ED between January 1, 2014, and December 31, 2020. Eligible patients were aged 65 or older, admitted to an inpatient unit from the emergency department, and had at least one DOSS assessment or CAM-ICU recorded within 72 h of hospitalization. Six machine learning models were developed to estimate the risk of delirium using clinical variables including demographic features, physiological measurements, medications administered, lab results, and diagnoses. RESULTS A total of 28,531 patients met the inclusion criteria with 8057 (28.4%) having a positive delirium screening within the outcome observation period. Machine learning models were compared using the area under the receiver operating curve (AUC). The gradient boosted machine achieved the best performance with an AUC of 0.839 (95% CI, 0.837-0.841). At a 90% sensitivity threshold, this model achieved a specificity of 53.5% (95% CI 53.0%-54.0%) a positive predictive value of 43.5% (95% CI 43.2%-43.9%), and a negative predictive value of 93.1% (95% CI 93.1%-93.2%). A random forest model and L1-penalized logistic regression also demonstrated notable performance with AUCs of 0.837 (95% CI, 0.835-0.838) and 0.831 (95% CI, 0.830-0.833) respectively. CONCLUSION This study demonstrated the use of machine learning algorithms to identify a combination of variables that enables an estimation of risk of positive delirium screens early in hospitalization to develop prevention or management protocols.
Collapse
Affiliation(s)
- Brianna Mueller
- Tippie College of Business, The University of Iowa, Iowa City, Iowa, USA
| | - W Nick Street
- Tippie College of Business, The University of Iowa, Iowa City, Iowa, USA
| | - Ryan M Carnahan
- Department of Epidemiology, The University of Iowa College of Public Health, Iowa City, Iowa, USA
| | - Sangil Lee
- Department of Emergency Medicine, The University of Iowa, Iowa City, Iowa, USA
| |
Collapse
|
14
|
Matsumoto K, Nohara Y, Sakaguchi M, Takayama Y, Fukushige S, Soejima H, Nakashima N. Delirium Prediction Using Machine Learning Interpretation Method and Its Incorporation into a Clinical Workflow. APPLIED SCIENCES 2023; 13:1564. [DOI: 10.3390/app13031564] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2025]
Abstract
Delirium in hospitalized patients is a worldwide problem, causing a burden on healthcare professionals and impacting patient prognosis. A machine learning interpretation method (ML interpretation method) presents the results of machine learning predictions and promotes guided decisions. This study focuses on visualizing the predictors of delirium using a ML interpretation method and implementing the analysis results in clinical practice. Retrospective data of 55,389 patients hospitalized in a single acute care center in Japan between December 2017 and February 2022 were collected. Patients were categorized into three analysis populations, according to inclusion and exclusion criteria, to develop delirium prediction models. The predictors were then visualized using Shapley additive explanation (SHAP) and fed back to clinical practice. The machine learning-based prediction of delirium in each population exhibited excellent predictive performance. SHAP was used to visualize the body mass index and albumin levels as critical contributors to delirium prediction. In addition, the cutoff value for age, which was previously unknown, was visualized, and the risk threshold for age was raised. By using the SHAP method, we demonstrated that data-driven decision support is possible using electronic medical record data.
Collapse
Affiliation(s)
- Koutarou Matsumoto
- Biostatistics Center, Graduate School of Medicine, Kurume University, Kurume 830-0011, Japan
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
| | - Yasunobu Nohara
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
- Big Data Science and Technology, Faculty of Advanced Science and Technology, Kumamoto University, Kumamoto 860-8555, Japan
| | - Mikako Sakaguchi
- Department of Nursing, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
| | - Yohei Takayama
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
- Department of Nursing, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
| | - Shota Fukushige
- Department of Laboratory, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
| | - Hidehisa Soejima
- Institute for Medical Information Research and Analysis, Saiseikai Kumamoto Hospital, Kumamoto 861-4193, Japan
| | - Naoki Nakashima
- Medical Information Center, Kyushu University Hospital, Fukuoka 812-8582, Japan
| |
Collapse
|