1
|
Askar M, Småbrekke L, Holsbø E, Bongo LA, Svendsen K. "Using network analysis modularity to group health code systems and decrease dimensionality in machine learning models". EXPLORATORY RESEARCH IN CLINICAL AND SOCIAL PHARMACY 2024; 14:100463. [PMID: 38974056 PMCID: PMC11227014 DOI: 10.1016/j.rcsop.2024.100463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2024] [Revised: 06/03/2024] [Accepted: 06/08/2024] [Indexed: 07/09/2024] Open
Abstract
Background Machine learning (ML) prediction models in healthcare and pharmacy-related research face challenges with encoding high-dimensional Healthcare Coding Systems (HCSs) such as ICD, ATC, and DRG codes, given the trade-off between reducing model dimensionality and minimizing information loss. Objectives To investigate using Network Analysis modularity as a method to group HCSs to improve encoding in ML models. Methods The MIMIC-III dataset was utilized to create a multimorbidity network in which ICD-9 codes are the nodes and the edges are the number of patients sharing the same ICD-9 code pairs. A modularity detection algorithm was applied using different resolution thresholds to generate 6 sets of modules. The impact of four grouping strategies on the performance of predicting 90-day Intensive Care Unit readmissions was assessed. The grouping strategies compared: 1) binary encoding of codes, 2) encoding codes grouped by network modules, 3) grouping codes to the highest level of ICD-9 hierarchy, and 4) grouping using the single-level Clinical Classification Software (CCS). The same methodology was also applied to encode DRG codes but limiting the comparison to a single modularity threshold to binary encoding.The performance was assessed using Logistic Regression, Support Vector Machine with a non-linear kernel, and Gradient Boosting Machines algorithms. Accuracy, Precision, Recall, AUC, and F1-score with 95% confidence intervals were reported. Results Models utilized modularity encoding outperformed ungrouped codes binary encoding models. The accuracy improved across all algorithms ranging from 0.736 to 0.78 for the modularity encoding, to 0.727 to 0.779 for binary encoding. AUC, recall, and precision also improved across almost all algorithms. In comparison with other grouping approaches, modularity encoding generally showed slightly higher performance in AUC, ranging from 0.813 to 0.837, and precision, ranging from 0.752 to 0.782. Conclusions Modularity encoding enhances the performance of ML models in pharmacy research by effectively reducing dimensionality and retaining necessary information. Across the three algorithms used, models utilizing modularity encoding showed superior or comparable performance to other encoding approaches. Modularity encoding introduces other advantages such as it can be used for both hierarchical and non-hierarchical HCSs, the approach is clinically relevant, and can enhance ML models' clinical interpretation. A Python package has been developed to facilitate the use of the approach for future research.
Collapse
Affiliation(s)
- Mohsen Askar
- Department of Pharmacy, Faculty of Health Sciences, UiT-The Arctic University of Norway, PO Box 6050, Stakkevollan, N-9037 Tromsø, Norway
| | - Lars Småbrekke
- Department of Pharmacy, Faculty of Health Sciences, UiT-The Arctic University of Norway, PO Box 6050, Stakkevollan, N-9037 Tromsø, Norway
| | - Einar Holsbø
- Department of Computer Science, Faculty of Science and Technology, UiT-The Arctic University of Norway, PO, Box 6050 Stakkevollan, N-9037 Tromsø, Norway
| | - Lars Ailo Bongo
- Department of Computer Science, Faculty of Science and Technology, UiT-The Arctic University of Norway, PO, Box 6050 Stakkevollan, N-9037 Tromsø, Norway
| | - Kristian Svendsen
- Department of Pharmacy, Faculty of Health Sciences, UiT-The Arctic University of Norway, PO Box 6050, Stakkevollan, N-9037 Tromsø, Norway
| |
Collapse
|
2
|
Mercurio G, Gottardelli B, Lenkowicz J, Patarnello S, Bellavia S, Scala I, Rizzo P, de Belvis AG, Del Signore AB, Maviglia R, Bocci MG, Olivi A, Franceschi F, Urbani A, Calabresi P, Valentini V, Antonelli M, Frisullo G. A novel risk score predicting 30-day hospital re-admission of patients with acute stroke by machine learning model. Eur J Neurol 2024; 31:e16153. [PMID: 38015472 PMCID: PMC11235732 DOI: 10.1111/ene.16153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Revised: 09/29/2023] [Accepted: 10/31/2023] [Indexed: 11/29/2023]
Abstract
BACKGROUND The 30-day hospital re-admission rate is a quality measure of hospital care to monitor the efficiency of the healthcare system. The hospital re-admission of acute stroke (AS) patients is often associated with higher mortality rates, greater levels of disability and increased healthcare costs. The aim of our study was to identify predictors of unplanned 30-day hospital re-admissions after discharge of AS patients and define an early re-admission risk score (RRS). METHODS This observational, retrospective study was performed on AS patients who were discharged between 2014 and 2019. Early re-admission predictors were identified by machine learning models. The performances of these models were assessed by receiver operating characteristic curve analysis. RESULTS Of 7599 patients with AS, 3699 patients met the inclusion criteria, and 304 patients (8.22%) were re-admitted within 30 days from discharge. After identifying the predictors of early re-admission by logistic regression analysis, RRS was obtained and consisted of seven variables: hemoglobin level, atrial fibrillation, brain hemorrhage, discharge home, chronic obstructive pulmonary disease, one and more than one hospitalization in the previous year. The cohort of patients was then stratified into three risk categories: low (RRS = 0-1), medium (RRS = 2-3) and high (RRS >3) with re-admission rates of 5%, 8% and 14%, respectively. CONCLUSIONS The identification of risk factors for early re-admission after AS and the elaboration of a score to stratify at discharge time the risk of re-admission can provide a tool for clinicians to plan a personalized follow-up and contain healthcare costs.
Collapse
Affiliation(s)
- Giovanna Mercurio
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Benedetta Gottardelli
- Department of Diagnostic Imaging, Oncological Radiotherapy and HematologyUniversità Cattolica del Sacro CuoreRomeItaly
| | - Jacopo Lenkowicz
- Gemelli Generator RWD, Fondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Stefano Patarnello
- Gemelli Generator RWD, Fondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Simone Bellavia
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Irene Scala
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Pierandrea Rizzo
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Antonio Giulio de Belvis
- Department of Life Sciences and Public Health, Section of HygieneUniversità Cattolica del Sacro CuoreRomeItaly
- Clinical Pathways and Outcome Evaluation UnitFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Anna Benedetta Del Signore
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Global Medical Department‐Primary Care Unit, Angelini PharmaRomeItaly
| | - Riccardo Maviglia
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Maria Grazia Bocci
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Alessandro Olivi
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Francesco Franceschi
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Andrea Urbani
- Catholic University of Sacred HeartRomeItaly
- Department of Laboratory and Infectious SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| | - Paolo Calabresi
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Vincenzo Valentini
- Department of Diagnostic Imaging, Oncological Radiotherapy and HematologyUniversità Cattolica del Sacro CuoreRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Massimo Antonelli
- Department of Emergency Science, Anesthesiology and Intensive CareFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
- Catholic University of Sacred HeartRomeItaly
| | - Giovanni Frisullo
- Department of Aging, Neurological, Orthopedic and Head and Neck SciencesFondazione Policlinico Universitario A. Gemelli IRCCSRomeItaly
| |
Collapse
|
3
|
Han S, Sohn TJ, Ng BP, Park C. Predicting unplanned readmission due to cardiovascular disease in hospitalized patients with cancer: a machine learning approach. Sci Rep 2023; 13:13491. [PMID: 37596346 PMCID: PMC10439193 DOI: 10.1038/s41598-023-40552-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Accepted: 08/12/2023] [Indexed: 08/20/2023] Open
Abstract
Cardiovascular disease (CVD) in cancer patients can affect the risk of unplanned readmissions, which have been reported to be costly and associated with worse mortality and prognosis. We aimed to demonstrate the feasibility of using machine learning techniques in predicting the risk of unplanned 180-day readmission attributable to CVD among hospitalized cancer patients using the 2017-2018 Nationwide Readmissions Database. We included hospitalized cancer patients, and the outcome was unplanned hospital readmission due to any CVD within 180 days after discharge. CVD included atrial fibrillation, coronary artery disease, heart failure, stroke, peripheral artery disease, cardiomegaly, and cardiomyopathy. Decision tree (DT), random forest, extreme gradient boost (XGBoost), and AdaBoost were implemented. Accuracy, precision, recall, F2 score, and receiver operating characteristic curve (AUC) were used to assess the model's performance. Among 358,629 hospitalized patients with cancer, 5.86% (n = 21,021) experienced unplanned readmission due to any CVD. The three ensemble algorithms outperformed the DT, with the XGBoost displaying the best performance. We found length of stay, age, and cancer surgery were important predictors of CVD-related unplanned hospitalization in cancer patients. Machine learning models can predict the risk of unplanned readmission due to CVD among hospitalized cancer patients.
Collapse
Affiliation(s)
- Sola Han
- Health Outcomes Division, College of Pharmacy, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Ted J Sohn
- Health Outcomes Division, College of Pharmacy, The University of Texas at Austin, Austin, TX, 78712, USA
| | - Boon Peng Ng
- College of Nursing, University of Central Florida, Orlando, FL, USA
- Disability, Aging, and Technology Cluster, University of Central Florida, Orlando, FL, USA
| | - Chanhyun Park
- Health Outcomes Division, College of Pharmacy, The University of Texas at Austin, Austin, TX, 78712, USA.
| |
Collapse
|
4
|
Song X, Tong Y, Luo Y, Chang H, Gao G, Dong Z, Wu X, Tong R. Predicting 7-day unplanned readmission in elderly patients with coronary heart disease using machine learning. Front Cardiovasc Med 2023; 10:1190038. [PMID: 37614939 PMCID: PMC10442485 DOI: 10.3389/fcvm.2023.1190038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 07/24/2023] [Indexed: 08/25/2023] Open
Abstract
Background Short-term unplanned readmission is always neglected, especially for elderly patients with coronary heart disease (CHD). However, tools to predict unplanned readmission are lacking. This study aimed to establish the most effective predictive model for the unplanned 7-day readmission in elderly CHD patients using machine learning (ML) algorithms. Methods The detailed clinical data of elderly CHD patients were collected retrospectively. Five ML algorithms, including extreme gradient boosting (XGB), random forest, multilayer perceptron, categorical boosting, and logistic regression, were used to establish predictive models. We used the area under the receiver operating characteristic curve (AUC), accuracy, precision, recall, the F1 value, the Brier score, the area under the precision-recall curve (AUPRC), and the calibration curve to evaluate the performance of ML models. The SHapley Additive exPlanations (SHAP) value was used to interpret the best model. Results The final study included 834 elderly CHD patients, whose average age was 73.5 ± 8.4 years, among whom 426 (51.08%) were men and 139 had 7-day unplanned readmissions. The XGB model had the best performance, exhibiting the highest AUC (0.9729), accuracy (0.9173), F1 value (0.9134), and AUPRC (0.9766). The Brier score of the XGB model was 0.08. The calibration curve of the XGB model showed good performance. The SHAP method showed that fracture, hypertension, length of stay, aspirin, and D-dimer were the most important indicators for the risk of 7-day unplanned readmissions. The top 10 variables were used to build a compact XGB, which also showed good predictive performance. Conclusions In this study, five ML algorithms were used to predict 7-day unplanned readmissions in elderly patients with CHD. The XGB model had the best predictive performance and potential clinical application perspective.
Collapse
Affiliation(s)
- Xuewu Song
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Yitong Tong
- Chengdu Second People’s Hospital, Chengdu, China
| | - Yi Luo
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Huan Chang
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Guangjie Gao
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Ziyi Dong
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Xingwei Wu
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| | - Rongsheng Tong
- Department of Pharmacy, Sichuan Provincial People’s Hospital, University of Electronic Science and Technology of China, Chengdu, China
- Chinese Academy of Sciences Sichuan Translational Medicine Research Hospital, Chengdu, China
| |
Collapse
|
5
|
Ru B, Tan X, Liu Y, Kannapur K, Ramanan D, Kessler G, Lautsch D, Fonarow G. Comparison of Machine Learning Algorithms for Predicting Hospital Readmissions and Worsening Heart Failure Events in Patients With Heart Failure With Reduced Ejection Fraction: Modeling Study. JMIR Form Res 2023; 7:e41775. [PMID: 37067873 PMCID: PMC10152335 DOI: 10.2196/41775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Revised: 02/16/2023] [Accepted: 02/19/2023] [Indexed: 04/18/2023] Open
Abstract
BACKGROUND Heart failure (HF) is highly prevalent in the United States. Approximately one-third to one-half of HF cases are categorized as HF with reduced ejection fraction (HFrEF). Patients with HFrEF are at risk of worsening HF, have a high risk of adverse outcomes, and experience higher health care use and costs. Therefore, it is crucial to identify patients with HFrEF who are at high risk of subsequent events after HF hospitalization. OBJECTIVE Machine learning (ML) has been used to predict HF-related outcomes. The objective of this study was to compare different ML prediction models and feature construction methods to predict 30-, 90-, and 365-day hospital readmissions and worsening HF events (WHFEs). METHODS We used the Veradigm PINNACLE outpatient registry linked to Symphony Health's Integrated Dataverse data from July 1, 2013, to September 30, 2017. Adults with a confirmed diagnosis of HFrEF and HF-related hospitalization were included. WHFEs were defined as HF-related hospitalizations or outpatient intravenous diuretic use within 1 year of the first HF hospitalization. We used different approaches to construct ML features from clinical codes, including frequencies of clinical classification software (CCS) categories, Bidirectional Encoder Representations From Transformers (BERT) trained with CCS sequences (BERT + CCS), BERT trained on raw clinical codes (BERT + raw), and prespecified features based on clinical knowledge. A multilayer perceptron neural network, extreme gradient boosting (XGBoost), random forest, and logistic regression prediction models were applied and compared. RESULTS A total of 30,687 adult patients with HFrEF were included in the analysis; 11.41% (3184/27,917) of adults experienced a hospital readmission within 30 days of their first HF hospitalization, and nearly half (9231/21,562, 42.81%) of the patients experienced at least 1 WHFE within 1 year after HF hospitalization. The prediction models and feature combinations with the best area under the receiver operating characteristic curve (AUC) for each outcome were XGBoost with CCS frequency (AUC=0.595) for 30-day readmission, random forest with CCS frequency (AUC=0.630) for 90-day readmission, XGBoost with CCS frequency (AUC=0.649) for 365-day readmission, and XGBoost with CCS frequency (AUC=0.640) for WHFEs. Our ML models could discriminate between readmission and WHFE among patients with HFrEF. Our model performance was mediocre, especially for the 30-day readmission events, most likely owing to limitations of the data, including an imbalance between positive and negative cases and high missing rates of many clinical variables and outcome definitions. CONCLUSIONS We predicted readmissions and WHFEs after HF hospitalizations in patients with HFrEF. Features identified by data-driven approaches may be comparable with those identified by clinical domain knowledge. Future work may be warranted to validate and improve the models using more longitudinal electronic health records that are complete, are comprehensive, and have a longer follow-up time.
Collapse
Affiliation(s)
- Boshu Ru
- Merck & Co, Inc, Rahway, NJ, United States
| | - Xi Tan
- Merck & Co, Inc, Rahway, NJ, United States
| | - Yu Liu
- Merck & Co, Inc, Rahway, NJ, United States
| | | | | | - Garin Kessler
- Amazon Web Services Inc, Seattle, WA, United States
- School of Continuing Studies, Georgetown University, Washington, DC, United States
| | | | - Gregg Fonarow
- Ahmanson-UCLA Cardiomyopathy Center, University of California, Los Angeles, Los Angeles, CA, United States
| |
Collapse
|
6
|
Tang S, Tariq A, Dunnmon JA, Sharma U, Elugunti P, Rubin DL, Patel BN, Banerjee I. Predicting 30-day all-cause hospital readmission using multimodal spatiotemporal graph neural networks. IEEE J Biomed Health Inform 2023; PP:10.1109/JBHI.2023.3236888. [PMID: 37018684 PMCID: PMC11073780 DOI: 10.1109/jbhi.2023.3236888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Reduction in 30-day readmission rate is an important quality factor for hospitals as it can reduce the overall cost of care and improve patient post-discharge outcomes. While deep-learning-based studies have shown promising empirical results, several limitations exist in prior models for hospital readmission prediction, such as: (a) only patients with certain conditions are considered, (b) do not leverage data temporality, (c) individual admissions are assumed independent of each other, which ignores patient similarity, (d) limited to single modality or single center data. In this study, we propose a multimodal, spatiotemporal graph neural network (MM-STGNN) for prediction of 30-day all-cause hospital readmission, which fuses in-patient multimodal, longitudinal data and models patient similarity using a graph. Using longitudinal chest radiographs and electronic health records from two independent centers, we show that MM-STGNN achieved an area under the receiver operating characteristic curve (AUROC) of 0.79 on both datasets. Furthermore, MM-STGNN significantly outperformed the current clinical reference standard, LACE+ (AUROC=0.61), on the internal dataset. For subset populations of patients with heart disease, our model significantly outperformed baselines, such as gradient-boosting and Long Short-Term Memory models (e.g., AUROC improved by 3.7 points in patients with heart disease). Qualitative interpretability analysis indicated that while patients' primary diagnoses were not explicitly used to train the model, features crucial for model prediction may reflect patients' diagnoses. Our model could be utilized as an additional clinical decision aid during discharge disposition and triaging high-risk patients for closer post-discharge follow-up for potential preventive measures.
Collapse
|
7
|
Davis S, Zhang J, Lee I, Rezaei M, Greiner R, McAlister FA, Padwal R. Effective hospital readmission prediction models using machine-learned features. BMC Health Serv Res 2022; 22:1415. [PMID: 36434628 PMCID: PMC9700920 DOI: 10.1186/s12913-022-08748-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 10/05/2022] [Accepted: 10/14/2022] [Indexed: 11/26/2022] Open
Abstract
BACKGROUND Hospital readmissions are one of the costliest challenges facing healthcare systems, but conventional models fail to predict readmissions well. Many existing models use exclusively manually-engineered features, which are labor intensive and dataset-specific. Our objective was to develop and evaluate models to predict hospital readmissions using derived features that are automatically generated from longitudinal data using machine learning techniques. METHODS We studied patients discharged from acute care facilities in 2015 and 2016 in Alberta, Canada, excluding those who were hospitalized to give birth or for a psychiatric condition. We used population-level linked administrative hospital data from 2011 to 2017 to train prediction models using both manually derived features and features generated automatically from observational data. The target value of interest was 30-day all-cause hospital readmissions, with the success of prediction measured using the area under the curve (AUC) statistic. RESULTS Data from 428,669 patients (62% female, 38% male, 27% 65 years or older) were used for training and evaluating models: 24,974 (5.83%) were readmitted within 30 days of discharge for any reason. Patients were more likely to be readmitted if they utilized hospital care more, had more physician office visits, had more prescriptions, had a chronic condition, or were 65 years old or older. The LACE readmission prediction model had an AUC of 0.66 ± 0.0064 while the machine learning model's test set AUC was 0.83 ± 0.0045, based on learning a gradient boosting machine on a combination of machine-learned and manually-derived features. CONCLUSION Applying a machine learning model to the computer-generated and manual features improved prediction accuracy over the LACE model and a model that used only manually-derived features. Our model can be used to identify high-risk patients, for whom targeted interventions may potentially prevent readmissions.
Collapse
Affiliation(s)
- Sacha Davis
- grid.17089.370000 0001 2190 316XDepartment of Computing Science, University of Alberta, Edmonton, AB Canada
| | - Jin Zhang
- grid.17089.370000 0001 2190 316XAlberta School of Business, University of Alberta, Edmonton, AB Canada
| | - Ilbin Lee
- grid.17089.370000 0001 2190 316XAlberta School of Business, University of Alberta, Edmonton, AB Canada
| | - Mostafa Rezaei
- grid.462233.20000 0001 1544 4083ESCP Business School, Paris, France
| | - Russell Greiner
- grid.17089.370000 0001 2190 316XDepartment of Computing Science, University of Alberta, Edmonton, AB Canada ,Alberta Machine Intelligence Institute, Edmonton, AB Canada
| | - Finlay A. McAlister
- grid.17089.370000 0001 2190 316XMedicine and Dentistry, University of Alberta, Edmonton, AB Canada
| | - Raj Padwal
- grid.17089.370000 0001 2190 316XMedicine and Dentistry, University of Alberta, Edmonton, AB Canada
| |
Collapse
|
8
|
Gopukumar D, Ghoshal A, Zhao H. A Machine Learning Approach for Predicting Readmission Charges Billed by Hospitals. JMIR Med Inform 2022; 10:e37578. [PMID: 35896038 PMCID: PMC9472041 DOI: 10.2196/37578] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2022] [Revised: 05/02/2022] [Accepted: 07/26/2022] [Indexed: 11/29/2022] Open
Abstract
Background The Centers for Medicare and Medicaid Services projects that health care costs will continue to grow over the next few years. Rising readmission costs contribute significantly to increasing health care costs. Multiple areas of health care, including readmissions, have benefited from the application of various machine learning algorithms in several ways. Objective We aimed to identify suitable models for predicting readmission charges billed by hospitals. Our literature review revealed that this application of machine learning is underexplored. We used various predictive methods, ranging from glass-box models (such as regularization techniques) to black-box models (such as deep learning–based models). Methods We defined readmissions as readmission with the same major diagnostic category (RSDC) and all-cause readmission category (RADC). For these readmission categories, 576,701 and 1,091,580 individuals, respectively, were identified from the Nationwide Readmission Database of the Healthcare Cost and Utilization Project by the Agency for Healthcare Research and Quality for 2013. Linear regression, lasso regression, elastic net, ridge regression, eXtreme gradient boosting (XGBoost), and a deep learning model based on multilayer perceptron (MLP) were the 6 machine learning algorithms we tested for RSDC and RADC through 10-fold cross-validation. Results Our preliminary analysis using a data-driven approach revealed that within RADC, the subsequent readmission charge billed per patient was higher than the previous charge for 541,090 individuals, and this number was 319,233 for RSDC. The top 3 major diagnostic categories (MDCs) for such instances were the same for RADC and RSDC. The average readmission charge billed was higher than the previous charge for 21 of the MDCs in the case of RSDC, whereas it was only for 13 of the MDCs in RADC. We recommend XGBoost and the deep learning model based on MLP for predicting readmission charges. The following performance metrics were obtained for XGBoost: (1) RADC (mean absolute percentage error [MAPE]=3.121%; root mean squared error [RMSE]=0.414; mean absolute error [MAE]=0.317; root relative squared error [RRSE]=0.410; relative absolute error [RAE]=0.399; normalized RMSE [NRMSE]=0.040; mean absolute deviation [MAD]=0.031) and (2) RSDC (MAPE=3.171%; RMSE=0.421; MAE=0.321; RRSE=0.407; RAE=0.393; NRMSE=0.041; MAD=0.031). The performance obtained for MLP-based deep neural networks are as follows: (1) RADC (MAPE=3.103%; RMSE=0.413; MAE=0.316; RRSE=0.410; RAE=0.397; NRMSE=0.040; MAD=0.031) and (2) RSDC (MAPE=3.202%; RMSE=0.427; MAE=0.326; RRSE=0.413; RAE=0.399; NRMSE=0.041; MAD=0.032). Repeated measures ANOVA revealed that the mean RMSE differed significantly across models with P<.001. Post hoc tests using the Bonferroni correction method indicated that the mean RMSE of the deep learning/XGBoost models was statistically significantly (P<.001) lower than that of all other models, namely linear regression/elastic net/lasso/ridge regression. Conclusions Models built using XGBoost and MLP are suitable for predicting readmission charges billed by hospitals. The MDCs allow models to accurately predict hospital readmission charges.
Collapse
Affiliation(s)
- Deepika Gopukumar
- Department of Health and Clinical Outcomes Research, School of Medicine, Saint Louis University, SALUS Center, 3545 Lafayette Ave., 4rth floor, Room 409 B, St.Louis, US
| | - Abhijeet Ghoshal
- Department of Business Administration, Gies College of Business, University of Illinois Urbana-Champaign, Champaign, US
| | - Huimin Zhao
- Sheldon B. Lubar College of Business, University of Wisconsin-Milwaukee, Milwaukee, US
| |
Collapse
|
9
|
Forecasting Hospital Readmissions with Machine Learning. Healthcare (Basel) 2022; 10:healthcare10060981. [PMID: 35742033 PMCID: PMC9222500 DOI: 10.3390/healthcare10060981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2022] [Revised: 05/21/2022] [Accepted: 05/21/2022] [Indexed: 11/17/2022] Open
Abstract
Hospital readmissions are regarded as a compounding economic factor for healthcare systems. In fact, the readmission rate is used in many countries as an indicator of the quality of services provided by a health institution. The ability to forecast patients’ readmissions allows for timely intervention and better post-discharge strategies, preventing future life-threatening events, and reducing medical costs to either the patient or the healthcare system. In this paper, four machine learning models are used to forecast readmissions: support vector machines with a linear kernel, support vector machines with an RBF kernel, balanced random forests, and weighted random forests. The dataset consists of 11,172 actual records of hospitalizations obtained from the General Hospital of Komotini “Sismanogleio” with a total of 24 independent variables. Each record is composed of administrative, medical-clinical, and operational variables. The experimental results indicate that the balanced random forest model outperforms the competition, reaching a sensitivity of 0.70 and an AUC value of 0.78.
Collapse
|
10
|
Shanbehzadeh M, Yazdani A, Shafiee M, Kazemi-Arpanahi H. Predictive modeling for COVID-19 readmission risk using machine learning algorithms. BMC Med Inform Decis Mak 2022; 22:139. [PMID: 35596167 PMCID: PMC9122247 DOI: 10.1186/s12911-022-01880-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2021] [Accepted: 05/18/2022] [Indexed: 12/15/2022] Open
Abstract
Introduction The COVID-19 pandemic overwhelmed healthcare systems with severe shortages in hospital resources such as ICU beds, specialized doctors, and respiratory ventilators. In this situation, reducing COVID-19 readmissions could potentially maintain hospital capacity. By employing machine learning (ML), we can predict the likelihood of COVID-19 readmission risk, which can assist in the optimal allocation of restricted resources to seriously ill patients. Methods In this retrospective single-center study, the data of 1225 COVID-19 patients discharged between January 9, 2020, and October 20, 2021 were analyzed. First, the most important predictors were selected using the horse herd optimization algorithms. Then, three classical ML algorithms, including decision tree, support vector machine, and k-nearest neighbors, and a hybrid algorithm, namely water wave optimization (WWO) as a precise metaheuristic evolutionary algorithm combined with a neural network were used to construct predictive models for COVID-19 readmission. Finally, the performance of prediction models was measured, and the best-performing one was identified. Results The ML algorithms were trained using 17 validated features. Among the four selected ML algorithms, the WWO had the best average performance in tenfold cross-validation (accuracy: 0.9705, precision: 0.9729, recall: 0.9869, specificity: 0.9259, F-measure: 0.9795). Conclusions Our findings show that the WWO algorithm predicts the risk of readmission of COVID-19 patients more accurately than other ML algorithms. The models developed herein can inform frontline clinicians and healthcare policymakers to manage and optimally allocate limited hospital resources to seriously ill COVID-19 patients.
Collapse
Affiliation(s)
- Mostafa Shanbehzadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Azita Yazdani
- Clinical Education Research Center, Health Human Resources Research Center, Department of Health Information Management, School of Health Management and Information Sciences, Shiraz University of Medical Sciences, Shiraz, Iran
| | - Mohsen Shafiee
- Department of Nursing, Abadan University of Medical Sciences, Abadan, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran. .,Department of Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
11
|
Shafiekhani S, Namdar P, Rafiei S. A COVID-19 forecasting system for hospital needs using ANFIS and LSTM models: A graphical user interface unit. Digit Health 2022; 8:20552076221085057. [PMID: 35355809 PMCID: PMC8961204 DOI: 10.1177/20552076221085057] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 01/13/2022] [Accepted: 02/16/2022] [Indexed: 12/23/2022] Open
Abstract
Background Centers for Disease Control and Prevention data showed that about 40% of coronavirus disease 2019 (COVID-19) patients had been suffering from at least one underlying medical condition were hospitalized; in which nearly 33% of them needed to be admitted to the intensive care unit (ICU) to receive specialized medical services. Our study aimed to find a proper machine learning algorithm that can predict confirmed COVID-19 hospital admissions with high accuracy. Methods We obtained data on daily COVID-19 cases in regular medical inpatient units, emergency department, and ICU in the time window between 21 July 2020 and 21 November 2021. Data for the first 183 days (training data set) were used for long short-term memory (LSTM) network, adaptive neuro-fuzzy inference system (ANFIS), support vector regression (SVR) and decision tree model training, whilst the remaining data for the last 60 days (test data set) were used for model validation. To predict the number of ICU and non-ICU patients, we used these models. Finally, a user-friendly graphical user interface unit was designed to load any time series data (here the trend of population of COVID-19 patients) and train LSTM, ANFIS, SVR or tree models for the prediction of COVID-19 cases for one week ahead. Results All models predicted the dynamics of COVID-19 cases in ICU and non- wards. The values of root-mean-square error and R2 as model assessment metrics showed that ANFIS model had better predictive power among all models. Conclusion Artificial intelligence-based forecasting models such as ANFIS system or deep learning approach based on LSTM or regression models including SVR or tree regression play a key role in forecasting the required number of beds or other types of medical facilities during the coronavirus pandemic. Thus, the designed graphical user interface of the present study can be used for optimum management of resources by health care systems amid COVID-19 pandemic.
Collapse
Affiliation(s)
- Sajad Shafiekhani
- Department of Biomedical Engineering, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran
- Research Center for Biomedical Technologies and Robotics, Tehran, Iran
- Students’ Scientific Research Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Peyman Namdar
- Social Determinants of Health Research Center, Research Institute for Prevention of Non-Communicable Diseases, Qazvin University of Medical Sciences, Qazvin, Iran
| | - Sima Rafiei
- Department of Healthcare Management, School of Health, Qazvin University of Medical Sciences, Qazvin, Iran
| |
Collapse
|
12
|
Afrash MR, Kazemi-Arpanahi H, Shanbehzadeh M, Nopour R, Mirbagheri E. Predicting hospital readmission risk in patients with COVID-19: A machine learning approach. INFORMATICS IN MEDICINE UNLOCKED 2022; 30:100908. [PMID: 35280933 PMCID: PMC8901230 DOI: 10.1016/j.imu.2022.100908] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 02/18/2022] [Accepted: 03/06/2022] [Indexed: 01/20/2023] Open
Abstract
Introduction The Coronavirus 2019 (COVID-19) epidemic stunned the health systems with severe scarcities in hospital resources. In this critical situation, decreasing COVID-19 readmissions could potentially sustain hospital capacity. This study aimed to select the most affecting features of COVID-19 readmission and compare the capability of Machine Learning (ML) algorithms to predict COVID-19 readmission based on the selected features. Material and methods The data of 5791 hospitalized patients with COVID-19 were retrospectively recruited from a hospital registry system. The LASSO feature selection algorithm was used to select the most important features related to COVID-19 readmission. HistGradientBoosting classifier (HGB), Bagging classifier, Multi-Layered Perceptron (MLP), Support Vector Machine ((SVM) kernel = linear), SVM (kernel = RBF), and Extreme Gradient Boosting (XGBoost) classifiers were used for prediction. We evaluated the performance of ML algorithms with a 10-fold cross-validation method using six performance evaluation metrics. Results Out of the 42 features, 14 were identified as the most relevant predictors. The XGBoost classifier outperformed the other six ML models with an average accuracy of 91.7%, specificity of 91.3%, the sensitivity of 91.6%, F-measure of 91.8%, and AUC of 0.91%. Conclusion The experimental results prove that ML models can satisfactorily predict COVID-19 readmission. Besides considering the risk factors prioritized in this work, categorizing cases with a high risk of reinfection can make the patient triaging procedure and hospital resource utilization more effective.
Collapse
Key Words
- AUC, Area under the curve
- Artificial intelligent
- CDSS, Clinical Decision Support Systems
- COVID-19
- COVID-19, Coronavirus disease 2019
- CRISP, Cross-Industry Standard Process
- Coronavirus
- HGB, Hist Gradient Boosting
- LASSO, Least Absolute Shrinkage and Selection Operator
- ML, Machine learning
- MLP, Multi-Layered Perceptron
- Machine learning
- Readmission
- SVM, Support Vector Machine
- XGBoost, Extreme Gradient Boosting
Collapse
Affiliation(s)
- Mohammad Reza Afrash
- Department of Health Information Technology and Management, School of Allied Medical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan Faculty of Medical Sciences, Abadan, Iran
- Student Research Committee, Abadan Faculty of Medical Sciences, Abadan, Iran
| | - Mostafa Shanbehzadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Raoof Nopour
- Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran
| | - Esmat Mirbagheri
- Department of Health Information Management, School of Health Management and Information Sciences, Iran University of Medical Sciences, Tehran, Iran
| |
Collapse
|