1
|
Karimi Z, Malak JS, Aghakhani A, Najafi MS, Ariannejad H, Zeraati H, Yekaninejad MS. Machine learning approaches to predict the need for intensive care unit admission among Iranian COVID-19 patients based on ICD-10: A cross-sectional study. Health Sci Rep 2024; 7:e70041. [PMID: 39229475 PMCID: PMC11369020 DOI: 10.1002/hsr2.70041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 07/16/2024] [Accepted: 08/16/2024] [Indexed: 09/05/2024] Open
Abstract
Background & Aim Timely identification of the patients requiring intensive care unit admission (ICU) could be life-saving. We aimed to compare different machine learning algorithms to predict the requirements for ICU admission in COVID-19 patients. Methods We screened all patients with COVID-19 at six academic hospitals in Tehran comprising our study population. A total of 44,112 COVID-19 patients (≥18 years old) were included, among which 7722 patients were hospitalized. We used a Random Forest algorithm to select significant variables. Then, prediction models were developed using the Support Vector Machine, Naıve Bayes, logistic regression, lightGBM, decision tree, and K-Nearest Neighbor algorithms. Sensitivity, specificity, accuracy, F1 score, and receiver operating characteristic-Area Under the Curve (AUC) were used to compare the prediction performance of different models. Results Based on random Forest, the following predictors were selected: age, cardiac disease, cough, hypertension, diabetes, influenza & pneumonia, malignancy, and nervous system disease. Age was found to have the strongest association with ICU admission among COVID-19 patients. All six models achieved an AUC greater than 0.60. Naıve Bayes achieved the best predictive performance (AUC = 0.71). Conclusion Naïve Bayes and lightGBM demonstrated promising results in predicting ICU admission needs in COVID-19 patients. Machine learning models could help quickly identify high-risk patients upon entry and reduce mortality and morbidity among COVID-19 patients.
Collapse
Affiliation(s)
- Zahra Karimi
- Department of Epidemiology and Biostatistics, School of Public HealthTehran University of Medical SciencesTehranIran
| | - Jaleh S. Malak
- Department of Digital Health, School of MedicineTehran University of Medical SciencesTehranIran
| | - Amirhossein Aghakhani
- Department of Epidemiology and Biostatistics, School of Public HealthTehran University of Medical SciencesTehranIran
| | - Mohammad S. Najafi
- Tehran Heart Center, Cardiovascular Diseases Research InstituteTehran University of Medical SciencesTehranIran
| | - Hamid Ariannejad
- Tehran Heart Center, Cardiovascular Diseases Research InstituteTehran University of Medical SciencesTehranIran
- Department of Artificial Intelligence in Medical Sciences, Faculty of Advanced Technologies in MedicineIran University of Medical SciencesTehranIran
| | - Hojjat Zeraati
- Department of Epidemiology and Biostatistics, School of Public HealthTehran University of Medical SciencesTehranIran
| | - Mir S. Yekaninejad
- Department of Epidemiology and Biostatistics, School of Public HealthTehran University of Medical SciencesTehranIran
| |
Collapse
|
2
|
Zhang LF, Chen LX, Yang WJ, Hu B. Machine learning in predicting postoperative complications in Crohn's disease. World J Gastrointest Surg 2024; 16:2745-2747. [PMID: 39220079 PMCID: PMC11362926 DOI: 10.4240/wjgs.v16.i8.2745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Revised: 07/06/2024] [Accepted: 07/15/2024] [Indexed: 08/16/2024] Open
Abstract
Crohn's disease (CD) is a chronic inflammatory bowel disease of unknown origin that can cause significant disability and morbidity with its progression. Due to the unique nature of CD, surgery is often necessary for many patients during their lifetime, and the incidence of postoperative complications is high, which can affect the prognosis of patients. Therefore, it is essential to identify and manage postoperative complications. Machine learning (ML) has become increasingly important in the medical field, and ML-based models can be used to predict postoperative complications of intestinal resection for CD. Recently, a valuable article titled "Predicting short-term major postoperative complications in intestinal resection for Crohn's disease: A machine learning-based study" was published by Wang et al. We appreciate the authors' creative work, and we are willing to share our views and discuss them with the authors.
Collapse
Affiliation(s)
- Li-Fan Zhang
- Department of Gastroenterology and Hepatology, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
- Digestive Endoscopy Medical Engineering Research Laboratory, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
| | - Liu-Xiang Chen
- Department of Gastroenterology and Hepatology, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
- Digestive Endoscopy Medical Engineering Research Laboratory, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
| | - Wen-Juan Yang
- Department of Gastroenterology and Hepatology, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
- Digestive Endoscopy Medical Engineering Research Laboratory, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
| | - Bing Hu
- Department of Gastroenterology and Hepatology, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
- Digestive Endoscopy Medical Engineering Research Laboratory, West China Hospital, Sichuan University, Chengdu 610041, Sichuan Province, China
| |
Collapse
|
3
|
Su Y, Li Y, Zhang H, Yang W, Liu M, Luo X, Liu L. Machine learning model for prediction of permanent stoma after anterior resection of rectal cancer: A multicenter study. EUROPEAN JOURNAL OF SURGICAL ONCOLOGY 2024; 50:108386. [PMID: 38776864 DOI: 10.1016/j.ejso.2024.108386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2024] [Revised: 04/23/2024] [Accepted: 05/01/2024] [Indexed: 05/25/2024]
Abstract
BACKGROUND The conversion from a temporary to a permanent stoma (PS) following rectal cancer surgery significantly impacts the quality of life of patients. However, there is currently a lack of practical preoperative tools to predict PS formation. The purpose of this study is to establish a preoperative predictive model for PS using machine learning algorithms to guide clinical practice. METHODS In this retrospective study, we analyzed clinical data from a total of 655 patients who underwent anterior resection for rectal cancer, with 552 patients from one medical center and 103 from another. Through machine learning algorithms, five predictive models were developed, and each was thoroughly evaluated for predictive performance. The model with superior predictive accuracy underwent additional validation using both an independent testing cohort and the external validation cohort. The Shapley Additive exPlanations (SHAP) approach was employed to elucidate the predictive factors influencing the model, providing an in-depth visual analysis of its decision-making process. RESULTS Eight variables were selected for the construction of the model. The support vector machine (SVM) model exhibited superior predictive performance in the training set, evidenced by an AUC of 0.854 (95 % CI:0.803-0.904). This performance was corroborated in both the testing set and external validation set, where the model demonstrated an AUC of 0.851 (95%CI:0.748-0.954) and 0.815 (95%CI:0.710-0.919), respectively, indicating its efficacy in identifying the PS. CONCLUSIONS The model(https://yangsu2023.shinyapps.io/psrisk/) indicated robust predictive performance in identifying PS after anterior resection for rectal cancer, potentially guiding surgeons in the preoperative stratification of patients, thus informing individualized treatment plans and improving patient outcomes.
Collapse
Affiliation(s)
- Yang Su
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| | - Yanqi Li
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| | - Heng Zhang
- Department of Gastrointestinal Surgery, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, 441100, Xiangyang, China.
| | - Wangshuo Yang
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| | - Mengdie Liu
- Department of Biliary-Pancreatic Surgery, Affiliated Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| | - Xuelai Luo
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| | - Lu Liu
- Department of Gastrointestinal Surgery Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China; Molecular Medicine Center, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, 430030, Wuhan, China.
| |
Collapse
|
4
|
Gu S, Zhu F. BAGAIL: Multi-modal imitation learning from imbalanced demonstrations. Neural Netw 2024; 174:106251. [PMID: 38552352 DOI: 10.1016/j.neunet.2024.106251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 01/19/2024] [Accepted: 03/18/2024] [Indexed: 04/14/2024]
Abstract
Expert demonstrations in imitation learning often contain different behavioral modes, e.g., driving modes such as driving on the left, keeping the lane, and driving on the right in the driving tasks. Although most existing multi-modal imitation learning methods allow learning from demonstrations of multiple modes, they have strict constraints on the data of each mode, generally requiring a near data ratio of all modes. Otherwise, it tends to fall into a mode collapse or only learn the data distribution of the mode that has the largest data volume. To address the problem, an algorithm that balances real-fake loss and classification loss by modifying the output of the discriminator, referred to as BAlanced Generative Adversarial Imitation Learning (BAGAIL), is proposed. With this modification, the generator is only rewarded for generating real trajectories with correct modes. BAGAIL is therefore able to deal with imbalanced expert demonstrations and carry out efficient learning for each mode. The learning process of BAGAIL is divided into a pre-training stage and an imitation learning stage. During the pre-training stage, BAGAIL initializes the generator parameters by means of conditional Behavioral Cloning, laying the foundation for the direction of parameter optimization. During the imitation learning stage, BAGAIL optimizes the parameters by using the adversary between the generator and the modified discriminator so that the finally obtained policy can successfully learn the distribution of imbalanced expert data. The experiments showed that BAGAIL accurately distinguished different behavioral modes with imbalanced demonstrations. What is more, the learning result of each mode is close to the expert standard and more stable than other multi-modal imitation learning methods.
Collapse
Affiliation(s)
- Sijia Gu
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu, 215006, China.
| |
Collapse
|
5
|
Chaudhary R, Nourelahi M, Thoma FW, Gellad WF, Lo-Ciganic WH, Bliden KP, Gurbel PA, Neal MD, Jain SK, Bhonsale A, Mulukutla SR, Wang Y, Harinstein ME, Saba S, Visweswaran S. Machine Learning - Based Bleeding Risk Predictions in Atrial Fibrillation Patients on Direct Oral Anticoagulants. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2024:2024.05.27.24307985. [PMID: 38854094 PMCID: PMC11160827 DOI: 10.1101/2024.05.27.24307985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Importance Accurately predicting major bleeding events in non-valvular atrial fibrillation (AF) patients on direct oral anticoagulants (DOACs) is crucial for personalized treatment and improving patient outcomes, especially with emerging alternatives like left atrial appendage closure devices. The left atrial appendage closure devices reduce stroke risk comparably but with significantly fewer non-procedural bleeding events. Objective To evaluate the performance of machine learning (ML) risk models in predicting clinically significant bleeding events requiring hospitalization and hemorrhagic stroke in non-valvular AF patients on DOACs compared to conventional bleeding risk scores (HAS-BLED, ORBIT, and ATRIA) at the index visit to a cardiologist for AF management. Design Prognostic modeling with retrospective cohort study design using electronic health record (EHR) data, with clinical follow-up at one-, two-, and five-years. Setting University of Pittsburgh Medical Center (UPMC) system. Participants 24,468 non-valvular AF patients aged ≥18 years treated with DOACs, excluding those with prior history of significant bleeding, other indications for DOACs, on warfarin or contraindicated to DOACs. Exposures DOAC therapy for non-valvular AF. Main Outcomes and Measures The primary endpoint was clinically significant bleeding requiring hospitalization within one year of index visit. The models incorporated demographic, clinical, and laboratory variables available in the EHR at the index visit. Results Among 24,468 patients, 553 (2.3%) had bleeding events within one year, 829 (3.5%) within two years, and 1,292 (5.8%) within five years of index visit. We evaluated multivariate logistic regression and ML models including random forest, classification trees, k-nearest neighbor, naive Bayes, and extreme gradient boosting (XGBoost) which modestly outperformed HAS-BLED, ATRIA, and ORBIT scores in predicting clinically significant bleeding at 1-year follow-up. The best performing model (random forest) showed area under the curve (AUC-ROC) 0.76 (0.70-0.81), G-Mean score of 0.67, net reclassification index 0.14 compared to 0.57 (0.50-0.63), G-Mean score of 0.57 for HASBLED score, p-value for difference <0.001. The ML models had improved performance compared to conventional risk across time-points of 2-year and 5-years and within the subgroup of hemorrhagic stroke. SHAP analysis identified novel risk factors including measures from body mass index, cholesterol profile, and insurance type beyond those used in conventional risk scores. Conclusions and Relevance Our findings demonstrate the superior performance of ML models compared to conventional bleeding risk scores and identify novel risk factors highlighting the potential for personalized bleeding risk assessment in AF patients on DOACs.
Collapse
|
6
|
Liu P, Sun Y, Zhao X, Yan Y. Deep learning algorithm performance in contouring head and neck organs at risk: a systematic review and single-arm meta-analysis. Biomed Eng Online 2023; 22:104. [PMID: 37915046 PMCID: PMC10621161 DOI: 10.1186/s12938-023-01159-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 09/21/2023] [Indexed: 11/03/2023] Open
Abstract
PURPOSE The contouring of organs at risk (OARs) in head and neck cancer radiation treatment planning is a crucial, yet repetitive and time-consuming process. Recent studies have applied deep learning (DL) algorithms to automatically contour head and neck OARs. This study aims to conduct a systematic review and meta-analysis to summarize and analyze the performance of DL algorithms in contouring head and neck OARs. The objective is to assess the advantages and limitations of DL algorithms in contour planning of head and neck OARs. METHODS This study conducted a literature search of Pubmed, Embase and Cochrane Library databases, to include studies related to DL contouring head and neck OARs, and the dice similarity coefficient (DSC) of four categories of OARs from the results of each study are selected as effect sizes for meta-analysis. Furthermore, this study conducted a subgroup analysis of OARs characterized by image modality and image type. RESULTS 149 articles were retrieved, and 22 studies were included in the meta-analysis after excluding duplicate literature, primary screening, and re-screening. The combined effect sizes of DSC for brainstem, spinal cord, mandible, left eye, right eye, left optic nerve, right optic nerve, optic chiasm, left parotid, right parotid, left submandibular, and right submandibular are 0.87, 0.83, 0.92, 0.90, 0.90, 0.71, 0.74, 0.62, 0.85, 0.85, 0.82, and 0.82, respectively. For subgroup analysis, the combined effect sizes for segmentation of the brainstem, mandible, left optic nerve, and left parotid gland using CT and MRI images are 0.86/0.92, 0.92/0.90, 0.71/0.73, and 0.84/0.87, respectively. Pooled effect sizes using 2D and 3D images of the brainstem, mandible, left optic nerve, and left parotid gland for contouring are 0.88/0.87, 0.92/0.92, 0.75/0.71 and 0.87/0.85. CONCLUSIONS The use of automated contouring technology based on DL algorithms is an essential tool for contouring head and neck OARs, achieving high accuracy, reducing the workload of clinical radiation oncologists, and providing individualized, standardized, and refined treatment plans for implementing "precision radiotherapy". Improving DL performance requires the construction of high-quality data sets and enhancing algorithm optimization and innovation.
Collapse
Affiliation(s)
- Peiru Liu
- General Hospital of Northern Theater Command, Department of Radiation Oncology, Shenyang, China
- Beifang Hospital of China Medical University, Shenyang, China
| | - Ying Sun
- General Hospital of Northern Theater Command, Department of Radiation Oncology, Shenyang, China
| | - Xinzhuo Zhao
- Shenyang University of Technology, School of Electrical Engineering,, Shenyang, China
| | - Ying Yan
- General Hospital of Northern Theater Command, Department of Radiation Oncology, Shenyang, China.
| |
Collapse
|
7
|
Von Rehlingen-Prinz F, Leiderer M, Dehoust J, Dust T, Kowald B, Frosch KH, Izadpanah K, Henes FO, Krause M. Association of medial collateral ligament complex injuries with anterior cruciate ligament ruptures based on posterolateral tibial plateau injuries. SPORTS MEDICINE - OPEN 2023; 9:70. [PMID: 37553489 PMCID: PMC10409938 DOI: 10.1186/s40798-023-00611-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Accepted: 07/12/2023] [Indexed: 08/10/2023]
Abstract
BACKGROUND The combined injury of the medial collateral ligament complex and the anterior cruciate ligament (ACL) is the most common two ligament injury of the knee. Additional injuries to the medial capsuloligamentous structures are associated with rotational instability and a high failure rate of ACL reconstruction. The study aimed to analyze the specific pattern of medial injuries and their associated risk factors, with the goal of enabling early diagnosis and initiating appropriate therapeutic interventions, if necessary. RESULTS Between January 2017 and December 2018, 151 patients with acute ACL ruptures with a mean age of 32 ± 12 years were included in this study. The MRIs performed during the acute phase were analyzed by four independent investigators-two radiologists and two orthopedic surgeons. The trauma impact on the posterolateral tibial plateau and associated injuries to the medial complex (POL, dMCL, and sMCL) were examined and revealed an injury to the medial collateral ligament complex in 34.4% of the patients. The dMCL was the most frequently injured structure (92.2%). A dMCL injury was significantly associated with an increase in trauma severity at the posterolateral tibial plateau (p < 0.02) and additional injuries to the sMCL (OR 4.702, 95% CL 1.3-133.3, p = 0.03) and POL (OR 20.818, 95% CL 5.9-84.4, p < 0.0001). Isolated injuries to the sMCL were not observed. Significant risk factors for acquiring an sMCL injury were age (p < 0.01) and injury to the lateral meniscus (p < 0.01). CONCLUSION In about one-third of acute ACL ruptures the medial collateral ligament complex is also injured. This might be associated with an increased knee laxity as well as anteromedial rotational instability. Also, this might be associated with an increased risk for failure of revision ACL reconstruction. In addition, we show risk factors and predictors that point to an injury of medial structures and facilitate their diagnosis. This should help physicians and surgeons to precisely diagnose and to assess its scope in order to initiate proper therapies. With this in mind, we would like to draw attention to a frequently occurring combination injury, the so-called "unlucky triad" (ACL, MCL, and lateral meniscus). Level of evidence Level III Retrospective cohort study.
Collapse
Affiliation(s)
- Fidelius Von Rehlingen-Prinz
- Department of Trauma and Orthopaedic Surgery, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20246, Hamburg, Germany
- Department of Trauma Surgery, Orthopaedics and Sports Traumatology, BG Hospital Hamburg, Bergedorfer Str. 10, 21033, Hamburg, Germany
| | - Miriam Leiderer
- Department of Diagnostic and Interventional Radiology and Nuclear Medicine, University Medical Center Hamburg-Eppendorf, Martinistrasse 52, 20251, Hamburg, Germany
| | - Julius Dehoust
- Department of Trauma Surgery, Orthopaedics and Sports Traumatology, BG Hospital Hamburg, Bergedorfer Str. 10, 21033, Hamburg, Germany
| | - Tobias Dust
- Department of Trauma and Orthopaedic Surgery, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20246, Hamburg, Germany
| | - Birgitt Kowald
- Department of Trauma Surgery, Orthopaedics and Sports Traumatology, BG Hospital Hamburg, Bergedorfer Str. 10, 21033, Hamburg, Germany
| | - Karl-Heinz Frosch
- Department of Trauma and Orthopaedic Surgery, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20246, Hamburg, Germany
- Department of Trauma Surgery, Orthopaedics and Sports Traumatology, BG Hospital Hamburg, Bergedorfer Str. 10, 21033, Hamburg, Germany
| | - Kaywan Izadpanah
- Department of Orthopaedic and Trauma Surgery, University Medical Center Freiburg, Hugstetter Strasse 55, 79106, Freiburg, Germany
| | - Frank Oliver Henes
- Department of Diagnostic and Interventional Radiology and Nuclear Medicine, University Medical Center Hamburg-Eppendorf, Martinistrasse 52, 20251, Hamburg, Germany
- Department of Diagnostic and Interventional Radiology, BG Hospital Hamburg, Bergedorfer Str. 10, 21033, Hamburg, Germany
| | - Matthias Krause
- Department of Trauma and Orthopaedic Surgery, University Medical Center Hamburg-Eppendorf, Martinistraße 52, 20246, Hamburg, Germany.
| |
Collapse
|
8
|
Zakariaee SS, Naderi N, Ebrahimi M, Kazemi-Arpanahi H. Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data. Sci Rep 2023; 13:11343. [PMID: 37443373 PMCID: PMC10345104 DOI: 10.1038/s41598-023-38133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 07/04/2023] [Indexed: 07/15/2023] Open
Abstract
Since the beginning of the COVID-19 pandemic, new and non-invasive digital technologies such as artificial intelligence (AI) had been introduced for mortality prediction of COVID-19 patients. The prognostic performances of the machine learning (ML)-based models for predicting clinical outcomes of COVID-19 patients had been mainly evaluated using demographics, risk factors, clinical manifestations, and laboratory results. There is a lack of information about the prognostic role of imaging manifestations in combination with demographics, clinical manifestations, and laboratory predictors. The purpose of the present study is to develop an efficient ML prognostic model based on a more comprehensive dataset including chest CT severity score (CT-SS). Fifty-five primary features in six main classes were retrospectively reviewed for 6854 suspected cases. The independence test of Chi-square was used to determine the most important features in the mortality prediction of COVID-19 patients. The most relevant predictors were used to train and test ML algorithms. The predictive models were developed using eight ML algorithms including the J48 decision tree (J48), support vector machine (SVM), multi-layer perceptron (MLP), k-nearest neighbourhood (k-NN), Naïve Bayes (NB), logistic regression (LR), random forest (RF), and eXtreme gradient boosting (XGBoost). The performances of the predictive models were evaluated using accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) metrics. After applying the exclusion criteria, a total of 815 positive RT-PCR patients were the final sample size, where 54.85% of the patients were male and the mean age of the study population was 57.22 ± 16.76 years. The RF algorithm with an accuracy of 97.2%, the sensitivity of 100%, a precision of 94.8%, specificity of 94.5%, F1-score of 97.3%, and AUC of 99.9% had the best performance. Other ML algorithms with AUC ranging from 81.2 to 93.9% had also good prediction performances in predicting COVID-19 mortality. Results showed that timely and accurate risk stratification of COVID-19 patients could be performed using ML-based predictive models fed by routine data. The proposed algorithm with the more comprehensive dataset including CT-SS could efficiently predict the mortality of COVID-19 patients. This could lead to promptly targeting high-risk patients on admission, the optimal use of hospital resources, and an increased probability of survival of patients.
Collapse
Affiliation(s)
| | - Negar Naderi
- Department of Midwifery, Ilam University of Medical Sciences, Ilam, Iran
| | - Mahdi Ebrahimi
- Department of Emergency Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
9
|
Welvaars K, Oosterhoff JHF, van den Bekerom MPJ, Doornberg JN, van Haarst EP. Implications of resampling data to address the class imbalance problem (IRCIP): an evaluation of impact on performance between classification algorithms in medical data. JAMIA Open 2023; 6:ooad033. [PMID: 37266187 PMCID: PMC10232287 DOI: 10.1093/jamiaopen/ooad033] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2023] [Revised: 04/04/2023] [Accepted: 05/11/2023] [Indexed: 06/03/2023] Open
Abstract
Objective When correcting for the "class imbalance" problem in medical data, the effects of resampling applied on classifier algorithms remain unclear. We examined the effect on performance over several combinations of classifiers and resampling ratios. Materials and Methods Multiple classification algorithms were trained on 7 resampled datasets: no correction, random undersampling, 4 ratios of Synthetic Minority Oversampling Technique (SMOTE), and random oversampling with the Adaptive Synthetic algorithm (ADASYN). Performance was evaluated in Area Under the Curve (AUC), precision, recall, Brier score, and calibration metrics. A case study on prediction modeling for 30-day unplanned readmissions in previously admitted Urology patients was presented. Results For most algorithms, using resampled data showed a significant increase in AUC and precision, ranging from 0.74 (CI: 0.69-0.79) to 0.93 (CI: 0.92-0.94), and 0.35 (CI: 0.12-0.58) to 0.86 (CI: 0.81-0.92) respectively. All classification algorithms showed significant increases in recall, and significant decreases in Brier score with distorted calibration overestimating positives. Discussion Imbalance correction resulted in an overall improved performance, yet poorly calibrated models. There can still be clinical utility due to a strong discriminating performance, specifically when predicting only low and high risk cases is clinically more relevant. Conclusion Resampling data resulted in increased performances in classification algorithms, yet produced an overestimation of positive predictions. Based on the findings from our case study, a thoughtful predefinition of the clinical prediction task may guide the use of resampling techniques in future studies aiming to improve clinical decision support tools.
Collapse
Affiliation(s)
- Koen Welvaars
- Corresponding Author: Koen Welvaars, MSc, Data Science Team, OLVG, Jan Tooropstraat 164, 1061 AE Amsterdam, the Netherlands;
| | | | | | | | | | | |
Collapse
|
10
|
Santos CY, Tuboi S, de Jesus Lopes de Abreu A, Abud DA, Lobao Neto AA, Pereira R, Siqueira JB. A machine learning model to assess potential misdiagnosed dengue hospitalization. Heliyon 2023; 9:e16634. [PMID: 37313173 PMCID: PMC10258378 DOI: 10.1016/j.heliyon.2023.e16634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Revised: 05/23/2023] [Accepted: 05/23/2023] [Indexed: 06/15/2023] Open
Abstract
Dengue, like other arboviruses with broad clinical spectra, can easily be misdiagnosed as other infectious diseases due to the overlap of signs and symptoms. During large outbreaks, severe dengue cases have the potential to overwhelm the health care system and understanding the burden of dengue hospitalizations is therefore important to better allocate medical care and public health resources. A machine learning model that used data from the Brazilian public healthcare system database and the National Institute of Meteorology (INMET) was developed to estimate potential misdiagnosed dengue hospitalizations in Brazil. The data was modeled into a hospitalization level linked dataset. Then, Random Forest, Logistic Regression and Support Vector Machine algorithms were assessed. The algorithms were trained by dividing the dataset in training/test set and performing a cross validation to select the best hyperparameters in each algorithm tested. The evaluation was done based on accuracy, precision, recall, F1 score, sensitivity, and specificity. The best model developed was Random Forest with an accuracy of 85% on the final reviewed test. This model shows that 3.4% (13,608) of all hospitalizations in the public healthcare system from 2014 to 2020 could have been dengue misdiagnosed as other diseases. The model was helpful in finding potentially misdiagnosed dengue and might be a useful tool to help public health decision makers in planning resource allocation.
Collapse
Affiliation(s)
- Claudia Yang Santos
- Takeda Pharmaceuticals Brazil, Av. das Nações Unidas 14401, São Paulo, SP, Brazil
| | - Suely Tuboi
- Takeda Pharmaceuticals Brazil, Av. das Nações Unidas 14401, São Paulo, SP, Brazil
| | | | - Denise Alves Abud
- Takeda Pharmaceuticals Brazil, Av. das Nações Unidas 14401, São Paulo, SP, Brazil
| | | | - Ramon Pereira
- IQVIA Brazil, Rua Verbo Divino 2001, São Paulo, SP, Brazil
| | | |
Collapse
|
11
|
Ye Z, Zhang T, Wu C, Qiao Y, Su W, Chen J, Xie G, Dong S, Xu J, Zhao J. Predicting the Objective and Subjective Clinical Outcomes of Anterior Cruciate Ligament Reconstruction: A Machine Learning Analysis of 432 Patients: Response. Am J Sports Med 2023; 51:NP17-NP18. [PMID: 37002726 DOI: 10.1177/03635465231161060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 04/04/2023]
|
12
|
Li X, Ma J. Distributed search and fusion for wine label image retrieval. PeerJ Comput Sci 2022; 8:e1116. [PMID: 36262126 PMCID: PMC9575874 DOI: 10.7717/peerj-cs.1116] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 08/31/2022] [Indexed: 06/16/2023]
Abstract
With the popularity of wine culture and the development of artificial intelligence (AI) technology, wine label image retrieval becomes more and more important. Taking an wine label image as an input, the goal of this task is to return the wine information that the user hopes to know, such as the main brand and sub-brand of the wine. The main challenge in wine label image retrieval task is that there are a large number of wine brands with the imbalance of their sample images which strongly affects the training of the retrieval system based on deep learning. To solve this problem, this article adopts a distribted strategy and proposes two distributed retrieval frameworks. It is demonstrated by the experimental results on the large scale wine label dataset and the Oxford flowers dataset that both our proposed distributed retrieval frameworks are effective and even greatly outperform the previous state-of-the-art retrieval models.
Collapse
Affiliation(s)
- Xiaoqing Li
- School of Statistics, Capital University of Economics and Business, Beijing, China
- School of Mathematical Sciences and LMAM, Peking University, Beijing, China
| | - Jinwen Ma
- School of Mathematical Sciences and LMAM, Peking University, Beijing, China
| |
Collapse
|
13
|
Sun L, Hu N, Ye Y, Tan W, Wu M, Wang X, Huang Z. Ensemble stacking rockburst prediction model based on Yeo-Johnson, K-means SMOTE, and optimal rockburst feature dimension determination. Sci Rep 2022; 12:15352. [PMID: 36097043 PMCID: PMC9468028 DOI: 10.1038/s41598-022-19669-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Accepted: 09/01/2022] [Indexed: 11/09/2022] Open
Abstract
Rockburst forecasting plays a crucial role in prevention and control of rockburst disaster. To improve the accuracy of rockburst prediction at the data structure and algorithm levels, the Yeo–Johnson transform, K-means SMOTE oversampling, and optimal rockburst feature dimension determination are used to optimize the data structure. At the algorithm optimization level, ensemble stacking rockburst prediction is performed based on the data structure optimization. First, to solve the problem of many outliers and data imbalance in the distribution of rockburst data, the Yeo–Johnson transform and k-means SMOTE algorithm are respectively used to solve the problems. Then, based on six original rockburst features, 21 new features are generated using the PolynomialFeatures function in Sklearn. Principal component analysis (PCA) dimensionality reduction is applied to eliminate the correlations between the 27 features. Thirteen types of machine learning algorithms are used to predict datasets that retain different numbers of features after dimensionality reduction to determine the optimal rockburst feature dimension. Finally, the 14-feature rockburst dataset is used as the input for integrated stacking. The results show that the ensemble stacking model based on Yeo–Johnson, K-means SMOTE, and optimal rockburst feature dimension determination can improve the accuracy of rockburst prediction by 0.1602–0.3636. Compared with the 13 single machine learning models without data preprocessing, this data structure optimization and algorithm optimization method effectively improves the accuracy of rockburst prediction.
Collapse
Affiliation(s)
- Lijun Sun
- School of Resources and Environmental Engineering, Wuhan University of Science and Technology, Wuhan, 430081, Hubei, China
| | - Nanyan Hu
- School of Resources and Environmental Engineering, Wuhan University of Science and Technology, Wuhan, 430081, Hubei, China.
| | - Yicheng Ye
- School of Resources and Environmental Engineering, Wuhan University of Science and Technology, Wuhan, 430081, Hubei, China
| | - Wenkan Tan
- School of Resources and Environmental Engineering, Wuhan University of Science and Technology, Wuhan, 430081, Hubei, China
| | - Menglong Wu
- School of Resources and Environmental Engineering, Wuhan University of Science and Technology, Wuhan, 430081, Hubei, China
| | - Xianhua Wang
- Wuhan Safety and Environmental Protection Research Institute of Sinosteel Group, Wuhan, 430081, Hubei, China
| | - Zhaoyun Huang
- Hubei Jingshen Safety Technology Co., Ltd., Yichang, 443000, Hubei, China
| |
Collapse
|
14
|
Cost-Sensitive Metaheuristic Optimization-Based Neural Network with Ensemble Learning for Financial Distress Prediction. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12146918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Financial distress prediction is crucial in the financial domain because of its implications for banks, businesses, and corporations. Serious financial losses may occur because of poor financial distress prediction. As a result, significant efforts have been made to develop prediction models that can assist decision-makers to anticipate events before they occur and avoid bankruptcy, thereby helping to improve the quality of such tasks. Because of the usual highly imbalanced distribution of data, financial distress prediction is a challenging task. Hence, a wide range of methods and algorithms have been developed over recent decades to address the classification of imbalanced datasets. Metaheuristic optimization-based artificial neural networks have shown exciting results in a variety of applications, as well as classification problems. However, less consideration has been paid to using a cost sensitivity fitness function in metaheuristic optimization-based artificial neural networks to solve the financial distress prediction problem. In this work, we propose ENS_PSONNcost and ENS_CSONNcost: metaheuristic optimization-based artificial neural networks that utilize a particle swarm optimizer and a competitive swarm optimizer and five cost sensitivity fitness functions as the base learners in a majority voting ensemble learning paradigm. Three extremely imbalanced datasets from Spanish, Taiwanese, and Polish companies were considered to avoid dataset bias. The results showed significant improvements in the g-mean (the geometric mean of sensitivity and specificity) metric and the F1 score (the harmonic mean of precision and sensitivity) while maintaining adequately high accuracy.
Collapse
|