1
|
Niu S, Dong R, Jiang G, Zhang Y. Identification of diagnostic signature and immune microenvironment subtypes of venous thromboembolism. Cytokine 2024; 181:156685. [PMID: 38945040 DOI: 10.1016/j.cyto.2024.156685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 06/20/2024] [Accepted: 06/24/2024] [Indexed: 07/02/2024]
Abstract
The close link between immune and pathogenesis of venous thromboembolism (VTE) has been recognized, but not fully elucidated. The current study was designed to identify immune microenvironment related signature and subtypes using explainable machine learning in VTE. We first observed an alteration of immune microenvironment in VTE patients and identified eight key immune cells involved in VTE. Then PTPN6, ITGB2, CR2, FPR2, MMP9 and ISG15 were determined as key immune microenvironment-related genes, which could divide VTE patients into two subtypes with different immune and metabolic characteristics. Also, we found that prunetin and torin-2 may be most promising to treat VTE patients in Cluster 1 and 2, respectively. By comparing six machine learning models in both training and external validation sets, XGboost was identified as the best one to predict the risk of VTE, followed by the interpretation of each immune microenvironment-related gene contributing to the model. Moreover, CR2 and FPR2 had high accuracy in distinguishing VTE and control, which may act as diagnostic biomarkers of VTE, and their expressions were validated by qPCR. Collectively, immune microenvironment related PTPN6, ITGB2, CR2, FPR2, MMP9 and ISG15 are key genes involved in the pathogenesis of VTE. The VTE risk prediction model and immune microenvironment subtypes based on those genes might benefit prevention, diagnosis, and the individualized treatment strategy in clinical practice of VTE.
Collapse
Affiliation(s)
- Shuai Niu
- Department of Vascular Surgery, the Third Hospital of Hebei Medical University, Shijiazhuang, Hebei, China; Department of Vascular Surgery, Hebei General Hospital, Shijiazhuang, Hebei, China
| | - Ruoyu Dong
- Department of Vascular Surgery, Hebei General Hospital, Shijiazhuang, Hebei, China
| | - Guangwei Jiang
- Department of Vascular Surgery, Hebei General Hospital, Shijiazhuang, Hebei, China
| | - Yanrong Zhang
- Department of Vascular Surgery, the Third Hospital of Hebei Medical University, Shijiazhuang, Hebei, China.
| |
Collapse
|
2
|
Alie MS, Negesse Y, Kindie K, Merawi DS. Machine learning algorithms for predicting COVID-19 mortality in Ethiopia. BMC Public Health 2024; 24:1728. [PMID: 38943093 PMCID: PMC11212371 DOI: 10.1186/s12889-024-19196-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 06/19/2024] [Indexed: 07/01/2024] Open
Abstract
BACKGROUND Coronavirus disease 2019 (COVID-19), a global public health crisis, continues to pose challenges despite preventive measures. The daily rise in COVID-19 cases is concerning, and the testing process is both time-consuming and costly. While several models have been created to predict mortality in COVID-19 patients, only a few have shown sufficient accuracy. Machine learning algorithms offer a promising approach to data-driven prediction of clinical outcomes, surpassing traditional statistical modeling. Leveraging machine learning (ML) algorithms could potentially provide a solution for predicting mortality in hospitalized COVID-19 patients in Ethiopia. Therefore, the aim of this study is to develop and validate machine-learning models for accurately predicting mortality in COVID-19 hospitalized patients in Ethiopia. METHODS Our study involved analyzing electronic medical records of COVID-19 patients who were admitted to public hospitals in Ethiopia. Specifically, we developed seven different machine learning models to predict COVID-19 patient mortality. These models included J48 decision tree, random forest (RF), k-nearest neighborhood (k-NN), multi-layer perceptron (MLP), Naïve Bayes (NB), eXtreme gradient boosting (XGBoost), and logistic regression (LR). We then compared the performance of these models using data from a cohort of 696 patients through statistical analysis. To evaluate the effectiveness of the models, we utilized metrics derived from the confusion matrix such as sensitivity, specificity, precision, and receiver operating characteristic (ROC). RESULTS The study included a total of 696 patients, with a higher number of females (440 patients, accounting for 63.2%) compared to males. The median age of the participants was 35.0 years old, with an interquartile range of 18-79. After conducting different feature selection procedures, 23 features were examined, and identified as predictors of mortality, and it was determined that gender, Intensive care unit (ICU) admission, and alcohol drinking/addiction were the top three predictors of COVID-19 mortality. On the other hand, loss of smell, loss of taste, and hypertension were identified as the three lowest predictors of COVID-19 mortality. The experimental results revealed that the k-nearest neighbor (k-NN) algorithm outperformed than other machine learning algorithms, achieving an accuracy of 95.25%, sensitivity of 95.30%, precision of 92.7%, specificity of 93.30%, F1 score 93.98% and a receiver operating characteristic (ROC) score of 96.90%. These findings highlight the effectiveness of the k-NN algorithm in predicting COVID-19 outcomes based on the selected features. CONCLUSION Our study has developed an innovative model that utilizes hospital data to accurately predict the mortality risk of COVID-19 patients. The main objective of this model is to prioritize early treatment for high-risk patients and optimize strained healthcare systems during the ongoing pandemic. By integrating machine learning with comprehensive hospital databases, our model effectively classifies patients' mortality risk, enabling targeted medical interventions and improved resource management. Among the various methods tested, the K-nearest neighbors (KNN) algorithm demonstrated the highest accuracy, allowing for early identification of high-risk patients. Through KNN feature identification, we identified 23 predictors that significantly contribute to predicting COVID-19 mortality. The top five predictors are gender (female), intensive care unit (ICU) admission, alcohol drinking, smoking, and symptoms of headache and chills. This advancement holds great promise in enhancing healthcare outcomes and decision-making during the pandemic. By providing services and prioritizing patients based on the identified predictors, healthcare facilities and providers can improve the chances of survival for individuals. This model provides valuable insights that can guide healthcare professionals in allocating resources and delivering appropriate care to those at highest risk.
Collapse
Affiliation(s)
- Melsew Setegn Alie
- Department Public Health, School of Public Health, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia.
| | - Yilkal Negesse
- Department of Public Health, College of Medicine and Health Science, Debre Markos University, Gojjam, Ethiopia
| | - Kassa Kindie
- Department Nursing, College of Medicine and Health Science, Mizan-Tepi University, Mizan-Aman, Ethiopia
| | - Dereje Senay Merawi
- Department of Information Technology, Faculty of Technology, Debre Tabor University, Gonder, Ethiopia
| |
Collapse
|
3
|
Yue G. Screening of lung cancer serum biomarkers based on Boruta-shap and RFC-RFECV algorithms. J Proteomics 2024; 301:105180. [PMID: 38663548 DOI: 10.1016/j.jprot.2024.105180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/16/2024] [Accepted: 04/22/2024] [Indexed: 05/03/2024]
Abstract
OBJECTIVE This study aimed to identify a set of serum miRNAs as potential biomarkers for lung cancer diagnosis using algorithmic approaches. METHODS Serum miRNA expression data from lung cancer patients and non-tumor controls were obtained. The top six miRNAs were selected using Boruta-shap and RFC-RFECV algorithms. A Gaussian Naive Bayes (NB) classifier was trained and evaluated using cross-validation, ROC curve analysis, and evaluation metrics. RESULTS Six miRNAs (hsa-miRNA-144, hsa-miRNA-107, hsa-miRNA-484, hsa-miRNA-103, hsa-miRNA-26b, and hsa-miRNA-641) were identified as feature genes. The NB classifier achieved an area under curve (AUC) of 0.8966 and a mean AUC of 0.88 in cross-validation. Accuracy, recall, and F1 scores exhibited promising results, with an accuracy of 82%. In the validation set, the AUC values for the NB and SVC classifiers were 0.9345 and 0.9423, respectively, with a mean AUC of 0.95 in cross-validation. The classifiers demonstrated an accuracy of 89% in diagnosing lung cancer. CONCLUSION This study identified a panel of six serum miRNAs with potential as non-invasive biomarkers for lung cancer diagnosis. These miRNAs show promise in providing sensitive and specific tools for detecting lung cancer. SIGNIFICANCE Lung cancer is one of the top cancers worldwide, threatening the health and lives of tens of thousands of people. miRNA is a biomarker, which can be used as a potential clinical tool for diagnosis and prognosis of cancer patients. Therefore, the use of multiple miRNAs to construct diagnostic models may be one of the future methods of accurate diagnosis of lung cancer. In this study, we used the Boruta-shap and RFC-RFECV algorithms to automatically identify and extract characteristic miRNAs highly associated with lung cancer, thereby establishing an accurate classifier for the diagnosis of lung cancer with characteristic miRNAs.
Collapse
Affiliation(s)
- Guangcheng Yue
- Department of Thoracic Surgery, Anyang Tumor Hospital, The Affiliated Anyang Tumor Hospital of Henan University of Science and Technology, China.
| |
Collapse
|
4
|
Zhai Y, Lan D, Lv S, Mo L. Interpretability-based machine learning for predicting the risk of death from pulmonary inflammation in Chinese intensive care unit patients. Front Med (Lausanne) 2024; 11:1399527. [PMID: 38933112 PMCID: PMC11200536 DOI: 10.3389/fmed.2024.1399527] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Accepted: 05/13/2024] [Indexed: 06/28/2024] Open
Abstract
Objective The objective of this research was to create a machine learning predictive model that could be easily interpreted in order to precisely determine the risk of premature death in patients receiving intensive care after pulmonary inflammation. Methods In this study, information from the China intensive care units (ICU) Open Source database was used to examine data from 2790 patients who had infections between January 2019 and December 2020. A 7:3 ratio was used to randomly assign the whole patient population to training and validation groups. This study used six machine learning techniques: logistic regression, random forest, gradient boosting tree, extreme gradient boosting tree (XGBoost), multilayer perceptron, and K-nearest neighbor. A cross-validation grid search method was used to search the parameters in each model. Eight metrics were used to assess the models' performance: accuracy, precision, recall, F1 score, area under the curve (AUC) value, Brier score, Jordon's index, and calibration slope. The machine methods were ranked based on how well they performed in each of these metrics. The best-performing models were selected for interpretation using both the Shapley Additive exPlanations (SHAP) and Local interpretable model-agnostic explanations (LIME) interpretable techniques. Results A subset of the study cohort's patients (120/1668, or 7.19%) died in the hospital following screening for inclusion and exclusion criteria. Using a cross-validated grid search to evaluate the six machine learning techniques, XGBoost showed good discriminative ability, achieving an accuracy score of 0.889 (0.874-0.904), precision score of 0.871 (0.849-0.893), recall score of 0.913 (0.890-0.936), F1 score of 0.891 (0.876-0.906), and AUC of 0.956 (0.939-0.973). Additionally, XGBoost exhibited excellent performance with a Brier score of 0.050, Jordon index of 0.947, and calibration slope of 1.074. It was also possible to create an interactive internet page using the XGBoost model. Conclusion By identifying patients at higher risk of early mortality, machine learning-based mortality risk prediction models have the potential to significantly improve patient care by directing clinical decision making and enabling early detection of survival and mortality issues in patients with pulmonary inflammation disease.
Collapse
Affiliation(s)
| | | | | | - Liqin Mo
- Cardiothoracic Surgery Intensive Care Unit, The First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi, China
| |
Collapse
|
5
|
Asteris PG, Gandomi AH, Armaghani DJ, Kokoris S, Papandreadi AT, Roumelioti A, Papanikolaou S, Tsoukalas MZ, Triantafyllidis L, Koutras EI, Bardhan A, Mohammed AS, Naderpour H, Paudel S, Samui P, Ntanasis-Stathopoulos I, Dimopoulos MA, Terpos E. Prognosis of COVID-19 severity using DERGA, a novel machine learning algorithm. Eur J Intern Med 2024:S0953-6205(24)00094-3. [PMID: 38458880 DOI: 10.1016/j.ejim.2024.02.037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 02/23/2024] [Accepted: 02/29/2024] [Indexed: 03/10/2024]
Abstract
It is important to determine the risk for admission to the intensive care unit (ICU) in patients with COVID-19 presenting at the emergency department. Using artificial neural networks, we propose a new Data Ensemble Refinement Greedy Algorithm (DERGA) based on 15 easily accessible hematological indices. A database of 1596 patients with COVID-19 was used; it was divided into 1257 training datasets (80 % of the database) for training the algorithms and 339 testing datasets (20 % of the database) to check the reliability of the algorithms. The optimal combination of hematological indicators that gives the best prediction consists of only four hematological indicators as follows: neutrophil-to-lymphocyte ratio (NLR), lactate dehydrogenase, ferritin, and albumin. The best prediction corresponds to a particularly high accuracy of 97.12 %. In conclusion, our novel approach provides a robust model based only on basic hematological parameters for predicting the risk for ICU admission and optimize COVID-19 patient management in the clinical practice.
Collapse
Affiliation(s)
- Panagiotis G Asteris
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Amir H Gandomi
- Faculty of Engineering & IT, University of Technology Sydney, Sydney, NSW 2007, Australia; University Research and Innovation Center (EKIK), Óbuda University, 1034 Budapest, Hungary
| | - Danial J Armaghani
- School of Civil and Environmental Engineering, University of Technology Sydney, NSW 2007, Australia
| | - Styliani Kokoris
- Laboratory of Hematology and Hospital Blood Transfusion Department, University General Hospital "Attikon", National and Kapodistrian University of Athens, Medical School, Greece
| | - Anastasia T Papandreadi
- Software and Applications Department, University General Hospital "Attikon", National and Kapodistrian University of Athens, Medical School, Greece
| | - Anna Roumelioti
- Department of Hematology and Lymphoma BMTU, Evangelismos General Hospital, Athens, Greece
| | - Stefanos Papanikolaou
- NOMATEN Centre of Excellence, National Center for Nuclear Research, ulica A. Sołtana 7, 05-400 Swierk/Otwock, Poland
| | - Markos Z Tsoukalas
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Leonidas Triantafyllidis
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Evangelos I Koutras
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Abidhan Bardhan
- Civil Engineering Department, National Institute of Technology Patna, Bihar, India
| | - Ahmed Salih Mohammed
- Engineering Department, American University of Iraq, Sulaimani, Kurdistan-Region, Iraq
| | - Hosein Naderpour
- Institute of Industrial Science, University of Tokyo, Tokyo, Japan
| | - Satish Paudel
- Department of Civil and Environmental Engineering, University of Nevada, Reno, US
| | - Pijush Samui
- Civil Engineering Department, National Institute of Technology Patna, Bihar, India
| | - Ioannis Ntanasis-Stathopoulos
- Department of Clinical Therapeutics, Medical School, Faculty of Medicine, National Kapodistrian University of Athens, Athens, Greece
| | - Meletios A Dimopoulos
- Department of Clinical Therapeutics, Medical School, Faculty of Medicine, National Kapodistrian University of Athens, Athens, Greece
| | - Evangelos Terpos
- Department of Clinical Therapeutics, Medical School, Faculty of Medicine, National Kapodistrian University of Athens, Athens, Greece.
| |
Collapse
|
6
|
Viderman D, Kotov A, Popov M, Abdildin Y. Machine and deep learning methods for clinical outcome prediction based on physiological data of COVID-19 patients: a scoping review. Int J Med Inform 2024; 182:105308. [PMID: 38091862 DOI: 10.1016/j.ijmedinf.2023.105308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 11/20/2023] [Accepted: 12/03/2023] [Indexed: 01/07/2024]
Abstract
INTRODUCTION Since the beginning of the COVID-19 pandemic, numerous machine and deep learning (MDL) methods have been proposed in the literature to analyze patient physiological data. The objective of this review is to summarize various aspects of these methods and assess their practical utility for predicting various clinical outcomes. METHODS We searched PubMed, Scopus, and Cochrane Library, screened and selected the studies matching the inclusion criteria. The clinical analysis focused on the characteristics of the patient cohorts in the studies included in this review, the specific tasks in the context of the COVID-19 pandemic that machine and deep learning methods were used for, and their practical limitations. The technical analysis focused on the details of specific MDL methods and their performance. RESULTS Analysis of the 48 selected studies revealed that the majority (∼54 %) of them examined the application of MDL methods for the prediction of survival/mortality-related patient outcomes, while a smaller fraction (∼13 %) of studies also examined applications to the prediction of patients' physiological outcomes and hospital resource utilization. 21 % of the studies examined the application of MDL methods to multiple clinical tasks. Machine and deep learning methods have been shown to be effective at predicting several outcomes of COVID-19 patients, such as disease severity, complications, intensive care unit (ICU) transfer, and mortality. MDL methods also achieved high accuracy in predicting the required number of ICU beds and ventilators. CONCLUSION Machine and deep learning methods have been shown to be valuable tools for predicting disease severity, organ dysfunction and failure, patient outcomes, and hospital resource utilization during the COVID-19 pandemic. The discovered knowledge and our conclusions and recommendations can also be useful to healthcare professionals and artificial intelligence researchers in managing future pandemics.
Collapse
Affiliation(s)
- Dmitriy Viderman
- Department of Surgery, School of Medicine, Nazarbayev University, Astana, Kazakhstan; Department of Anesthesiology, Intensive Care, and Pain Medicine, National Research Oncology Center, Astana, Kazakhstan.
| | - Alexander Kotov
- Department of Computer Science, College of Engineering, Wayne State University, Detroit, USA.
| | - Maxim Popov
- Department of Computer Science, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.
| | - Yerkin Abdildin
- Department of Mechanical and Aerospace Engineering, School of Engineering and Digital Sciences, Nazarbayev University, Astana, Kazakhstan.
| |
Collapse
|
7
|
Zhang P, Wu L, Zou TT, Zou Z, Tu J, Gong R, Kuang J. Machine Learning for Early Prediction of Major Adverse Cardiovascular Events After First Percutaneous Coronary Intervention in Patients With Acute Myocardial Infarction: Retrospective Cohort Study. JMIR Form Res 2024; 8:e48487. [PMID: 38170581 PMCID: PMC10794958 DOI: 10.2196/48487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 08/29/2023] [Accepted: 09/15/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND The incidence of major adverse cardiovascular events (MACEs) remains high in patients with acute myocardial infarction (AMI) who undergo percutaneous coronary intervention (PCI), and early prediction models to guide their clinical management are lacking. OBJECTIVE This study aimed to develop machine learning-based early prediction models for MACEs in patients with newly diagnosed AMI who underwent PCI. METHODS A total of 1531 patients with AMI who underwent PCI from January 2018 to December 2019 were enrolled in this consecutive cohort. The data comprised demographic characteristics, clinical investigations, laboratory tests, and disease-related events. Four machine learning models-artificial neural network (ANN), k-nearest neighbors, support vector machine, and random forest-were developed and compared with the logistic regression model. Our primary outcome was the model performance that predicted the MACEs, which was determined by accuracy, area under the receiver operating characteristic curve, and F1-score. RESULTS In total, 1362 patients were successfully followed up. With a median follow-up of 25.9 months, the incidence of MACEs was 18.5% (252/1362). The area under the receiver operating characteristic curve of the ANN, random forest, k-nearest neighbors, support vector machine, and logistic regression models were 80.49%, 72.67%, 79.80%, 77.20%, and 71.77%, respectively. The top 5 predictors in the ANN model were left ventricular ejection fraction, the number of implanted stents, age, diabetes, and the number of vessels with coronary artery disease. CONCLUSIONS The ANN model showed good MACE prediction after PCI for patients with AMI. The use of machine learning-based prediction models may improve patient management and outcomes in clinical practice.
Collapse
Affiliation(s)
- Pin Zhang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
- School of Public Health and Management, Nanchang Medical College, Nanchang, China
| | - Lei Wu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - Ting-Ting Zou
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - ZiXuan Zou
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - JiaXin Tu
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| | - Ren Gong
- Department of Cardiology, The Second Affiliated Hospital of Nanchang University, Nanchang, China
| | - Jie Kuang
- Jiangxi Provincial Key Laboratory of Preventive Medicine, School of Public Health, Nanchang University, Nanchang, China
| |
Collapse
|
8
|
Budiarto A, Tsang KCH, Wilson AM, Sheikh A, Shah SA. Machine Learning-Based Asthma Attack Prediction Models From Routinely Collected Electronic Health Records: Systematic Scoping Review. JMIR AI 2023; 2:e46717. [PMID: 38875586 PMCID: PMC11041490 DOI: 10.2196/46717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 09/28/2023] [Accepted: 10/09/2023] [Indexed: 06/16/2024]
Abstract
BACKGROUND An early warning tool to predict attacks could enhance asthma management and reduce the likelihood of serious consequences. Electronic health records (EHRs) providing access to historical data about patients with asthma coupled with machine learning (ML) provide an opportunity to develop such a tool. Several studies have developed ML-based tools to predict asthma attacks. OBJECTIVE This study aims to critically evaluate ML-based models derived using EHRs for the prediction of asthma attacks. METHODS We systematically searched PubMed and Scopus (the search period was between January 1, 2012, and January 31, 2023) for papers meeting the following inclusion criteria: (1) used EHR data as the main data source, (2) used asthma attack as the outcome, and (3) compared ML-based prediction models' performance. We excluded non-English papers and nonresearch papers, such as commentary and systematic review papers. In addition, we also excluded papers that did not provide any details about the respective ML approach and its result, including protocol papers. The selected studies were then summarized across multiple dimensions including data preprocessing methods, ML algorithms, model validation, model explainability, and model implementation. RESULTS Overall, 17 papers were included at the end of the selection process. There was considerable heterogeneity in how asthma attacks were defined. Of the 17 studies, 8 (47%) studies used routinely collected data both from primary care and secondary care practices together. Extreme imbalanced data was a notable issue in most studies (13/17, 76%), but only 38% (5/13) of them explicitly dealt with it in their data preprocessing pipeline. The gradient boosting-based method was the best ML method in 59% (10/17) of the studies. Of the 17 studies, 14 (82%) studies used a model explanation method to identify the most important predictors. None of the studies followed the standard reporting guidelines, and none were prospectively validated. CONCLUSIONS Our review indicates that this research field is still underdeveloped, given the limited body of evidence, heterogeneity of methods, lack of external validation, and suboptimally reported models. We highlighted several technical challenges (class imbalance, external validation, model explanation, and adherence to reporting guidelines to aid reproducibility) that need to be addressed to make progress toward clinical adoption.
Collapse
Affiliation(s)
- Arif Budiarto
- Asthma UK Center for Applied Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
- Bioinformatics and Data Science Research Center, Bina Nusantara University, Jakarta, Indonesia
| | - Kevin C H Tsang
- Asthma UK Center for Applied Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Andrew M Wilson
- Norwich Medical School, University of East Anglia, Norwich, United Kingdom
- Norfolk and Norwich University Hospital NHS Foundation Trust, Norwich, United Kingdom
| | - Aziz Sheikh
- Asthma UK Center for Applied Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| | - Syed Ahmar Shah
- Asthma UK Center for Applied Research, Usher Institute, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
9
|
Zhuang Z, Qi Y, Yao Y, Yu Y. A predictive model for disease severity among COVID-19 elderly patients based on IgG subtypes and machine learning. Front Immunol 2023; 14:1286380. [PMID: 38106427 PMCID: PMC10723829 DOI: 10.3389/fimmu.2023.1286380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 11/15/2023] [Indexed: 12/19/2023] Open
Abstract
Objective Due to the increased likelihood of progression of severe pneumonia, the mortality rate of the elderly infected with coronavirus disease 2019 (COVID-19) is high. However, there is a lack of models based on immunoglobulin G (IgG) subtypes to forecast the severity of COVID-19 in elderly individuals. The objective of this study was to create and verify a new algorithm for distinguishing elderly individuals with severe COVID-19. Methods In this study, laboratory data were gathered from 103 individuals who had confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection using a retrospective analysis. These individuals were split into training (80%) and testing cohort (20%) by using random allocation. Furthermore, 22 COVID-19 elderly patients from the other two centers were divided into an external validation cohort. Differential indicators were analyzed through univariate analysis, and variable selection was performed using least absolute shrinkage and selection operator (LASSO) regression. The severity of elderly patients with COVID-19 was predicted using a combination of five machine learning algorithms. Area under the curve (AUC) was utilized to evaluate the performance of these models. Calibration curves, decision curves analysis (DCA), and Shapley additive explanations (SHAP) plots were utilized to interpret and evaluate the model. Results The logistic regression model was chosen as the best machine learning model with four principal variables that could predict the probability of COVID-19 severity. In the training cohort, the model achieved an AUC of 0.889, while in the testing cohort, it obtained an AUC of 0.824. The calibration curve demonstrated excellent consistency between actual and predicted probabilities. According to the DCA curve, it was evident that the model provided significant clinical advantages. Moreover, the model performed effectively in an external validation group (AUC=0.74). Conclusion The present study developed a model that can distinguish between severe and non-severe patients of COVID-19 in the elderly, which might assist clinical doctors in evaluating the severity of COVID-19 and reducing the bad outcomes of elderly patients.
Collapse
Affiliation(s)
- Zhenchao Zhuang
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| | - Yuxiang Qi
- School of Medical Technology and Information Engineering, Zhejiang Chinese Medical University, Hangzhou, China
| | - Yimin Yao
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| | - Ying Yu
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| |
Collapse
|
10
|
Liontos A, Biros D, Matzaras R, Tsarapatsani KH, Kolios NG, Zarachi A, Tatsis K, Pappa C, Nasiou M, Pargana E, Tsiakas I, Lymperatou D, Filippas-Ntekouan S, Athanasiou L, Samanidou V, Konstantopoulou R, Vagias I, Panteli A, Milionis H, Christaki E. Inflammation and Venous Thromboembolism in Hospitalized Patients with COVID-19. Diagnostics (Basel) 2023; 13:3477. [PMID: 37998613 PMCID: PMC10670045 DOI: 10.3390/diagnostics13223477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2023] [Revised: 11/04/2023] [Accepted: 11/17/2023] [Indexed: 11/25/2023] Open
Abstract
BACKGROUND A link between inflammation and venous thromboembolism (VTE) in COVID-19 disease has been suggested pathophysiologically and clinically. The aim of this study was to investigate the association between inflammation and disease outcomes in adult hospitalized COVID-19 patients with VTE. METHODS This was a retrospective observational study, including quantitative and qualitative data collected from COVID-19 patients hospitalized at the Infectious Diseases Unit (IDU) of the University Hospital of Ioannina, from 1 March 2020 to 31 May 2022. Venous thromboembolism was defined as a diagnosis of pulmonary embolism (PE) and/or vascular tree-in-bud in the lungs. The burden of disease, assessed by computed tomography of the lungs (CTBoD), was quantified as the percentage (%) of the affected lung parenchyma. The study outcomes were defined as death, intubation, and length of hospital stay (LoS). A chi-squared test and univariate logistic regression analyses were performed in IBM SPSS 28.0. RESULTS After propensity score matching, the final study cohort included 532 patients. VTE was found in 11.2% of the total population. In patients with VTE, we found that lymphocytopenia and a high neutrophil/lymphocyte ratio were associated with an increased risk of intubation and death, respectively. Similarly, CTBoD > 50% was associated with a higher risk of intubation and death in this group of patients. The triglyceride-glucose (TyG) index was also linked to worse outcomes. CONCLUSIONS Inflammatory indices were associated with VTE. Lymphocytopenia and an increased neutrophil-to-lymphocyte ratio negatively impacted the disease's prognosis and outcomes. Whether these indices unfavorably affect outcomes in COVID-19-associated VTE must be further evaluated.
Collapse
Affiliation(s)
- Angelos Liontos
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Dimitrios Biros
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Rafail Matzaras
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | | | - Nikolaos-Gavriel Kolios
- Faculty of Medicine, University of Ioannina, 45110 Ioannina, Greece; (N.-G.K.); (C.P.); (M.N.); (E.P.)
| | - Athina Zarachi
- Department of Otorhinolaryngology, Head and Neck Surgery, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 451100 Ioannina, Greece;
| | - Konstantinos Tatsis
- Department of Respiratory Medicine, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 451100 Ioannina, Greece;
| | - Christiana Pappa
- Faculty of Medicine, University of Ioannina, 45110 Ioannina, Greece; (N.-G.K.); (C.P.); (M.N.); (E.P.)
| | - Maria Nasiou
- Faculty of Medicine, University of Ioannina, 45110 Ioannina, Greece; (N.-G.K.); (C.P.); (M.N.); (E.P.)
| | - Eleni Pargana
- Faculty of Medicine, University of Ioannina, 45110 Ioannina, Greece; (N.-G.K.); (C.P.); (M.N.); (E.P.)
| | - Ilias Tsiakas
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Diamantina Lymperatou
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Sempastien Filippas-Ntekouan
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Lazaros Athanasiou
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Valentini Samanidou
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Revekka Konstantopoulou
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Ioannis Vagias
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Aikaterini Panteli
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Haralampos Milionis
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| | - Eirini Christaki
- 1st Division of Internal Medicine & Infectious Diseases Unit, University General Hospital of Ioannina, Faculty of Medicine, University of Ioannina, 45500 Ioannina, Greece; (A.L.); (D.B.); (R.M.); (I.T.); (D.L.); (S.F.-N.); (L.A.); (V.S.); (R.K.); (I.V.); (A.P.); (H.M.)
| |
Collapse
|
11
|
Sun S, Wang L, Lin J, Sun Y, Ma C. An effective prediction model based on XGBoost for the 12-month recurrence of AF patients after RFA. BMC Cardiovasc Disord 2023; 23:561. [PMID: 37974062 PMCID: PMC10655386 DOI: 10.1186/s12872-023-03599-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Accepted: 11/07/2023] [Indexed: 11/19/2023] Open
Abstract
BACKGROUND Atrial fibrillation (AF) is a common heart rhythm disorder that can lead to complications such as stroke and heart failure. Radiofrequency ablation (RFA) is a procedure used to treat AF, but it is not always successful in maintaining a normal heart rhythm. This study aimed to construct a clinical prediction model based on extreme gradient boosting (XGBoost) for AF recurrence 12 months after ablation. METHODS The 27-dimensional data of 359 patients with AF undergoing RFA in the First Affiliated Hospital of Soochow University from October 2018 to November 2021 were retrospectively analysed. We adopted the logistic regression, support vector machine (SVM), random forest (RF) and XGBoost methods to conduct the experiment. To evaluate the performance of the prediction, we used the area under the receiver operating characteristic curve (AUC), the area under the precision-recall curve (AP), and calibration curves of both the training and testing sets. Finally, Shapley additive explanations (SHAP) were utilized to explain the significance of the variables. RESULTS Of the 27-dimensional variables, ejection fraction (EF) of the left atrial appendage (LAA), N-terminal probrain natriuretic peptide (NT-proBNP), global peak longitudinal strain of the LAA (LAAGPLS), left atrial diameter (LAD), diabetes mellitus (DM) history, and female sex had a significant role in the predictive model. The experimental results demonstrated that XGBoost exhibited the best performance among these methods, and the accuracy, specificity, sensitivity, precision and F1 score (a measure of test accuracy) of XGBoost were 86.1%, 89.7%, 71.4%, 62.5% and 0.67, respectively. In addition, SHAP analysis also proved that the 6 parameters were decisive for the effect of the XGBoost-based prediction model. CONCLUSIONS We proposed an effective model based on XGBoost that can be used to predict the recurrence of AF patients after RFA. This prediction result can guide treatment decisions and help to optimize the management of AF.
Collapse
Affiliation(s)
- ShiKun Sun
- The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Li Wang
- The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - Jia Lin
- The First Affiliated Hospital of Soochow University, Suzhou, 215006, China
| | - YouFen Sun
- The Shengcheng Street Health Center, Shouguang, 262700, China.
| | - ChangSheng Ma
- The First Affiliated Hospital of Soochow University, Suzhou, 215006, China.
| |
Collapse
|
12
|
Giuste FO, He L, Lais P, Shi W, Zhu Y, Hornback A, Tsai C, Isgut M, Anderson B, Wang MD. Early and fair COVID-19 outcome risk assessment using robust feature selection. Sci Rep 2023; 13:18981. [PMID: 37923795 PMCID: PMC10624921 DOI: 10.1038/s41598-023-36175-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Accepted: 05/29/2023] [Indexed: 11/06/2023] Open
Abstract
Personalized medicine plays an important role in treatment optimization for COVID-19 patient management. Early treatment in patients at high risk of severe complications is vital to prevent death and ventilator use. Predicting COVID-19 clinical outcomes using machine learning may provide a fast and data-driven solution for optimizing patient care by estimating the need for early treatment. In addition, it is essential to accurately predict risk across demographic groups, particularly those underrepresented in existing models. Unfortunately, there is a lack of studies demonstrating the equitable performance of machine learning models across patient demographics. To overcome this existing limitation, we generate a robust machine learning model to predict patient-specific risk of death or ventilator use in COVID-19 positive patients using features available at the time of diagnosis. We establish the value of our solution across patient demographics, including gender and race. In addition, we improve clinical trust in our automated predictions by generating interpretable patient clustering, patient-level clinical feature importance, and global clinical feature importance within our large real-world COVID-19 positive patient dataset. We achieved 89.38% area under receiver operating curve (AUROC) performance for severe outcomes prediction and our robust feature ranking approach identified the presence of dementia as a key indicator for worse patient outcomes. We also demonstrated that our deep-learning clustering approach outperforms traditional clustering in separating patients by severity of outcome based on mutual information performance. Finally, we developed an application for automated and fair patient risk assessment with minimal manual data entry using existing data exchange standards.
Collapse
Affiliation(s)
- Felipe O Giuste
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| | - Lawrence He
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| | - Peter Lais
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| | - Wenqi Shi
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30322, USA
| | - Yuanda Zhu
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, 30322, USA
| | - Andrew Hornback
- School of Computer Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30322, USA
| | - Chiche Tsai
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA
| | - Monica Isgut
- School of Biology, Georgia Institute of Technology, Atlanta, GA, 30322, USA
| | - Blake Anderson
- Department of Medicine, Emory University, Atlanta, GA, 30322, USA
| | - May D Wang
- The Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, 30322, USA.
| |
Collapse
|
13
|
Wickersham M, Bartelo N, Kulm S, Liu Y, Zhang Y, Elemento O. USING MACHINE LEARNING METHODS TO ASSESS THE RISK OF ALCOHOL MISUSE IN OLDER ADULTS. RESEARCH SQUARE 2023:rs.3.rs-3154584. [PMID: 37886491 PMCID: PMC10602059 DOI: 10.21203/rs.3.rs-3154584/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
The population of older adults, defined in this study as those 50 years of age or older, continues to increase every year. Substance misuse, particularly alcohol misuse, is often neglected in these individuals. To better identify older adults who might not be properly assessed for alcohol misuse, we have derived a risk assessment tool using patients from the United Kingdom Biobank (UKB), which was validated on patients in the Weill Cornell Medicine (WCM) electronic health record (EHR). The model and tooling created stratifies the risk of alcohol misuse in older adults using 10 features that are commonly found in most EHR systems. We found that the area under the receiver operating curve (AUROC) to correctly predict alcohol misuse in older adults for the UKB and WCM models were 0.84 and 0.78, respectively. We further show that of those who self-identified as having ongoing alcohol misuse in the UKB cohort, only 12.5% of these patients had any alcohol-related F.10 ICD-10 code. Extending this to the WCM cohort, we forecast that 7,838 out of 12,360 older adults with no F.10 ICD-10 code (63.4%) may be missed as having alcohol misuse in the EHR. Overall, this study importantly prioritizes the health of older adults by being able to predict alcohol misuse in an understudied population.
Collapse
Affiliation(s)
- Matthew Wickersham
- Weill-Cornell/Rockefeller/Sloan-Kettering Tri-Institutional MD-PhD Program, New York, New York, United States
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States
| | - Nicholas Bartelo
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States
| | - Scott Kulm
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States
| | - Yifan Liu
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States
| | - Yiye Zhang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, New York, United States
- Department of Emergency Medicine, Weill Cornell Medicine, New York, New York, United States
| | - Olivier Elemento
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, New York, United States
- Caryl and Israel Englander Institute for Precision Medicine, Weill Cornell Medicine, New York, New York, United States
| |
Collapse
|
14
|
Ma FQ, He C, Yang HR, Hu ZW, Mao HR, Fan CY, Qi Y, Zhang JX, Xu B. Interpretable machine-learning model for Predicting the Convalescent COVID-19 patients with pulmonary diffusing capacity impairment. BMC Med Inform Decis Mak 2023; 23:169. [PMID: 37644543 PMCID: PMC10466769 DOI: 10.1186/s12911-023-02192-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 05/04/2023] [Indexed: 08/31/2023] Open
Abstract
INTRODUCTION The COVID-19 patients in the convalescent stage noticeably have pulmonary diffusing capacity impairment (PDCI). The pulmonary diffusing capacity is a frequently-used indicator of the COVID-19 survivors' prognosis of pulmonary function, but the current studies focusing on prediction of the pulmonary diffusing capacity of these people are limited. The aim of this study was to develop and validate a machine learning (ML) model for predicting PDCI in the COVID-19 patients using routinely available clinical data, thus assisting the clinical diagnosis. METHODS Collected from a follow-up study from August to September 2021 of 221 hospitalized survivors of COVID-19 18 months after discharge from Wuhan, including the demographic characteristics and clinical examination, the data in this study were randomly separated into a training (80%) data set and a validation (20%) data set. Six popular machine learning models were developed to predict the pulmonary diffusing capacity of patients infected with COVID-19 in the recovery stage. The performance indicators of the model included area under the curve (AUC), Accuracy, Recall, Precision, Positive Predictive Value(PPV), Negative Predictive Value (NPV) and F1. The model with the optimum performance was defined as the optimal model, which was further employed in the interpretability analysis. The MAHAKIL method was utilized to balance the data and optimize the balance of sample distribution, while the RFECV method for feature selection was utilized to select combined features more favorable to machine learning. RESULTS A total of 221 COVID-19 survivors were recruited in this study after discharge from hospitals in Wuhan. Of these participants, 117 (52.94%) were female, with a median age of 58.2 years (standard deviation (SD) = 12). After feature selection, 31 of the 37 clinical factors were finally selected for use in constructing the model. Among the six tested ML models, the best performance was accomplished in the XGBoost model, with an AUC of 0.755 and an accuracy of 78.01% after experimental verification. The SHAPELY Additive explanations (SHAP) summary analysis exhibited that hemoglobin (Hb), maximal voluntary ventilation (MVV), severity of illness, platelet (PLT), Uric Acid (UA) and blood urea nitrogen (BUN) were the top six most important factors affecting the XGBoost model decision-making. CONCLUSION The XGBoost model reported here showed a good prognostic prediction ability for PDCI of COVID-19 survivors during the recovery period. Among the interpretation methods based on the importance of SHAP values, Hb and MVV contributed the most to the prediction of PDCI outcomes of COVID-19 survivors in the recovery period.
Collapse
Affiliation(s)
- Fu-Qiang Ma
- Hubei University of Chinese Medicine, Wuhan, 430065, China
| | - Cong He
- Hubei Provincial Hospital of Traditional Chinese Medicine, Wuhan, 430061, China
- Affiliated Hospital of Hubei University of Traditional Chinese Medicine, Wuhan, 430061, China
- Hubei Province Academy of Traditional Chinese Medicine, Wuhan, 430074, China
| | - Hao-Ran Yang
- School of Software, HuaZhong University of Science and Technology, Wuhan, 430074, China
| | - Zuo-Wei Hu
- Wuhan No.1 Hospital, Wuhan, 430022, China
| | - He-Rong Mao
- Hubei University of Chinese Medicine, Wuhan, 430065, China
| | - Cun-Yu Fan
- Hubei Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Wuhan, 430015, China
| | - Yu Qi
- Hubei University of Chinese Medicine, Wuhan, 430065, China
| | - Ji-Xian Zhang
- Hubei Provincial Hospital of Integrated Traditional Chinese and Western Medicine, Wuhan, 430015, China.
| | - Bo Xu
- Hubei University of Chinese Medicine, Wuhan, 430065, China.
| |
Collapse
|
15
|
Zakariaee SS, Naderi N, Ebrahimi M, Kazemi-Arpanahi H. Comparing machine learning algorithms to predict COVID‑19 mortality using a dataset including chest computed tomography severity score data. Sci Rep 2023; 13:11343. [PMID: 37443373 PMCID: PMC10345104 DOI: 10.1038/s41598-023-38133-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 07/04/2023] [Indexed: 07/15/2023] Open
Abstract
Since the beginning of the COVID-19 pandemic, new and non-invasive digital technologies such as artificial intelligence (AI) had been introduced for mortality prediction of COVID-19 patients. The prognostic performances of the machine learning (ML)-based models for predicting clinical outcomes of COVID-19 patients had been mainly evaluated using demographics, risk factors, clinical manifestations, and laboratory results. There is a lack of information about the prognostic role of imaging manifestations in combination with demographics, clinical manifestations, and laboratory predictors. The purpose of the present study is to develop an efficient ML prognostic model based on a more comprehensive dataset including chest CT severity score (CT-SS). Fifty-five primary features in six main classes were retrospectively reviewed for 6854 suspected cases. The independence test of Chi-square was used to determine the most important features in the mortality prediction of COVID-19 patients. The most relevant predictors were used to train and test ML algorithms. The predictive models were developed using eight ML algorithms including the J48 decision tree (J48), support vector machine (SVM), multi-layer perceptron (MLP), k-nearest neighbourhood (k-NN), Naïve Bayes (NB), logistic regression (LR), random forest (RF), and eXtreme gradient boosting (XGBoost). The performances of the predictive models were evaluated using accuracy, precision, sensitivity, specificity, and area under the ROC curve (AUC) metrics. After applying the exclusion criteria, a total of 815 positive RT-PCR patients were the final sample size, where 54.85% of the patients were male and the mean age of the study population was 57.22 ± 16.76 years. The RF algorithm with an accuracy of 97.2%, the sensitivity of 100%, a precision of 94.8%, specificity of 94.5%, F1-score of 97.3%, and AUC of 99.9% had the best performance. Other ML algorithms with AUC ranging from 81.2 to 93.9% had also good prediction performances in predicting COVID-19 mortality. Results showed that timely and accurate risk stratification of COVID-19 patients could be performed using ML-based predictive models fed by routine data. The proposed algorithm with the more comprehensive dataset including CT-SS could efficiently predict the mortality of COVID-19 patients. This could lead to promptly targeting high-risk patients on admission, the optimal use of hospital resources, and an increased probability of survival of patients.
Collapse
Affiliation(s)
| | - Negar Naderi
- Department of Midwifery, Ilam University of Medical Sciences, Ilam, Iran
| | - Mahdi Ebrahimi
- Department of Emergency Medicine, Tehran University of Medical Sciences, Tehran, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|
16
|
Ahsan MM, Uddin MR, Ali MS, Islam MK, Farjana M, Sakib AN, Momin KA, Luna SA. Deep transfer learning approaches for Monkeypox disease diagnosis. EXPERT SYSTEMS WITH APPLICATIONS 2023; 216:119483. [PMID: 36624785 PMCID: PMC9814470 DOI: 10.1016/j.eswa.2022.119483] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Revised: 12/24/2022] [Accepted: 12/27/2022] [Indexed: 06/01/2023]
Abstract
Monkeypox has become a significant global challenge as the number of cases increases daily. Those infected with the disease often display various skin symptoms and can spread the infection through contamination. Recently, Machine Learning (ML) has shown potential in image-based diagnoses, such as detecting cancer, identifying tumor cells, and identifying coronavirus disease (COVID)-19 patients. Thus, ML could potentially be used to diagnose Monkeypox as well. In this study, we developed a Monkeypox diagnosis model using Generalization and Regularization-based Transfer Learning approaches (GRA-TLA) for binary and multiclass classification. We tested our proposed approach on ten different convolutional Neural Network (CNN) models in three separate studies. The preliminary computational results showed that our proposed approach, combined with Extreme Inception (Xception), was able to distinguish between individuals with and without Monkeypox with an accuracy ranging from 77% to 88% in Studies One and Two, while Residual Network (ResNet)-101 had the best performance for multiclass classification in Study Three, with an accuracy ranging from 84% to 99%. In addition, we found that our proposed approach was computationally efficient compared to existing TL approaches in terms of the number of parameters (NP) and Floating-Point Operations per Second (FLOPs) required. We also used Local Interpretable Model-Agnostic Explanations (LIME) to explain our model's predictions and feature extractions, providing a deeper understanding of the specific features that may indicate the onset of Monkeypox.
Collapse
Affiliation(s)
- Md Manjurul Ahsan
- Industrial and Systems Engineering, University of Oklahoma, Norman, OK 73019, USA
| | - Muhammad Ramiz Uddin
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Md Shahin Ali
- Department of Biomedical Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Md Khairul Islam
- Department of Biomedical Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Mithila Farjana
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK, 73019, USA
| | - Ahmed Nazmus Sakib
- Department of Aerospace and Mechanical Engineering, University of Oklahoma, Norman, OK, 73019, USA
| | - Khondhaker Al Momin
- Department of Civil Engineering, Daffodil International University, Dhaka, 1341, Bangladesh
| | - Shahana Akter Luna
- Medicine & Surgery, Dhaka Medical College & Hospital, Dhaka, 1000, Bangladesh
| |
Collapse
|
17
|
Li F, Chen A, Li Z, Gu L, Pan Q, Wang P, Fan Y, Feng J. Machine learning-based prediction of cerebral hemorrhage in patients with hemodialysis: A multicenter, retrospective study. Front Neurol 2023; 14:1139096. [PMID: 37077571 PMCID: PMC10109449 DOI: 10.3389/fneur.2023.1139096] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Accepted: 03/08/2023] [Indexed: 04/05/2023] Open
Abstract
BackgroundIntracerebral hemorrhage (ICH) is one of the most serious complications in patients with chronic kidney disease undergoing long-term hemodialysis. It has high mortality and disability rates and imposes a serious economic burden on the patient's family and society. An early prediction of ICH is essential for timely intervention and improving prognosis. This study aims to build an interpretable machine learning-based model to predict the risk of ICH in patients undergoing hemodialysis.MethodsThe clinical data of 393 patients with end-stage kidney disease undergoing hemodialysis at three different centers between August 2014 and August 2022 were retrospectively analyzed. A total of 70% of the samples were randomly selected as the training set, and the remaining 30% were used as the validation set. Five machine learning (ML) algorithms, namely, support vector machine (SVM), extreme gradient boosting (XGB), complement Naïve Bayes (CNB), K-nearest neighbor (KNN), and logistic regression (LR), were used to develop a model to predict the risk of ICH in patients with uremia undergoing long-term hemodialysis. In addition, the area under the curve (AUC) values were evaluated to compare the performance of each algorithmic model. Global and individual interpretive analyses of the model were performed using importance ranking and Shapley additive explanations (SHAP) in the training set.ResultsA total of 73 patients undergoing hemodialysis developed spontaneous ICH among the 393 patients included in the study. The AUC of SVM, CNB, KNN, LR, and XGB models in the validation dataset were 0.725 (95% CI: 0.610 ~ 0.841), 0.797 (95% CI: 0.690 ~ 0.905), 0.675 (95% CI: 0.560 ~ 0.789), 0.922 (95% CI: 0.862 ~ 0.981), and 0.979 (95% CI: 0.953 ~ 1.000), respectively. Therefore, the XGBoost model had the best performance among the five algorithms. SHAP analysis revealed that the levels of LDL, HDL, CRP, and HGB and pre-hemodialysis blood pressure were the most important factors.ConclusionThe XGB model developed in this study can efficiently predict the risk of a cerebral hemorrhage in patients with uremia undergoing long-term hemodialysis and can help clinicians to make more individualized and rational clinical decisions. ICH events in patients undergoing maintenance hemodialysis (MHD) are associated with serum LDL, HDL, CRP, HGB, and pre-hemodialysis SBP levels.
Collapse
Affiliation(s)
- Fengda Li
- Department of Neurosurgery, Changshu Hospital Affiliated to Soochow University, Changshu, China
| | - Anmin Chen
- Department of Nephrology, The First People's Hospital of Jintan, Changzhou, China
| | - Zeyi Li
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Longyuan Gu
- Department of Neurosurgery, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
| | - Qiyang Pan
- Faculty of Informatics, Università della Svizzera italiana, Lugano, Ticino, Switzerland
| | - Pan Wang
- School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, China
| | - Yuechao Fan
- Department of Neurosurgery, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
- *Correspondence: Yuechao Fan
| | - Jinhong Feng
- Department of Nephrology, Affiliated Hospital of Xuzhou Medical University, Xuzhou, China
- Jinhong Feng
| |
Collapse
|
18
|
Tian J, Yan J, Han G, Du Y, Hu X, He Z, Han Q, Zhang Y. Machine learning prognosis model based on patient-reported outcomes for chronic heart failure patients after discharge. Health Qual Life Outcomes 2023; 21:31. [PMID: 36978124 PMCID: PMC10053412 DOI: 10.1186/s12955-023-02109-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2022] [Accepted: 03/03/2023] [Indexed: 03/30/2023] Open
Abstract
BACKGROUND Patient-reported outcomes (PROs) can be obtained outside hospitals and are of great significance for evaluation of patients with chronic heart failure (CHF). The aim of this study was to establish a prediction model using PROs for out-of-hospital patients. METHODS CHF-PRO were collected in 941 patients with CHF from a prospective cohort. Primary endpoints were all-cause mortality, HF hospitalization, and major adverse cardiovascular events (MACEs). To establish prognosis models during the two years follow-up, six machine learning methods were used, including logistic regression, random forest classifier, extreme gradient boosting (XGBoost), light gradient boosting machine, naive bayes, and multilayer perceptron. Models were established in four steps, namely, using general information as predictors, using four domains of CHF-PRO, using both of them and adjusting the parameters. The discrimination and calibration were then estimated. Further analyze were performed for the best model. The top prediction variables were further assessed. The Shapley additive explanations (SHAP) method was used to explain black boxes of the models. Moreover, a self-made web-based risk calculator was established to facilitate the clinical application. RESULTS CHF-PRO showed strong prediction value and improved the performance of the models. Among the approaches, XGBoost of the parameter adjustment model had the highest prediction performance with an area under the curve of 0.754 (95% CI: 0.737 to 0.761) for death, 0.718 (95% CI: 0.717 to 0.721) for HF rehospitalization and 0.670 (95% CI: 0.595 to 0.710) for MACEs. The four domains of CHF-PRO, especially the physical domain, showed the most significant impact on the prediction of outcomes. CONCLUSION CHF-PRO showed strong prediction value in the models. The XGBoost models using variables based on CHF-PRO and the patient's general information provide prognostic assessment for patients with CHF. The self-made web-based risk calculator can be conveniently used to predict the prognosis for patients after discharge. CLINICAL TRIAL REGISTRATION URL: http://www.chictr.org.cn/index.aspx ; Unique identifier: ChiCTR2100043337.
Collapse
Affiliation(s)
- Jing Tian
- Department of Cardiology, the 1st Hospital of Shanxi Medical University, 85 South Jiefang Road, Taiyuan, Shanxi Province, 030001, China
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China
| | - Jingjing Yan
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China
| | - Gangfei Han
- Department of Cardiology, the 1st Hospital of Shanxi Medical University, 85 South Jiefang Road, Taiyuan, Shanxi Province, 030001, China
| | - Yutao Du
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China
| | - Xiaojuan Hu
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China
| | - Zixuan He
- Department of Cardiology, the 1st Hospital of Shanxi Medical University, 85 South Jiefang Road, Taiyuan, Shanxi Province, 030001, China
| | - Qinghua Han
- Department of Cardiology, the 1st Hospital of Shanxi Medical University, 85 South Jiefang Road, Taiyuan, Shanxi Province, 030001, China.
| | - Yanbo Zhang
- Shanxi Provincial Key Laboratory of Major Diseases Risk Assessment, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China.
- Department of Health Statistics, School of Public Health, Shanxi Medical University, 56 South XinJian Road, Taiyuan, Shanxi Province, 030001, China.
- Shanxi University of Chinese Medicine, 121 University Street, Jinzhong, Shanxi Province, 030619, China.
| |
Collapse
|
19
|
Application of machine learning algorithms in thermal images for an automatic classification of lumbar sympathetic blocks. J Therm Biol 2023; 113:103523. [PMID: 37055127 DOI: 10.1016/j.jtherbio.2023.103523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2022] [Revised: 01/30/2023] [Accepted: 02/13/2023] [Indexed: 02/19/2023]
Abstract
PURPOSE There are no previous studies developing machine learning algorithms in the classification of lumbar sympathetic blocks (LSBs) performance using infrared thermography data. The objective was to assess the performance of different machine learning algorithms to classify LSBs carried out in patients diagnosed with lower limbs Complex Regional Pain Syndrome as successful or failed based on the evaluation of thermal predictors. METHODS 66 LSBs previously performed and classified by the medical team were evaluated in 24 patients. 11 regions of interest on each plantar foot were selected within the thermal images acquired in the clinical setting. From every region of interest, different thermal predictors were extracted and analysed in three different moments (minutes 4, 5, and 6) along with the baseline time (just after the injection of a local anaesthetic around the sympathetic ganglia). Among them, the thermal variation of the ipsilateral foot and the thermal asymmetry variation between feet at each minute assessed and the starting time for each region of interest, were fed into 4 different machine learning classifiers: an Artificial Neuronal Network, K-Nearest Neighbours, Random Forest, and a Support Vector Machine. RESULTS All classifiers presented an accuracy and specificity higher than 70%, sensitivity higher than 67%, and AUC higher than 0.73, and the Artificial Neuronal Network classifier performed the best with a maximum accuracy of 88%, sensitivity of 100%, specificity of 84% and AUC of 0.92, using 3 predictors. CONCLUSION These results suggest thermal data retrieved from plantar feet combined with a machine learning-based methodology can be an effective tool to automatically classify LSBs performance.
Collapse
|
20
|
Nopour R, Shanbezadeh M, Kazemi-Arpanahi H. Predicting intubation risk among COVID-19 hospitalized patients using artificial neural networks. JOURNAL OF EDUCATION AND HEALTH PROMOTION 2023; 12:16. [PMID: 37034879 PMCID: PMC10079178 DOI: 10.4103/jehp.jehp_20_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Accepted: 04/26/2022] [Indexed: 06/19/2023]
Abstract
BACKGROUND Accurately predicting the intubation risk in COVID-19 patients at the admission time is critical to optimal use of limited hospital resources, providing customized and evidence-based treatments, and improving the quality of delivered medical care services. This study aimed to design a statistical algorithm to select the best features influencing intubation prediction in coronavirus disease 2019 (COVID-19) hospitalized patients. Then, using selected features, multiple artificial neural network (ANN) configurations were developed to predict intubation risk. MATERIAL AND METHODS In this retrospective single-center study, a dataset containing 482 COVID-19 patients who were hospitalized between February 9, 2020 and July 20, 2021 was used. First, the Phi correlation coefficient method was performed for selecting the most important features affecting COVID-19 patients' intubation. Then, the different configurations of ANN were developed. Finally, the performance of ANN configurations was assessed using several evaluation metrics, and the best structure was determined for predicting intubation requirements among hospitalized COVID-19 patients. RESULTS The ANN models were developed based on 18 validated features. The results indicated that the best performance belongs to the 18-20-1 ANN configuration with positive predictive value (PPV) = 0.907, negative predictive value (NPV) = 0.941, sensitivity = 0.898, specificity = 0.951, and area under curve (AUC) = 0.906. CONCLUSIONS The results demonstrate the effectiveness of the ANN models for timely and reliable prediction of intubation risk in COVID-19 hospitalized patients. Our models can inform clinicians and those involved in policymaking and decision making for prioritizing restricted mechanical ventilation and other related resources for critically COVID-19 patients.
Collapse
Affiliation(s)
- Raoof Nopour
- Department of Health Information Management, Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran
| | - Mostafa Shanbezadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran
| |
Collapse
|
21
|
Khounraz F, Khodadoost M, Gholamzadeh S, Pourhamidi R, Baniasadi T, Jafarbigloo A, Mohammadi G, Ahmadi M, Ayyoubzadeh SM. Prognosis of COVID-19 patients using lab tests: A data mining approach. Health Sci Rep 2023; 6:e1049. [PMID: 36628109 PMCID: PMC9826741 DOI: 10.1002/hsr2.1049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 12/24/2022] [Accepted: 12/28/2022] [Indexed: 01/10/2023] Open
Abstract
Background The rapid prevalence of coronavirus disease 2019 (COVID-19) has caused a pandemic worldwide and affected the lives of millions. The potential fatality of the disease has led to global public health concerns. Apart from clinical practice, artificial intelligence (AI) has provided a new model for the early diagnosis and prediction of disease based on machine learning (ML) algorithms. In this study, we aimed to make a prediction model for the prognosis of COVID-19 patients using data mining techniques. Methods In this study, a data set was obtained from the intelligent management system repository of 19 hospitals at Shahid Beheshti University of Medical Sciences in Iran. All patients admitted had shown positive polymerase chain reaction (PCR) test results. They were hospitalized between February 19 and May 12 in 2020, which were investigated in this study. The extracted data set has 8621 data instances. The data include demographic information and results of 16 laboratory tests. In the first stage, preprocessing was performed on the data. Then, among 15 laboratory tests, four of them were selected. The models were created based on seven data mining algorithms, and finally, the performances of the models were compared with each other. Results Based on our results, the Random Forest (RF) and Gradient Boosted Trees models were known as the most efficient methods, with the highest accuracy percentage of 86.45% and 84.80%, respectively. In contrast, the Decision Tree exhibited the least accuracy (75.43%) among the seven models. Conclusion Data mining methods have the potential to be used for predicting outcomes of COVID-19 patients with the use of lab tests and demographic features. After validating these methods, they could be implemented in clinical decision support systems for better management and providing care to severe COVID-19 patients.
Collapse
Affiliation(s)
- Fariba Khounraz
- Administration and Resources Development AffairsShahid Beheshti University of Medical SciencesTehranIran
| | - Mahmood Khodadoost
- School of Traditional Medicine, Traditional Medicine & Materia Medical Research CenterShahid Beheshti University of Medical SciencesTehranIran
| | - Saeid Gholamzadeh
- Administration and Resources Development AffairsShahid Beheshti University of Medical SciencesTehranIran,Legal Medicine Research Center, Legal Medicine OrganizationTehranIran
| | - Rashed Pourhamidi
- Non Communicable Diseases Research Center, Bam University of Medical SciencesBamIran
| | - Tayebeh Baniasadi
- Department of Health Information Technology, Faculty of Para‐MedicineHormozgan University of Medical SciencesBandar AbbasIran
| | - Aida Jafarbigloo
- Department of Health Information Technology, School of Allied Medical SciencesShahid Beheshti University of Medical SciencesTehranIran
| | - Gohar Mohammadi
- Administration and Resources Development AffairsShahid Beheshti University of Medical SciencesTehranIran
| | - Mahnaz Ahmadi
- Department of Pharmaceutics and Pharmaceutical Nanotechnology, School of PharmacyShahid Beheshti University of Medical SciencesTehranIran
| | - Seyed Mohammad Ayyoubzadeh
- Department of Health Information Management, School of Allied Medical SciencesTehran University of Medical SciencesTehranIran
| |
Collapse
|
22
|
Zakariaee SS, Abdi AI, Naderi N, Babashahi M. Prognostic significance of chest CT severity score in mortality prediction of COVID-19 patients, a machine learning study. THE EGYPTIAN JOURNAL OF RADIOLOGY AND NUCLEAR MEDICINE 2023; 54:73. [PMCID: PMC10116092 DOI: 10.1186/s43055-023-01022-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 04/13/2023] [Indexed: 04/05/2024] Open
Abstract
Background The high mortality rate of COVID-19 makes it necessary to seek early identification of high-risk patients with poor prognoses. Although the association between CT-SS and mortality of COVID-19 patients was reported, its prognosis significance in combination with other prognostic parameters was not evaluated yet. Methods This retrospective single-center study reviewed a total of 6854 suspected patients referred to Imam Khomeini hospital, Ilam city, west of Iran, from February 9, 2020 to December 20, 2020. The prognostic performances of k-Nearest Neighbors (kNN), Multilayer Perceptron (MLP), Support Vector Machine (SVM), and J48 decision tree algorithms were evaluated based on the most important and relevant predictors. The metrics derived from the confusion matrix were used to determine the performance of the ML models. Results After applying exclusion criteria, 815 hospitalized cases were entered into the study. Of these, 447(54.85%) were male and the mean (± SD) age of participants was 57.22(± 16.76) years. The results showed that the performances of the ML algorithms were improved when they are fed by the dataset with CT-SS data. The kNN model with an accuracy of 94.1%, sensitivity of 100. 0%, precision of 89.5%, specificity of 88.3%, and AUC around 97.2% had the best performance among the other three ML techniques. Conclusions The integration of CT-SS data with demographics, risk factors, clinical manifestations, and laboratory parameters improved the prognostic performances of the ML algorithms. An ML model with a comprehensive collection of predictors could identify high-risk patients more efficiently and lead to the optimal use of hospital resources.
Collapse
Affiliation(s)
- Seyed Salman Zakariaee
- Department of Medical Physics, Faculty of Paramedical Sciences, Ilam University of Medical Sciences, Ilam, Iran
| | - Aza Ismail Abdi
- Department of Radiology, Erbil Medical Technical Institute, Erbil Polytechnic University, Erbil, Iraq
| | - Negar Naderi
- Department of Midwifery, Faculty of Nursing and Midwifery, Ilam University of Medical Sciences, Ilam, Iran
| | - Mashallah Babashahi
- Department of Pathology, Faculty of Paramedical Sciences, Ilam University of Medical Sciences, Ilam, Iran
| |
Collapse
|
23
|
Asteris PG, Kokoris S, Gavriilaki E, Tsoukalas MZ, Houpas P, Paneta M, Koutzas A, Argyropoulos T, Alkayem NF, Armaghani DJ, Bardhan A, Cavaleri L, Cao M, Mansouri I, Mohammed AS, Samui P, Gerber G, Boumpas DT, Tsantes A, Terpos E, Dimopoulos MA. Early prediction of COVID-19 outcome using artificial intelligence techniques and only five laboratory indices. Clin Immunol 2023; 246:109218. [PMID: 36586431 PMCID: PMC9797218 DOI: 10.1016/j.clim.2022.109218] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 10/25/2022] [Accepted: 12/21/2022] [Indexed: 12/29/2022]
Abstract
We aimed to develop a prediction model for intensive care unit (ICU) hospitalization of Coronavirus disease-19 (COVID-19) patients using artificial neural networks (ANN). We assessed 25 laboratory parameters at first from 248 consecutive adult COVID-19 patients for database creation, training, and development of ANN models. We developed a new alpha-index to assess association of each parameter with outcome. We used 166 records for training of computational simulations (training), 41 for documentation of computational simulations (validation), and 41 for reliability check of computational simulations (testing). The first five laboratory indices ranked by importance were Neutrophil-to-lymphocyte ratio, Lactate Dehydrogenase, Fibrinogen, Albumin, and D-Dimers. The best ANN based on these indices achieved accuracy 95.97%, precision 90.63%, sensitivity 93.55%. and F1-score 92.06%, verified in the validation cohort. Our preliminary findings reveal for the first time an ANN to predict ICU hospitalization accurately and early, using only 5 easily accessible laboratory indices.
Collapse
Affiliation(s)
- Panagiotis G. Asteris
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Styliani Kokoris
- Laboratory of Hematology and Hospital Blood Transfusion Department, University General Hospital "Attikon", National and Kapodistrian University of Athens, Medical School, Greece.
| | - Eleni Gavriilaki
- Hematology Department – BMT Unit, G Papanicolaou Hospital, Thessaloniki, Greece
| | - Markos Z. Tsoukalas
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Panagiotis Houpas
- Computational Mechanics Laboratory, School of Pedagogical and Technological Education, Athens, Greece
| | - Maria Paneta
- Fourth Department of Internal Medicine, University General Hospital "Attikon", National and Kapodistrian University of Athens, Medical School, Greece
| | | | | | - Nizar Faisal Alkayem
- Jiangxi Province Key Laboratory of Environmental Geotechnical Engineering and Hazards Control, Jiangxi University of Science and Technology, Ganzhou 341000, China
| | - Danial J. Armaghani
- Department of Urban Planning, Engineering Networks and Systems, Institute of Architecture and Construction, South Ural State University, 76, Lenin Prospect, Chelyabinsk 454080, Russian Federation
| | - Abidhan Bardhan
- Civil Engineering Department, National Institute of Technology Patna, Bihar, India
| | - Liborio Cavaleri
- Department of Civil, Environmental, Aerospace and Materials Engineering, University of Palermo, Palermo, Italy
| | - Maosen Cao
- Jiangxi Province Key Laboratory of Environmental Geotechnical Engineering and Hazards Control, Jiangxi University of Science and Technology, Ganzhou 341000, China
| | - Iman Mansouri
- Department of Civil and Environmental Engineering, Princeton University Princeton, Princeton, NJ 08544, USA
| | - Ahmed Salih Mohammed
- Engineering Department, American University of Iraq, Sulaimani, Kurdistan-Region, Iraq
| | - Pijush Samui
- Civil Engineering Department, National Institute of Technology Patna, Bihar, India
| | - Gloria Gerber
- Hematology Division, Johns Hopkins University, Baltimore, USA
| | - Dimitrios T. Boumpas
- "Attikon" University Hospital of Athens, Rheumatology and Clinical Immunology, Medical School, National and Kapodistrian University of Athens, Athens, Attica, Greece
| | - Argyrios Tsantes
- Laboratory of Hematology and Hospital Blood Transfusion Department, University General Hospital "Attikon", National and Kapodistrian University of Athens, Medical School, Greece
| | - Evangelos Terpos
- Department of Clinical Therapeutics, Medical School, Faculty of Medicine, National Kapodistrian University of Athens, Athens, Greece
| | - Meletios A. Dimopoulos
- Department of Clinical Therapeutics, Medical School, Faculty of Medicine, National Kapodistrian University of Athens, Athens, Greece
| |
Collapse
|
24
|
Afrash MR, Shanbehzadeh M, Kazemi-Arpanahi H. Predicting Risk of Mortality in COVID-19 Hospitalized Patients using Hybrid Machine Learning Algorithms. J Biomed Phys Eng 2022; 12:611-626. [PMID: 36569564 PMCID: PMC9759642 DOI: 10.31661/jbpe.v0i0.2105-1334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Accepted: 01/20/2022] [Indexed: 06/17/2023]
Abstract
BACKGROUND Since hospitalized patients with COVID-19 are considered at high risk of death, the patients with the sever clinical condition should be identified. Despite the potential of machine learning (ML) techniques to predict the mortality of COVID-19 patients, high-dimensional data is considered a challenge, which can be addressed by metaheuristic and nature-inspired algorithms, such as genetic algorithm (GA). OBJECTIVE This paper aimed to compare the efficiency of the GA with several ML techniques to predict COVID-19 in-hospital mortality. MATERIAL AND METHODS In this retrospective study, 1353 COVID-19 in-hospital patients were examined from February 9 to December 20, 2020. The GA technique was applied to select the important features, then using selected features several ML algorithms such as K-nearest-neighbor (K-NN), Decision Tree (DT), Support Vector Machines (SVM), and Artificial Neural Network (ANN) were trained to design predictive models. Finally, some evaluation metrics were used for the comparison of developed models. RESULTS A total of 10 features out of 56 were selected, including length of stay (LOS), age, cough, respiratory intubation, dyspnea, cardiovascular diseases, leukocytosis, blood urea nitrogen (BUN), C-reactive protein, and pleural effusion by 10-independent execution of GA. The GA-SVM had the best performance with the accuracy and specificity of 9.5147e+01 and 9.5112e+01, respectively. CONCLUSION The hybrid ML models, especially the GA-SVM, can improve the treatment of COVID-19 patients, predict severe disease and mortality, and optimize the utilization of health resources based on the improvement of input features and the adaption of the structure of the models.
Collapse
Affiliation(s)
- Mohammad Reza Afrash
- PhD, Department of Artificial Intelligence, Smart University of Medical Sciences, Tehran, Iran
| | - Mostafa Shanbehzadeh
- PhD, Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Hadi Kazemi-Arpanahi
- PhD, Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran
- PhD, Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran
| |
Collapse
|
25
|
Khadem H, Nemat H, Elliott J, Benaissa M. Interpretable Machine Learning for Inpatient COVID-19 Mortality Risk Assessments: Diabetes Mellitus Exclusive Interplay. SENSORS (BASEL, SWITZERLAND) 2022; 22:s22228757. [PMID: 36433354 PMCID: PMC9692305 DOI: 10.3390/s22228757] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 11/07/2022] [Accepted: 11/11/2022] [Indexed: 05/13/2023]
Abstract
People with diabetes mellitus (DM) are at elevated risk of in-hospital mortality from coronavirus disease-2019 (COVID-19). This vulnerability has spurred efforts to pinpoint distinctive characteristics of COVID-19 patients with DM. In this context, the present article develops ML models equipped with interpretation modules for inpatient mortality risk assessments of COVID-19 patients with DM. To this end, a cohort of 156 hospitalised COVID-19 patients with pre-existing DM is studied. For creating risk assessment platforms, this work explores a pool of historical, on-admission, and during-admission data that are DM-related or, according to preliminary investigations, are exclusively attributed to the COVID-19 susceptibility of DM patients. First, a set of careful pre-modelling steps are executed on the clinical data, including cleaning, pre-processing, subdivision, and feature elimination. Subsequently, standard machine learning (ML) modelling analysis is performed on the cured data. Initially, a classifier is tasked with forecasting COVID-19 fatality from selected features. The model undergoes thorough evaluation analysis. The results achieved substantiate the efficacy of the undertaken data curation and modelling steps. Afterwards, SHapley Additive exPlanations (SHAP) technique is assigned to interpret the generated mortality risk prediction model by rating the predictors' global and local influence on the model's outputs. These interpretations advance the comprehensibility of the analysis by explaining the formation of outcomes and, in this way, foster the adoption of the proposed methodologies. Next, a clustering algorithm demarcates patients into four separate groups based on their SHAP values, providing a practical risk stratification method. Finally, a re-evaluation analysis is performed to verify the robustness of the proposed framework.
Collapse
Affiliation(s)
- Heydar Khadem
- Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield S10 2TN, UK
- Correspondence:
| | - Hoda Nemat
- Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield S10 2TN, UK
| | - Jackie Elliott
- Department of Oncology and Metabolism, University of Sheffield, Sheffield S10 2TN, UK
- Teaching Hospitals, Diabetes and Endocrine Centre, Northern General Hospital, Sheffield S5 7AU, UK
| | - Mohammed Benaissa
- Department of Electronic and Electrical Engineering, University of Sheffield, Sheffield S10 2TN, UK
| |
Collapse
|
26
|
Wang J, Wang Y, Zeng Y, Huang D. Feature selection approaches identify potential plasma metabolites in postmenopausal osteoporosis patients. Metabolomics 2022; 18:86. [PMID: 36318345 DOI: 10.1007/s11306-022-01937-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 09/27/2022] [Indexed: 11/25/2022]
Abstract
INTRODUCTION Postmenopausal women with osteoporosis (PMOP) are prone to fragility fractures. Osteoporosis is associated with alterations in the levels of specific circulating metabolites. OBJECTIVES To analyze the metabolic profile of individuals with PMOP and identify novel metabolites associated with bone mineral density (BMD). METHODS We performed an unsupervised metabolomics analysis of plasma samples from participants with PMOP and of normal controls (NC) with normal bone mass. BMD values for the lumber spine and the proximal femur were determined using dual-energy X-ray absorptiometry. Principal component analysis (PCA) and supervised partial least squares discriminant analysis (PLS-DA) were performed for metabolomic profile analyses. Metabolites with P < 0.05 in the t-test, VIP > 1 in the PLS-DA model, and SNR > 0.3 between the PMOP and NC groups were defined as differential abundant metabolites (DAMs). The SHapley additive explanations (SHAP) method was utilized to determine the importance of permutation of each DAM in the predictive model between the two groups. ROC analysis and correlation analysis of metabolite relative abundance and BMD/T-scores were conducted. KEGG pathway analysis was used for functional annotation of the candidate metabolites. RESULTS Overall, 527 annotated molecular markers were extracted in the positive and negative total ion chromatogram (TIC) of each sample. The PMOP and NC groups could be differentiated using the PLS-DA model. Sixty-eight DAMs were identified, with most relative abundances decreasing in the PMOP samples. SHAP was used to identify 9 DAM metabolites as factors distinguishing PMOP from NC. The logistic regression model including Triethanolamine, Linoleic acid, and PC(18:1(9Z)/18:1(9Z)) metabolites demonstrated excellent discrimination performance (sensitivity = 97.0, specificity = 96.6, AUC = 0.993). The correlation analysis revealed that the abundances of Triethanolamine, PC(18:1(9Z)/18:1(9Z)), 16-Hydroxypalmitic acid, and Palmitic acid were significantly positively correlated with the BMD/T score (Pearson correlation coefficients > 0.5, P < 0.05). Most candidate metabolites were involved in lipid metabolism based on KEGG functional annotations. CONCLUSION The plasma metabolomic signature of PMOP patients differed from that of healthy controls. Marker metabolites may help provide information for the diagnosis, therapy, and prevention of PMOP. We highlight the application of feature selection approaches in the analysis of high-dimensional biological data.
Collapse
Affiliation(s)
- Jihan Wang
- Xi'an Key Laboratory of Stem Cell and Regenerative Medicine, Institute of Medical Research, Northwestern Polytechnical University, Xi'an, China
| | - Yangyang Wang
- School of Electronics and Information, Northwestern Polytechnical University, Xi'an, China
| | - Yuhong Zeng
- Department of Osteoporosis, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China
| | - Dageng Huang
- Department of Spine Surgery, Honghui Hospital, Xi'an Jiaotong University, Xi'an, China.
| |
Collapse
|
27
|
Zou Y, Shi Y, Sun F, Liu J, Guo Y, Zhang H, Lu X, Gong Y, Xia S. Extreme gradient boosting model to assess risk of central cervical lymph node metastasis in patients with papillary thyroid carcinoma: Individual prediction using SHapley Additive exPlanations. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 225:107038. [PMID: 35930861 DOI: 10.1016/j.cmpb.2022.107038] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 07/02/2022] [Accepted: 07/22/2022] [Indexed: 05/06/2023]
Abstract
BACKGROUND AND OBJECTIVES Central cervical lymph node metastasis (CLNM) is considered a risk factor for recurrence in patients with papillary thyroid carcinoma (PTC). Traditional machine learning models suffered from "black-box" problems, which could not exactly explain the interactive effects of the risk factors. We aimed to develop an eXtreme Gradient Boosting (XGBoost) model to assess CLNM, including positive and negative effects. METHODS 1,122 patients with PTC admitted at Tianjin First Central Hospital from 2016 to 2020 were retrospectively selected. They were randomly divided into the training and test datasets with an 8:2 ratio. 108 patients with PTC admitted at Binzhou Medical University Hospital in 2020 served as the validation dataset. The XGBoost model was used to assess CLNM. The 10-fold cross-validation was utilized for model selection, and the metric used to evaluate classification performance was the average area under the curve (AUC) of 10-fold cross-validation. Interpretation and transparency of the "black-box" problem were performed. SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanation (LIME) were used to ensure the stability and reliability of the model. RESULTS The XGBoost model based on ultrasound and dual-energy computed tomography images of the solitary primary lesion had an excellent performance for assessing CLNM, with average AUCs of 0.918, 0.903, and 0.881 in the training, test, and validation datasets, respectively. SHAP plots showed the influence of each parameter on the XGBoost model, including positive (i.e., capsular invasion, diameter, iodine concentration in the venous phase, and calcification) and negative (i.e., sex and age) impacts. For all cases, the capsular invasion prediction weight was the highest; for individual cases, different predictors were assigned different weights. Moreover, the performance of the XGBoost model was better than classical machine-learning models. CONCLUSIONS This study developed and validated an XGBoost model for assessing CLNM in patients with PTC. The ability to visually interpret the positive and negative effects made the XGBoost model an effective tool for guiding clinical treatment.
Collapse
Affiliation(s)
- Ying Zou
- Department of Radiology, First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China; Department of Radiology, National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China
| | - Yan Shi
- Department of Ultrasonography, Binzhou Medical University Hospital, No. 661 Huanghe 2nd Road, Binzhou City, Shandong 256603, China
| | - Fang Sun
- Department of Ultrasonography, Binzhou Medical University Hospital, No. 661 Huanghe 2nd Road, Binzhou City, Shandong 256603, China
| | - Jihua Liu
- Department of Radiology, First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China; Department of Radiology, National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China
| | - Yu Guo
- Department of Radiology, Tianjin First Central Hospital, School of Medicine, Nankai University, No.24 Fukang Road, Nankai District, Tianjin 300192, China
| | - Huanlei Zhang
- Department of Radiologist, Yidu central hospital of Weifang, No. 4138 LingLongShan nan Road, Qing Zhou City, Shandong, 262500, China
| | - Xiudi Lu
- Department of Radiology, First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China; Department of Radiology, National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, No. 314 Anshan West Road, Nan Kai District, Tianjin 300193, China
| | - Yan Gong
- Department of Radiology, Tianjin Hospital of ITCWM Nan Kai Hospital, No.6 Changjiang Road, Nan Kai District, Tianjin 300100, China
| | - Shuang Xia
- Department of Radiology, Tianjin First Central Hospital, School of Medicine, Nankai University, No.24 Fukang Road, Nankai District, Tianjin 300192, China.
| |
Collapse
|
28
|
Maestre-Muñiz MM, Arias Á, Lucendo AJ. Predicting In-Hospital Mortality in Severe COVID-19: A Systematic Review and External Validation of Clinical Prediction Rules. Biomedicines 2022; 10:biomedicines10102414. [PMID: 36289676 PMCID: PMC9599062 DOI: 10.3390/biomedicines10102414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Revised: 09/02/2022] [Accepted: 09/05/2022] [Indexed: 12/03/2022] Open
Abstract
Multiple prediction models for risk of in-hospital mortality from COVID-19 have been developed, but not applied, to patient cohorts different to those from which they were derived. The MEDLINE, EMBASE, Scopus, and Web of Science (WOS) databases were searched. Risk of bias and applicability were assessed with PROBAST. Nomograms, whose variables were available in a well-defined cohort of 444 patients from our site, were externally validated. Overall, 71 studies, which derived a clinical prediction rule for mortality outcome from COVID-19, were identified. Predictive variables consisted of combinations of patients′ age, chronic conditions, dyspnea/taquipnea, radiographic chest alteration, and analytical values (LDH, CRP, lymphocytes, D-dimer); and markers of respiratory, renal, liver, and myocardial damage, which were mayor predictors in several nomograms. Twenty-five models could be externally validated. Areas under receiver operator curve (AUROC) in predicting mortality ranged from 0.71 to 1 in derivation cohorts; C-index values ranged from 0.823 to 0.970. Overall, 37/71 models provided very-good-to-outstanding test performance. Externally validated nomograms provided lower predictive performances for mortality in their respective derivation cohorts, with the AUROC being 0.654 to 0.806 (poor to acceptable performance). We can conclude that available nomograms were limited in predicting mortality when applied to different populations from which they were derived.
Collapse
Affiliation(s)
- Modesto M. Maestre-Muñiz
- Department of Internal Medicine, Hospital General de Tomelloso, 13700 Ciudad Real, Spain
- Department of Medicine and Medical Specialties, Universidad de Alcalá, 28801 Alcalá de Henares, Spain
| | - Ángel Arias
- Hospital General La Mancha Centro, Research Unit, Alcázar de San Juan, 13600 Ciudad Real, Spain
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28006 Madrid, Spain
- Instituto de Investigación Sanitaria La Princesa, 28006 Madrid, Spain
- Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 13700 Tomelloso, Spain
| | - Alfredo J. Lucendo
- Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas (CIBERehd), Instituto de Salud Carlos III, 28006 Madrid, Spain
- Instituto de Investigación Sanitaria La Princesa, 28006 Madrid, Spain
- Instituto de Investigación Sanitaria de Castilla-La Mancha (IDISCAM), 13700 Tomelloso, Spain
- Department of Gastroenterology, Hospital General de Tomelloso, 13700 Ciudad Real, Spain
- Correspondence: ; Tel.: +34-926-525-927
| |
Collapse
|
29
|
Wu M, Zhao Y, Dong X, Jin Y, Cheng S, Zhang N, Xu S, Gu S, Wu Y, Yang J, Yao L, Wang Y. Artificial intelligence-based preoperative prediction system for diagnosis and prognosis in epithelial ovarian cancer: A multicenter study. Front Oncol 2022; 12:975703. [PMID: 36212430 PMCID: PMC9532858 DOI: 10.3389/fonc.2022.975703] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 08/11/2022] [Indexed: 11/13/2022] Open
Abstract
Background Ovarian cancer (OC) is the most lethal gynecological malignancy, with limited early screening methods and poor prognosis. Artificial intelligence technology has made a great breakthrough in cancer diagnosis. Purpose We aim to develop a specific interpretable machine learning (ML) prediction model for the diagnosis and prognosis of epithelial ovarian cancer (EOC) based on a variety of biomarkers. Methods A total of 521 patients with EOC and 144 patients with benign gynecological diseases were enrolled including derivation datasets and an external validation cohort. The predicted information was acquired by 9 supervised ML methods, through 34 parameters. Behind predicted reasons for the best ML were improved by using the SHapley Additive exPlanations (SHAP) algorithm. In addition, the prognosis of EOC was analyzed by unsupervised clustering and Kaplan–Meier (KM) survival analysis. Results ML technology was superior to conventional logistic regression in predicting EOC diagnosis and XGBoost performed best in the external validation datasets. The AUC values of distinguishing EOC and benign disease patients, determining pathological type, grade and clinical stage were 0.958 (0.926-0.989), 0.792 (0.701-0.8834), 0.819 (0.687-0.950) and 0.68 (0.573-0.788) respectively. For negative CA-125 EOC patients, the AUC performance of XGBoost model was 0.835(0.763-0.907). We used unsupervised cluster analysis to identify EOC subgroups with significantly poor overall survival (p-value <0.0001) and recurrence-free survival (p-value <0.0001). Conclusions Based on the preoperative characteristics, we proved that ML algorithm can provide an acceptable diagnosis and prognosis prediction model for EOC patients. Meanwhile, SHAP analysis can improve the interpretability of ML models and contribute to precision medicine.
Collapse
Affiliation(s)
- Meixuan Wu
- Department of Obstetrics and Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Yaqian Zhao
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Xuhui Dong
- Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
| | - Yue Jin
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Shanshan Cheng
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Nan Zhang
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Shilin Xu
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Sijia Gu
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Yongsong Wu
- Department of Obstetrics and Gynecology, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Jiani Yang
- Department of Obstetrics and Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
- *Correspondence: Yu Wang, ; Liangqing Yao, ; Jiani Yang,
| | - Liangqing Yao
- Obstetrics and Gynecology Hospital, Fudan University, Shanghai, China
- *Correspondence: Yu Wang, ; Liangqing Yao, ; Jiani Yang,
| | - Yu Wang
- Department of Obstetrics and Gynecology, Shanghai First Maternity and Infant Hospital, School of Medicine, Tongji University, Shanghai, China
- *Correspondence: Yu Wang, ; Liangqing Yao, ; Jiani Yang,
| |
Collapse
|
30
|
LASSO Model Better Predicted the Prognosis of DLBCL than Random Forest Model: A Retrospective Multicenter Analysis of HHLWG. JOURNAL OF ONCOLOGY 2022; 2022:1618272. [PMID: 36157230 PMCID: PMC9507678 DOI: 10.1155/2022/1618272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/28/2022] [Accepted: 08/26/2022] [Indexed: 11/17/2022]
Abstract
Background. Diffuse large B-cell lymphoma (DLBCL) is a heterogeneous non-Hodgkin’s lymphoma with great clinical challenge. Machine learning (ML) has attracted substantial attention in diagnosis, prognosis, and treatment of diseases. This study is aimed at exploring the prognostic factors of DLBCL by ML. Methods. In total, 1211 DLBCL patients were retrieved from Huaihai Lymphoma Working Group (HHLWG). The least absolute shrinkage and selection operator (LASSO) and random forest algorithm were used to identify prognostic factors for the overall survival (OS) rate of DLBCL among twenty-five variables. Receiver operating characteristic (ROC) curve and decision curve analysis (DCA) were utilized to compare the predictive performance and clinical effectiveness of the two models, respectively. Results. The median follow-up time was 43.4 months, and the 5-year OS was 58.5%. The LASSO model achieved an Area under the curve (AUC) of 75.8% for the prognosis of DLBCL, which was higher than that of the random forest model (AUC: 71.6%). DCA analysis also revealed that the LASSO model could augment net benefits and exhibited a wider range of threshold probabilities by risk stratification than the random forest model. In addition, multivariable analysis demonstrated that age, white blood cell count, hemoglobin, central nervous system involvement, gender, and Ann Arbor stage were independent prognostic factors for DLBCL. The LASSO model showed better discrimination of outcomes compared with the IPI and NCCN-IPI models and identified three groups of patients: low risk, high-intermediate risk, and high risk. Conclusions. The prognostic model of DLBCL based on the LASSO regression was more accurate than the random forest, IPI, and NCCN-IPI models.
Collapse
|
31
|
Caires Silveira E, Mattos Pretti S, Santos BA, Santos Corrêa CF, Madureira Silva L, Freire de Melo F. Prediction of hospital mortality in intensive care unit patients from clinical and laboratory data: A machine learning approach. World J Crit Care Med 2022; 11:317-329. [PMID: 36160934 PMCID: PMC9483004 DOI: 10.5492/wjccm.v11.i5.317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2021] [Revised: 08/13/2021] [Accepted: 07/06/2022] [Indexed: 02/05/2023] Open
Abstract
BACKGROUND Intensive care unit (ICU) patients demand continuous monitoring of several clinical and laboratory parameters that directly influence their medical progress and the staff’s decision-making. Those data are vital in the assistance of these patients, being already used by several scoring systems. In this context, machine learning approaches have been used for medical predictions based on clinical data, which includes patient outcomes.
AIM To develop a binary classifier for the outcome of death in ICU patients based on clinical and laboratory parameters, a set formed by 1087 instances and 50 variables from ICU patients admitted to the emergency department was obtained in the “WiDS (Women in Data Science) Datathon 2020: ICU Mortality Prediction” dataset.
METHODS For categorical variables, frequencies and risk ratios were calculated. Numerical variables were computed as means and standard deviations and Mann-Whitney U tests were performed. We then divided the data into a training (80%) and test (20%) set. The training set was used to train a predictive model based on the Random Forest algorithm and the test set was used to evaluate the predictive effectiveness of the model.
RESULTS A statistically significant association was identified between need for intubation, as well predominant systemic cardiovascular involvement, and hospital death. A number of the numerical variables analyzed (for instance Glasgow Coma Score punctuations, mean arterial pressure, temperature, pH, and lactate, creatinine, albumin and bilirubin values) were also significantly associated with death outcome. The proposed binary Random Forest classifier obtained on the test set (n = 218) had an accuracy of 80.28%, sensitivity of 81.82%, specificity of 79.43%, positive predictive value of 73.26%, negative predictive value of 84.85%, F1 score of 0.74, and area under the curve score of 0.85. The predictive variables of the greatest importance were the maximum and minimum lactate values, adding up to a predictive importance of 15.54%.
CONCLUSION We demonstrated the efficacy of a Random Forest machine learning algorithm for handling clinical and laboratory data from patients under intensive monitoring. Therefore, we endorse the emerging notion that machine learning has great potential to provide us support to critically question existing methodologies, allowing improvements that reduce mortality.
Collapse
Affiliation(s)
- Elena Caires Silveira
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| | - Soraya Mattos Pretti
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| | - Bruna Almeida Santos
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| | - Caio Fellipe Santos Corrêa
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| | - Leonardo Madureira Silva
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| | - Fabrício Freire de Melo
- Multidisciplinary Institute of Health, Federal University of Bahia, Vitória da Conquista 45-029094, Brazil
| |
Collapse
|
32
|
Shi Y, Zou Y, Liu J, Wang Y, Chen Y, Sun F, Yang Z, Cui G, Zhu X, Cui X, Liu F. Ultrasound-based radiomics XGBoost model to assess the risk of central cervical lymph node metastasis in patients with papillary thyroid carcinoma: Individual application of SHAP. Front Oncol 2022; 12:897596. [PMID: 36091102 PMCID: PMC9458917 DOI: 10.3389/fonc.2022.897596] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
ObjectivesA radiomics-based explainable eXtreme Gradient Boosting (XGBoost) model was developed to predict central cervical lymph node metastasis (CCLNM) in patients with papillary thyroid carcinoma (PTC), including positive and negative effects.MethodsA total of 587 PTC patients admitted at Binzhou Medical University Hospital from 2017 to 2021 were analyzed retrospectively. The patients were randomized into the training and test cohorts with an 8:2 ratio. Radiomics features were extracted from ultrasound images of the primary PTC lesions. The minimum redundancy maximum relevance algorithm and the least absolute shrinkage and selection operator regression were used to select CCLNM positively-related features and radiomics scores were constructed. Clinical features, ultrasound features, and radiomics score were screened out by the Boruta algorithm, and the XGBoost model was constructed from these characteristics. SHapley Additive exPlanations (SHAP) was used for individualized and visualized interpretation. SHAP addressed the cognitive opacity of machine learning models.ResultsEleven radiomics features were used to calculate the radiomics score. Five critical elements were used to build the XGBoost model: capsular invasion, radiomics score, diameter, age, and calcification. The area under the curve was 91.53% and 90.88% in the training and test cohorts, respectively. SHAP plots showed the influence of each parameter on the XGBoost model, including positive (i.e., capsular invasion, radiomics score, diameter, and calcification) and negative (i.e., age) impacts. The XGBoost model outperformed the radiologist, increasing the AUC by 44%.ConclusionsThe radiomics-based XGBoost model predicted CCLNM in PTC patients. Visual interpretation using SHAP made the model an effective tool for preoperative guidance of clinical procedures, including positive and negative impacts.
Collapse
Affiliation(s)
- Yan Shi
- Binzhou Medical University Hospital, Binzhou, China
| | - Ying Zou
- First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, Tianjin, China
- National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, Tianjin, China
| | - Jihua Liu
- First Teaching Hospital of Tianjin University of Traditional Chinese Medicine, Tianjin, China
- National Clinical Research Center for Chinese Medicine Acupuncture and Moxibustion, Tianjin, China
| | | | | | - Fang Sun
- Binzhou Medical University Hospital, Binzhou, China
| | - Zhi Yang
- Binzhou Medical University Hospital, Binzhou, China
| | - Guanghe Cui
- Binzhou Medical University Hospital, Binzhou, China
| | - Xijun Zhu
- Binzhou Medical University Hospital, Binzhou, China
| | - Xu Cui
- Binzhou Medical University Hospital, Binzhou, China
| | - Feifei Liu
- Binzhou Medical University Hospital, Binzhou, China
- Peking University People’s Hospital, Beijing, China
- *Correspondence: Feifei Liu,
| |
Collapse
|
33
|
Nopour R, Shanbehzadeh M, Kazemi-Arpanahi H. Predicting the Need for Intubation among COVID-19 Patients Using Machine Learning Algorithms: A Single-Center Study. Med J Islam Repub Iran 2022; 36:30. [PMID: 35999913 PMCID: PMC9386770 DOI: 10.47176/mjiri.36.30] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 04/04/2022] [Indexed: 12/15/2022] Open
Abstract
Background: Owing to the shortage of ventilators, there is a crucial demand for an objective and accurate prognosis for 2019 coronavirus disease (COVID-19) critical patients, which may necessitate a mechanical ventilator (MV). This study aimed to construct a predictive model using machine learning (ML) algorithms for frontline clinicians to better triage endangered patients and priorities who would need MV.
Methods: In this retrospective single-center study, the data of 482 COVID-19 patients from February 9, 2020, to December 20, 2020, were analyzed by several ML algorithms including, multi-layer perception (MLP), logistic regression (LR), J-48 decision tree, and Naïve Bayes (NB). First, the most important clinical variables were identified using the Chi-square test at P < 0.01. Then, by comparing the ML algorithms' performance using some evaluation criteria, including TP-Rate, FP-Rate, precision, recall, F-Score, MCC, and Kappa, the best performing one was identified. Results: Predictive models were trained using 15 validated features, including cough, contusion, oxygen therapy, dyspnea, loss of taste, rhinorrhea, blood pressure, absolute lymphocyte count, pleural fluid, activated partial thromboplastin time, blood glucose, white cell count, cardiac diseases, length of hospitalization, and other underline diseases. The results indicated the J-48 with F-score = 0.868 and AUC = 0.892 yielded the best performance for predicting intubation requirement.
Conclusion: ML algorithms are potentials to improve traditional clinical criteria to forecast the necessity for intubation in COVID-19 in-hospital patients. Such ML-based prediction models may help physicians with optimizing the timing of intubation, better sharing of MV resources and personnel, and increase patient clinical status.
Collapse
Affiliation(s)
- Raoof Nopour
- Student Research Committee, School of Health Management and Information Sciences Branch, Iran University of Medical Sciences, Tehran, Iran
| | - Mostafa Shanbehzadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.,Department of Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran
| |
Collapse
|
34
|
Zhang G, Shi Y, Yin P, Liu F, Fang Y, Li X, Zhang Q, Zhang Z. A machine learning model based on ultrasound image features to assess the risk of sentinel lymph node metastasis in breast cancer patients: Applications of scikit-learn and SHAP. Front Oncol 2022; 12:944569. [PMID: 35957890 PMCID: PMC9359803 DOI: 10.3389/fonc.2022.944569] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2022] [Accepted: 07/01/2022] [Indexed: 11/16/2022] Open
Abstract
Background This study aimed to determine an optimal machine learning (ML) model for evaluating the preoperative diagnostic value of ultrasound signs of breast cancer lesions for sentinel lymph node (SLN) status. Method This study retrospectively analyzed the ultrasound images and postoperative pathological findings of lesions in 952 breast cancer patients. Firstly, the univariate analysis of the relationship between the ultrasonographic features of breast cancer morphological features and SLN metastasis. Then, based on the ultrasound signs of breast cancer lesions, we screened ten ML models: support vector machine (SVM), extreme gradient boosting (XGBoost), random forest (RF), linear discriminant analysis (LDA), logistic regression (LR), naive bayesian model (NB), k-nearest neighbors (KNN), multilayer perceptron (MLP), long short-term memory (LSTM), and convolutional neural network (CNN). The diagnostic performance of the model was evaluated using the area under the receiver operating characteristic (ROC) curve (AUC), Kappa value, accuracy, F1-score, sensitivity, and specificity. Then we constructed a clinical prediction model which was based on the ML algorithm with the best diagnostic performance. Finally, we used SHapley Additive exPlanation (SHAP) to visualize and analyze the diagnostic process of the ML model. Results Of 952 patients with breast cancer, 394 (41.4%) had SLN metastasis, and 558 (58.6%) had no metastasis. Univariate analysis found that the shape, orientation, margin, posterior features, calculations, architectural distortion, duct changes and suspicious lymph node of breast cancer lesions in ultrasound signs were associated with SLN metastasis. Among the 10 ML algorithms, XGBoost had the best comprehensive diagnostic performance for SLN metastasis, with Average-AUC of 0.952, Average-Kappa of 0.763, and Average-Accuracy of 0.891. The AUC of the XGBoost model in the validation cohort was 0.916, the accuracy was 0.846, the sensitivity was 0.870, the specificity was 0.862, and the F1-score was 0.826. The diagnostic performance of the XGBoost model was significantly higher than that of experienced radiologists in some cases (P<0.001). Using SHAP to visualize the interpretation of the ML model screen, it was found that the ultrasonic detection of suspicious lymph nodes, microcalcifications in the primary tumor, burrs on the edge of the primary tumor, and distortion of the tissue structure around the lesion contributed greatly to the diagnostic performance of the XGBoost model. Conclusions The XGBoost model based on the ultrasound signs of the primary breast tumor and its surrounding tissues and lymph nodes has a high diagnostic performance for predicting SLN metastasis. Visual explanation using SHAP made it an effective tool for guiding clinical courses preoperatively.
Collapse
Affiliation(s)
- Gaosen Zhang
- Department of Ultrasound, First Affiliated Hospital of China Medical University, Shenyang, China
| | - Yan Shi
- Department of Ultrasound, Binzhou Medical University Hospital, Binzhou, China
| | - Peipei Yin
- Department of Ultrasound, Binzhou Medical University Hospital, Binzhou, China
| | - Feifei Liu
- Department of Ultrasound Medicine, Peking University People’s Hospital, Beijing, China
| | - Yi Fang
- Department of Ultrasound, First Affiliated Hospital of China Medical University, Shenyang, China
| | - Xiang Li
- Department of Ultrasound, First Affiliated Hospital of China Medical University, Shenyang, China
| | - Qingyu Zhang
- College of Information Science and Engineering, Northeastern University, Shenyang, China
| | - Zhen Zhang
- Department of Ultrasound, First Affiliated Hospital of China Medical University, Shenyang, China
- *Correspondence: Zhen Zhang,
| |
Collapse
|
35
|
Development and Validation of a Multimodal-Based Prognosis and Intervention Prediction Model for COVID-19 Patients in a Multicenter Cohort. SENSORS 2022; 22:s22135007. [PMID: 35808502 PMCID: PMC9269794 DOI: 10.3390/s22135007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 06/29/2022] [Accepted: 06/29/2022] [Indexed: 02/04/2023]
Abstract
The ability to accurately predict the prognosis and intervention requirements for treating highly infectious diseases, such as COVID-19, can greatly support the effective management of patients, especially in resource-limited settings. The aim of the study is to develop and validate a multimodal artificial intelligence (AI) system using clinical findings, laboratory data and AI-interpreted features of chest X-rays (CXRs), and to predict the prognosis and the required interventions for patients diagnosed with COVID-19, using multi-center data. In total, 2282 real-time reverse transcriptase polymerase chain reaction-confirmed COVID-19 patients’ initial clinical findings, laboratory data and CXRs were retrospectively collected from 13 medical centers in South Korea, between January 2020 and June 2021. The prognostic outcomes collected included intensive care unit (ICU) admission and in-hospital mortality. Intervention outcomes included the use of oxygen (O2) supplementation, mechanical ventilation and extracorporeal membrane oxygenation (ECMO). A deep learning algorithm detecting 10 common CXR abnormalities (DLAD-10) was used to infer the initial CXR taken. A random forest model with a quantile classifier was used to predict the prognostic and intervention outcomes, using multimodal data. The area under the receiver operating curve (AUROC) values for the single-modal model, using clinical findings, laboratory data and the outputs from DLAD-10, were 0.742 (95% confidence interval [CI], 0.696−0.788), 0.794 (0.745−0.843) and 0.770 (0.724−0.815), respectively. The AUROC of the combined model, using clinical findings, laboratory data and DLAD-10 outputs, was significantly higher at 0.854 (0.820−0.889) than that of all other models (p < 0.001, using DeLong’s test). In the order of importance, age, dyspnea, consolidation and fever were significant clinical variables for prediction. The most predictive DLAD-10 output was consolidation. We have shown that a multimodal AI model can improve the performance of predicting both the prognosis and intervention in COVID-19 patients, and this could assist in effective treatment and subsequent resource management. Further, image feature extraction using an established AI engine with well-defined clinical outputs, and combining them with different modes of clinical data, could be a useful way of creating an understandable multimodal prediction model.
Collapse
|
36
|
An C, Yang H, Yu X, Han ZY, Cheng Z, Liu F, Dou J, Li B, Li Y, Li Y, Yu J, Liang P. A Machine Learning Model Based on Health Records for Predicting Recurrence After Microwave Ablation of Hepatocellular Carcinoma. J Hepatocell Carcinoma 2022; 9:671-684. [PMID: 35923613 PMCID: PMC9342890 DOI: 10.2147/jhc.s358197] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 07/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background and Aim Early recurrence (ER) presents a challenge for the survival prognosis of patients with hepatocellular carcinoma (HCC). The aim of this study was to investigate machine learning (ML) models using clinical data for predicting ER after microwave ablation (MWA). Methods Between August 2005 and December 2019, 1574 patients with early-stage HCC underwent MWA at four hospitals were reviewed. Then, 36 clinical data points per patient were collected, and the patients were assigned to the training, internal, and external validation set. Apart from traditional logistic regression (LR), three ML models—random forest, support vector machine, and eXtreme Gradient Boosting (XGBoost)—were built and validated for their predictive ability with the area under ROC curve (AUC). Algorithms such as SHapley Additive exPlanations (SHAP) and local interpretable model-agnostic explanations (LIME) were used to realize their interpretability. Results The three ML models all outperformed LR (P < 0.001 for all) in predictive ability. When nine variables (tumor number, platelet, α-fetoprotein, comorbidity score, white blood cell, cholinesterase, prothrombin time, neutrophils, and etiology) were extracted simultaneously using recursive feature elimination with cross-validation, the XGBoost model achieved the best discrimination among all models, with an AUC value 0.75 (95% CI [confidence interval]: 0.72–0.78) in the training set, 0.74 (95% CI: 0.69–0.80) in the internal validation set, and 0.76 (95% CI: 0.70–0.82) in the external validation set, and it was interpreted depending on the visualization of risk factors by the SHAP and LIME algorithms. The predictive system of post-ablation recurrence risk stratification was provided on online (http://114.251.235.51:8001/) based on XGboost analysis. Conclusion The XGBoost model based on clinical data can effectively predict ER risk after MWA, which can contribute to surveillance, prevention, and treatment strategies for HCC.
Collapse
Affiliation(s)
- Chao An
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Hongcai Yang
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
- School of Medicine, Nankai University, Tianjin, People’s Republic of China
| | - Xiaoling Yu
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Zhi-Yu Han
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Zhigang Cheng
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Fangyi Liu
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Jianping Dou
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Bing Li
- National Laboratory of Pattern Recognition (NLPR), Institute of Automation, Chinese Academy of Sciences, Beijing, People’s Republic of China
| | - Yansheng Li
- DHC Mediway Technology CO, Ltd, Beijing, People’s Republic of China
| | - Yichao Li
- DHC Mediway Technology CO, Ltd, Beijing, People’s Republic of China
| | - Jie Yu
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
| | - Ping Liang
- Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China
- Correspondence: Ping Liang; Jie Yu, Department of Ultrasound, PLA Medical College & 5th Medical Center of Chinese PLA General Hospital, Beijing, 100853, People’s Republic of China, Tel +86-10-66939530, Fax +86-10-68161218, Email ;
| |
Collapse
|
37
|
Machine Learning Models to Predict In-Hospital Mortality among Inpatients with COVID-19: Underestimation and Overestimation Bias Analysis in Subgroup Populations. JOURNAL OF HEALTHCARE ENGINEERING 2022; 2022:1644910. [PMID: 35756093 PMCID: PMC9226971 DOI: 10.1155/2022/1644910] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2022] [Revised: 04/17/2022] [Accepted: 05/22/2022] [Indexed: 12/13/2022]
Abstract
Prediction of the death among COVID-19 patients can help healthcare providers manage the patients better. We aimed to develop machine learning models to predict in-hospital death among these patients. We developed different models using different feature sets and datasets developed using the data balancing method. We used demographic and clinical data from a multicenter COVID-19 registry. We extracted 10,657 records for confirmed patients with PCR or CT scans, who were hospitalized at least for 24 hours at the end of March 2021. The death rate was 16.06%. Generally, models with 60 and 40 features performed better. Among the 240 models, the C5 models with 60 and 40 features performed well. The C5 model with 60 features outperformed the rest based on all evaluation metrics; however, in external validation, C5 with 32 features performed better. This model had high accuracy (91.18%), F-score (0.916), Area under the Curve (0.96), sensitivity (94.2%), and specificity (88%). The model suggested in this study uses simple and available data and can be applied to predict death among COVID-19 patients. Furthermore, we concluded that machine learning models may perform differently in different subpopulations in terms of gender and age groups.
Collapse
|
38
|
Mohammadi Z, Faghih Dinevari M, Vahed N, Ebrahimi Bakhtavar H, Rahmani F. Clinical and Laboratory Predictors of COVID-19-Related In-hospital Mortality; a Cross-sectional Study of 1000 Cases. ARCHIVES OF ACADEMIC EMERGENCY MEDICINE 2022; 10:e49. [PMID: 36033996 PMCID: PMC9397590 DOI: 10.22037/aaem.v10i1.1574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
INTRODUCTION Identifying patients at risk for mortality and using appropriate treatment for each patient based on their situation could be an effective strategy in improving their outcome. This study aimed to evaluated the predictors of COVID-19 in-hospital mortality. METHODS This descriptive cross-sectional study was conducted on all adult COVID-19 patients who were managed in Imam-Reza and Sina Hospitals, Tabriz, Iran, from November 2020 until December 2021. The demographic, clinical, and laboratory characteristics of patients were evaluated and predictors of in-hospital mortality were identified using logistic regression model. RESULTS 1000 patients with the mean age of 56.34 ± 18.00 years were studied (65.7% male). There were significant associations between COVID-19 in-hospital mortality and hospitalization above five days (p = 0.001), white blood cell count (WBC) > 4000 Cells*103/mL (p < 0.01), aspartate aminotransferase (AST) above 40 IU/L (p = 0.001), alanine transaminase (ALT) above 40 IU/L (p = 0.001), creatinine above 1.4 mg/dL (p = 0.007), urea above 100 mg/dL (p = 0.024), and SaO2 below 80% (p = 0.001). Hospital stay above five days (OR: 3.473; 95%CI: 1.272 - 9.479; p = 0.15), AST above 40 IU/L (OR: 0.269, 95%CI: 0.179 - 0.402; p = 0.001), creatinine above 1.4 mg/dL (OR: 0.529; 95%CI: 0.344 - 0.813; p = 0.004), urea above 100 mg/dL (OR: 0.327, 95%CI: 0.189 - 0.567; p = 0.001), and SaO2 below 80% (OR: 8.754, 95%CI: 5.413 - 14.156; p = 0.001) were among the independent predictors of COVID-19 in-hospital mortality. CONCLUSION The mortality rate of patients with COVID-19 in our study was 29.9%. Hospitalization of more than five days, AST above 40 IU/L, creatinine above 1.4 mg/dL, urea above 100 mg/dL and SaO2 < 80% were independent risk factors of in-hospital mortality among patients with COVID-19.
Collapse
Affiliation(s)
- Zohreh Mohammadi
- Emergency and Trauma Care Research Center, Tabriz University of Medical Sciences, Tabriz, Iran
| | | | | | | | - Farzad Rahmani
- Liver and Gastrointestinal Diseases Research Center, Tabriz University of Medical Sciences, Tabriz, Iran. ,Corresponding Author: Farzad Rahmani; Emam Reza Medical Research and Training Hospital, Tabriz University of Medical Sciences, Tabriz, Iran. Tel: 00984133352078, Fax: 00984133352078, , ORCID: 0000-0001-5582-9156
| |
Collapse
|
39
|
Zheng Y, Guo Z, Zhang Y, Shang J, Yu L, Fu P, Liu Y, Li X, Wang H, Ren L, Zhang W, Hou H, Tan X, Wang W. Rapid triage for ischemic stroke: a machine learning-driven approach in the context of predictive, preventive and personalised medicine. EPMA J 2022; 13:285-298. [PMID: 35719136 PMCID: PMC9203613 DOI: 10.1007/s13167-022-00283-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 05/09/2022] [Indexed: 02/05/2023]
Abstract
BACKGROUND Recognising the early signs of ischemic stroke (IS) in emergency settings has been challenging. Machine learning (ML), a robust tool for predictive, preventive and personalised medicine (PPPM/3PM), presents a possible solution for this issue and produces accurate predictions for real-time data processing. METHODS This investigation evaluated 4999 IS patients among a total of 10,476 adults included in the initial dataset, and 1076 IS subjects among 3935 participants in the external validation dataset. Six ML-based models for the prediction of IS were trained on the initial dataset of 10,476 participants (split participants into a training set [80%] and an internal validation set [20%]). Selected clinical laboratory features routinely assessed at admission were used to inform the models. Model performance was mainly evaluated by the area under the receiver operating characteristic (AUC) curve. Additional techniques-permutation feature importance (PFI), local interpretable model-agnostic explanations (LIME), and SHapley Additive exPlanations (SHAP)-were applied for explaining the black-box ML models. RESULTS Fifteen routine haematological and biochemical features were selected to establish ML-based models for the prediction of IS. The XGBoost-based model achieved the highest predictive performance, reaching AUCs of 0.91 (0.90-0.92) and 0.92 (0.91-0.93) in the internal and external datasets respectively. PFI globally revealed that demographic feature age, routine haematological parameters, haemoglobin and neutrophil count, and biochemical analytes total protein and high-density lipoprotein cholesterol were more influential on the model's prediction. LIME and SHAP showed similar local feature attribution explanations. CONCLUSION In the context of PPPM/3PM, we used the selected predictors obtained from the results of common blood tests to develop and validate ML-based models for the diagnosis of IS. The XGBoost-based model offers the most accurate prediction. By incorporating the individualised patient profile, this prediction tool is simple and quick to administer. This is promising to support subjective decision making in resource-limited settings or primary care, thereby shortening the time window for the treatment, and improving outcomes after IS. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13167-022-00283-4.
Collapse
Affiliation(s)
- Yulu Zheng
- Centre for Precision Health, Edith Cowan University, 270 Joondalup Drive, Joondalup, 6027 Western
Australia Australia
| | - Zheng Guo
- Centre for Precision Health, Edith Cowan University, 270 Joondalup Drive, Joondalup, 6027 Western
Australia Australia
| | - Yanbo Zhang
- The Second Affiliated Hospital of Shandong First Medical University, Tai’an, Shandong China
| | | | - Leilei Yu
- Tai’an City Central Hospital, Tai’an, Shandong China
| | - Ping Fu
- Ti’men Township Central Hospital, Tai’an, Shandong China
| | - Yizhi Liu
- School of Public Health, Shandong First Medical University & Shandong Academy of Medical Sciences, 619 Changcheng Road, Tai’an, 271016 Shandong China
| | - Xingang Li
- Centre for Precision Health, Edith Cowan University, 270 Joondalup Drive, Joondalup, 6027 Western
Australia Australia
| | - Hao Wang
- Department of Clinical Epidemiology and Evidence-Based Medicine, National Clinical Research Centre for Digestive Disease, Beijing Friendship Hospital, Capital Medical University, Beijing, China
- Beijing Key Laboratory of Clinical Epidemiology, School of Public Health, Capital Medical University, Beijing, China
| | - Ling Ren
- Beijing United Family Hospital, No.2 Jiangtai Road, Chaoyang District, Beijing, China
| | - Wei Zhang
- Centre for Cognitive Neurology, Department of Neurology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
| | - Haifeng Hou
- Centre for Precision Health, Edith Cowan University, 270 Joondalup Drive, Joondalup, 6027 Western
Australia Australia
- The Second Affiliated Hospital of Shandong First Medical University, Tai’an, Shandong China
- School of Public Health, Shandong First Medical University &
- Shandong Academy of Medical Sciences, 619 Changcheng Road, Tai’an, 271016 Shandong China
| | - Xuerui Tan
- The First Affiliated Hospital of Shantou University Medical College, Shantou, Guangdong China
| | - Wei Wang
- Centre for Precision Health, Edith Cowan University, 270 Joondalup Drive, Joondalup, 6027 Western
Australia Australia
- School of Public Health, Shandong First Medical University &
- Shandong Academy of Medical Sciences, 619 Changcheng Road, Tai’an, 271016 Shandong China
- Beijing Key Laboratory of Clinical Epidemiology, School of Public Health, Capital Medical University, Beijing, China
- The First Affiliated Hospital of Shantou University Medical College, Shantou, Guangdong China
- Institute for Nutrition Research, Edith Cowan University, Joondalup, WA Australia
| | | |
Collapse
|
40
|
Föll S, Lison A, Maritsch M, Klingberg K, Lehmann V, Züger T, Srivastava D, Jegerlehner S, Feuerriegel S, Fleisch E, Exadaktylos A, Wortmann F. A Scalable Risk Scoring System for COVID-19 Inpatients Based on Consumer-grade Wearables: Statistical Analysis and Model Development. JMIR Form Res 2022; 6:e35717. [PMID: 35613417 PMCID: PMC9217156 DOI: 10.2196/35717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2021] [Revised: 04/06/2022] [Accepted: 05/09/2022] [Indexed: 11/13/2022] Open
Abstract
Background To provide effective care for inpatients with COVID-19, clinical practitioners need systems that monitor patient health and subsequently allow for risk scoring. Existing approaches for risk scoring in patients with COVID-19 focus primarily on intensive care units (ICUs) with specialized medical measurement devices but not on hospital general wards. Objective In this paper, we aim to develop a risk score for inpatients with COVID-19 in general wards based on consumer-grade wearables (smartwatches). Methods Patients wore consumer-grade wearables to record physiological measurements, such as the heart rate (HR), heart rate variability (HRV), and respiration frequency (RF). Based on Bayesian survival analysis, we validated the association between these measurements and patient outcomes (ie, discharge or ICU admission). To build our risk score, we generated a low-dimensional representation of the physiological features. Subsequently, a pooled ordinal regression with time-dependent covariates inferred the probability of either hospital discharge or ICU admission. We evaluated the predictive performance of our developed system for risk scoring in a single-center, prospective study based on 40 inpatients with COVID-19 in a general ward of a tertiary referral center in Switzerland. Results First, Bayesian survival analysis showed that physiological measurements from consumer-grade wearables are significantly associated with patient outcomes (ie, discharge or ICU admission). Second, our risk score achieved a time-dependent area under the receiver operating characteristic curve (AUROC) of 0.73-0.90 based on leave-one-subject-out cross-validation. Conclusions Our results demonstrate the effectiveness of consumer-grade wearables for risk scoring in inpatients with COVID-19. Due to their low cost and ease of use, consumer-grade wearables could enable a scalable monitoring system. Trial Registration Clinicaltrials.gov NCT04357834; https://www.clinicaltrials.gov/ct2/show/NCT04357834
Collapse
Affiliation(s)
- Simon Föll
- Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH
| | - Adrian Lison
- Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH
| | - Martin Maritsch
- Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH
| | - Karsten Klingberg
- Department of Emergency Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, Bern, CH
| | - Vera Lehmann
- Department of Diabetes, Endocrinology, Nutritional Medicine and Metabolism, Inselspital, Bern University Hospital, University of Bern, Bern, CH
| | - Thomas Züger
- Department of Diabetes, Endocrinology, Nutritional Medicine and Metabolism, Inselspital, Bern University Hospital, University of Bern, Bern, CH.,Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH
| | - David Srivastava
- Department of Emergency Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, Bern, CH
| | - Sabrina Jegerlehner
- Department of Emergency Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, Bern, CH
| | - Stefan Feuerriegel
- Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH.,Institute of AI in Management, LMU Munich, Munich, DE
| | - Elgar Fleisch
- Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH.,Institute of Technology Management, University of St. Gallen, St. Gallen, CH
| | - Aristomenis Exadaktylos
- Department of Emergency Medicine, Inselspital, Bern University Hospital, University of Bern, Freiburgstrasse 16C, Bern, CH
| | - Felix Wortmann
- Institute of Technology Management, University of St. Gallen, St. Gallen, CH.,Department of Management, Technology, and Economics, ETH Zürich, Zürich, CH
| |
Collapse
|
41
|
Safaei N, Safaei B, Seyedekrami S, Talafidaryani M, Masoud A, Wang S, Li Q, Moqri M. E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database. PLoS One 2022; 17:e0262895. [PMID: 35511882 PMCID: PMC9070907 DOI: 10.1371/journal.pone.0262895] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Accepted: 01/09/2022] [Indexed: 11/19/2022] Open
Abstract
Improving the Intensive Care Unit (ICU) management network and building cost-effective and well-managed healthcare systems are high priorities for healthcare units. Creating accurate and explainable mortality prediction models helps identify the most critical risk factors in the patients' survival/death status and early detect the most in-need patients. This study proposes a highly accurate and efficient machine learning model for predicting ICU mortality status upon discharge using the information available during the first 24 hours of admission. The most important features in mortality prediction are identified, and the effects of changing each feature on the prediction are studied. We used supervised machine learning models and illness severity scoring systems to benchmark the mortality prediction. We also implemented a combination of SHAP, LIME, partial dependence, and individual conditional expectation plots to explain the predictions made by the best-performing model (CatBoost). We proposed E-CatBoost, an optimized and efficient patient mortality prediction model, which can accurately predict the patients' discharge status using only ten input features. We used eICU-CRD v2.0 to train and validate the models; the dataset contains information on over 200,000 ICU admissions. The patients were divided into twelve disease groups, and models were fitted and tuned for each group. The models' predictive performance was evaluated using the area under a receiver operating curve (AUROC). The AUROC scores were 0.86 [std:0.02] to 0.92 [std:0.02] for CatBoost and 0.83 [std:0.02] to 0.91 [std:0.03] for E-CatBoost models across the defined disease groups; if measured over the entire patient population, their AUROC scores were 7 to 18 and 2 to 12 percent higher than the baseline models, respectively. Based on SHAP explanations, we found age, heart rate, respiratory rate, blood urine nitrogen, and creatinine level as the most critical cross-disease features in mortality predictions.
Collapse
Affiliation(s)
- Nima Safaei
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Babak Safaei
- Civil and Environmental Engineering Department, Michigan State University, East Lansing, MI, United States of America
| | - Seyedhouman Seyedekrami
- Department of Computer Science and Engineering, University of Nevada, Reno, NV, United States of America
| | | | - Arezoo Masoud
- Department of Business Analytics and Information Systems, Tippie College of Business, University of Iowa, Iowa City, IA, United States of America
| | - Shaodong Wang
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Qing Li
- Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA, United States of America
| | - Mahdi Moqri
- Department of Information Systems and Business Analytics, Ivy College of Business, Iowa State University, Ames, IA, United States of America
| |
Collapse
|
42
|
Shanbehzadeh M, Nopour R, Kazemi-Arpanahi H. Design of an artificial neural network to predict mortality among COVID-19 patients. INFORMATICS IN MEDICINE UNLOCKED 2022; 31:100983. [PMID: 35664686 PMCID: PMC9148440 DOI: 10.1016/j.imu.2022.100983] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Revised: 05/26/2022] [Accepted: 05/26/2022] [Indexed: 12/23/2022] Open
Abstract
Introduction The fast pandemic of coronavirus disease 2019 (COVID-19) has challenged clinicians with many uncertainties and ambiguities regarding disease outcomes and complications. To deal with these uncertainties, our study aimed to develop and evaluate several artificial neural networks (ANNs) to predict the mortality risk in hospitalized COVID-19 patients. Material and methods The data of 1710 hospitalized COVID-19 patients were used in this retrospective and developmental study. First, a Chi-square test (P < 0.05), Eta coefficient (η > 0.4), and binary logistics regression (BLR) analysis were performed to determine the factors affecting COVID-19 mortality. Then, using the selected variables, two types of feed-forward (FF) models, including the back-propagation (BP) and distributed time delay (DTD) were trained. The models' performance was assessed using mean squared error (MSE), error histogram (EH), and area under the ROC curve (AUC-ROC) metrics. Results After applying the univariate and multivariate analysis, 13 variables were selected as important features in predicting COVID-19 mortality at P < 0.05. A comparison of the two ANN architectures using the MSE showed that the BP-ANN (validation error: 0.067, most of the classified samples having 0.049 and 0.05 error rates, and AUC-ROC: 0.888) was the best model. Conclusions Our findings show the acceptable performance of ANN for predicting the risk of mortality in hospitalized COVID-19 patients. Application of the developed ANN-based CDSS in a real clinical environment will improve patient safety and reduce disease severity and mortality.
Collapse
|
43
|
Khadem H, Nemat H, Eissa MR, Elliott J, Benaissa M. COVID-19 mortality risk assessments for individuals with and without diabetes mellitus: Machine learning models integrated with interpretation framework. Comput Biol Med 2022; 144:105361. [PMID: 35255295 PMCID: PMC8887960 DOI: 10.1016/j.compbiomed.2022.105361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 02/25/2022] [Accepted: 02/26/2022] [Indexed: 11/17/2022]
Abstract
This research develops machine learning models equipped with interpretation modules for mortality risk prediction and stratification in cohorts of hospitalised coronavirus disease-2019 (COVID-19) patients with and without diabetes mellitus (DM). To this end, routinely collected clinical data from 156 COVID-19 patients with DM and 349 COVID-19 patients without DM were scrutinised. First, a random forest classifier forecasted in-hospital COVID-19 fatality utilising admission data for each cohort. For the DM cohort, the model predicted mortality risk with the accuracy of 82%, area under the receiver operating characteristic curve (AUC) of 80%, sensitivity of 80%, and specificity of 56%. For the non-DM cohort, the achieved accuracy, AUC, sensitivity, and specificity were 80%, 84%, 91%, and 56%, respectively. The models were then interpreted using SHapley Additive exPlanations (SHAP), which explained predictors’ global and local influences on model outputs. Finally, the k-means algorithm was applied to cluster patients on their SHAP values. The algorithm demarcated patients into three clusters. Average mortality rates within the generated clusters were 8%, 20%, and 76% for the DM cohort, 2.7%, 28%, and 41.9% for the non-DM cohort, providing a functional method of risk stratification.
Collapse
|
44
|
Kuo KM, Talley PC, Chang CS. The Accuracy of Machine Learning Approaches Using Non-image Data for the Prediction of COVID-19: A Meta-Analysis. Int J Med Inform 2022; 164:104791. [PMID: 35594810 PMCID: PMC9098530 DOI: 10.1016/j.ijmedinf.2022.104791] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Revised: 04/08/2022] [Accepted: 05/09/2022] [Indexed: 12/12/2022]
Abstract
Objective COVID-19 is a novel, severely contagious disease with enormous negative impact on humanity as well as the world economy. An expeditious, feasible tool for detecting COVID-19 remains yet elusive. Recently, there has been a surge of interest in applying machine learning techniques to predict COVID-19 using non-image data. We have therefore undertaken a meta-analysis to quantify the diagnostic performance of machine learning models facilitating the prediction of COVID-19. Materials and methods A comprehensive electronic database search for the period between January 1st, 2021 and December 3rd, 2021 was undertaken in order to identify eligible studies relevant to this meta-analysis. Summary sensitivity, specificity, and the area under receiver operating characteristic curves were used to assess potential diagnostic accuracy. Risk of bias was assessed by means of a revised Quality Assessment of Diagnostic Studies. Results A total of 30 studies, including 34 models, met all of the inclusion criteria. Summary sensitivity, specificity, and area under receiver operating characteristic curves were 0.86, 0.86, and 0.91, respectively. The purpose of machine learning models, class imbalance, and feature selection are significant covariates useful in explaining the between-study heterogeneity, in terms of both sensitivity and specificity. Conclusions Our study findings show that non-image data can be used to predict COVID-19 with an acceptable performance. Further, class imbalance and feature selection are suggested to be incorporated whenever building models for the prediction of COVID-19, thus improving further diagnostic performance.
Collapse
|
45
|
Liu Y, Gao K, Deng H, Ling T, Lin J, Yu X, Bo X, Zhou J, Gao L, Wang P, Hu J, Zhang J, Tong Z, Liu Y, Shi Y, Ke L, Gao Y, Li W. A time-incorporated SOFA score-based machine learning model for predicting mortality in critically ill patients: A multicenter, real-world study. Int J Med Inform 2022; 163:104776. [PMID: 35512625 DOI: 10.1016/j.ijmedinf.2022.104776] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 04/11/2022] [Accepted: 04/14/2022] [Indexed: 11/16/2022]
Abstract
BACKGROUND Organ dysfunction (OD) assessment is essential in intensive care units (ICUs). However, current OD assessment scores merely describe the number and the severity of each OD, without evaluating the duration of organ injury. The objective of this study is to develop and validate a machine learning model based on the Sequential Organ Failure Assessment (SOFA) score for the prediction of mortality in critically ill patients. MATERIAL AND METHODS Data from the eICU Collaborative Research Database and Medical Information Mart for Intensive Care (MIMIC) -III were mixed for model development. The MIMIC-IV and Nanjing Jinling Hospital Surgical ICU database were used as external test set A and set B, respectively. The outcome of interest was in-ICU mortality. A modified SOFA model incorporating time-dimension (T-SOFA) was stepwise developed to predict ICU mortality using extreme gradient boosting (XGBoost), support vector machine, random forest and logistic regression algorithms. Time-dimensional features were calculated based on six consecutive SOFA scores collected every 12 h within the first three days of admission. The predictive performance was assessed with the area under the receiver operating characteristic curves (AUROC) and calibration plot. RESULTS A total of 82,132 patients from the real-world datasets were included in this study, and 7,494 patients (9.12%) died during their ICU stay. The T-SOFA M3 that incorporated the time-dimension features and age, using the XGBoost algorithm, significantly outperformed the original SOFA score in the validation set (AUROC 0.800 95% CI [0.787-0.813] vs. 0.693 95% CI [0.678-0.709], p < 0.01). Good discrimination and calibration were maintained in the test set A and B, with AUROC of 0.803, 95% CI [0.791-0.815] and 0.830, 95% CI [0.789-0.870], respectively. CONCLUSIONS The time-incorporated T-SOFA model could significantly improve the prediction performance of the original SOFA score and is of potential for identifying high-risk patients in future clinical application.
Collapse
Affiliation(s)
- Yang Liu
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Kun Gao
- Department of Critical Care Medicine, Jinling Hospital, Nanjing Medical University, Nanjing 210002, PR China
| | - Hongbin Deng
- Department of Critical Care Medicine, Jinling Hospital, Nanjing Medical University, Nanjing 210002, PR China
| | - Tong Ling
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China; National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, PR China
| | - Jiajia Lin
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Xianqiang Yu
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Xiangwei Bo
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Jing Zhou
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Lin Gao
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Peng Wang
- Department of Critical Care Medicine, Jinling Hospital, Nanjing Medical University, Nanjing 210002, PR China
| | - Jiajun Hu
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China; National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, PR China
| | - Jian Zhang
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China; National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, PR China
| | - Zhihui Tong
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China
| | - Yuxiu Liu
- Department of Critical Care Medicine, Jinling Hospital, Nanjing Medical University, Nanjing 210002, PR China
| | - Yinghuan Shi
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China; National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, PR China.
| | - Lu Ke
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China; National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China.
| | - Yang Gao
- National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China; National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210023, PR China
| | - Weiqin Li
- Department of Critical Care Medicine, Affiliated Jinling Hospital, School of Medicine, Southeast University& Nanjing University, Nanjing 210002, PR China; National Institute of Healthcare Data Science at Nanjing University, Nanjing, 210023, PR China
| |
Collapse
|
46
|
Li J, Liu S, Hu Y, Zhu L, Mao Y, Liu J. Predicting mortality in ICU Patients with heart failure using interpretable machine learning model (Preprint). J Med Internet Res 2022; 24:e38082. [PMID: 35943767 PMCID: PMC9399880 DOI: 10.2196/38082] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2022] [Revised: 07/07/2022] [Accepted: 07/15/2022] [Indexed: 01/01/2023] Open
Affiliation(s)
- Jili Li
- West China School of Medicine, Sichuan University, Chengdu, China
| | - Siru Liu
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States
| | - Yundi Hu
- School of Data Science, Fudan University, Shanghai, China
| | - Lingfeng Zhu
- Department of Computer Science, Sichuan University, Chengdu, China
| | - Yujia Mao
- West China School of Medicine, Sichuan University, Chengdu, China
| | - Jialin Liu
- Department of Medical Informatics, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
47
|
Kurban LAS, AlDhaheri S, Elkkari A, Khashkhusha R, AlEissaee S, AlZaabi A, Ismail M, Bakoush O. Predicting Severe Disease and Critical Illness on Initial Diagnosis of COVID-19: Simple Triage Tools. Front Med (Lausanne) 2022; 9:817549. [PMID: 35223916 PMCID: PMC8866724 DOI: 10.3389/fmed.2022.817549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 01/17/2022] [Indexed: 01/08/2023] Open
Abstract
Rationale This study was conducted to develop, validate, and compare prediction models for severe disease and critical illness among symptomatic patients with confirmed COVID-19. Methods For development cohort, 433 symptomatic patients diagnosed with COVID-19 between April 15th 2020 and June 30th, 2020 presented to Tawam Public Hospital, Abu Dhabi, United Arab Emirates were included in this study. Our cohort included both severe and non-severe patients as all cases were admitted for purpose of isolation as per hospital policy. We examined 19 potential predictors of severe disease and critical illness that were recorded at the time of initial assessment. Univariate and multivariate logistic regression analyses were used to construct predictive models. Discrimination was assessed by the area under the receiver operating characteristic curve (AUC). Calibration and goodness of fit of the models were assessed. A cohort of 213 patients assessed at another public hospital in the country during the same period was used to validate the models. Results One hundred and eighty-six patients were classified as severe while the remaining 247 were categorized as non-severe. For prediction of progression to severe disease, the three independent predictive factors were age, serum lactate dehydrogenase (LDH) and serum albumin (ALA model). For progression to critical illness, the four independent predictive factors were age, serum LDH, kidney function (eGFR), and serum albumin (ALKA model). The AUC for the ALA and ALKA models were 0.88 (95% CI, 0.86–0.89) and 0.85 (95% CI, 0.83–0.86), respectively. Calibration of the two models showed good fit and the validation cohort showed excellent discrimination, with an AUC of 0.91 (95% CI, 0.83–0.99) for the ALA model and 0.89 (95% CI, 0.80–0.99) for the ALKA model. A free web-based risk calculator was developed. Conclusions The ALA and ALKA predictive models were developed and validated based on simple, readily available clinical and laboratory tests assessed at presentation. These models may help frontline clinicians to triage patients for admission or discharge, as well as for early identification of patients at risk of developing critical illness.
Collapse
Affiliation(s)
| | - Sharina AlDhaheri
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Abdulbaset Elkkari
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Ramzi Khashkhusha
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Shaikha AlEissaee
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Amna AlZaabi
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Mohamed Ismail
- Department of Internal Medicine, Tawam Hospital, Al Ain, United Arab Emirates
| | - Omran Bakoush
- Department of Internal Medicine, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates
| |
Collapse
|
48
|
Cao Z, Qiu Z, Tang F, Liang S, Wang Y, Long H, Chen C, Zhang B, Zhang C, Wang Y, Tang K, Tang J, Chen J, Yang C, Xu Y, Yang Y, Xiao S, Tian D, Jiang G, Du X. Drivers and forecasts of multiple waves of the coronavirus disease 2019 pandemic: a systematic analysis based on an interpretable machine learning framework. Transbound Emerg Dis 2022; 69:e1584-e1594. [PMID: 35192224 DOI: 10.1111/tbed.14492] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2021] [Revised: 02/20/2022] [Accepted: 02/21/2022] [Indexed: 11/26/2022]
Abstract
Coronavirus disease 2019 (COVID-19) has become a global pandemic and continues to prevail with multiple rebound waves in many countries. The driving factors for the spread of COVID-19 and their quantitative contributions, especially to rebound waves, are not well studied. Multidimensional time-series data, including policy, travel, medical, socioeconomic, environmental, mutant and vaccine related data, were collected from 39 countries up to June 30, 2021, and an interpretable machine learning framework (XGBoost model with Shapley Additive explanation interpretation) was used to systematically analyze the effect of multiple factors on the spread of COVID-19, using the daily effective reproduction number as an indicator. Based on a model of the pre-vaccine era, policy-related factors were shown to be the main drivers of the spread of COVID-19, with a contribution of 60.81%. In the post-vaccine era, the contribution of policy-related factors decreased to 28.34%, accompanied with an increase in the contribution of travel-related factors, such as domestic flights, and contributions emerged for mutant-related (16.49%) and vaccine-related (7.06%) factors. For single-peak countries, the dominant ones were policy-related factors during both the rising and fading stages, with overall contributions of 33.7% and 37.7%, respectively. For double-peak countries, factors from the rebound stage contributed 45.8% and policy-related factors showed the greatest contribution in both the rebound (32.6%) and fading (25.0%) stages. For multiple-peak countries, the Delta variant, domestic flights (current month) and the daily vaccination population are the three greatest contributors (8.12%, 7.59% and 7.26%, respectively). Forecasting models to predict the rebound risk were built based on these findings, with accuracies of 0.78 and 0.81 for the pre- and post-vaccine eras, respectively. These findings quantitatively demonstrate the systematic drivers of the spread of COVID-19, and the framework proposed in this study will facilitate the targeted prevention and control of the ongoing COVID-19 pandemic. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Zicheng Cao
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Zekai Qiu
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Feng Tang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,Foshan Center for Disease Control and Prevention, Foshan, 528010, P.R. China
| | - Shiwen Liang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,Fujian Provincial Center for Disease Control and Prevention, Fuzhou, 350001, P.R. China
| | - Yinghan Wang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,Clinical research center, Second Affiliated Hospital of Kunming Medical University, Kunming, 650033, P.R. China
| | - Haoyu Long
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Cai Chen
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China
| | - Bing Zhang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Chi Zhang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Yaqi Wang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Kang Tang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Jing Tang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Junhong Chen
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Chunhui Yang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Yuzhe Xu
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Yulin Yang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Shenglan Xiao
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Dechao Tian
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Guozhi Jiang
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China
| | - Xiangjun Du
- School of Public Health (Shenzhen), Sun Yat-sen University, Guangzhou, 510275, P.R. China.,School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen, 518107, P.R. China.,Key Laboratory of Tropical Disease Control, Ministry of Education, Sun Yat-sen University, Guangzhou, 510030, P.R. China
| |
Collapse
|
49
|
Kim J, Blaum C, Ferris R, Arcila-Mesa M, Do H, Pulgarin C, Dolle J, Scherer J, Kalyanaraman Marcello R, Zhong J. Factors associated with hospital admission and severe outcomes for older patients with COVID-19. J Am Geriatr Soc 2022; 70:1906-1917. [PMID: 35179781 PMCID: PMC9115084 DOI: 10.1111/jgs.17718] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 01/20/2022] [Accepted: 02/06/2022] [Indexed: 11/28/2022]
Abstract
Background Morbidity and death due to coronavirus disease 2019 (COVID‐19) experienced by older adults in nursing homes have been well described, but COVID‐19's impact on community‐living older adults is less studied. Similarly, the previous ambulatory care experience of such patients has rarely been considered in studies of COVID‐19 risks and outcomes. Methods To investigate the relationship of advanced age (65+), on risk factors associated with COVID‐19 outcomes in community‐living elders, we identified an electronic health records cohort of older patients aged 65+ with laboratory‐confirmed COVID‐19 with and without an ambulatory care visit in the past 24 months (n = 47,219) in the New York City (NYC) academic medical institutions and the NYC public hospital system from January 2020 to February 2021. The main outcomes are COVID‐19 hospitalization; severe outcomes/Intensive care unit (ICU), intubation, dialysis, stroke, in‐hospital death), and in‐hospital death. The exposures include demographic characteristics, and those with ambulatory records, comorbidities, frailty, and laboratory results. Results The 31,770 patients with an ambulatory history had a median age of 74 years; were 47.4% male, 24.3% non‐Hispanic white, 23.3% non‐Hispanic black, and 18.4% Hispanic. With increasing age, the odds ratios and attributable fractions of sex, race–ethnicity, comorbidities, and biomarkers decreased except for dementia and frailty (Hospital Frailty Risk Score). Patients without ambulatory care histories, compared to those with, had significantly higher adjusted rates of COVID‐19 hospitalization and severe outcomes, with strongest effect in the oldest group. Conclusions In this cohort of community‐dwelling older adults, we provided evidence of age‐specific risk factors for COVID‐19 hospitalization and severe outcomes. Future research should explore the impact of frailty and dementia in severe COVID‐19 outcomes in community‐living older adults, and the role of engagement in ambulatory care in mitigating severe disease.
Collapse
Affiliation(s)
- Jiyu Kim
- Division of Biostatistics, Department of Population Health, NYU Grossman School of Medicine
| | - Caroline Blaum
- Division of Geriatric Medicine and Palliative Care, Department of Medicine, NYU Grossman School of Medicine.,National Center for Quality Assurance
| | - Rosie Ferris
- Division of Geriatric Medicine and Palliative Care, Department of Medicine, NYU Grossman School of Medicine
| | - Mauricio Arcila-Mesa
- Division of Geriatric Medicine and Palliative Care, Department of Medicine, NYU Grossman School of Medicine
| | - Hyungrok Do
- Division of Biostatistics, Department of Population Health, NYU Grossman School of Medicine
| | - Claudia Pulgarin
- Department of Population Health, NYU Grossman School of Medicine
| | - Johanna Dolle
- Office of Ambulatory Care and Population Health, NYC Health + Hospitals
| | - Jennifer Scherer
- Division of Palliative Care and Division of Nephrology, NYU Grossman School of Medicine
| | | | - Judy Zhong
- Division of Biostatistics, Department of Population Health, NYU Grossman School of Medicine
| |
Collapse
|
50
|
Moulaei K, Shanbehzadeh M, Mohammadi-Taghiabad Z, Kazemi-Arpanahi H. Comparing machine learning algorithms for predicting COVID-19 mortality. BMC Med Inform Decis Mak 2022; 22:2. [PMID: 34983496 PMCID: PMC8724649 DOI: 10.1186/s12911-021-01742-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2021] [Accepted: 12/28/2021] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND The coronavirus disease (COVID-19) hospitalized patients are always at risk of death. Machine learning (ML) algorithms can be used as a potential solution for predicting mortality in COVID-19 hospitalized patients. So, our study aimed to compare several ML algorithms to predict the COVID-19 mortality using the patient's data at the first time of admission and choose the best performing algorithm as a predictive tool for decision-making. METHODS In this study, after feature selection, based on the confirmed predictors, information about 1500 eligible patients (1386 survivors and 144 deaths) obtained from the registry of Ayatollah Taleghani Hospital, Abadan city, Iran, was extracted. Afterwards, several ML algorithms were trained to predict COVID-19 mortality. Finally, to assess the models' performance, the metrics derived from the confusion matrix were calculated. RESULTS The study participants were 1500 patients; the number of men was found to be higher than that of women (836 vs. 664) and the median age was 57.25 years old (interquartile 18-100). After performing the feature selection, out of 38 features, dyspnea, ICU admission, and oxygen therapy were found as the top three predictors. Smoking, alanine aminotransferase, and platelet count were found to be the three lowest predictors of COVID-19 mortality. Experimental results demonstrated that random forest (RF) had better performance than other ML algorithms with accuracy, sensitivity, precision, specificity, and receiver operating characteristic (ROC) of 95.03%, 90.70%, 94.23%, 95.10%, and 99.02%, respectively. CONCLUSION It was found that ML enables a reasonable level of accuracy in predicting the COVID-19 mortality. Therefore, ML-based predictive models, particularly the RF algorithm, potentially facilitate identifying the patients who are at high risk of mortality and inform proper interventions by the clinicians.
Collapse
Affiliation(s)
- Khadijeh Moulaei
- Medical Informatics Research Center, Institute for Futures Studies in Health, Kerman University of Medical Sciences, Kerman, Iran
| | - Mostafa Shanbehzadeh
- Department of Health Information Technology, School of Paramedical, Ilam University of Medical Sciences, Ilam, Iran
| | - Zahra Mohammadi-Taghiabad
- Department of Health Information Management, School of Health Management and Information Sciences, Kerman University of Medical Sciences, Kerman, Iran
| | - Hadi Kazemi-Arpanahi
- Department of Health Information Technology, Abadan University of Medical Sciences, Abadan, Iran.
- Student Research Committee, Abadan University of Medical Sciences, Abadan, Iran.
| |
Collapse
|