1
|
Wang K, Hong T, Liu W, Xu C, Yin C, Liu H, Wei X, Wu SN, Li W, Rong L. Development and validation of a machine learning-based prognostic risk stratification model for acute ischemic stroke. Sci Rep 2023; 13:13782. [PMID: 37612344 PMCID: PMC10447537 DOI: 10.1038/s41598-023-40411-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 08/09/2023] [Indexed: 08/25/2023] Open
Abstract
Acute ischemic stroke (AIS) is a most prevalent cause of serious long-term disability worldwide. Accurate prediction of stroke prognosis is highly valuable for effective intervention and treatment. As such, the present retrospective study aims to provide a reliable machine learning-based model for prognosis prediction in AIS patients. Data from AIS patients were collected retrospectively from the Second Affiliated Hospital of Xuzhou Medical University between August 2017 and July 2019. Independent prognostic factors were identified by univariate and multivariate logistic analysis and used to develop machine learning (ML) models. The ML model performance was assessed by area under the receiver operating characteristic curve (AUC) and radar plot. Shapley Additive explanations (SHAP) values were used to interpret the importance of all features included in the predictive model. A total of 677 AIS patients were included in the present study. Poor prognosis was observed in 209 patients (30.9%). Six variables, including neuron specific enolase (NSE), homocysteine (HCY), S-100β, dysphagia, C-reactive protein (CRP), and anticoagulation were included to establish ML models. Six different ML algorithms were tested, and Random Forest model was selected as the final predictive model with the greatest AUC of 0.908. Moreover, according to SHAP results, NSE impacted the predictive model the most, followed by HCY, S-100β, dysphagia, CRP and anticoagulation. Based on the RF model, an online tool was constructed to predict the prognosis of AIS patients and assist clinicians in optimizing patient treatment. The present study revealed that NSE, HCY, CRP, S-100β, anticoagulation, and dysphagia were important factors for poor prognosis in AIS patients. ML algorithms were used to develop predictive models for predicting the prognosis of AIS patients, with the RF model presenting the optimal performance.
Collapse
Affiliation(s)
- Kai Wang
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboatory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Tao Hong
- Pediatric Surgery Ward, Fuwai Hospital Chinese Academy of Medical Sciences, Shenzhen, China
- Department of Cardiovascular Surgery, General Hospital of Northern Theater Command, Shenyang, 110000, China
- Postgraduate College, Dalian Medical University, Dalian, 116000, China
| | - Wencai Liu
- Department of Orthopaedics, Shanghai Jiao Tong University Affiliated Sixth People's Hospital, 600 Yishan Road, Shanghai, 200233, China
| | - Chan Xu
- The State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics & Center for Molecular Imaging and Translational Medicine, School of Public Health, Xiamen University, Xiamen, China
| | - Chengliang Yin
- Faculty of Medicine, Macau University of Science and Technology, Macau, China
| | - Haiyan Liu
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboatory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Xiu'e Wei
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
- Key Laboatory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China
| | - Shi-Nan Wu
- School of Medicine, Eye Institute of Xiamen University, Xiamen University, Xiamen, Fujian, China.
| | - Wenle Li
- Key Laboatory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China.
- The State Key Laboratory of Molecular Vaccinology and Molecular Diagnostics & Center for Molecular Imaging and Translational Medicine, School of Public Health, Xiamen University, Xiamen, China.
| | - Liangqun Rong
- Department of Neurology, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China.
- Key Laboatory of Neurological Diseases, The Second Affiliated Hospital of Xuzhou Medical University, Xuzhou, Jiangsu, China.
| |
Collapse
|
2
|
Song Y, Shen F, Dong Q, Wang L, Mi J. Prediction of Late Hospital Arrival in Patients with Mild and Rapidly Improving Acute Ischemic Stroke in a Rural Area of China. Risk Manag Healthc Policy 2023; 16:1119-1129. [PMID: 37360537 PMCID: PMC10290495 DOI: 10.2147/rmhp.s414700] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Accepted: 06/03/2023] [Indexed: 06/28/2023] Open
Abstract
Purpose Among all ischemic stroke patients, more than half are mild and rapidly improving acute ischemic stroke (MaRAIS) patients. However, many MaRAIS patients do not recognize the disease early on, and thus they delay access to the treatment that would be most effective if provided earlier. This is especially true in rural areas. The aim of this study was to develop and validate a late hospital arrival risk nomogram in a rural Chinese population of patients with MaRAIS. Methods We developed a prediction model based on a training dataset of 173 MaRAIS patients collected from September 9, 2019 to May 13, 2020. Data analyzed included demographics and disease characteristics. A least absolute shrinkage and selection operator (LASSO) regression model was used to optimize feature selection for the late hospital arrival risk model. Multivariable logistic regression analysis was applied to build a prediction model incorporating the features selected in the LASSO regression models. The discrimination, calibration, and clinical usefulness of the prediction model were assessed using the C-index, calibration plot, and decision curve analysis, respectively. Internal validation was then assessed using bootstrapping validation. Results Variables contained in the prediction nomogram included transportation mode, history of diabetes, knowledge of stroke symptoms, and thrombolytic therapy. The model had moderate predictive power with a C-index of 0.709 (95% confidence interval: 0.636-0.783) and good calibration. In the internal validation, the C-index reached 0.692. The risk threshold was 30-97% according to the analysis of the decision curve, and the nomogram could be applied in clinical practice. Conclusion This novel nomogram, which incorporates transportation mode, history of diabetes, knowledge of stroke symptoms, and thrombolytic therapy, was conveniently applied to facilitate individual late hospital arrival risk prediction among MaRAIS patients in a rural area of Shanghai, China.
Collapse
Affiliation(s)
- Yeping Song
- Cerebrovascular Disease Center, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, 200127, People’s Republic of China
| | - Fei Shen
- Cerebrovascular Disease Center, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, 200127, People’s Republic of China
| | - Qing Dong
- Cerebrovascular Disease Center, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, 200127, People’s Republic of China
| | - Liling Wang
- Cerebrovascular Disease Center, Renji Hospital, School of Medicine, Shanghai Jiaotong University, Shanghai, 200127, People’s Republic of China
| | - Jianhua Mi
- Health Management Center, Renji Hospital, School of Medical School, Shanghai Jiaotong University, Shanghai, 200127, People’s Republic of China
| |
Collapse
|
3
|
Abebe TG, Feleke SF, Dessie AM, Anteneh RM, Anteneh ZA. Development and internal validation of a clinical risk score for in-hospital mortality after stroke: a single-centre retrospective cohort study in Northwest Ethiopia. BMJ Open 2023; 13:e063170. [PMID: 36977538 PMCID: PMC10069517 DOI: 10.1136/bmjopen-2022-063170] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 03/30/2023] Open
Abstract
OBJECTIVE To develop and validate a clinical risk score for in-hospital stroke mortality. DESIGN The study used a retrospective cohort study design. SETTING The study was carried out in a tertiary hospital in the Northwest Ethiopian region. PARTICIPANTS The study included 912 patients who had a stroke admitted to a tertiary hospital between 11 September 2018 and 7 March 2021. MAIN OUTCOME MEASURES Clinical risk score for in-hospital stroke mortality. METHODS We used EpiData V.3.1 and R V.4.0.4 for data entry and analysis, respectively. Predictors of mortality were identified by multivariable logistic regression. A bootstrapping technique was performed to internally validate the model. Simplified risk scores were established from the beta coefficients of predictors of the final reduced model. Model performance was evaluated using the area under the receiver operating characteristic curve and calibration plot. RESULTS From the total stroke cases, 132 (14.5%) patients died during the hospital stay. We developed a risk prediction model from eight prognostic determinants (age, sex, type of stroke, diabetes mellitus, temperature, Glasgow Coma Scale, pneumonia and creatinine). The area under the curve (AUC) of the model was 0.895 (95% CI: 0.859-0.932) for the original model and was the same for the bootstrapped model. The AUC of the simplified risk score model was 0.893 (95% CI: 0.856-0.929) with a calibration test p value of 0.225. CONCLUSIONS The prediction model was developed from eight easy-to-collect predictors. The model has excellent discrimination and calibration performance, similar to that of the risk score model. It is simple, easily remembered, and helps clinicians identify the risk of patients and manage it properly. Prospective studies in different healthcare settings are required to externally validate our risk score.
Collapse
Affiliation(s)
| | | | | | | | - Zelalem Alamrew Anteneh
- Epidemiology, Bahir Dar University College of Medical and Health Sciences, Bahir Dar, Ethiopia
| |
Collapse
|
4
|
Subramanian V, Mascha EJ, Kattan MW. Developing a Clinical Prediction Score: Comparing Prediction Accuracy of Integer Scores to Statistical Regression Models. Anesth Analg 2021; 132:1603-1613. [PMID: 33464759 DOI: 10.1213/ane.0000000000005362] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Researchers often convert prediction tools built on statistical regression models into integer scores and risk classification systems in the name of simplicity. However, this workflow discards useful information and reduces prediction accuracy. We, therefore, investigated the impact on prediction accuracy when researchers simplify a regression model into an integer score using a simulation study and an example clinical data set. Simulated independent training and test sets (n = 1000) were randomly generated such that a logistic regression model would perform at a specified target area under the receiver operating characteristic curve (AUC) of 0.7, 0.8, or 0.9. After fitting a logistic regression with continuous covariates to each data set, continuous variables were dichotomized using data-dependent cut points. A logistic regression was refit, and the coefficients were scaled and rounded to create an integer score. A risk classification system was built by stratifying integer scores into low-, intermediate-, and high-risk tertiles. Discrimination and calibration were assessed by calculating the AUC and index of prediction accuracy (IPA) for each model. The optimism in performance between the training set and test set was calculated for both AUC and IPA. The logistic regression model using the continuous form of covariates outperformed all other models. In the simulation study, converting the logistic regression model to an integer score and subsequent risk classification system incurred an average decrease of 0.057-0.094 in AUC, and an absolute 6.2%-17.5% in IPA. The largest decrease in both AUC and IPA occurred in the dichotomization step. The dichotomization and risk stratification steps also increased the optimism of the resulting models, such that they appeared to be able to predict better than they actually would on new data. In the clinical data set, converting the logistic regression with continuous covariates to an integer score incurred a decrease in externally validated AUC of 0.06 and a decrease in externally validated IPA of 13%. Converting a regression model to an integer score decreases model performance considerably. Therefore, we recommend developing a regression model that incorporates all available information to make the most accurate predictions possible, and using the unaltered regression model when making predictions for individual patients. In all cases, researchers should be mindful that they correctly validate the specific model that is intended for clinical use.
Collapse
Affiliation(s)
- Vigneshwar Subramanian
- From the Cleveland Clinic Lerner College of Medicine at Case Western Reserve University, Cleveland, Ohio
| | - Edward J Mascha
- Departments of Quantitative Health Sciences and Outcomes Research and
| | - Michael W Kattan
- Department of Quantitative Health Sciences, Lerner Research Institute, Cleveland Clinic, Cleveland, Ohio
| |
Collapse
|