1
|
Dong T, Sinha S, Zhai B, Fudulu D, Chan J, Narayan P, Judge A, Caputo M, Dimagli A, Benedetto U, Angelini GD. Performance Drift in Machine Learning Models for Cardiac Surgery Risk Prediction: Retrospective Analysis. JMIRX MED 2024; 5:e45973. [PMID: 38889069 PMCID: PMC11217160 DOI: 10.2196/45973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 02/27/2024] [Accepted: 04/29/2024] [Indexed: 06/20/2024]
Abstract
Background The Society of Thoracic Surgeons and European System for Cardiac Operative Risk Evaluation (EuroSCORE) II risk scores are the most commonly used risk prediction models for in-hospital mortality after adult cardiac surgery. However, they are prone to miscalibration over time and poor generalization across data sets; thus, their use remains controversial. Despite increased interest, a gap in understanding the effect of data set drift on the performance of machine learning (ML) over time remains a barrier to its wider use in clinical practice. Data set drift occurs when an ML system underperforms because of a mismatch between the data it was developed from and the data on which it is deployed. Objective In this study, we analyzed the extent of performance drift using models built on a large UK cardiac surgery database. The objectives were to (1) rank and assess the extent of performance drift in cardiac surgery risk ML models over time and (2) investigate any potential influence of data set drift and variable importance drift on performance drift. Methods We conducted a retrospective analysis of prospectively, routinely gathered data on adult patients undergoing cardiac surgery in the United Kingdom between 2012 and 2019. We temporally split the data 70:30 into a training and validation set and a holdout set. Five novel ML mortality prediction models were developed and assessed, along with EuroSCORE II, for relationships between and within variable importance drift, performance drift, and actual data set drift. Performance was assessed using a consensus metric. Results A total of 227,087 adults underwent cardiac surgery during the study period, with a mortality rate of 2.76% (n=6258). There was strong evidence of a decrease in overall performance across all models (P<.0001). Extreme gradient boosting (clinical effectiveness metric [CEM] 0.728, 95% CI 0.728-0.729) and random forest (CEM 0.727, 95% CI 0.727-0.728) were the overall best-performing models, both temporally and nontemporally. EuroSCORE II performed the worst across all comparisons. Sharp changes in variable importance and data set drift from October to December 2017, from June to July 2018, and from December 2018 to February 2019 mirrored the effects of performance decrease across models. Conclusions All models show a decrease in at least 3 of the 5 individual metrics. CEM and variable importance drift detection demonstrate the limitation of logistic regression methods used for cardiac surgery risk prediction and the effects of data set drift. Future work will be required to determine the interplay between ML models and whether ensemble models could improve on their respective performance advantages.
Collapse
Affiliation(s)
- Tim Dong
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Shubhra Sinha
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Ben Zhai
- School of Computing Science, Northumbria University, Newcastle upon Tyne, United Kingdom
| | - Daniel Fudulu
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Jeremy Chan
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Pradeep Narayan
- Department of Cardiac Surgery, Rabindranath Tagore International Institute of Cardiac Sciences, West Bengal, India
| | - Andy Judge
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Massimo Caputo
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Arnaldo Dimagli
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Umberto Benedetto
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| | - Gianni D Angelini
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, United Kingdom
| |
Collapse
|
2
|
Zeng J, Zhang D, Lin S, Su X, Wang P, Zhao Y, Zheng Z. Comparative analysis of machine learning vs. traditional modeling approaches for predicting in-hospital mortality after cardiac surgery: temporal and spatial external validation based on a nationwide cardiac surgery registry. EUROPEAN HEART JOURNAL. QUALITY OF CARE & CLINICAL OUTCOMES 2024; 10:121-131. [PMID: 37218710 DOI: 10.1093/ehjqcco/qcad028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/12/2023] [Accepted: 05/21/2023] [Indexed: 05/24/2023]
Abstract
AIMS Preoperative risk assessment is crucial for cardiac surgery. Although previous studies suggested machine learning (ML) may improve in-hospital mortality predictions after cardiac surgery compared to traditional modeling approaches, the validity is doubted due to lacking external validation, limited sample sizes, and inadequate modeling considerations. We aimed to assess predictive performance between ML and traditional modelling approaches, while addressing these major limitations. METHODS AND RESULTS Adult cardiac surgery cases (n = 168 565) between 2013 and 2018 in the Chinese Cardiac Surgery Registry were used to develop, validate, and compare various ML vs. logistic regression (LR) models. The dataset was split for temporal (2013-2017 for training, 2018 for testing) and spatial (geographically-stratified random selection of 83 centers for training, 22 for testing) experiments, respectively. Model performances were evaluated in testing sets for discrimination and calibration. The overall in-hospital mortality was 1.9%. In the temporal testing set (n = 32 184), the best-performing ML model demonstrated a similar area under the receiver operating characteristic curve (AUC) of 0.797 (95% CI 0.779-0.815) to the LR model (AUC 0.791 [95% CI 0.775-0.808]; P = 0.12). In the spatial experiment (n = 28 323), the best ML model showed a statistically better but modest performance improvement (AUC 0.732 [95% CI 0.710-0.754]) than LR (AUC 0.713 [95% CI 0.691-0.737]; P = 0.002). Varying feature selection methods had relatively smaller effects on ML models. Most ML and LR models were significantly miscalibrated. CONCLUSION ML provided only marginal improvements over traditional modelling approaches in predicting cardiac surgery mortality with routine preoperative variables, which calls for more judicious use of ML in practice.
Collapse
Affiliation(s)
- Juntong Zeng
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
| | - Danwei Zhang
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
- Department of Cardiac Surgery, Fujian Children's Hospital (Fujian Branch of Shanghai Children's Medical Center), College of Clinical Medicine for Obstetrics & Gynecology and Pediatrics, Fujian Medical University, 966 Hengyu Road, Jinan, Fuzhou, 350014, People's Republic of China
| | - Shen Lin
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
- Department of Cardiovascular Surgery, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
| | - Xiaoting Su
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
| | - Peng Wang
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
| | - Yan Zhao
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
| | - Zhe Zheng
- National Clinical Research Center of Cardiovascular Diseases, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Chinese Academy of Medical Sciences and Peking Union Medical College, 9 Dongdansantiao, Dongcheng, Beijing, 100730, People's Republic of China
- Department of Cardiovascular Surgery, Fuwai Hospital, National Center for Cardiovascular Diseases, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
- Key Laboratory of Coronary Heart Disease Risk Prediction and Precision Therapy, Chinese Academy of Medical Sciences and Peking Union Medical College, 167 Beilishi Road, Xicheng, Beijing, 100037, People's Republic of China
| |
Collapse
|
3
|
Allou N, Allyn J, Provenchere S, Delmas B, Braunberger E, Oliver M, De Brux JL, Ferdynus C. Clinical utility of a deep-learning mortality prediction model for cardiac surgery decision making. J Thorac Cardiovasc Surg 2023; 166:e567-e578. [PMID: 36858843 DOI: 10.1016/j.jtcvs.2023.01.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Revised: 01/17/2023] [Accepted: 01/18/2023] [Indexed: 02/05/2023]
Abstract
OBJECTIVES The aim of this study using decision curve analysis (DCA) was to evaluate the clinical utility of a deep-learning mortality prediction model for cardiac surgery decision making compared with the European System for Cardiac Operative Risk Evaluation (EuroSCORE) II and to 2 machine-learning models. METHODS Using data from a French prospective database, this retrospective study evaluated all patients who underwent cardiac surgery in 43 hospital centers between January 2012 and December 2020. A receiver operating characteristic analysis was performed to compare the accuracy of the EuroSCORE II, machine-learning models, and an adapted Tabular Bidirectional Encoder Representations from Transformers deep-learning model in predicting postoperative in-hospital mortality. The clinical utility of these models for cardiac surgery decision making was compared using DCA. RESULTS Over the study period, 165,640 patients underwent cardiac surgery, with a mean EuroSCORE II of 3.99 ± 6.67%. In the receiver operating characteristic analysis, the area under the curve was significantly greater for the deep-learning model (0.834; 95% confidence interval, 0.831-0.838) than the EuroSCORE II (P < .001), the random forest model (P = .03), and the Extreme Gradient Boosting model (P = .03). In the DCA, the clinical utility of the 3 artificial intelligence models was superior to that of the EuroSCORE II, especially when the threshold probability of death was high (>45%). The deep-learning model showed the greatest advantage over the EuroSCORE II. CONCLUSIONS The deep-learning model had better predictive accuracy and greater clinical utility than the EuroSCORE II and the 2 machine-learning models. These findings suggest that deep learning with Tabular Bidirectional Encoder Representations from Transformers prediction model could be used in the future as the gold standard for cardiac surgery decision making.
Collapse
Affiliation(s)
- Nicolas Allou
- Intensive Care Unit, Félix Guyon University Hospital, Saint Denis, France; Clinical Informatics Department, Félix Guyon University Hospital, Saint Denis, France.
| | - Jérôme Allyn
- Intensive Care Unit, Félix Guyon University Hospital, Saint Denis, France; Clinical Informatics Department, Félix Guyon University Hospital, Saint Denis, France
| | - Sophie Provenchere
- Anesthesia and Cardiac Surgery, Bichat Claude Bernard University Hospital, Paris, France
| | - Benjamin Delmas
- Anesthesia and Cardiac Surgery, Félix Guyon University Hospital, Saint Denis, France
| | - Eric Braunberger
- Anesthesia and Cardiac Surgery, Félix Guyon University Hospital, Saint Denis, France
| | - Matthieu Oliver
- Clinical Informatics Department, Félix Guyon University Hospital, Saint Denis, France; Unité de Soutien Méthodologique, Centre Hospitalier Universitaire Félix Guyon, Saint-Denis, France
| | | | - Cyril Ferdynus
- Clinical Informatics Department, Félix Guyon University Hospital, Saint Denis, France; Unité de Soutien Méthodologique, Centre Hospitalier Universitaire Félix Guyon, Saint-Denis, France; INSERM, Saint-Pierre, France
| |
Collapse
|
4
|
Perduca V, Bouaziz O, Zannis K, Beaussier M, Untereiner O. Can machine learning provide preoperative predictions of biological hemostasis after extracorporeal circulation for cardiac surgery? J Thorac Cardiovasc Surg 2023:S0022-5223(23)01019-X. [PMID: 37931798 DOI: 10.1016/j.jtcvs.2023.10.062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 10/12/2023] [Accepted: 10/31/2023] [Indexed: 11/08/2023]
Abstract
OBJECTIVES The goal of this study was to improve decision making regarding the transfusion of patients at the end of extracorporeal circulation for cardiac surgery through machine learning predictions of the evolution of platelets counts, prothrombin ratio, and fibrinogen assay. METHODS Prospective data with information about patient preoperative biology and surgery characteristics were collected at Institut Mutualiste Montsouris Hospital (Paris, France) for 10 months (n = 598). For each outcome of interest, instead of arbitrarily choosing 1 machine learning algorithm, we trained and tested a variety of algorithms together with the super learning algorithm, a state-of-the-art ensemble method that aggregates all the predictions and selects the best performing algorithm (total, 137 algorithms). We considered the top-performing algorithms and compared them to more standard and interpretable multivariable linear regression models. All algorithms were evaluated through their root mean squared error, a measure of the average difference between true and predicted values. RESULTS The root mean squared error of the top algorithms for predicting the difference between pre- and postoperative platelet counts, prothrombin ratio, and fibrinogen assay were 38.27 × 10e9/L, 8.66%, and 0.44 g/L, respectively. The linear models had similar performances. CONCLUSIONS Our machine learning algorithms accurately predicted prothrombin ratio and fibrinogen assay and less accurately platelet counts. As such, our models could provide an aid-decision tool for anesthetists in an operating room; future clinical trials addressing this hypothesis are warranted.
Collapse
Affiliation(s)
| | | | - Kostantinos Zannis
- Department of Cardiac surgery, Institut Mutualiste Montsouris, Paris, France
| | - Marc Beaussier
- Department of Anesthesiology, Institut Mutualiste Montsouris, Paris, France
| | - Olivier Untereiner
- Department of Anesthesiology, Institut Mutualiste Montsouris, Paris, France.
| |
Collapse
|
5
|
Sinha S, Dong T, Dimagli A, Vohra HA, Holmes C, Benedetto U, Angelini GD. Comparison of machine learning techniques in prediction of mortality following cardiac surgery: analysis of over 220 000 patients from a large national database. Eur J Cardiothorac Surg 2023; 63:ezad183. [PMID: 37154705 PMCID: PMC10275911 DOI: 10.1093/ejcts/ezad183] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 04/19/2023] [Accepted: 05/05/2023] [Indexed: 05/10/2023] Open
Abstract
OBJECTIVES To perform a systematic comparison of in-hospital mortality risk prediction post-cardiac surgery, between the predominant scoring system-European System for Cardiac Operative Risk Evaluation (EuroSCORE) II, logistic regression (LR) retrained on the same variables and alternative machine learning techniques (ML)-random forest (RF), neural networks (NN), XGBoost and weighted support vector machine. METHODS Retrospective analyses of prospectively routinely collected data on adult patients undergoing cardiac surgery in the UK from January 2012 to March 2019. Data were temporally split 70:30 into training and validation subsets. Mortality prediction models were created using the 18 variables of EuroSCORE II. Comparisons of discrimination, calibration and clinical utility were then conducted. Changes in model performance, variable-importance over time and hospital/operation-based model performance were also reviewed. RESULTS Of the 227 087 adults who underwent cardiac surgery during the study period, there were 6258 deaths (2.76%). In the testing cohort, there was an improvement in discrimination [XGBoost (95% confidence interval (CI) area under the receiver operator curve (AUC), 0.834-0.834, F1 score, 0.276-0.280) and RF (95% CI AUC, 0.833-0.834, F1, 0.277-0.281)] compared with EuroSCORE II (95% CI AUC, 0.817-0.818, F1, 0.243-0.245). There was no significant improvement in calibration with ML and retrained-LR compared to EuroSCORE II. However, EuroSCORE II overestimated risk across all deciles of risk and over time. The calibration drift was lowest in NN, XGBoost and RF compared with EuroSCORE II. Decision curve analysis showed XGBoost and RF to have greater net benefit than EuroSCORE II. CONCLUSIONS ML techniques showed some statistical improvements over retrained-LR and EuroSCORE II. The clinical impact of this improvement is modest at present. However the incorporation of additional risk factors in future studies may improve upon these findings and warrants further study.
Collapse
Affiliation(s)
- Shubhra Sinha
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Tim Dong
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Arnaldo Dimagli
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Hunaid A Vohra
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Chris Holmes
- Alan Turing Institute, London, UK
- Department of Statistics, University of Oxford, Oxford, UK
| | - Umberto Benedetto
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Gianni D Angelini
- Division of Cardiac Surgery, Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| |
Collapse
|
6
|
Xiao S, Liu F, Yu L, Li X, Ye X, Gong X. Development and validation of a nomogram for blood transfusion during intracranial aneurysm clamping surgery: a retrospective analysis. BMC Med Inform Decis Mak 2023; 23:71. [PMID: 37076865 PMCID: PMC10114399 DOI: 10.1186/s12911-023-02157-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 03/17/2023] [Indexed: 04/21/2023] Open
Abstract
PURPOSE Intraoperative blood transfusion is associated with adverse events. We aimed to establish a machine learning model to predict the probability of intraoperative blood transfusion during intracranial aneurysm surgery. METHODS Patients, who underwent intracranial aneurysm surgery in our hospital between January 2019 and December 2021 were enrolled. Four machine learning models were benchmarked and the best learning model was used to establish the nomogram, before conducting a discriminative assessment. RESULTS A total of 375 patients were included for analysis in this model, among whom 108 received an intraoperative blood transfusion during the intracranial aneurysm surgery. The least absolute shrinkage selection operator identified six preoperative relative factors: hemoglobin, platelet, D-dimer, sex, white blood cell, and aneurysm rupture before surgery. Performance evaluation of the classification error demonstrated the following: K-nearest neighbor, 0.2903; logistic regression, 0.2290; ranger, 0.2518; and extremely gradient boosting model, 0.2632. A nomogram based on a logistic regression algorithm was established using the above six parameters. The AUC values of the nomogram were 0.828 (0.775, 0.881) and 0.796 (0.710, 0.882) in the development and validation groups, respectively. CONCLUSIONS Machine learning algorithms present a good performance evaluation of intraoperative blood transfusion. The nomogram established using a logistic regression algorithm showed a good discriminative ability to predict intraoperative blood transfusion during aneurysm surgery.
Collapse
Affiliation(s)
- Shugen Xiao
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China
| | - Fan Liu
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China
| | - Liyuan Yu
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China
| | - Xiaopei Li
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China
| | - Xihong Ye
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China.
| | - Xingrui Gong
- Institute of Brain Disease and Neuroscience, Department of Anesthesiology, Xiangyang Central Hospital, Affiliated Hospital of Hubei University of Arts and Science, Xiangyang, Hubei, China.
| |
Collapse
|
7
|
Behnoush AH, Khalaji A, Rezaee M, Momtahen S, Mansourian S, Bagheri J, Masoudkabir F, Hosseini K. Machine learning-based prediction of 1-year mortality in hypertensive patients undergoing coronary revascularization surgery. Clin Cardiol 2023; 46:269-278. [PMID: 36588391 PMCID: PMC10018097 DOI: 10.1002/clc.23963] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/17/2022] [Revised: 12/12/2022] [Accepted: 12/19/2022] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Machine learning (ML) has shown promising results in all fields of medicine, including preventive cardiology. Hypertensive patients are at higher risk of mortality after coronary artery bypass graft (CABG) surgery; thus, we aimed to design and evaluate five ML models to predict 1-year mortality among hypertensive patients who underwent CABG. HYOTHESIS ML algorithms can significantly improve mortality prediction after CABG. METHODS Tehran Heart Center's CABG data registry was used to extract several baseline and peri-procedural characteristics and mortality data. The best features were chosen using random forest (RF) feature selection algorithm. Five ML models were developed to predict 1-year mortality: logistic regression (LR), RF, artificial neural network (ANN), extreme gradient boosting (XGB), and naïve Bayes (NB). The area under the curve (AUC), sensitivity, and specificity were used to evaluate the models. RESULTS Among the 8,493 hypertensive patients who underwent CABG (mean age of 68.27 ± 9.27 years), 303 died in the first year. Eleven features were selected as the best predictors, among which total ventilation hours and ejection fraction were the leading ones. LR showed the best prediction ability with an AUC of 0.82, while the least AUC was for the NB model (0.79). Among the subgroups, the highest AUC for LR model was for two age range groups (50-59 and 80-89 years), overweight, diabetic, and smoker subgroups of hypertensive patients. CONCLUSIONS All ML models had excellent performance in predicting 1-year mortality among CABG hypertension patients, while LR was the best regarding AUC. These models can help clinicians assess the risk of mortality in specific subgroups at higher risk (such as hypertensive ones).
Collapse
Affiliation(s)
- Amir Hossein Behnoush
- Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.,Non-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Amirmohammad Khalaji
- Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,School of Medicine, Tehran University of Medical Sciences, Tehran, Iran.,Non-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Malihe Rezaee
- Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Non-Communicable Diseases Research Center, Endocrinology and Metabolism Population Sciences Institute, Tehran University of Medical Sciences, Tehran, Iran.,School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Shahram Momtahen
- Department of Surgery, Tehran Heart Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Soheil Mansourian
- Department of Surgery, Tehran Heart Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Jamshid Bagheri
- Department of Surgery, Tehran Heart Center, Tehran University of Medical Sciences, Tehran, Iran
| | - Farzad Masoudkabir
- Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran
| | - Kaveh Hosseini
- Tehran Heart Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran.,Cardiac Primary Prevention Research Center, Cardiovascular Diseases Research Institute, Tehran University of Medical Sciences, Tehran, Iran
| |
Collapse
|
8
|
Dong T, Sinha S, Zhai B, Fudulu DP, Chan J, Narayan P, Judge A, Caputo M, Dimagli A, Benedetto U, Angelini GD. Cardiac surgery risk prediction using ensemble machine learning to incorporate legacy risk scores: A benchmarking study. Digit Health 2023; 9:20552076231187605. [PMID: 37492033 PMCID: PMC10363892 DOI: 10.1177/20552076231187605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 06/23/2023] [Indexed: 07/27/2023] Open
Abstract
Objective The introduction of new clinical risk scores (e.g. European System for Cardiac Operative Risk Evaluation (EuroSCORE) II) superseding original scores (e.g. EuroSCORE I) with different variable sets typically result in disparate datasets due to high levels of missingness for new score variables prior to time of adoption. Little is known about the use of ensemble learning to incorporate disparate data from legacy scores. We tested the hypothesised that Homogenenous and Heterogeneous Machine Learning (ML) ensembles will have better performance than ensembles of Dynamic Model Averaging (DMA) for combining knowledge from EuroSCORE I legacy data with EuroSCORE II data to predict cardiac surgery risk. Methods Using the National Adult Cardiac Surgery Audit dataset, we trained 12 different base learner models, based on two different variable sets from either EuroSCORE I (LogES) or EuroScore II (ES II), partitioned by the time of score adoption (1996-2016 or 2012-2016) and evaluated on holdout set (2017-2019). These base learner models were ensembled using nine different combinations of six ML algorithms to produce homogeneous or heterogeneous ensembles. Performance was assessed using a consensus metric. Results Xgboost homogenous ensemble (HE) was the highest performing model (clinical effectiveness metric (CEM) 0.725) with area under the curve (AUC) (0.8327; 95% confidence interval (CI) 0.8323-0.8329) followed by Random Forest HE (CEM 0.723; AUC 0.8325; 95%CI 0.8320-0.8326). Across different heterogenous ensembles, significantly better performance was obtained by combining siloed datasets across time (CEM 0.720) than building ensembles of either 1996-2011 (t-test adjusted, p = 1.67×10-6) or 2012-2019 (t-test adjusted, p = 1.35×10-193) datasets alone. Conclusions Both homogenous and heterogenous ML ensembles performed significantly better than DMA ensemble of Bayesian Update models. Time-dependent ensemble combination of variables, having differing qualities according to time of score adoption, enabled previously siloed data to be combined, leading to increased power, clinical interpretability of variables and usage of data.
Collapse
Affiliation(s)
- Tim Dong
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Shubhra Sinha
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Ben Zhai
- School of Computing Science, Northumbria University, Newcastle upon Tyne, UK
| | - Daniel P Fudulu
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Jeremy Chan
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Pradeep Narayan
- Department of Cardiac Surgery, Rabindranath Tagore International Institute of Cardiac Sciences, Kolkata, India
| | - Andy Judge
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Massimo Caputo
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Arnaldo Dimagli
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Umberto Benedetto
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| | - Gianni D Angelini
- Translational Health Sciences, Bristol Heart Institute, University of Bristol, Bristol, UK
| |
Collapse
|
9
|
Hsu W, Warren J, Riddle P. Multivariate Sequential Analytics for Cardiovascular Disease Event Prediction. Methods Inf Med 2022; 61:e149-e171. [PMID: 36564011 PMCID: PMC9788915 DOI: 10.1055/s-0042-1758687] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
BACKGROUND Automated clinical decision support for risk assessment is a powerful tool in combating cardiovascular disease (CVD), enabling targeted early intervention that could avoid issues of overtreatment or undertreatment. However, current CVD risk prediction models use observations at baseline without explicitly representing patient history as a time series. OBJECTIVE The aim of this study is to examine whether by explicitly modelling the temporal dimension of patient history event prediction may be improved. METHODS This study investigates methods for multivariate sequential modelling with a particular emphasis on long short-term memory (LSTM) recurrent neural networks. Data from a CVD decision support tool is linked to routinely collected national datasets including pharmaceutical dispensing, hospitalization, laboratory test results, and deaths. The study uses a 2-year observation and a 5-year prediction window. Selected methods are applied to the linked dataset. The experiments performed focus on CVD event prediction. CVD death or hospitalization in a 5-year interval was predicted for patients with history of lipid-lowering therapy. RESULTS The results of the experiments showed temporal models are valuable for CVD event prediction over a 5-year interval. This is especially the case for LSTM, which produced the best predictive performance among all models compared achieving AUROC of 0.801 and average precision of 0.425. The non-temporal model comparator ridge classifier (RC) trained using all quarterly data or by aggregating quarterly data (averaging time-varying features) was highly competitive achieving AUROC of 0.799 and average precision of 0.420 and AUROC of 0.800 and average precision of 0.421, respectively. CONCLUSION This study provides evidence that the use of deep temporal models particularly LSTM in clinical decision support for chronic disease would be advantageous with LSTM significantly improving on commonly used regression models such as logistic regression and Cox proportional hazards on the task of CVD event prediction.
Collapse
Affiliation(s)
- William Hsu
- School of Computer Science, University of Auckland, Auckland, New Zealand,Address for correspondence William Hsu, PhD School of Computer Science, University of AucklandPrivate Bag 92019, Auckland 1142New Zealand
| | - Jim Warren
- School of Computer Science, University of Auckland, Auckland, New Zealand
| | - Patricia Riddle
- School of Computer Science, University of Auckland, Auckland, New Zealand
| |
Collapse
|
10
|
Fransvea P, Fransvea G, Liuzzi P, Sganga G, Mannini A, Costa G. Study and validation of an explainable machine learning-based mortality prediction following emergency surgery in the elderly: A prospective observational study. Int J Surg 2022; 107:106954. [PMID: 36229017 DOI: 10.1016/j.ijsu.2022.106954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/07/2022] [Accepted: 10/03/2022] [Indexed: 10/31/2022]
Abstract
INTRODUCTION The heterogeneity of procedures and the variety of comorbidities of the patients undergoing surgery in an emergency setting makes perioperative risk stratification, planning, and risk mitigation crucial. In this optic, Machine Learning has the capability of deriving data-driven predictions based on multivariate interactions of thousands of instances. Our aim was to cross-validate and test interpretable models for the prediction of post-operative mortality after any surgery in an emergency setting on elderly patients. METHODS This study is a secondary analysis derived from the FRAILESEL study, a multi-center (N = 29 emergency care units), nationwide, observational prospective study with data collected between 06-2017 and 06-2018 investigating perioperative outcomes of elderly patients (age≥65 years) undergoing emergency surgery. Demographic and clinical data, medical and surgical history, preoperative risk factors, frailty, biochemical blood examination, vital parameters, and operative details were collected and the primary outcome was set to the 30-day mortality. RESULTS Of the 2570 included patients (50.66% males, median age 77 [IQR = 13] years) 238 (9.26%) were in the non-survivors group. The best performing solution (MultiLayer Perceptron) resulted in a test accuracy of 94.9% (sensitivity = 92.0%, specificity = 95.2%). Model explanations showed how non-chronic cardiac-related comorbidities reduced activities of daily living, low consciousness levels, high creatinine and low saturation increase the risk of death following surgery. CONCLUSIONS In this prospective observational study, a robustly cross-validated model resulted in better predictive performance than existing tools and scores in literature. By using only preoperative features and by deriving patient-specific explanations, the model provides crucial information during shared decision-making processes required for risk mitigation procedures.
Collapse
Affiliation(s)
- Pietro Fransvea
- Emergency Surgery and Trauma, Fondazione Policlinico Universitario A. Gemelli IRCCS, Università Cattolica del Sacro Cuore, Largo A. Gemelli 8, Rome, Italy The BioRobotics Institute, Scuola Superiore Sant'Anna, Viale Rinaldo Piaggio 34, Pontedera, PI, Italy IRCCS Fondazione Don Carlo Gnocchi ONLUS, Via di Scandicci 269, Firenze, FI, Italy Surgery Center, Colorectal Surgery Unit - Fondazione Policlinico Campus Bio-Medico, University Hospital of University Campus Bio-Medico of Rome, Rome, Italy
| | | | | | | | | | | |
Collapse
|
11
|
Bi S, Chen S, Li J, Gu J. Machine learning-based prediction of in-hospital mortality for post cardiovascular surgery patients admitting to intensive care unit: a retrospective observational cohort study based on a large multi-center critical care database. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022; 226:107115. [PMID: 36126435 DOI: 10.1016/j.cmpb.2022.107115] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 07/15/2022] [Accepted: 09/04/2022] [Indexed: 06/15/2023]
Abstract
BACKGROUND AND OBJECTIVES The acute physiology and chronic health evaluation-IV model (APACHE-IV), and the sequential organ failure assessment (SOFA) score are two traditional severity assessment systems that can be applied to cardiac surgery patients admitted to intensive care units (ICUs). However, the performance of machine learning approaches in post cardiovascular surgery (PCS) patients admitted to the ICU remains unknown. METHODS The clinical data of adult subjects were collected from the eICU database. Seven models were constructed based on the training set (70% random sample) for predicting hospital mortality, including two traditional models based on APACHE-IV and SOFA scores and five machine learning models. We measured the models' performance in the remaining 30% of the sample by computing AUC-ROC values, prospective prediction results, and decision curves and compared the models with net reclassification improvement. RESULTS This study included 5860 PCS patients. The AUC-ROC value of the Xgboost model significantly outperformed the APACHE-IV and SOFA scores (0.12 [0.06-0.17] p < 0.01, 0.18 [0.1-0.26] p < 0.01 respectively). The use of ML models would also gain more clinical net benefits than traditional models based on decision curve analysis. There was a significant improvement in integrated discrimination when comparing the backward stepwise linear regression model with the APACHE-IV model (0.11 [0.05, 0.16], p < 0.01) and SOFA model (0.12 [0.06, 0.17], p < 0.01). CONCLUSIONS In conclusion, the predictive ability of ML models was better than that of traditional models. The present study suggested that developing advanced prognosis prediction tools could support clinical decision-making in the ICU for PCS patients.
Collapse
Affiliation(s)
- Siwei Bi
- Department of Burn and Plastic Surgery, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Shanshan Chen
- West China School of Medicine, Sichuan University, Chengdu 610041, China
| | - Jingyi Li
- West China School of Medicine, Sichuan University, Chengdu 610041, China
| | - Jun Gu
- Department of Cardiovascular Surgery, West China Hospital, Sichuan University, Chengdu, Sichuan 610041, China.
| |
Collapse
|
12
|
Gao Y, Liu X, Wang L, Wang S, Yu Y, Ding Y, Wang J, Ao H. Machine learning algorithms to predict major bleeding after isolated coronary artery bypass grafting. Front Cardiovasc Med 2022; 9:881881. [PMID: 35966564 PMCID: PMC9366116 DOI: 10.3389/fcvm.2022.881881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/27/2022] [Indexed: 11/13/2022] Open
Abstract
ObjectivesPostoperative major bleeding is a common problem in patients undergoing cardiac surgery and is associated with poor outcomes. We evaluated the performance of machine learning (ML) methods to predict postoperative major bleeding.MethodsA total of 1,045 patients who underwent isolated coronary artery bypass graft surgery (CABG) were enrolled. Their datasets were assigned randomly to training (70%) or a testing set (30%). The primary outcome was major bleeding defined as the universal definition of perioperative bleeding (UDPB) classes 3–4. We constructed a reference logistic regression (LR) model using known predictors. We also developed several modern ML algorithms. In the test set, we compared the area under the receiver operating characteristic curves (AUCs) of these ML algorithms with the reference LR model results, and the TRUST and WILL-BLEED risk score. Calibration analysis was undertaken using the calibration belt method.ResultsThe prevalence of postoperative major bleeding was 7.1% (74/1,045). For major bleeds, the conditional inference random forest (CIRF) model showed the highest AUC [0.831 (0.732–0.930)], and the stochastic gradient boosting (SGBT) and random forest models demonstrated the next best results [0.820 (0.742–0.899) and 0.810 (0.719–0.902)]. The AUCs of all ML models were higher than [0.629 (0.517–0.641) and 0.557 (0.449–0.665)], as achieved by TRUST and WILL-BLEED, respectively.ConclusionML methods successfully predicted major bleeding after cardiac surgery, with greater performance compared with previous scoring models. Modern ML models may enhance the identification of high-risk major bleeding subpopulations.
Collapse
Affiliation(s)
- Yuchen Gao
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Xiaojie Liu
- Department of Anesthesiology, The Affiliated Hospital of Qingdao University, Qingdao, China
| | - Lijuan Wang
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Sudena Wang
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yang Yu
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yao Ding
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Jingcan Wang
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Hushan Ao
- Department of Anesthesiology, Fuwai Hospital, State Key Laboratory of Cardiovascular Disease, National Center of Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- *Correspondence: Hushan Ao,
| |
Collapse
|
13
|
Fan Y, Dong J, Wu Y, Shen M, Zhu S, He X, Jiang S, Shao J, Song C. Development of machine learning models for mortality risk prediction after cardiac surgery. Cardiovasc Diagn Ther 2022; 12:12-23. [PMID: 35282663 PMCID: PMC8898685 DOI: 10.21037/cdt-21-648] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 12/28/2021] [Indexed: 02/12/2024]
Abstract
BACKGROUND We developed machine learning models that combine preoperative and intraoperative risk factors to predict mortality after cardiac surgery. METHODS Machine learning involving random forest, neural network, support vector machine, and gradient boosting machine was developed and compared with the risk scores of EuroSCORE I and II, Society of Thoracic Surgeons (STS), as well as a logistic regression model. Clinical data were collected from patients undergoing adult cardiac surgery at the First Medical Centre of Chinese PLA General Hospital between December 2008 and December 2017. The primary outcome was post-operative mortality. Model performance was estimated using several metrics, including sensitivity, specificity, accuracy, and area under the receiver operating characteristic curve (AUC). The visualization algorithm was implemented using Shapley's additive explanations. RESULTS A total of 5,443 patients were enrolled during the study period. The mean EuroSCORE II score was 3.7%, and the actual in-hospital mortality rate was 2.7%. For predicting operative mortality after cardiac surgery, the AUC scores were 0.87, 0.79, 0.81, and 0.82 for random forest, neural network, support vector machine, and gradient boosting machine, compared with 0.70, 0.73, 0.71, and 0.74 for EuroSCORE I and II, STS, and logistic regression model. Shapley's additive explanations analysis of random forest yielded the top-20 predictors and individual-level explanations for each prediction. CONCLUSIONS Machine learning models based on available clinical data may be superior to clinical scoring tools in predicting postoperative mortality in patients following cardiac surgery. Explanatory models show the potential to provide personalized risk profiles for individuals by accounting for the contribution of influencing factors. Additional prospective multicenter studies are warranted to confirm the clinical benefit of these machine learning-driven models.
Collapse
Affiliation(s)
- Yunlong Fan
- Medical School of Chinese PLA, Beijing, China
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| | - Junfeng Dong
- Department of Organ Transplantation, Changzhen Hospital, Navy Medical University, Shanghai, China
| | - Yuanbin Wu
- Medical School of Chinese PLA, Beijing, China
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| | - Ming Shen
- Department of Cardiology, The First Hospital of Hebei Medical University, Shijiazhuang, China
| | - Siming Zhu
- Medical School of Chinese PLA, Beijing, China
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| | - Xiaoyi He
- Medical School of Chinese PLA, Beijing, China
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| | - Shengli Jiang
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| | | | - Chao Song
- Medical School of Chinese PLA, Beijing, China
- Department of Cardiovascular Surgery, the First Medical Centre of Chinese PLA General Hospital, Beijing, China
| |
Collapse
|
14
|
Rellum SR, Schuurmans J, van der Ven WH, Eberl S, Driessen AHG, Vlaar APJ, Veelo DP. Machine learning methods for perioperative anesthetic management in cardiac surgery patients: a scoping review. J Thorac Dis 2022; 13:6976-6993. [PMID: 35070381 PMCID: PMC8743411 DOI: 10.21037/jtd-21-765] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2021] [Accepted: 08/27/2021] [Indexed: 12/27/2022]
Abstract
Background Machine learning (ML) is developing fast with promising prospects within medicine and already has several applications in perioperative care. We conducted a scoping review to examine the extent and potential limitations of ML implementation in perioperative anesthetic care, specifically in cardiac surgery patients. Methods We mapped the current literature by searching three databases: MEDLINE (Ovid), EMBASE (Ovid), and Cochrane Library. Articles were eligible if they reported on perioperative ML use in the field of cardiac surgery with relevance to anesthetic practices. Data on the applicability of ML and comparability to conventional statistical methods were extracted. Results Forty-six articles on ML relevant to the work of the anesthesiologist in cardiac surgery were identified. Three main categories emerged: (I) event and risk prediction, (II) hemodynamic monitoring, and (III) automation of echocardiography. Prediction models based on ML tend to behave similarly to conventional statistical methods. Using dynamic hemodynamic or ultrasound data in ML models, however, shifts the potential to promising results. Conclusions ML in cardiac surgery is increasingly used in perioperative anesthetic management. The majority is used for prediction purposes similar to conventional clinical scores. Remarkable ML model performances are achieved when using real-time dynamic parameters. However, beneficial clinical outcomes of ML integration have yet to be determined. Nonetheless, the first steps introducing ML in perioperative anesthetic care for cardiac surgery have been taken.
Collapse
Affiliation(s)
- Santino R Rellum
- Department of Anesthesiology, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands.,Department of Intensive Care, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Jaap Schuurmans
- Department of Anesthesiology, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands.,Department of Intensive Care, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Ward H van der Ven
- Department of Anesthesiology, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands.,Department of Intensive Care, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Susanne Eberl
- Department of Anesthesiology, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Antoine H G Driessen
- Department of Cardiothoracic Surgery, Heart Center, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Alexander P J Vlaar
- Department of Intensive Care, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| | - Denise P Veelo
- Department of Anesthesiology, Amsterdam UMC, Location AMC, Amsterdam, The Netherlands
| |
Collapse
|
15
|
Ostberg NP, Zafar MA, Elefteriades JA. Machine learning: principles and applications for thoracic surgery. Eur J Cardiothorac Surg 2021; 60:213-221. [PMID: 33748840 DOI: 10.1093/ejcts/ezab095] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/05/2020] [Revised: 01/25/2021] [Accepted: 01/27/2021] [Indexed: 12/20/2022] Open
Abstract
OBJECTIVES Machine learning (ML) has experienced a revolutionary decade with advances across many disciplines. We seek to understand how recent advances in ML are going to specifically influence the practice of surgery in the future with a particular focus on thoracic surgery. METHODS Review of relevant literature in both technical and clinical domains. RESULTS ML is a revolutionary technology that promises to change the way that surgery is practiced in the near future. Spurred by an advance in computing power and the volume of data produced in healthcare, ML has shown remarkable ability to master tasks that had once been reserved for physicians. Supervised learning, unsupervised learning and reinforcement learning are all important techniques that can be leveraged to improve care. Five key applications of ML to cardiac surgery include diagnostics, surgical skill assessment, postoperative prognostication, augmenting intraoperative performance and accelerating translational research. Some key limitations of ML include lack of interpretability, low quality and volumes of relevant clinical data, ethical limitations and difficulties with clinical implementation. CONCLUSIONS In the future, the practice of cardiac surgery will be greatly augmented by ML technologies, ultimately leading to improved surgical performance and better patient outcomes.
Collapse
Affiliation(s)
- Nicolai P Ostberg
- Aortic Institute at Yale-New Haven Hospital, Yale University School of Medicine, New Haven, CT, USA.,New York University Grossman School of Medicine, New York, NY, USA
| | - Mohammad A Zafar
- Aortic Institute at Yale-New Haven Hospital, Yale University School of Medicine, New Haven, CT, USA
| | - John A Elefteriades
- Aortic Institute at Yale-New Haven Hospital, Yale University School of Medicine, New Haven, CT, USA
| |
Collapse
|
16
|
Hernandez-Vaquero C, Hernandez-Vaquero D. Neural networks may outperform classical regressions, but only when non-linear relationships are considered. Eur J Cardiothorac Surg 2021; 60:433. [PMID: 34263297 DOI: 10.1093/ejcts/ezab051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 01/24/2021] [Indexed: 11/13/2022] Open
|
17
|
Giang KW, Helgadottir S, Dellborg M, Volpe G, Mandalenakis Z. Enhanced prediction of atrial fibrillation and mortality among patients with congenital heart disease using nationwide register-based medical hospital data and neural networks. EUROPEAN HEART JOURNAL. DIGITAL HEALTH 2021; 2:568-575. [PMID: 36713111 PMCID: PMC9707883 DOI: 10.1093/ehjdh/ztab065] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2021] [Revised: 06/28/2021] [Accepted: 07/14/2021] [Indexed: 02/01/2023]
Abstract
Aims To improve short-and long-term predictions of mortality and atrial fibrillation (AF) among patients with congenital heart disease (CHD) from a nationwide population using neural networks (NN). Methods and results The Swedish National Patient Register and the Cause of Death Register were used to identify all patients with CHD born from 1970 to 2017. A total of 71 941 CHD patients were identified and followed-up from birth until the event or end of study in 2017. Based on data from a nationwide population, a NN model was obtained to predict mortality and AF. Logistic regression (LR) based on the same data was used as a baseline comparison. Of 71 941 CHD patients, a total of 5768 died (8.02%) and 995 (1.38%) developed AF over time with a mean follow-up time of 16.47 years (standard deviation 12.73 years). The performance of NN models in predicting the mortality and AF was higher than the performance of LR regardless of the complexity of the disease, with an average area under the receiver operating characteristic of >0.80 and >0.70, respectively. The largest differences were observed in mortality and complexity of CHD over time. Conclusion We found that NN can be used to predict mortality and AF on a nationwide scale using data that are easily obtainable by clinicians. In addition, NN showed a high performance overall and, in most cases, with better performance for prediction as compared with more traditional regression methods.
Collapse
Affiliation(s)
- Kok Wai Giang
- Institute of Medicine, Department of Molecular and Clinical Medicine, Sahlgrenska Academy, University of Gothenburg, Diagnosvägen 11, 416 50 Gothenburg, Sweden,Corresponding author. Tel: +46736488997,
| | - Saga Helgadottir
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Mikael Dellborg
- Institute of Medicine, Department of Molecular and Clinical Medicine, Sahlgrenska Academy, University of Gothenburg, Diagnosvägen 11, 416 50 Gothenburg, Sweden,Adult Congenital Heart Unit, Department of Medicine, Sahlgrenska University Hospital/Östra, Gothenburg, Sweden
| | - Giovanni Volpe
- Department of Physics, University of Gothenburg, Gothenburg, Sweden
| | - Zacharias Mandalenakis
- Institute of Medicine, Department of Molecular and Clinical Medicine, Sahlgrenska Academy, University of Gothenburg, Diagnosvägen 11, 416 50 Gothenburg, Sweden,Adult Congenital Heart Unit, Department of Medicine, Sahlgrenska University Hospital/Östra, Gothenburg, Sweden
| |
Collapse
|
18
|
Sinha S, Benedetto U. Reply to Hernandez-Vaquero and Hernandez-Vaquero. Eur J Cardiothorac Surg 2021; 60:433-434. [PMID: 34263299 DOI: 10.1093/ejcts/ezab057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2021] [Accepted: 01/24/2021] [Indexed: 11/12/2022] Open
Affiliation(s)
- Shubhra Sinha
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| | - Umberto Benedetto
- Bristol Heart Institute, Translational Health Sciences, University of Bristol, Bristol, UK
| |
Collapse
|
19
|
Cho SM, Austin PC, Ross HJ, Abdel-Qadir H, Chicco D, Tomlinson G, Taheri C, Foroutan F, Lawler PR, Billia F, Gramolini A, Epelman S, Wang B, Lee DS. Machine Learning Compared With Conventional Statistical Models for Predicting Myocardial Infarction Readmission and Mortality: A Systematic Review. Can J Cardiol 2021; 37:1207-1214. [PMID: 33677098 DOI: 10.1016/j.cjca.2021.02.020] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2020] [Revised: 02/23/2021] [Accepted: 02/27/2021] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Machine learning (ML) methods are increasingly used in addition to conventional statistical modelling (CSM) for predicting readmission and mortality in patients with myocardial infarction (MI). However, the two approaches have not been systematically compared across studies of prognosis in patients with MI. METHODS Following PRISMA guidelines, we systematically reviewed the literature via Medline, EPub, Cochrane Central, Embase, Inspec, ACM Digital Library, and Web of Science. Eligible studies included primary research articles published from January 2000 to March 2020, comparing ML and CSM for prognostication after MI. RESULTS Of 7,348 articles, 112 underwent full-text review, with the final set composed of 24 articles representing 374,365 patients. ML methods included artificial neural networks (n = 12 studies), random forests (n = 11), decision trees (n = 8), support vector machines (n = 8), and Bayesian techniques (n = 7). CSM included logistic regression (n = 19 studies), existing CSM-derived risk scores (n = 12), and Cox regression (n = 2). Thirteen of 19 studies examining mortality reported higher C-indexes with the use of ML compared with CSM. One study examined readmissions at 2 different time points, with C-indexes that were higher for ML than CSM. Across all studies, a total of 29 comparisons were performed, but the majority (n = 26, 90%) found small (< 0.05) absolute differences in the C-index between ML and CSM. With the use of a modified CHARMS checklist, sources of bias were identifiable in the majority of studies, and only 2 were externally validated. CONCLUSION Although ML algorithms tended to have higher C-indexes than CSM for predicting death or readmission after MI, these studies exhibited threats to internal validity and were often unvalidated. Further comparisons are needed, with adherence to clinical quality standards for prognosis research. (Trial registration: PROSPERO CRD42019134896).
Collapse
Affiliation(s)
- Sung Min Cho
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Peter C Austin
- Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada; Institute for Health Policy, Management and Evaluation, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Heather J Ross
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Husam Abdel-Qadir
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada; Institute for Health Policy, Management and Evaluation, Toronto, Ontario, Canada; Women's College Hospital, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | | | - George Tomlinson
- Institute for Health Policy, Management and Evaluation, Toronto, Ontario, Canada; Biostatistics Research Unit, University Health Network, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Cameron Taheri
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Farid Foroutan
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada
| | - Patrick R Lawler
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; Toronto General Hospital Research Institute, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Filio Billia
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; Toronto General Hospital Research Institute, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Anthony Gramolini
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Slava Epelman
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; Toronto General Hospital Research Institute, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Bo Wang
- Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada
| | - Douglas S Lee
- Ted Rogers Centre for Heart Research, Toronto, Ontario, Canada; Peter Munk Cardiac Centre, University Health Network, Toronto, Ontario, Canada; Institute for Clinical Evaluative Sciences, Toronto, Ontario, Canada; Institute for Health Policy, Management and Evaluation, Toronto, Ontario, Canada; Toronto General Hospital Research Institute, Toronto, Ontario, Canada; University of Toronto, Toronto, Ontario, Canada.
| |
Collapse
|