Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Graeßner M, Jungwirth B, Frank E, Schaller SJ, Kochs E, Ulm K, Blobner M, Ulm B, Podtschaske AH, Kagerbauer SM. Enabling personalized perioperative risk prediction by using a machine-learning model based on preoperative data. Sci Rep 2023;13:7128. [PMID: 37130884 PMCID: PMC10153050 DOI: 10.1038/s41598-023-33981-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 04/21/2023] [Indexed: 05/04/2023] Open

For:	Graeßner M, Jungwirth B, Frank E, Schaller SJ, Kochs E, Ulm K, Blobner M, Ulm B, Podtschaske AH, Kagerbauer SM. Enabling personalized perioperative risk prediction by using a machine-learning model based on preoperative data. Sci Rep 2023;13:7128. [PMID: 37130884 PMCID: PMC10153050 DOI: 10.1038/s41598-023-33981-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Accepted: 04/21/2023] [Indexed: 05/04/2023] Open

Number

Cited by Other Article(s)

Yang C, Zheng P, Li L, Zhang Q, Luo Z, Shi Z, Zhao S, Li Q. Machine learning-based model development for predicting risk factors of prolonged intra-aortic balloon pump therapy in patients with coronary artery bypass grafting. J Cardiothorac Surg 2024;19:383. [PMID: 38926828 PMCID: PMC11201335 DOI: 10.1186/s13019-024-02830-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 06/14/2024] [Indexed: 06/28/2024] Open

Chung P, Fong CT, Walters AM, Aghaeepour N, Yetisgen M, O’Reilly-Shah VN. Large Language Model Capabilities in Perioperative Risk Prediction and Prognostication. JAMA Surg 2024:2819795. [PMID: 38837145 PMCID: PMC11154375 DOI: 10.1001/jamasurg.2024.1621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 03/08/2024] [Indexed: 06/06/2024]

Abstract

Importance

General-domain large language models may be able to perform risk stratification and predict postoperative outcome measures using a description of the procedure and a patient's electronic health record notes.

Objective

To examine predictive performance on 8 different tasks: prediction of American Society of Anesthesiologists Physical Status (ASA-PS), hospital admission, intensive care unit (ICU) admission, unplanned admission, hospital mortality, postanesthesia care unit (PACU) phase 1 duration, hospital duration, and ICU duration.

Design, Setting, and Participants

This prognostic study included task-specific datasets constructed from 2 years of retrospective electronic health records data collected during routine clinical care. Case and note data were formatted into prompts and given to the large language model GPT-4 Turbo (OpenAI) to generate a prediction and explanation. The setting included a quaternary care center comprising 3 academic hospitals and affiliated clinics in a single metropolitan area. Patients who had a surgery or procedure with anesthesia and at least 1 clinician-written note filed in the electronic health record before surgery were included in the study. Data were analyzed from November to December 2023.

Exposures

Compared original notes, note summaries, few-shot prompting, and chain-of-thought prompting strategies.

Main Outcomes and Measures

F1 score for binary and categorical outcomes. Mean absolute error for numerical duration outcomes.

Results

Study results were measured on task-specific datasets, each with 1000 cases with the exception of unplanned admission, which had 949 cases, and hospital mortality, which had 576 cases. The best results for each task included an F1 score of 0.50 (95% CI, 0.47-0.53) for ASA-PS, 0.64 (95% CI, 0.61-0.67) for hospital admission, 0.81 (95% CI, 0.78-0.83) for ICU admission, 0.61 (95% CI, 0.58-0.64) for unplanned admission, and 0.86 (95% CI, 0.83-0.89) for hospital mortality prediction. Performance on duration prediction tasks was universally poor across all prompt strategies for which the large language model achieved a mean absolute error of 49 minutes (95% CI, 46-51 minutes) for PACU phase 1 duration, 4.5 days (95% CI, 4.2-5.0 days) for hospital duration, and 1.1 days (95% CI, 0.9-1.3 days) for ICU duration prediction.

Conclusions and Relevance

Current general-domain large language models may assist clinicians in perioperative risk stratification on classification tasks but are inadequate for numerical duration predictions. Their ability to produce high-quality natural language explanations for the predictions may make them useful tools in clinical workflows and may be complementary to traditional risk prediction models.

Collapse

Baumgart A, Beck G, Ghezel-Ahmadi D. [Artificial intelligence in intensive care medicine]. Med Klin Intensivmed Notfmed 2024;119:189-198. [PMID: 38546864 DOI: 10.1007/s00063-024-01117-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 01/29/2024] [Accepted: 02/05/2024] [Indexed: 04/05/2024]

Kagerbauer SM, Ulm B, Podtschaske AH, Andonov DI, Blobner M, Jungwirth B, Graessner M. Susceptibility of AutoML mortality prediction algorithms to model drift caused by the COVID pandemic. BMC Med Inform Decis Mak 2024;24:34. [PMID: 38308256 PMCID: PMC10837894 DOI: 10.1186/s12911-024-02428-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/16/2024] [Indexed: 02/04/2024] Open

Abstract

BACKGROUND

Concept drift and covariate shift lead to a degradation of machine learning (ML) models. The objective of our study was to characterize sudden data drift as caused by the COVID pandemic. Furthermore, we investigated the suitability of certain methods in model training to prevent model degradation caused by data drift.

METHODS

We trained different ML models with the H2O AutoML method on a dataset comprising 102,666 cases of surgical patients collected in the years 2014-2019 to predict postoperative mortality using preoperatively available data. Models applied were Generalized Linear Model with regularization, Default Random Forest, Gradient Boosting Machine, eXtreme Gradient Boosting, Deep Learning and Stacked Ensembles comprising all base models. Further, we modified the original models by applying three different methods when training on the original pre-pandemic dataset: (Rahmani K, et al, Int J Med Inform 173:104930, 2023) we weighted older data weaker, (Morger A, et al, Sci Rep 12:7244, 2022) used only the most recent data for model training and (Dilmegani C, 2023) performed a z-transformation of the numerical input parameters. Afterwards, we tested model performance on a pre-pandemic and an in-pandemic data set not used in the training process, and analysed common features.

RESULTS

The models produced showed excellent areas under receiver-operating characteristic and acceptable precision-recall curves when tested on a dataset from January-March 2020, but significant degradation when tested on a dataset collected in the first wave of the COVID pandemic from April-May 2020. When comparing the probability distributions of the input parameters, significant differences between pre-pandemic and in-pandemic data were found. The endpoint of our models, in-hospital mortality after surgery, did not differ significantly between pre- and in-pandemic data and was about 1% in each case. However, the models varied considerably in the composition of their input parameters. None of our applied modifications prevented a loss of performance, although very different models emerged from it, using a large variety of parameters.

CONCLUSIONS

Our results show that none of our tested easy-to-implement measures in model training can prevent deterioration in the case of sudden external events. Therefore, we conclude that, in the presence of concept drift and covariate shift, close monitoring and critical review of model predictions are necessary.

Collapse

Sander J, Simon P, Hinske C. [Big data and artificial intelligence in anesthesia : Reality or fiction?]. DIE ANAESTHESIOLOGIE 2024;73:77-84. [PMID: 38066215 DOI: 10.1007/s00101-023-01362-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 10/28/2023] [Indexed: 02/08/2024]