Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Duckworth C, Chmiel FP, Burns DK, Zlatev ZD, White NM, Daniels TWV, Kiuber M, Boniface MJ. Using explainable machine learning to characterise data drift and detect emergent health risks for emergency department admissions during COVID-19. Sci Rep 2021;11:23017. [PMID: 34837021 DOI: 10.1038/s41598-021-02481-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 11/15/2021] [Indexed: 01/11/2023] Open

For:	Duckworth C, Chmiel FP, Burns DK, Zlatev ZD, White NM, Daniels TWV, Kiuber M, Boniface MJ. Using explainable machine learning to characterise data drift and detect emergent health risks for emergency department admissions during COVID-19. Sci Rep 2021;11:23017. [PMID: 34837021 DOI: 10.1038/s41598-021-02481-y] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 11/15/2021] [Indexed: 01/11/2023] Open

Number

Cited by Other Article(s)

Kale AU, Hogg HDJ, Pearson R, Glocker B, Golder S, Coombe A, Waring J, Liu X, Moore DJ, Denniston AK. Detecting Algorithmic Errors and Patient Harms for AI-Enabled Medical Devices in Randomized Controlled Trials: Protocol for a Systematic Review. JMIR Res Protoc 2024;13:e51614. [PMID: 38941147 DOI: 10.2196/51614] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 03/11/2024] [Accepted: 04/18/2024] [Indexed: 06/29/2024] Open

Abstract

BACKGROUND

Artificial intelligence (AI) medical devices have the potential to transform existing clinical workflows and ultimately improve patient outcomes. AI medical devices have shown potential for a range of clinical tasks such as diagnostics, prognostics, and therapeutic decision-making such as drug dosing. There is, however, an urgent need to ensure that these technologies remain safe for all populations. Recent literature demonstrates the need for rigorous performance error analysis to identify issues such as algorithmic encoding of spurious correlations (eg, protected characteristics) or specific failure modes that may lead to patient harm. Guidelines for reporting on studies that evaluate AI medical devices require the mention of performance error analysis; however, there is still a lack of understanding around how performance errors should be analyzed in clinical studies, and what harms authors should aim to detect and report.

OBJECTIVE

This systematic review will assess the frequency and severity of AI errors and adverse events (AEs) in randomized controlled trials (RCTs) investigating AI medical devices as interventions in clinical settings. The review will also explore how performance errors are analyzed including whether the analysis includes the investigation of subgroup-level outcomes.

METHODS

This systematic review will identify and select RCTs assessing AI medical devices. Search strategies will be deployed in MEDLINE (Ovid), Embase (Ovid), Cochrane CENTRAL, and clinical trial registries to identify relevant papers. RCTs identified in bibliographic databases will be cross-referenced with clinical trial registries. The primary outcomes of interest are the frequency and severity of AI errors, patient harms, and reported AEs. Quality assessment of RCTs will be based on version 2 of the Cochrane risk-of-bias tool (RoB2). Data analysis will include a comparison of error rates and patient harms between study arms, and a meta-analysis of the rates of patient harm in control versus intervention arms will be conducted if appropriate.

RESULTS

The project was registered on PROSPERO in February 2023. Preliminary searches have been completed and the search strategy has been designed in consultation with an information specialist and methodologist. Title and abstract screening started in September 2023. Full-text screening is ongoing and data collection and analysis began in April 2024.

CONCLUSIONS

Evaluations of AI medical devices have shown promising results; however, reporting of studies has been variable. Detection, analysis, and reporting of performance errors and patient harms is vital to robustly assess the safety of AI medical devices in RCTs. Scoping searches have illustrated that the reporting of harms is variable, often with no mention of AEs. The findings of this systematic review will identify the frequency and severity of AI performance errors and patient harms and generate insights into how errors should be analyzed to account for both overall and subgroup performance.

TRIAL REGISTRATION

PROSPERO CRD42023387747; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=387747.

INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID)

PRR1-10.2196/51614.

Collapse

Faust L, Wilson P, Asai S, Fu S, Liu H, Ruan X, Storlie C. Considerations for Quality Control Monitoring of Machine Learning Models in Clinical Practice. JMIR Med Inform 2024;12:e50437. [PMID: 38941140 DOI: 10.2196/50437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 08/22/2023] [Accepted: 05/04/2024] [Indexed: 06/29/2024] Open

Abstract

Integrating machine learning (ML) models into clinical practice presents a challenge of maintaining their efficacy over time. While existing literature offers valuable strategies for detecting declining model performance, there is a need to document the broader challenges and solutions associated with the real-world development and integration of model monitoring solutions. This work details the development and use of a platform for monitoring the performance of a production-level ML model operating in Mayo Clinic. In this paper, we aimed to provide a series of considerations and guidelines necessary for integrating such a platform into a team's technical infrastructure and workflow. We have documented our experiences with this integration process, discussed the broader challenges encountered with real-world implementation and maintenance, and included the source code for the platform. Our monitoring platform was built as an R shiny application, developed and implemented over the course of 6 months. The platform has been used and maintained for 2 years and is still in use as of July 2023. The considerations necessary for the implementation of the monitoring platform center around 4 pillars: feasibility (what resources can be used for platform development?); design (through what statistics or models will the model be monitored, and how will these results be efficiently displayed to the end user?); implementation (how will this platform be built, and where will it exist within the IT ecosystem?); and policy (based on monitoring feedback, when and what actions will be taken to fix problems, and how will these problems be translated to clinical staff?). While much of the literature surrounding ML performance monitoring emphasizes methodological approaches for capturing changes in performance, there remains a battery of other challenges and considerations that must be addressed for successful real-world implementation.

Collapse

Dong T, Sinha S, Zhai B, Fudulu D, Chan J, Narayan P, Judge A, Caputo M, Dimagli A, Benedetto U, Angelini GD. Performance Drift in Machine Learning Models for Cardiac Surgery Risk Prediction: Retrospective Analysis. JMIRX MED 2024;5:e45973. [PMID: 38889069 PMCID: PMC11217160 DOI: 10.2196/45973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 02/27/2024] [Accepted: 04/29/2024] [Indexed: 06/20/2024]

Abstract

Background

The Society of Thoracic Surgeons and European System for Cardiac Operative Risk Evaluation (EuroSCORE) II risk scores are the most commonly used risk prediction models for in-hospital mortality after adult cardiac surgery. However, they are prone to miscalibration over time and poor generalization across data sets; thus, their use remains controversial. Despite increased interest, a gap in understanding the effect of data set drift on the performance of machine learning (ML) over time remains a barrier to its wider use in clinical practice. Data set drift occurs when an ML system underperforms because of a mismatch between the data it was developed from and the data on which it is deployed.

Objective

In this study, we analyzed the extent of performance drift using models built on a large UK cardiac surgery database. The objectives were to (1) rank and assess the extent of performance drift in cardiac surgery risk ML models over time and (2) investigate any potential influence of data set drift and variable importance drift on performance drift.

Methods

We conducted a retrospective analysis of prospectively, routinely gathered data on adult patients undergoing cardiac surgery in the United Kingdom between 2012 and 2019. We temporally split the data 70:30 into a training and validation set and a holdout set. Five novel ML mortality prediction models were developed and assessed, along with EuroSCORE II, for relationships between and within variable importance drift, performance drift, and actual data set drift. Performance was assessed using a consensus metric.

Results

A total of 227,087 adults underwent cardiac surgery during the study period, with a mortality rate of 2.76% (n=6258). There was strong evidence of a decrease in overall performance across all models (P<.0001). Extreme gradient boosting (clinical effectiveness metric [CEM] 0.728, 95% CI 0.728-0.729) and random forest (CEM 0.727, 95% CI 0.727-0.728) were the overall best-performing models, both temporally and nontemporally. EuroSCORE II performed the worst across all comparisons. Sharp changes in variable importance and data set drift from October to December 2017, from June to July 2018, and from December 2018 to February 2019 mirrored the effects of performance decrease across models.

Conclusions

All models show a decrease in at least 3 of the 5 individual metrics. CEM and variable importance drift detection demonstrate the limitation of logistic regression methods used for cardiac surgery risk prediction and the effects of data set drift. Future work will be required to determine the interplay between ML models and whether ensemble models could improve on their respective performance advantages.

Collapse

Chung A, Opoku-Pare GA, Tibble H. Cause of death coding in asthma. BMC Med Res Methodol 2024;24:129. [PMID: 38840045 PMCID: PMC11151540 DOI: 10.1186/s12874-024-02238-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 05/03/2024] [Indexed: 06/07/2024] Open

Petrella RJ. The AI Future of Emergency Medicine. Ann Emerg Med 2024:S0196-0644(24)00043-X. [PMID: 38795081 DOI: 10.1016/j.annemergmed.2024.01.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 01/23/2024] [Accepted: 01/24/2024] [Indexed: 05/27/2024]

Cherblanc J, Gaboury S, Maître J, Côté I, Cadell S, Bergeron-Leclerc C. Predicting levels of prolonged grief disorder symptoms during the COVID-19 pandemic: An integrated approach of classical data exploration, predictive machine learning, and explainable AI. J Affect Disord 2024;351:746-754. [PMID: 38290589 DOI: 10.1016/j.jad.2024.01.236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 01/11/2024] [Accepted: 01/26/2024] [Indexed: 02/01/2024]

Ghanta SN, Gautam N, Mehta JL, Al’Aref SJ. Machine Learning for Predicting Intubations in Heart Failure Patients: the Challenge of the Right Approach. Cardiovasc Drugs Ther 2024;38:211-214. [PMID: 36593325 PMCID: PMC9807425 DOI: 10.1007/s10557-022-07423-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 12/28/2022] [Indexed: 01/04/2023]

Kore A, Abbasi Bavil E, Subasri V, Abdalla M, Fine B, Dolatabadi E, Abdalla M. Empirical data drift detection experiments on real-world medical imaging data. Nat Commun 2024;15:1887. [PMID: 38424096 PMCID: PMC10904813 DOI: 10.1038/s41467-024-46142-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Accepted: 02/14/2024] [Indexed: 03/02/2024] Open

Hashimoto DA, Sambasastry SK, Singh V, Kurada S, Altieri M, Yoshida T, Madani A, Jogan M. A foundation for evaluating the surgical artificial intelligence literature. EUROPEAN JOURNAL OF SURGICAL ONCOLOGY 2024:108014. [PMID: 38360498 DOI: 10.1016/j.ejso.2024.108014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2023] [Revised: 01/06/2024] [Accepted: 02/09/2024] [Indexed: 02/17/2024]

Conners KM, Avery CL, Syed FF. Advancing Cardiovascular Risk Assessment with Artificial Intelligence: Opportunities and Implications in North Carolina. N C Med J 2024;85:10.18043/001c.91424. [PMID: 38938760 PMCID: PMC11208038 DOI: 10.18043/001c.91424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]

Hekman DJ, Barton HJ, Maru AP, Wills G, Cochran AL, Fritsch C, Wiegmann DA, Liao F, Patterson BW. Dashboarding to Monitor Machine-Learning-Based Clinical Decision Support Interventions. Appl Clin Inform 2024;15:164-169. [PMID: 38029792 PMCID: PMC10901643 DOI: 10.1055/a-2219-5175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/28/2023] [Indexed: 12/01/2023] Open

van Velzen M, de Graaf-Waar HI, Ubert T, van der Willigen RF, Muilwijk L, Schmitt MA, Scheper MC, van Meeteren NLU. 21st century (clinical) decision support in nursing and allied healthcare. Developing a learning health system: a reasoned design of a theoretical framework. BMC Med Inform Decis Mak 2023;23:279. [PMID: 38053104 PMCID: PMC10699040 DOI: 10.1186/s12911-023-02372-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 11/09/2023] [Indexed: 12/07/2023] Open

Abstract

In this paper, we present a framework for developing a Learning Health System (LHS) to provide means to a computerized clinical decision support system for allied healthcare and/or nursing professionals. LHSs are well suited to transform healthcare systems in a mission-oriented approach, and is being adopted by an increasing number of countries. Our theoretical framework provides a blueprint for organizing such a transformation with help of evidence based state of the art methodologies and techniques to eventually optimize personalized health and healthcare. Learning via health information technologies using LHS enables users to learn both individually and collectively, and independent of their location. These developments demand healthcare innovations beyond a disease focused orientation since clinical decision making in allied healthcare and nursing is mainly based on aspects of individuals' functioning, wellbeing and (dis)abilities. Developing LHSs depends heavily on intertwined social and technological innovation, and research and development. Crucial factors may be the transformation of the Internet of Things into the Internet of FAIR data & services. However, Electronic Health Record (EHR) data is in up to 80% unstructured including free text narratives and stored in various inaccessible data warehouses. Enabling the use of data as a driver for learning is challenged by interoperability and reusability.To address technical needs, key enabling technologies are suitable to convert relevant health data into machine actionable data and to develop algorithms for computerized decision support. To enable data conversions, existing classification and terminology systems serve as definition providers for natural language processing through (un)supervised learning.To facilitate clinical reasoning and personalized healthcare using LHSs, the development of personomics and functionomics are useful in allied healthcare and nursing. Developing these omics will be determined via text and data mining. This will focus on the relationships between social, psychological, cultural, behavioral and economic determinants, and human functioning.Furthermore, multiparty collaboration is crucial to develop LHSs, and man-machine interaction studies are required to develop a functional design and prototype. During development, validation and maintenance of the LHS continuous attention for challenges like data-drift, ethical, technical and practical implementation difficulties is required.

Collapse

Sahiner B, Chen W, Samala RK, Petrick N. Data drift in medical machine learning: implications and potential remedies. Br J Radiol 2023;96:20220878. [PMID: 36971405 PMCID: PMC10546450 DOI: 10.1259/bjr.20220878] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 02/16/2023] [Accepted: 02/20/2023] [Indexed: 03/29/2023] Open

Mallio CA, Radbruch A, Deike-Hofmann K, van der Molen AJ, Dekkers IA, Zaharchuk G, Parizel PM, Beomonte Zobel B, Quattrocchi CC. Artificial Intelligence to Reduce or Eliminate the Need for Gadolinium-Based Contrast Agents in Brain and Cardiac MRI: A Literature Review. Invest Radiol 2023;58:746-753. [PMID: 37126454 DOI: 10.1097/rli.0000000000000983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]

McFadden BR, Reynolds M, Inglis TJJ. Developing machine learning systems worthy of trust for infection science: a requirement for future implementation into clinical practice. Front Digit Health 2023;5:1260602. [PMID: 37829595 PMCID: PMC10565494 DOI: 10.3389/fdgth.2023.1260602] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2023] [Accepted: 09/15/2023] [Indexed: 10/14/2023] Open

Xu Y, Sun X, Liu Y, Huang Y, Liang M, Sun R, Yin G, Song C, Ding Q, Du B, Bi X. Prediction of subjective cognitive decline after corpus callosum infarction by an interpretable machine learning-derived early warning strategy. Front Neurol 2023;14:1123607. [PMID: 37416313 PMCID: PMC10321713 DOI: 10.3389/fneur.2023.1123607] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 05/25/2023] [Indexed: 07/08/2023] Open

Abstract

Background and purpose

Corpus callosum (CC) infarction is an extremely rare subtype of cerebral ischemic stroke, however, the symptoms of cognitive impairment often fail to attract early attention of patients, which seriously affects the long-term prognosis, such as high mortality, personality changes, mood disorders, psychotic reactions, financial burden and so on. This study seeks to develop and validate models for early predicting the risk of subjective cognitive decline (SCD) after CC infarction by machine learning (ML) algorithms.

Methods

This is a prospective study that enrolled 213 (only 3.7%) CC infarction patients from a nine-year cohort comprising 8,555 patients with acute ischemic stroke. Telephone follow-up surveys were carried out for the patients with definite diagnosis of CC infarction one-year after disease onset, and SCD was identified by Behavioral Risk Factor Surveillance System (BRFSS) questionnaire. Based on the significant features selected by the least absolute shrinkage and selection operator (LASSO), seven ML models including Extreme Gradient Boosting (XGBoost), Logistic Regression (LR), Light Gradient Boosting Machine (LightGBM), Adaptive Boosting (AdaBoost), Gaussian Naïve Bayes (GNB), Complement Naïve Bayes (CNB), and Support vector machine (SVM) were established and their predictive performances were compared by different metrics. Importantly, the SHapley Additive exPlanations (SHAP) was also utilized to examine internal behavior of the highest-performance ML classifier.

Results

The Logistic Regression (LR)-model performed better than other six ML-models in SCD predictability after the CC infarction, with the area under the receiver characteristic operator curve (AUC) of 77.1% in the validation set. Using LASSO and SHAP analysis, we found that infarction subregions of CC infarction, female, 3-month modified Rankin Scale (mRS) score, age, homocysteine, location of angiostenosis, neutrophil to lymphocyte ratio, pure CC infarction, and number of angiostenosis were the top-nine significant predictors in the order of importance for the output of LR-model. Meanwhile, we identified that infarction subregion of CC, female, 3-month mRS score and pure CC infarction were the factors which independently associated with the cognitive outcome.

Conclusion

Our study firstly demonstrated that the LR-model with 9 common variables has the best-performance to predict the risk of post-stroke SCD due to CC infarcton. Particularly, the combination of LR-model and SHAP-explainer could aid in achieving personalized risk prediction and be served as a decision-making tool for early intervention since its poor long-term outcome.

Collapse

Rahmani K, Thapa R, Tsou P, Casie Chetty S, Barnes G, Lam C, Foon Tso C. Assessing the effects of data drift on the performance of machine learning models used in clinical sepsis prediction. Int J Med Inform 2023;173:104930. [PMID: 36893656 DOI: 10.1016/j.ijmedinf.2022.104930] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 10/30/2022] [Accepted: 11/15/2022] [Indexed: 11/21/2022]

Abstract

BACKGROUND

Data drift can negatively impact the performance of machine learning algorithms (MLAs) that were trained on historical data. As such, MLAs should be continuously monitored and tuned to overcome the systematic changes that occur in the distribution of data. In this paper, we study the extent of data drift and provide insights about its characteristics for sepsis onset prediction. This study will help elucidate the nature of data drift for prediction of sepsis and similar diseases. This may aid with the development of more effective patient monitoring systems that can stratify risk for dynamic disease states in hospitals.

METHODS

We devise a series of simulations that measure the effects of data drift in patients with sepsis, using electronic health records (EHR). We simulate multiple scenarios in which data drift may occur, namely the change in the distribution of the predictor variables (covariate shift), the change in the statistical relationship between the predictors and the target (concept shift), and the occurrence of a major healthcare event (major event) such as the COVID-19 pandemic. We measure the impact of data drift on model performances, identify the circumstances that necessitate model retraining, and compare the effects of different retraining methodologies and model architecture on the outcomes. We present the results for two different MLAs, eXtreme Gradient Boosting (XGB) and Recurrent Neural Network (RNN).

RESULTS

Our results show that the properly retrained XGB models outperform the baseline models in all simulation scenarios, hence signifying the existence of data drift. In the major event scenario, the area under the receiver operating characteristic curve (AUROC) at the end of the simulation period is 0.811 for the baseline XGB model and 0.868 for the retrained XGB model. In the covariate shift scenario, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.853 and 0.874 respectively. In the concept shift scenario and under the mixed labeling method, the retrained XGB models perform worse than the baseline model for most simulation steps. However, under the full relabeling method, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.852 and 0.877 respectively. The results for the RNN models were mixed, suggesting that retraining based on a fixed network architecture may be inadequate for an RNN. We also present the results in the form of other performance metrics such as the ratio of observed to expected probabilities (calibration) and the normalized rate of positive predictive values (PPV) by prevalence, referred to as lift, at a sensitivity of 0.8.

CONCLUSION

Our simulations reveal that retraining periods of a couple of months or using several thousand patients are likely to be adequate to monitor machine learning models that predict sepsis. This indicates that a machine learning system for sepsis prediction will probably need less infrastructure for performance monitoring and retraining compared to other applications in which data drift is more frequent and continuous. Our results also show that in the event of a concept shift, a full overhaul of the sepsis prediction model may be necessary because it indicates a discrete change in the definition of sepsis labels, and mixing the labels for the sake of incremental training may not produce the desired results.

Collapse

Im JE, Park S, Kim YJ, Yoon SA, Lee JH. Predicting the need for intubation within 3 h in the neonatal intensive care unit using a multimodal deep neural network. Sci Rep 2023;13:6213. [PMID: 37069174 PMCID: PMC10106895 DOI: 10.1038/s41598-023-33353-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Accepted: 04/12/2023] [Indexed: 04/19/2023] Open

Andonov DI, Ulm B, Graessner M, Podtschaske A, Blobner M, Jungwirth B, Kagerbauer SM. Impact of the Covid-19 pandemic on the performance of machine learning algorithms for predicting perioperative mortality. BMC Med Inform Decis Mak 2023;23:67. [PMID: 37046259 PMCID: PMC10092913 DOI: 10.1186/s12911-023-02151-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/15/2023] [Indexed: 04/14/2023] Open

Abstract

BACKGROUND

Machine-learning models are susceptible to external influences which can result in performance deterioration. The aim of our study was to elucidate the impact of a sudden shift in covariates, like the one caused by the Covid-19 pandemic, on model performance.

METHODS

After ethical approval and registration in Clinical Trials (NCT04092933, initial release 17/09/2019), we developed different models for the prediction of perioperative mortality based on preoperative data: one for the pre-pandemic data period until March 2020, one including data before the pandemic and from the first wave until May 2020, and one that covers the complete period before and during the pandemic until October 2021. We applied XGBoost as well as a Deep Learning neural network (DL). Performance metrics of each model during the different pandemic phases were determined, and XGBoost models were analysed for changes in feature importance.

RESULTS

XGBoost and DL provided similar performance on the pre-pandemic data with respect to area under receiver operating characteristic (AUROC, 0.951 vs. 0.942) and area under precision-recall curve (AUPR, 0.144 vs. 0.187). Validation in patient cohorts of the different pandemic waves showed high fluctuations in performance from both AUROC and AUPR for DL, whereas the XGBoost models seemed more stable. Change in variable frequencies with onset of the pandemic were visible in age, ASA score, and the higher proportion of emergency operations, among others. Age consistently showed the highest information gain. Models based on pre-pandemic data performed worse during the first pandemic wave (AUROC 0.914 for XGBoost and DL) whereas models augmented with data from the first wave lacked performance after the first wave (AUROC 0.907 for XGBoost and 0.747 for DL). The deterioration was also visible in AUPR, which worsened by over 50% in both XGBoost and DL in the first phase after re-training.

CONCLUSIONS

A sudden shift in data impacts model performance. Re-training the model with updated data may cause degradation in predictive accuracy if the changes are only transient. Too early re-training should therefore be avoided, and close model surveillance is necessary.

Collapse

Albahra S, Gorbett T, Robertson S, D'Aleo G, Kumar SVS, Ockunzzi S, Lallo D, Hu B, Rashidi HH. Artificial intelligence and machine learning overview in pathology & laboratory medicine: A general review of data preprocessing and basic supervised concepts. Semin Diagn Pathol 2023;40:71-87. [PMID: 36870825 DOI: 10.1053/j.semdp.2023.02.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Revised: 02/10/2023] [Accepted: 02/14/2023] [Indexed: 02/17/2023]

Affiliation(s)

Samer Albahra Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States.
Tom Gorbett Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Scott Robertson Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Giana D'Aleo Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Sushasree Vasudevan Suseel Kumar Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Samuel Ockunzzi Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Daniel Lallo Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Bo Hu Department of Quantitative Health Sciences, Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States
Hooman H Rashidi Pathology and Laboratory Medicine Institute (PLMI), Cleveland Clinic, Cleveland, OH, United States; PLMI's Center for Artificial Intelligence & Data Science, Cleveland Clinic, Cleveland, OH, United States.

Collapse

Lu SC, Swisher CL, Chung C, Jaffray D, Sidey-Gibbons C. On the importance of interpretable machine learning predictions to inform clinical decision making in oncology. Front Oncol 2023;13:1129380. [PMID: 36925929 PMCID: PMC10013157 DOI: 10.3389/fonc.2023.1129380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Accepted: 02/14/2023] [Indexed: 03/04/2023] Open

The use of machine learning and artificial intelligence within pediatric critical care. Pediatr Res 2023;93:405-412. [PMID: 36376506 PMCID: PMC9660024 DOI: 10.1038/s41390-022-02380-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Revised: 09/15/2022] [Accepted: 10/30/2022] [Indexed: 11/16/2022]

Loh HW, Ooi CP, Seoni S, Barua PD, Molinari F, Acharya UR. Application of explainable artificial intelligence for healthcare: A systematic review of the last decade (2011-2022). COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2022;226:107161. [PMID: 36228495 DOI: 10.1016/j.cmpb.2022.107161] [Citation(s) in RCA: 84] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 09/16/2022] [Accepted: 09/25/2022] [Indexed: 06/16/2023]

Di Martino F, Delmastro F. Explainable AI for clinical and remote health applications: a survey on tabular and time series data. Artif Intell Rev 2022;56:5261-5315. [PMID: 36320613 PMCID: PMC9607788 DOI: 10.1007/s10462-022-10304-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]

Rahmani K, Thapa R, Tsou P, Chetty SC, Barnes G, Lam C, Tso CF. Assessing the effects of data drift on the performance of machine learning models used in clinical sepsis prediction. MEDRXIV : THE PREPRINT SERVER FOR HEALTH SCIENCES 2022:2022.06.06.22276062. [PMID: 35702157 PMCID: PMC9196120 DOI: 10.1101/2022.06.06.22276062] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Abstract

Background

Methods

We devise a series of simulations that measure the effects of data drift in patients with sepsis. We simulate multiple scenarios in which data drift may occur, namely the change in the distribution of the predictor variables (covariate shift), the change in the statistical relationship between the predictors and the target (concept shift), and the occurrence of a major healthcare event (major event) such as the COVID-19 pandemic. We measure the impact of data drift on model performances, identify the circumstances that necessitate model retraining, and compare the effects of different retraining methodologies and model architecture on the outcomes. We present the results for two different MLAs, eXtreme Gradient Boosting (XGB) and Recurrent Neural Network (RNN).

Results

Conclusion

Collapse

Clinical artificial intelligence quality improvement: towards continual monitoring and updating of AI algorithms in healthcare. NPJ Digit Med 2022;5:66. [PMID: 35641814 PMCID: PMC9156743 DOI: 10.1038/s41746-022-00611-y] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 04/29/2022] [Indexed: 12/13/2022] Open

AI and Clinical Decision Making: The Limitations and Risks of Computational Reductionism in Bowel Cancer Screening. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12073341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]

Is Artificial Intelligence (AI) a Pipe Dream? Why Legal Issues Present Significant Hurdles to AI Autonomy. AJR Am J Roentgenol 2022;219:152-156. [PMID: 35138133 DOI: 10.2214/ajr.21.27224] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]