Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Siregar S, Nieboer D, Vergouwe Y, Versteegh MIM, Noyez L, Vonk ABA, Steyerberg EW, Takkenberg JJM. Improved Prediction by Dynamic Modeling: An Exploratory Study in the Adult Cardiac Surgery Database of the Netherlands Association for Cardio-Thoracic Surgery. Circ Cardiovasc Qual Outcomes 2016;9:171-81. [PMID: 26933048 DOI: 10.1161/circoutcomes.114.001645] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 01/22/2016] [Indexed: 11/16/2022]

For:	Siregar S, Nieboer D, Vergouwe Y, Versteegh MIM, Noyez L, Vonk ABA, Steyerberg EW, Takkenberg JJM. Improved Prediction by Dynamic Modeling: An Exploratory Study in the Adult Cardiac Surgery Database of the Netherlands Association for Cardio-Thoracic Surgery. Circ Cardiovasc Qual Outcomes 2016;9:171-81. [PMID: 26933048 DOI: 10.1161/circoutcomes.114.001645] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2014] [Accepted: 01/22/2016] [Indexed: 11/16/2022]

Number

Cited by Other Article(s)

Ryvlin J, Kim SW, De la Garza Ramos R, Hamad M, Stock A, Owolo E, Fourman MS, Eleswarapu A, Gelfand Y, Murthy S, Yassari R. External Validation of an Online Wound Infection and Wound Reoperation Risk Calculator After Metastatic Spinal Tumor Surgery. World Neurosurg 2024;185:e351-e356. [PMID: 38342175 DOI: 10.1016/j.wneu.2024.02.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 02/01/2024] [Accepted: 02/02/2024] [Indexed: 02/13/2024]

Tanner KT, Keogh RH, Coupland CAC, Hippisley-Cox J, Diaz-Ordaz K. Dynamic updating of clinical survival prediction models in a changing environment. Diagn Progn Res 2023;7:24. [PMID: 38082429 PMCID: PMC10714456 DOI: 10.1186/s41512-023-00163-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 10/17/2023] [Indexed: 01/31/2024] Open

Abstract

BACKGROUND

Over time, the performance of clinical prediction models may deteriorate due to changes in clinical management, data quality, disease risk and/or patient mix. Such prediction models must be updated in order to remain useful. In this study, we investigate dynamic model updating of clinical survival prediction models. In contrast to discrete or one-time updating, dynamic updating refers to a repeated process for updating a prediction model with new data. We aim to extend previous research which focused largely on binary outcome prediction models by concentrating on time-to-event outcomes. We were motivated by the rapidly changing environment seen during the COVID-19 pandemic where mortality rates changed over time and new treatments and vaccines were introduced.

METHODS

We illustrate three methods for dynamic model updating: Bayesian dynamic updating, recalibration, and full refitting. We use a simulation study to compare performance in a range of scenarios including changing mortality rates, predictors with low prevalence and the introduction of a new treatment. Next, the updating strategies were applied to a model for predicting 70-day COVID-19-related mortality using patient data from QResearch, an electronic health records database from general practices in the UK.

RESULTS

In simulated scenarios with mortality rates changing over time, all updating methods resulted in better calibration than not updating. Moreover, dynamic updating outperformed ad hoc updating. In the simulation scenario with a new predictor and a small updating dataset, Bayesian updating improved the C-index over not updating and refitting. In the motivating example with a rare outcome, no single updating method offered the best performance.

CONCLUSIONS

We found that a dynamic updating process outperformed one-time discrete updating in the simulations. Bayesian updating offered good performance overall, even in scenarios with new predictors and few events. Intercept recalibration was effective in scenarios with smaller sample size and changing baseline hazard. Refitting performance depended on sample size and produced abrupt changes in hazard ratio estimates between periods.

Collapse

van Dijk WB, Leeuwenberg AM, Grobbee DE, Siregar S, Houterman S, Daeter EJ, de Vries MC, Groenwold RHH, Schuit E. Dynamics in cardiac surgery: trends in population characteristics and the performance of the EuroSCORE II over time. Eur J Cardiothorac Surg 2023;64:ezad301. [PMID: 37672025 PMCID: PMC10504469 DOI: 10.1093/ejcts/ezad301] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2022] [Revised: 06/21/2023] [Accepted: 09/05/2023] [Indexed: 09/07/2023] Open

Abstract

OBJECTIVES

The aim of this study was to investigate the performance of the EuroSCORE II over time and dynamics in values of predictors included in the model.

METHODS

A cohort study was performed using data from the Netherlands Heart Registration. All cardiothoracic surgical procedures performed between 1 January 2013 and 31 December 2019 were included for analysis. Performance of the EuroSCORE II was assessed across 3-month intervals in terms of calibration and discrimination. For subgroups of major surgical procedures, performance of the EuroSCORE II was assessed across 12-month time intervals. Changes in values of individual EuroSCORE II predictors over time were assessed graphically.

RESULTS

A total of 103 404 cardiothoracic surgical procedures were included. Observed mortality risk ranged between 1.9% [95% confidence interval (CI) 1.6-2.4] and 3.6% (95% CI 2.6-4.4) across 3-month intervals, while the mean predicted mortality risk ranged between 3.4% (95% CI 3.3-3.6) and 4.2% (95% CI 3.9-4.6). The corresponding observed:expected ratios ranged from 0.50 (95% CI 0.46-0.61) to 0.95 (95% CI 0.74-1.16). Discriminative performance in terms of the c-statistic ranged between 0.82 (95% CI 0.78-0.89) and 0.89 (95% CI 0.87-0.93). The EuroSCORE II consistently overestimated mortality compared to observed mortality. This finding was consistent across all major cardiothoracic surgical procedures. Distributions of values of individual predictors varied broadly across predictors over time. Most notable trends were a decrease in elective surgery from 75% to 54% and a rise in patients with no or New York Heart Association I class heart failure from 27% to 33%.

CONCLUSIONS

The EuroSCORE II shows good discriminative performance, but consistently overestimates mortality risks of all types of major cardiothoracic surgical procedures in the Netherlands.

Collapse

Yadava OP. Predicting the unpredictable in cardiothoracic surgery. Indian J Thorac Cardiovasc Surg 2023;39:109-111. [PMID: 36785611 PMCID: PMC9918619 DOI: 10.1007/s12055-023-01478-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open

Binuya MAE, Engelhardt EG, Schats W, Schmidt MK, Steyerberg EW. Methodological guidance for the evaluation and updating of clinical prediction models: a systematic review. BMC Med Res Methodol 2022;22:316. [PMID: 36510134 PMCID: PMC9742671 DOI: 10.1186/s12874-022-01801-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2022] [Accepted: 11/22/2022] [Indexed: 12/14/2022] Open

Abstract

BACKGROUND

Clinical prediction models are often not evaluated properly in specific settings or updated, for instance, with information from new markers. These key steps are needed such that models are fit for purpose and remain relevant in the long-term. We aimed to present an overview of methodological guidance for the evaluation (i.e., validation and impact assessment) and updating of clinical prediction models.

METHODS

We systematically searched nine databases from January 2000 to January 2022 for articles in English with methodological recommendations for the post-derivation stages of interest. Qualitative analysis was used to summarize the 70 selected guidance papers.

RESULTS

Key aspects for validation are the assessment of statistical performance using measures for discrimination (e.g., C-statistic) and calibration (e.g., calibration-in-the-large and calibration slope). For assessing impact or usefulness in clinical decision-making, recent papers advise using decision-analytic measures (e.g., the Net Benefit) over simplistic classification measures that ignore clinical consequences (e.g., accuracy, overall Net Reclassification Index). Commonly recommended methods for model updating are recalibration (i.e., adjustment of intercept or baseline hazard and/or slope), revision (i.e., re-estimation of individual predictor effects), and extension (i.e., addition of new markers). Additional methodological guidance is needed for newer types of updating (e.g., meta-model and dynamic updating) and machine learning-based models.

CONCLUSION

Substantial guidance was found for model evaluation and more conventional updating of regression-based models. An important development in model evaluation is the introduction of a decision-analytic framework for assessing clinical usefulness. Consensus is emerging on methods for model updating.

Collapse

van Klaveren D, Zanos TP, Nelson J, Levy TJ, Park JG, Retel Helmrich IRA, Rietjens JAC, Basile MJ, Hajizadeh N, Lingsma HF, Kent DM. Prognostic models for COVID-19 needed updating to warrant transportability over time and space. BMC Med 2022;20:456. [PMID: 36424619 PMCID: PMC9686462 DOI: 10.1186/s12916-022-02651-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/04/2022] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

Supporting decisions for patients who present to the emergency department (ED) with COVID-19 requires accurate prognostication. We aimed to evaluate prognostic models for predicting outcomes in hospitalized patients with COVID-19, in different locations and across time.

METHODS

We included patients who presented to the ED with suspected COVID-19 and were admitted to 12 hospitals in the New York City (NYC) area and 4 large Dutch hospitals. We used second-wave patients who presented between September and December 2020 (2137 and 3252 in NYC and the Netherlands, respectively) to evaluate models that were developed on first-wave patients who presented between March and August 2020 (12,163 and 5831). We evaluated two prognostic models for in-hospital death: The Northwell COVID-19 Survival (NOCOS) model was developed on NYC data and the COVID Outcome Prediction in the Emergency Department (COPE) model was developed on Dutch data. These models were validated on subsequent second-wave data at the same site (temporal validation) and at the other site (geographic validation). We assessed model performance by the Area Under the receiver operating characteristic Curve (AUC), by the E-statistic, and by net benefit.

RESULTS

Twenty-eight-day mortality was considerably higher in the NYC first-wave data (21.0%), compared to the second-wave (10.1%) and the Dutch data (first wave 10.8%; second wave 10.0%). COPE discriminated well at temporal validation (AUC 0.82), with excellent calibration (E-statistic 0.8%). At geographic validation, discrimination was satisfactory (AUC 0.78), but with moderate over-prediction of mortality risk, particularly in higher-risk patients (E-statistic 2.9%). While discrimination was adequate when NOCOS was tested on second-wave NYC data (AUC 0.77), NOCOS systematically overestimated the mortality risk (E-statistic 5.1%). Discrimination in the Dutch data was good (AUC 0.81), but with over-prediction of risk, particularly in lower-risk patients (E-statistic 4.0%). Recalibration of COPE and NOCOS led to limited net benefit improvement in Dutch data, but to substantial net benefit improvement in NYC data.

CONCLUSIONS

NOCOS performed moderately worse than COPE, probably reflecting unique aspects of the early pandemic in NYC. Frequent updating of prognostic models is likely to be required for transportability over time and space during a dynamic pandemic.

Collapse

Affiliation(s)

David van Klaveren Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands. .,Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA.
Theodoros P Zanos Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
Jason Nelson Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
Todd J Levy Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
Jinny G Park Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
Isabel R A Retel Helmrich Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
Judith A C Rietjens Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
Melissa J Basile Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
Negin Hajizadeh Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
Hester F Lingsma Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
David M Kent Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA

Collapse

Zhang X, Xue Y, Su X, Chen S, Liu K, Chen W, Liu M, Hu Y. A Transfer Learning Approach to Correct the Temporal Performance Drift of Clinical Prediction Models: Retrospective Cohort Study. JMIR Med Inform 2022;10:e38053. [DOI: 10.2196/38053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 07/31/2022] [Accepted: 10/12/2022] [Indexed: 11/11/2022] Open

Abstract Background Clinical prediction models suffer from performance drift as the patient population shifts over time. There is a great need for model updating approaches or modeling frameworks that can effectively use the old and new data. Objective Based on the paradigm of transfer learning, we aimed to develop a novel modeling framework that transfers old knowledge to the new environment for prediction tasks, and contributes to performance drift correction. Methods The proposed predictive modeling framework maintains a logistic regression–based stacking ensemble of 2 gradient boosting machine (GBM) models representing old and new knowledge learned from old and new data, respectively (referred to as transfer learning gradient boosting machine [TransferGBM]). The ensemble learning procedure can dynamically balance the old and new knowledge. Using 2010-2017 electronic health record data on a retrospective cohort of 141,696 patients, we validated TransferGBM for hospital-acquired acute kidney injury prediction. Results The baseline models (ie, transported models) that were trained on 2010 and 2011 data showed significant performance drift in the temporal validation with 2012-2017 data. Refitting these models using updated samples resulted in performance gains in nearly all cases. The proposed TransferGBM model succeeded in achieving uniformly better performance than the refitted models. Conclusions Under the scenario of population shift, incorporating new knowledge while preserving old knowledge is essential for maintaining stable performance. Transfer learning combined with stacking ensemble learning can help achieve a balance of old and new knowledge in a flexible and adaptive way, even in the case of insufficient new data. Collapse

Machine Learning Model Drift: Predicting Diagnostic Imaging Follow-Up as a Case Example. J Am Coll Radiol 2022;19:1162-1169. [PMID: 35981636 DOI: 10.1016/j.jacr.2022.05.030] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/19/2022] [Accepted: 05/20/2022] [Indexed: 11/23/2022]

Davis SE, Brown JR, Dorn C, Westerman D, Solomon RJ, Matheny ME. Maintaining a National Acute Kidney Injury Risk Prediction Model to Support Local Quality Benchmarking. Circ Cardiovasc Qual Outcomes 2022;15:e008635. [PMID: 35959674 PMCID: PMC9388604 DOI: 10.1161/circoutcomes.121.008635] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Abstract

BACKGROUND

The utility of quality dashboards to inform decision-making and improve clinical outcomes is tightly linked to the accuracy of the information they provide and, in turn, accuracy of underlying prediction models. Despite recognition of the need to update prediction models to maintain accuracy over time, there is limited guidance on updating strategies. We compare predefined and surveillance-based updating strategies applied to a model supporting quality evaluations among US veterans.

METHODS

We evaluated the performance of a US Department of Veterans Affairs-specific model for postcardiac catheterization acute kidney injury using routinely collected observational data over the 6 years following model development (n=90 295 procedures in 2013-2019). Predicted probabilities were generated from the original model, an annually retrained model, and a surveillance-based approach that monitored performance to inform the timing and method of updates. We evaluated how updating the national model impacted regional quality profiles. We compared observed-to-expected outcome ratios, where values above and below 1 indicated more and fewer adverse outcomes than expected, respectively.

RESULTS

The original model overpredicted risk at the national level (observed-to-expected outcome ratio, 0.75 [0.74-0.77]). Annual retraining updated the model 5×; surveillance-based updating retrained once and recalibrated twice. While both strategies improved performance, the surveillance-based approach provided superior calibration (observed-to-expected outcome ratio, 1.01 [0.99-1.03] versus 0.94 [0.92-0.96]). Overprediction by the original model led to optimistic quality assessments, incorrectly indicating most of the US Department of Veterans Affairs' 18 regions observed fewer acute kidney injury events than predicted. Both updating strategies revealed 16 regions performed as expected and 2 regions increasingly underperformed, having more acute kidney injury events than predicted.

CONCLUSIONS

Miscalibrated clinical prediction models provide inaccurate pictures of performance across clinical units, and degrading calibration further complicates our understanding of quality. Updating strategies tailored to health system needs and capacity should be incorporated into model implementation plans to promote the utility and longevity of quality reporting tools.

Collapse

Mehta Y, Kapoor PM, Maheswarappa HM, Saxena G. Noninvasive Bioreactance-Based Fluid Management Monitoring: A Review of Literature. JOURNAL OF CARDIAC CRITICAL CARE TSS 2022. [DOI: 10.1055/s-0041-1741491] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open

Antoniou T, Mamdani M. Évaluation des solutions fondées sur l’apprentissage machine en santé. CMAJ 2021;193:E1720-E1724. [PMID: 34750185 PMCID: PMC8584374 DOI: 10.1503/cmaj.210036-f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open

Reynard C, Martin GP, Kontopantelis E, Jenkins DA, Heagerty A, McMillan B, Jafar A, Garlapati R, Body R. Advanced cardiovascular risk prediction in the emergency department: updating a clinical prediction model - a large database study protocol. Diagn Progn Res 2021;5:16. [PMID: 34620253 PMCID: PMC8499458 DOI: 10.1186/s41512-021-00105-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 09/27/2021] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Patients presenting with chest pain represent a large proportion of attendances to emergency departments. In these patients clinicians often consider the diagnosis of acute myocardial infarction (AMI), the timely recognition and treatment of which is clinically important. Clinical prediction models (CPMs) have been used to enhance early diagnosis of AMI. The Troponin-only Manchester Acute Coronary Syndromes (T-MACS) decision aid is currently in clinical use across Greater Manchester. CPMs have been shown to deteriorate over time through calibration drift. We aim to assess potential calibration drift with T-MACS and compare methods for updating the model.

METHODS

We will use routinely collected electronic data from patients who were treated using TMACS at two large NHS hospitals. This is estimated to include approximately 14,000 patient episodes spanning June 2016 to October 2020. The primary outcome of acute myocardial infarction will be sourced from NHS Digital's admitted patient care dataset. We will assess the calibration drift of the existing model and the benefit of updating the CPM by model recalibration, model extension and dynamic updating. These models will be validated by bootstrapping and one step ahead prequential testing. We will evaluate predictive performance using calibrations plots and c-statistics. We will also examine the reclassification of predicted probability with the updated TMACS model.

DISCUSSION

CPMs are widely used in modern medicine, but are vulnerable to deteriorating calibration over time. Ongoing refinement using routinely collected electronic data will inevitably be more efficient than deriving and validating new models. In this analysis we will seek to exemplify methods for updating CPMs to protect the initial investment of time and effort. If successful, the updating methods could be used to continually refine the algorithm used within TMACS, maintaining or even improving predictive performance over time.

TRIAL REGISTRATION

ISRCTN number: ISRCTN41008456.

Collapse

Antoniou T, Mamdani M. Evaluation of machine learning solutions in medicine. CMAJ 2021;193:E1425-E1429. [PMID: 34462315 PMCID: PMC8443287 DOI: 10.1503/cmaj.210036] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Guo LL, Pfohl SR, Fries J, Posada J, Fleming SL, Aftandilian C, Shah N, Sung L. Systematic Review of Approaches to Preserve Machine Learning Performance in the Presence of Temporal Dataset Shift in Clinical Medicine. Appl Clin Inform 2021;12:808-815. [PMID: 34470057 PMCID: PMC8410238 DOI: 10.1055/s-0041-1735184] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2021] [Accepted: 07/12/2021] [Indexed: 10/20/2022] Open

Abstract

OBJECTIVE

The change in performance of machine learning models over time as a result of temporal dataset shift is a barrier to machine learning-derived models facilitating decision-making in clinical practice. Our aim was to describe technical procedures used to preserve the performance of machine learning models in the presence of temporal dataset shifts.

METHODS

Studies were included if they were fully published articles that used machine learning and implemented a procedure to mitigate the effects of temporal dataset shift in a clinical setting. We described how dataset shift was measured, the procedures used to preserve model performance, and their effects.

RESULTS

Of 4,457 potentially relevant publications identified, 15 were included. The impact of temporal dataset shift was primarily quantified using changes, usually deterioration, in calibration or discrimination. Calibration deterioration was more common (n = 11) than discrimination deterioration (n = 3). Mitigation strategies were categorized as model level or feature level. Model-level approaches (n = 15) were more common than feature-level approaches (n = 2), with the most common approaches being model refitting (n = 12), probability calibration (n = 7), model updating (n = 6), and model selection (n = 6). In general, all mitigation strategies were successful at preserving calibration but not uniformly successful in preserving discrimination.

CONCLUSION

There was limited research in preserving the performance of machine learning models in the presence of temporal dataset shift in clinical medicine. Future research could focus on the impact of dataset shift on clinical decision making, benchmark the mitigation strategies on a wider range of datasets and tasks, and identify optimal strategies for specific settings.

Collapse

Davis SE, Greevy RA, Lasko TA, Walsh CG, Matheny ME. Detection of calibration drift in clinical prediction models to inform model updating. J Biomed Inform 2020;112:103611. [PMID: 33157313 PMCID: PMC8627243 DOI: 10.1016/j.jbi.2020.103611] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 10/21/2020] [Accepted: 10/29/2020] [Indexed: 10/23/2022]

Baxter RD, Fann JI, DiMaio JM, Lobdell K. Digital Health Primer for Cardiothoracic Surgeons. Ann Thorac Surg 2020;110:364-372. [PMID: 32268139 DOI: 10.1016/j.athoracsur.2020.02.072] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Revised: 01/03/2020] [Accepted: 02/23/2020] [Indexed: 12/12/2022]

Siregar S, Nieboer D, Versteegh MIM, Steyerberg EW, Takkenberg JJM. Methods for updating a risk prediction model for cardiac surgery: a statistical primer. Interact Cardiovasc Thorac Surg 2019;28:333-338. [DOI: 10.1093/icvts/ivy338] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2018] [Revised: 10/26/2018] [Accepted: 11/13/2018] [Indexed: 11/12/2022] Open

Jenkins DA, Sperrin M, Martin GP, Peek N. Dynamic models to predict health outcomes: current status and methodological challenges. Diagn Progn Res 2018;2:23. [PMID: 31093570 PMCID: PMC6460710 DOI: 10.1186/s41512-018-0045-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 11/19/2018] [Indexed: 12/14/2022] Open