Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Chi S, Tian Y, Wang F, Zhou T, Jin S, Li J. A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models. Artif Intell Med 2022;125:102256. [DOI: 10.1016/j.artmed.2022.102256] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Revised: 01/14/2022] [Accepted: 02/09/2022] [Indexed: 02/03/2023]

For:	Chi S, Tian Y, Wang F, Zhou T, Jin S, Li J. A novel lifelong machine learning-based method to eliminate calibration drift in clinical prediction models. Artif Intell Med 2022;125:102256. [DOI: 10.1016/j.artmed.2022.102256] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2021] [Revised: 01/14/2022] [Accepted: 02/09/2022] [Indexed: 02/03/2023]

Number

Cited by Other Article(s)

Pruski M. What does it mean for a clinical AI to be just: conflicts between local fairness and being fit-for-purpose? JOURNAL OF MEDICAL ETHICS 2024:jme-2023-109675. [PMID: 38423759 DOI: 10.1136/jme-2023-109675] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2023] [Accepted: 02/15/2024] [Indexed: 03/02/2024]

Kagerbauer SM, Ulm B, Podtschaske AH, Andonov DI, Blobner M, Jungwirth B, Graessner M. Susceptibility of AutoML mortality prediction algorithms to model drift caused by the COVID pandemic. BMC Med Inform Decis Mak 2024;24:34. [PMID: 38308256 PMCID: PMC10837894 DOI: 10.1186/s12911-024-02428-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 01/16/2024] [Indexed: 02/04/2024] Open

Abstract

BACKGROUND

Concept drift and covariate shift lead to a degradation of machine learning (ML) models. The objective of our study was to characterize sudden data drift as caused by the COVID pandemic. Furthermore, we investigated the suitability of certain methods in model training to prevent model degradation caused by data drift.

METHODS

We trained different ML models with the H2O AutoML method on a dataset comprising 102,666 cases of surgical patients collected in the years 2014-2019 to predict postoperative mortality using preoperatively available data. Models applied were Generalized Linear Model with regularization, Default Random Forest, Gradient Boosting Machine, eXtreme Gradient Boosting, Deep Learning and Stacked Ensembles comprising all base models. Further, we modified the original models by applying three different methods when training on the original pre-pandemic dataset: (Rahmani K, et al, Int J Med Inform 173:104930, 2023) we weighted older data weaker, (Morger A, et al, Sci Rep 12:7244, 2022) used only the most recent data for model training and (Dilmegani C, 2023) performed a z-transformation of the numerical input parameters. Afterwards, we tested model performance on a pre-pandemic and an in-pandemic data set not used in the training process, and analysed common features.

RESULTS

The models produced showed excellent areas under receiver-operating characteristic and acceptable precision-recall curves when tested on a dataset from January-March 2020, but significant degradation when tested on a dataset collected in the first wave of the COVID pandemic from April-May 2020. When comparing the probability distributions of the input parameters, significant differences between pre-pandemic and in-pandemic data were found. The endpoint of our models, in-hospital mortality after surgery, did not differ significantly between pre- and in-pandemic data and was about 1% in each case. However, the models varied considerably in the composition of their input parameters. None of our applied modifications prevented a loss of performance, although very different models emerged from it, using a large variety of parameters.

CONCLUSIONS

Our results show that none of our tested easy-to-implement measures in model training can prevent deterioration in the case of sudden external events. Therefore, we conclude that, in the presence of concept drift and covariate shift, close monitoring and critical review of model predictions are necessary.

Collapse

Hussain W, Mabrok M, Gao H, Rabhi FA, Rashed EA. Revolutionising healthcare with artificial intelligence: A bibliometric analysis of 40 years of progress in health systems. Digit Health 2024;10:20552076241258757. [PMID: 38817839 PMCID: PMC11138196 DOI: 10.1177/20552076241258757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/14/2024] [Indexed: 06/01/2024] Open

Tozzi AE, Croci I, Voicu P, Dotta F, Colafati GS, Carai A, Fabozzi F, Lacanna G, Premuselli R, Mastronuzzi A. A systematic review of data sources for artificial intelligence applications in pediatric brain tumors in Europe: implications for bias and generalizability. Front Oncol 2023;13:1285775. [PMID: 38016063 PMCID: PMC10646175 DOI: 10.3389/fonc.2023.1285775] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Accepted: 10/16/2023] [Indexed: 11/30/2023] Open

van Breugel M, Fehrmann RSN, Bügel M, Rezwan FI, Holloway JW, Nawijn MC, Fontanella S, Custovic A, Koppelman GH. Current state and prospects of artificial intelligence in allergy. Allergy 2023;78:2623-2643. [PMID: 37584170 DOI: 10.1111/all.15849] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/08/2023] [Accepted: 07/31/2023] [Indexed: 08/17/2023]

Affiliation(s)

Merlijn van Breugel Department of Pediatric Pulmonology and Pediatric Allergology, Beatrix Children's Hospital, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands MIcompany, Amsterdam, the Netherlands
Rudolf S N Fehrmann Department of Medical Oncology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
Marnix Bügel MIcompany, Amsterdam, the Netherlands
Faisal I Rezwan Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK Department of Computer Science, Aberystwyth University, Aberystwyth, UK
John W Holloway Human Development and Health, Faculty of Medicine, University of Southampton, Southampton, UK National Institute for Health and Care Research Southampton Biomedical Research Centre, University Hospitals Southampton NHS Foundation Trust, Southampton, UK
Martijn C Nawijn Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands Department of Pathology and Medical Biology, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands
Sara Fontanella National Heart and Lung Institute, Imperial College London, London, UK National Institute for Health and Care Research Imperial Biomedical Research Centre (BRC), London, UK
Adnan Custovic National Heart and Lung Institute, Imperial College London, London, UK National Institute for Health and Care Research Imperial Biomedical Research Centre (BRC), London, UK
Gerard H Koppelman Department of Pediatric Pulmonology and Pediatric Allergology, Beatrix Children's Hospital, University Medical Center Groningen, University of Groningen, Groningen, the Netherlands Groningen Research Institute for Asthma and COPD (GRIAC), University Medical Center Groningen, University of Groningen, Groningen, the Netherlands

Collapse

Rahmani K, Thapa R, Tsou P, Casie Chetty S, Barnes G, Lam C, Foon Tso C. Assessing the effects of data drift on the performance of machine learning models used in clinical sepsis prediction. Int J Med Inform 2023;173:104930. [PMID: 36893656 DOI: 10.1016/j.ijmedinf.2022.104930] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 10/30/2022] [Accepted: 11/15/2022] [Indexed: 11/21/2022]

Abstract

BACKGROUND

Data drift can negatively impact the performance of machine learning algorithms (MLAs) that were trained on historical data. As such, MLAs should be continuously monitored and tuned to overcome the systematic changes that occur in the distribution of data. In this paper, we study the extent of data drift and provide insights about its characteristics for sepsis onset prediction. This study will help elucidate the nature of data drift for prediction of sepsis and similar diseases. This may aid with the development of more effective patient monitoring systems that can stratify risk for dynamic disease states in hospitals.

METHODS

We devise a series of simulations that measure the effects of data drift in patients with sepsis, using electronic health records (EHR). We simulate multiple scenarios in which data drift may occur, namely the change in the distribution of the predictor variables (covariate shift), the change in the statistical relationship between the predictors and the target (concept shift), and the occurrence of a major healthcare event (major event) such as the COVID-19 pandemic. We measure the impact of data drift on model performances, identify the circumstances that necessitate model retraining, and compare the effects of different retraining methodologies and model architecture on the outcomes. We present the results for two different MLAs, eXtreme Gradient Boosting (XGB) and Recurrent Neural Network (RNN).

RESULTS

Our results show that the properly retrained XGB models outperform the baseline models in all simulation scenarios, hence signifying the existence of data drift. In the major event scenario, the area under the receiver operating characteristic curve (AUROC) at the end of the simulation period is 0.811 for the baseline XGB model and 0.868 for the retrained XGB model. In the covariate shift scenario, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.853 and 0.874 respectively. In the concept shift scenario and under the mixed labeling method, the retrained XGB models perform worse than the baseline model for most simulation steps. However, under the full relabeling method, the AUROC at the end of the simulation period for the baseline and retrained XGB models is 0.852 and 0.877 respectively. The results for the RNN models were mixed, suggesting that retraining based on a fixed network architecture may be inadequate for an RNN. We also present the results in the form of other performance metrics such as the ratio of observed to expected probabilities (calibration) and the normalized rate of positive predictive values (PPV) by prevalence, referred to as lift, at a sensitivity of 0.8.

CONCLUSION

Our simulations reveal that retraining periods of a couple of months or using several thousand patients are likely to be adequate to monitor machine learning models that predict sepsis. This indicates that a machine learning system for sepsis prediction will probably need less infrastructure for performance monitoring and retraining compared to other applications in which data drift is more frequent and continuous. Our results also show that in the event of a concept shift, a full overhaul of the sepsis prediction model may be necessary because it indicates a discrete change in the definition of sepsis labels, and mixing the labels for the sake of incremental training may not produce the desired results.

Collapse

Andonov DI, Ulm B, Graessner M, Podtschaske A, Blobner M, Jungwirth B, Kagerbauer SM. Impact of the Covid-19 pandemic on the performance of machine learning algorithms for predicting perioperative mortality. BMC Med Inform Decis Mak 2023;23:67. [PMID: 37046259 PMCID: PMC10092913 DOI: 10.1186/s12911-023-02151-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 03/15/2023] [Indexed: 04/14/2023] Open

Abstract

BACKGROUND

Machine-learning models are susceptible to external influences which can result in performance deterioration. The aim of our study was to elucidate the impact of a sudden shift in covariates, like the one caused by the Covid-19 pandemic, on model performance.

METHODS

After ethical approval and registration in Clinical Trials (NCT04092933, initial release 17/09/2019), we developed different models for the prediction of perioperative mortality based on preoperative data: one for the pre-pandemic data period until March 2020, one including data before the pandemic and from the first wave until May 2020, and one that covers the complete period before and during the pandemic until October 2021. We applied XGBoost as well as a Deep Learning neural network (DL). Performance metrics of each model during the different pandemic phases were determined, and XGBoost models were analysed for changes in feature importance.

RESULTS

XGBoost and DL provided similar performance on the pre-pandemic data with respect to area under receiver operating characteristic (AUROC, 0.951 vs. 0.942) and area under precision-recall curve (AUPR, 0.144 vs. 0.187). Validation in patient cohorts of the different pandemic waves showed high fluctuations in performance from both AUROC and AUPR for DL, whereas the XGBoost models seemed more stable. Change in variable frequencies with onset of the pandemic were visible in age, ASA score, and the higher proportion of emergency operations, among others. Age consistently showed the highest information gain. Models based on pre-pandemic data performed worse during the first pandemic wave (AUROC 0.914 for XGBoost and DL) whereas models augmented with data from the first wave lacked performance after the first wave (AUROC 0.907 for XGBoost and 0.747 for DL). The deterioration was also visible in AUPR, which worsened by over 50% in both XGBoost and DL in the first phase after re-training.

CONCLUSIONS

A sudden shift in data impacts model performance. Re-training the model with updated data may cause degradation in predictive accuracy if the changes are only transient. Too early re-training should therefore be avoided, and close model surveillance is necessary.

Collapse

Mo D, Zheng Q, Xiao B, Li L. Predicting thalassemia using deep neural network based on red blood cell indices. Clin Chim Acta 2023;543:117329. [PMID: 37019327 DOI: 10.1016/j.cca.2023.117329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Revised: 03/11/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023]

van Klaveren D, Zanos TP, Nelson J, Levy TJ, Park JG, Retel Helmrich IRA, Rietjens JAC, Basile MJ, Hajizadeh N, Lingsma HF, Kent DM. Prognostic models for COVID-19 needed updating to warrant transportability over time and space. BMC Med 2022;20:456. [PMID: 36424619 PMCID: PMC9686462 DOI: 10.1186/s12916-022-02651-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/22/2022] [Accepted: 11/04/2022] [Indexed: 11/25/2022] Open

Abstract

BACKGROUND

Supporting decisions for patients who present to the emergency department (ED) with COVID-19 requires accurate prognostication. We aimed to evaluate prognostic models for predicting outcomes in hospitalized patients with COVID-19, in different locations and across time.

METHODS

We included patients who presented to the ED with suspected COVID-19 and were admitted to 12 hospitals in the New York City (NYC) area and 4 large Dutch hospitals. We used second-wave patients who presented between September and December 2020 (2137 and 3252 in NYC and the Netherlands, respectively) to evaluate models that were developed on first-wave patients who presented between March and August 2020 (12,163 and 5831). We evaluated two prognostic models for in-hospital death: The Northwell COVID-19 Survival (NOCOS) model was developed on NYC data and the COVID Outcome Prediction in the Emergency Department (COPE) model was developed on Dutch data. These models were validated on subsequent second-wave data at the same site (temporal validation) and at the other site (geographic validation). We assessed model performance by the Area Under the receiver operating characteristic Curve (AUC), by the E-statistic, and by net benefit.

RESULTS

Twenty-eight-day mortality was considerably higher in the NYC first-wave data (21.0%), compared to the second-wave (10.1%) and the Dutch data (first wave 10.8%; second wave 10.0%). COPE discriminated well at temporal validation (AUC 0.82), with excellent calibration (E-statistic 0.8%). At geographic validation, discrimination was satisfactory (AUC 0.78), but with moderate over-prediction of mortality risk, particularly in higher-risk patients (E-statistic 2.9%). While discrimination was adequate when NOCOS was tested on second-wave NYC data (AUC 0.77), NOCOS systematically overestimated the mortality risk (E-statistic 5.1%). Discrimination in the Dutch data was good (AUC 0.81), but with over-prediction of risk, particularly in lower-risk patients (E-statistic 4.0%). Recalibration of COPE and NOCOS led to limited net benefit improvement in Dutch data, but to substantial net benefit improvement in NYC data.

CONCLUSIONS

NOCOS performed moderately worse than COPE, probably reflecting unique aspects of the early pandemic in NYC. Frequent updating of prognostic models is likely to be required for transportability over time and space during a dynamic pandemic.

Collapse

Affiliation(s)

David van Klaveren Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands. .,Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA.
Theodoros P Zanos Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
Jason Nelson Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
Todd J Levy Institute of Bioelectronic Medicine, Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
Jinny G Park Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA
Isabel R A Retel Helmrich Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
Judith A C Rietjens Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
Melissa J Basile Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
Negin Hajizadeh Division of Pulmonary Critical Care and Sleep Medicine, Department of Medicine, Donald and Barbara Zucker School of Medicine at Hofstra/Northwell Health, Hempstead, NY, USA
Hester F Lingsma Department of Public Health, Erasmus MC University Medical Center Rotterdam, Dr. Molewaterplein 50, 3015 GE, Rotterdam, The Netherlands
David M Kent Predictive Analytics and Comparative Effectiveness Center, Institute for Clinical Research and Health Policy Studies, Tufts Medical Center, Boston, USA

Collapse

Shreve JT, Khanani SA, Haddad TC. Artificial Intelligence in Oncology: Current Capabilities, Future Opportunities, and Ethical Considerations. Am Soc Clin Oncol Educ Book 2022;42:1-10. [PMID: 35687826 DOI: 10.1200/edbk_350652] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]

Abstract

The promise of highly personalized oncology care using artificial intelligence (AI) technologies has been forecasted since the emergence of the field. Cumulative advances across the science are bringing this promise to realization, including refinement of machine learning- and deep learning algorithms; expansion in the depth and variety of databases, including multiomics; and the decreased cost of massively parallelized computational power. Examples of successful clinical applications of AI can be found throughout the cancer continuum and in multidisciplinary practice, with computer vision-assisted image analysis in particular having several U.S. Food and Drug Administration-approved uses. Techniques with emerging clinical utility include whole blood multicancer detection from deep sequencing, virtual biopsies, natural language processing to infer health trajectories from medical notes, and advanced clinical decision support systems that combine genomics and clinomics. Substantial issues have delayed broad adoption, with data transparency and interpretability suffering from AI's "black box" mechanism, and intrinsic bias against underrepresented persons limiting the reproducibility of AI models and perpetuating health care disparities. Midfuture projections of AI maturation involve increasing a model's complexity by using multimodal data elements to better approximate an organic system. Far-future positing includes living databases that accumulate all aspects of a person's health into discrete data elements; this will fuel highly convoluted modeling that can tailor treatment selection, dose determination, surveillance modality and schedule, and more. The field of AI has had a historical dichotomy between its proponents and detractors. The successful development of recent applications, and continued investment in prospective validation that defines their impact on multilevel outcomes, has established a momentum of accelerated progress.

Collapse