Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Van Calster B, Steyerberg EW, Wynants L, van Smeden M. There is no such thing as a validated prediction model. BMC Med 2023;21:70. [PMID: 36829188 PMCID: PMC9951847 DOI: 10.1186/s12916-023-02779-w] [Citation(s) in RCA: 110] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/10/2023] [Indexed: 02/26/2023] Open

For:	Van Calster B, Steyerberg EW, Wynants L, van Smeden M. There is no such thing as a validated prediction model. BMC Med 2023;21:70. [PMID: 36829188 PMCID: PMC9951847 DOI: 10.1186/s12916-023-02779-w] [Citation(s) in RCA: 110] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/10/2023] [Indexed: 02/26/2023] Open

Number

Cited by Other Article(s)

Zhou Q, He R, Li H, Gu M. Development and validation of a nomogram to predict the risk of in-hospital MACE for emergence NSTE-ACS: A retrospective multicenter study based on the Chinese population. Int J Med Inform 2025;199:105884. [PMID: 40147416 DOI: 10.1016/j.ijmedinf.2025.105884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2024] [Revised: 03/04/2025] [Accepted: 03/19/2025] [Indexed: 03/29/2025]

Abstract

PURPOSE

Our study aims to develop and validate an effective in-hospital major adverse cardiovascular events(MACE) prediction model for patients with emergency Non-ST elevation acute coronary syndrome(NSTE-ACS).

METHODS

We retrospectively collected NSTE-ACS patients in three tertiary hospitals in Chongqing. In-hospital MACE was the predicted outcome. Patients from one hospital were divided into training set and internal validation set according to the ratio of 7:3. Besides, 662 patients from two other tertiary hospitals were for external validation. Patient information including demographics, laboratory tests results and disease course records were for comprehensive analysis. Finally, LASSO were used to identify the predictors and develop the model. This model was subsequently visualized as a nomogram, followed by both internal and external validations.The receiver operating characteristic curve, calibration curve and clinical decision curve analysis were used to assess the model's discrimination, calibration and clinical applicability, respectively.

RESULTS

A total of 3,308 patients were included, 375 of whom developed in-hospital MACE. The LR model demonstrated that length of stay, neutrophils, myoglobin, NYHA, CCI, NT-proBNP, LVEF and respiratory failure were risk factors for in-hospital MACE in emergence NSTE-ACS patients. In the training set, the AUC was 0.860 (95%CI:0.831-0.889). In external validation,the AUC was 0.855(95%CI:0.808-0.902), and both the calibration curve and DCA in validation set also revealed stable predictive accuracy and clinical validity.Additionally,it is available to calculate the MACE risk online via the web page (https://cocozhou99.shinyapps.io/DynNomapp/).

CONCLUSION

The prediction model we constructed has good predictive performance and can help healthcare professionals accurately assess the risk of in-hospital MACE in emergence NSTE-ACS patients.

Collapse

Negoi I. Personalized surveillance in colorectal cancer: Integrating circulating tumor DNA and artificial intelligence into post-treatment follow-up. World J Gastroenterol 2025;31:106670. [DOI: 10.3748/wjg.v31.i18.106670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/03/2025] [Revised: 04/07/2025] [Accepted: 04/18/2025] [Indexed: 05/13/2025] Open

Alfaraj SA, Kist JM, Groenwold RHH, Spruit M, Mook-Kanamori D, Vos RC. External validation of SCORE2-Diabetes in The Netherlands across various socioeconomic levels in native-Dutch and non-Dutch populations. Eur J Prev Cardiol 2025;32:555-563. [PMID: 39485827 DOI: 10.1093/eurjpc/zwae354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 07/15/2024] [Accepted: 10/17/2024] [Indexed: 11/03/2024]

Abstract

AIMS

Adults with type 2 diabetes have an increased risk of cardiovascular events (CVEs), the world's leading cause of mortality. The SCORE2-Diabetes model is a tool designed to estimate the 10-year risk of CVE specifically in individuals with type 2 diabetes. However, the performance of such models may vary across different demographic and socioeconomic groups, necessitating validation and assessment in diverse populations. This study aims to externally validate SCORE2-Diabetes and assess its performance across various socioeconomic and migration origins in The Netherlands.

METHODS AND RESULTS

We selected adults with type 2 diabetes, aged 40-79 years and without previous CVE from the Extramural LUMC Academic Network (ELAN) primary care data cohort from 2007 to 2023. ELAN data were linked with Statistics Netherlands registry data to obtain information about the country of origin and socioeconomic status (SES). Cardiovascular event was defined as myocardial infarction, stroke, or CV mortality. Non-CV mortality was considered a competing event. Analyses were stratified by sex, Dutch vs. other non-Dutch countries of origin, and quintiles of SES. Of the 26 544 included adults with type 2 diabetes, 2518 developed CVE. SCORE2-Diabetes showed strong predictive accuracy for CVE in the Dutch population [observed-to-expected ratio (OE) = 1.000, 95% CI = 0.990-1.008 for men, and OE = 1.050, 95% CI = 1.042-1.057 for women]. For non-Dutch individuals, the model underestimated CVE risk (OE = 1.121, 95% CI = 1.108-1.131 for men, and OE = 1.100, 95% CI = 1.092-1.111 for women). The model also underestimated the CVE risk (OE > 1) in low SES groups and overestimated the risk (OE < 1) in high SES groups. Discrimination was moderate across subgroups with c-indices between 0.6 and 0.7.

CONCLUSION

SCORE2-Diabetes accurately predicted the risk of CVE in the Dutch population. However, it underpredicted the risk of CVE in the low SES groups and non-Dutch origins, while overpredicting the risk in high SES men and women. Additional clinical judgment must be considered when using SCORE2-Diabetes for different SES and countries of origin.

LAY SUMMARY

A new study validates the SCORE2-Diabetes model for predicting a 10-year risk of cardiovascular events in type 2 diabetes. Strong accuracy for the Dutch population, but underestimation of the risk for low SES and non-Dutch groups. SCORE2-Diabetes should be used with extra caution across diverse subgroups.

Collapse

van de Klundert J, Perez-Galarce F, Olivares M, Pengel L, de Weerd A. The comparative performance of models predicting patient and graft survival after kidney transplantation: A systematic review. Transplant Rev (Orlando) 2025;39:100934. [PMID: 40339177 DOI: 10.1016/j.trre.2025.100934] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Revised: 04/25/2025] [Accepted: 04/26/2025] [Indexed: 05/10/2025]

Siemens K, Hunt BJ, Tibby SM. In Response. Anesth Analg 2025;140:e60-e61. [PMID: 39977338 DOI: 10.1213/ane.0000000000007464] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2025]

Butt AL, Allan PG, Dang DD, Tanaka KA. Test Driving an Old Car on a New Road-The Need for Context-Specific Adaptations in Predictive Modeling. Anesth Analg 2025;140:e59-e60. [PMID: 39977344 DOI: 10.1213/ane.0000000000007463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2025]

Neal SR, Sturrock SS, Musorowegomo D, Gannon H, Zaman M, Cortina-Borja M, Le Doare K, Heys M, Chimhini G, Fitzgerald F. Clinical prediction models to diagnose neonatal sepsis in low-income and middle-income countries: a scoping review. BMJ Glob Health 2025;10:e017582. [PMID: 40204466 DOI: 10.1136/bmjgh-2024-017582] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 02/26/2025] [Indexed: 04/11/2025] Open

Abstract

INTRODUCTION

Neonatal sepsis causes significant morbidity and mortality worldwide but is difficult to diagnose clinically. Clinical prediction models (CPMs) could improve diagnostic accuracy, facilitating earlier treatment for cases and avoiding antibiotic overuse. Neonates in low-income and middle-income countries (LMICs) are disproportionately affected by sepsis, yet no review has comprehensively synthesised evidence for CPMs validated in this setting.

METHODS

We performed a scoping review of CPMs to diagnose neonatal sepsis using Ovid MEDLINE, Ovid Embase, Scopus, Web of Science, Global Index Medicus and the Cochrane Library. The most recent searches were performed on 16 June 2024. We included studies published in English or Spanish that validated a new or existing CPM for neonatal sepsis in any healthcare setting in an LMIC. Studies were excluded if they validated a prognostic model or where data for neonates could not be separated from a larger paediatric population. Studies were selected by two independent reviewers and summarised by narrative synthesis.

RESULTS

From 4598 unique records, we included 82 studies validating 44 distinct models in 24 252 neonates. Most studies were set in neonatal intensive or special care units (n=64, 78%) in middle-income countries (n=81, 99%) and included neonates already suspected of sepsis (n=58, 71%). Only four studies (5%) were set in the WHO African region, and only one study included data from a low-income country. Two-thirds of CPMs (n=30) required laboratory parameters, and three-quarters (n=34) were only validated in one study.

CONCLUSION

Our review highlights several literature gaps, particularly a paucity of studies validating models in the lowest-income countries where neonatal sepsis is most prevalent, and models for the undifferentiated neonatal population that do not rely on laboratory tests. Furthermore, heterogeneity in study populations, definitions of sepsis and reporting of models inhibits meaningful comparison between studies and may hinder progress towards useful diagnostic tools.

Collapse

Rysstad T, Grotle M, Traeger AC, Aasdahl L, Vigdal ØN, Aanesen F, Øiestad BE, Pripp AH, Wynne-Jones G, Dunn KM, Fors EA, Linton SJ, Tveter AT. Predicting prolonged work absence due to musculoskeletal disorders: development, validation, and clinical usefulness of prognostic prediction models. Int Arch Occup Environ Health 2025:10.1007/s00420-025-02129-8. [PMID: 40198330 DOI: 10.1007/s00420-025-02129-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Accepted: 02/11/2025] [Indexed: 04/10/2025]

Affiliation(s)

Tarjei Rysstad Department of Rehabilitation Science and Health Technology, Faculty of Health Sciences, Oslo Metropolitan University, St. Olavs Plass, P.O. Box 4, 0130, Oslo, Norway.
Margreth Grotle Department of Rehabilitation Science and Health Technology, Faculty of Health Sciences, Oslo Metropolitan University, St. Olavs Plass, P.O. Box 4, 0130, Oslo, Norway Department of Research and Innovation, Division of Clinical Neuroscience, Oslo University Hospital, Oslo, Norway
Adrian C Traeger Institute for Musculoskeletal Health, The University of Sydney and Sydney Local Health District, Sydney, Australia School of Public Health, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
Lene Aasdahl Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology (NTNU), Trondheim, Norway Unicare Helsefort Rehabilitation Centre, Rissa, Norway
Ørjan Nesse Vigdal Department of Rehabilitation Science and Health Technology, Faculty of Health Sciences, Oslo Metropolitan University, St. Olavs Plass, P.O. Box 4, 0130, Oslo, Norway
Fiona Aanesen National Institute of Occupational Health, Majorstuen, Oslo, Norway
Britt Elin Øiestad Department of Rehabilitation Science and Health Technology, Faculty of Health Sciences, Oslo Metropolitan University, St. Olavs Plass, P.O. Box 4, 0130, Oslo, Norway
Are Hugo Pripp Oslo Centre of Biostatistics and Epidemiology, Research Support Services, Oslo University Hospital, Oslo, Norway Faculty of Health Sciences, Oslo Metropolitan University, Oslo, Norway
Gwenllian Wynne-Jones School of Medicine, Keele University, Staffordshire, UK
Kate M Dunn School of Medicine, Keele University, Staffordshire, UK
Egil A Fors Department of Public Health and Nursing, Faculty of Medicine and Health Sciences, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
Steven J Linton Department of Law, Psychology, and Social Work, Örebro University, Orebro, Sweden
Anne Therese Tveter Department of Rehabilitation Science and Health Technology, Faculty of Health Sciences, Oslo Metropolitan University, St. Olavs Plass, P.O. Box 4, 0130, Oslo, Norway Center for Treatment of Rheumatic and Musculoskeletal Diseases (REMEDY), Diakonhjemmet Hospital, Oslo, Norway

Collapse

Wang Z, Wang W, Sun C, Li J, Xie S, Xu J, Zou K, Jin Y, Yan S, Liao X, Kang Y, Coopersmith CM, Sun X. A methodological systematic review of validation and performance of sepsis real-time prediction models. NPJ Digit Med 2025;8:190. [PMID: 40189694 PMCID: PMC11973177 DOI: 10.1038/s41746-025-01587-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Accepted: 03/26/2025] [Indexed: 04/09/2025] Open

Affiliation(s)

Zichen Wang Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China
Wen Wang Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China. NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China. Sichuan Center of Technology Innovation for Real World Data, Chengdu, China.
Che Sun Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Jili Li Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China West China School of Medicine, West China Hospital, Sichuan University, Chengdu, 610041, China
Shuangyi Xie Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Jiayue Xu Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Kang Zou Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China Sichuan Center of Technology Innovation for Real World Data, Chengdu, China
Yinghui Jin Center for Evidence-Based and Translational Medicine, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
Siyu Yan Center for Evidence-Based and Translational Medicine, Zhongnan Hospital of Wuhan University, Wuhan, 430071, China
Xuelian Liao Department of Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, 610041, China
Yan Kang Department of Critical Care Medicine, West China Hospital, Sichuan University, Chengdu, 610041, China
Craig M Coopersmith Emory Critical Care Center and Department of Surgery, Emory University School of Medicine, Atlanta, GA, USA
Xin Sun Department of Critical Care Medicine, Chinese Evidence-based Medicine Center, West China Hospital, Sichuan University, Chengdu, 610041, China. NMPA Key Laboratory for Real World Data Research and Evaluation in Hainan, Chengdu, 610041, China. Sichuan Center of Technology Innovation for Real World Data, Chengdu, China. West China School of Public Health and West China Fourth Hospital, Sichuan University, Chengdu, China.

Collapse

Heesen P, Christ SM, Ciobanu-Caraus O, Kahraman A, Schelling G, Studer G, Bode-Lesniewska B, Fuchs B. Clinical prognostic models for sarcomas: a systematic review and critical appraisal of development and validation studies. Diagn Progn Res 2025;9:7. [PMID: 40189567 PMCID: PMC11974052 DOI: 10.1186/s41512-025-00186-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/22/2024] [Accepted: 02/28/2025] [Indexed: 04/09/2025] Open

Abstract

BACKGROUND

Current clinical guidelines recommend the use of clinical prognostic models (CPMs) for therapeutic decision-making in sarcoma patients. However, the number and quality of developed and externally validated CPMs is unknown. Therefore, we aimed to describe and critically assess CPMs for sarcomas.

METHODS

We performed a systematic review including all studies describing the development and/or external validation of a CPM for sarcomas. We searched the databases MEDLINE, EMBASE, Cochrane Central, and Scopus from inception until June 7th, 2022. The risk of bias was assessed using the prediction model risk of bias assessment tool (PROBAST).

RESULTS

Seven thousand six hundred fifty-six records were screened, of which 145 studies were eventually included, developing 182 and externally validating 59 CPMs. The most frequently modeled type of sarcoma was osteosarcoma (43/182; 23.6%), and the most frequently predicted outcome was overall survival (81/182; 44.5%). The most used predictors were the patient's age (133/182; 73.1%) and tumor size (116/182; 63.7%). Univariable screening was used in 137 (75.3%) CPMs, and only 7 (3.9%) CPMs were developed using pre-specified predictors based on clinical knowledge or literature. The median c-statistic on the development dataset was 0.74 (interquartile range [IQR] 0.71, 0.78). Calibration was reported for 142 CPMs (142/182; 78.0%). The median c-statistic of external validations was 0.72 (IQR 0.68-0.75). Calibration was reported for 46 out of 59 (78.0%) externally validated CPMs. We found 169 out of 241 (70.1%) CPMs to be at high risk of bias, mostly due to the high risk of bias in the analysis domain.

DISCUSSION

While various CPMs for sarcomas have been developed, the clinical utility of most of them is hindered by a high risk of bias and limited external validation. Future research should prioritise validating and updating existing well-developed CPMs over developing new ones to ensure reliable prognostic tools.

TRIAL REGISTRATION

PROSPERO CRD42022335222.

Collapse

Rysavy MA. Challenges in making an evidence-based prognosis. Semin Perinatol 2025;49:152054. [PMID: 40404235 DOI: 10.1016/j.semperi.2025.152054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/06/2025] [Revised: 02/11/2025] [Accepted: 02/11/2025] [Indexed: 05/24/2025]

Schots BBS, Pizarro CS, Arends BKO, Oerlemans MIFJ, Ahmetagić D, van der Harst P, van Es R. Deep learning for electrocardiogram interpretation: Bench to bedside. Eur J Clin Invest 2025;55 Suppl 1:e70002. [PMID: 40191935 PMCID: PMC11973865 DOI: 10.1111/eci.70002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/23/2024] [Accepted: 01/23/2025] [Indexed: 04/09/2025]

Tian CH, Liu LY, Huang YF, Yang HJ, Lai YY, Li CL, Gan D, Yang J. Clinical prediction models for in vitro fertilization outcomes: a systematic review, meta-analysis, and external validation. Hum Reprod 2025;40:633-646. [PMID: 39983753 DOI: 10.1093/humrep/deaf013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2024] [Revised: 12/16/2024] [Indexed: 02/23/2025] Open

Abstract

STUDY QUESTION

What is the best-performing model currently predicting live birth outcomes for IVF or ICSI?

SUMMARY ANSWER

Among the identified prognostic models, McLernon's post-treatment model outperforms other models in both the meta-analysis and external validation of a Chinese cohort.

WHAT IS KNOWN ALREADY

With numerous similar models available across different time periods and using various predictors in IVF prognostic models, there is a need to summarize and evaluate them, due to a lack of validated evidence distinguishing high-quality from low-quality prediction tools. However, there is a notable dearth of research in the form of meta-analysis or external validation assessing the performance of models in predicting live births in this field.

STUDY DESIGN, SIZE, DURATION

The researchers conducted a comprehensive literature review in PubMed, EMBASE, and Web of Science, using keywords related to prognostic models and IVF/ICSI live birth outcomes. The search included studies published up to 3 April 2024, and was limited to English language studies.

PARTICIPANTS/MATERIALS, SETTING, METHODS

The review included studies that developed or validated prognostic models for IVF live birth outcomes while providing clear reports on model characteristics. Researchers extracted and analysed the data in accordance with the guidelines outlined in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses and other model-related guidelines. For model effects in meta-analysis, the choice would be based on the heterogeneity assessed using the I2 statistic and the Cochrane Q test. Model performance was evaluated by assessing their area under the receiver operating characteristic curves (AUCs) and calibration plots in the studies.

MAIN RESULTS AND THE ROLE OF CHANCE

This review provides a comprehensive summary of data derived from 72 studies with an overall ROB of high or unclear. These studies contained a total of 132 predictors and 86 prognostic models, and then meta-analyses were performed for each of the five selected models. The total random effects of Templeton's, Nelson's, McLernon's pre-treatment and post-treatment model demonstrated AUCs of 0.65 (95% CI: 0.61-0.69), 0.63 (95% CI: 0.63-0.64), 0.67 (95% CI: 0.62-0.71), and 0.73 (95% CI: 0.71-0.75), respectively. The total fixed effects of the intelligent data analysis score (iDAScore) model estimated an AUC of 0.66 (95% CI: 0.63-0.68). The external validation of the initial four models in our cohort produced AUCs ranging from 0.53 to 0.58, and the calibration was confirmed through calibration plots.

LIMITATIONS, REASONS FOR CAUTION

While the focus on English-language studies and live birth outcomes may constrain the generalizability of the findings to diverse populations, this approach equips clinicians, who view live births as the ultimate objective, with more precise and actionable reference guidelines.

WIDER IMPLICATIONS OF THE FINDINGS

This study represents the first meta-analysis in the field of IVF prediction models, definitively confirming the superior performance of McLernon's post-treatment model. The conclusion is reinforced by independent validation from another perspective. Nevertheless, further investigation is warranted to develop new models and to externally validate existing high-performing models for prognostic accuracy in IVF outcomes.

STUDY FUNDING/COMPETING INTEREST(S)

This study was supported by the National Natural Science Foundation of China (Grant No. 82174517). The authors report no conflict of interest.

REGISTRATION NUMBER

2022 CRD42022312018.

Collapse

Ai C, Song J, Yuan C, Xu G, Yang J, Lv T, Jin S, Wu H, Xiang B, Yang J. Prediction model of the T cell-mediated rejection after liver transplantation in children and adults: A case-controlled study. Int J Surg 2025;111:2827-2837. [PMID: 39878165 DOI: 10.1097/js9.0000000000002279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2024] [Accepted: 01/05/2025] [Indexed: 01/31/2025]

Stassen RC, Maas CCHM, Leong SP, Kashani-Sabet M, White RL, Pockaj BA, Zager JS, Schneebaum S, Vetto JT, Avisar E, Harrison Howard J, O’Donoghue C, Kosiorek H, van Akkooi ACJ, Verhoef C, van Klaveren D, Grünhagen DJ, Olofsson Bagge R. External validation of a model to predict recurrence-free and melanoma-specific survival for patients with melanoma after sentinel node biopsy. Br J Surg 2025;112:znaf037. [PMID: 40243383 PMCID: PMC12004364 DOI: 10.1093/bjs/znaf037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2024] [Revised: 12/11/2024] [Accepted: 01/26/2025] [Indexed: 04/18/2025]

Affiliation(s)

Robert C Stassen Department of Surgical Oncology, Erasmus Medical Centre Cancer Institute, Rotterdam, The Netherlands
Carolien C H M Maas Department of Public Health, Erasmus University Medical Centre, Rotterdam, The Netherlands
Stanley P Leong Department of Surgery, California Pacific Medical Center and Research Institute, San Francisco, California, USA
Mohammed Kashani-Sabet Department of Surgery, California Pacific Medical Center and Research Institute, San Francisco, California, USA
Richard L White Department of Surgery, Levine Cancer Institute, Carolinas Medical Center, Atrium Health, Charlotte, North Carolina, USA
Barbara A Pockaj Department of Surgery, Mayo Clinic, Phoenix, Arizona, USA
Jonathan S Zager Department of Cutaneous Oncology, Moffitt Cancer Center, Tampa, Florida, USA
Schlomo Schneebaum Department of Surgery, Tel-Aviv Sourasky Medical Center, Tel Aviv, Israel
John T Vetto Division of Surgical Oncology, Oregon Health & Science University, Portland, Oregon, USA
Eli Avisar Department of Surgery, Division of Surgical Oncology at University of Miami Miller School of Medicine, Miami, Florida, USA
J Harrison Howard Department of Surgery, University of South Alabama, Mobile, Alabama, USA
Cristina O’Donoghue Department of Surgery, Rush University Medical Center, Chicago, Illinois, USA
Heidi Kosiorek Department of Quantitative Health Sciences, Mayo Clinic Arizona, Scottsdale, Arizona, USA
Alexander C J van Akkooi Melanoma Institute Australia, University of Sydney, Sydney, New South Wales, Australia Faculty of Medicine and Health, University of Sydney, Sydney, New South Wales, Australia Department of Melanoma and Surgical Oncology, Royal Prince Alfred Hospital, Sydney, New South Wales, Australia
Cornelis Verhoef Department of Surgical Oncology, Erasmus Medical Centre Cancer Institute, Rotterdam, The Netherlands
David van Klaveren Department of Public Health, Erasmus University Medical Centre, Rotterdam, The Netherlands
Dirk J Grünhagen Department of Surgical Oncology, Erasmus Medical Centre Cancer Institute, Rotterdam, The Netherlands
Roger Olofsson Bagge Department of Surgery, Institute of Clinical Sciences, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden Department of Surgery, Sahlgrenska University Hospital, Gothenburg, Sweden

Collapse

Hartmann S, Dwyer D, Scott I, Wannan CMJ, Nguyen J, Lin A, Middeldorp CM, Wood SJ, Yung AR, McGorry PD, Nelson B, Clark SR. Dynamic Updating of Psychosis Prediction Models in Individuals at Ultra-High Risk of Psychosis. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2025:S2451-9022(25)00119-3. [PMID: 40158694 DOI: 10.1016/j.bpsc.2025.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/05/2025] [Revised: 03/11/2025] [Accepted: 03/15/2025] [Indexed: 04/02/2025]

Affiliation(s)

Simon Hartmann Discipline of Psychiatry, Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia; Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia.
Dominic Dwyer Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Isabelle Scott Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Cassandra M J Wannan Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Josh Nguyen Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Ashleigh Lin School of Population and Global Health, The University of Western Australia, Perth, Western Australia, Australia
Christel M Middeldorp Child Health Research Center, University of Queensland, St Lucia, Brisbane, Queensland, Australia; Child and Youth Mental Health Service, Children's Health Queensland Hospital and Health Service, Brisbane, Queensland, Australia; Department of Child and Adolescent Psychiatry and Psychology, Amsterdam University Medical Center, Amsterdam Public Health Research Institute, Amsterdam, the Netherlands; Arkin Mental Health Care, Amsterdam, the Netherlands; Levvel, Academic Center for Child and Adolescent Psychiatry, Amsterdam, the Netherlands
Stephen J Wood Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia; School of Psychology, University of Birmingham, Edgbaston, United Kingdom
Alison R Yung Deakin University, Institute of Mental and Physical Health and Clinical Translation, Geelong, Victoria, Australia; School of Health Science, University of Manchester, Manchester, United Kingdom
Patrick D McGorry Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Barnaby Nelson Orygen, Melbourne, Victoria, Australia; Centre for Youth Mental Health, The University of Melbourne, Melbourne, Victoria, Australia
Scott R Clark Discipline of Psychiatry, Adelaide Medical School, The University of Adelaide, Adelaide, South Australia, Australia

Collapse

Mansmann U, Ön BI. The validation of prediction models deserves more recognition. BMC Med 2025;23:166. [PMID: 40102914 PMCID: PMC11921473 DOI: 10.1186/s12916-025-03994-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Accepted: 03/11/2025] [Indexed: 03/20/2025] Open

Smart MH, Lin JY, Layden BT, Eisenberg Y, Pickard AS, Sharp LK, Danielson KK, Kong A. Diabetes Screening in the Emergency Department: Development of a Predictive Model for Elevated Hemoglobin A1c. J Diabetes Res 2025;2025:8830658. [PMID: 40109952 PMCID: PMC11922610 DOI: 10.1155/jdr/8830658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/18/2024] [Accepted: 02/04/2025] [Indexed: 03/22/2025] Open

Abstract

Aims: We developed a prediction model for elevated hemoglobin A1c (HbA1c) among patients presenting to the emergency department (ED) at risk for diabetes to identify important factors that may influence follow-up patient care. Methods: Retrospective electronic health records data among patients screened for diabetes at the ED in May 2021 was used. The primary outcome was elevated HbA1c (≥ 5.7%). The data was divided into a derivation set (80%) and a test set (20%) stratified by elevated HbA1c. In the derivation set, we estimated the optimal significance level for backward elimination using a 10-fold cross-validation method. A final model was derived using the entire derivation set and validated on the test set. Performance statistics included C-statistic, sensitivity, specificity, predictive values, Hosmer-Lemeshow test, and Brier score. Results: There were 590 ED patients screened for diabetes in May 2021. The final model included nine variables: age, race/ethnicity, insurance, chief complaints of back pain and fever/chills, and a past medical history of obesity, hyperlipidemia, chronic obstructive pulmonary disease, and substance misuse. Adequate model discrimination (C-statistic = 0.75; sensitivity, specificity, and predictive values > 0.70), no evidence of model ill fit (Hosmer-Lemeshow test = 0.29), and moderate Brier score (0.21) suggest acceptable model performance. Conclusion: In addition to age, obesity, and hyperlipidemia, a history of substance misuse was identified as an important predictor of elevated HbA1c levels among patients screened for diabetes in the ED. Our findings suggest that substance misuse may be an important factor to consider when facilitating follow-up care for patients identified with prediabetes or diabetes in the ED and warrants further investigation. Future research efforts should also include external validation in larger samples of ED patients.

Collapse

Drebin HM, Kurtansky NR, Hosein S, Nadelmann E, Moy AP, Ariyan CE, Bello DM, Brady MS, Coit DG, Marchetti MA, Bartlett EK. Declining Clinical Utility of Tools for Predicting Sentinel Lymph Node Biopsy Status: A Single Institution Experience from 2000 to 2021. Ann Surg Oncol 2025;32:1463-1472. [PMID: 39681721 DOI: 10.1245/s10434-024-16698-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2024] [Accepted: 11/28/2024] [Indexed: 12/18/2024]

Nanki T, Yamaguchi T, Umetsu K, Tanabe R, Maeda N, Kanazawa M, Furuno Y, Matsuda S, Takemoto S, Asao K, Kamiuchi T. Development and validation of a prediction model for serious infections in rheumatoid arthritis patients treated with tocilizumab in Japan. Clin Rheumatol 2025;44:1081-1093. [PMID: 39918730 PMCID: PMC11865113 DOI: 10.1007/s10067-025-07328-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Revised: 12/13/2024] [Accepted: 01/09/2025] [Indexed: 02/27/2025]

Madathil S, Dhouib M, Lelong Q, Bourassine A, Monsonego J. A multimodal deep learning model for cervical pre-cancers and cancers prediction: Development and internal validation study. Comput Biol Med 2025;186:109710. [PMID: 39847948 DOI: 10.1016/j.compbiomed.2025.109710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2024] [Revised: 10/10/2024] [Accepted: 01/15/2025] [Indexed: 01/25/2025]

van Leeuwen FD, Steyerberg EW, van Klaveren D, Wessler B, Kent DM, van Zwet EW. Instability of the AUROC of Clinical Prediction Models. Stat Med 2025;44:e70011. [PMID: 39921554 PMCID: PMC11806515 DOI: 10.1002/sim.70011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Revised: 12/04/2024] [Accepted: 01/18/2025] [Indexed: 02/10/2025]

Abstract

BACKGROUND

External validations are essential to assess the performance of a clinical prediction model (CPM) before deployment. Apart from model misspecification, also differences in patient population, the standard of care, predictor definitions, and other factors influence a model's discriminative ability, as commonly quantified by the AUC (or c-statistic). We aimed to quantify the variation in AUCs across sets of external validation studies and propose ways to adjust expectations of a model's performance in a new setting.

METHODS

The Tufts-PACE CPM Registry holds a collection of CPMs for prognosis in cardiovascular disease. We analyzed the AUC estimates of 469 CPMs with at least one external validation. Combined, these CPMs had a total of 1603 external validations reported in the literature. For each CPM and its associated set of validation studies, we performed a random-effects meta-analysis to estimate the between-study standard deviationτ $$ \tau $$ among the AUCs. Since the majority of these meta-analyses have only a handful of validations, this leads to very poor estimates ofτ $$ \tau $$ . So, instead of focusing on a single CPM, we estimated a log-normal distribution ofτ $$ \tau $$ across all 469 CPMs. We then used this distribution as an empirical prior. We used cross-validation to compare this empirical Bayesian approach with frequentist fixed and random-effects meta-analyses.

RESULTS

The 469 CPMs included in our study had a median of 2 external validations with an IQR of [1-3]. The estimated distribution ofτ $$ \tau $$ had a mean of 0.055 and a standard deviation of 0.015. Ifτ $$ \tau $$ = 0.05, then the 95% prediction interval for the AUC in a new setting has a width of at least+ / - $$ +/- $$ 0.1, no matter how many validations have been done. When there are fewer than 5 validations, which is typically the case, the usual frequentist methods grossly underestimate the uncertainty about the AUC in a new setting. Accounting forτ $$ \tau $$ in a Bayesian approach achieved near nominal coverage.

CONCLUSION

Due to large heterogeneity among the validated AUC values of a CPM, there is great irreducible uncertainty in predicting the AUC in a new setting. This uncertainty is underestimated by existing methods. The proposed empirical Bayes approach addresses this problem which merits wide application in judging the validity of prediction models.

Collapse

Nong P, Maurer E, Dwivedi R. The urgency of centering safety-net organizations in AI governance. NPJ Digit Med 2025;8:117. [PMID: 39984650 PMCID: PMC11845669 DOI: 10.1038/s41746-025-01479-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2024] [Accepted: 01/24/2025] [Indexed: 02/23/2025] Open

Ling XC, Chen HSL, Yeh PH, Cheng YC, Huang CY, Shen SC, Lee YS. Deep Learning in Glaucoma Detection and Progression Prediction: A Systematic Review and Meta-Analysis. Biomedicines 2025;13:420. [PMID: 40002833 PMCID: PMC11852503 DOI: 10.3390/biomedicines13020420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2024] [Revised: 12/21/2024] [Accepted: 02/06/2025] [Indexed: 02/27/2025] Open

Gill SS, Ponniah HS, Giersztein S, Anantharaj RM, Namireddy SR, Killilea J, Ramsay D, Salih A, Thavarajasingam A, Scurtu D, Jankovic D, Russo S, Kramer A, Thavarajasingam SG. The diagnostic and prognostic capability of artificial intelligence in spinal cord injury: A systematic review. BRAIN & SPINE 2025;5:104208. [PMID: 40027293 PMCID: PMC11871462 DOI: 10.1016/j.bas.2025.104208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2024] [Revised: 01/20/2025] [Accepted: 02/04/2025] [Indexed: 03/05/2025]

Van den Eynde R, Vrancken A, Foubert R, Tuand K, Vandendriessche T, Schrijvers A, Verbrugghe P, Devos T, Van Calster B, Rex S. Prognostic models for prediction of perioperative allogeneic red blood cell transfusion in adult cardiac surgery: A systematic review and meta-analysis. Transfusion 2025;65:397-409. [PMID: 39726297 PMCID: PMC11826302 DOI: 10.1111/trf.18108] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2024] [Revised: 12/04/2024] [Accepted: 12/04/2024] [Indexed: 12/28/2024]

Abstract

OBJECTIVES

Identifying cardiac surgical patients at risk of requiring red blood cell (RBC) transfusion is crucial for optimizing their outcome. We critically appraised prognostic models preoperatively predicting perioperative exposure to RBC transfusion in adult cardiac surgery and summarized model performance.

METHODS

Design: Systematic review and meta-analysis.

STUDY ELIGIBILITY CRITERIA

Studies developing and/or externally validating models preoperatively predicting perioperative RBC transfusion in adult cardiac surgery. Information sources MEDLINE, CENTRAL & CDSR, Embase, Transfusion Evidence Library, Web of Science, Scopus, ClinicalTrials.gov, and WHO ICTRP. Risk of bias and applicability: Quality of reporting was assessed with the Transparent Reporting of studies on prediction models for Individual Prognosis or Diagnosis adherence form, and risk of bias and applicability with the Prediction model Risk of Bias ASsessment Tool.

SYNTHESIS METHODS

Random-effects meta-analyses of concordance-statistics and total observed:expected ratios for models externally validated ≥5 times.

RESULTS

Nine model development, and 27 external validation studies were included. The average TRIPOD adherence score was 66.4% (range 44.1%-85.2%). All studies but 1 were rated high risk of bias. For TRUST and TRACK, the only models externally validated ≥5 times, summary c-statistics were 0.74 (95% CI: 0.65-0.84; 6 contributing studies) and 0.72 (95% CI: 0.68-0.75; 5 contributing studies) respectively, and summary total observed:expected ratios were 0.86 (95% CI: 0.71-1.05; 5 contributing studies) and 0.94 (95% CI: 0.74-1.19; 5 contributing studies), respectively. Considerable heterogeneity was observed in all meta-analyses.

DISCUSSION

Future high quality external validation and model updating studies which strictly adhere to reporting guidelines, are warranted.

Collapse

van der Meijden SL, van Boekel AM, Schinkelshoek LJ, van Goor H, Steyerberg EW, Nelissen RG, Mesotten D, Geerts BF, de Boer MG, Arbous MS. Development and validation of artificial intelligence models for early detection of postoperative infections (PERISCOPE): a multicentre study using electronic health record data. THE LANCET REGIONAL HEALTH. EUROPE 2025;49:101163. [PMID: 39720095 PMCID: PMC11667051 DOI: 10.1016/j.lanepe.2024.101163] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 11/20/2024] [Accepted: 11/21/2024] [Indexed: 12/26/2024]

Abstract

Background

Postoperative infections significantly impact patient outcomes and costs, exacerbated by late diagnoses, yet early reliable predictors are scarce. Existing artificial intelligence (AI) models for postoperative infection prediction often lack external validation or perform poorly in local settings when validated. We aimed to develop locally valid models as part of the PERISCOPE AI system to enable early detection, safer discharge, and more timely treatment of patients.

Methods

We developed and validated XGBoost models to predict postoperative infections within 7 and 30 days of surgery. Using retrospective pre-operative and intra-operative electronic health record data from 2014 to 2023 across various surgical specialities, the models were developed at Hospital A and validated and updated at Hospitals B and C in the Netherlands and Belgium. Model performance was evaluated before and after updating using the two most recent years of data as temporal validation datasets. Main outcome measures were model discrimination (area under the receiver operating characteristic curve (AUROC)), calibration (slope, intercept, and plots), and clinical utility (decision curve analysis with net benefit).

Findings

The study included 253,010 surgical procedures with 23,903 infections within 30-days. Discriminative performance, calibration properties, and clinical utility significantly improved after updating. Final AUROCs after updating for Hospitals A, B, and C were 0.82 (95% confidence interval (CI) 0.81-0.83), 0.82 (95% CI 0.81-0.83), and 0.91 (95% CI 0.90-0.91) respectively for 30-day predictions on the temporal validation datasets (2022-2023). Calibration plots demonstrated adequate correspondence between observed outcomes and predicted risk. All local models were deemed clinically useful as the net benefit was higher than default strategies (treat all and treat none) over a wide range of clinically relevant decision thresholds.

Interpretation

PERISCOPE can accurately predict overall postoperative infections within 7- and 30-days post-surgery. The robust performance implies potential for improving clinical care in diverse clinical target populations. This study supports the need for approaches to local updating of AI models to account for domain shifts in patient populations and data distributions across different clinical settings.

Funding

This study was funded by a REACT EU grant from European Regional Development Fund (ERDF) and Kansen voor West.

Collapse

Memedovich A, Steele B, Orr T, Chaudhry S, Tadrous M, Kesselheim AS, Hollis A, Beall RF. Predicting patent challenges for small-molecule drugs: A cross-sectional study. PLoS Med 2025;22:e1004540. [PMID: 39937776 PMCID: PMC11867330 DOI: 10.1371/journal.pmed.1004540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/04/2024] [Revised: 02/27/2025] [Accepted: 01/22/2025] [Indexed: 02/14/2025] Open

Abstract

BACKGROUND

The high cost of prescription drugs in the United States is maintained by brand-name manufacturers' competition-free period made possible in part through patent protection, which generic competitors must challenge to enter the market early. Understanding the predictors of these challenges can inform policy development to encourage timely generic competition. Identifying categories of drugs systematically overlooked by challengers, such as those with low market size, highlights gaps where unchecked patent quality and high prices persist, and can help design policy interventions to help promote timely patient access to generic drugs including enhanced patent scrutiny or incentives for challenges. Our objective was to characterize and assess the extent to which market size and other drug characteristics can predict patent challenges for brand-name drugs.

METHODS AND FINDINGS

This cross-sectional study included new patented small-molecule drugs approved by the FDA from 2007 to 2018. Market size, patent, and patent challenge data came from IQVIA MIDAS pharmaceutical quarterly sales data, the FDA's Orange Book database, and the FDA's Paragraph IV list. Predictive models were constructed using random forest and elastic net classification. The primary outcome was the occurrence of a patent challenge within the first year of eligibility. Of the 210 new small-molecule drugs included in the sample, 55% experienced initiation of patent challenge within the first year of eligibility. Market value was the most important predictor variable, with larger markets being more likely to be associated with patent challenges. Drugs in the anti-infective therapeutic class or those with fast-track approval were less likely to be challenged. The limitations of this work arise from the exclusion of variables that were not readily available publicly, will be the target of future research, or were deemed beyond the scope of this project.

CONCLUSIONS

Generic competition does not occur with the same timeliness across all drug markets, which can leave granted patents of questionable merit in place and sustain high brand-name drug prices. Predictive models may help direct limited resources for post-grant patent validity review and adjust policy when generic competition is lacking.

Collapse

Shamsutdinova D, Stamate D, Stahl D. Balancing accuracy and Interpretability: An R package assessing complex relationships beyond the Cox model and applications to clinical prediction. Int J Med Inform 2025;194:105700. [PMID: 39546831 DOI: 10.1016/j.ijmedinf.2024.105700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Accepted: 11/08/2024] [Indexed: 11/17/2024]

Abstract

BACKGROUND

Accurate and interpretable models are essential for clinical decision-making, where predictions can directly impact patient care. Machine learning (ML) survival methods can handle complex multidimensional data and achieve high accuracy but require post-hoc explanations. Traditional models such as the Cox Proportional Hazards Model (Cox-PH) are less flexible, but fast, stable, and intrinsically transparent. Moreover, ML does not always outperform Cox-PH in clinical settings, warranting a diligent model validation. We aimed to develop a set of R functions to help explore the limits of Cox-PH compared to the tree-based and deep learning survival models for clinical prediction modelling, employing ensemble learning and nested cross-validation.

METHODS

We developed a set of R functions, publicly available as the package "survcompare". It supports Cox-PH and Cox-Lasso, and Survival Random Forest (SRF) and DeepHit are the ML alternatives, along with the ensemble methods integrating Cox-PH with SRF or DeepHit designed to isolate the marginal value of ML. The package performs a repeated nested cross-validation and tests for statistical significance of the ML's superiority using the survival-specific performance metrics, the concordance index, time-dependent AUC-ROC and calibration slope. To get practical insights, we applied this methodology to clinical and simulated datasets with varying complexities and sizes.

RESULTS

In simulated data with non-linearities or interactions, ML models outperformed Cox-PH at sample sizes ≥ 500. ML superiority was also observed in imaging and high-dimensional clinical data. However, for tabular clinical data, the performance gains of ML were minimal; in some cases, regularised Cox-Lasso recovered much of the ML's performance advantage with significantly faster computations. Ensemble methods combining Cox-PH and ML predictions were instrumental in quantifying Cox-PH's limits and improving ML calibration. Traditional models like Cox-PH or Cox-Lasso should not be overlooked while developing clinical predictive models from tabular data or data of limited size.

CONCLUSION

Our package offers researchers a framework and practical tool for evaluating the accuracy-interpretability trade-off, helping make informed decisions about model selection.

Collapse

Meijerink LM, Dunias ZS, Leeuwenberg AM, de Hond AAH, Jenkins DA, Martin GP, Sperrin M, Peek N, Spijker R, Hooft L, Moons KGM, van Smeden M, Schuit E. Updating methods for artificial intelligence-based clinical prediction models: a scoping review. J Clin Epidemiol 2025;178:111636. [PMID: 39662644 DOI: 10.1016/j.jclinepi.2024.111636] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Revised: 12/02/2024] [Accepted: 12/03/2024] [Indexed: 12/13/2024]

Abstract

OBJECTIVES

To give an overview of methods for updating artificial intelligence (AI)-based clinical prediction models based on new data.

STUDY DESIGN AND SETTING

We comprehensively searched Scopus and Embase up to August 2022 for articles that addressed developments, descriptions, or evaluations of prediction model updating methods. We specifically focused on articles in the medical domain involving AI-based prediction models that were updated based on new data, excluding regression-based updating methods as these have been extensively discussed elsewhere. We categorized and described the identified methods used to update the AI-based prediction model as well as the use cases in which they were used.

RESULTS

We included 78 articles. The majority of the included articles discussed updating for neural network methods (93.6%) with medical images as input data (65.4%). In many articles (51.3%) existing, pretrained models for broad tasks were updated to perform specialized clinical tasks. Other common reasons for model updating were to address changes in the data over time and cross-center differences; however, more unique use cases were also identified, such as updating a model from a broad population to a specific individual. We categorized the identified model updating methods into four categories: neural network-specific methods (described in 92.3% of the articles), ensemble-specific methods (2.5%), model-agnostic methods (9.0%), and other (1.3%). Variations of neural network-specific methods are further categorized based on the following: (1) the part of the original neural network that is kept, (2) whether and how the original neural network is extended with new parameters, and (3) to what extent the original neural network parameters are adjusted to the new data. The most frequently occurring method (n = 30) involved selecting the first layer(s) of an existing neural network, appending new, randomly initialized layers, and then optimizing the entire neural network.

CONCLUSION

We identified many ways to adjust or update AI-based prediction models based on new data, within a large variety of use cases. Updating methods for AI-based prediction models other than neural networks (eg, random forest) appear to be underexplored in clinical prediction research.

PLAIN LANGUAGE SUMMARY

AI-based prediction models are increasingly used in health care, helping clinicians with diagnosing diseases, guiding treatment decisions, and informing patients. However, these prediction models do not always work well when applied to hospitals, patient populations, or times different from those used to develop the models. Developing new models for every situation is neither practical nor desired, as it wastes resources, time, and existing knowledge. A more efficient approach is to adjust existing models to new contexts ('updating'), but there is limited guidance on how to do this for AI-based clinical prediction models. To address this, we reviewed 78 studies in detail to understand how researchers are currently updating AI-based clinical prediction models, and the types of situations in which these updating methods are used. Our findings provide a comprehensive overview of the available methods to update existing models. This is intended to serve as guidance and inspiration for researchers. Ultimately, this can lead to better reuse of existing models and improve the quality and efficiency of AI-based prediction models in health care.

Collapse

Affiliation(s)

Lotta M Meijerink Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands.
Zoë S Dunias Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Artuur M Leeuwenberg Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Anne A H de Hond Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
David A Jenkins Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, United Kingdom
Glen P Martin Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, United Kingdom
Matthew Sperrin Division of Informatics, Imaging and Data Sciences, University of Manchester, Manchester, United Kingdom
Niels Peek Department of Public Health and Primary Care, The Healthcare Improvement Studies Institute, University of Cambridge, Cambridge, United Kingdom
René Spijker Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Lotty Hooft Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Karel G M Moons Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Maarten van Smeden Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands
Ewoud Schuit Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht University, Utrecht, The Netherlands

Collapse

Bindels BJJ, Kuijten RH, Groot OQ, Huele EH, Gal R, de Groot MCH, van der Velden JM, Delawi D, Schwab JH, Verkooijen HM, Verlaan JJ, Tobert D, Rutges JPHJ. External validation of twelve existing survival prediction models for patients with spinal metastases. Spine J 2025:S1529-9430(25)00063-4. [PMID: 39894281 DOI: 10.1016/j.spinee.2025.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/29/2024] [Revised: 12/19/2024] [Accepted: 01/20/2025] [Indexed: 02/04/2025]

Abstract

BACKGROUND CONTEXT

Survival prediction models for patients with spinal metastases may inform patients and clinicians in shared decision-making.

PURPOSE

To externally validate all existing survival prediction models for patients with spinal metastases.

DESIGN

Prospective cohort study using retrospective data.

PATIENT SAMPLE

953 patients.

OUTCOME MEASURES

Survival in months, area under the curve (AUC), and calibration intercept and slope.

METHOD

This study included patients with spinal metastases referred to a single tertiary referral center between 2016 and 2021. Twelve models for predicting 3, 6, and 12-month survival were externally validated Bollen, Mizumoto, Modified Bauer, New England Spinal Metastasis Score, Original Bauer, Oswestry Spinal Risk Index (OSRI), PathFx, Revised Katagiri, Revised Tokuhashi, Skeletal Oncology Research Group Machine Learning Algorithm (SORG-MLA), Tomita, and Van der Linden. Discrimination was assessed using (AUC) and calibration using the intercept and slope. Calibration was considered appropriate if calibration measures were close to their ideal values with narrow confidence intervals.

RESULTS

In total, 953 patients were included. Survival was 76.4% at 3 months (728/953), 62.2% at 6 months (593/953), and 50.3% at 12 months (479/953). Revised Katagiri yielded AUCs of 0.79 (95% CI, 0.76-0.82) to 0.81 (95% CI, 0.79-0.84), Bollen yielded AUCs of 0.76 (95% CI, 0.73-0.80) to 0.77 (95% CI, 0.75-0.80), and OSRI yielded AUCs of 0.75 (95% CI, 0.72-0.78) to 0.77 (95% CI, 0.74-0.79). The other 9 prediction models yielded AUCs ranging from 0.59 (95% CI, 0.55-0.63) to 0.76 (95% CI, 0.74-0.79). None of the twelve models yielded appropriate calibration.

CONCLUSIONS

Twelve survival prediction models for patients with spinal metastases yielded poor to fair discrimination and poor calibration. Survival prediction models may inform decision-making in patients with spinal metastases, provided that recalibration using recent patient data is performed.

Collapse

Affiliation(s)

B J J Bindels Department of Orthopedic Surgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
R H Kuijten Department of Orthopedic Surgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
O Q Groot Department of Orthopedic Surgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
E H Huele Division of Imaging and Oncology, Utrecht University, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
R Gal Division of Imaging and Oncology, Utrecht University, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
M C H de Groot Central Diagnostic Library, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
J M van der Velden Division of Imaging and Oncology, Utrecht University, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
D Delawi Department of Orthopedic Surgery, Antonius Medical Center, Koekoekslaan 1, 3435 CM, Nieuwegein, Utrecht, The Netherlands
J H Schwab Department of Orthopedic Surgery, Cedars-Sinai Medical Center, 8700 Beverly Boulevard, Los Angeles, CA, USA
H M Verkooijen Division of Imaging and Oncology, Utrecht University, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands; Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
J J Verlaan Department of Orthopedic Surgery, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands; Division of Imaging and Oncology, Utrecht University, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX, Utrecht, Utrecht, The Netherlands
D Tobert Department of Orthopedic Surgery, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA
J P H J Rutges Department of Orthopedics and Sports Medicine, Erasmus Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, Zuid-Holland, The Netherlands.

Collapse

Tack B, Vita D, Mbuyamba J, Ntangu E, Vuvu H, Kahindo I, Ngina J, Luyindula A, Nama N, Mputu T, Im J, Jeon H, Marks F, Toelen J, Lunguya O, Jacobs J, Van Calster B. Developing a clinical prediction model to modify empirical antibiotics for non-typhoidal Salmonella bloodstream infection in children under-five in the Democratic Republic of Congo. BMC Infect Dis 2025;25:122. [PMID: 39871187 PMCID: PMC11771121 DOI: 10.1186/s12879-024-10319-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Accepted: 12/05/2024] [Indexed: 01/29/2025] Open

Abstract

BACKGROUND

Non-typhoidal Salmonella (NTS) frequently cause bloodstream infection in children under-five in sub-Saharan Africa, particularly in malaria-endemic areas. Due to increasing drug resistance, NTS are often not covered by standard-of-care empirical antibiotics for severe febrile illness. We developed a clinical prediction model to orient the choice of empirical antibiotics (standard-of-care versus alternative antibiotics) for children admitted to hospital in settings with high proportions of drug-resistant NTS.

METHODS

Data were collected during a prospective cohort study in children (> 28 days-< 5 years) admitted with severe febrile illness to Kisantu district hospital, DR Congo. The outcome variable was blood culture confirmed NTS bloodstream infection; the comparison group were children without NTS bloodstream infection. Predictors were selected a priori based on systematic literature review. The prediction model was developed with multivariable logistic regression; a simplified scoring system was derived. Internal validation to estimate optimism-corrected performance was performed using bootstrapping and net benefits were calculated to evaluate clinical usefulness.

RESULTS

NTS bloodstream infection was diagnosed in 12.7% (295/2327) of enrolled children. The area under the curve was 0.79 (95%CI: 0.76-0.82) for the prediction model, and 0.78 (0.85-0.80) for the scoring system. The estimated calibration slopes were 0.95 (model) and 0.91 (scoring system). At a decision threshold of 20% NTS risk, the prediction model and scoring system had 57% and 53% sensitivity, and 85% specificity. The net benefit for decisions thresholds < 30% ranged from 2.4 to 3.9 per 100 children.

CONCLUSION

The model predicts NTS bloodstream infection and can support the choice of empiric antibiotics to include coverage of drug-resistant NTS, in particular for decision thresholds < 30%. External validation studies are needed to investigate generalizability.

TRIAL REGISTRATION

DeNTS study, clinicaltrials.gov: NCT04473768 (registration 16/07/2020) and TreNTS study, clinicaltrials.gov: NCT04850677 (registration 20/04/2021).

Collapse

Affiliation(s)

Bieke Tack Department of Clinical Sciences, Institute of Tropical Medicine, Nationalestraat 155, 2000, Antwerp, Belgium. Department of Microbiology, Immunology and Transplantation, KU Leuven, Louvain, Belgium. Department of Pediatrics, University Hospitals Leuven, Louvain, Belgium.
Daniel Vita Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Jules Mbuyamba Department of Microbiology, Institut National de Recherche Biomédicale, Kinshasa, Democratic Republic of Congo Department of Medical Biology, University Teaching Hospital of Kinshasa, Kinshasa, Democratic Republic of Congo
Emmanuel Ntangu Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Hornela Vuvu Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Immaculée Kahindo Department of Microbiology, Institut National de Recherche Biomédicale, Kinshasa, Democratic Republic of Congo
Japhet Ngina Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Aimée Luyindula Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Naomie Nama Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Tito Mputu Saint Luc Hôpital Général de Référence Kisantu, Kisantu, Democratic Republic of Congo
Justin Im International Vaccine Institute, Seoul, Republic of Korea
Hyonjin Jeon International Vaccine Institute, Seoul, Republic of Korea
Florian Marks International Vaccine Institute, Seoul, Republic of Korea Cambridge Institute of Therapeutic Immunology and Infectious Disease, School of Clinical Medicine, University of Cambridge, Cambridge, UK Heidelberg Institute of Global Health, University of Heidelberg, Heidelberg, Germany Madagascar Institute for Vaccine Research, University of Antananarivo, Antananarivo, Madagascar
Jaan Toelen Department of Pediatrics, University Hospitals Leuven, Louvain, Belgium Department of Development and Regeneration, KU Leuven, Louvain, Belgium
Octavie Lunguya Department of Microbiology, Institut National de Recherche Biomédicale, Kinshasa, Democratic Republic of Congo Department of Medical Biology, University Teaching Hospital of Kinshasa, Kinshasa, Democratic Republic of Congo
Jan Jacobs Department of Clinical Sciences, Institute of Tropical Medicine, Nationalestraat 155, 2000, Antwerp, Belgium Department of Microbiology, Immunology and Transplantation, KU Leuven, Louvain, Belgium
Ben Van Calster Department of Development and Regeneration, KU Leuven, Louvain, Belgium Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, the Netherlands EPI-Center, KU Leuven, Louvain, Belgium

Collapse

Clift AK. How Outcome Prediction Could Aid Clinical Practice. Br J Hosp Med (Lond) 2025;86:1-6. [PMID: 39862035 DOI: 10.12968/hmed.2024.0781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2025]

Hillier B, Scandrett K, Coombe A, Hernandez-Boussard T, Steyerberg E, Takwoingi Y, Velickovic V, Dinnes J. Risk prediction tools for pressure injury occurrence: an umbrella review of systematic reviews reporting model development and validation methods. Diagn Progn Res 2025;9:2. [PMID: 39806510 PMCID: PMC11730812 DOI: 10.1186/s41512-024-00182-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/15/2024] [Accepted: 12/02/2024] [Indexed: 01/16/2025] Open

Abstract

BACKGROUND

Pressure injuries (PIs) place a substantial burden on healthcare systems worldwide. Risk stratification of those who are at risk of developing PIs allows preventive interventions to be focused on patients who are at the highest risk. The considerable number of risk assessment scales and prediction models available underscores the need for a thorough evaluation of their development, validation, and clinical utility. Our objectives were to identify and describe available risk prediction tools for PI occurrence, their content and the development and validation methods used.

METHODS

The umbrella review was conducted according to Cochrane guidance. MEDLINE, Embase, CINAHL, EPISTEMONIKOS, Google Scholar, and reference lists were searched to identify relevant systematic reviews. The risk of bias was assessed using adapted AMSTAR-2 criteria. Results were described narratively. All included reviews contributed to building a comprehensive list of risk prediction tools.

RESULTS

We identified 32 eligible systematic reviews only seven of which described the development and validation of risk prediction tools for PI. Nineteen reviews assessed the prognostic accuracy of the tools and 11 assessed clinical effectiveness. Of the seven reviews reporting model development and validation, six included only machine learning models. Two reviews included external validations of models, although only one review reported any details on external validation methods or results. This was also the only review to report measures of both discrimination and calibration. Five reviews presented measures of discrimination, such as the area under the curve (AUC), sensitivities, specificities, F1 scores, and G-means. For the four reviews that assessed the risk of bias assessment using the PROBAST tool, all models but one were found to be at high or unclear risk of bias.

CONCLUSIONS

Available tools do not meet current standards for the development or reporting of risk prediction models. The majority of tools have not been externally validated. Standardised and rigorous approaches to risk prediction model development and validation are needed.

TRIAL REGISTRATION

The protocol was registered on the Open Science Framework ( https://osf.io/tepyk ).

Collapse

Robledo KP, Marschner IC, Grossmann M, Handelsman DJ, Yeap BB, Allan CA, Foote C, Inder WJ, Stuckey BGA, Jesudason D, Bracken K, Keech AC, Jenkins AJ, Gebski V, Jardine M, Wittert G. Predicting type 2 diabetes and testosterone effects in high-risk Australian men: development and external validation of a 2-year risk model. Eur J Endocrinol 2025;192:15-24. [PMID: 39720906 DOI: 10.1093/ejendo/lvae166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/24/2024] [Revised: 11/13/2024] [Accepted: 12/21/2024] [Indexed: 12/26/2024]

Affiliation(s)

Kristy P Robledo NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown, NSW 1450, Australia
Ian C Marschner NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown, NSW 1450, Australia
Mathis Grossmann Department of Endocrinology, Austin Hospital, Heidelberg, VIC 3084, Australia Department of Medicine, University of Melbourne, Parkville, VIC 3010, Australia
David J Handelsman Andrology Laboratory, ANZAC Research Institute, University of Sydney, Concord, NSW 2139, Australia Andrology Department, Concord Hospital, Concord, NSW 2139, Australia
Bu B Yeap Medical School, University of Western Australia, Perth, WA 6009, Australia Department of Endocrinology and Diabetes, Fiona Stanley Hospital, Murdoch, WA 6150, Australia
Carolyn A Allan Centre for Endocrinology and Metabolism, Hudson Institute of Medical Research, Clayton, VIC 3168, Australia School of Clinical Sciences, Monash University, Clayton, VIC 3800, Australia
Celine Foote The George Institute for Global Health, University of New South Wales, Sydney, NSW 2052, Australia
Warrick J Inder Department of Diabetes and Endocrinology, Princess Alexandra Hospital, Woolloongabba, QLD 4102, Australia Medical School, University of Queensland, Herston, QLD 4029, Australia
Bronwyn G A Stuckey Keogh Institute for Medical Research, Nedlands, WA 6009, Australia Department of Endocrinology and Diabetes, Sir Charles Gairdner Hospital, Nedlands, WA 6009, Australia Medical School, University of Western Australia, Nedlands, WA 6009, Australia
David Jesudason School of Medicine, The University of Adelaide, Adelaide, SA 5005, Australia Endocrinology Unit, The Queen Elizabeth Hospital, Woodville South, SA 5011, Australia
Karen Bracken Faculty of Medicine and Health, University of Sydney, Camperdown, NSW 2006, Australia
Anthony C Keech NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown, NSW 1450, Australia
Alicia J Jenkins Diabetes and Vascular Medicine, Baker Heart and Diabetes Institute, Melbourne, VIC 3004, Australia
Val Gebski NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown, NSW 1450, Australia
Meg Jardine NHMRC Clinical Trials Centre, University of Sydney, Locked bag 77, Camperdown, NSW 1450, Australia
Gary Wittert Freemasons Centre for Male Health and Wellbeing, South Australian Health and Medical Research Institute, North Terrace, SA 5000, Australia Medical School, University of Adelaide, North Terrace, Adelaide 5000, Australia

Collapse

Rockenschaub P, Akay EM, Carlisle BG, Hilbert A, Wendland J, Meyer-Eschenbach F, Näher AF, Frey D, Madai VI. External validation of AI-based scoring systems in the ICU: a systematic review and meta-analysis. BMC Med Inform Decis Mak 2025;25:5. [PMID: 39762808 PMCID: PMC11702098 DOI: 10.1186/s12911-024-02830-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2024] [Accepted: 12/17/2024] [Indexed: 01/11/2025] Open

Avelino-Silva TJ, Lee SJ, Covinsky KE, Walter LC, Deardorff WJ, Boscardin J, Campora F, Szlejf C, Suemoto CK, Smith AK. External Validation of the Walter Index for Posthospitalization Mortality Prediction in Older Adults. JAMA Netw Open 2025;8:e2455475. [PMID: 39841475 PMCID: PMC11755200 DOI: 10.1001/jamanetworkopen.2024.55475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/13/2024] [Accepted: 11/14/2024] [Indexed: 01/23/2025] Open

Abstract

Importance

The Walter Index is a widely used prognostic tool for assessing 12-month mortality risk among hospitalized older adults. Developed in the US in 2001, its accuracy in contemporary non-US contexts is unclear.

Objective

To evaluate the external validity of the Walter Index in predicting posthospitalization mortality risk in Brazilian older adult inpatients.

Design, Setting, and Participants

This prognostic study used data from a cohort of adults aged 70 years or older admitted to the geriatric unit of a university hospital in Brazil from January 1, 2009, to February 28, 2020. Participants underwent comprehensive geriatric assessments at admission, were reevaluated at discharge, and were subsequently followed up for 48 months. Data were analyzed from March to July 2024.

Main Outcomes and Measures

The Walter Index, a score based on 6 risk factors (male sex, dependent activities of daily living at discharge, heart failure, cancer, high creatinine level, and low albumin level), was calculated to assess its predictive accuracy for 12-month mortality as well as 6-, 24-, and 48-month mortality. The study investigated whether incorporating delirium, frailty, or C-reactive protein level enhanced accuracy. Performance was assessed using discrimination, calibration, and clinical utility measures.

Results

In total, 2780 participants (mean [SD] age, 81 [7] years; 1795 [65%] female) were included, with 89 (3%) lost to follow-up. The 12-month posthospitalization mortality rate was 23% (646 participants). Mortality was 7% (47 of 634) in the lowest-risk group (0-1 point), 17% (111 of 668) for 2 to 3 points, 25% (198 of 803) for 4 to 6 points, and 43% (290 of 675) in the highest-risk group (≥7 points). The index demonstrated an area under the receiver operating characteristic curve (AUC) of 0.714 (95% CI, 0.691-0.736) for predicting 12-month posthospitalization mortality (AUCs were 0.75 and 0.80 in the original derivation and validation cohorts, respectively). Comparable results were observed for mortality at 6 months (AUC, 0.726; 95% CI, 0.700-0.752), 24 months (AUC, 0.711; 95% CI, 0.691-0.730), and 48 months (AUC, 0.719; 95% CI, 0.700-0.738). Adding delirium modestly increased the index's discrimination (AUC, 0.723; 95% CI, 0.702-0.749); additionally including frailty and C-reactive protein level did not improve discrimination further (AUC, 0.723; 95% CI, 0.701-0.744).

Conclusions and Relevance

In this prognostic study of hospitalized older adults in Brazil, the Walter Index showed similar discrimination in predicting postdischarge mortality as it did 2 decades ago in the US. These findings highlight the need for continuous validation and potential modification of established prognostic tools to improve their applicability across settings.

Collapse

Affiliation(s)

Thiago J. Avelino-Silva Division of Geriatrics, School of Medicine, University of California San Francisco Laboratorio de Investigacao Medica em Envelhecimento (LIM-66), Servico de Geriatria, Hospital das Clinicas (HCFMUSP), Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
Sei J. Lee Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California
Kenneth E. Covinsky Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California
Louise C. Walter Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California
W. James Deardorff Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California
John Boscardin Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California
Flavia Campora Laboratorio de Investigacao Medica em Envelhecimento (LIM-66), Servico de Geriatria, Hospital das Clinicas (HCFMUSP), Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
Claudia Szlejf Division of Geriatrics, School of Medicine, University of California San Francisco Hospital Israelita Albert Einstein, São Paulo, Brazil
Claudia K. Suemoto Division of Geriatrics, School of Medicine, University of California San Francisco Laboratorio de Investigacao Medica em Envelhecimento (LIM-66), Servico de Geriatria, Hospital das Clinicas (HCFMUSP), Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
Alexander K. Smith Division of Geriatrics, School of Medicine, University of California San Francisco Geriatrics, Palliative and Extended Care Service Line, San Francisco Veterans Administration Health Care System, San Francisco, California

Collapse

Luu HS. Laboratory Data as a Potential Source of Bias in Healthcare Artificial Intelligence and Machine Learning Models. Ann Lab Med 2025;45:12-21. [PMID: 39444135 PMCID: PMC11609702 DOI: 10.3343/alm.2024.0323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 09/10/2024] [Accepted: 10/18/2024] [Indexed: 10/25/2024] Open

Wernly B, Guidet B, Beil M. The role of artificial intelligence in life-sustaining treatment decisions: current state and future considerations. Intensive Care Med 2025;51:157-159. [PMID: 39661140 DOI: 10.1007/s00134-024-07738-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2024] [Accepted: 11/19/2024] [Indexed: 12/12/2024]

Cabanillas Silva P, Sun H, Rezk M, Roccaro-Waldmeyer DM, Fliegenschmidt J, Hulde N, von Dossow V, Meesseman L, Depraetere K, Stieg J, Szymanowsky R, Dahlweid FM. Longitudinal Model Shifts of Machine Learning-Based Clinical Risk Prediction Models: Evaluation Study of Multiple Use Cases Across Different Hospitals. J Med Internet Res 2024;26:e51409. [PMID: 39671571 DOI: 10.2196/51409] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Revised: 01/30/2024] [Accepted: 10/16/2024] [Indexed: 12/15/2024] Open

Abstract

BACKGROUND

In recent years, machine learning (ML)-based models have been widely used in clinical domains to predict clinical risk events. However, in production, the performances of such models heavily rely on changes in the system and data. The dynamic nature of the system environment, characterized by continuous changes, has significant implications for prediction models, leading to performance degradation and reduced clinical efficacy. Thus, monitoring model shifts and evaluating their impact on prediction models are of utmost importance.

OBJECTIVE

This study aimed to assess the impact of a model shift on ML-based prediction models by evaluating 3 different use cases-delirium, sepsis, and acute kidney injury (AKI)-from 2 hospitals (M and H) with different patient populations and investigate potential model deterioration during the COVID-19 pandemic period.

METHODS

We trained prediction models using retrospective data from earlier years and examined the presence of a model shift using data from more recent years. We used the area under the receiver operating characteristic curve (AUROC) to evaluate model performance and analyzed the calibration curves over time. We also assessed the influence on clinical decisions by evaluating the alert rate, the rates of over- and underdiagnosis, and the decision curve.

RESULTS

The 2 data sets used in this study contained 189,775 and 180,976 medical cases for hospitals M and H, respectively. Statistical analyses (Z test) revealed no significant difference (P>.05) between the AUROCs from the different years for all use cases and hospitals. For example, in hospital M, AKI did not show a significant difference between 2020 (AUROC=0.898) and 2021 (AUROC=0.907, Z=-1.171, P=.242). Similar results were observed in both hospitals and for all use cases (sepsis and delirium) when comparing all the different years. However, when evaluating the calibration curves at the 2 hospitals, model shifts were observed for the delirium and sepsis use cases but not for AKI. Additionally, to investigate the clinical utility of our models, we performed decision curve analysis (DCA) and compared the results across the different years. A pairwise nonparametric statistical comparison showed no differences in the net benefit at the probability thresholds of interest (P>.05). The comprehensive evaluations performed in this study ensured robust model performance of all the investigated models across the years. Moreover, neither performance deteriorations nor alert surges were observed during the COVID-19 pandemic period.

CONCLUSIONS

Clinical risk prediction models were affected by the dynamic and continuous evolution of clinical practices and workflows. The performance of the models evaluated in this study appeared stable when assessed using AUROCs, showing no significant variations over the years. Additional model shift investigations suggested that a calibration shift was present for certain use cases (delirium and sepsis). However, these changes did not have any impact on the clinical utility of the models based on DCA. Consequently, it is crucial to closely monitor data changes and detect possible model shifts, along with their potential influence on clinical decision-making.

Collapse

Ke JXC, Jen TTH, Gao S, Ngo L, Wu L, Flexman AM, Schwarz SKW, Brown CJ, Görges M. Development and internal validation of time-to-event risk prediction models for major medical complications within 30 days after elective colectomy. PLoS One 2024;19:e0314526. [PMID: 39621640 PMCID: PMC11611139 DOI: 10.1371/journal.pone.0314526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2024] [Accepted: 11/12/2024] [Indexed: 12/12/2024] Open

Abstract

BACKGROUND

Patients undergoing colectomy are at risk of numerous major complications. However, existing binary risk stratification models do not predict when a patient may be at highest risks of each complication. Accurate prediction of the timing of complications facilitates targeted, resource-efficient monitoring. We sought to develop and internally validate Cox proportional hazards models to predict time-to-complication of major complications within 30 days after elective colectomy.

METHODS

We studied a retrospective cohort from the multicentered American College of Surgeons National Surgical Quality Improvement Program procedure-targeted colectomy dataset. Patients aged 18 years or above, who underwent elective colectomy between January 1, 2014 and December 31, 2019 were included. A priori candidate predictors were selected based on variable availability, literature review, and multidisciplinary team consensus. Outcomes were mortality, hospital readmission, myocardial infarction, cerebral vascular events, pneumonia, venous thromboembolism, acute renal failure, and sepsis or septic shock within 30 days after surgery.

RESULTS

The cohort consisted of 132145 patients (mean ± SD age, 61 ± 15 years; 52% females). Complication rates ranged between 0.3% (n = 383) for cardiac arrest and acute renal failure to 5.3% (n = 6986) for bleeding requiring transfusion, with readmission rate of 8.6% (n = 11415). We observed distinct temporal patterns for each complication: the median [quartiles] postoperative day of complication diagnosis ranged from 1 [0, 2] days for bleeding requiring transfusion to 12 [6, 18] days for venous thromboembolism. Models for mortality, myocardial infarction, pneumonia, and renal failure showed good discrimination with a concordance > 0.8, while models for readmission, venous thromboembolism, and sepsis performed poorly with a concordance of 0.6 to 0.7. Models exhibited good calibration but ranges were limited to low probability areas.

CONCLUSIONS

We developed and internally validated time-to-event prediction models for complications after elective colectomy. Once further validated, the models can facilitate tailored monitoring of high risk patients during high risk periods.

TRIAL REGISTRATION

Clinicaltrials.gov (NCT05150548; Principal Investigator: Janny Xue Chen Ke, M.D., M.Sc., F.R.C.P.C.; initial posting: November 25, 2021).

Collapse

Affiliation(s)

Janny X. C. Ke Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada Department of Anesthesia, St. Paul’s Hospital/Providence Health Care, Vancouver, British Columbia, Canada
Tim T. H. Jen Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada Department of Anesthesia, St. Paul’s Hospital/Providence Health Care, Vancouver, British Columbia, Canada
Sihaoyu Gao Department of Statistics, Faculty of Science, The University of British Columbia, Vancouver, British Columbia, Canada
Long Ngo Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts, United States of America Division of General Medicine, Beth Israel Deaconess Medical Center, Harvard Medical School, Boston, Massachusetts, United States of America
Lang Wu Department of Statistics, Faculty of Science, The University of British Columbia, Vancouver, British Columbia, Canada
Alana M. Flexman Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada Department of Anesthesia, St. Paul’s Hospital/Providence Health Care, Vancouver, British Columbia, Canada
Stephan K. W. Schwarz Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada Department of Anesthesia, St. Paul’s Hospital/Providence Health Care, Vancouver, British Columbia, Canada
Carl J. Brown Department of Surgery, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada Department of Surgery, St. Paul’s Hospital/Providence Health Care, Vancouver, British Columbia, Canada
Matthias Görges Department of Anesthesiology, Pharmacology & Therapeutics, Faculty of Medicine, The University of British Columbia, Vancouver, British Columbia, Canada BC Children’s Hospital Research Institute, Vancouver, British Columbia, Canada

Collapse

Tangel VE, Hoeks SE, Stolker RJ, Brown S, Pryor KO, de Graaff JC. International multi-institutional external validation of preoperative risk scores for 30-day in-hospital mortality in paediatric patients. Br J Anaesth 2024;133:1222-1233. [PMID: 39477712 DOI: 10.1016/j.bja.2024.09.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 08/14/2024] [Accepted: 09/14/2024] [Indexed: 11/19/2024] Open

Abstract

BACKGROUND

Risk prediction scores are used to guide clinical decision-making. Our primary objective was to externally validate two patient-specific risk scores for 30-day in-hospital mortality using the Multicenter Perioperative Outcomes Group (MPOG) registry: the Pediatric Risk Assessment (PRAm) score and the intrinsic surgical risk score. The secondary objective was to recalibrate these scores.

METHODS

Data from 56 US and Dutch hospitals with paediatric caseloads were included. The primary outcome was 30-day mortality. To assess model discrimination, the area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUC-PR) were calculated. Model calibration was assessed by plotting the observed and predicted probabilities. Decision analytic curves were fit.

RESULTS

The 30-day mortality was 0.14% (822/606 488). The AUROC for the PRAm upon external validation was 0.856 (95% confidence interval 0.844-0.869), and the AUC-PR was 0.008. Upon recalibration, the AUROC was 0.873 (0.861-0.886), and the AUC-PR was 0.031. The AUROC for the external validation of the intrinsic surgical risk score was 0.925 (0.914-0.936) and AUC-PR was 0.085. Upon recalibration, the AUROC was 0.925 (0.915-0.936), and the AUC-PR was 0.094. Calibration metrics for both scores were favourable because of the large cluster of cases with low probabilities of mortality. Decision curve analyses showed limited benefit to using either score.

CONCLUSIONS

The intrinsic surgical risk score performed better than the PRAm, but both resulted in large numbers of false positives. Both scores exhibited decreased performance compared with the original studies. ASA physical status scores in sicker patients drove the superior performance of the intrinsic surgical risk score, suggesting the use of a risk score does not improve prediction.

Collapse

Chavosh Nejad M, Vestergaard Matthiesen R, Dukovska-Popovska I, Jakobsen T, Johansen J. Machine learning for predicting duration of surgery and length of stay: A literature review on joint arthroplasty. Int J Med Inform 2024;192:105631. [PMID: 39293161 DOI: 10.1016/j.ijmedinf.2024.105631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 08/15/2024] [Accepted: 09/13/2024] [Indexed: 09/20/2024]

Abstract

INTRODUCTION

In recent years, different factors such as population aging have caused escalating demand for hip and knee arthroplasty straining already limited hospitals' resources. To address this challenge, focus is put on medical and operational efficiency improvements. This includes an increased use of machine learning (ML) to predict duration of surgery (DOS) and length of stay (LOS) for total knee and total hip arthroplasty, which can be utilized for optimizing resource allocation to satisfy medical and operational limitations. This paper explores the development and performance of ML models in predicting DOS and LOS.

METHODS

A systematic search of publications between 2010-2023 was conducted following PRISMA guidelines. Considering the inclusion and exclusion criteria, 28 out of 722 gathered papers from PubMed, Web of Science, and manual search were included in the study. Descriptive statistics was used to analyze the extracted data regarding data preprocessing, model development, and model performance assessment.

RESULTS

Most of the papers work on LOS as a binary variable. Patient's age was identified as the most frequently used and reported as important variable for predicting DOS and LOS. Investigations also illustrated that within the resulting 28 papers, more than 71% of models reached good to perfect performance based on the area under the receiver operating characteristic curve (AUC), where artificial neural networks and ensemble learning models had the biggest share among the best-performing models.

CONCLUSION

The utilization of ML models is increasing in the literature. The current performance level indicates that ML can potentially turn to powerful tools in predicting DOS and LOS for different purposes. Meanwhile, the literature is not matured yet in reporting real-life application. Future studies can focus on model specification and validation by considering empirical application.

Collapse

Rockenschaub P, Madai VI, Frey D. The authors reply. Crit Care Med 2024;52:e638-e639. [PMID: 39637279 DOI: 10.1097/ccm.0000000000006441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/07/2024]

Gonzalez R, Saha A, Campbell CJ, Nejat P, Lokker C, Norgan AP. Seeing the random forest through the decision trees. Supporting learning health systems from histopathology with machine learning models: Challenges and opportunities. J Pathol Inform 2024;15:100347. [PMID: 38162950 PMCID: PMC10755052 DOI: 10.1016/j.jpi.2023.100347] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2023] [Revised: 10/06/2023] [Accepted: 11/01/2023] [Indexed: 01/03/2024] Open

de Ruijter UW, Kaplan ZLR, Eijkenaar F, Maas CCHM, van der Heide A, Bax WA, Lingsma HF. Identifying persistent high-cost patients in the hospital for care management: development and validation of prediction models. BMC Health Serv Res 2024;24:1469. [PMID: 39593019 PMCID: PMC11590622 DOI: 10.1186/s12913-024-11936-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 11/13/2024] [Indexed: 11/28/2024] Open

Abstract

BACKGROUND

Healthcare use by High-Need High-Cost (HNHC) patients is believed to be modifiable through better coordination of care. To identify patients for care management, a hybrid approach is recommended that combines clinical assessment of need with model-based prediction of cost. Models that predict high healthcare costs persisting over time are relevant but scarce. We aimed to develop and validate two models predicting Persistent High-Cost (PHC) status upon hospital outpatient visit and hospital admission, respectively.

METHODS

We performed a retrospective cohort study using claims data from a national health insurer in the Netherlands-a regulated competitive health care system with universal coverage. We created two populations of adults based on their index event in 2016: a first hospital outpatient visit (i.e., outpatient population) or hospital admission (i.e., hospital admission population). Both were divided in a development (January-June) and validation (July-December) cohort. Our outcome of interest, PHC status, was defined as belonging to the top 10% of total annual healthcare costs for three consecutive years after the index event. Predictors were predefined based on an earlier systematic review and collected in the year prior to the index event. Predictor effects were quantified through logistic multivariable regression analysis. To increase usability, we also developed smaller models containing the lowest number of predictors while maintaining comparable performance. This was based on relative predictor importance (Wald χ2). Model performance was evaluated by means of discrimination (C-statistic) and calibration (plots).

RESULTS

In the outpatient development cohort (n = 135,558), 2.2% of patients (n = 3,016) was PHC. In the hospital admission development cohort (n = 24,805), this was 5.8% (n = 1,451). Both full models included 27 predictors, while their smaller counterparts had 10 (outpatient model) and 11 predictors (hospital admission model). In the outpatient validation cohort (n = 84,009) and hospital admission validation cohort (n = 20,768), discrimination was good for full models (C-statistics 0.75; 0.74) and smaller models (C-statistics 0.70; 0.73), while calibration plots indicated that models were well-calibrated.

CONCLUSIONS

We developed and validated two models predicting PHC status that demonstrate good discrimination and calibration. Both models are suitable for integration into electronic health records to aid a hybrid case-finding strategy for HNHC care management.

Collapse

Sajanti A, Hellström S, Bennett C, Srinath A, Jhaveri A, Cao Y, Takala R, Frantzén J, Koskimäki F, Falter J, Lyne SB, Rantamäki T, Posti JP, Roine S, Jänkälä M, Puolitaival J, Kolehmainen S, Girard R, Rahi M, Rinne J, Castrén E, Koskimäki J. Soluble Urokinase-Type Plasminogen Activator Receptor and Inflammatory Biomarker Response with Prognostic Significance after Acute Neuronal Injury - a Prospective Cohort Study. Inflammation 2024:10.1007/s10753-024-02185-1. [PMID: 39540961 DOI: 10.1007/s10753-024-02185-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2024] [Revised: 10/30/2024] [Accepted: 11/05/2024] [Indexed: 11/16/2024]

Abstract

Aneurysmal subarachnoid hemorrhage (aSAH), ischemic stroke (IS), and traumatic brain injury (TBI) are severe conditions impacting individuals and society. Identifying reliable prognostic biomarkers for predicting survival or recovery remains a challenge. Soluble urokinase type plasminogen activator receptor (suPAR) has gained attention as a potential prognostic biomarker in acute sepsis. This study evaluates suPAR and related neuroinflammatory biomarkers in serum for brain injury prognosis. This prospective study included 31 aSAH, 30 IS, 13 TBI, and three healthy controls (n = 77). Serum samples were collected on average 5.9 days post-injury, analyzing suPAR, IL-1β, cyclophilin A, and TNFα levels using ELISA. Outcomes were assessed 90 days post-injury with the modified Rankin Scale (mRS), categorized as favorable (mRS 0-2) or unfavorable (mRS 3-6). Statistical analyses included 2-tailed t-tests, Pearson's correlations, and machine learning linear discriminant analysis (LDA) for biomarker combinations. Elevated suPAR levels were found in brain injury patients compared to controls (p = 0.017). Increased suPAR correlated with unfavorable outcomes (p = 0.0018) and showed prognostic value (AUC = 0.66, p = 0.03). IL-1β levels were higher in the unfavorable group (p = 0.0015). LDA combinatory analysis resulted a fair prognostic accuracy with canonical equation = 0.775[suPAR] + 0.667[IL1-β] (AUC = 0.77, OR 0.296, sensitivity 93.1%, specificity 53.1%, p = 0.0007). No correlation was found between suPAR and CRP or infection status. Elevated suPAR levels in acute brain injury patients were associated with poorer outcomes, highlighting suPAR's potential as a prognostic biomarker across different brain injury types. Combining IL-1β with suPAR improved prognostic accuracy, supporting a multimodal biomarker approach for predicting outcomes.

Collapse

Affiliation(s)

Antti Sajanti Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Santtu Hellström Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Carolyn Bennett Neurovascular Surgery Program, Section of Neurosurgery, The University of Chicago Medicine and Biological Sciences, 5841 S. Maryland, Chicago, IL, 60637, USA
Abhinav Srinath Neurovascular Surgery Program, Section of Neurosurgery, The University of Chicago Medicine and Biological Sciences, 5841 S. Maryland, Chicago, IL, 60637, USA
Aditya Jhaveri Neurovascular Surgery Program, Section of Neurosurgery, The University of Chicago Medicine and Biological Sciences, 5841 S. Maryland, Chicago, IL, 60637, USA
Ying Cao Department of Radiation Oncology, Kansas University Medical Center, Kansas City, KS, 66160, USA
Riikka Takala Perioperative Services, Intensive Care and Pain Medicine, Turku University Hospital and University of Turku, POB 52, 20521, Turku, Finland
Janek Frantzén Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Fredrika Koskimäki Neurocenter, Acute Stroke Unit, Turku University Hospital, P.O. Box 52, FI-20521, Turku, Finland
Johannes Falter Department of Neurosurgery, University Medical Center of Regensburg, Regensburg, Germany
Seán B Lyne Department of Neurosurgery, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, USA
Tomi Rantamäki Laboratory of Neurotherapeutics, Molecular and Integrative Biosciences Research Programme, Faculty of Biological and Environmental Sciences and Drug Research Program, Division of Pharmacology and Pharmacotherapy, Faculty of Pharmacy, University of Helsinki, Helsinki, Finland
Jussi P Posti Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Susanna Roine Neurocenter, Acute Stroke Unit, Turku University Hospital, P.O. Box 52, FI-20521, Turku, Finland
Miro Jänkälä Department of Neurosurgery, Oulu University Hospital, Box 25, 90029 OYS, Oulu, Finland
Jukka Puolitaival Department of Neurosurgery, Oulu University Hospital, Box 25, 90029 OYS, Oulu, Finland
Sulo Kolehmainen Neuroscience Center, HiLIFE, University of Helsinki, Box 63, 00014, Helsinki, Finland
Romuald Girard Neurovascular Surgery Program, Section of Neurosurgery, The University of Chicago Medicine and Biological Sciences, 5841 S. Maryland, Chicago, IL, 60637, USA
Melissa Rahi Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Jaakko Rinne Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland
Eero Castrén Neuroscience Center, HiLIFE, University of Helsinki, Box 63, 00014, Helsinki, Finland
Janne Koskimäki Neurocenter, Department of Neurosurgery, Turku University Hospital and University of Turku, P.O. Box 52, Hämeentie 11, FI-20521, Turku, Finland. Department of Neurosurgery, Oulu University Hospital, Box 25, 90029 OYS, Oulu, Finland. Neuroscience Center, HiLIFE, University of Helsinki, Box 63, 00014, Helsinki, Finland.

Collapse

Hong M, Kang RR, Yang JH, Rhee SJ, Lee H, Kim YG, Lee K, Kim H, Lee YS, Youn T, Kim SH, Ahn YM. Comprehensive Symptom Prediction in Inpatients With Acute Psychiatric Disorders Using Wearable-Based Deep Learning Models: Development and Validation Study. J Med Internet Res 2024;26:e65994. [PMID: 39536315 PMCID: PMC11602769 DOI: 10.2196/65994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2024] [Revised: 10/20/2024] [Accepted: 10/20/2024] [Indexed: 11/16/2024] Open

Abstract

BACKGROUND

Assessing the complex and multifaceted symptoms of patients with acute psychiatric disorders proves to be significantly challenging for clinicians. Moreover, the staff in acute psychiatric wards face high work intensity and risk of burnout, yet research on the introduction of digital technologies in this field remains limited. The combination of continuous and objective wearable sensor data acquired from patients with deep learning techniques holds the potential to overcome the limitations of traditional psychiatric assessments and support clinical decision-making.

OBJECTIVE

This study aimed to develop and validate wearable-based deep learning models to comprehensively predict patient symptoms across various acute psychiatric wards in South Korea.

METHODS

Participants diagnosed with schizophrenia and mood disorders were recruited from 4 wards across 3 hospitals and prospectively observed using wrist-worn wearable devices during their admission period. Trained raters conducted periodic clinical assessments using the Brief Psychiatric Rating Scale, Hamilton Anxiety Rating Scale, Montgomery-Asberg Depression Rating Scale, and Young Mania Rating Scale. Wearable devices collected patients' heart rate, accelerometer, and location data. Deep learning models were developed to predict psychiatric symptoms using 2 distinct approaches: single symptoms individually (Single) and multiple symptoms simultaneously via multitask learning (Multi). These models further addressed 2 problems: within-subject relative changes (Deterioration) and between-subject absolute severity (Score). Four configurations were consequently developed for each scale: Single-Deterioration, Single-Score, Multi-Deterioration, and Multi-Score. Data of participants recruited before May 1, 2024, underwent cross-validation, and the resulting fine-tuned models were then externally validated using data from the remaining participants.

RESULTS

Of the 244 enrolled participants, 191 (78.3%; 3954 person-days) were included in the final analysis after applying the exclusion criteria. The demographic and clinical characteristics of participants, as well as the distribution of sensor data, showed considerable variations across wards and hospitals. Data of 139 participants were used for cross-validation, while data of 52 participants were used for external validation. The Single-Deterioration and Multi-Deterioration models achieved similar overall accuracy values of 0.75 in cross-validation and 0.73 in external validation. The Single-Score and Multi-Score models attained overall R² values of 0.78 and 0.83 in cross-validation and 0.66 and 0.74 in external validation, respectively, with the Multi-Score model demonstrating superior performance.

CONCLUSIONS

Deep learning models based on wearable sensor data effectively classified symptom deterioration and predicted symptom severity in participants in acute psychiatric wards. Despite lower computational costs, Multi models demonstrated equivalent or superior performance than Single models, suggesting that multitask learning is a promising approach for comprehensive symptom prediction. However, significant variations were observed across wards, which presents a key challenge for developing clinical decision support systems in acute psychiatric wards. Future studies may benefit from recurring local validation or federated learning to address generalizability issues.

Collapse

Affiliation(s)

Minseok Hong Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea
Ri-Ra Kang Department of IT Convergence Engineering, Gachon University, Seongnam-si, Republic of Korea
Jeong Hun Yang Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea Department of Psychiatry, Chungnam National University Sejong Hospital, Sejong, Republic of Korea
Sang Jin Rhee Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea
Hyunju Lee Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea
Yong-Gyom Kim Department of IT Convergence Engineering, Gachon University, Seongnam-si, Republic of Korea
KangYoon Lee Department of IT Convergence Engineering, Gachon University, Seongnam-si, Republic of Korea Department of Computer Engineering, Gachon University, Seongnam-si, Republic of Korea
HongGi Kim Healthconnect Co. Ltd., Seoul, Republic of Korea
Yu Sang Lee Department of Psychiatry, Yong-In Mental Hospital, Yongin-si, Republic of Korea
Tak Youn Department of Psychiatry and Electroconvulsive Therapy Center, Dongguk University International Hospital, Goyang-si, Republic of Korea Institute of Buddhism and Medicine, Dongguk University, Seoul, Republic of Korea
Se Hyun Kim Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea
Yong Min Ahn Department of Neuropsychiatry, Seoul National University Hospital, Seoul, Republic of Korea Department of Psychiatry, Seoul National University College of Medicine, Seoul, Republic of Korea Institute of Human Behavioral Medicine, Seoul National University Medical Research Center, Seoul, Republic of Korea

Collapse

Yoon SJ, Jutte PC, Soriano A, Sousa R, Zijlstra WP, Wouthuyzen-Bakker M. Predicting periprosthetic joint infection: external validation of preoperative prediction models. J Bone Jt Infect 2024;9:231-239. [PMID: 39539737 PMCID: PMC11554715 DOI: 10.5194/jbji-9-231-2024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/08/2024] [Accepted: 08/29/2024] [Indexed: 11/16/2024] Open

Abstract

Introduction: Prediction models for periprosthetic joint infections (PJIs) are gaining interest due to their potential to improve clinical decision-making. However, their external validity across various settings remains uncertain. This study aimed to externally validate promising preoperative PJI prediction models in a recent multinational European cohort. Methods: Three preoperative PJI prediction models - by Tan et al. (2018), Del Toro et al. (2019), and Bülow et al. (2022) - that have previously demonstrated high levels of accuracy were selected for validation. A retrospective observational analysis of patients undergoing total hip arthroplasty (THA) and total knee arthroplasty (TKA) at centers in the Netherlands, Portugal, and Spain between January 2020 and December 2021 was conducted. Patient characteristics were compared between our cohort and those used to develop the models. Performance was assessed through discrimination and calibration. Results: The study included 2684 patients, 60 of whom developed a PJI (2.2 %). Our cohort differed from the models' original cohorts with respect to demographic variables, procedural variables, and comorbidity prevalence. The overall accuracies of the models, measured with the c statistic, were 0.72, 0.69, and 0.72 for the Tan, Del Toro, and Bülow models, respectively. Calibration was reasonable, but the PJI risk estimates were most accurate for predicted infection risks below 3 %-4 %. The Tan model overestimated PJI risk above 4 %, whereas the Del Toro model underestimated PJI risk above 3 %. Conclusions: The Tan, Del Toro, and Bülow PJI prediction models were externally validated in this multinational cohort, demonstrating potential for clinical application in identifying high-risk patients and enhancing preoperative counseling and prevention strategies.

Collapse

Hosar R, Berntsen GR, Steinsbekk A. Validity of the Johns Hopkins Adjusted Clinical Groups system on the utilisation of healthcare services in Norway: a retrospective cross-sectional study. BMC Health Serv Res 2024;24:1279. [PMID: 39448990 PMCID: PMC11515438 DOI: 10.1186/s12913-024-11715-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2024] [Accepted: 10/07/2024] [Indexed: 10/26/2024] Open

Abstract

BACKGROUND

The Adjusted Clinical Groups (ACG) System is a validated electronic risk stratification system. However, there is a lack of studies on the association between different ACG risk scores and the utilisation of different healthcare services using different sources of input data. The aim of this study was therefore to assess the validity of the association between five different ACG risk scores and the utilisation of a range of different healthcare services using input data from either general practitioners (GPs) or hospitals.

METHODS

Registry-based study of all adult inhabitants in four Norwegian municipalities that received somatic healthcare in one year (N = 168 285). The ACG risk scores resource utilisation band, unscaled ACG concurrent risk, unscaled concurrent risk, frailty flag and chronic condition count were calculated using age, sex and diagnosis codes from GPs and a hospital, respectively. Healthcare utilisation covered GP, municipal and hospital services. Areas under the receiver operating curve (AUC) were calculated and compared to the AUC of a model using only age and sex.

RESULTS

Utilisation of all healthcare services increased with increasing scores in the "resource utilisation band" (RUB) and all other investigated ACG risk scores. The risk scores overall distinguished well between levels of utilisation of GP visits (AUC up to 0.84), hospitalisation (AUC up to 0.8) and specialist outpatient visits (AUC up to 0.72), but not out-of-hours GP visits (AUC up to 0.62). The score "unscaled ACG concurrent risk" overall performed best. Risk scores based on data from either GPs or hospitals performed better for the classification of healthcare services in their respective domains. The model based on age and sex performed better for distinguishing between levels of utilisation of municipal services (AUC 0.83-0.90 compared to 0.46-0.79).

CONCLUSIONS

Risk scores from the ACG system is valid for classifying GP visits, hospitalisation and specialist outpatient visits. It does not outperform simpler models in the classification of utilisation of municipal services such as nursing homes and home services and outpatient emergency care in primary healthcare. The ACG system can be applied in Norway using administrative data from either GPs or hospitals.

Collapse