GRADE concept paper 2: Concepts for judging certainty on the calibration of prognostic models in a body of validation studies.
J Clin Epidemiol 2021;
143:202-211. [PMID:
34800677 DOI:
10.1016/j.jclinepi.2021.11.024]
[Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2021] [Revised: 10/16/2021] [Accepted: 11/10/2021] [Indexed: 12/23/2022]
Abstract
``In this paper, we highlight key concepts...'' is background.The sentence ``IN this paper, we highlight key concepts in evaluating the certainty of evidence regarding the calibration of prognostic models'' is methods. The rest is results and conclusion. Brognostic models combine several prognostic factors to provide an estimate of the likelihood (or risk) of future events in individual patients, conditional on their prognostic factor values. A fundamental part of evaluating prognostic models is undertaking studies to determine whether their predictive performance, such as calibration and discrimination, is reproduced across settings. Systematic reviews and meta-analyses of studies evaluating prognostic models' performance are a necessary step for selection of models for clinical practice and for testing the underlying assumption that their use will improve outcomes, including patient's reassurance and optimal future planning. In this paper, we highlight key concepts in evaluating the certainty of evidence regarding the calibration of prognostic models. Four concepts are key to evaluating the certainty of evidence on prognostic models' performance regarding calibration. The first concept is that the inference regarding calibration may take 1 of 2 forms: deciding whether 1 is rating certainty that a model's performance is satisfactory or, instead, unsatisfactory, in either case defining the threshold for satisfactory (or unsatisfactory) model performance. Second, inconsistency is the critical GRADE domain to deciding whether we are rating certainty in the model performance being satisfactory or unsatisfactory. Third, depending on whether 1 is rating certainty in satisfactory or unsatisfactory performance, different patterns of inconsistency of results across studies will inform ratings of certainty of evidence. Fourth, exploring the distribution of point estimates of observed to expected ratio across individual studies, and its determinants, will bear on the need for and direction of future research.
Collapse