Staibano P, Ham J, Chen J, Zhang H, Gupta MK. Inter-Rater Reliability of Thyroid Ultrasound Risk Criteria: A Systematic Review and Meta-Analysis.
Laryngoscope 2023;
133:485-493. [PMID:
36039947 DOI:
10.1002/lary.30347]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 07/05/2022] [Accepted: 07/29/2022] [Indexed: 11/06/2022]
Abstract
OBJECTIVE
The most commonly employed diagnostic criteria for identifying thyroid nodules include Thyroid Imaging and Reporting Data System (TI-RADS) and American Thyroid Association (ATA) guidelines. The purpose of this systematic review and meta-analysis is to determine the inter-rater reliability of thyroid ultrasound criteria.
METHODS
We performed a library search of MEDLINE (Ovid), EMBASE (Ovid), and Web of Science for full-text articles published from January 2005 to June 2022. We included full-text primary research articles that used TI-RADS and/or ATA guidelines to evaluate thyroid nodules in adults. These included studies must have calculated inter-rater reliability using any validated metric. The Quality Appraisal for Reliability Studies (QAREL) was used to assess study quality. We planned for a random-effects meta-analysis, in addition to covariate and publication bias analyses. This study was performed in accordance with Preferred Reporting Items for a Systematic Review and Meta-analysis guidelines and registered prior to conduction (International prospective register of systematic reviews-PROSPERO: CRD42021275072).
RESULTS
Of the 951 articles identified via the database search, 35 met eligibility criteria. All studies were observational. The most commonly utilized criteria were ACR Thyroid Imaging and Reporting Data System (TI-RADS) and/or ATA criteria, while the majority of studies employed Κ statistics. For ACR TI-RADS, the pooled Κ was 0.51 (95% confidence interval [CI]: 0.42, 0.57; n = 7) while for ATA, the pooled Κ was 0.52 (95% CI: 0.37, 0.67; n = 3). Due to the small number of studies, covariate or publication bias analyses were not performed.
CONCLUSION
Ultrasound criteria demonstrate moderate inter-rater reliability, but these findings are impacted by poor study quality and a lack of standardization. Laryngoscope, 133:485-493, 2023.
Collapse