Benedetti A, Levis B, Rücker G, Jones HE, Schumacher M, Ioannidis JPA, Thombs B. An empirical comparison of three methods for multiple cutoff diagnostic test meta-analysis of the Patient Health Questionnaire-9 (PHQ-9) depression screening tool using published data vs individual level data.
Res Synth Methods 2020;
11:833-848. [PMID:
32896096 DOI:
10.1002/jrsm.1443]
[Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Revised: 07/10/2020] [Accepted: 08/07/2020] [Indexed: 12/20/2022]
Abstract
Selective cutoff reporting in primary diagnostic accuracy studies with continuous or ordinal data may result in biased estimates when meta-analyzing studies. Collecting individual participant data (IPD) and estimating accuracy across all relevant cutoffs for all studies can overcome such bias but is labour intensive. We meta-analyzed the diagnostic accuracy of the Patient Health Questionnaire-9 (PHQ-9) depression screening tool. We compared results for two statistical methods proposed by Steinhauser and by Jones to account for missing cutoffs, with results from a series of bivariate random effects models (BRM) estimated separately at each cutoff. We applied the methods to a dataset that contained information only on cutoffs that were reported in the primary publications and to the full IPD dataset that contained information for all cutoffs for every study. For each method, we estimated pooled sensitivity and specificity and associated 95% confidence intervals for each cutoff and area under the curve (AUC). The full IPD dataset comprised data from 45 studies, 15 020 subjects, and 1972 cases of major depression and included information on every possible cutoff. When using data available in publications, using statistical approaches outperformed the BRM applied to the same data. AUC was similar for all approaches when using the full IPD dataset, though pooled estimates were slightly different. Overall, using statistical methods to fill in missing cutoff data recovered the receiver operating characteristic (ROC) curve from the full IPD dataset well when using only the published subset. All methods performed similarly when applied to the full IPD dataset.
Collapse