Abstract
BACKGROUND
To inform recommendations by the Canadian Task Force on Preventive Health Care, we reviewed evidence on the benefits, harms, and acceptability of screening and treatment, and on the accuracy of risk prediction tools for the primary prevention of fragility fractures among adults aged 40 years and older in primary care.
METHODS
For screening effectiveness, accuracy of risk prediction tools, and treatment benefits, our search methods involved integrating studies published up to 2016 from an existing systematic review. Then, to locate more recent studies and any evidence relating to acceptability and treatment harms, we searched online databases (2016 to April 4, 2022 [screening] or to June 1, 2021 [predictive accuracy]; 1995 to June 1, 2021, for acceptability; 2016 to March 2, 2020, for treatment benefits; 2015 to June 24, 2020, for treatment harms), trial registries and gray literature, and hand-searched reviews, guidelines, and the included studies. Two reviewers selected studies, extracted results, and appraised risk of bias, with disagreements resolved by consensus or a third reviewer. The overview of reviews on treatment harms relied on one reviewer, with verification of data by another reviewer to correct errors and omissions. When appropriate, study results were pooled using random effects meta-analysis; otherwise, findings were described narratively. Evidence certainty was rated according to the GRADE approach.
RESULTS
We included 4 randomized controlled trials (RCTs) and 1 controlled clinical trial (CCT) for the benefits and harms of screening, 1 RCT for comparative benefits and harms of different screening strategies, 32 validation cohort studies for the calibration of risk prediction tools (26 of these reporting on the Fracture Risk Assessment Tool without [i.e., clinical FRAX], or with the inclusion of bone mineral density (BMD) results [i.e., FRAX + BMD]), 27 RCTs for the benefits of treatment, 10 systematic reviews for the harms of treatment, and 12 studies for the acceptability of screening or initiating treatment. In females aged 65 years and older who are willing to independently complete a mailed fracture risk questionnaire (referred to as "selected population"), 2-step screening using a risk assessment tool with or without measurement of BMD probably (moderate certainty) reduces the risk of hip fractures (3 RCTs and 1 CCT, n = 43,736, absolute risk reduction [ARD] = 6.2 fewer in 1000, 95% CI 9.0-2.8 fewer, number needed to screen [NNS] = 161) and clinical fragility fractures (3 RCTs, n = 42,009, ARD = 5.9 fewer in 1000, 95% CI 10.9-0.8 fewer, NNS = 169). It probably does not reduce all-cause mortality (2 RCTs and 1 CCT, n = 26,511, ARD = no difference in 1000, 95% CI 7.1 fewer to 5.3 more) and may (low certainty) not affect health-related quality of life. Benefits for fracture outcomes were not replicated in an offer-to-screen population where the rate of response to mailed screening questionnaires was low. For females aged 68-80 years, population screening may not reduce the risk of hip fractures (1 RCT, n = 34,229, ARD = 0.3 fewer in 1000, 95% CI 4.2 fewer to 3.9 more) or clinical fragility fractures (1 RCT, n = 34,229, ARD = 1.0 fewer in 1000, 95% CI 8.0 fewer to 6.0 more) over 5 years of follow-up. The evidence for serious adverse events among all patients and for all outcomes among males and younger females (<65 years) is very uncertain. We defined overdiagnosis as the identification of high risk in individuals who, if not screened, would never have known that they were at risk and would never have experienced a fragility fracture. This was not directly reported in any of the trials. Estimates using data available in the trials suggest that among "selected" females offered screening, 12% of those meeting age-specific treatment thresholds based on clinical FRAX 10-year hip fracture risk, and 19% of those meeting thresholds based on clinical FRAX 10-year major osteoporotic fracture risk, may be overdiagnosed as being at high risk of fracture. Of those identified as being at high clinical FRAX 10-year hip fracture risk and who were referred for BMD assessment, 24% may be overdiagnosed. One RCT (n = 9268) provided evidence comparing 1-step to 2-step screening among postmenopausal females, but the evidence from this trial was very uncertain. For the calibration of risk prediction tools, evidence from three Canadian studies (n = 67,611) without serious risk of bias concerns indicates that clinical FRAX-Canada may be well calibrated for the 10-year prediction of hip fractures (observed-to-expected fracture ratio [O:E] = 1.13, 95% CI 0.74-1.72, I2 = 89.2%), and is probably well calibrated for the 10-year prediction of clinical fragility fractures (O:E = 1.10, 95% CI 1.01-1.20, I2 = 50.4%), both leading to some underestimation of the observed risk. Data from these same studies (n = 61,156) showed that FRAX-Canada with BMD may perform poorly to estimate 10-year hip fracture risk (O:E = 1.31, 95% CI 0.91-2.13, I2 = 92.7%), but is probably well calibrated for the 10-year prediction of clinical fragility fractures, with some underestimation of the observed risk (O:E 1.16, 95% CI 1.12-1.20, I2 = 0%). The Canadian Association of Radiologists and Osteoporosis Canada Risk Assessment (CAROC) tool may be well calibrated to predict a category of risk for 10-year clinical fractures (low, moderate, or high risk; 1 study, n = 34,060). The evidence for most other tools was limited, or in the case of FRAX tools calibrated for countries other than Canada, very uncertain due to serious risk of bias concerns and large inconsistency in findings across studies. Postmenopausal females in a primary prevention population defined as <50% prevalence of prior fragility fracture (median 16.9%, range 0 to 48% when reported in the trials) and at risk of fragility fracture, treatment with bisphosphonates as a class (median 2 years, range 1-6 years) probably reduces the risk of clinical fragility fractures (19 RCTs, n = 22,482, ARD = 11.1 fewer in 1000, 95% CI 15.0-6.6 fewer, [number needed to treat for an additional beneficial outcome] NNT = 90), and may reduce the risk of hip fractures (14 RCTs, n = 21,038, ARD = 2.9 fewer in 1000, 95% CI 4.6-0.9 fewer, NNT = 345) and clinical vertebral fractures (11 RCTs, n = 8921, ARD = 10.0 fewer in 1000, 95% CI 14.0-3.9 fewer, NNT = 100); it may not reduce all-cause mortality. There is low certainty evidence of little-to-no reduction in hip fractures with any individual bisphosphonate, but all provided evidence of decreased risk of clinical fragility fractures (moderate certainty for alendronate [NNT=68] and zoledronic acid [NNT=50], low certainty for risedronate [NNT=128]) among postmenopausal females. Evidence for an impact on risk of clinical vertebral fractures is very uncertain for alendronate and risedronate; zoledronic acid may reduce the risk of this outcome (4 RCTs, n = 2367, ARD = 18.7 fewer in 1000, 95% CI 25.6-6.6 fewer, NNT = 54) for postmenopausal females. Denosumab probably reduces the risk of clinical fragility fractures (6 RCTs, n = 9473, ARD = 9.1 fewer in 1000, 95% CI 12.1-5.6 fewer, NNT = 110) and clinical vertebral fractures (4 RCTs, n = 8639, ARD = 16.0 fewer in 1000, 95% CI 18.6-12.1 fewer, NNT=62), but may make little-to-no difference in the risk of hip fractures among postmenopausal females. Denosumab probably makes little-to-no difference in the risk of all-cause mortality or health-related quality of life among postmenopausal females. Evidence in males is limited to two trials (1 zoledronic acid, 1 denosumab); in this population, zoledronic acid may make little-to-no difference in the risk of hip or clinical fragility fractures, and evidence for all-cause mortality is very uncertain. The evidence for treatment with denosumab in males is very uncertain for all fracture outcomes (hip, clinical fragility, clinical vertebral) and all-cause mortality. There is moderate certainty evidence that treatment causes a small number of patients to experience a non-serious adverse event, notably non-serious gastrointestinal events (e.g., abdominal pain, reflux) with alendronate (50 RCTs, n = 22,549, ARD = 16.3 more in 1000, 95% CI 2.4-31.3 more, [number needed to treat for an additional harmful outcome] NNH = 61) but not with risedronate; influenza-like symptoms with zoledronic acid (5 RCTs, n = 10,695, ARD = 142.5 more in 1000, 95% CI 105.5-188.5 more, NNH = 7); and non-serious gastrointestinal adverse events (3 RCTs, n = 8454, ARD = 64.5 more in 1000, 95% CI 26.4-13.3 more, NNH = 16), dermatologic adverse events (3 RCTs, n = 8454, ARD = 15.6 more in 1000, 95% CI 7.6-27.0 more, NNH = 64), and infections (any severity; 4 RCTs, n = 8691, ARD = 1.8 more in 1000, 95% CI 0.1-4.0 more, NNH = 556) with denosumab. For serious adverse events overall and specific to stroke and myocardial infarction, treatment with bisphosphonates probably makes little-to-no difference; evidence for other specific serious harms was less certain or not available. There was low certainty evidence for an increased risk for the rare occurrence of atypical femoral fractures (0.06 to 0.08 more in 1000) and osteonecrosis of the jaw (0.22 more in 1000) with bisphosphonates (most evidence for alendronate). The evidence for these rare outcomes and for rebound fractures with denosumab was very uncertain. Younger (lower risk) females have high willingness to be screened. A minority of postmenopausal females at increased risk for fracture may accept treatment. Further, there is large heterogeneity in the level of risk at which patients may be accepting of initiating treatment, and treatment effects appear to be overestimated.
CONCLUSION
An offer of 2-step screening with risk assessment and BMD measurement to selected postmenopausal females with low prevalence of prior fracture probably results in a small reduction in the risk of clinical fragility fracture and hip fracture compared to no screening. These findings were most applicable to the use of clinical FRAX for risk assessment and were not replicated in the offer-to-screen population where the rate of response to mailed screening questionnaires was low. Limited direct evidence on harms of screening were available; using study data to provide estimates, there may be a moderate degree of overdiagnosis of high risk for fracture to consider. The evidence for younger females and males is very limited. The benefits of screening and treatment need to be weighed against the potential for harm; patient views on the acceptability of treatment are highly variable.
SYSTEMATIC REVIEW REGISTRATION
International Prospective Register of Systematic Reviews (PROSPERO): CRD42019123767.
Collapse