1
|
Naji L, Dennis B, Rodrigues M, Bawor M, Hillmer A, Chawar C, Deck E, Worster A, Paul J, Thabane L, Samaan Z. Assessing fragility of statistically significant findings from randomized controlled trials assessing pharmacological therapies for opioid use disorders: a systematic review. Trials 2024; 25:286. [PMID: 38678289 PMCID: PMC11055220 DOI: 10.1186/s13063-024-08104-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2022] [Accepted: 04/10/2024] [Indexed: 04/29/2024] Open
Abstract
BACKGROUND The fragility index is a statistical measure of the robustness or "stability" of a statistically significant result. It has been adapted to assess the robustness of statistically significant outcomes from randomized controlled trials. By hypothetically switching some non-responders to responders, for instance, this metric measures how many individuals would need to have responded for a statistically significant finding to become non-statistically significant. The purpose of this study is to assess the fragility index of randomized controlled trials evaluating opioid substitution and antagonist therapies for opioid use disorder. This will provide an indication as to the robustness of trials in the field and the confidence that should be placed in the trials' outcomes, potentially identifying ways to improve clinical research in the field. This is especially important as opioid use disorder has become a global epidemic, and the incidence of opioid related fatalities have climbed 500% in the past two decades. METHODS Six databases were searched from inception to September 25, 2021, for randomized controlled trials evaluating opioid substitution and antagonist therapies for opioid use disorder, and meeting the necessary requirements for fragility index calculation. Specifically, we included all parallel arm or two-by-two factorial design RCTs that assessed the effectiveness of any opioid substitution and antagonist therapies using a binary primary outcome and reported a statistically significant result. The fragility index of each study was calculated using methods described by Walsh and colleagues. The risk of bias of included studies was assessed using the Revised Cochrane Risk of Bias tool for randomized trials. RESULTS Ten studies with a median sample size of 82.5 (interquartile range (IQR) 58, 179, range 52-226) were eligible for inclusion. Overall risk of bias was deemed to be low in seven studies, have some concerns in two studies, and be high in one study. The median fragility index was 7.5 (IQR 4, 12, range 1-26). CONCLUSIONS Our results suggest that approximately eight participants are needed to overturn the conclusions of the majority of trials in opioid use disorder. Future work should focus on maximizing transparency in reporting of study results, by reporting confidence intervals, fragility indexes, and emphasizing the clinical relevance of findings. TRIAL REGISTRATION PROSPERO CRD42013006507. Registered on November 25, 2013.
Collapse
Affiliation(s)
- Leen Naji
- Department of Family Medicine, David Braley Health Sciences Centre, McMaster University, 100 Main St W, 3rdFloor, Hamilton, ON, L8P 1H6, Canada.
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada.
- Department of Medicine, Montefiore Medical Center, New York, NY, USA.
| | - Brittany Dennis
- Department of Medicine, McMaster University, Hamilton, ON, Canada
- Department of Medicine, University of British Columbia, Vancouver, Canada
| | - Myanca Rodrigues
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
| | - Monica Bawor
- Department of Medicine, Imperial College Healthcare NHS Trust, London, UK
| | - Alannah Hillmer
- Department of Psychiatry and Behavaioral Neurosciences, McMaster University, Hamilton, ON, Canada
| | - Caroul Chawar
- Physician Assistant Program, University of Toronto, Toronto, ON, Canada
| | - Eve Deck
- Department of Family Medicine, Western University, London, ON, Canada
| | - Andrew Worster
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - James Paul
- Department of Anesthesia, McMaster University, Hamilton, ON, Canada
| | - Lehana Thabane
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Biostatistics Unit, Research Institute at St Joseph's Healthcare, Hamilton, ON, Canada
| | - Zainab Samaan
- Department of Health Research Methods, Evidence, and Impact, McMaster University, Hamilton, ON, Canada
- Department of Psychiatry and Behavioral Neurosciences, McMaster University, Hamilton, ON, Canada
| |
Collapse
|
2
|
Sudah SY, Moverman MA, Masood R, Mojica ES, Pagani NR, Puzzitiello RN, Menendez ME, Salzler MJ. The Majority of Sports Medicine and Arthroscopy-Related Randomized Controlled Trials Reporting Nonsignificant Results Are Statistically Fragile. Arthroscopy 2023; 39:2071-2083.e1. [PMID: 36868530 DOI: 10.1016/j.arthro.2023.02.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 02/14/2023] [Accepted: 02/16/2023] [Indexed: 03/05/2023]
Abstract
PURPOSE To evaluate the robustness of sports medicine and arthroscopy related randomized controlled trials (RCTs) reporting nonsignificant results by calculating the reverse fragility index (RFI) and reverse fragility quotient (RFQ). METHODS All sports medicine and arthroscopic-related RCTs from January 1, 2010, through August 3, 2021, were identified. Randomized-controlled trials comparing dichotomous variables with a reported P value ≥ .05 were included. Study characteristics, such as publication year and sample size, as well as loss to follow-up and number of outcome events were recorded. The RFI at a threshold of P < .05 and respective RFQ were calculated for each study. Coefficients of determination were calculated to determine the relationships between RFI and the number of outcome events, sample size, and number of patients lost to follow-up. The number of RCTs in which the loss to follow-up was greater than the RFI was determined. RESULTS Fifty-four studies and 4,638 patients were included in this analysis. The mean sample size and loss to follow-up were 85.9 patients and 12.5 patients, respectively. The mean RFI was 3.7, signifying that a change of 3.7 events in one arm was needed to flip the results of the study from non-significant to significant (P < .05). Of the 54 studies investigated, 33 (61%) had a loss to follow-up greater than their calculated RFI. The mean RFQ was 0.05. A significant correlation between RFI with sample size (R2 = 0.10, P = .02) and the total number of observed events (R2 = 0.13, P < .01) was found. No significant correlation existed between RFI and loss to follow-up in the lesser arm (R2 = 0.01, P = .41). CONCLUSIONS The RFI and RFQ are statistical tools that allow the fragility of studies reporting nonsignificant results to be appraised. Using this methodology, we found that the majority of sports medicine and arthroscopy-related RCTs reporting nonsignificant results are fragile. CLINICAL RELEVANCE RFI and RFQ serve as tools that can be used to assess the validity of RCT results and provide additional context for appropriate conclusions.
Collapse
Affiliation(s)
- Suleiman Y Sudah
- Department of Orthopedics, Monmouth Medical Center, Long Branch, New Jersey
| | - Michael A Moverman
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts
| | - Raisa Masood
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts
| | - Edward S Mojica
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts
| | - Nicholas R Pagani
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts
| | - Richard N Puzzitiello
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts
| | - Mariano E Menendez
- Oregon Shoulder Institute at Southern Oregon Orthopedics, Medford, OR; Midwest Orthopaedics at Rush, Rush University Medical Center, Chicago, IL, U.S.A
| | - Matthew J Salzler
- Department of Orthopaedic Surgery, Tufts Medical Center, Tufts University School of Medicine, Boston, Massachusetts.
| |
Collapse
|
3
|
Constant M, Trofa DP, Saltzman BM, Ahmad CS, Li X, Parisien RL. The Fragility of Statistical Significance in Patellofemoral Instability Research: A Systematic Review. Am J Sports Med 2022; 50:3714-3718. [PMID: 34633219 DOI: 10.1177/03635465211039202] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
BACKGROUND Fragility analysis is increasingly utilized to evaluate the robustness of results within the orthopaedic literature and has frequently revealed instability of reported outcomes. PURPOSE/HYPOTHESIS The purpose of this investigation was to utilize a fragility analysis to evaluate the stability of reported results in the patellofemoral instability (PFI) literature. We hypothesized the demonstration of significant fragility in patellofemoral research to be similar to that identified throughout other areas of the orthopaedic literature. STUDY DESIGN Systematic review; Level of evidence, 4. METHODS The PubMed database was queried from January 1, 2000, to October 10, 2020 for comparative trials in 10 prominent orthopaedic journals that reported dichotomous outcomes related to the management of PFI. The fragility index (FI) and the fragility quotient (FQ) were calculated for each individual outcome event, and the overall FI and FQ were determined for all included studies. RESULTS A total of 22 comparative studies comprising 11 randomized controlled trials and 11 nonrandomized trials were included for the analysis. A total of 75 outcome events underwent a fragility analysis and revealed a median FI and FQ of 3 (interquartile range [IQR], 1-5) and 0.043 (IQR, 0.018-0.081), respectively. Also 27% of included studies reported loss to follow-up greater than the overall FI, therefore suggesting the maintenance of the follow-up may have resulted in the reversal of significance. CONCLUSION The result of the comprehensive fragility analysis demonstrated a lack of robustness in PFI research with the alteration of only a few outcome events required to reverse statistical significance. We therefore recommend the triple reporting of the P value, the FI, and the FQ to aid in the interpretation of the statistical integrity of future comparative trials in the PFI literature.
Collapse
Affiliation(s)
- Michael Constant
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - David P Trofa
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - Bryan M Saltzman
- OrthoCarolina Sports Medicine Center, Charlotte, North Carolina, USA
| | - Christopher S Ahmad
- Department of Orthopaedics, New York-Presbyterian Hospital, Columbia University Medical Center, New York, New York, USA
| | - Xinning Li
- Department of Orthopaedics, Boston University Medical Center, Boston, Massachusetts, USA
| | - Robert L Parisien
- Department of Orthopaedic Surgery & Sports Medicine, Mount Sinai, New York, New York, USA
| |
Collapse
|
4
|
Morris SC, Gowd AK, Agarwalla A, Phipatanakul WP, Amin NH, Liu JN. Fragility of statistically significant findings from randomized clinical trials of surgical treatment of humeral shaft fractures: A systematic review. World J Orthop 2022; 13:825-836. [PMID: 36189338 PMCID: PMC9516622 DOI: 10.5312/wjo.v13.i9.825] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 02/28/2022] [Accepted: 08/17/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND Despite recent meta-analyses of randomized controlled trials (RCTs), there remains no consensus regarding the preferred surgical treatment for humeral shaft fractures. The fragility index (FI) is an emerging tool used to evaluate the robustness of RCTs by quantifying the number of participants in a study group that would need to switch outcomes in order to reverse the study conclusions.
AIM To investigate the fragility index of randomized control trials assessing outcomes of operative fixation in proximal humerus fractures.
METHODS We completed a systematic review of RCTs evaluating the surgical treatment of humeral shaft fractures. Inclusion criteria included: articles published in English; patients randomized and allotted in 1:1 ratio to 2 parallel arms; and dichotomous outcome variables. The FI was calculated for total complications, each complication individually, and secondary surgeries using the Fisher exact test, as previously published.
RESULTS Fifteen RCTs were included in the analysis comparing open reduction plate osteosynthesis with dynamic compression plate or locking compression plate, intramedullary nail, and minimally invasive plate osteosynthesis. The median FI was 0 for all parameters analyzed. Regarding individual outcomes, the FI was 0 for 81/91 (89%) of outcomes. The FI exceeded the number lost to follow up in only 2/91 (2%) outcomes.
CONCLUSION The FI shows that data from RCTs regarding operative treatment of humeral shaft fractures are fragile and does not demonstrate superiority of any particular surgical technique.
Collapse
Affiliation(s)
- Stephen Craig Morris
- Department of Orthopaedic Surgery, Loma Linda University, Loma Linda, CA 92354, United States
| | - Anirudh K Gowd
- Department of Orthopaedic Surgery, Wake Forest University Baptist Medical Center, Winston-Salem, NC 27157, United States
| | - Avinesh Agarwalla
- Department of Orthopaedic Surgery, Westchester Medical Center, Valhalla, NY 10595, United States
| | - Wesley P Phipatanakul
- Department of Orthopaedic Surgery, Loma Linda University, Loma Linda, CA 92354, United States
| | - Nirav H Amin
- Department of Orthopaedic Surgery, Premier Orthopaedic and Trauma Specialists, Pomona, CA 91767, United States
| | - Joseph N Liu
- Department of Orthopedic Surgery, USC Epstein Family Center for Sports Medicine, Los Angeles, CA 90089, United States
| |
Collapse
|
5
|
Usman MS, Khan MS, Fonarow GC, Greene SJ, Friede T, Vaduganathan M, Filippatos G, Coats AJS, Anker SD, Butler J. Robustness of outcomes in trials evaluating sodium-glucose co-transporter 2 inhibitors for heart failure. ESC Heart Fail 2022; 9:885-893. [PMID: 35029056 PMCID: PMC8934993 DOI: 10.1002/ehf2.13785] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 10/25/2021] [Accepted: 12/13/2021] [Indexed: 01/10/2023] Open
Abstract
AIMS Recent trials have evaluated sodium-glucose co-transporter 2 inhibitors in patients with heart failure (HF). We sought to assess the robustness of findings from these trials using the fragility index (FI). METHODS AND RESULTS Fragility index is defined as the minimum number of patients that must be moved from the 'non-event' to the 'event' group to turn a statistically significant result to non-significant. In addition to FI, fragility quotient [(FQ); FI divided by the sample size] was calculated to assess the proportion of events that must be moved to change the significance. For statistically non-significant outcomes, reverse fragility index (RFI) and reverse fragility quotient (RFQ) were calculated. Robustness of findings after pooling data from all three trials was also assessed. A robust reduction in first HF hospitalization or cardiovascular mortality was seen with dapagliflozin (FI = 62 and FQ = 0.013), empagliflozin (FI = 50 and FQ = 0.013), and sotagliflozin (FI = 60 and FQ = 0.049). Dapagliflozin nominally improved all-cause and cardiovascular mortality, with modest FI (n = 8 and 5) and FQ (0.002 and 0.001). Empagliflozin and sotagliflozin did not demonstrate statistically significant reductions in all-cause mortality, with modest RFI (empagliflozin: RFI = 26 and RFQ = 0.007; sotagliflozin: RFI = 6 and RFQ = 0.005). A similar trend was seen with cardiovascular mortality (empagliflozin: RFI = 24 and RFQ = 0.006; sotagliflozin: RFI = 7 and RFQ = 0.006). Upon meta-analysis, the result for first HF hospitalization or cardiovascular mortality was robust (FI = 95 and FQ = 0.010). The reductions in all-cause (FI = 12 and FQ = 0.001) and cardiovascular mortality (FI = 9 and FQ = 0.001), while statistically significant, were fragile. CONCLUSION Improvement in the composite outcome of first HF hospitalization or cardiovascular death was highly concordant and robust across sodium-glucose co-transporter 2 inhibitor trials. In contrast, secondary endpoints of all-cause and cardiovascular mortality were statistically fragile, underscoring the need to power trials for mortality to fully understand the benefit of therapies on fatal events.
Collapse
Affiliation(s)
| | | | - Gregg C Fonarow
- Division of Cardiology, Ronald Reagan UCLA Medical Center, Los Angeles, CA, USA
| | - Stephen J Greene
- Division of Cardiology, Duke University Medical Center, Durham, NC, USA
| | - Tim Friede
- Department of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany.,DZHK (German Centre for Cardiovascular Research), partner site Göttingen, Göttingen, Germany
| | | | - Gerasimos Filippatos
- National and Kapodistrian University of Athens School of Medicine, Athens University Hospital Attikon, Athens, Greece
| | - Andrew J Stewart Coats
- Department of Cardiology, IRCCS San Raffaele Pisana, Rome, Italy.,University of Warwick, Coventry, UK
| | - Stefan D Anker
- Department of Cardiology (CVK); and Berlin Institute of Health Center for Regenerative Therapies (BCRT); DZHK (German Centre for Cardiovascular Research), partner site Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Javed Butler
- Department of Medicine, University of Mississippi, Jackson, MS, USA
| |
Collapse
|
6
|
Parisien RL, Constant M, Saltzman BM, Popkin CA, Ahmad CS, Li X, Trofa DP. The Fragility of Statistical Significance in Cartilage Restoration of the Knee: A Systematic Review of Randomized Controlled Trials. Cartilage 2021; 13:147S-155S. [PMID: 33969744 PMCID: PMC8808853 DOI: 10.1177/19476035211012458] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
OBJECTIVE The purpose of this study was to utilize fragility analysis to assess the robustness of randomized controlled trials (RCTs) evaluating the management of articular cartilage defects of the knee. We hypothesize that the cartilage restorative literature will be fragile with the reversal of only a few outcome events required to change statistical significance. DESIGN RCTs from 11 orthopedic journals indexed on PubMed from 2000 to 2020 reporting dichotomous outcome measures relating to the management of articular cartilage defects of the knee were included. The Fragility Index (FI) for each outcome was calculated through the iterative reversal of a single outcome event until significance was reversed. The Fragility Quotient (FQ) was calculated by dividing each FI by study sample size. Additional statistical analysis was performed to provide median FI and FQ across subgroups. RESULTS Nineteen RCTs containing 60 dichotomous outcomes were included for analysis. The FI and FQ of all outcomes was 4 (IQR 2-7) and 0.067 (IQR 0.034-0.096), respectively. The average number of patients lost to follow-up (LTF) was 3.9 patients with 15.8% of the included studies reporting LTF greater than or equal to 4, the FI of all included outcomes. CONCLUSIONS The orthopedic literature evaluating articular cartilage defects of the knee is fragile as the reversal of relatively few outcome events may alter the significance of statistical findings. We therefore recommend comprehensive fragility analysis and triple reporting of the P value, FI, and FQ to aid in the interpretation and contextualization of clinical findings reported in the cartilage restoration literature.
Collapse
Affiliation(s)
- Robert L. Parisien
- Department of Orthopaedics, Harvard
Medical School & Boston Children’s Hospital, Boston, MA, USA
| | - Michael Constant
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Bryan M. Saltzman
- Ortho Carolina, Sports Medicine, Knee
& Shoulder/Elbow, Charlotte, NC, USA
| | - Charles A. Popkin
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Christopher S. Ahmad
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| | - Xinning Li
- Department of Orthopaedics, Boston
University Medical Center, Boston, MA, USA
| | - David P. Trofa
- Department of Orthopaedics, Columbia
University Irving Medical Center, New York, NY, USA
| |
Collapse
|
7
|
Huang X, Chen B, Thabane L, Adachi JD, Li G. Fragility of results from randomized controlled trials supporting the guidelines for the treatment of osteoporosis: a retrospective analysis. Osteoporos Int 2021; 32:1713-1723. [PMID: 33595680 DOI: 10.1007/s00198-021-05865-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/13/2020] [Accepted: 01/29/2021] [Indexed: 12/11/2022]
Abstract
UNLABELLED This is the first report on the fragility of results from randomized controlled trials (RCTs) for the treatment of osteoporosis. The results of aforementioned RCTs appear to depend on a small number of events and are generally statistically fragile. INTRODUCTION Osteoporosis remains a health concern worldwide. Evidence-based guideline recommendations that are mainly based on results of clinical trials are important to clinical decision-making. The fragility index (FI) is a novel statistical metric to measure the fragility of results from an RCT. Our study aimed to analyze the fragility of the clinical trials referenced in the guidelines for the treatment of osteoporosis. METHODS Trials were included if they investigated primary osteoporosis, randomized patients to treatment or control in a 1:1 design, and reported fracture outcome as the primary endpoint. The FI and fragility quotient (FQ) were calculated for assessing the robustness of results from the eligible RCTs. An FI was defined as the minimum number of events in the intervention group that needs to change from a non-event to an event in order to render a significant result non-significant (or vice versa). The FQ was calculated by dividing the FI by the sample size of the trial. RESULTS Of the 372 RCTs identified from the guidelines, 42 were eligible for analyses. Their median FI was 10 (25th-75th percentile [Q1-Q3]: 4-18), with a median FQ of 0.007 (Q1-Q3: 0.0017-0.019). Approximately one third of the RCTs had a FI of less than or equal to 5. There were 17 (40.5%) trials where the number of patients lost to follow-up was greater than the FI. The FI was significantly associated with sample size, journal impact factor, and the percent of patients lost to follow-up. CONCLUSION Results from some RCTs supporting guideline recommendations for the treatment of osteoporosis depend on a small number of events. The FI and FQ may provide additional, intuitive metrics to help interpret the robustness of trial results.
Collapse
Affiliation(s)
- X Huang
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China
| | - B Chen
- Department of Endocrinology, Guangdong Second Provincial General Hospital, Guangzhou, China
| | - L Thabane
- Department of Health research methods, Evidence, and Impact (HEI), McMaster University, 1280 Main St West, Hamilton, ON, L8S 4 L8, Canada
| | - J D Adachi
- Department of Medicine, McMaster University, Hamilton, ON, Canada
| | - G Li
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou, China.
- Department of Health research methods, Evidence, and Impact (HEI), McMaster University, 1280 Main St West, Hamilton, ON, L8S 4 L8, Canada.
| |
Collapse
|
8
|
Wayant C, Tritz D, Horn J, Crow M, Vassar M. Evaluation of Risks of Bias in Addiction Medicine Randomized Controlled Trials. Alcohol Alcohol 2021; 56:284-290. [PMID: 32808009 DOI: 10.1093/alcalc/agaa074] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 07/06/2020] [Accepted: 07/07/2020] [Indexed: 01/11/2023] Open
Abstract
AIMS Perhaps the most important step when designing and conducting randomized controlled trials (RCTs) in addiction is to put methodological safeguards in place to minimize the likelihood for bias to affect trial outcomes. In this study, we applied the revised Cochrane risk of bias tool (ROB 2) to RCTs of drug, alcohol or tobacco interventions. METHODS We searched for trials published in 15 addiction medicine journals over a 7-year period. Our primary endpoint is the risk of bias of included studies. We conducted a sensitivity analysis of publicly funded trials. RESULTS Overall, included RCTs were most often at high risk of bias per our judgments (244/487, 50.1%). However, significant proportions of included RCTs were at low risk of bias (123/487, 25.3%) or some concerns for bias (120/497, 24.6%). RCTs with behavioral modification interventions (19/44, 43.2%) and alcohol interventions (80/150, 53.3%) had the highest proportion of high-risk judgments. In a sensitivity analysis of publicly funded RCTs), 195/386 (50.5%) were at high risk of bias. CONCLUSIONS Approximately half of included drug, alcohol or tobacco RCTs in our sample were judged to be at high risk of bias with the most common reason being a lack of proper blinding or proper description of blinding. Key action items to reduce bias in future addiction RCTs include adequate randomization, blinding and inclusion of a trial registry number and protocol.
Collapse
Affiliation(s)
- Cole Wayant
- Oklahoma State University Center for Health Sciences, Tulsa, OK 74104, USA
| | - Daniel Tritz
- Oklahoma State University Center for Health Sciences, Tulsa, OK 74104, USA
| | - Jarryd Horn
- Oklahoma State University Center for Health Sciences, Tulsa, OK 74104, USA
| | - Matt Crow
- Oklahoma State University Center for Health Sciences, Tulsa, OK 74104, USA
| | - Matt Vassar
- Oklahoma State University Center for Health Sciences, Tulsa, OK 74104, USA
| |
Collapse
|
9
|
Zhang H, Li J, Zeng W. Frequent fragility of randomized controlled trials for HCC treatment. BMC Cancer 2021; 21:389. [PMID: 33836710 PMCID: PMC8034173 DOI: 10.1186/s12885-021-08133-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2019] [Accepted: 12/16/2020] [Indexed: 11/18/2022] Open
Abstract
Background The fragility index (FI) of trial results can provide a measure of confidence in the positive effects reported in randomized controlled trials (RCTs). The aim of this study was to calculate the FI of RCTs supporting HCC treatments. Methods A methodological systematic review of RCTs in HCC treatments was conducted. Two-arm studies with randomized and positive results for a time-to-event outcome were eligible for the FI calculation. Results A total of 6 trails were included in this analysis. The median FI was 0.5 (IQR 0–10). FI was ≤7 in 4 (66.7%) of 6 trials; in those trials the fragility quotient was ≤1%. Conclusion Many phase 3 RCTs supporting HCC treatments have a low FI, which challenges the confidence in concluding the superiority of these drugs over control treatments. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-021-08133-8.
Collapse
Affiliation(s)
- Hao Zhang
- Department of Infectious Diseases, The Key Discipline of Gguangdong Province, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou Medical University, #151 Yanjiang Road, Guangzhou, 510120, Guangdong Province, China
| | - Jingtao Li
- Department of liver diseases (I), The Hospital Affiliated to Shaanxi University of Chinese Medicine, Xianyang, 712000, Shaanxi Province, China.
| | - Wenting Zeng
- Department of Infectious Diseases, The Key Discipline of Gguangdong Province, The First Affiliated Hospital of Guangzhou Medical University, Guangzhou Medical University, #151 Yanjiang Road, Guangzhou, 510120, Guangdong Province, China.
| |
Collapse
|
10
|
Li G, Walter SD, Thabane L. Shifting the focus away from binary thinking of statistical significance and towards education for key stakeholders: revisiting the debate on whether it's time to de-emphasize or get rid of statistical significance. J Clin Epidemiol 2021; 137:104-112. [PMID: 33839240 DOI: 10.1016/j.jclinepi.2021.03.033] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 03/03/2021] [Accepted: 03/10/2021] [Indexed: 01/01/2023]
Abstract
There has been a long-standing controversy among scientists regarding the appropriate use of P-values and statistical significance in clinical research. This debate has resurfaced through recent calls to modify the threshold of P-value required to declare significance, or to retire statistical significance entirely. In this article, we revisit the issue by discussing: i) the connection between statistical thinking and evidence-based practice; ii) some history of statistical significance and P-values; iii) some practical challenges with statistical significance or P-value thresholds in clinical research; iv) the on-going debate on what to do with statistical significance; v) suggestions to shift the focus away from binary thinking of statistical significance and towards education for key stakeholders on research essentials including statistical thinking, critical thinking, good reporting, basic clinical research concepts and methods, and more. We then conclude with remarks and illustrations of the potential deleterious public health consequences of poor methods including selective choice of analysis approach and misguided reliance on binary use of P-values to report and interpret scientific findings.
Collapse
Affiliation(s)
- Guowei Li
- Center for Clinical Epidemiology and Methodology (CCEM), Guangdong Second Provincial General Hospital, Guangzhou City, Guangdong Province, China 510317; Department of Health research methods, Evidence, and Impact (HEI), McMaster University, Hamilton, Ontario, Canada
| | - Stephen D Walter
- Department of Health research methods, Evidence, and Impact (HEI), McMaster University, Hamilton, Ontario, Canada
| | - Lehana Thabane
- Department of Health research methods, Evidence, and Impact (HEI), McMaster University, Hamilton, Ontario, Canada; Population Health Research Institute, Hamilton Health Sciences and McMaster University, Hamilton, Ontario, Canada; Father Sean O'Sullivan Research Centre, St. Joseph's Healthcare Hamilton, Hamilton, Ontario, Canada.
| |
Collapse
|
11
|
Holek M, Bdair F, Khan M, Walsh M, Devereaux P, Walter SD, Thabane L, Mbuagbaw L. Fragility of clinical trials across research fields: A synthesis of methodological reviews. Contemp Clin Trials 2020; 97:106151. [DOI: 10.1016/j.cct.2020.106151] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2020] [Revised: 09/10/2020] [Accepted: 09/10/2020] [Indexed: 12/29/2022]
|
12
|
Khan MS, Fonarow GC, Friede T, Lateef N, Khan SU, Anker SD, Harrell FE, Butler J. Application of the Reverse Fragility Index to Statistically Nonsignificant Randomized Clinical Trial Results. JAMA Netw Open 2020; 3:e2012469. [PMID: 32756927 PMCID: PMC7407075 DOI: 10.1001/jamanetworkopen.2020.12469] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
IMPORTANCE Interpreting randomized clinical trials (RCTs) and their clinical relevance is challenging when P values are either marginally above or below the P = .05 threshold. OBJECTIVE To use the concept of reverse fragility index (RFI) to provide a measure of confidence in the neutrality of RCT results when assessed from the clinical perspective. DESIGN, SETTING, AND PARTICIPANTS In this cross-sectional study, a MEDLINE search was conducted for RCTs published from January 1, 2013, to December 31, 2018, in JAMA, the New England Journal of Medicine (NEJM), and The Lancet. Eligible studies were phase 3 and 4 trials with 1:1 randomization and statistically nonsignificant binary primary end points. Data analysis was performed from August 1, 2019, to August 31, 2019. EXPOSURES Single vs multicenter enrollment, total number of events, private vs government funding, placebo vs active control, and time to event vs frequency data. MAIN OUTCOMES AND MEASURES The primary outcome was the median RFI with interquartile range (IQR) at the P = .05 threshold. Secondary outcomes were the number of RCTs in which the number of participants lost to follow-up was greater than the RFI; the median RFI with IQR at different P value thresholds; the median reverse fragility quotient with IQR; and the correlation between sample sizes, number of events, and P values of the RCT and RFI. RESULTS Of the 167 RCTs included, 76 (46%) were published in the NEJM, 50 (30%) in JAMA, and 41 (24%) in The Lancet. The median (IQR) sample size was 970 (470-3427) participants, and the median (IQR) number of events was 251 (105-570). The median (IQR) RFI at the P = .05 threshold was 8 (5-13). Fifty-seven RCTs (34%) had an RFI of 5 or lower, and in 68 RCTs (41%) the number of participants lost to follow-up was greater than the RFI. Trials with P values ranging from P = .06 to P = .10 had a median (IQR) RFI of 3 (2-4). When compared, median (IQR) RFIs were not statistically significant for single-center vs multicenter enrollment (5 [4-13] vs 8 [5-13]; P = .41), private vs government-funded studies (9 [5-13] vs 8 [5-13]; P = .34), and time-to-event primary end points vs frequency data (9 [5-14] vs 7 [4-13]; P = .43). The median (IQR) RFI at the P = .01 threshold was 12 (7-19) and at the P = .005 threshold was 14 (9-21). CONCLUSIONS AND RELEVANCE This cross-sectional study found that a relatively small number of events (median of 8) had to change to move the primary end point of an RCT from nonsignificant to statistically significant. These findings emphasize the nuance required when interpreting trial results that did not meet prespecified significance thresholds.
Collapse
Affiliation(s)
| | - Gregg C. Fonarow
- Division of Cardiology, Ronald Reagan–UCLA (University of California, Los Angeles) Medical Center, Los Angeles
| | - Tim Friede
- Department of Medical Statistics, University Medical Center Goettingen, Goettingen, Germany
| | - Noman Lateef
- Department of Medicine, Creighton University, Omaha, Nebraska
| | - Safi U. Khan
- Department of Medicine, West Virginia University, Morgantown
| | - Stefan D. Anker
- Department of Cardiology, and Berlin Institute of Health Center for Regenerative Therapies, German Centre for Cardiovascular Research partner site Berlin; Charité Universitätsmedizin Berlin, Berlin, Germany
| | - Frank E. Harrell
- Department of Biostatistics, Vanderbilt University School of Medicine, Nashville, Tennessee
| | - Javed Butler
- Department of Medicine, University of Mississippi Medical Center, Jackson
| |
Collapse
|
13
|
Rickard M, Keefe DT, Drysdale E, Erdman L, Hannick JH, Milford K, Santos JD, Mistry N, Koyle MA, Lorenzo AJ. Trends and relevance in the bladder and bowel dysfunction literature: PlumX metrics contrasted with fragility indicators. J Pediatr Urol 2020; 16:477.e1-477.e7. [PMID: 32684443 DOI: 10.1016/j.jpurol.2020.06.015] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/26/2020] [Accepted: 06/12/2020] [Indexed: 12/31/2022]
Abstract
INTRODUCTION The concepts of fragility index (FI) and fragility quotient (FQ) have been previously described. PlumX metrics encompass online "footprints" of research in addition to traditional citations. Herein we explore PlumX metrics against the quality of BBD literature. OBJECTIVE To explore altmetrics against the quality of bladder and bowel dysfunction (BBD) literature. STUDY DESIGN A literature search was conducted using Pubmed, Medline, Embase for BBD and related terms. A total of 54,045 abstracts were screened, followed by 693 full text reviews and data extraction from 126. Studies were included if they reported on 2 groups being compared, had dichotomous outcomes, and had significant results. RESULTS The median FI score was 4 (0-500) and there were 20 studies which had a FI of 0. The FQ had a median of 0.04 (0-0.32). PlumX usage was 263 ± 540, captures were 45 ± 60 and social media attention was 2 ± 2. Overall, 42% of papers were clinical trials (RCTs). When compared to other study designs, we noted a significant difference in PlumX captures (57 ± 72 RCT vs. 35 ± 47 other; p = 0.03). RCTs had higher usage, social media engagement and citations however, the differences were not significant. H-Index had a significant correlation with FI (p = 0.036), however correlations for PlumX usage and captures, while modestly positive (0.04-0.10) for the FI and FQ, were not significant. A comparison of FI and FQ by topic can be reviewed in the Summary Table. DISCUSSION When considering the FI and FQ robustness indicators of the BBD literature, we found similarities when compared to other studies. It was reported that overall, the hydronephrosis literature was fragile with many studies requiring only a few events to nullify significance, regardless of the study design. Similarly, in a review of pediatric vesicoureteral reflux (VUR) clinical trials, results were also fragile. When comparing fragility measures to altmetric variables we noted that despite the growing popularity of altmetrics, citation counts, and h-indices remain the traditional measures to monitor research consumption. There has been a reported correlation between manuscript citation counts, author h-index, altmetrics measures in several specialties and across many domains of research including medical sciences, arts, and the humanities, however in the present study only weak correlations were noted. CONCLUSION The body of BBD comparative studies is fragile in keeping with other pediatric urology literature populations. Despite fragile results, RCTs generate slightly moreattention as measured by select PlumX metrics. These results suggest the need for including fragility measures in our literature, aiming to focus attention towards more robust articles.
Collapse
Affiliation(s)
- Mandy Rickard
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada.
| | - Daniel T Keefe
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada
| | - Erik Drysdale
- Center for Computational Medicine, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Lauren Erdman
- Center for Computational Medicine, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Jessica H Hannick
- Division of Pediatric Urology, UH Rainbow Babies and Children's Hospital, Cleveland, OH, USA
| | - Karen Milford
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada
| | - Joana Dos Santos
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada
| | - Niraj Mistry
- Department of Pediatrics, Hospital for Sick Children, Toronto, Ontario, Canada
| | - Martin A Koyle
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada
| | - Armando J Lorenzo
- Division of Urology, Hospital for Sick Children and Department of Surgery, University of Toronto, Ontario, Canada
| |
Collapse
|
14
|
Van Howe RS. The Fragility Index in HIV/AIDS Trials. J Gen Intern Med 2020; 35:2204. [PMID: 31768903 PMCID: PMC7351995 DOI: 10.1007/s11606-019-05554-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 11/07/2019] [Indexed: 11/24/2022]
Affiliation(s)
- Robert S Van Howe
- Department of Pediatrics and Human Development, Michigan State University College of Human Medicine, 413 E. Ohio Street, Marquette, MI, 49855, USA.
| |
Collapse
|
15
|
Evaluation of the fragility of pivotal trials used to support US Food and Drug Administration approval for plaque psoriasis. J Am Acad Dermatol 2020; 84:354-360. [PMID: 32320767 DOI: 10.1016/j.jaad.2020.04.057] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2020] [Revised: 03/30/2020] [Accepted: 04/07/2020] [Indexed: 11/23/2022]
Abstract
BACKGROUND Over the last 5 years, there has been a rapid growth in the number of clinical trials used to support a US Food and Drug Administration (FDA) approval for systemic therapies with labeled indications for plaque psoriasis. OBJECTIVE We aim to evaluate the fragility of clinical trial data used to support FDA approval of therapies for psoriasis. METHODS We reviewed the primary endpoints of the pivotal trials of all systemic medications with a labeled indication for plaque psoriasis available from Drugs@FDA. RESULTS Sixty-nine clinical trial primary endpoints met inclusion criteria and were assessed for robustness, yielding a median fragility index of 72 and a median fragility quotient of 0.19. LIMITATIONS Efficacy and statistical analysis data for several approved medications were not available on the product label or on Drugs@FDA. CONCLUSIONS When compared with randomized controlled trials for FDA approval across various diseases, pivotal trials in psoriasis appear quite robust to changes in outcomes.
Collapse
|
16
|
Khan MS, Ochani RK, Shaikh A, Usman MS, Yamani N, Khan SU, Murad MH, Mandrola J, Doukky R, Krasuski RA. Fragility Index in Cardiovascular Randomized Controlled Trials. Circ Cardiovasc Qual Outcomes 2019; 12:e005755. [PMID: 31822121 DOI: 10.1161/circoutcomes.119.005755] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
BACKGROUND Efficacy of an intervention is commonly evaluated using P values, in addition to effect size measures such as absolute risk reduction, relative risk reduction, and numbers needed to treat. However, these measures are not always intuitive to clinicians. The fragility index (FI) is a more intuitive number that can facilitate interpretation but can only be used with binary outcomes. FI is the minimum number of patients who must be moved from the nonevent group to the event group to turn a significant result nonsignificant. In this retrospective analysis, we assessed the robustness of cardiovascular randomized controlled trials (RCTs), which report a positive (statistically significant) primary outcome by using the FI. METHODS AND RESULTS We searched Medline from 2007 to 2017 to identify cardiovascular RCTs published in 6 high impact journals (The Lancet, New England Journal of Medicine, Journal of the American Medical Association, Circulation, Journal of the American College of Cardiology and European Heart Journal). Only RCTs with sample sizes >500 and a 2-by-2 factorial design or dichotomous primary outcomes were selected. FI was calculated using a defined approach. Among the cohort of 123 RCTs that met inclusion criteria, median FI was 13 (interquartile range, 5-26). In 28 trials (22.8%), FI ranged between 1 and 4. In 37 trials (30.1%), number of patients lost to follow-up was higher than the FI. Pharmaceutical interventions had higher FI compared with other interventions, FI=19 (7-52; P=0.002). Median FI varied according to subspecialty (electrophysiology=2; heart failure=11; interventional cardiology=8; P=0.020) and multiregional RCTs had higher FI=22 (12-53.25; P=0.023). FI did not differ based on risk of bias indicators, funding, or publication year. CONCLUSIONS Considerable variations in FI were observed among cardiovascular trials, suggesting the need for careful interpretation of results, particularly when number of patients lost to follow-up exceeds FI.
Collapse
Affiliation(s)
- Muhammad Shahzeb Khan
- Department of Internal Medicine, John H. Stroger Jr. Hospital of Cook County, Chicago, IL (M.S.K., N.Y.)
| | - Rohan Kumar Ochani
- Department of Internal Medicine, Dow University of Health Sciences, Karachi, Pakistan (R.K.O., A.S., M.S.U.)
| | - Asim Shaikh
- Department of Internal Medicine, Dow University of Health Sciences, Karachi, Pakistan (R.K.O., A.S., M.S.U.)
| | - Muhammad Shariq Usman
- Department of Internal Medicine, Dow University of Health Sciences, Karachi, Pakistan (R.K.O., A.S., M.S.U.)
| | - Naser Yamani
- Department of Internal Medicine, John H. Stroger Jr. Hospital of Cook County, Chicago, IL (M.S.K., N.Y.)
| | - Safi U Khan
- Department of Internal Medicine, West Virginia University (S.U.K.)
| | - M Hassan Murad
- Evidence-based Practice Center, Mayo Clinic, Rochester, MN (M.H.M.)
| | | | - Rami Doukky
- Division of Cardiology, Rush University Medical Center, Chicago, IL (R.D.)
| | - Richard A Krasuski
- Department of Cardiovascular Medicine, Duke University Health System, Durham, NC (R.A.K.)
| |
Collapse
|