1
|
Sidebotham D, Barlow CJ, Martin J, Jones PM. Interpreting frequentist hypothesis tests: insights from Bayesian inference. Can J Anaesth 2023; 70:1560-1575. [PMID: 37794259 PMCID: PMC10600289 DOI: 10.1007/s12630-023-02557-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Revised: 03/25/2023] [Accepted: 03/27/2023] [Indexed: 10/06/2023] Open
Abstract
Randomized controlled trials are one of the best ways of quantifying the effectiveness of medical interventions. Therefore, when the authors of a randomized superiority trial report that differences in the primary outcome between the intervention group and the control group are "significant" (i.e., P ≤ 0.05), we might assume that the intervention has an effect on the outcome. Similarly, when differences between the groups are "not significant," we might assume that the intervention does not have an effect on the outcome. Nevertheless, both assumptions are frequently incorrect.In this article, we explore the relationship that exists between real treatment effects and declarations of statistical significance based on P values and confidence intervals. We explain why, in some circumstances, the chance an intervention is ineffective when P ≤ 0.05 exceeds 25% and the chance an intervention is effective when P > 0.05 exceeds 50%.Over the last decade, there has been increasing interest in Bayesian methods as an alternative to frequentist hypothesis testing. We provide a robust but nontechnical introduction to Bayesian inference and explain why a Bayesian posterior distribution overcomes many of the problems associated with frequentist hypothesis testing.Notwithstanding the current interest in Bayesian methods, frequentist hypothesis testing remains the default method for statistical inference in medical research. Therefore, we propose an interim solution to the "significance problem" based on simplified Bayesian metrics (e.g., Bayes factor, false positive risk) that can be reported along with traditional P values and confidence intervals. We calculate these metrics for four well-known multicentre trials. We provide links to online calculators so readers can easily estimate these metrics for published trials. In this way, we hope decisions on incorporating the results of randomized trials into clinical practice can be enhanced, minimizing the chance that useful treatments are discarded or that ineffective treatments are adopted.
Collapse
Affiliation(s)
- David Sidebotham
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand.
- Faculty of Medical and Health Sciences, University of Auckland, Auckland, New Zealand.
- Cardiothoracic and Vascular Intensive Care Unit (Ward 48), Building 32, Auckland City Hospital, 2 Park Road, Grafton, Auckland, 1023, New Zealand.
| | - C Jake Barlow
- Department of Anaesthesia and the Cardiothoracic and Vascular Intensive Care Unit, Auckland City Hospital, Auckland, New Zealand
| | - Janet Martin
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| | - Philip M Jones
- Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada
- Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| |
Collapse
|
2
|
Hayes J, Zuercher M, Gai N, Chowdhury AR, Aoyama K. The Fragility Index of randomized controlled trials in pediatric anesthesiology. Can J Anaesth 2023; 70:1449-1460. [PMID: 37286747 DOI: 10.1007/s12630-023-02513-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2022] [Revised: 01/16/2023] [Accepted: 01/23/2023] [Indexed: 06/09/2023] Open
Abstract
PURPOSE The P value is a widely used measure of statistical importance but has many drawbacks and limitations, one being that it does not reflect the robustness of the results of a clinical trial. The Fragility Index (FI) was developed as a measure of how many outcome events would need to change to nonevents to render a significant P value nonsignificant (P ≥ 0.05). The FI of trials from other medical specialties is typically < 5. We aimed to determine the FI of pediatric anesthesiology randomized controlled trials (RCT) and to test for association with various characteristics of the included trials. METHODS We conducted a comprehensive systematic search of high-impact anesthesia, surgical, and medical journals from the last 25 years for trials comparing an intervention between two groups with a statistically significant P value (< 0.05) for a dichotomous outcome. We also compared FI values for variables that reflect the quality and importance of a trial. RESULTS The median [interquartile range] FI was 3 [1-7] and correlated positively with the number of participants (rS = 0.41; P < 0.001) and events (rS = 0.42; P < 0.001), and negatively with the P value (rPB = -0.36; P < 0.001). Other measures of trial quality and impact or importance were not strongly associated with the FI. CONCLUSIONS The FI of published trials in pediatric anesthesiology is similarly low as in other medical specialties. Larger trials with more events and P values ≤ 0.01 were associated with a higher FI.
Collapse
Affiliation(s)
- Jason Hayes
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada.
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada.
| | - Mael Zuercher
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Nan Gai
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada
| | - Apala Roy Chowdhury
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
| | - Kazuyoshi Aoyama
- Department of Anesthesia and Pain Medicine, The Hospital for Sick Children (SickKids), 555 University Avenue, Toronto, ON, M5G 1X8, Canada
- Department of Anesthesiology and Pain Medicine, University of Toronto, Toronto, ON, Canada
- Program in Child Health Evaluative Sciences, SickKids Research Institute, Toronto, ON, Canada
| |
Collapse
|
3
|
Shuming J, Hua L, Yusha T, Lei C. The efficacy and safety of patent Foramen Ovale Closure for Refractory Epilepsy (PFOC-RE): a prospectively randomized control trial of an innovative surgical therapy for refractory epilepsy patients with PFO of high-grade right-to-left shunt. BMC Neurol 2023; 23:282. [PMID: 37501155 PMCID: PMC10373383 DOI: 10.1186/s12883-023-03317-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 07/04/2023] [Indexed: 07/29/2023] Open
Abstract
BACKGROUND A significant proportion of patients with epilepsy have an unknown etiology and lack effective targeted therapeutic drugs. Patent Foramen Ovale (PFO) induces hypoxia and microembolism, leading to cerebral neurological dysfunction and increased epilepsy risk. This study aims to assess the efficacy and safety of PFO closure for relieving epileptic seizures in patients with refractory epilepsy associated with PFO. METHODS/DESIGN Recruitment takes place at the West China Hospital of Sichuan University, China, for an open-label, randomized controlled clinical trial. The trial will include 110 patients with refractory epilepsy and PFO. Disease diagnoses will conform to the diagnostic criteria of the International League Against Epilepsy (ILAE) for refractory epilepsy and the American Society of Echocardiography (ASE) for PFO. Refractory epilepsy and high-grade right-to-left shunt (RLS) of the PFO will be further diagnosed using 24-hour video electroencephalogram and transthoracic echocardiography with contrast injection, respectively. Eligible participants require a secondary or higher volume of RLS. TRIAL REGISTRATION Chinese Clinical Trial Registry (ChiCTR2200065681). Registered on November 11, 2022.
Collapse
Affiliation(s)
- Ji Shuming
- Department of Clinical Research Management, West China Hospital, Sichuan University, Chengdu, 610044, China
| | - Li Hua
- Department of Neurology, West China Hospital, Joint Research Institution of Altitude Health, Sichuan University, Chengdu, 610044, China
| | - Tang Yusha
- Department of Neurology, West China Hospital, Joint Research Institution of Altitude Health, Sichuan University, Chengdu, 610044, China
| | - Chen Lei
- Department of Neurology, West China Hospital, Joint Research Institution of Altitude Health, Sichuan University, Chengdu, 610044, China.
| |
Collapse
|
4
|
Chuang Z, Martin J, Shapiro J, Nguyen D, Neocleous P, Jones PM. Minimum false-positive risk of primary outcomes and impact of reducing nominal P-value threshold from 0.05 to 0.005 in anaesthesiology randomised clinical trials: a cross-sectional study. Br J Anaesth 2023; 130:412-420. [PMID: 36503825 DOI: 10.1016/j.bja.2022.11.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Revised: 11/01/2022] [Accepted: 11/03/2022] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Reproducibility of research is poor; this may be because many articles report statistically significant findings that are false positives. Two potential solutions are to lower the P-value for statistical significance testing from 0.05 to 0.005 and to report the minimum false-positive risk (minFPR). This study determined these metrics for randomised controlled trials (RCTs) in general anaesthesiology journals. METHODS We identified superiority RCTs published between January 1, 2019 and March 15, 2021 from seven leading anaesthesia journals. P-values for primary outcomes were collected, and minFPRs for these outcomes were calculated using a formula assuming a 50% prior probability of an intervention being effective (minFPR50). The primary outcomes were the percentage of RCTs maintaining statistical significance at P<0.005 and minFPR50. RESULTS We included 318 RCTs. P-values below 0.05 were reported in 205/318 (64%) of RCTs. Of these 205 RCTs, 119/205 (58%) maintained statistical significance at the P<0.005 threshold. The mean (standard deviation) minFPR50 was 22% (20). At P=0.005, the minFPR50 was approximately 5%. CONCLUSIONS These proposed metrics aimed at mitigating reproducibility concerns would call a significant portion of the anaesthesiology literature into question. We found a minFPR of 22% and determined that 42% of primary outcomes would not maintain statistical significance if the P-value threshold changed from 0.05 to 0.005. These findings could partially explain the lack of reproducibility of research findings.
Collapse
Affiliation(s)
- Zachary Chuang
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Janet Martin
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada; Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada; Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada
| | - Jordan Shapiro
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Derek Nguyen
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Penelope Neocleous
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada
| | - Philip M Jones
- Schulich School of Medicine & Dentistry, University of Western Ontario, London, ON, Canada; Department of Anesthesia & Perioperative Medicine, University of Western Ontario, London, ON, Canada; Department of Epidemiology & Biostatistics, University of Western Ontario, London, ON, Canada.
| |
Collapse
|
5
|
Meza J, Babajide R, Saoud R, Sweis J, Abelleira J, Helenowski I, Jovanovic B, Eggener S, Miller FH, Horowitz JM, Casalino DD, Murphy AB. Assessing the accuracy of multiparametric MRI to predict clinically significant prostate cancer in biopsy naïve men across racial/ethnic groups. BMC Urol 2022; 22:107. [PMID: 35850677 PMCID: PMC9295380 DOI: 10.1186/s12894-022-01066-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2022] [Accepted: 07/05/2022] [Indexed: 11/16/2022] Open
Abstract
Introduction The Prostate Imaging Reporting and Data System (PIRADS) has shown promise in improving the detection of Gleason grade group (GG) 2–5 prostate cancer (PCa) and reducing the detection of indolent GG1 PCa. However, data on the performance of PIRADS in Black and Hispanic men is sparse. We evaluated the accuracy of PIRADS scores in detecting GG2-5 PCa in White, Black, and Hispanic men. Methods We performed a multicenter retrospective review of biopsy-naïve Black (n = 108), White (n = 108), and Hispanic (n = 64) men who underwent prostate biopsy (PB) following multiparametric MRI. Sensitivity and specificity of PIRADS for GG2-5 PCa were calculated. Race-stratified binary logistic regression models for GG2-5 PCa using standard clinical variables and PIRADS were used to calculate area under the receiver operating characteristics curves (AUC). Results Rates of GG2-5 PCa were statistically similar between Blacks, Whites, and Hispanics (52.8% vs 42.6% vs 37.5% respectively, p = 0.12). Sensitivity was lower in Hispanic men compared to White men (87.5% vs 97.8% respectively, p = 0.01). Specificity was similar in Black versus White men (21.6% vs 27.4%, p = 0.32) and White versus Hispanic men (27.4% vs 17.5%, p = 0.14). The AUCs of the PIRADS added to standard clinical data (age, PSA and suspicious prostate exam) were similar when comparing Black versus White men (0.75 vs 0.73, p = 0.79) and White versus Hispanic men (0.73 vs 0.59, p = 0.11). The AUCs for the Base model and PIRADS model alone were statistically similar when comparing Black versus White men and White versus Hispanic men. Conclusions The accuracy of the PIRADS and clinical data for detecting GG2-5 PCa seems statistically similar across race. However, there is concern that PIRADS 2.0 has lower sensitivity in Hispanic men compared to White men. Prospective validation studies are needed. Supplementary Information The online version contains supplementary material available at 10.1186/s12894-022-01066-9.
Collapse
Affiliation(s)
- Julio Meza
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA.
| | | | - Ragheed Saoud
- Arthur Smith Institute of Urology at Riverhead, Northwell Health, Riverhead, NY, USA
| | - Jamila Sweis
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA
| | - Josephine Abelleira
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA
| | - Irene Helenowski
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA
| | - Borko Jovanovic
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA
| | - Scott Eggener
- University of Chicago Division of Urology, Chicago, IL, USA
| | - Frank H Miller
- Northwestern University Department of Radiology, Chicago, IL, USA
| | | | - David D Casalino
- Northwestern University Department of Radiology, Chicago, IL, USA
| | - Adam B Murphy
- Department of Urology, Northwestern University, 710 N. Fairbanks Court Olson Pavilion 8-250, Chicago, IL, 60611, USA
| |
Collapse
|
6
|
Herrmann C, Kieser M, Rauch G, Pilz M. Optimization of adaptive designs with respect to a performance score. Biom J 2022; 64:989-1006. [PMID: 35426460 DOI: 10.1002/bimj.202100166] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 02/09/2022] [Accepted: 02/12/2022] [Indexed: 11/08/2022]
Abstract
Adaptive designs are an increasingly popular method for the adaptation of design aspects in clinical trials, such as the sample size. Scoring different adaptive designs helps to make an appropriate choice among the numerous existing adaptive design methods. Several scores have been proposed to evaluate adaptive designs. Moreover, it is possible to determine optimal two-stage adaptive designs with respect to a customized objective score by solving a constrained optimization problem. In this paper, we use the conditional performance score by Herrmann et al. (2020) as the optimization criterion to derive optimal adaptive two-stage designs. We investigate variations of the original performance score, for example, by assigning different weights to the score components and by incorporating prior assumptions on the effect size. We further investigate a setting where the optimization framework is extended by a global power constraint, and additional optimization of the critical value function next to the stage-two sample size is performed. Those evaluations with respect to the sample size curves and the resulting design's performance can contribute to facilitate the score's usage in practice.
Collapse
Affiliation(s)
- Carolin Herrmann
- Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Meinhard Kieser
- Institute of Medical Biometry, University Hospital Heidelberg, Heidelberg, Germany
| | - Geraldine Rauch
- Institute of Biometry and Clinical Epidemiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - Maximilian Pilz
- Institute of Medical Biometry, University Hospital Heidelberg, Heidelberg, Germany
| |
Collapse
|
7
|
Hanson NA, Lavallee MB, Thiele RH. Apophenia and anesthesia: how we sometimes change our practice prematurely. Can J Anaesth 2021; 68:1185-1196. [PMID: 33963519 PMCID: PMC8104920 DOI: 10.1007/s12630-021-02005-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Revised: 02/08/2021] [Accepted: 02/16/2021] [Indexed: 12/21/2022] Open
Abstract
Human beings are predisposed to identifying false patterns in statistical noise, a likely survival advantage during our evolutionary development. Moreover, humans seem to prefer "positive" results over "negative" ones. These two cognitive features lay a framework for premature adoption of falsely positive studies. Added to this predisposition is the tendency of journals to "overbid" for exciting or newsworthy manuscripts, incentives in both the academic and publishing industries that value change over truth and scientific rigour, and a growing dependence on complex statistical techniques that some reviewers do not understand. The purpose of this article is to describe the underlying causes of premature adoption and provide recommendations that may improve the quality of published science.
Collapse
Affiliation(s)
- Neil A Hanson
- Department of Anesthesiology, University of Virginia Health System, PO Box 800710, ville, VA, 22908-0710, USA.
| | - Matthew B Lavallee
- Department of Anesthesiology, University of Virginia Health System, PO Box 800710, ville, VA, 22908-0710, USA
| | - Robert H Thiele
- Department of Anesthesiology, University of Virginia Health System, PO Box 800710, ville, VA, 22908-0710, USA
| |
Collapse
|
8
|
Tulka S, Knippschild S, Funck S, Goetjes I, Uluk Y, Baulig C. Reporting of statistical sample size calculations in publications of trials on age-related macular degeneration, glaucoma and cataract. PLoS One 2021; 16:e0252640. [PMID: 34086796 PMCID: PMC8177464 DOI: 10.1371/journal.pone.0252640] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Accepted: 05/20/2021] [Indexed: 11/18/2022] Open
Abstract
Background Transparent and complete publications of randomised controlled trials (RCT) ought to comply with the guidelines of the CONSORT Statement, which stipulates sample size calculation as an important aspect of trial planning. The objective of this study was to analyse and compare the reporting of statistical sample size calculations in RCT papers on the treatment of age-related macular degeneration (AMD), glaucoma and cataract published in 2018. Material and methods This study comprises a total of 113 RCT papers (RCT-P) published in 2018 (AMD: 14, glaucoma: 28, cataract: 71), in English or German, and identified through an internet-based literature search in PubMed and EMBASE. The primary outcome measure of the study was the number of trials providing a complete description of the underlying sample case calculation on the basis of the variables required (significance level, expected outcomes, power, and resulting sample size). Results Of the RCTs reviewed, 64% (AMD), 61% (glaucoma) and 31% (cataract) provided a justification of the number of patients included. A complete description of the described studies’ sample size calculation including all the necessary values (primary outcome measure of this study) was described by 21% of the AMD, 29% of the cataract and 18% of the glaucoma RCT publications (in total: 24 of 113 (21%) at a confidence interval of 95%: [13%; 29%]). Conclusion All three treatment areas analysed lacked reporting quality regarding the justification of the number of patients included in a clinical trial based on a sample size calculation required for ethical reasons. More than half of all RCT publications reviewed did not provide all of the required information on statistical sample size calculation, and thus lacked transparency and completeness. It is therefore urgently required to involve methodologists in a study’s planning and publishing processes to ensure that methodology descriptions are transparent and of high quality.
Collapse
Affiliation(s)
- Sabrina Tulka
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
- * E-mail:
| | - Stephanie Knippschild
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
| | - Sina Funck
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
| | - Isabelle Goetjes
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
| | - Yasmin Uluk
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
| | - Christine Baulig
- Chair for Medical Biometry and Epidemiology (IMBE), Faculty of Health, Witten/Herdecke University, Witten, Germany
| |
Collapse
|
9
|
Seehra J, Stonehouse-Smith D, Cobourne MT, Tsagris M, Pandis N. Are treatment effect assumptions in orthodontic studies overoptimistic? Eur J Orthod 2021; 43:583-587. [PMID: 33991101 PMCID: PMC8488969 DOI: 10.1093/ejo/cjab018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Background At the clinical trial design stage, assumptions regarding the treatment effects to be detected should be appropriate so that the required sample size can be calculated. There is evidence in the medical literature that sample size assumption can be overoptimistic. The aim of this study was to compare the distribution of the assumed effects versus that of the observed effects as a proxy for overoptimistic treatment effect assumptions at the study design stage. Materials and method Systematic reviews (SRs) published between 1 January 2010 and 31 December 2019 containing at least one meta-analysis on continuous outcomes were identified electronically. SR and primary study level characteristics were extracted from the SRs and the individual trials. Details on the sample size calculation process and assumptions and the observed treatment effects were extracted. Results Eighty-five SRs with meta-analysis containing 347 primary trials were included. The median number of SR authors was 5 (interquartile range: 4–7). At the primary study level, the majority were single centre (78.1%), utilized a parallel design (52%), and rated as an unclear/moderate level of risk of bias (34.3%). A sample size was described in only 31.7% (110/347) of studies. From this cohort of 110 studies, in only 37 studies was the assumed clinical difference that the study was designed to detect reported (37/110). The assumed treatment effect was recalculated for the remaining 73 studies (73/110). The one-sided exact signed rank test showed a significant difference between the assumed and observed treatment effects (P < 0.001) suggesting greater values for the assumed effect sizes. Conclusions Careful consideration of the assumptions at the design stage of orthodontic studies are necessary in order to reduce the unreliability of clinical study results and research waste.
Collapse
Affiliation(s)
- Jadbinder Seehra
- Department of Orthodontics, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, Floor 25, Guy's Hospital, Guy's and St Thomas NHS Foundation Trust, London, UK
| | - Daniel Stonehouse-Smith
- Department of Orthodontics, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, Floor 25, Guy's Hospital, Guy's and St Thomas NHS Foundation Trust, London, UK
| | - Martyn T Cobourne
- Department of Orthodontics, Faculty of Dentistry, Oral & Craniofacial Sciences, King's College London, Floor 25, Guy's Hospital, Guy's and St Thomas NHS Foundation Trust, London, UK
| | - Michail Tsagris
- Department of Economics, University of Crete, Rethimnon, Greece
| | - Nikolaos Pandis
- Department of Orthodontics and Dentofacial Orthopedics, Dental School/Medical Faculty, University of Bern, Bern, Switzerland
| |
Collapse
|
10
|
Increasing the reproducibility of research will reduce the problem of apophenia (and more). Can J Anaesth 2021; 68:1120-1134. [PMID: 33963518 DOI: 10.1007/s12630-021-02006-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 04/07/2021] [Accepted: 04/07/2021] [Indexed: 10/21/2022] Open
|
11
|
[Influence of impact factor on reporting sample size calculations in publications on studies exemplified by AMD treatment : Cross-sectional investigation on the presence of sample size calculations in publications of RCTs on AMD treatment in journals with low and high impact factors]. Ophthalmologe 2020; 117:125-131. [PMID: 31201561 DOI: 10.1007/s00347-019-0924-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
BACKGROUND For scientific and ethical reasons randomized controlled clinical trials (RCTs) should be based on a sample size calculation. The CONSORT statement, an established publication guideline for transparent study reporting, requires a sample size calculation in every study publication. OBJECTIVE The availability of sample size calculations in RCT publications on treatment of age-related macular degeneration (AMD) was investigated. The primary hypothesis of this investigation compared the prevalence of reported sample size calculations between journals with higher (≥5) versus lower (<5) impact factors (IF). MATERIAL AND METHODS It was examined whether information on sample size calculation was available in a series of 97 publications of RTCs on AMD treatment published between 2004 and 2014. RESULTS Only 46 out of 97 (47%) study publications provided information on the reason for the number of patients enrolled. The comparison of publications from journals with an IF ≥ 5 (63%, 30) and from journals with an IF < 5 (40%, 67) showed a statistically significant difference of 23% in the frequencies of available sample size calculations (95% confidence interval, CI 2%; 44%). Of the publications published before 2010, 43% reported a sample size calculation versus 51% of the publications afterwards. CONCLUSION Publications in journals with higher IF more frequently reported a sample size calculation. More than 50% of the publications did not report any sample size calculation. Authors and reviewers of publications should pay more attention to the explicit reporting of sample size calculations.
Collapse
|
12
|
Tulka S, Geis B, Baulig C, Knippschild S, Krummenauer F. Validity of sample sizes in publications of randomised controlled trials on the treatment of age-related macular degeneration: cross-sectional evaluation. BMJ Open 2019; 9:e030312. [PMID: 31601589 PMCID: PMC6797239 DOI: 10.1136/bmjopen-2019-030312] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
OBJECTIVE The aim of this cross-sectional study was to examine the completeness and accuracy of the reporting of sample size calculations in randomised controlled trial (RCT) publications on the treatment of age-related macular degeneration (AMD). METHODS A sample of 97 RCTs published between 2004 and 2014 was reviewed for the calculation of their sample size. It was examined whether a (complete) description of the sample size calculation was presented. Furthermore, the sample size was recalculated, whenever possible based on the published details, in order to verify the reported number of patients. PRIMARY OUTCOME MEASURE The primary endpoint of this cross-sectional investigation was a described sample size calculation that was reproducible, complete and correct (maximum tolerated deviation between reported and replicated sample size ±2 participants per trial arm). RESULTS A total of 50 publications (52%) did not provide any information on the justification of the number of patients included. Only 17 publications (18%) provided all the necessary parameters for recalculation; 8 of 97 (8%, 95%-CI: 4% to 16%) publications achieved the primary endpoint. The median relative deviation between reported and recalculated sample sizes was 1%, with a range from -43% to +66%. CONCLUSION Although a transparent sample size legitimation is a crucial determinant of an RCT's methodological validity, more than half of the RCT publications considered failed to report them. Furthermore, reported sample size legitimations were often incomplete or incorrect. In summary, clinical authors should pay more attention to the transparent reporting of sample size calculation, and clinical journal reviewers may opt to reproduce reported sample size calculations. SYNOPSIS More than half of the analysed RCT publications on the treatment of AMD did not report a transparent sample size calculation. Only 8% reported a complete and correct sample size calculation.
Collapse
Affiliation(s)
- Sabrina Tulka
- Institute for Medical Biometry and Epidemiology, University Witten Herdecke Faculty of Health, Witten, Germany
| | - Berit Geis
- Institute for Medical Biometry and Epidemiology, University Witten Herdecke Faculty of Health, Witten, Germany
| | - Christine Baulig
- Institute for Medical Biometry and Epidemiology, University Witten Herdecke Faculty of Health, Witten, Germany
| | - Stephanie Knippschild
- Institute for Medical Biometry and Epidemiology, University Witten Herdecke Faculty of Health, Witten, Germany
| | - Frank Krummenauer
- Institute for Medical Biometry and Epidemiology, University Witten Herdecke Faculty of Health, Witten, Germany
| |
Collapse
|
13
|
Bayman EO, Dexter F. Relative importance of strategies for improving the sample size selection and reporting of small randomized clinical trials in anesthesiology. Can J Anaesth 2018; 65:607-610. [PMID: 29556936 DOI: 10.1007/s12630-018-1110-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 02/05/2018] [Indexed: 10/17/2022] Open
Affiliation(s)
- Emine Ozgur Bayman
- Department of Anesthesia, University of Iowa, 200 Hawkins Drive, 6439 JCP, Iowa City, IA, 52242, USA. .,Department of Biostatistics, University of Iowa, Iowa City, IA, USA.
| | - Franklin Dexter
- Department of Anesthesia, University of Iowa, 200 Hawkins Drive, 6439 JCP, Iowa City, IA, 52242, USA.,Division of Management Consulting, University of Iowa, Iowa City, IA, USA
| |
Collapse
|