Srimaneekarn N, Leelachaikul P, Thiradilok S, Manopatanakul S. Agreement test of P value versus Bayes factor for sample means comparison: analysis of articles from the Angle Orthodontist journal.
BMC Med Res Methodol 2023;
23:43. [PMID:
36797687 PMCID:
PMC9933385 DOI:
10.1186/s12874-023-01858-z]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2022] [Accepted: 02/02/2023] [Indexed: 02/18/2023] Open
Abstract
BACKGROUND
Researchers are cautioned against misinterpreting the conventional P value, especially while implementing the popular t test. Therefore, this study evaluated the agreement between the P value and Bayes factor (BF01) results obtained from a comparison of sample means in published orthodontic articles.
METHODS
Data pooling was undertaken using the modified PRISMA flow diagram. Per the inclusion criteria applied to The Angle Orthodontist journal for a two-year period (November 2016 to September 2018), all articles that utilised the t test for statistical analysis were selected. The agreement was evaluated between the P value and Bayes factor set at 0.05 and 1, respectively. The percentage of agreement and Kappa coefficient were calculated. Plotting of effect size against P value and BF01 was analysed.
RESULTS
From 265 articles, 82 utilised the t test. Of these, only 37 articles met the inclusion criteria. The study identified 793 justifiable t tests (438 independent-sample and 355 dependent-sample t tests) for which the agreement percentage and Kappa coefficient were found to be 93.57% and 0.87, respectively. However, when anecdotal evidence (1/3 < BF01 < 3) was considered, almost half of the studies missed statistical significance. Furthermore, two-thirds of the significantly reported P values (0.01 < P < 0.05; 30 independent-sample and 20 dependent-sample t tests) showed only anecdotal evidence (1/3 < BF01 < 1). Moreover, BF01 indicated moderate evidence (BF01 > 3) for approximately one-third of the total studies, with nonsignificant P values (P > 0.05). Furthermore, accompanying the P values, the effect sizes, especially for studies with independent-sample t tests, were very high with a strong potential to show substantive significance. Although it is best to extend the statistical calculation of a doubted P value (just below 0.05), especially for orthodontic innovation, orthodontists may reach a balanced decision relying on cephalometric measurements.
CONCLUSIONS
The Kappa coefficient indicated perfect agreement between the two methods. BF01 restricted this judgement to approximately half of them, with two-thirds of these studies showing nonsignificant P values. Simple extensions of statistical calculations, especially effect size and BF01, can be useful and should be considered when finalising statistical analyses, especially for orthodontic studies without cephalometric analysis.
Collapse