1
|
Younas A. Beyond 'statistical significance': A nontechnical primer of Bayesian statistics and Bayes factors for health researchers. J Eval Clin Pract 2024; 30:1218-1226. [PMID: 38825756 DOI: 10.1111/jep.14032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 05/14/2024] [Accepted: 05/16/2024] [Indexed: 06/04/2024]
Abstract
RATIONALE Hypothesis testing is integral to health research and is commonly completed through frequentist statistics focused on computing p values. p Values have been long criticized for offering limited information about the relationship of variables and strength of evidence concerning the plausibility, presence and certainty of associations among variables. Bayesian statistics is a potential alternative for inference-making. Despite emerging discussion on Bayesian statistics across various disciplines, the uptake of Bayesian statistics in health research is still limited. AIM To offer a primer on Bayesian statistics and Bayes factors for health researchers to gain preliminary knowledge of its use, application and interpretation in health research. METHODS Theoretical and empirical literature on Bayesian statistics and methods were used to develop this methodological primer. CONCLUSIONS Using Bayesian statistics in health research without a careful and complete understanding of its underlying philosophy and differences from frequentist testing, estimation and interpretation methods can result in similar ritualistic use as done for p values. IMPLICATIONS Health researchers should supplement frequentists statistics with Bayesian statistics when analysing research data. The overreliance on p values for clinical decisions making should be avoided. Bayes factors offer a more intuitive measure of assessing the strength of evidence for null and alternative hypothesis.
Collapse
Affiliation(s)
- Ahtisham Younas
- Memorial University of Newfoundland, St. John's, Newfoundland, Canada
| |
Collapse
|
2
|
Altman N, Krzywinski M. Understanding p-values and significance. Lab Anim 2024:236772241247106. [PMID: 39315628 DOI: 10.1177/00236772241247106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/25/2024]
Abstract
P-values combined with estimates of effect size are used to assess the importance of experimental results. However, their interpretation can be invalidated by selection bias when testing multiple hypotheses, fitting multiple models or even informally selecting results that seem interesting after observing the data. We offer an introduction to principled uses of p-values (targeted at the non-specialist) and identify questionable practices to be avoided.
Collapse
Affiliation(s)
- Naomi Altman
- Department of Statistics, The Pennsylvania State University, State College, PA, USA
| | - Martin Krzywinski
- Canada's Michael Smith Genome Sciences Centre, Vancouver, British Columbia, Canada
| |
Collapse
|
3
|
La Rosa GRM. Rethinking dental research: the importance of patient-reported outcomes and minimally clinically important difference. Evid Based Dent 2024; 25:117-118. [PMID: 38961312 DOI: 10.1038/s41432-024-01034-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/05/2024]
Affiliation(s)
- Giusy Rita Maria La Rosa
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy.
- Department of General Surgery and Surgical-Medical Specialties, University of Catania, Catania, Italy.
| |
Collapse
|
4
|
Clarke L, Lockwood P. Student radiographers' knowledge and experience of lateral hip X-ray positioning: A survey. Radiography (Lond) 2024:S1078-8174(24)00208-6. [PMID: 39214786 DOI: 10.1016/j.radi.2024.08.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2024] [Revised: 08/12/2024] [Accepted: 08/12/2024] [Indexed: 09/04/2024]
Abstract
INTRODUCTION The horizontal beam lateral (HBL) position technique for X-ray imaging has been used for nearly a century; however, this can be challenging for the patient and the practitioner, as it potentially compromises patient dignity. This study explores student radiographers' knowledge and experience of lateral hip positions and their impact on diagnostic quality and patient dignity. METHOD A cross-sectional mixed-method online survey of undergraduate diagnostic radiography students was completed. Likert scale assessments, rank ordering questions, and free-test qualitative responses were utilised for questions on knowledge and experience of different positioning, ease to obtain, patient dignity, diagnostic quality, and need for repeats. Data analysis included descriptive statistics and cross-tabulation non-parametric analysis against variables of age, gender and year of study. RESULTS Responses were received by n = 42/158 students, a response rate of 27%. The HBL position was the most commonly repeated image (76.6%); the qualitative themes included HBL image quality issues and difficulty in the HBL positioning for elderly or frail patients, often in discomfort and pain. Analysis of student responses to perceived patient dignity in positioning identified 73.8% found the HBL undignified, and 85.7% agreed the Clements-Nakayama (CN) position would be more dignified for patients. The diagnostic image quality of the HBL position (64.2%) was compared to the CN alternative axiolateral (66.6%). Comparison of ease of obtaining the correct position for HBL (47.6%) was higher than CN position (28.6%); this could be due to the lack of experience n = 3/42 (7.1%) of this position. CONCLUSION Overall, student radiographers' experience and knowledge of various lateral hip positions observed in clinical practice was good. The CN position scored high for diagnostic image (66.6%) and dignity for the patient (85.7%), over the often repeated HBL position (76.6%), which scored lower for image quality (64.2%) and dignity (76.6%). IMPLICATIONS FOR PRACTICE Radiographers should advocate for professional autonomy and explore alternative positioning techniques. Further investigation into the CN position's utilisation, image quality and radiation dose in England is recommended.
Collapse
Affiliation(s)
- L Clarke
- Radiology Department, The Princess Alexandra Hospital NHS Trust, Harlow, Essex, United Kingdom
| | - P Lockwood
- Department of Radiography, School of Allied Health Professions, Faculty of Medicine, Health and Social Care, Canterbury Christ Church University, Kent, United Kingdom.
| |
Collapse
|
5
|
Hagen K. Misinterpretation of statistical nonsignificance as a sign of potential bias: Hydroxychloroquine as a case study. Account Res 2024; 31:600-619. [PMID: 36469591 DOI: 10.1080/08989621.2022.2155517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Accepted: 12/02/2022] [Indexed: 12/08/2022]
Abstract
The term "statistical significance," ubiquitous in the medical literature, is often misinterpreted, as is the "p-value" from which it stems. This article explores the implications of results that are numerically positive (e.g., those in the treatment arm do better on average) but not statistically significant. This lack of statistical significance is sometimes interpreted as strong, even decisive, evidence against an effect without due consideration of other factors. Three influential articles on hydroxychloroquine (HCQ) as a treatment for COVID-19 are illustrative. They all involve numerically positive results that were not statistically significant that were misinterpreted as strong evidence against HCQ's efficacy. These and related considerations raise concerns regarding the reliability of academic/medical reasoning around COVID-19 treatments, as well as more generally, and regarding the potential for bias stemming from conflicts of interest.
Collapse
Affiliation(s)
- Kurtis Hagen
- Independent Scholar Former Associate Professor of Philosophy at SUNY Plattsburgh, Wesley Chapel, Florida, USA
| |
Collapse
|
6
|
Free N, Stemple JC, Smith JA, Phyland DJ. The Impact of a Vocal Loading Task on Voice Characteristics of Female Speakers With Benign Vocal Fold Lesions. J Voice 2024; 38:964.e1-964.e16. [PMID: 34955368 DOI: 10.1016/j.jvoice.2021.11.009] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2021] [Revised: 10/29/2021] [Accepted: 11/09/2021] [Indexed: 12/12/2022]
Abstract
OBJECTIVES To examine the effect of a vocal loading task on measures of vocal structure and function in females with benign vocal fold lesions (BVFLs) and determine if change is observed in voice and lesion characteristics. STUDY DESIGN Prospective cohort study. METHODS Twenty-eight (n = 28) female subjects with phonotraumatic BVFLs completed a vocal loading task of 30 minutes of reading aloud at 75-85 dBA. Multidimensional voice evaluation was completed pre- and post-load, including audio and videostroboscopy recordings and images for expert perceptual ratings and acoustic and aerodynamic evaluation. Subjects also scored themselves using a 10 cm visual analogue scale for Perceived Phonatory Effort, and completed the Evaluation of Ability to Voice Easily, a 12 item self-report scale of current perceived speaking voice function. An exploratory rather than confirmatory approach to data analysis was adopted. The direction and magnitude of the change scores (pre- to post-load) for each individual, across a wide variety of instrumental and self-report measures, were assessed against a Minimal Clinically Important Difference criteria. RESULTS Observations of change and the direction of change in vocal response of individuals with BVFLs to 30 minutes of loud vocal load was variable. Minimal to no change was noted for participants pre- to post-load as rated perceptually, for auditory and videostroboscopy samples. For most instrumental measures, change was shown for many participants including an overall improvement in aerodynamic and acoustic measures of function and efficiency post-load for 20 participants (77%) and decline in function for 4 participants (15%). Self-reported effort and vocal function post-load was multidirectional with similar numbers of participants reporting no change, improved function or a decline. CONCLUSION Subjects with BVFLs demonstrate change in vocal function following 30 minutes of vocal load. While this change can be variable and multidirectional, overall improvement was observed in instrumental measures of function and efficiency for most participants. Some participants perceived this change to be an increase in effort, some a reduction in effort and some perceived no change. Improved vocal function despite relative lesion stability can seemingly occur after loading in some pathological voices.
Collapse
Affiliation(s)
- Nicole Free
- Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia.
| | - Joseph C Stemple
- Department of Communication Sciences and Disorders, and Rehabilitation Sciences PhD Program, University of Kentucky, Lexington, Kentucky
| | - Julian A Smith
- Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| | - Debra J Phyland
- Department of Surgery, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, Victoria, Australia
| |
Collapse
|
7
|
Choo M, Park D, Cho M, Bae S, Kim J, Han DH. Exploring a multimodal approach for utilizing digital biomarkers for childhood mental health screening. Front Psychiatry 2024; 15:1348319. [PMID: 38666089 PMCID: PMC11043569 DOI: 10.3389/fpsyt.2024.1348319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 03/25/2024] [Indexed: 04/28/2024] Open
Abstract
Background Depression and anxiety are prevalent mental health concerns among children and adolescents. The application of conventional assessment methods, such as survey questionnaires to children, may lead to self-reporting issues. Digital biomarkers provide extensive data, reducing bias in mental health self-reporting, and significantly influence patient screening. Our primary objectives were to accurately assess children's mental health and to investigate the feasibility of using various digital biomarkers. Methods This study included a total of 54 boys and girls aged between 7 to 11 years. Each participant's mental state was assessed using the Depression, Anxiety, and Stress Scale. Subsequently, the subjects participated in digital biomarker collection tasks. Heart rate variability (HRV) data were collected using a camera sensor. Eye-tracking data were collected through tasks displaying emotion-face stimuli. Voice data were obtained by recording the participants' voices while they engaged in free speech and description tasks. Results Depressive symptoms were positively correlated with low frequency (LF, 0.04-0.15 Hz of HRV) in HRV and negatively associated with eye-tracking variables. Anxiety symptoms had a negative correlation with high frequency (HF, 0.15-0.40 Hz of HRV) in HRV and a positive association with LF/HF. Regarding stress, eye-tracking variables indicated a positive correlation, while pNN50, which represents the proportion of NN50 (the number of pairs of successive R-R intervals differing by more than 50 milliseconds) divided by the total number of NN (R-R) intervals, exhibited a negative association. Variables identified for childhood depression included LF and the total time spent looking at a sad face. Those variables recognized for anxiety were LF/HF, heart rate (HR), and pNN50. For childhood stress, HF, LF, and Jitter showed different correlation patterns between the two grade groups. Discussion We examined the potential of multimodal biomarkers in children, identifying features linked to childhood depression, particularly LF and the Sad.TF:time. Anxiety was most effectively explained by HRV features. To explore reasons for non-replication of previous studies, we categorized participants by elementary school grades into lower grades (1st, 2nd, 3rd) and upper grades (4th, 5th, 6th). Conclusion This study confirmed the potential use of multimodal digital biomarkers for children's mental health screening, serving as foundational research.
Collapse
Affiliation(s)
| | - Doeun Park
- HCI Lab, Yonsei University, Seoul, Republic of Korea
| | - Minseo Cho
- HCI Lab, Yonsei University, Seoul, Republic of Korea
| | - Sujin Bae
- Department of Psychiatry, College of Medicine, Chung-Ang University, Seoul, Republic of Korea
| | - Jinwoo Kim
- HCI Lab, Yonsei University, Seoul, Republic of Korea
| | - Doug Hyun Han
- Department of Psychiatry, College of Medicine, Chung-Ang University, Seoul, Republic of Korea
| |
Collapse
|
8
|
Suresh NV, Go BC, Fritz CG, Harris J, Ahluwalia V, Xu K, Lu J, Rajasekaran K. The fragility index: how robust are the outcomes of head and neck cancer randomised, controlled trials? J Laryngol Otol 2024; 138:451-456. [PMID: 37795709 PMCID: PMC10950446 DOI: 10.1017/s0022215123001755] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 08/12/2023] [Accepted: 08/29/2023] [Indexed: 10/06/2023]
Abstract
BACKGROUND The fragility index represents the minimum number of patients required to convert an outcome from statistically significant to insignificant. This report assesses the fragility index of head and neck cancer randomised, controlled trials. METHODS Studies were extracted from PubMed/Medline, Scopus, Embase and Cochrane databases. RESULTS Overall, 123 randomised, controlled trials were included. The sample size and fragility index medians (interquartile ranges) were 103 (56-213) and 2 (0-5), respectively. The fragility index exceeded the number of patients lost to follow up in 42.3 per cent (n = 52) of studies. A higher fragility index correlated with higher sample size (r = 0.514, p < 0.001), number of events (r = 0.449, p < 0.001) and statistical significance via p-value (r = -0.367, p < 0.001). CONCLUSION Head and neck cancer randomised, controlled trials demonstrated low fragility index values, in which statistically significant results could be nullified by altering the outcomes of just two patients, on average. Future head and neck oncology randomised, controlled trials should report the fragility index in order to provide insight into statistical robustness.
Collapse
Affiliation(s)
- Neeraj V Suresh
- Department of Otorhinolaryngology – Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
- Department of Otolaryngology – Head and Neck Surgery, Yale University, New Haven, CT, USA
| | - Beatrice C Go
- Department of Otorhinolaryngology – Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
| | - Christian G Fritz
- Department of Otorhinolaryngology – Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
| | - Jacob Harris
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Vinayak Ahluwalia
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Katherine Xu
- Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
| | - Joseph Lu
- Sidney Kimmel Medical College at Thomas Jefferson University, Philadelphia, PA, USA
| | - Karthik Rajasekaran
- Department of Otorhinolaryngology – Head and Neck Surgery, University of Pennsylvania, Philadelphia, PA, USA
- Leonard Davis Institute of Health Economics, University of Pennsylvania, Philadelphia, PA, USA
| |
Collapse
|
9
|
Manolov R, Onghena P. Testing delayed, gradual, and temporary treatment effects in randomized single-case experiments: A general response function framework. Behav Res Methods 2024; 56:3915-3936. [PMID: 37749426 PMCID: PMC11133040 DOI: 10.3758/s13428-023-02230-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/29/2023] [Indexed: 09/27/2023]
Abstract
Randomization tests represent a class of significance tests to assess the statistical significance of treatment effects in randomized single-case experiments. Most applications of single-case randomization tests concern simple treatment effects: immediate, abrupt, and permanent changes in the level of the outcome variable. However, researchers are confronted with delayed, gradual, and temporary treatment effects; in general, with "response functions" that are markedly different from single-step functions. We here introduce a general framework that allows specifying a test statistic for a randomization test based on predicted response functions that is sensitive to a wide variety of data patterns beyond immediate and sustained changes in level: different latencies (degrees of delay) of effect, abrupt versus gradual effects, and different durations of the effect (permanent or temporary). There may be reasonable expectations regarding the kind of effect (abrupt or gradual), entailing a different focal data feature (e.g., level or slope). However, the exact amount of latency and the exact duration of a temporary effect may not be known a priori, justifying an exploratory approach studying the effect of specifying different latencies or delayed effects and different durations for temporary effects. We provide illustrations of the proposal with real data, and we present a user-friendly freely available web application implementing it.
Collapse
Affiliation(s)
- Rumen Manolov
- Department of Social Psychology and Quantitative Psychology, Faculty of Psychology, University of Barcelona, Passeig de la Vall d'Hebron 171, 08035, Barcelona, Spain.
| | - Patrick Onghena
- Faculty of Psychology and Educational Sciences, Methodology of Educational Sciences Research Group, KU Leuven, Tiensestraat 102, 3000, Leuven, Belgium
| |
Collapse
|
10
|
Shatz I. Assumption-checking rather than (just) testing: The importance of visualization and effect size in statistical diagnostics. Behav Res Methods 2024; 56:826-845. [PMID: 36869217 PMCID: PMC10830673 DOI: 10.3758/s13428-023-02072-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/17/2023] [Indexed: 03/05/2023]
Abstract
Statistical methods generally have assumptions (e.g., normality in linear regression models). Violations of these assumptions can cause various issues, like statistical errors and biased estimates, whose impact can range from inconsequential to critical. Accordingly, it is important to check these assumptions, but this is often done in a flawed way. Here, I first present a prevalent but problematic approach to diagnostics-testing assumptions using null hypothesis significance tests (e.g., the Shapiro-Wilk test of normality). Then, I consolidate and illustrate the issues with this approach, primarily using simulations. These issues include statistical errors (i.e., false positives, especially with large samples, and false negatives, especially with small samples), false binarity, limited descriptiveness, misinterpretation (e.g., of p-value as an effect size), and potential testing failure due to unmet test assumptions. Finally, I synthesize the implications of these issues for statistical diagnostics, and provide practical recommendations for improving such diagnostics. Key recommendations include maintaining awareness of the issues with assumption tests (while recognizing they can be useful), using appropriate combinations of diagnostic methods (including visualization and effect sizes) while recognizing their limitations, and distinguishing between testing and checking assumptions. Additional recommendations include judging assumption violations as a complex spectrum (rather than a simplistic binary), using programmatic tools that increase replicability and decrease researcher degrees of freedom, and sharing the material and rationale involved in the diagnostics.
Collapse
|
11
|
García-Pérez MA. Use and misuse of corrections for multiple testing. METHODS IN PSYCHOLOGY 2023. [DOI: 10.1016/j.metip.2023.100120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023] Open
|
12
|
Ekkekakis P, Swinton P, Tiller NB. Extraordinary Claims in the Literature on High-Intensity Interval Training (HIIT): I. Bonafide Scientific Revolution or a Looming Crisis of Replication and Credibility? Sports Med 2023; 53:1865-1890. [PMID: 37561389 DOI: 10.1007/s40279-023-01880-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/15/2023] [Indexed: 08/11/2023]
Abstract
The literature on high-intensity interval training (HIIT) contains claims that, if true, could revolutionize the science and practice of exercise. This critical analysis examines two varieties of claims: (i) HIIT is effective in improving various indices of fitness and health, and (ii) HIIT is as effective as more time-consuming moderate-intensity continuous exercise. Using data from two recent systematic reviews as working examples, we show that studies in both categories exhibit considerable weaknesses when judged through the prism of fundamental statistical principles. Predominantly, small-to-medium effects are investigated in severely underpowered studies, thus greatly increasing the risk of both type I and type II errors of statistical inference. Studies in the first category combine the volatility of estimates associated with small samples with numerous dependent variables analyzed without consideration of the inflation of the type I error rate. Studies in the second category inappropriately use the p > 0.05 criterion from small studies to support claims of 'similar' or 'comparable' effects. It is concluded that the situation in the HIIT literature is reminiscent of the research climate that led to the replication crisis in psychology. As in psychology, this could be an opportunity to reform statistical practices in exercise science.
Collapse
Affiliation(s)
- Panteleimon Ekkekakis
- Department of Kinesiology, Michigan State University, 308 W Circle Dr #134, East Lansing, MI, 48824, USA.
| | - Paul Swinton
- School of Health Sciences, Robert Gordon University, Aberdeen, Scotland, UK
| | - Nicholas B Tiller
- The Lundquist Institute for Biomedical Innovation at Harbor-UCLA Medical Center, Torrance, CA, USA
| |
Collapse
|
13
|
Das D, Das T. The "P"-Value: The Primary Alphabet of Research Revisited. Int J Prev Med 2023; 14:41. [PMID: 37351025 PMCID: PMC10284198 DOI: 10.4103/ijpvm.ijpvm_200_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2022] [Accepted: 01/09/2023] [Indexed: 06/24/2023] Open
Abstract
Each research roves around the P value. A value less than 0.05 is considered to be statistically significant. Very few researchers are aware of the history, real-world significance, statistical insight, and in-depth criticism about this monumental alphabet of research. This article will provide detailed insight into the most common molecule of research which will be rewarding for the young students and researchers in the primary world of research. It is not a simple value; it is the longest and broadest description of research squeezed to a number for the ground level worker to the principal investigator. The present review will provide a detailed and unique insight into the P value which would be rewarding for the primary care physicians toward translating research into their clinical practice.
Collapse
Affiliation(s)
- Debasish Das
- Department of Cardiology, All India Institute of Medical Sciences (AIIMS), Bhubaneswar, Odisha, India
| | - Tutan Das
- Department of Cardiology, All India Institute of Medical Sciences (AIIMS), Bhubaneswar, Odisha, India
| |
Collapse
|
14
|
Uygun Tunç D, Tunç MN, Lakens D. The epistemic and pragmatic function of dichotomous claims based on statistical hypothesis tests. THEORY & PSYCHOLOGY 2023. [DOI: 10.1177/09593543231160112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/18/2023]
Abstract
Researchers commonly make dichotomous claims based on continuous test statistics. Many have branded the practice as a misuse of statistics and criticize scientists for the widespread application of hypothesis tests to tentatively reject a hypothesis (or not) depending on whether a p-value is below or above an alpha level. Although dichotomous claims are rarely explicitly defended, we argue they play an important epistemological and pragmatic role in science. The epistemological function of dichotomous claims consists in transforming data into quasibasic statements, which are tentatively accepted singular facts that can corroborate or falsify theoretical claims. This transformation requires a prespecified methodological decision procedure such as Neyman-Pearson hypothesis tests. From the perspective of methodological falsificationism these decision procedures are necessary, as probabilistic statements (e.g., continuous test statistics) cannot function as falsifiers of substantive hypotheses. The pragmatic function of dichotomous claims is to facilitate scrutiny and criticism among peers by generating contestable claims, a process referred to by Popper as “conjectures and refutations.” We speculate about how the surprisingly widespread use of a 5% alpha level might have facilitated this pragmatic function. Abandoning dichotomous claims, for example because researchers commonly misuse p-values, would sacrifice their crucial epistemic and pragmatic functions.
Collapse
Affiliation(s)
- Duygu Uygun Tunç
- Eindhoven University of Technology
- Middle East Technical University
| | | | | |
Collapse
|
15
|
Franco NH, Fry DJ. Case-based teaching of experimental design - contributions for meaningful learning. Lab Anim 2023; 57:192-203. [PMID: 36739493 DOI: 10.1177/00236772221150299] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
This article argues the need for education and training of researchers carrying out animal studies on the fundamentals of experimental design (ED), as a key means of improving the reliability and reproducibility of preclinical results. The current landscape in ED education in Europe is presented, and we make the case for dedicated tutor-guided teaching of ED. With less than a day dedicated to it in many courses effective techniques for communicating key issues are needed. We have developed two approaches that transfer to experimental design teaching the case-study, problem-solving techniques known to be effective in other fields. They use realistic research scenarios to provoke discussion and engage learning. In one the scenario is for group discussion or informal or formal assessment with subsequent tutor-led discussion of key points. For this each scenario needs a clear statement of the purpose of the research study, simplified text outlining the comparisons and procedures, and a statement of the outcome measure. In the other approach, the scenario is used with freely-available software with a good graphical output to explore the sizing of experiments and the use of both sexes. Trainee feedback and informal assessment show that these approaches can make for interesting and memorable sessions and offer a useful contribution to improvement in experimental design teaching so that it produces meaningful learning that can translate into better practice.
Collapse
Affiliation(s)
- Nuno H Franco
- i3S - Instituto de Investigação e Inovação em Saúde, Universidade do Porto, Portugal
| | - Derek J Fry
- School of Biological Sciences, University of Manchester, UK
| |
Collapse
|
16
|
Rognli EW, Zahl‐Olsen R, Rekdal SS, Hoffart A, Bertelsen TB. Editorial perspective: Bayesian statistical methods are useful for researchers in child and adolescent mental health. J Child Psychol Psychiatry 2023; 64:339-342. [PMID: 35818323 PMCID: PMC10084248 DOI: 10.1111/jcpp.13662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 06/02/2022] [Indexed: 01/17/2023]
Abstract
Bayesian statistical approaches offer nuanced, detailed, and intuitive analyses, even with small sample sizes. Although these qualities are highly relevant for researchers in child and adolescent mental health, Bayesian methods are still quite rarely employed. This editorial perspective will briefly describe what is different about Bayesian statistical methods, discuss some of the ways they may benefit research in our field, and provide an introduction to how Bayesian statistics are employed in practical research.
Collapse
Affiliation(s)
- Erling W. Rognli
- Department of Child and Adolescent Mental Health ServicesAkershus University HospitalLørenskogNorway
| | - Rune Zahl‐Olsen
- Department of Child and Adolescent Mental HealthSørlandet HospitalKristiansandNorway
| | - Sondre Sverd Rekdal
- Department of Child and Adolescent Mental HealthSørlandet HospitalKristiansandNorway
| | - Asle Hoffart
- Research Institute of Modum Bad Psychiatric HospitalVikersundNorway
- Department of PsychologyUniversity of OsloOsloNorway
| | - Thomas Bjerregaard Bertelsen
- Department of Child and Adolescent Mental HealthSørlandet HospitalKristiansandNorway
- Department of Clinical Child and Adolescent PsychologyUniversity of BergenBergenNorway
| |
Collapse
|
17
|
Statistical Analysis in the Presence of Spatial Autocorrelation: Selected Sampling Strategy Effects. STATS 2022. [DOI: 10.3390/stats5040081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Fundamental to most classical data collection sampling theory development is the random drawings assumption requiring that each targeted population member has a known sample selection (i.e., inclusion) probability. Frequently, however, unrestricted random sampling of spatially autocorrelated data is impractical and/or inefficient. Instead, randomly choosing a population subset accounts for its exhibited spatial pattern by utilizing a grid, which often provides improved parameter estimates, such as the geographic landscape mean, at least via its precision. Unfortunately, spatial autocorrelation latent in these data can produce a questionable mean and/or standard error estimate because each sampled population member contains information about its nearby members, a data feature explicitly acknowledged in model-based inference, but ignored in design-based inference. This autocorrelation effect prompted the development of formulae for calculating an effective sample size (i.e., the equivalent number of sample selections from a geographically randomly distributed population that would yield the same sampling error) estimate. Some researchers recently challenged this and other aspects of spatial statistics as being incorrect/invalid/misleading. This paper seeks to address this category of misconceptions, demonstrating that the effective geographic sample size is a valid and useful concept regardless of the inferential basis invoked. Its spatial statistical methodology builds upon the preceding ingredients.
Collapse
|
18
|
Mesquida C, Murphy J, Lakens D, Warne J. Replication concerns in sports and exercise science: a narrative review of selected methodological issues in the field. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220946. [PMID: 36533197 PMCID: PMC9748505 DOI: 10.1098/rsos.220946] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/07/2022] [Indexed: 06/17/2023]
Abstract
Known methodological issues such as publication bias, questionable research practices and studies with underpowered designs are known to decrease the replicability of study findings. The presence of such issues has been widely established across different research fields, especially in psychology. Their presence raised the first concerns that the replicability of study findings could be low and led researchers to conduct large replication projects. These replication projects revealed that a significant portion of original study findings could not be replicated, giving rise to the conceptualization of the replication crisis. Although previous research in the field of sports and exercise science has identified the first warning signs, such as an overwhelming proportion of significant findings, small sample sizes and lack of data availability, their possible consequences for the replicability of our field have been overlooked. We discuss the consequences of the above issues on the replicability of our field and offer potential solutions to improve replicability.
Collapse
Affiliation(s)
- Cristian Mesquida
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland
| | - Jennifer Murphy
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland
| | - Daniël Lakens
- Human-Technology Interaction Group, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Joe Warne
- Centre of Applied Science for Health, Technological University Dublin, Tallaght, Dublin, Ireland
| |
Collapse
|
19
|
Calin-Jageman RJ. Better Inference in Neuroscience: Test Less, Estimate More. J Neurosci 2022; 42:8427-8431. [PMID: 36351833 PMCID: PMC9665913 DOI: 10.1523/jneurosci.1133-22.2022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 08/28/2022] [Accepted: 08/29/2022] [Indexed: 11/17/2022] Open
Abstract
Null-hypothesis significance testing (NHST) has become the main tool of inference in neuroscience, and yet evidence suggests we do not use this tool well: tests are often planned poorly, conducted unfairly, and interpreted invalidly. This editorial makes the case that in addition to reforms to increase rigor we should test less, reserving NHST for clearly confirmatory contexts in which the researcher has derived a quantitative prediction, can provide the inputs needed to plan a quality test, and can specify the criteria not only for confirming their hypothesis but also for rejecting it. A reduction in testing would be accompanied by an expansion of the use of estimation [effect sizes and confidence intervals (CIs)]. Estimation is more suitable for exploratory research, provides the inputs needed to plan strong tests, and provides important contexts for properly interpreting tests.
Collapse
|
20
|
More value from less food? Effects of epicurean labeling on moderate eating in the United States and in France. Appetite 2022; 178:106262. [PMID: 35926807 DOI: 10.1016/j.appet.2022.106262] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 06/30/2022] [Accepted: 07/28/2022] [Indexed: 11/22/2022]
Abstract
Emerging research has shown that sensory-based interventions (e.g., inviting people to mindfully focus on the multisensory aspects of eating) can be a viable alternative to nutrition-based interventions (e.g., nutrition labeling) to encourage moderate eating. We contribute to this literature in two ways. First, we propose a novel and simple sensory-based intervention to increase the appeal of moderate food portions in commercial settings, epicurean labeling, which consists in emphasizing the aesthetic, multisensory properties of the food when describing it on menus or packages. Second, we show theory-relevant cross-cultural differences in the effectiveness of this intervention between the United States and France, two food cultures at the opposite ends of the hedonic-utilitarian food attitude spectrum. We report the results of a multi-day field experiment at a French cafeteria showing that epicurean labeling, unlike nutrition labeling, reduces intake while increasing the perceived monetary value of the meal thanks to higher savoring. We then show in a matched cross-national online experiment that epicurean labeling is more effective in France than in the United States. We provide additional evidence of this cross-cultural variation in a study of 9154 food products sold in supermarkets in both countries. We find that epicurean labeling is more prevalent, but also more likely to be associated with smaller portions in France than in the United States. While sensory-based interventions are a promising alternative to nutrition-based interventions, it is necessary to develop business-friendly interventions that can be implemented in everyday life, as well as to consider cultural factors that can modulate their effectiveness.
Collapse
|
21
|
Sawada T, Huang L, Koryakov OY. Some misunderstandings in psychology about confidence intervals. Front Psychol 2022; 13:948423. [PMID: 35936264 PMCID: PMC9355556 DOI: 10.3389/fpsyg.2022.948423] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Accepted: 06/30/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Tadamasa Sawada
- School of Psychology, National Research University Higher School of Economics, Moscow, Russia
- Akian College of Science and Engineering, American University of Armenia, Yerevan, Armenia
- Department of Psychology, Russian-Armenian (Slavonic) University, Yerevan, Armenia
- *Correspondence: Tadamasa Sawada
| | - Lorick Huang
- Institut Mathématiques de Toulouse, Toulouse, France
| | - Oleg Y. Koryakov
- School of Psychology, National Research University Higher School of Economics, Moscow, Russia
| |
Collapse
|
22
|
Lakens D. Correspondence: Reward, but do not yet require, interval hypothesis tests. J Physiother 2022; 68:213-214. [PMID: 35760725 DOI: 10.1016/j.jphys.2022.06.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 06/06/2022] [Indexed: 11/17/2022] Open
Affiliation(s)
- Daniël Lakens
- Human-Technology Interaction Group, Eindhoven University of Technology, Eindhoven, The Netherlands.
| |
Collapse
|
23
|
Ellis RJ. Questionable Research Practices, Low Statistical Power, and Other Obstacles to Replicability: Why Preclinical Neuroscience Research Would Benefit from Registered Reports. eNeuro 2022; 9:ENEURO.0017-22.2022. [PMID: 35922130 PMCID: PMC9351632 DOI: 10.1523/eneuro.0017-22.2022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 05/22/2022] [Accepted: 05/31/2022] [Indexed: 02/03/2023] Open
Abstract
Replicability, the degree to which a previous scientific finding can be repeated in a distinct set of data, has been considered an integral component of institutionalized scientific practice since its inception several hundred years ago. In the past decade, large-scale replication studies have demonstrated that replicability is far from favorable, across multiple scientific fields. Here, I evaluate this literature and describe contributing factors including the prevalence of questionable research practices (QRPs), misunderstanding of p-values, and low statistical power. I subsequently discuss how these issues manifest specifically in preclinical neuroscience research. I conclude that these problems are multifaceted and difficult to solve, relying on the actions of early and late career researchers, funding sources, academic publishers, and others. I assert that any viable solution to the problem of substandard replicability must include changing academic incentives, with adoption of registered reports being the most immediately impactful and pragmatic strategy. For animal research in particular, comprehensive reporting guidelines that document potential sources of sensitivity for experimental outcomes is an essential addition.
Collapse
Affiliation(s)
- Randall J Ellis
- Friedman Brain Institute, Department of Neuroscience, Addiction Institute of Mount Sinai, Icahn School of Medicine at Mount Sinai, New York, NY 10029
| |
Collapse
|
24
|
Kudrna L, Kushlev K. Money Does Not Always Buy Happiness, but Are Richer People Less Happy in Their Daily Lives? It Depends on How You Analyze Income. Front Psychol 2022; 13:883137. [PMID: 35719460 PMCID: PMC9199446 DOI: 10.3389/fpsyg.2022.883137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 04/08/2022] [Indexed: 11/18/2022] Open
Abstract
Do people who have more money feel happier during their daily activities? Some prior research has found no relationship between income and daily happiness when treating income as a continuous variable in OLS regressions, although results differ between studies. We re-analyzed existing data from the United States and Germany, treating household income as a categorical variable and using lowess and spline regressions to explore nonlinearities. Our analyses reveal that these methodological decisions change the results and conclusions about the relationship between income and happiness. In American and German diary data from 2010 to 2015, results for the continuous treatment of income showed a null relationship with happiness, whereas the categorization of income showed that some of those with higher incomes reported feeling less happy than some of those with lower incomes. Lowess and spline regressions suggested null results overall, and there was no evidence of a relationship between income and happiness in Experience Sampling Methodology (ESM) data. Not all analytic approaches generate the same results, which may contribute to explaining discrepant results in existing studies about the correlates of happiness. Future research should be explicit about their approaches to measuring and analyzing income when studying its relationship with subjective well-being, ideally testing different approaches, and making conclusions based on the pattern of results across approaches.
Collapse
Affiliation(s)
- Laura Kudrna
- Institute of Applied Health Research, University of Birmingham, Birmingham, United Kingdom
- *Correspondence: Laura Kudrna,
| | - Kostadin Kushlev
- Department of Psychology, Georgetown University, Washington, DC, United States
| |
Collapse
|
25
|
AbdusSalam SS, Agocs FJ, Allanach BC, Athron P, Balázs C, Bagnaschi E, Bechtle P, Buchmueller O, Beniwal A, Bhom J, Bloor S, Bringmann T, Buckley A, Butter A, Camargo-Molina JE, Chrzaszcz M, Conrad J, Cornell JM, Danninger M, de Blas J, De Roeck A, Desch K, Dolan M, Dreiner H, Eberhardt O, Ellis J, Farmer B, Fedele M, Flächer H, Fowlie A, Gonzalo TE, Grace P, Hamer M, Handley W, Harz J, Heinemeyer S, Hoof S, Hotinli S, Jackson P, Kahlhoefer F, Kowalska K, Krämer M, Kvellestad A, Martinez ML, Mahmoudi F, Santos DM, Martinez GD, Mishima S, Olive K, Paul A, Prim MT, Porod W, Raklev A, Renk JJ, Rogan C, Roszkowski L, Ruiz de Austri R, Sakurai K, Scaffidi A, Scott P, Sessolo EM, Stefaniak T, Stöcker P, Su W, Trojanowski S, Trotta R, Sming Tsai YL, Van den Abeele J, Valli M, Vincent AC, Weiglein G, White M, Wienemann P, Wu L, Zhang Y. Simple and statistically sound recommendations for analysing physical theories. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2022; 85:052201. [PMID: 35522172 DOI: 10.1088/1361-6633/ac60ac] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Accepted: 03/24/2022] [Indexed: 06/14/2023]
Abstract
Physical theories that depend on many parameters or are tested against data from many different experiments pose unique challenges to statistical inference. Many models in particle physics, astrophysics and cosmology fall into one or both of these categories. These issues are often sidestepped with statistically unsound ad hoc methods, involving intersection of parameter intervals estimated by multiple experiments, and random or grid sampling of model parameters. Whilst these methods are easy to apply, they exhibit pathologies even in low-dimensional parameter spaces, and quickly become problematic to use and interpret in higher dimensions. In this article we give clear guidance for going beyond these procedures, suggesting where possible simple methods for performing statistically sound inference, and recommendations of readily-available software tools and standards that can assist in doing so. Our aim is to provide any physicists lacking comprehensive statistical training with recommendations for reaching correct scientific conclusions, with only a modest increase in analysis burden. Our examples can be reproduced with the code publicly available at Zenodo.
Collapse
Affiliation(s)
| | - Fruzsina J Agocs
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
- Kavli Institute for Cosmology, University of Cambridge, Madingley Road, Cambridge, CB3 0HA, United Kingdom
| | | | - Peter Athron
- Department of Physics and Institute of Theoretical Physics, Nanjing Normal University, Nanjing, Jiangsu 210023, People's Republic of China
- School of Physics and Astronomy, Monash University, Melbourne, VIC 3800, Australia
| | - Csaba Balázs
- School of Physics and Astronomy, Monash University, Melbourne, VIC 3800, Australia
| | | | - Philip Bechtle
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Oliver Buchmueller
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
| | - Ankit Beniwal
- Centre for Cosmology, Particle Physics and Phenomenology (CP3), Université catholique de Louvain, B-1348 Louvain-la-Neuve, Belgium
| | - Jihyun Bhom
- Institute of Nuclear Physics, Polish Academy of Sciences, Krakow, Poland
| | - Sanjay Bloor
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
- School of Mathematics and Physics, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | - Torsten Bringmann
- Department of Physics, University of Oslo, Box 1048, Blindern, N-0316 Oslo, Norway
| | - Andy Buckley
- School of Physics and Astronomy, University of Glasgow, University Place, Glasgow, G12 8QQ, United Kingdom
| | - Anja Butter
- Institut für Theoretische Physik, Universität Heidelberg, Germany
| | | | - Marcin Chrzaszcz
- Institute of Nuclear Physics, Polish Academy of Sciences, Krakow, Poland
| | - Jan Conrad
- Oskar Klein Centre for Cosmoparticle Physics, AlbaNova University Centre, SE-10691 Stockholm, Sweden
| | - Jonathan M Cornell
- Department of Physics, Weber State University, 1415 Edvalson St., Dept. 2508, Ogden, UT 84408, United States of America
| | - Matthias Danninger
- Department of Physics, Simon Fraser University, 8888 University Drive, Burnaby B.C., Canada
| | - Jorge de Blas
- Institute of Particle Physics Phenomenology, Durham University, Durham DH1 3LE, United Kingdom
| | - Albert De Roeck
- Experimental Physics Department, CERN, CH-1211 Geneva 23, Switzerland
| | - Klaus Desch
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Matthew Dolan
- ARC Centre of Excellence for Dark Matter Particle Physics, School of Physics, The University of Melbourne, Victoria 3010, Australia
| | - Herbert Dreiner
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Otto Eberhardt
- Instituto de Física Corpuscular, IFIC-UV/CSIC, Apt. Correus 22085, E-46071, Valencia, Spain
| | - John Ellis
- Theoretical Particle Physics and Cosmology Group, Department of Physics, King's College London, London WC2R 2LS, United Kingdom
| | - Ben Farmer
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
- Bureau of Meteorology, Melbourne, VIC 3001, Australia
| | - Marco Fedele
- Institut für Theoretische Teilchenphysik, Karlsruhe Institute of Technology, D-76131 Karlsruhe, Germany
| | - Henning Flächer
- HH Wills Physics Laboratory, University of Bristol, Tyndall Avenue, Bristol BS8 1TL, United Kingdom
| | - Andrew Fowlie
- Department of Physics, Shahid Beheshti University, Tehran, Iran
- Department of Physics and Institute of Theoretical Physics, Nanjing Normal University, Nanjing, Jiangsu 210023, People's Republic of China
| | - Tomás E Gonzalo
- Department of Physics and Institute of Theoretical Physics, Nanjing Normal University, Nanjing, Jiangsu 210023, People's Republic of China
| | - Philip Grace
- ARC Centre for Dark Matter Particle Physics, Department of Physics, University of Adelaide, Adelaide, SA 5005, Australia
| | - Matthias Hamer
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Will Handley
- Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
- Kavli Institute for Cosmology, University of Cambridge, Madingley Road, Cambridge, CB3 0HA, United Kingdom
| | - Julia Harz
- Physik Department T70, James-Franck-Straße, Technische Universität München, D-85748 Garching, Germany
| | - Sven Heinemeyer
- Instituto de Física Teórica UAM-CSIC, Cantoblanco, 28049, Madrid, Spain
| | - Sebastian Hoof
- Institut für Astrophysik und Geophysik, Georg-August-Universität Göttingen, Friedrich-Hund-Platz 1, D-37077 Göttingen, Germany
| | - Selim Hotinli
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
| | - Paul Jackson
- ARC Centre for Dark Matter Particle Physics, Department of Physics, University of Adelaide, Adelaide, SA 5005, Australia
| | - Felix Kahlhoefer
- Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Sommerfeldstraße 14, D-52056 Aachen, Germany
| | - Kamila Kowalska
- National Centre for Nuclear Research, ul. Pasteura 7, PL-02-093 Warsaw, Poland
| | - Michael Krämer
- Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Sommerfeldstraße 14, D-52056 Aachen, Germany
| | - Anders Kvellestad
- Department of Physics, University of Oslo, Box 1048, Blindern, N-0316 Oslo, Norway
| | | | - Farvah Mahmoudi
- Université de Lyon, Université Claude Bernard Lyon 1, CNRS/IN2P3, Institut de Physique des 2 Infinis de Lyon, UMR 5822, F-69622, Villeurbanne, France
- Theoretical Physics Department, CERN, CH-1211 Geneva 23, Switzerland
| | - Diego Martinez Santos
- Instituto Galego de Física de Altas Enerxías, Universidade de Santiago de Compostela, Spain
| | - Gregory D Martinez
- Physics and Astronomy Department, University of California, Los Angeles, CA 90095, United States of America
| | | | - Keith Olive
- William I. Fine Theoretical Physics Institute, School of Physics and Astronomy, University of Minnesota, Minneapolis, MN 55455, United States of America
| | - Ayan Paul
- Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany
- Institut für Physik, Humboldt-Universität zu Berlin, D-12489 Berlin, Germany
| | - Markus Tobias Prim
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Werner Porod
- University of Würzburg, Emil-Hilb-Weg 22, D-97074 Würzburg, Germany
| | - Are Raklev
- Department of Physics, University of Oslo, Box 1048, Blindern, N-0316 Oslo, Norway
| | - Janina J Renk
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
- School of Mathematics and Physics, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
- Oskar Klein Centre for Cosmoparticle Physics, AlbaNova University Centre, SE-10691 Stockholm, Sweden
| | - Christopher Rogan
- Department of Physics and Astronomy, University of Kansas, Lawrence, KS 66045, United States of America
| | - Leszek Roszkowski
- National Centre for Nuclear Research, ul. Pasteura 7, PL-02-093 Warsaw, Poland
- Astrocent, Nicolaus Copernicus Astronomical Center Polish Academy of Sciences, Bartycka 18, PL-00-716 Warsaw, Poland
| | | | - Kazuki Sakurai
- Institute of Theoretical Physics, Faculty of Physics, University of Warsaw, ul. Pasteura 5, PL-02-093 Warsaw, Poland
| | - Andre Scaffidi
- Istituto Nazionale di Fisica Nucleare, Sezione di Torino, via P. Giuria 1, I-10125 Torino, Italy
| | - Pat Scott
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
- School of Mathematics and Physics, The University of Queensland, St. Lucia, Brisbane, QLD 4072, Australia
| | | | - Tim Stefaniak
- Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany
| | - Patrick Stöcker
- Institute for Theoretical Particle Physics and Cosmology (TTK), RWTH Aachen University, Sommerfeldstraße 14, D-52056 Aachen, Germany
| | - Wei Su
- ARC Centre for Dark Matter Particle Physics, Department of Physics, University of Adelaide, Adelaide, SA 5005, Australia
- Korea Institute for Advanced Study, Seoul 02455, Republic of Korea
| | - Sebastian Trojanowski
- National Centre for Nuclear Research, ul. Pasteura 7, PL-02-093 Warsaw, Poland
- Astrocent, Nicolaus Copernicus Astronomical Center Polish Academy of Sciences, Bartycka 18, PL-00-716 Warsaw, Poland
| | - Roberto Trotta
- Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, United Kingdom
- SISSA International School for Advanced Studies, Via Bonomea 265, 34136, Trieste, Italy
| | - Yue-Lin Sming Tsai
- Key Laboratory of Dark Matter and Space Astronomy, Purple Mountain Observatory, Chinese Academy of Sciences, Nanjing 210033, People's Republic of China
| | | | - Mauro Valli
- Department of Physics and Astronomy, University of California, Irvine, California 92697, United States of America
| | - Aaron C Vincent
- Department of Physics, Engineering Physics and Astronomy, Queen's University, Kingston ON K7L 3N6, Canada
- Arthur B McDonald Canadian Astroparticle Physics Research Institute, Kingston ON K7L 3N6, Canada
- Perimeter Institute for Theoretical Physics, Waterloo ON N2L 2Y5, Canada
| | - Georg Weiglein
- Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany
- Institut fur Theoretische Physik, Universitat Hamburg, Luruper Chaussee 149, 22761 Hamburg, Germany
| | - Martin White
- ARC Centre for Dark Matter Particle Physics, Department of Physics, University of Adelaide, Adelaide, SA 5005, Australia
| | - Peter Wienemann
- University of Bonn, Physikalisches Institut, Nussallee 12, D-53115 Bonn, Germany
| | - Lei Wu
- Department of Physics and Institute of Theoretical Physics, Nanjing Normal University, Nanjing, Jiangsu 210023, People's Republic of China
| | - Yang Zhang
- Department of Physics and Institute of Theoretical Physics, Nanjing Normal University, Nanjing, Jiangsu 210023, People's Republic of China
- School of Physics, Zhengzhou University, ZhengZhou 450001, People's Republic of China
| |
Collapse
|
26
|
Berrar D. Using p-values for the comparison of classifiers: pitfalls and alternatives. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-022-00828-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
27
|
Temp AGM, Naumann M, Hermann A, Glaß H. Applied Bayesian Approaches for Research in Motor Neuron Disease. Front Neurol 2022; 13:796777. [PMID: 35401404 PMCID: PMC8987707 DOI: 10.3389/fneur.2022.796777] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/23/2022] [Indexed: 11/13/2022] Open
Abstract
Statistical evaluation of empirical data is the basis of the modern scientific method. Available tools include various hypothesis tests for specific data structures, as well as methods that are used to quantify the uncertainty of an obtained result. Statistics are pivotal, but many misconceptions arise due to their complexity and difficult-to-acquire mathematical background. Even though most studies rely on a frequentist interpretation of statistical readouts, the application of Bayesian statistics has increased due to the availability of easy-to-use software suites and an increased outreach favouring this topic in the scientific community. Bayesian statistics take our prior knowledge together with the obtained data to express a degree of belief how likely a certain event is. Bayes factor hypothesis testing (BFHT) provides a straightforward method to evaluate multiple hypotheses at the same time and provides evidence that favors the null hypothesis or alternative hypothesis. In the present perspective, we show the merits of BFHT for three different use cases, including a clinical trial, basic research as well as a single case study. Here we show that Bayesian statistics is a viable addition of a scientist's statistical toolset, which can help to interpret data.
Collapse
Affiliation(s)
- Anna G. M. Temp
- Translational Neurodegeneration Section “Albrecht Kossel,” Department of Neurology, University Medical Centre, Rostock, Germany
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Rostock, Germany
- Neurozentrum, Berufsgenossenschaftliches Klinikum Hamburg, Hamburg, Germany
- *Correspondence: Anna G. M. Temp ; orcid.org/0000-0003-0671-121X
| | - Marcel Naumann
- Translational Neurodegeneration Section “Albrecht Kossel,” Department of Neurology, University Medical Centre, Rostock, Germany
| | - Andreas Hermann
- Translational Neurodegeneration Section “Albrecht Kossel,” Department of Neurology, University Medical Centre, Rostock, Germany
- Deutsches Zentrum für Neurodegenerative Erkrankungen (DZNE), Rostock, Germany
- Center for Transdisciplinary Neurosciences Rostock, University Medical Centre, Rostock, Germany
| | - Hannes Glaß
- Translational Neurodegeneration Section “Albrecht Kossel,” Department of Neurology, University Medical Centre, Rostock, Germany
| |
Collapse
|
28
|
Schneider MC, Schütz GJ. Don’t Be Fooled by Randomness: Valid p-Values for Single Molecule Microscopy. FRONTIERS IN BIOINFORMATICS 2022; 2:811053. [PMID: 36304307 PMCID: PMC9580918 DOI: 10.3389/fbinf.2022.811053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 01/12/2022] [Indexed: 12/04/2022] Open
Abstract
The human mind shows extraordinary capability at recognizing patterns, while at the same time tending to underestimate the natural scope of random processes. Taken together, this easily misleads researchers in judging whether the observed characteristics of their data are of significance or just the outcome of random effects. One of the best tools to assess whether observed features fall into the scope of pure randomness is statistical significance testing, which quantifies the probability to falsely reject a chosen null hypothesis. The central parameter in this context is the p-value, which can be calculated from the recorded data sets. In case of p-values smaller than the level of significance, the null hypothesis is rejected, otherwise not. While significance testing has found widespread application in many sciences including the life sciences, it is hardly used in (bio-)physics. We propose here that significance testing provides an important and valid addendum to the toolbox of quantitative (single molecule) biology. It allows to support a quantitative judgement (the hypothesis) about the data set with a probabilistic assessment. In this manuscript we describe ways for obtaining valid p-values in two selected applications of single molecule microscopy: (i) Nanoclustering in single molecule localization microscopy. Previously, we developed a method termed 2-CLASTA, which allows to calculate a valid p-value for the null hypothesis of an underlying random distribution of molecules of interest while circumventing overcounting issues. Here, we present an extension to this approach, yielding a single overall p-value for data pooled from multiple cells or experiments. (ii) Single molecule trajectories. Data from a single molecule trajectory are inherently correlated, thus prohibiting a direct analysis via conventional statistical tools. Here, we introduce a block permutation test, which yields a valid p-value for the analysis and comparison of single molecule trajectory data. We exemplify the approach based on FRET trajectories.
Collapse
|
29
|
Quintana DS. Towards better hypothesis tests in oxytocin research: Evaluating the validity of auxiliary assumptions. Psychoneuroendocrinology 2022; 137:105642. [PMID: 34991063 DOI: 10.1016/j.psyneuen.2021.105642] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 12/16/2021] [Accepted: 12/17/2021] [Indexed: 10/19/2022]
Abstract
Various factors have been attributed to the inconsistent reproducibility of human oxytocin research in the cognitive and behavioral sciences. These factors include small sample sizes, a lack of pre-registered studies, and the absence of overarching theoretical frameworks that can account for oxytocin's effects over a broad range of contexts. While there have been efforts to remedy these issues, there has been very little systematic scrutiny of the role of auxiliary assumptions, which are claims that are not central for testing a hypothesis but nonetheless critical for testing theories. For instance, the hypothesis that oxytocin increases the salience of social cues is predicated on the assumption that intranasally administered oxytocin increases oxytocin levels in the brain. Without robust auxiliary assumptions, it is unclear whether a hypothesis testing failure is due to an incorrect hypothesis or poorly supported auxiliary assumptions. Consequently, poorly supported auxiliary assumptions can be blamed for hypothesis failure, thereby safeguarding theories from falsification. In this article, I will evaluate the body of evidence for key auxiliary assumptions in human behavioral oxytocin research in terms of theory, experimental design, and statistical inference, and highlight assumptions that require stronger evidence. Strong auxiliary assumptions will leave hypotheses vulnerable for falsification, which will improve hypothesis testing and consequently advance our understanding of oxytocin's role in cognition and behavior.
Collapse
Affiliation(s)
- Daniel S Quintana
- Department of Psychology, University of Oslo, Oslo, Norway; NevSom, Department of Rare Disorders, Oslo University Hospital, Oslo, Norway; Norwegian Centre for Mental Disorders Research (NORMENT), University of Oslo, Oslo, Norway; KG Jebsen Centre for Neurodevelopmental Disorders, University of Oslo, Oslo, Norway.
| |
Collapse
|
30
|
Schuengel C. Learning to love the null. J Child Psychol Psychiatry 2022; 63:249-251. [PMID: 35165898 DOI: 10.1111/jcpp.13577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/19/2022] [Indexed: 11/29/2022]
Abstract
Children's behaviour and mental health has the power to surprise us, readers and authors of the Journal of Child Psychology and Psychiatry and laymen alike, if not for the endless variation among people, then for the ever-changing context in which they develop. The hypothetico-deductive method in combination with null-hypothesis significance testing has turned surprise into scientific knowledge. Null effects may in themselves also be surprising and informative, but appear less well represented in the literature. This editorial highlights emerging methodological practices for studying null effects in the most informative way.
Collapse
Affiliation(s)
- Carlo Schuengel
- Educational and Family Studies, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.,Amsterdam Public Health Research Institute, Amsterdam UMC and Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
31
|
Passarelli DA, Amd M, de Oliveira MA, de Rose JC. Augmenting salivation, but not evaluations, through subliminal conditioning of eating-related words. Behav Processes 2021; 194:104541. [PMID: 34813914 DOI: 10.1016/j.beproc.2021.104541] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 10/22/2021] [Accepted: 11/08/2021] [Indexed: 12/11/2022]
Abstract
Correlating eating-related words (CS) with positively valenced words (US+) may augment eating-associated motivational responses (e.g., preingestive salivation) with minimal CS knowledge. We tested this claim using a subliminal conditioning procedure, where CS and US were presented under subliminal and supraliminal visual conditions. Three groups of Brazilian undergraduates (N = 69) viewed eating-related words (CS) or their scrambled counterparts (non-CS) followed by positive (US+) or neutral (US-) words. A free-selection visibility check confirmed that subliminally presented CS and non-CS had not been detected by any group. Participants exposed to CS/US+ pairings produced significantly more saliva relative to participants exposed to CS/US- and non-CS/US+ pairings. Reliable induction of salivation, coupled with null outcomes across evaluation measures, suggests that affective information related to eating can subliminally augment preingestive salivation with minimal deliberation.
Collapse
Affiliation(s)
| | - Micah Amd
- Federal University of Sao Carlos, Brazil; University of the South Pacific, Fiji
| | | | | |
Collapse
|
32
|
Cherubini JM, MacDonald MJ. Statistical Inferences Using Effect Sizes in Human Endothelial Function Research. Artery Res 2021; 27:176-185. [PMID: 34966462 PMCID: PMC8654719 DOI: 10.1007/s44200-021-00006-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 10/07/2021] [Indexed: 11/28/2022] Open
Abstract
INTRODUCTION Magnitudes of change in endothelial function research can be articulated using effect size statistics. Effect sizes are commonly used in reference to Cohen's seminal guidelines of small (d = 0.2), medium (d = 0.5), and large (d = 0.8). Quantitative analyses of effect size distributions across various research disciplines have revealed values differing from Cohen's original recommendations. Here we examine effect size distributions in human endothelial function research, and the magnitude of small, medium, and large effects for macro and microvascular endothelial function. METHODS Effect sizes reported as standardized mean differences were extracted from meta research available for endothelial function. A frequency distribution was constructed to sort effect sizes. The 25th, 50th, and 75th percentiles were used to derive small, medium, and large effects. Group sample sizes and publication year from primary studies were also extracted to observe any potential trends, related to these factors, in effect size reporting in endothelial function research. RESULTS Seven hundred fifty-two effect sizes were extracted from eligible meta-analyses. We determined small (d = 0.28), medium (d = 0.69), and large (d = 1.21) effects for endothelial function that corresponded to the 25th, 50th, and 75th percentile of the data distribution. CONCLUSION Our data indicate that direct application of Cohen's guidelines would underestimate the magnitude of effects in human endothelial function research. This investigation facilitates future a priori power analyses, provides a practical guiding benchmark for the contextualization of an effect when no other information is available, and further encourages the reporting of effect sizes in endothelial function research.
Collapse
Affiliation(s)
- Joshua M. Cherubini
- Department of Kinesiology, Vascular Dynamics Lab, McMaster University, Ivor Wynne Centre, Room E210, 1280 Main Street West, Hamilton, ON L8S 4K1 Canada
| | - Maureen J. MacDonald
- Department of Kinesiology, Vascular Dynamics Lab, McMaster University, Ivor Wynne Centre, Room E210, 1280 Main Street West, Hamilton, ON L8S 4K1 Canada
| |
Collapse
|