Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Badenes-Ribera L, Frias-Navarro D, Iotti B, Bonilla-Campos A, Longobardi C. Misconceptions of the p-value among Chilean and Italian Academic Psychologists. Front Psychol 2016;7:1247. [PMID: 27602007 PMCID: PMC4993781 DOI: 10.3389/fpsyg.2016.01247] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 08/05/2016] [Indexed: 11/26/2022] Open

For:	Badenes-Ribera L, Frias-Navarro D, Iotti B, Bonilla-Campos A, Longobardi C. Misconceptions of the p-value among Chilean and Italian Academic Psychologists. Front Psychol 2016;7:1247. [PMID: 27602007 PMCID: PMC4993781 DOI: 10.3389/fpsyg.2016.01247] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2016] [Accepted: 08/05/2016] [Indexed: 11/26/2022] Open

Number

Cited by Other Article(s)

Drouin JR, Flores S. Effects of training length on adaptation to noise-vocoded speech. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2024;155:2114-2127. [PMID: 38488452 DOI: 10.1121/10.0025273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 02/22/2024] [Indexed: 03/19/2024]

Drown L, Philip B, Francis AL, Theodore RM. Revisiting the left ear advantage for phonetic cues to talker identification. THE JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA 2022;152:3107. [PMID: 36456295 PMCID: PMC9715276 DOI: 10.1121/10.0015093] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Revised: 09/13/2022] [Accepted: 10/18/2022] [Indexed: 06/17/2023]

Quintana DS. Towards better hypothesis tests in oxytocin research: Evaluating the validity of auxiliary assumptions. Psychoneuroendocrinology 2022;137:105642. [PMID: 34991063 DOI: 10.1016/j.psyneuen.2021.105642] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 12/16/2021] [Accepted: 12/17/2021] [Indexed: 10/19/2022]

Beyond psychology: prevalence of p value and confidence interval misinterpretation across different fields. JOURNAL OF PACIFIC RIM PSYCHOLOGY 2021. [DOI: 10.1017/prp.2019.28] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open

Carpenter TP, Law KC. Optimizing the scientific study of suicide with open and transparent research practices. Suicide Life Threat Behav 2021;51:36-46. [PMID: 33624871 DOI: 10.1111/sltb.12665] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Świątkowski W, Carrier A. There is Nothing Magical about Bayesian Statistics: An Introduction to Epistemic Probabilities in Data Analysis for Psychology Starters. BASIC AND APPLIED SOCIAL PSYCHOLOGY 2020. [DOI: 10.1080/01973533.2020.1792297] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Rost DH, Bienefeld M. Nicht replizieren: publizieren!? ZEITSCHRIFT FUR PADAGOGISCHE PSYCHOLOGIE 2019. [DOI: 10.1024/1010-0652/a000253] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Abstract Zusammenfassung. In der (Pädagogischen) Psychologie sind Replikationsstudien bislang extrem seltene Ausnahmen. Dieser Artikel legt dar, dass und warum Wiederholungsstudien unentbehrlich sind. Weiterhin wird der Frage nachgegangen, warum – trotz des enormen Mehrwerts – nahezu keine Replikationen publiziert werden und warum viele „Ergebnisse“ der psychologischen Forschung nicht replizierbar sind. Dass es sich bei diesen Sachverhalten nicht um Vermutungen handelt, wird durch vorliegende Untersuchungen belegt. Die Ursachen dafür liegen in verschiedenen – teilweise voneinander abhängigen – Ebenen des Wissenschaftssystems: die verbreitete – aber abwegige – Ansicht, „statistische Signifikanz“ indiziere auch die Wahrscheinlichkeit, einen Befund replizieren zu können; die Verwechslung von „statistisch signifikant“ mit relevant; die Unsitte, getestete Untersuchungshypothesen erst im Nachhinein (ex post), also in Kenntnis der Resultate einer Studie, aufgestellt zu haben, aber in der Publikation als theoretisch abgeleiteten Ausgangspunkt (d. h. a priori formuliert) auszugeben; die α-Fehler-Inflationierung durch multiple statistische Signifikanztestungen; das exklusive Berichten von Ergebnissen, welche die Forschungshypothesen stützen, verbunden mit dem Unterschlagen abweichender Befunde; mangelnde Konstruktvalidität der verwendeten Messinstrumente; Lug und Betrug in der Wissenschaft; die Geringschätzung von Replikationen durch Zeitschriftenherausgeber, Gutachter und Drittmittelgeber. All das führt dazu, dass fast ausschließlich „statistisch signifikante“ und „neue“ Ergebnisse veröffentlicht werden und falsche Theorien persistieren. Als Gegenmaßnahmen werden beispielhaft genannt: eine großzügige finanzielle Förderung von Replikationsprojekten und ihrer Publikation; die nachdrückliche gutachterliche Befürwortung der Veröffentlichung methodisch adäquater Wiederholungsstudien; die Bereitschaft von Fachzeitschriften, dafür genug Platz bereitzustellen; die Anerkennung des großen wissenschaftlichen Werts von Wiederholungsstudien, auch in Berufungsverfahren. Daraus ergibt sich, dass mit den aufgezeigten Möglichkeiten und Forderungen zur Etablierung und Förderung von Replikationsstudien unterschiedliche Adressaten parallel angesprochen werden müssen. Nachhaltige Veränderungen sind allerdings nur erreichbar, wenn die einzelnen Akteure (Forscher; Gutachter; Zeitschriftenherausgeber; Berufungskommissionen; Drittmittelgeber) ihre individuelle Verantwortung anerkennen und entsprechende Taten folgen lassen. Collapse

Griffiths P, Needleman J. Statistical significance testing and p-values: Defending the indefensible? A discussion paper and position statement. Int J Nurs Stud 2019;99:103384. [PMID: 31442781 DOI: 10.1016/j.ijnurstu.2019.07.001] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 07/13/2019] [Indexed: 11/15/2022]

Abstract

Much statistical teaching and many research reports focus on the 'null hypothesis significance test'. Yet the correct meaning and interpretation of statistical significance tests is elusive. Misinterpretations are both common and persistent, leading many to question whether significance tests should be used at all. While most take aim at the arbitrary declaration of p < 0.05 as a threshold for determining 'significance', others extend the critique to suggest the 'p-value' should be dispensed with entirely. P-values and significance tests are still widely used as if they give a measure of the size and importance of relationships, even though this misunderstanding has been observed and discussed for many years. We argue that p-values and significance tests are intrinsically misleading. Point estimates of relationships and confidence intervals give direct information about the effect and the uncertainty of the estimate without recourse to interpreting how a particular p-value might have arisen or indeed referring to them at all. In this paper we briefly outline some of the problems with significance testing, offer a number of examples selected from a recent issue of the International Journal of Nursing Studies and discuss some proposed responses to these problems. We conclude by offering some guidance to authors reporting statistical tests in journals and present a position statement that has been adopted by the International Journal of Nursing Studies to guide its' authors in reporting the results of statistical analyses. While stopping short of calling for an outright ban on reporting p-values and significance tests we urge authors (and journals) to place more emphasis on measures of effect and estimates of precision/uncertainty and, following the position of the American Statistical Association, emphasise that authors (and readers) should avoid using 0.05 or any other cut off for a p-value as the basis for a decision about the meaningfulness/importance of an effect. If point estimates and confidence intervals are used, then the p-value may be redundant and can be omitted from reports. When authors talk about 'significance' they need to be explicit when referring to statistical significance and we recommend authors adopt the language of 'importance' when talking about effect sizes to avoid any confusion.

Collapse

Badenes-Ribera L, Frias-Navarro D, Iotti NO, Bonilla-Campos A, Longobardi C. Perceived Statistical Knowledge Level and Self-Reported Statistical Practice Among Academic Psychologists. Front Psychol 2018;9:996. [PMID: 29988476 PMCID: PMC6024681 DOI: 10.3389/fpsyg.2018.00996] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2018] [Accepted: 05/28/2018] [Indexed: 11/18/2022] Open

Abstract

Introduction: Publications arguing against the null hypothesis significance testing (NHST) procedure and in favor of good statistical practices have increased. The most frequently mentioned alternatives to NHST are effect size statistics (ES), confidence intervals (CIs), and meta-analyses. A recent survey conducted in Spain found that academic psychologists have poor knowledge about effect size statistics, confidence intervals, and graphic displays for meta-analyses, which might lead to a misinterpretation of the results. In addition, it also found that, although the use of ES is becoming generalized, the same thing is not true for CIs. Finally, academics with greater knowledge about ES statistics presented a profile closer to good statistical practice and research design. Our main purpose was to analyze the extension of these results to a different geographical area through a replication study.

Methods: For this purpose, we elaborated an on-line survey that included the same items as the original research, and we asked academic psychologists to indicate their level of knowledge about ES, their CIs, and meta-analyses, and how they use them. The sample consisted of 159 Italian academic psychologists (54.09% women, mean age of 47.65 years). The mean number of years in the position of professor was 12.90 (SD = 10.21).

Results: As in the original research, the results showed that, although the use of effect size estimates is becoming generalized, an under-reporting of CIs for ES persists. The most frequent ES statistics mentioned were Cohen's d and R²/η², which can have outliers or show non-normality or violate statistical assumptions. In addition, academics showed poor knowledge about meta-analytic displays (e.g., forest plot and funnel plot) and quality checklists for studies. Finally, academics with higher-level knowledge about ES statistics seem to have a profile closer to good statistical practices.

Conclusions: Changing statistical practice is not easy.This change requires statistical training programs for academics, both graduate and undergraduate.

Collapse

Gigerenzer G. Statistical Rituals: The Replication Delusion and How We Got There. ADVANCES IN METHODS AND PRACTICES IN PSYCHOLOGICAL SCIENCE 2018. [DOI: 10.1177/2515245918771329] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]

Lyu Z, Peng K, Hu CP. P-Value, Confidence Intervals, and Statistical Inference: A New Dataset of Misinterpretation. Front Psychol 2018;9:868. [PMID: 29937743 PMCID: PMC6002511 DOI: 10.3389/fpsyg.2018.00868] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2017] [Accepted: 05/14/2018] [Indexed: 11/13/2022] Open

Badenes-Ribera L, Frias-Navarro D. Falacias sobre el valor p compartidas por profesores y estudiantes universitarios. UNIVERSITAS PSYCHOLOGICA 2017. [DOI: 10.11144/javeriana.upsy16-3.fvcp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]

Abstract Resumen La “Práctica Basada en la Evidencia” exige que los profesionales valoren de forma crítica los resultados de las investigaciones psicológicas. Sin embargo, las interpretaciones incorrectas de los valores p de probabilidad son abundantes y repetitivas. Estas interpretaciones incorrectas afectan a las decisiones profesionales y ponen en riesgo la calidad de las intervenciones y la acumulación de un conocimiento científico válido. Identificar el tipo de falacia que subyace a las decisiones estadísticas es fundamental para abordar y planificar estrategias de educación estadística dirigidas a intervenir sobre las interpretaciones incorrectas. En consecuencia, el objetivo de este estudio es analizar la interpretación del valor p en estudiantes y profesores universitarios de Psicología. La muestra estuvo formada por 161 participantes (43 profesores y 118 estudiantes). La antigüedad media como profesor fue de 16.7 años (DT = 10.07). La edad media de los estudiantes fue de 21.59 (DT = 1.3). Los hallazgos sugieren que los estudiantes y profesores universitarios no conocen la interpretación correcta del valor p. La falacia de la probabilidad inversa presenta mayores problemas de comprensión. Además, se confunde la significación estadística y la significación práctica o clínica. Estos resultados destacan la necesidad de la educación estadística y re-educación estadística. Abstract The "Evidence Based Practice" requires professionals to critically assess the results of psychological research. However, incorrect interpretations of p values of probability are abundant and repetitive. These misconceptions affect professional decisions and compromise the quality of interventions and the accumulation of a valid scientific knowledge. Identifying the types of fallacies that underlying statistical decisions is fundamental for approaching and planning statistical education strategies designed to intervene in incorrect interpretations. Therefore, the aim of this study is to analyze the interpretation of p value among college students of psychology and academic psychologist. The sample was composed of 161 participants (43 academic and 118 students). The mean number of years as academic was 16.7 (SD = 10.07). The mean age of college students was 21.59 years (SD = 1.3). The findings suggest that college students and academic do not know the correct interpretation of p values. The fallacy of the inverse probability presents major problems of comprehension. In addition, statistical significance and practical significance or clinical are confused. There is a need for statistical education and statistical re-education. Collapse

Amrhein V, Korner-Nievergelt F, Roth T. The earth is flat (p > 0.05): significance thresholds and the crisis of unreplicable research. PeerJ 2017;5:e3544. [PMID: 28698825 PMCID: PMC5502092 DOI: 10.7717/peerj.3544] [Citation(s) in RCA: 144] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 06/14/2017] [Indexed: 11/25/2022] Open

Abstract

The widespread use of 'statistical significance' as a license for making a claim of a scientific finding leads to considerable distortion of the scientific process (according to the American Statistical Association). We review why degrading p-values into 'significant' and 'nonsignificant' contributes to making studies irreproducible, or to making them seem irreproducible. A major problem is that we tend to take small p-values at face value, but mistrust results with larger p-values. In either case, p-values tell little about reliability of research, because they are hardly replicable even if an alternative hypothesis is true. Also significance (p ≤ 0.05) is hardly replicable: at a good statistical power of 80%, two studies will be 'conflicting', meaning that one is significant and the other is not, in one third of the cases if there is a true effect. A replication can therefore not be interpreted as having failed only because it is nonsignificant. Many apparent replication failures may thus reflect faulty judgment based on significance thresholds rather than a crisis of unreplicable research. Reliable conclusions on replicability and practical importance of a finding can only be drawn using cumulative evidence from multiple independent studies. However, applying significance thresholds makes cumulative knowledge unreliable. One reason is that with anything but ideal statistical power, significant effect sizes will be biased upwards. Interpreting inflated significant results while ignoring nonsignificant results will thus lead to wrong conclusions. But current incentives to hunt for significance lead to selective reporting and to publication bias against nonsignificant findings. Data dredging, p-hacking, and publication bias should be addressed by removing fixed significance thresholds. Consistent with the recommendations of the late Ronald Fisher, p-values should be interpreted as graded measures of the strength of evidence against the null hypothesis. Also larger p-values offer some evidence against the null hypothesis, and they cannot be interpreted as supporting the null hypothesis, falsely concluding that 'there is no effect'. Information on possible true effect sizes that are compatible with the data must be obtained from the point estimate, e.g., from a sample average, and from the interval estimate, such as a confidence interval. We review how confusion about interpretation of larger p-values can be traced back to historical disputes among the founders of modern statistics. We further discuss potential arguments against removing significance thresholds, for example that decision rules should rather be more stringent, that sample sizes could decrease, or that p-values should better be completely abandoned. We conclude that whatever method of statistical inference we use, dichotomous threshold thinking must give way to non-automated informed judgment.

Collapse