1
|
Lippert S, Dreber A, Johannesson M, Tierney W, Cyrus-Lai W, Uhlmann EL, Pfeiffer T. Can large language models help predict results from a complex behavioural science study? ROYAL SOCIETY OPEN SCIENCE 2024; 11:240682. [PMID: 39323554 PMCID: PMC11421891 DOI: 10.1098/rsos.240682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/28/2024] [Revised: 07/26/2024] [Accepted: 08/12/2024] [Indexed: 09/27/2024]
Abstract
We tested whether large language models (LLMs) can help predict results from a complex behavioural science experiment. In study 1, we investigated the performance of the widely used LLMs GPT-3.5 and GPT-4 in forecasting the empirical findings of a large-scale experimental study of emotions, gender, and social perceptions. We found that GPT-4, but not GPT-3.5, matched the performance of a cohort of 119 human experts, with correlations of 0.89 (GPT-4), 0.07 (GPT-3.5) and 0.87 (human experts) between aggregated forecasts and realized effect sizes. In study 2, providing participants from a university subject pool the opportunity to query a GPT-4 powered chatbot significantly increased the accuracy of their forecasts. Results indicate promise for artificial intelligence (AI) to help anticipate-at scale and minimal cost-which claims about human behaviour will find empirical support and which ones will not. Our discussion focuses on avenues for human-AI collaboration in science.
Collapse
Affiliation(s)
- Steffen Lippert
- Department of Economics, University of Auckland, Auckland, New Zealand
| | - Anna Dreber
- Department of Economics, Stockholm School of Economics, Stockholm, Sweden
- Department of Economics, University of Innsbruck, Innsbruck, Austria
| | - Magnus Johannesson
- Department of Economics, Stockholm School of Economics, Stockholm, Sweden
| | - Warren Tierney
- Organisational Behaviour Area/Marketing Area, INSEAD, Singapore
| | | | | | - Thomas Pfeiffer
- New Zealand Institute for Advanced Study, Massey University, Auckland, New Zealand
| |
Collapse
|
2
|
Mougenot D, Matheson H. Theoretical strategies for an embodied cognitive neuroscience: Mechanistic explanations of brain-body-environment systems. Cogn Neurosci 2024:1-13. [PMID: 38736314 DOI: 10.1080/17588928.2024.2349546] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2023] [Indexed: 05/14/2024]
Abstract
Cognitive neuroscience seeks to explain mind, brain, and behavior. But how do we generate explanations? In this integrative theoretical paper, we review the commitments of the 'New Mechanist' movement within the philosophy of science, focusing specifically on the role of mechanistic models in scientific explanation. We highlight how this approach differs from other explanatory approaches within the field, showing its unique contributions to the efforts of scientific explanation. We then argue that the commitments of the Embodied Cognition framework converge with the commitments of the New Mechanist movement in a way that provides a necessary explanatory strategy available to cognitive neuroscience. We then discuss a number of consequences of this convergence, including issues related to the inadequacy of statistical prediction, neuroscientific reduction, the autonomy of psychology from neuroscience, and psychological and neuroscientific ontology. We hope that our integrative thesis provides researchers with a theoretical strategy for an embodied cognitive neuroscience.
Collapse
Affiliation(s)
- Davy Mougenot
- Department of Psychology, Memorial University of Newfoundland, St. John's, Canada
| | - Heath Matheson
- Department of Psychology, Memorial University of Newfoundland, St. John's, Canada
| |
Collapse
|
3
|
Grossmann I, Varnum MEW, Hutcherson CA, Mandel DR. When expert predictions fail. Trends Cogn Sci 2024; 28:113-123. [PMID: 37949791 DOI: 10.1016/j.tics.2023.10.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 10/07/2023] [Accepted: 10/09/2023] [Indexed: 11/12/2023]
Abstract
We examine the opportunities and challenges of expert judgment in the social sciences, scrutinizing the way social scientists make predictions. While social scientists show above-chance accuracy in predicting laboratory-based phenomena, they often struggle to predict real-world societal changes. We argue that most causal models used in social sciences are oversimplified, confuse levels of analysis to which a model applies, misalign the nature of the model with the nature of the phenomena, and fail to consider factors beyond the scientist's pet theory. Taking cues from physical sciences and meteorology, we advocate an approach that integrates broad foundational models with context-specific time series data. We call for a shift in the social sciences towards more precise, daring predictions and greater intellectual humility.
Collapse
Affiliation(s)
- Igor Grossmann
- Department of Psychology, University of Waterloo, Waterloo, N2L 3G1, ON, Canada.
| | - Michael E W Varnum
- Department of Psychology, Arizona State University, Tempe, AZ 85287, USA
| | - Cendri A Hutcherson
- Department of Psychology, University of Toronto Scarborough, Toronto, M1C 1A4, ON, Canada
| | - David R Mandel
- Defence Research and Development Canada, Toronto, M3K 2C9, ON, Canada
| |
Collapse
|
4
|
Sarafoglou A, Aust F, Marsman M, Bartoš F, Wagenmakers EJ, Haaf JM. Multibridge: an R package to evaluate informed hypotheses in binomial and multinomial models. Behav Res Methods 2023; 55:4343-4368. [PMID: 37277644 PMCID: PMC10700431 DOI: 10.3758/s13428-022-02020-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/03/2022] [Indexed: 06/07/2023]
Abstract
The multibridge R package allows a Bayesian evaluation of informed hypotheses [Formula: see text] applied to frequency data from an independent binomial or multinomial distribution. multibridge uses bridge sampling to efficiently compute Bayes factors for the following hypotheses concerning the latent category proportions 𝜃: (a) hypotheses that postulate equality constraints (e.g., 𝜃1 = 𝜃2 = 𝜃3); (b) hypotheses that postulate inequality constraints (e.g., 𝜃1 < 𝜃2 < 𝜃3 or 𝜃1 > 𝜃2 > 𝜃3); (c) hypotheses that postulate combinations of inequality constraints and equality constraints (e.g., 𝜃1 < 𝜃2 = 𝜃3); and (d) hypotheses that postulate combinations of (a)-(c) (e.g., 𝜃1 < (𝜃2 = 𝜃3),𝜃4). Any informed hypothesis [Formula: see text] may be compared against the encompassing hypothesis [Formula: see text] that all category proportions vary freely, or against the null hypothesis [Formula: see text] that all category proportions are equal. multibridge facilitates the fast and accurate comparison of large models with many constraints and models for which relatively little posterior mass falls in the restricted parameter space. This paper describes the underlying methodology and illustrates the use of multibridge through fully reproducible examples.
Collapse
Affiliation(s)
- Alexandra Sarafoglou
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands.
| | - Frederik Aust
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| | - Maarten Marsman
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| | - František Bartoš
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| | - Eric-Jan Wagenmakers
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| | - Julia M Haaf
- Department of Psychology, University of Amsterdam, PO Box 15906, 1001 NK Amsterdam, The Netherlands
| |
Collapse
|
5
|
Korbmacher M, Azevedo F, Pennington CR, Hartmann H, Pownall M, Schmidt K, Elsherif M, Breznau N, Robertson O, Kalandadze T, Yu S, Baker BJ, O'Mahony A, Olsnes JØS, Shaw JJ, Gjoneska B, Yamada Y, Röer JP, Murphy J, Alzahawi S, Grinschgl S, Oliveira CM, Wingen T, Yeung SK, Liu M, König LM, Albayrak-Aydemir N, Lecuona O, Micheli L, Evans T. The replication crisis has led to positive structural, procedural, and community changes. COMMUNICATIONS PSYCHOLOGY 2023; 1:3. [PMID: 39242883 PMCID: PMC11290608 DOI: 10.1038/s44271-023-00003-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/22/2023] [Indexed: 09/09/2024]
Abstract
The emergence of large-scale replication projects yielding successful rates substantially lower than expected caused the behavioural, cognitive, and social sciences to experience a so-called 'replication crisis'. In this Perspective, we reframe this 'crisis' through the lens of a credibility revolution, focusing on positive structural, procedural and community-driven changes. Second, we outline a path to expand ongoing advances and improvements. The credibility revolution has been an impetus to several substantive changes which will have a positive, long-term impact on our research environment.
Collapse
Affiliation(s)
- Max Korbmacher
- Department of Health and Functioning, Western Norway University of Applied Sciences, Bergen, Norway
- NORMENT Centre for Psychosis Research, University of Oslo and Oslo University Hospital, Oslo, Norway
- Mohn Medical Imaging and Visualisation Center, Bergen, Norway
| | - Flavio Azevedo
- Department of Psychology, University of Cambridge, Cambridge, UK.
- Department of Social Psychology, University of Groningen, Groningen, The Netherlands.
| | | | - Helena Hartmann
- Department of Neurology, University of Essen, Essen, Germany
| | | | | | | | - Nate Breznau
- SOCIUM Research Center on Inequality and Social Policy, University of Bremen, Bremen, Germany
| | - Olly Robertson
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Tamara Kalandadze
- Department of Education, ICT and Learning, Ostfold University College, Halden, Norway
| | - Shijun Yu
- School of Psychology, University of Birmingham, Birmingham, UK
| | - Bradley J Baker
- Department of Sport and Recreation Management, Temple University, Philadelphia, USA
| | | | - Jørgen Ø-S Olsnes
- Kavli Institute for Systems Neuroscience, Norwegian University of Science and Technology, Trondheim, Norway
| | - John J Shaw
- Division of Psychology, De Montfort University, Leicester, UK
| | - Biljana Gjoneska
- Macedonian Academy of Sciences and Arts, Skopje, North Macedonia
| | - Yuki Yamada
- Faculty of Arts and Science, Kyushu University, Fukuoka, Japan
| | - Jan P Röer
- Department of Psychology and Psychotherapy, Witten/Herdecke University, Witten, Germany
| | - Jennifer Murphy
- Department of Applied Science, Technological University Dublin, Dublin, Ireland
| | - Shilaan Alzahawi
- Graduate School of Business, Stanford University, Standford, USA
| | | | | | - Tobias Wingen
- Institute of General Practice and Family Medicine, University of Bonn, Bonn, Germany
| | - Siu Kit Yeung
- Department of Psychology, Chinese University of Hong Kong, Hong Kong, China
| | - Meng Liu
- Faculty of Education, University of Cambridge, Cambridge, UK
| | - Laura M König
- Faculty of Life Sciences: Food, Nutrition and Health, University of Bayreuth, Bayreuth, Germany
| | - Nihan Albayrak-Aydemir
- Open Psychology Research Centre, Open University, Milton Keynes, UK
- Department of Psychological and Behavioural Science, London School of Economics and Political Science, London, UK
| | - Oscar Lecuona
- Department of Psychology, Universidad Rey Juan Carlos, Madrid, Spain
- Faculty of Psychology, Universidad Autónoma de Madrid, Madrid, Spain
| | - Leticia Micheli
- Institute of Psychology, Leiden University, Leiden, The Netherlands
| | - Thomas Evans
- School of Human Sciences, University of Greenwich, Greenwich, UK
- Institute for Lifecourse Development, University of Greenwich, Greenwich, UK
| |
Collapse
|
6
|
Wintle BC, Smith ET, Bush M, Mody F, Wilkinson DP, Hanea AM, Marcoci A, Fraser H, Hemming V, Thorn FS, McBride MF, Gould E, Head A, Hamilton DG, Kambouris S, Rumpff L, Hoekstra R, Burgman MA, Fidler F. Predicting and reasoning about replicability using structured groups. ROYAL SOCIETY OPEN SCIENCE 2023; 10:221553. [PMID: 37293358 PMCID: PMC10245209 DOI: 10.1098/rsos.221553] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/09/2022] [Accepted: 04/14/2023] [Indexed: 06/10/2023]
Abstract
This paper explores judgements about the replicability of social and behavioural sciences research and what drives those judgements. Using a mixed methods approach, it draws on qualitative and quantitative data elicited from groups using a structured approach called the IDEA protocol ('investigate', 'discuss', 'estimate' and 'aggregate'). Five groups of five people with relevant domain expertise evaluated 25 research claims that were subject to at least one replication study. Participants assessed the probability that each of the 25 research claims would replicate (i.e. that a replication study would find a statistically significant result in the same direction as the original study) and described the reasoning behind those judgements. We quantitatively analysed possible correlates of predictive accuracy, including self-rated expertise and updating of judgements after feedback and discussion. We qualitatively analysed the reasoning data to explore the cues, heuristics and patterns of reasoning used by participants. Participants achieved 84% classification accuracy in predicting replicability. Those who engaged in a greater breadth of reasoning provided more accurate replicability judgements. Some reasons were more commonly invoked by more accurate participants, such as 'effect size' and 'reputation' (e.g. of the field of research). There was also some evidence of a relationship between statistical literacy and accuracy.
Collapse
Affiliation(s)
- Bonnie C. Wintle
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Eden T. Smith
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| | - Martin Bush
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| | - Fallon Mody
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| | - David P. Wilkinson
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Anca M. Hanea
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
- Centre of Excellence for Biosecurity Risk Analysis, School of BioSciences, University of Melbourne, Parkville 3010, Australia
| | - Alexandru Marcoci
- Centre for the Study of Existential Risk, University of Cambridge, Cambridge, UK
| | - Hannah Fraser
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Victoria Hemming
- Martin Conservation Decisions Lab, Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada
| | - Felix Singleton Thorn
- School of Psychological Sciences, University of Melbourne, Parkville 3010, Australia
| | - Marissa F. McBride
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
- Centre for Environmental Policy, Imperial College London, London, UK
| | - Elliot Gould
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Andrew Head
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| | - Daniel G. Hamilton
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| | - Steven Kambouris
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Libby Rumpff
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
| | - Rink Hoekstra
- Department of Pedagogical and Educational Sciences, University of Groningen, Groningen, The Netherlands
| | - Mark A. Burgman
- Centre for Environmental Policy, Imperial College London, London, UK
| | - Fiona Fidler
- MetaMelb Research Initiative, School of Ecosystem and Forest Sciences, University of Melbourne, Parkville 3010, Australia
- MetaMelb Research Initiative, School of Historical and Philosophical Studies, University of Melbourne, Parkville 3010, Australia
| |
Collapse
|
7
|
Fraser H, Bush M, Wintle BC, Mody F, Smith ET, Hanea AM, Gould E, Hemming V, Hamilton DG, Rumpff L, Wilkinson DP, Pearson R, Singleton Thorn F, Ashton R, Willcox A, Gray CT, Head A, Ross M, Groenewegen R, Marcoci A, Vercammen A, Parker TH, Hoekstra R, Nakagawa S, Mandel DR, van Ravenzwaaij D, McBride M, Sinnott RO, Vesk P, Burgman M, Fidler F. Predicting reliability through structured expert elicitation with the repliCATS (Collaborative Assessments for Trustworthy Science) process. PLoS One 2023; 18:e0274429. [PMID: 36701303 PMCID: PMC9879480 DOI: 10.1371/journal.pone.0274429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Accepted: 08/27/2022] [Indexed: 01/27/2023] Open
Abstract
As replications of individual studies are resource intensive, techniques for predicting the replicability are required. We introduce the repliCATS (Collaborative Assessments for Trustworthy Science) process, a new method for eliciting expert predictions about the replicability of research. This process is a structured expert elicitation approach based on a modified Delphi technique applied to the evaluation of research claims in social and behavioural sciences. The utility of processes to predict replicability is their capacity to test scientific claims without the costs of full replication. Experimental data supports the validity of this process, with a validation study producing a classification accuracy of 84% and an Area Under the Curve of 0.94, meeting or exceeding the accuracy of other techniques used to predict replicability. The repliCATS process provides other benefits. It is highly scalable, able to be deployed for both rapid assessment of small numbers of claims, and assessment of high volumes of claims over an extended period through an online elicitation platform, having been used to assess 3000 research claims over an 18 month period. It is available to be implemented in a range of ways and we describe one such implementation. An important advantage of the repliCATS process is that it collects qualitative data that has the potential to provide insight in understanding the limits of generalizability of scientific claims. The primary limitation of the repliCATS process is its reliance on human-derived predictions with consequent costs in terms of participant fatigue although careful design can minimise these costs. The repliCATS process has potential applications in alternative peer review and in the allocation of effort for replication studies.
Collapse
Affiliation(s)
- Hannah Fraser
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Martin Bush
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Bonnie C. Wintle
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
| | - Fallon Mody
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Eden T. Smith
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Anca M. Hanea
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Melbourne, Victoria, Australia
| | - Elliot Gould
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
- Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Melbourne, Victoria, Australia
| | - Victoria Hemming
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Martin Conservation Decisions Lab, Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada
| | | | - Libby Rumpff
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
| | - David P. Wilkinson
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
| | - Ross Pearson
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | | | - Raquel Ashton
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Aaron Willcox
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Charles T. Gray
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, United Kingdom
| | - Andrew Head
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Melissa Ross
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| | - Rebecca Groenewegen
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
| | - Alexandru Marcoci
- Centre for Argument Technology, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Ans Vercammen
- Centre for Environmental Policy, Imperial College London, London, United Kingdom
- School of Communication and Arts, Faculty of Humanities and Social Sciences, The University of Queensland, Brisbane, Australia
| | - Timothy H. Parker
- Department of Biology, Whitman College, Walla Walla, Washington, United States of America
| | - Rink Hoekstra
- Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands
| | - Shinichi Nakagawa
- School of Biological, Earth and Environmental Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - David R. Mandel
- Department of Psychology, York University, Toronto, Ontario, Canada
| | - Don van Ravenzwaaij
- Faculty of Behavioural and Social Sciences, University of Groningen, Groningen, The Netherlands
| | - Marissa McBride
- Centre for Environmental Policy, Imperial College London, London, United Kingdom
| | - Richard O. Sinnott
- Melbourne eResearch Group, University of Melbourne, Melbourne, Victoria, Australia
| | - Peter Vesk
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
- Quantitative & Applied Ecology Group, University of Melbourne, Melbourne, Victoria, Australia
| | - Mark Burgman
- Centre for Environmental Policy, Imperial College London, London, United Kingdom
| | - Fiona Fidler
- MetaMelb Lab, University of Melbourne, Melbourne, Victoria, Australia
| |
Collapse
|
8
|
Bartoš F, Maier M, Wagenmakers EJ, Doucouliagos H, Stanley TD. Robust Bayesian meta-analysis: Model-averaging across complementary publication bias adjustment methods. Res Synth Methods 2023; 14:99-116. [PMID: 35869696 PMCID: PMC10087723 DOI: 10.1002/jrsm.1594] [Citation(s) in RCA: 21] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Revised: 06/05/2022] [Accepted: 06/12/2022] [Indexed: 01/18/2023]
Abstract
Publication bias is a ubiquitous threat to the validity of meta-analysis and the accumulation of scientific evidence. In order to estimate and counteract the impact of publication bias, multiple methods have been developed; however, recent simulation studies have shown the methods' performance to depend on the true data generating process, and no method consistently outperforms the others across a wide range of conditions. Unfortunately, when different methods lead to contradicting conclusions, researchers can choose those methods that lead to a desired outcome. To avoid the condition-dependent, all-or-none choice between competing methods and conflicting results, we extend robust Bayesian meta-analysis and model-average across two prominent approaches of adjusting for publication bias: (1) selection models of p-values and (2) models adjusting for small-study effects. The resulting model ensemble weights the estimates and the evidence for the absence/presence of the effect from the competing approaches with the support they receive from the data. Applications, simulations, and comparisons to preregistered, multi-lab replications demonstrate the benefits of Bayesian model-averaging of complementary publication bias adjustment methods.
Collapse
Affiliation(s)
- František Bartoš
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands.,Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic
| | - Maximilian Maier
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands.,Department of Experimental Psychology, University College London, London, England, UK
| | - Eric-Jan Wagenmakers
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Hristos Doucouliagos
- Deakin Laboratory for the Meta-Analysis of Research (DeLMAR), Deakin University, Melbourne, Australia.,Department of Economics, Deakin University, Melbourne, Australia
| | - T D Stanley
- Deakin Laboratory for the Meta-Analysis of Research (DeLMAR), Deakin University, Melbourne, Australia.,Department of Economics, Deakin University, Melbourne, Australia
| |
Collapse
|
9
|
Bosse NI, Abbott S, Bracher J, Hain H, Quilty BJ, Jit M, van Leeuwen E, Cori A, Funk S. Comparing human and model-based forecasts of COVID-19 in Germany and Poland. PLoS Comput Biol 2022; 18:e1010405. [PMID: 36121848 PMCID: PMC9534421 DOI: 10.1371/journal.pcbi.1010405] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 10/05/2022] [Accepted: 07/18/2022] [Indexed: 11/19/2022] Open
Abstract
Forecasts based on epidemiological modelling have played an important role in shaping public policy throughout the COVID-19 pandemic. This modelling combines knowledge about infectious disease dynamics with the subjective opinion of the researcher who develops and refines the model and often also adjusts model outputs. Developing a forecast model is difficult, resource- and time-consuming. It is therefore worth asking what modelling is able to add beyond the subjective opinion of the researcher alone. To investigate this, we analysed different real-time forecasts of cases of and deaths from COVID-19 in Germany and Poland over a 1-4 week horizon submitted to the German and Polish Forecast Hub. We compared crowd forecasts elicited from researchers and volunteers, against a) forecasts from two semi-mechanistic models based on common epidemiological assumptions and b) the ensemble of all other models submitted to the Forecast Hub. We found crowd forecasts, despite being overconfident, to outperform all other methods across all forecast horizons when forecasting cases (weighted interval score relative to the Hub ensemble 2 weeks ahead: 0.89). Forecasts based on computational models performed comparably better when predicting deaths (rel. WIS 1.26), suggesting that epidemiological modelling and human judgement can complement each other in important ways.
Collapse
Affiliation(s)
- Nikos I. Bosse
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- Centre for the Mathematical Modelling of Infectious Diseases (members of the CMMID COVID-19 working group are listed in S1 Acknowledgements), London, United Kingdom
| | - Sam Abbott
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- Centre for the Mathematical Modelling of Infectious Diseases (members of the CMMID COVID-19 working group are listed in S1 Acknowledgements), London, United Kingdom
| | - Johannes Bracher
- Institute of Economic Theory and Statistics, Karlsruhe Institute of Technology, Karlsruhe, Germany
| | - Habakuk Hain
- Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Billy J. Quilty
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- Centre for the Mathematical Modelling of Infectious Diseases (members of the CMMID COVID-19 working group are listed in S1 Acknowledgements), London, United Kingdom
| | - Mark Jit
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- Centre for the Mathematical Modelling of Infectious Diseases (members of the CMMID COVID-19 working group are listed in S1 Acknowledgements), London, United Kingdom
| | | | - Edwin van Leeuwen
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- UK Health Security Agency, London, United Kingdom
| | - Anne Cori
- MRC Centre for Outbreak Analysis and Modelling, Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom
| | - Sebastian Funk
- Department of Infectious Disease Epidemiology, London School of Hygiene & Tropical Medicine, London, United Kingdom
- Centre for the Mathematical Modelling of Infectious Diseases (members of the CMMID COVID-19 working group are listed in S1 Acknowledgements), London, United Kingdom
| |
Collapse
|
10
|
Dougherty MR, Horne Z. Citation counts and journal impact factors do not capture some indicators of research quality in the behavioural and brain sciences. ROYAL SOCIETY OPEN SCIENCE 2022; 9:220334. [PMID: 35991336 PMCID: PMC9382220 DOI: 10.1098/rsos.220334] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Accepted: 07/22/2022] [Indexed: 05/29/2023]
Abstract
Citation data and journal impact factors are important components of faculty dossiers and figure prominently in both promotion decisions and assessments of a researcher's broader societal impact. Although these metrics play a large role in high-stakes decisions, the evidence is mixed about whether they are strongly correlated with indicators of research quality. We use data from a large-scale dataset comprising 45 144 journal articles with 667 208 statistical tests and data from 190 replication attempts to assess whether citation counts and impact factors predict three indicators of research quality: (i) the accuracy of statistical reporting, (ii) the evidential value of the reported data and (iii) the replicability of a given experimental result. Both citation counts and impact factors were weak and inconsistent predictors of research quality, so defined, and sometimes negatively related to quality. Our findings raise the possibility that citation data and impact factors may be of limited utility in evaluating scientists and their research. We discuss the implications of these findings in light of current incentive structures and discuss alternative approaches to evaluating research.
Collapse
Affiliation(s)
| | - Zachary Horne
- Department of Psychology, University of Edinburgh, Edinburgh, UK
| |
Collapse
|
11
|
Dimant E, Clemente EG, Pieper D, Dreber A, Gelfand M. Politicizing mask-wearing: predicting the success of behavioral interventions among republicans and democrats in the U.S. Sci Rep 2022; 12:7575. [PMID: 35534489 PMCID: PMC9082983 DOI: 10.1038/s41598-022-10524-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 04/07/2022] [Indexed: 11/09/2022] Open
Abstract
Scientists and policymakers seek to choose effective interventions that promote preventative health measures. We evaluated whether academics, behavioral science practitioners, and laypeople (N = 1034) were able to forecast the effectiveness of seven different messages compared to a baseline message for Republicans and Democrats separately. These messages were designed to nudge mask-wearing attitudes, intentions, and behaviors. When examining predictions across political parties, forecasters predicted larger effects than those observed for Democrats compared to Republicans and made more accurate predictions for Republicans compared to Democrats. These results are partly driven by a lack of nudge effects on Democrats, as reported in Gelfand et al. (J Exp Soc Psychol, 2021). Academics and practitioners made more accurate predictions compared to laypeople. Although forecasters' predictions were correlated with the nudge interventions, all groups overestimated the observed results. We discuss potential reasons for why the forecasts did not perform better and how more accurate forecasts of behavioral intervention outcomes could potentially provide insight that can help save resources and increase the efficacy of interventions.
Collapse
Affiliation(s)
- Eugen Dimant
- Center for Social Norms and Behavioral Dynamics, University of Pennsylvania, Philadelphia, USA.
- CESifo, Munich, Germany.
| | | | | | - Anna Dreber
- Stockholm School of Economics, Stockholm, Sweden
- University of Innsbruck, Innsbruck, Austria
| | | |
Collapse
|
12
|
Marcoci A, Vercammen A, Bush M, Hamilton DG, Hanea A, Hemming V, Wintle BC, Burgman M, Fidler F. Reimagining peer review as an expert elicitation process. BMC Res Notes 2022; 15:127. [PMID: 35382867 PMCID: PMC8981826 DOI: 10.1186/s13104-022-06016-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 03/24/2022] [Indexed: 11/10/2022] Open
Abstract
Journal peer review regulates the flow of ideas through an academic discipline and thus has the power to shape what a research community knows, actively investigates, and recommends to policymakers and the wider public. We might assume that editors can identify the 'best' experts and rely on them for peer review. But decades of research on both expert decision-making and peer review suggests they cannot. In the absence of a clear criterion for demarcating reliable, insightful, and accurate expert assessors of research quality, the best safeguard against unwanted biases and uneven power distributions is to introduce greater transparency and structure into the process. This paper argues that peer review would therefore benefit from applying a series of evidence-based recommendations from the empirical literature on structured expert elicitation. We highlight individual and group characteristics that contribute to higher quality judgements, and elements of elicitation protocols that reduce bias, promote constructive discussion, and enable opinions to be objectively and transparently aggregated.
Collapse
Affiliation(s)
- Alexandru Marcoci
- Centre for Argument Technology, School of Science and Engineering (Computing), University of Dundee, Dundee, UK.
| | - Ans Vercammen
- School of Communication and Arts, The University of Queensland, Brisbane, QLD, Australia
- Centre for Environmental Policy, Imperial College London, London, UK
| | - Martin Bush
- MetaMelb Lab, University of Melbourne, Melbourne, VIC, Australia
| | | | - Anca Hanea
- MetaMelb Lab, University of Melbourne, Melbourne, VIC, Australia
- Centre of Excellence for Biosecurity Risk Analysis, University of Melbourne, Melbourne, VIC, Australia
| | - Victoria Hemming
- Martin Conservation Decisions Lab, Department of Forest and Conservation Sciences, University of British Columbia, Vancouver, Canada
| | - Bonnie C Wintle
- MetaMelb Lab, University of Melbourne, Melbourne, VIC, Australia
| | - Mark Burgman
- Centre for Environmental Policy, Imperial College London, London, UK
| | - Fiona Fidler
- MetaMelb Lab, University of Melbourne, Melbourne, VIC, Australia
| |
Collapse
|
13
|
Wingen T, Berkessel JB, Dohle S. Caution, Preprint! Brief Explanations Allow Nonscientists to Differentiate Between Preprints and Peer-Reviewed Journal Articles. ADVANCES IN METHODS AND PRACTICES IN PSYCHOLOGICAL SCIENCE 2022. [DOI: 10.1177/25152459211070559] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
A growing number of psychological research findings are initially published as preprints. Preprints are not peer reviewed and thus did not undergo the established scientific quality-control process. Many researchers hence worry that these preprints reach nonscientists, such as practitioners, journalists, and policymakers, who might be unable to differentiate them from the peer-reviewed literature. Across five studies in Germany and the United States, we investigated whether this concern is warranted and whether this problem can be solved by providing nonscientists with a brief explanation of preprints and the peer-review process. Studies 1 and 2 showed that without an explanation, nonscientists perceive research findings published as preprints as equally credible as findings published as peer-reviewed articles. However, an explanation of the peer-review process reduces the credibility of preprints (Studies 3 and 4). In Study 5, we developed and tested a shortened version of this explanation, which we recommend adding to preprints. This explanation again allowed nonscientists to differentiate between preprints and the peer-reviewed literature. In sum, our research demonstrates that even a short explanation of the concept of preprints and their lack of peer review allows nonscientists who evaluate scientific findings to adjust their credibility perception accordingly. This would allow harvesting the benefits of preprints, such as faster and more accessible science communication, while reducing concerns about public overconfidence in the presented findings.
Collapse
Affiliation(s)
- Tobias Wingen
- Department of Psychology, University of Cologne, Cologne, Germany
| | - Jana B. Berkessel
- Mannheim Centre for European Social Research, University of Mannheim, Mannheim, Germany
| | - Simone Dohle
- Department of Psychology, University of Cologne, Cologne, Germany
| |
Collapse
|
14
|
Kerwer M, Stoll M, Jonas M, Benz G, Chasiotis A. How to Put It Plainly? Findings From Two Randomized Controlled Studies on Writing Plain Language Summaries for Psychological Meta-Analyses. Front Psychol 2021; 12:771399. [PMID: 34975663 PMCID: PMC8717946 DOI: 10.3389/fpsyg.2021.771399] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Accepted: 11/08/2021] [Indexed: 11/26/2022] Open
Abstract
Plain language summaries (PLS) aim to communicate research findings to laypersons in an easily understandable manner. Despite the societal relevance of making psychological research findings available to the public, our empirical knowledge on how to write PLS of psychology studies is still scarce. In this article, we present two experimental studies investigating six characteristics of PLS for psychological meta-analyses. We specifically focused on approaches for (1) handling technical terms, (2) communicating the quality of evidence by explaining the methodological approach of meta-analyses, (3) explaining how synthesized studies operationalized their research questions, (4) handling statistical terms, (5) structuring PLS, and (6) explaining complex meta-analytic designs. To develop empirically validated guidelines on writing PLS, two randomized controlled studies including large samples stratified for education status, age, and gender (N Study1=2,288 and N Study2=2,211) were conducted. Eight PLS of meta-analyses from different areas of psychology were investigated as study materials. Main outcome variables were user experience (i.e., perceived accessibility, perceived understanding, and perceived empowerment) and knowledge acquisition, as well as understanding and knowledge of the quality of evidence. Overall, our hypotheses were partially confirmed, with our results underlining, among other things, the importance of explaining or replacing content-related technical terms (i.e., theoretical concepts) and indicating the detrimental effects of providing too many details on statistical concepts on user experience. Drawing on these and further findings, we derive five empirically well-founded rules on the lay-friendly communication of meta-analytic research findings in psychology. Implications for PLS authors and future research on PLS are discussed.
Collapse
Affiliation(s)
- Martin Kerwer
- Leibniz Institute for Psychology (ZPID), Trier, Germany
| | - Marlene Stoll
- Leibniz Institute for Psychology (ZPID), Trier, Germany
- Leibniz Institute for Resilience Research (LIR), Mainz, Germany
| | - Mark Jonas
- Leibniz Institute for Psychology (ZPID), Trier, Germany
| | - Gesa Benz
- Leibniz Institute for Psychology (ZPID), Trier, Germany
| | | |
Collapse
|
15
|
Nosek BA, Hardwicke TE, Moshontz H, Allard A, Corker KS, Dreber A, Fidler F, Hilgard J, Struhl MK, Nuijten MB, Rohrer JM, Romero F, Scheel AM, Scherer LD, Schönbrodt FD, Vazire S. Replicability, Robustness, and Reproducibility in Psychological Science. Annu Rev Psychol 2021; 73:719-748. [PMID: 34665669 DOI: 10.1146/annurev-psych-020821-114157] [Citation(s) in RCA: 130] [Impact Index Per Article: 43.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Replication-an important, uncommon, and misunderstood practice-is gaining appreciation in psychology. Achieving replicability is important for making research progress. If findings are not replicable, then prediction and theory development are stifled. If findings are replicable, then interrogation of their meaning and validity can advance knowledge. Assessing replicability can be productive for generating and testing hypotheses by actively confronting current understandings to identify weaknesses and spur innovation. For psychology, the 2010s might be characterized as a decade of active confrontation. Systematic and multi-site replication projects assessed current understandings and observed surprising failures to replicate many published findings. Replication efforts highlighted sociocultural challenges such as disincentives to conduct replications and a tendency to frame replication as a personal attack rather than a healthy scientific practice, and they raised awareness that replication contributes to self-correction. Nevertheless, innovation in doing and understanding replication and its cousins, reproducibility and robustness, has positioned psychology to improve research practices and accelerate progress. Expected final online publication date for the Annual Review of Psychology, Volume 73 is January 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.
Collapse
Affiliation(s)
- Brian A Nosek
- Department of Psychology, University of Virginia, Charlottesville, Virginia 22904, USA; .,Center for Open Science, Charlottesville, Virginia 22903, USA
| | - Tom E Hardwicke
- Department of Psychology, University of Amsterdam, 1012 ZA Amsterdam, The Netherlands
| | - Hannah Moshontz
- Addiction Research Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, USA
| | - Aurélien Allard
- Department of Psychology, University of California, Davis, California 95616, USA
| | - Katherine S Corker
- Psychology Department, Grand Valley State University, Allendale, Michigan 49401, USA
| | - Anna Dreber
- Department of Economics, Stockholm School of Economics, 113 83 Stockholm, Sweden
| | - Fiona Fidler
- School of Biosciences, University of Melbourne, Parkville VIC 3010, Australia
| | - Joe Hilgard
- Department of Psychology, Illinois State University, Normal, Illinois 61790, USA
| | | | - Michèle B Nuijten
- Meta-Research Center, Tilburg University, 5037 AB Tilburg, The Netherlands
| | - Julia M Rohrer
- Department of Psychology, Leipzig University, 04109 Leipzig, Germany
| | - Felipe Romero
- Department of Theoretical Philosophy, University of Groningen, 9712 CP, The Netherlands
| | - Anne M Scheel
- Department of Industrial Engineering and Innovation Sciences, Eindhoven University of Technology, 5612 AZ Eindhoven, The Netherlands
| | - Laura D Scherer
- University of Colorado Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | - Felix D Schönbrodt
- Department of Psychology, Ludwig Maximilian University of Munich, 80539 Munich, Germany
| | - Simine Vazire
- School of Psychological Sciences, University of Melbourne, Parkville VIC 3052, Australia
| |
Collapse
|
16
|
Brown MI, Grossenbacher MA, Martin‐Raugh MP, Kochert J, Prewett MS. Can you crowdsource expertise? Comparing expert and crowd‐based scoring keys for three situational judgment tests. INTERNATIONAL JOURNAL OF SELECTION AND ASSESSMENT 2021. [DOI: 10.1111/ijsa.12353] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
| | | | | | - Jonathan Kochert
- U.S. Army Research Institute for the Behavioral and Social Sciences Fort Belvoir Virginia USA
| | | |
Collapse
|