1
|
Nickson D, Singmann H, Meyer C, Toro C, Walasek L. Replicability and reproducibility of predictive models for diagnosis of depression among young adults using Electronic Health Records. Diagn Progn Res 2023; 7:25. [PMID: 38049919 PMCID: PMC10696659 DOI: 10.1186/s41512-023-00160-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Accepted: 10/10/2023] [Indexed: 12/06/2023] Open
Abstract
BACKGROUND Recent advances in machine learning combined with the growing availability of digitized health records offer new opportunities for improving early diagnosis of depression. An emerging body of research shows that Electronic Health Records can be used to accurately predict cases of depression on the basis of individual's primary care records. The successes of these studies are undeniable, but there is a growing concern that their results may not be replicable, which could cast doubt on their clinical usefulness. METHODS To address this issue in the present paper, we set out to reproduce and replicate the work by Nichols et al. (2018), who trained predictive models of depression among young adults using Electronic Healthcare Records. Our contribution consists of three parts. First, we attempt to replicate the methodology used by the original authors, acquiring a more up-to-date set of primary health care records to the same specification and reproducing their data processing and analysis. Second, we test models presented in the original paper on our own data, thus providing out-of-sample prediction of the predictive models. Third, we extend past work by considering several novel machine-learning approaches in an attempt to improve the predictive accuracy achieved in the original work. RESULTS In summary, our results demonstrate that the work of Nichols et al. is largely reproducible and replicable. This was the case both for the replication of the original model and the out-of-sample replication applying NRCBM coefficients to our new EHRs data. Although alternative predictive models did not improve model performance over standard logistic regression, our results indicate that stepwise variable selection is not stable even in the case of large data sets. CONCLUSION We discuss the challenges associated with the research on mental health and Electronic Health Records, including the need to produce interpretable and robust models. We demonstrated some potential issues associated with the reliance on EHRs, including changes in the regulations and guidelines (such as the QOF guidelines in the UK) and reliance on visits to GP as a predictor of specific disorders.
Collapse
Affiliation(s)
| | - Henrik Singmann
- Department of Experimental Psychology, University College London, London, UK
| | - Caroline Meyer
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Carla Toro
- Warwick Medical School, University of Warwick, Coventry, UK
| | - Lukasz Walasek
- Department of Psychology, University of Warwick, Coventry, UK
| |
Collapse
|
2
|
Schimmack U, Bartoš F. Estimating the false discovery risk of (randomized) clinical trials in medical journals based on published p-values. PLoS One 2023; 18:e0290084. [PMID: 37647247 PMCID: PMC10468063 DOI: 10.1371/journal.pone.0290084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 08/01/2023] [Indexed: 09/01/2023] Open
Abstract
The influential claim that most published results are false raised concerns about the trustworthiness and integrity of science. Since then, there have been numerous attempts to examine the rate of false-positive results that have failed to settle this question empirically. Here we propose a new way to estimate the false positive risk and apply the method to the results of (randomized) clinical trials in top medical journals. Contrary to claims that most published results are false, we find that the traditional significance criterion of α = .05 produces a false positive risk of 13%. Adjusting α to.01 lowers the false positive risk to less than 5%. However, our method does provide clear evidence of publication bias that leads to inflated effect size estimates. These results provide a solid empirical foundation for evaluations of the trustworthiness of medical research.
Collapse
Affiliation(s)
- Ulrich Schimmack
- Department of Psychology, University of Toronto Mississauga, Mississauga, Canada
| | - František Bartoš
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
- Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
3
|
Davis J, Redshaw J, Suddendorf T, Nielsen M, Kennedy-Costantini S, Oostenbroek J, Slaughter V. Does Neonatal Imitation Exist? Insights From a Meta-Analysis of 336 Effect Sizes. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2021; 16:1373-1397. [PMID: 33577426 DOI: 10.1177/1745691620959834] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/17/2024]
Abstract
Neonatal imitation is a cornerstone in many theoretical accounts of human development and social behavior, yet its existence has been debated for the past 40 years. To examine possible explanations for the inconsistent findings in this body of research, we conducted a multilevel meta-analysis synthesizing 336 effect sizes from 33 independent samples of human newborns, reported in 26 articles. The meta-analysis found significant evidence for neonatal imitation (d = 0.68, 95% CI = [0.39, 0.96], p < .001) but substantial heterogeneity between study estimates. This heterogeneity was not explained by any of 13 methodological moderators identified by previous reviews, but it was associated with researcher affiliation, test of moderators (QM) (15) = 57.09, p < .001. There are at least two possible explanations for these results: (a) Neonatal imitation exists and its detection varies as a function of uncaptured methodological factors common to a limited set of studies, and (2) neonatal imitation does not exist and the overall positive result is an artifact of high researcher degrees of freedom.
Collapse
Affiliation(s)
- Jacqueline Davis
- Department of Psychology, University of Cambridge
- School of Psychology, University of Queensland
| | | | | | - Mark Nielsen
- School of Psychology, University of Queensland
- Faculty of Humanities, University of Johannesburg
| | | | | | | |
Collapse
|
4
|
Bennett EA. Open Science From a Qualitative, Feminist Perspective: Epistemological Dogmas and a Call for Critical Examination. PSYCHOLOGY OF WOMEN QUARTERLY 2021. [DOI: 10.1177/03616843211036460] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Open science serves to address core issues that are unique to quantitative methods in psychology, though it is typically presented as an appropriate framework for psychological research in general. In the present article, I critically examine the context within which open science operates as I bring that perspective into dialogue with priorities and goals of research that are both qualitative and feminist. I orient this examination in response to the question: what does open science mean for research methodologies that have historically been a home for transgressive and radical question-asking? Questioning the purposes of key tenets like replication and statistical significance—values systems that pose central distinctions between quantitative and qualitative methods—begs bigger questions for feminist psychologists. What counts as science? What counts as a valid epistemology? How can we avoid a further marginalization of epistemologies deemed less valid? I explore these questions, followed by a possible reimagining of our field’s engagement with open science in which I present seven suggestions to practically guide this endeavor.
Collapse
|
5
|
Brabeck MM. Open Science and Feminist Ethics: Promises and Challenges of Open Access. PSYCHOLOGY OF WOMEN QUARTERLY 2021. [DOI: 10.1177/03616843211030926] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Open science advocates argue that making data sets, studies, methodologies, and other aspects of research free from publication fees and available to scholars will increase collaborations, access, and dissemination of knowledge. In this article, I argue that open access policies and practices raise both feminist and ethical issues. I reflect on the five themes of feminist ethics identified 20 years ago by a task force of the Society for the Psychology of Women. I update the themes with recent scholarship of feminist philosophers and ethicists, and I use the themes to raise questions about the promises and challenges of open access. Throughout, I offer suggestions for all who seek to make knowledge of human psychology more complete and more accessible to more people. I conclude by offering recommendations informed by feminist ethics to those building the policies and practices of open access. Online slides for instructors who want to use this article for teaching are available on PWQ's website at http://journals.sagepub.com/doi/suppl/10.1177/03616843211030926
Collapse
Affiliation(s)
- Mary M. Brabeck
- Department of Applied Psychology, New York University, New York, NY, USA
| |
Collapse
|
6
|
Carpenter TP, Law KC. Optimizing the scientific study of suicide with open and transparent research practices. Suicide Life Threat Behav 2021; 51:36-46. [PMID: 33624871 DOI: 10.1111/sltb.12665] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Suicide research is vitally important, yet-like psychology research more broadly-faces methodological challenges. In recent years, researchers have raised concerns about standard practices in psychological research, concerns that apply to suicide research and raise questions about its robustness and validity. In the present paper, we review these concerns and the corresponding solutions put forth by the "open science" community. These include using open science platforms, pre-registering studies, ensuring reproducible analyses, using high-powered studies, ensuring open access to research materials and products, and conducting replication studies. We build upon existing guides, address specific obstacles faced by suicide researchers, and offer a clear set of recommended practices for suicide researchers. In particular, we consider challenges that suicide researchers may face in seeking to adopt "open science" practices (e.g., prioritizing large samples) and suggest possible strategies that the field may use in order to ensure robust and transparent research, despite these challenges.
Collapse
Affiliation(s)
| | - Keyne C Law
- Seattle Pacific University, Seattle, Washington, USA
| |
Collapse
|
7
|
Margoni F, Shepperd M. Changing the logic of replication: A case from infant studies. Infant Behav Dev 2020; 61:101483. [PMID: 33011611 DOI: 10.1016/j.infbeh.2020.101483] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 08/11/2020] [Accepted: 08/12/2020] [Indexed: 10/23/2022]
Abstract
Among infant researchers there is growing concern regarding the widespread practice of undertaking studies that have small sample sizes and employ tests with low statistical power (to detect a wide range of possible effects). For many researchers, issues of confidence may be partially resolved by relying on replications. Here, we bring further evidence that the classical logic of confirmation, according to which the result of a replication study confirms the original finding when it reaches statistical significance, could be usefully abandoned. With real examples taken from the infant literature and Monte Carlo simulations, we show that a very wide range of possible replication results would in a formal statistical sense constitute confirmation as they can be explained simply due to sampling error. Thus, often no useful conclusion can be derived from a single or small number of replication studies. We suggest that, in order to accumulate and generate new knowledge, the dichotomous view of replication as confirmatory/disconfirmatory can be replaced by an approach that emphasizes the estimation of effect sizes via meta-analysis. Moreover, we discuss possible solutions for reducing problems affecting the validity of conclusions drawn from meta-analyses in infant research.
Collapse
Affiliation(s)
- Francesco Margoni
- Department of Psychology and Cognitive Sciences, University of Trento, Italy.
| | - Martin Shepperd
- Department of Computer Science, Brunel University London, United Kingdom
| |
Collapse
|
8
|
Fife D. The Eight Steps of Data Analysis: A Graphical Framework to Promote Sound Statistical Analysis. PERSPECTIVES ON PSYCHOLOGICAL SCIENCE 2020; 15:1054-1075. [PMID: 32502366 DOI: 10.1177/1745691620917333] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Data analysis is a risky endeavor, particularly among people who are unaware of its dangers. According to some researchers, "statistical conclusions validity" threatens all research subjected to the dark arts of statistical magic. Although traditional statistics classes may advise against certain practices (e.g., multiple comparisons, small sample sizes, violating normality), they may fail to cover others (e.g., outlier detection and violating linearity). More common, perhaps, is that researchers may fail to remember them. In this article, rather than rehashing old warnings and diatribes against this practice or that, I instead advocate a general statistical-analysis strategy. This graphic-based eight-step strategy promises to resolve the majority of statistical traps researchers may fall into-without having to remember large lists of problematic statistical practices. These steps will assist in preventing both false positives and false negatives and yield critical insights about the data that would have otherwise been missed. I conclude with an applied example that shows how the eight steps reveal interesting insights that would not be detected with standard statistical practices.
Collapse
|
9
|
|
10
|
Changing institutional incentives to foster sound scientific practices: One department. Infant Behav Dev 2019; 55:69-76. [PMID: 30933839 DOI: 10.1016/j.infbeh.2019.03.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 12/11/2018] [Accepted: 03/22/2019] [Indexed: 01/08/2023]
Abstract
Replicable research and open science are of value to our field and to society at large, but most universities provide no incentives to adopt these practices. Instead, current incentive structures favor novel research, which has led to a situation in which few researchers take the time to do replications, share protocols, or share data. Obviously, several approaches to remedy this situation are possible. However, little progress can be made if becoming involved in such activities reduces a researcher's chances of rank and status advancement and other rewards. I describe in this article the way my department has modified our incentive structure to tackle this problem, including how the changes influence my research as a developmental psychologist. Finally, I offer suggestions for faculty who wish to initiate similar changes in their institutions.
Collapse
|
11
|
Craske MG. Refining our research practices in clinical science: Challenges and steps towards solutions. Behav Res Ther 2019; 116:90-93. [PMID: 30870766 DOI: 10.1016/j.brat.2019.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Michelle G Craske
- Department of Psychology, University of California, Los Angeles, United States.
| |
Collapse
|
12
|
Hantula DA. Editorial: Replication and Reliability in Behavior Science and Behavior Analysis: A Call for a Conversation. Perspect Behav Sci 2019; 42:1-11. [PMID: 31976418 PMCID: PMC6701722 DOI: 10.1007/s40614-019-00194-2] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Affiliation(s)
- Donald A. Hantula
- Department of Psychology, Temple University, Philadelphia, PA 19122 USA
| |
Collapse
|
13
|
Amaral OB, Neves K, Wasilewska-Sampaio AP, Carneiro CFD. The Brazilian Reproducibility Initiative. eLife 2019; 8:e41602. [PMID: 30720433 PMCID: PMC6374071 DOI: 10.7554/elife.41602] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2018] [Accepted: 01/25/2019] [Indexed: 12/12/2022] Open
Abstract
Most efforts to estimate the reproducibility of published findings have focused on specific areas of research, even though science is usually assessed and funded on a regional or national basis. Here we describe a project to assess the reproducibility of findings in biomedical science published by researchers based in Brazil. The Brazilian Reproducibility Initiative is a systematic, multicenter effort to repeat between 60 and 100 experiments: the project will focus on a set of common methods, repeating each experiment in three different laboratories from a countrywide network. The results, due in 2021, will allow us to estimate the level of reproducibility of biomedical science in Brazil, and to investigate what aspects of the published literature might help to predict whether a finding is reproducible.
Collapse
Affiliation(s)
- Olavo B Amaral
- Institute of Medical Biochemistry Leopoldo de MeisFederal University of Rio de JaneiroRio de JaneiroBrazil
| | - Kleber Neves
- Institute of Medical Biochemistry Leopoldo de MeisFederal University of Rio de JaneiroRio de JaneiroBrazil
| | - Ana P Wasilewska-Sampaio
- Institute of Medical Biochemistry Leopoldo de MeisFederal University of Rio de JaneiroRio de JaneiroBrazil
| | - Clarissa FD Carneiro
- Institute of Medical Biochemistry Leopoldo de MeisFederal University of Rio de JaneiroRio de JaneiroBrazil
| |
Collapse
|
14
|
Abstract
AbstractZwaan et al. make a compelling case for the necessity of direct replication in psychological science. I build on their arguments by underscoring the necessity of direct implication for two domains of clinical psychological science: the evaluation of psychotherapy outcome and the construct validity of psychological measures.
Collapse
|
15
|
Shrout PE, Rodgers JL. Psychology, Science, and Knowledge Construction: Broadening Perspectives from the Replication Crisis. Annu Rev Psychol 2018; 69:487-510. [DOI: 10.1146/annurev-psych-122216-011845] [Citation(s) in RCA: 279] [Impact Index Per Article: 46.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Patrick E. Shrout
- Department of Psychology, New York University, New York, New York 10003
| | - Joseph L. Rodgers
- Department of Psychology and Human Development, Peabody College, Vanderbilt University, Nashville, Tennessee 37205
| |
Collapse
|
16
|
Abstract
AbstractMany philosophers of science and methodologists have argued that the ability to repeat studies and obtain similar results is an essential component of science. A finding is elevated from single observation to scientific evidence when the procedures that were used to obtain it can be reproduced and the finding itself can be replicated. Recent replication attempts show that some high profile results – most notably in psychology, but in many other disciplines as well – cannot be replicated consistently. These replication attempts have generated a considerable amount of controversy, and the issue of whether direct replications have value has, in particular, proven to be contentious. However, much of this discussion has occurred in published commentaries and social media outlets, resulting in a fragmented discourse. To address the need for an integrative summary, we review various types of replication studies and then discuss the most commonly voiced concerns about direct replication. We provide detailed responses to these concerns and consider different statistical ways to evaluate replications. We conclude there are no theoretical or statistical obstacles to making direct replication a routine aspect of psychological science.
Collapse
|
17
|
Lawton JS. Reproducibility and replicability of science and thoracic surgery. J Thorac Cardiovasc Surg 2016; 152:1489-1491. [PMID: 27692760 DOI: 10.1016/j.jtcvs.2016.08.044] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/09/2016] [Revised: 08/23/2016] [Accepted: 08/25/2016] [Indexed: 11/29/2022]
Affiliation(s)
- Jennifer S Lawton
- Division of Cardiac Surgery, Department of Surgery, Johns Hopkins University School of Medicine, Baltimore, Md.
| |
Collapse
|
18
|
Abstract
Modern psychology is apparently in crisis and the prevailing view is that this partly reflects an inability to replicate past findings. If a crisis does exists, then it is some kind of ‘chronic’ crisis, as psychologists have been censuring themselves over replicability for decades. While the debate in psychology is not new, the lack of progress across the decades is disappointing. Recently though, we have seen a veritable surfeit of debate alongside multiple orchestrated and well-publicised replication initiatives. The spotlight is being shone on certain areas and although not everyone agrees on how we should interpret the outcomes, the debate is happening and impassioned. The issue of reproducibility occupies a central place in our whig history of psychology.
Collapse
|