1
|
Yan X, Ebitz RB, Grissom N, Darrow DP, Herman AB. Individual differences in uncertainty evaluation explain opposing exploratory behaviors in anxiety and apathy. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.06.04.597412. [PMID: 38895240 PMCID: PMC11185698 DOI: 10.1101/2024.06.04.597412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
Navigating uncertain environments is a fundamental challenge for adaptive behavior, and affective states such as anxiety and apathy can profoundly influence an individual's response to uncertainty. Uncertainty encompasses both volatility and stochasticity, where volatility refers to how rapidly the environment changes and stochasticity describes outcomes resulting from random chance. This study investigates how anxiety and apathy modulate perceptions of environmental volatility and stochasticity and how these perceptions impact exploratory behavior. In a large online sample (N = 1001), participants completed a restless three-armed bandit task, and their choices were analyzed using latent state models to quantify the computational processes. We found that anxious individuals attributed uncertainty more to environmental volatility than stochasticity, leading to increased exploration, particularly after reward omission. Conversely, apathetic individuals perceived uncertainty as more stochastic than volatile, resulting in decreased exploration. The ratio of perceived volatility to stochasticity mediated the relationship between anxiety and exploratory behavior following adverse outcomes. These findings reveal distinct computational mechanisms underlying anxiety and apathy in uncertain environments. Our results provide a novel framework for understanding the cognitive and affective processes driving adaptive and potentially maladaptive behaviors under uncertainty, with implications for the characterization and treatment of neuropsychiatric disorders.
Collapse
Affiliation(s)
- Xinyuan Yan
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, MN 55455, USA
| | - R. Becket Ebitz
- Department of Neuroscience, Universite de Montreal, 2900 Edouard Montpetit Blvd, Montreal, Quebec H3T 1J4, Canada
| | - Nicola Grissom
- Department of Psychology, University of Minnesota, 75 E River Rd, Minneapolis, MN 55455, USA
| | - David P. Darrow
- Department of Neurosurgery, University of Minnesota, Minneapolis, MN 55455, USA
| | - Alexander B. Herman
- Department of Psychiatry and Behavioral Sciences, University of Minnesota, Minneapolis, MN 55455, USA
| |
Collapse
|
2
|
Montaser-Kouhsari L, Nicholas J, Gerraty RT, Shohamy D. Two routes to value-based decisions in Parkinson's disease: differentiating incremental reinforcement learning from episodic memory. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.03.592414. [PMID: 38746345 PMCID: PMC11092770 DOI: 10.1101/2024.05.03.592414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Patients with Parkinson's disease are impaired at incremental reward-based learning. It is typically assumed that this impairment reflects a loss of striatal dopamine. However, many open questions remain about the nature of reward-based learning deficits in Parkinson's. Recent studies have found that a combination of different cognitive and computational strategies contribute even to simple reward-based learning tasks, suggesting a possible role for episodic memory. These findings raise critical questions about how incremental learning and episodic memory interact to support learning from past experience and what their relative contributions are to impaired decision-making in Parkinson's disease. Here we addressed these questions by asking patients with Parkinson's disease (n=26) both on and off their dopamine replacement medication and age- and education-matched healthy controls (n=26) to complete a task designed to isolate the contributions of incremental learning and episodic memory to reward-based learning and decision-making. We found that Parkinson's patients performed as well as healthy controls when using episodic memory, but were impaired at incremental reward-based learning. Dopamine replacement medication remediated this deficit while enhancing subsequent episodic memory for the value of motivationally relevant stimuli. These results demonstrate that Parkinson's patients are impaired at learning about reward from trial-and-error when episodic memory is properly controlled for, and that learning based on the value of single experiences remains intact in patients with Parkinson's disease.
Collapse
|
3
|
Mochizuki Y, Harasawa N, Aggarwal M, Chen C, Fukuda H. Foraging in a non-foraging task: Fitness maximization explains human risk preference dynamics under changing environment. PLoS Comput Biol 2024; 20:e1012080. [PMID: 38739672 PMCID: PMC11115364 DOI: 10.1371/journal.pcbi.1012080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 05/23/2024] [Accepted: 04/16/2024] [Indexed: 05/16/2024] Open
Abstract
Changes in risk preference have been reported when making a series of independent risky choices or non-foraging economic decisions. Behavioral economics has put forward various explanations for specific changes in risk preference in non-foraging tasks, but a consensus regarding the general principle underlying these effects has not been reached. In contrast, recent studies have investigated human economic risky choices using tasks adapted from foraging theory, which require consideration of past choices and future opportunities to make optimal decisions. In these foraging tasks, human economic risky choices are explained by the ethological principle of fitness maximization, which naturally leads to dynamic risk preference. Here, we conducted two online experiments to investigate whether the principle of fitness maximization can explain risk preference dynamics in a non-foraging task. Participants were asked to make a series of independent risky economic decisions while the environmental richness changed. We found that participants' risk preferences were influenced by the current and past environments, making them more risk-averse during and after the rich environment compared to the poor environment. These changes in risk preference align with fitness maximization. Our findings suggest that the ethological principle of fitness maximization might serve as a generalizable principle for explaining dynamic preferences, including risk preference, in human economic decision-making.
Collapse
Affiliation(s)
| | | | | | - Chong Chen
- Division of Neuropsychiatry, Department of Neuroscience, Yamaguchi University Graduate School of Medicine, Ube, Yamaguchi, Japan
| | - Haruaki Fukuda
- Graduate School of Business Administration, Hitotsubashi University, Kunitachi, Tokyo, Japan
| |
Collapse
|
4
|
Kang P, Tobler PN, Dayan P. Bayesian reinforcement learning: A basic overview. Neurobiol Learn Mem 2024; 211:107924. [PMID: 38579896 DOI: 10.1016/j.nlm.2024.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/07/2024]
Abstract
We and other animals learn because there is some aspect of the world about which we are uncertain. This uncertainty arises from initial ignorance, and from changes in the world that we do not perfectly know; the uncertainty often becomes evident when our predictions about the world are found to be erroneous. The Rescorla-Wagner learning rule, which specifies one way that prediction errors can occasion learning, has been hugely influential as a characterization of Pavlovian conditioning and, through its equivalence to the delta rule in engineering, in a much wider class of learning problems. Here, we review the embedding of the Rescorla-Wagner rule in a Bayesian context that is precise about the link between uncertainty and learning, and thereby discuss extensions to such suggestions as the Kalman filter, structure learning, and beyond, that collectively encompass a wider range of uncertainties and accommodate a wider assortment of phenomena in conditioning.
Collapse
Affiliation(s)
- Pyungwon Kang
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Philippe N Tobler
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany; University of Tübingen, Tübingen Germany.
| |
Collapse
|
5
|
Farkas BC, Baptista A, Speranza M, Wyart V, Jacquet PO. Specifying the timescale of early life unpredictability helps explain the development of internalising and externalising behaviours. Sci Rep 2024; 14:3563. [PMID: 38347055 PMCID: PMC10861493 DOI: 10.1038/s41598-024-54093-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 02/08/2024] [Indexed: 02/15/2024] Open
Abstract
Early life unpredictability is associated with both physical and mental health outcomes throughout the life course. Here, we classified adverse experiences based on the timescale on which they are likely to introduce variability in children's environments: variations unfolding over short time scales (e.g., hours, days, weeks) and labelled Stochasticity vs variations unfolding over longer time scales (e.g., months, years) and labelled Volatility and explored how they contribute to the development of problem behaviours. Results indicate that externalising behaviours at age 9 and 15 and internalising behaviours at age 15 were better accounted for by models that separated Stochasticity and Volatility measured at ages 3 to 5. Both externalising and internalising behaviours were specifically associated with Volatility, with larger effects for externalising behaviours. These findings are interpreted in light of evolutionary-developmental models of psychopathology and reinforcement learning models of learning under uncertainty.
Collapse
Affiliation(s)
- Bence Csaba Farkas
- Institut du Psychotraumatisme de l'Enfant et de l'Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine et Centre Hospitalier des Versailles, 78000, Versailles, France.
- UVSQ, Inserm, Centre de Recherche en Epidémiologie et Santé des Populations, Université Paris-Saclay, 78000, Versailles, France.
- LNC2, Département d'études Cognitives, École Normale Supérieure, INSERM, PSL Research University, 75005, Paris, France.
| | - Axel Baptista
- UVSQ, Inserm, Centre de Recherche en Epidémiologie et Santé des Populations, Université Paris-Saclay, 78000, Versailles, France
- Centre Hospitalier de Versailles, Le Chesnay, France
| | - Mario Speranza
- Institut du Psychotraumatisme de l'Enfant et de l'Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine et Centre Hospitalier des Versailles, 78000, Versailles, France
- UVSQ, Inserm, Centre de Recherche en Epidémiologie et Santé des Populations, Université Paris-Saclay, 78000, Versailles, France
- Centre Hospitalier de Versailles, Le Chesnay, France
| | - Valentin Wyart
- Institut du Psychotraumatisme de l'Enfant et de l'Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine et Centre Hospitalier des Versailles, 78000, Versailles, France
- LNC2, Département d'études Cognitives, École Normale Supérieure, INSERM, PSL Research University, 75005, Paris, France
| | - Pierre Olivier Jacquet
- Institut du Psychotraumatisme de l'Enfant et de l'Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine et Centre Hospitalier des Versailles, 78000, Versailles, France
- UVSQ, Inserm, Centre de Recherche en Epidémiologie et Santé des Populations, Université Paris-Saclay, 78000, Versailles, France
- LNC2, Département d'études Cognitives, École Normale Supérieure, INSERM, PSL Research University, 75005, Paris, France
| |
Collapse
|
6
|
Heald JB, Wolpert DM, Lengyel M. The Computational and Neural Bases of Context-Dependent Learning. Annu Rev Neurosci 2023; 46:233-258. [PMID: 36972611 PMCID: PMC10348919 DOI: 10.1146/annurev-neuro-092322-100402] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
Abstract
Flexible behavior requires the creation, updating, and expression of memories to depend on context. While the neural underpinnings of each of these processes have been intensively studied, recent advances in computational modeling revealed a key challenge in context-dependent learning that had been largely ignored previously: Under naturalistic conditions, context is typically uncertain, necessitating contextual inference. We review a theoretical approach to formalizing context-dependent learning in the face of contextual uncertainty and the core computations it requires. We show how this approach begins to organize a large body of disparate experimental observations, from multiple levels of brain organization (including circuits, systems, and behavior) and multiple brain regions (most prominently the prefrontal cortex, the hippocampus, and motor cortices), into a coherent framework. We argue that contextual inference may also be key to understanding continual learning in the brain. This theory-driven perspective places contextual inference as a core component of learning.
Collapse
Affiliation(s)
- James B Heald
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
| | - Daniel M Wolpert
- Department of Neuroscience and Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA; ,
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, United Kingdom;
- Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary
| |
Collapse
|
7
|
Sharp PB, Fradkin I, Eldar E. Hierarchical inference as a source of human biases. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2023; 23:476-490. [PMID: 35725986 DOI: 10.3758/s13415-022-01020-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 06/06/2022] [Indexed: 06/15/2023]
Abstract
The finding that human decision-making is systematically biased continues to have an immense impact on both research and policymaking. Prevailing views ascribe biases to limited computational resources, which require humans to resort to less costly resource-rational heuristics. Here, we propose that many biases in fact arise due to a computationally costly way of coping with uncertainty-namely, hierarchical inference-which by nature incorporates information that can seem irrelevant. We show how, in uncertain situations, Bayesian inference may avail of the environment's hierarchical structure to reduce uncertainty at the cost of introducing bias. We illustrate how this account can explain a range of familiar biases, focusing in detail on the halo effect and on the neglect of base rates. In each case, we show how a hierarchical-inference account takes the characterization of a bias beyond phenomenological description by revealing the computations and assumptions it might reflect. Furthermore, we highlight new predictions entailed by our account concerning factors that could mitigate or exacerbate bias, some of which have already garnered empirical support. We conclude that a hierarchical inference account may inform scientists and policy makers with a richer understanding of the adaptive and maladaptive aspects of human decision-making.
Collapse
Affiliation(s)
- Paul B Sharp
- Department of Psychology, Hebrew University of Jerusalem, 9190501, Jerusalem, Israel
- Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, 9190501, Jerusalem, Israel
| | - Isaac Fradkin
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK
- Wellcome Trust Centre for Neuroimaging, University College London, London, WC1N 3BG, UK
| | - Eran Eldar
- Department of Psychology, Hebrew University of Jerusalem, 9190501, Jerusalem, Israel.
- Department of Cognitive and Brain Sciences, Hebrew University of Jerusalem, 9190501, Jerusalem, Israel.
| |
Collapse
|
8
|
Lee JK, Rouault M, Wyart V. Adaptive tuning of human learning and choice variability to unexpected uncertainty. SCIENCE ADVANCES 2023; 9:eadd0501. [PMID: 36989365 PMCID: PMC10058239 DOI: 10.1126/sciadv.add0501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Accepted: 02/28/2023] [Indexed: 06/19/2023]
Abstract
Human value-based decisions are notably variable under uncertainty. This variability is known to arise from two distinct sources: variable choices aimed at exploring available options and imprecise learning of option values due to limited cognitive resources. However, whether these two sources of decision variability are tuned to their specific costs and benefits remains unclear. To address this question, we compared the effects of expected and unexpected uncertainty on decision-making in the same reinforcement learning task. Across two large behavioral datasets, we found that humans choose more variably between options but simultaneously learn less imprecisely their values in response to unexpected uncertainty. Using simulations of learning agents, we demonstrate that these opposite adjustments reflect adaptive tuning of exploration and learning precision to the structure of uncertainty. Together, these findings indicate that humans regulate not only how much they explore uncertain options but also how precisely they learn the values of these options.
Collapse
Affiliation(s)
- Junseok K. Lee
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et de la Recherche Médicale (Inserm), Paris, France
- Département d’Études Cognitives, École Normale Supérieure, Université PSL, Paris, France
| | - Marion Rouault
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et de la Recherche Médicale (Inserm), Paris, France
- Département d’Études Cognitives, École Normale Supérieure, Université PSL, Paris, France
| | - Valentin Wyart
- Laboratoire de Neurosciences Cognitives et Computationnelles, Institut National de la Santé et de la Recherche Médicale (Inserm), Paris, France
- Département d’Études Cognitives, École Normale Supérieure, Université PSL, Paris, France
- Institut du Psychotraumatisme de l’Enfant et de l’Adolescent, Conseil Départemental Yvelines et Hauts-de-Seine, Versailles, France
| |
Collapse
|
9
|
Wieland L, Ebrahimi C, Katthagen T, Panitz M, Luettgau L, Heinz A, Schlagenhauf F, Sjoerds Z. Acute stress alters probabilistic reversal learning in healthy male adults. Eur J Neurosci 2023; 57:824-839. [PMID: 36656136 DOI: 10.1111/ejn.15916] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 01/12/2023] [Accepted: 01/14/2023] [Indexed: 01/20/2023]
Abstract
Behavioural adaptation is a fundamental cognitive ability, ensuring survival by allowing for flexible adjustment to changing environments. In laboratory settings, behavioural adaptation can be measured with reversal learning paradigms requiring agents to adjust reward learning to stimulus-action-outcome contingency changes. Stress is found to alter flexibility of reward learning, but effect directionality is mixed across studies. Here, we used model-based functional MRI (fMRI) in a within-subjects design to investigate the effect of acute psychosocial stress on flexible behavioural adaptation. Healthy male volunteers (n = 28) did a reversal learning task during fMRI in two sessions, once after the Trier Social Stress Test (TSST), a validated psychosocial stress induction method, and once after a control condition. Stress effects on choice behaviour were investigated using multilevel generalized linear models and computational models describing different learning processes that potentially generated the data. Computational models were fitted using a hierarchical Bayesian approach, and model-derived reward prediction errors (RPE) were used as fMRI regressors. We found that acute psychosocial stress slightly increased correct response rates. Model comparison revealed that double-update learning with altered choice temperature under stress best explained the observed behaviour. In the brain, model-derived RPEs were correlated with BOLD signals in striatum and ventromedial prefrontal cortex (vmPFC). Striatal RPE signals for win trials were stronger during stress compared with the control condition. Our study suggests that acute psychosocial stress could enhance reversal learning and RPE brain responses in healthy male participants and provides a starting point to explore these effects further in a more diverse population.
Collapse
Affiliation(s)
- Lara Wieland
- Department of Psychiatry and Neurosciences, CCM, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Claudia Ebrahimi
- Department of Psychiatry and Neurosciences, CCM, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Teresa Katthagen
- Department of Psychiatry and Neurosciences, CCM, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Martin Panitz
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Lennart Luettgau
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, UK
| | - Andreas Heinz
- Department of Psychiatry and Neurosciences, CCM, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Florian Schlagenhauf
- Department of Psychiatry and Neurosciences, CCM, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Einstein Center for Neurosciences Berlin, Charité-Universitätsmedizin Berlin, Berlin, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany.,Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany
| | - Zsuzsika Sjoerds
- Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, Germany.,Cognitive Psychology Unit, Institute of Psychology & Leiden Institute for Brain and Cognition, Leiden University, Leiden, the Netherlands
| |
Collapse
|
10
|
Suzuki S, Zhang X, Dezfouli A, Braganza L, Fulcher BD, Parkes L, Fontenelle LF, Harrison BJ, Murawski C, Yücel M, Suo C. Individuals with problem gambling and obsessive-compulsive disorder learn through distinct reinforcement mechanisms. PLoS Biol 2023; 21:e3002031. [PMID: 36917567 PMCID: PMC10013903 DOI: 10.1371/journal.pbio.3002031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 02/08/2023] [Indexed: 03/16/2023] Open
Abstract
Obsessive-compulsive disorder (OCD) and pathological gambling (PG) are accompanied by deficits in behavioural flexibility. In reinforcement learning, this inflexibility can reflect asymmetric learning from outcomes above and below expectations. In alternative frameworks, it reflects perseveration independent of learning. Here, we examine evidence for asymmetric reward-learning in OCD and PG by leveraging model-based functional magnetic resonance imaging (fMRI). Compared with healthy controls (HC), OCD patients exhibited a lower learning rate for worse-than-expected outcomes, which was associated with the attenuated encoding of negative reward prediction errors in the dorsomedial prefrontal cortex and the dorsal striatum. PG patients showed higher and lower learning rates for better- and worse-than-expected outcomes, respectively, accompanied by higher encoding of positive reward prediction errors in the anterior insula than HC. Perseveration did not differ considerably between the patient groups and HC. These findings elucidate the neural computations of reward-learning that are altered in OCD and PG, providing a potential account of behavioural inflexibility in those mental disorders.
Collapse
Affiliation(s)
- Shinsuke Suzuki
- Centre for Brain, Mind and Markets, The University of Melbourne, Carlton, Australia
- Center for the Promotion of Social Data Science Education and Research, Hitotsubashi University, Tokyo, Japan
- * E-mail:
| | - Xiaoliu Zhang
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Amir Dezfouli
- Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, Australia
| | - Leah Braganza
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Ben D. Fulcher
- School of Physics, The University of Sydney, Sydney, Australia
| | - Linden Parkes
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Leonardo F. Fontenelle
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Ben J. Harrison
- Melbourne Neuropsychiatry Centre, Department of Psychiatry, The University of Melbourne, Carlton, Australia
| | - Carsten Murawski
- Centre for Brain, Mind and Markets, The University of Melbourne, Carlton, Australia
| | - Murat Yücel
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Chao Suo
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| |
Collapse
|
11
|
Heald JB, Lengyel M, Wolpert DM. Contextual inference in learning and memory. Trends Cogn Sci 2023; 27:43-64. [PMID: 36435674 PMCID: PMC9789331 DOI: 10.1016/j.tics.2022.10.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 10/11/2022] [Accepted: 10/12/2022] [Indexed: 11/25/2022]
Abstract
Context is widely regarded as a major determinant of learning and memory across numerous domains, including classical and instrumental conditioning, episodic memory, economic decision-making, and motor learning. However, studies across these domains remain disconnected due to the lack of a unifying framework formalizing the concept of context and its role in learning. Here, we develop a unified vernacular allowing direct comparisons between different domains of contextual learning. This leads to a Bayesian model positing that context is unobserved and needs to be inferred. Contextual inference then controls the creation, expression, and updating of memories. This theoretical approach reveals two distinct components that underlie adaptation, proper and apparent learning, respectively referring to the creation and updating of memories versus time-varying adjustments in their expression. We review a number of extensions of the basic Bayesian model that allow it to account for increasingly complex forms of contextual learning.
Collapse
Affiliation(s)
- James B Heald
- Department of Neuroscience, Columbia University, New York, NY 10027, USA; Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA.
| | - Máté Lengyel
- Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK; Center for Cognitive Computation, Department of Cognitive Science, Central European University, Budapest, Hungary.
| | - Daniel M Wolpert
- Department of Neuroscience, Columbia University, New York, NY 10027, USA; Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY 10027, USA; Computational and Biological Learning Lab, Department of Engineering, University of Cambridge, Cambridge, UK.
| |
Collapse
|
12
|
Ma I, Westhoff B, van Duijvenvoorde ACK. Uncertainty about others' trustworthiness increases during adolescence and guides social information sampling. Sci Rep 2022; 12:7634. [PMID: 35538170 PMCID: PMC9091231 DOI: 10.1038/s41598-022-09477-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2021] [Accepted: 03/15/2022] [Indexed: 01/11/2023] Open
Abstract
Adolescence is a key life phase for developing well-adjusted social behaviour. An essential component of well-adjusted social behaviour is the ability to update our beliefs about the trustworthiness of others based on gathered information. Here, we examined how adolescents (n = 157, 10-24 years) sequentially sampled information about the trustworthiness of peers and how they used this information to update their beliefs about others' trustworthiness. Our Bayesian computational modelling approach revealed an adolescence-emergent increase in uncertainty of prior beliefs about others' trustworthiness. As a consequence, early to mid-adolescents (ages 10-16) gradually relied less on their prior beliefs and more on the gathered evidence when deciding to sample more information, and when deciding to trust. We propose that these age-related differences could be adaptive to the rapidly changing social environment of early and mid-adolescents. Together, these findings contribute to the understanding of adolescent social development by revealing adolescent-emergent flexibility in prior beliefs about others that drives adolescents' information sampling and trust decisions.
Collapse
Affiliation(s)
- I Ma
- Department of Psychology, New York University, New York, USA.
- Institute of Psychology, Leiden University, Leiden, The Netherlands.
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands.
| | - B Westhoff
- Institute of Psychology, Leiden University, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| | - A C K van Duijvenvoorde
- Institute of Psychology, Leiden University, Leiden, The Netherlands
- Leiden Institute for Brain and Cognition, Leiden, The Netherlands
| |
Collapse
|
13
|
Möller M, Manohar S, Bogacz R. Uncertainty-guided learning with scaled prediction errors in the basal ganglia. PLoS Comput Biol 2022; 18:e1009816. [PMID: 35622863 PMCID: PMC9182698 DOI: 10.1371/journal.pcbi.1009816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 06/09/2022] [Accepted: 05/05/2022] [Indexed: 11/19/2022] Open
Abstract
To accurately predict rewards associated with states or actions, the variability of observations has to be taken into account. In particular, when the observations are noisy, the individual rewards should have less influence on tracking of average reward, and the estimate of the mean reward should be updated to a smaller extent after each observation. However, it is not known how the magnitude of the observation noise might be tracked and used to control prediction updates in the brain reward system. Here, we introduce a new model that uses simple, tractable learning rules that track the mean and standard deviation of reward, and leverages prediction errors scaled by uncertainty as the central feedback signal. We show that the new model has an advantage over conventional reinforcement learning models in a value tracking task, and approaches a theoretic limit of performance provided by the Kalman filter. Further, we propose a possible biological implementation of the model in the basal ganglia circuit. In the proposed network, dopaminergic neurons encode reward prediction errors scaled by standard deviation of rewards. We show that such scaling may arise if the striatal neurons learn the standard deviation of rewards and modulate the activity of dopaminergic neurons. The model is consistent with experimental findings concerning dopamine prediction error scaling relative to reward magnitude, and with many features of striatal plasticity. Our results span across the levels of implementation, algorithm, and computation, and might have important implications for understanding the dopaminergic prediction error signal and its relation to adaptive and effective learning.
Collapse
Affiliation(s)
- Moritz Möller
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| | - Sanjay Manohar
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
- Department of Experimental Psychology, University of Oxford, Oxford, United Kingdom
| | - Rafal Bogacz
- Nuffield Department of Clinical Neurosciences, University of Oxford, Oxford, United Kingdom
| |
Collapse
|
14
|
Williams B, Christakou A. Dissociable roles for the striatal cholinergic system in different flexibility contexts. IBRO Neurosci Rep 2022; 12:260-270. [PMID: 35481226 PMCID: PMC9035710 DOI: 10.1016/j.ibneur.2022.03.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/03/2022] [Accepted: 03/28/2022] [Indexed: 11/17/2022] Open
Abstract
The production of behavioural flexibility requires the coordination and integration of information from across the brain, by the dorsal striatum. In particular, the striatal cholinergic system is thought to be important for the modulation of striatal activity. Research from animal literature has shown that chemical inactivation of the dorsal striatum leads to impairments in reversal learning. Furthermore, proton magnetic resonance spectroscopy work has shown that the striatal cholinergic system is also important for reversal learning in humans. Here, we aim to assess whether the state of the dorsal striatal cholinergic system at rest is related to serial reversal learning in humans. We provide preliminary results showing that variability in choline in the dorsal striatum is significantly related to both the number of perseverative and regressive errors that participants make, and their rate of learning from positive and negative prediction errors. These findings, in line with previous work, suggest the resting state of dorsal striatal cholinergic system has important implications for producing flexible behaviour. However, these results also suggest the system may have heterogeneous functionality across different types of tasks measuring behavioural flexibility. These findings provide a starting point for further interrogation into understanding the functional role of the striatal cholinergic system in flexibility. Striatal acetylcholine is important for behavioural flexibility in rodents & primates. Nascent evidence the striatal cholinergic system is important for human flexibility. 1H-MRS, reversal learning and reinforcement learning used to interrogate relationship. Striatal cholinergic system at rest is associated with direct and latent performance. Results specific to concentrations of striatal choline, and not other metabolites.
Collapse
Affiliation(s)
- Brendan Williams
- Centre for Integrative Neuroscience and Neurodynamics, University of Reading, UK
- School of Psychology and Clinical Language Sciences, University of Reading, UK
- Correspondence to: Centre for Integrative Neuroscience and Neurodynamics, Harry Pitt Building, University of Reading, Reading, Berkshire, UK.
| | - Anastasia Christakou
- Centre for Integrative Neuroscience and Neurodynamics, University of Reading, UK
- School of Psychology and Clinical Language Sciences, University of Reading, UK
| |
Collapse
|
15
|
Grzywacz NM, Aleem H. Does Amount of Information Support Aesthetic Values? Front Neurosci 2022; 16:805658. [PMID: 35392414 PMCID: PMC8982361 DOI: 10.3389/fnins.2022.805658] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 02/16/2022] [Indexed: 11/24/2022] Open
Abstract
Obtaining information from the world is important for survival. The brain, therefore, has special mechanisms to extract as much information as possible from sensory stimuli. Hence, given its importance, the amount of available information may underlie aesthetic values. Such information-based aesthetic values would be significant because they would compete with others to drive decision-making. In this article, we ask, "What is the evidence that amount of information support aesthetic values?" An important concept in the measurement of informational volume is entropy. Research on aesthetic values has thus used Shannon entropy to evaluate the contribution of quantity of information. We review here the concepts of information and aesthetic values, and research on the visual and auditory systems to probe whether the brain uses entropy or other relevant measures, specially, Fisher information, in aesthetic decisions. We conclude that information measures contribute to these decisions in two ways: first, the absolute quantity of information can modulate aesthetic preferences for certain sensory patterns. However, the preference for volume of information is highly individualized, with information-measures competing with organizing principles, such as rhythm and symmetry. In addition, people tend to be resistant to too much entropy, but not necessarily, high amounts of Fisher information. We show that this resistance may stem in part from the distribution of amount of information in natural sensory stimuli. Second, the measurement of entropic-like quantities over time reveal that they can modulate aesthetic decisions by varying degrees of surprise given temporally integrated expectations. We propose that amount of information underpins complex aesthetic values, possibly informing the brain on the allocation of resources or the situational appropriateness of some cognitive models.
Collapse
Affiliation(s)
- Norberto M. Grzywacz
- Department of Psychology, Loyola University Chicago, Chicago, IL, United States
- Department of Molecular Pharmacology and Neuroscience, Loyola University Chicago, Chicago, IL, United States
- Interdisciplinary Program in Neuroscience, Georgetown University, Washington, DC, United States
| | - Hassan Aleem
- Interdisciplinary Program in Neuroscience, Georgetown University, Washington, DC, United States
| |
Collapse
|
16
|
Katthagen T, Fromm S, Wieland L, Schlagenhauf F. Models of Dynamic Belief Updating in Psychosis-A Review Across Different Computational Approaches. Front Psychiatry 2022; 13:814111. [PMID: 35492702 PMCID: PMC9039658 DOI: 10.3389/fpsyt.2022.814111] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 02/18/2022] [Indexed: 11/20/2022] Open
Abstract
To understand the dysfunctional mechanisms underlying maladaptive reasoning of psychosis, computational models of decision making have widely been applied over the past decade. Thereby, a particular focus has been on the degree to which beliefs are updated based on new evidence, expressed by the learning rate in computational models. Higher order beliefs about the stability of the environment can determine the attribution of meaningfulness to events that deviate from existing beliefs by interpreting these either as noise or as true systematic changes (volatility). Both, the inappropriate downplaying of important changes as noise (belief update too low) as well as the overly flexible adaptation to random events (belief update too high) were theoretically and empirically linked to symptoms of psychosis. Whereas models with fixed learning rates fail to adjust learning in reaction to dynamic changes, increasingly complex learning models have been adopted in samples with clinical and subclinical psychosis lately. These ranged from advanced reinforcement learning models, over fully Bayesian belief updating models to approximations of fully Bayesian models with hierarchical learning or change point detection algorithms. It remains difficult to draw comparisons across findings of learning alterations in psychosis modeled by different approaches e.g., the Hierarchical Gaussian Filter and change point detection. Therefore, this review aims to summarize and compare computational definitions and findings of dynamic belief updating without perceptual ambiguity in (sub)clinical psychosis across these different mathematical approaches. There was strong heterogeneity in tasks and samples. Overall, individuals with schizophrenia and delusion-proneness showed lower behavioral performance linked to failed differentiation between uninformative noise and environmental change. This was indicated by increased belief updating and an overestimation of volatility, which was associated with cognitive deficits. Correlational evidence for computational mechanisms and positive symptoms is still sparse and might diverge from the group finding of instable beliefs. Based on the reviewed studies, we highlight some aspects to be considered to advance the field with regard to task design, modeling approach, and inclusion of participants across the psychosis spectrum. Taken together, our review shows that computational psychiatry offers powerful tools to advance our mechanistic insights into the cognitive anatomy of psychotic experiences.
Collapse
Affiliation(s)
- Teresa Katthagen
- Department of Psychiatry and Neurosciences, CCM, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany
| | - Sophie Fromm
- Department of Psychiatry and Neurosciences, CCM, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Einstein Center for Neurosciences, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Lara Wieland
- Department of Psychiatry and Neurosciences, CCM, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Einstein Center for Neurosciences, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany
| | - Florian Schlagenhauf
- Department of Psychiatry and Neurosciences, CCM, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Einstein Center for Neurosciences, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany.,Bernstein Center for Computational Neuroscience, Berlin, Germany.,NeuroCure Clinical Research Center, Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität zu Berlin and Berlin Institute of Health, Berlin, Germany
| |
Collapse
|
17
|
Nicholas J, Daw ND, Shohamy D. Uncertainty alters the balance between incremental learning and episodic memory. eLife 2022; 11:81679. [PMID: 36458809 PMCID: PMC9810331 DOI: 10.7554/elife.81679] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 12/01/2022] [Indexed: 12/04/2022] Open
Abstract
A key question in decision-making is how humans arbitrate between competing learning and memory systems to maximize reward. We address this question by probing the balance between the effects, on choice, of incremental trial-and-error learning versus episodic memories of individual events. Although a rich literature has studied incremental learning in isolation, the role of episodic memory in decision-making has only recently drawn focus, and little research disentangles their separate contributions. We hypothesized that the brain arbitrates rationally between these two systems, relying on each in circumstances to which it is most suited, as indicated by uncertainty. We tested this hypothesis by directly contrasting contributions of episodic and incremental influence to decisions, while manipulating the relative uncertainty of incremental learning using a well-established manipulation of reward volatility. Across two large, independent samples of young adults, participants traded these influences off rationally, depending more on episodic information when incremental summaries were more uncertain. These results support the proposal that the brain optimizes the balance between different forms of learning and memory according to their relative uncertainties and elucidate the circumstances under which episodic memory informs decisions.
Collapse
Affiliation(s)
- Jonathan Nicholas
- Department of Psychology, Columbia UniversityNew YorkUnited States,Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia UniversityNew YorkUnited States
| | - Nathaniel D Daw
- Department of Psychology, Princeton UniversityPrincetonUnited States,Princeton Neuroscience Institute, Princeton UniversityPrincetonUnited States
| | - Daphna Shohamy
- Department of Psychology, Columbia UniversityNew YorkUnited States,Mortimer B. Zuckerman Mind, Brain, Behavior Institute, Columbia UniversityNew YorkUnited States,The Kavli Institute for Brain Science, Columbia UniversityNew YorkUnited States
| |
Collapse
|
18
|
Soltani A, Koechlin E. Computational models of adaptive behavior and prefrontal cortex. Neuropsychopharmacology 2022; 47:58-71. [PMID: 34389808 PMCID: PMC8617006 DOI: 10.1038/s41386-021-01123-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 07/19/2021] [Accepted: 07/20/2021] [Indexed: 02/07/2023]
Abstract
The real world is uncertain, and while ever changing, it constantly presents itself in terms of new sets of behavioral options. To attain the flexibility required to tackle these challenges successfully, most mammalian brains are equipped with certain computational abilities that rely on the prefrontal cortex (PFC). By examining learning in terms of internal models associating stimuli, actions, and outcomes, we argue here that adaptive behavior relies on specific interactions between multiple systems including: (1) selective models learning stimulus-action associations through rewards; (2) predictive models learning stimulus- and/or action-outcome associations through statistical inferences anticipating behavioral outcomes; and (3) contextual models learning external cues associated with latent states of the environment. Critically, the PFC combines these internal models by forming task sets to drive behavior and, moreover, constantly evaluates the reliability of actor task sets in predicting external contingencies to switch between task sets or create new ones. We review different models of adaptive behavior to demonstrate how their components map onto this unifying framework and specific PFC regions. Finally, we discuss how our framework may help to better understand the neural computations and the cognitive architecture of PFC regions guiding adaptive behavior.
Collapse
Affiliation(s)
- Alireza Soltani
- Department of Psychological and Brain Sciences, Dartmouth College, Hanover, NH, USA.
| | - Etienne Koechlin
- Institut National de la Sante et de la Recherche Medicale, Universite Pierre et Marie Curie, Ecole Normale Superieure, Paris, France.
| |
Collapse
|
19
|
Yoo AH, Collins AGE. How Working Memory and Reinforcement Learning Are Intertwined: A Cognitive, Neural, and Computational Perspective. J Cogn Neurosci 2021; 34:551-568. [PMID: 34942642 DOI: 10.1162/jocn_a_01808] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Reinforcement learning and working memory are two core processes of human cognition and are often considered cognitively, neuroscientifically, and algorithmically distinct. Here, we show that the brain networks that support them actually overlap significantly and that they are less distinct cognitive processes than often assumed. We review literature demonstrating the benefits of considering each process to explain properties of the other and highlight recent work investigating their more complex interactions. We discuss how future research in both computational and cognitive sciences can benefit from one another, suggesting that a key missing piece for artificial agents to learn to behave with more human-like efficiency is taking working memory's role in learning seriously. This review highlights the risks of neglecting the interplay between different processes when studying human behavior (in particular when considering individual differences). We emphasize the importance of investigating these dynamics to build a comprehensive understanding of human cognition.
Collapse
|
20
|
Piray P, Daw ND. A model for learning based on the joint estimation of stochasticity and volatility. Nat Commun 2021; 12:6587. [PMID: 34782597 PMCID: PMC8592992 DOI: 10.1038/s41467-021-26731-9] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2021] [Accepted: 10/08/2021] [Indexed: 02/08/2023] Open
Abstract
Previous research has stressed the importance of uncertainty for controlling the speed of learning, and how such control depends on the learner inferring the noise properties of the environment, especially volatility: the speed of change. However, learning rates are jointly determined by the comparison between volatility and a second factor, moment-to-moment stochasticity. Yet much previous research has focused on simplified cases corresponding to estimation of either factor alone. Here, we introduce a learning model, in which both factors are learned simultaneously from experience, and use the model to simulate human and animal data across many seemingly disparate neuroscientific and behavioral phenomena. By considering the full problem of joint estimation, we highlight a set of previously unappreciated issues, arising from the mutual interdependence of inference about volatility and stochasticity. This interdependence complicates and enriches the interpretation of previous results, such as pathological learning in individuals with anxiety and following amygdala damage.
Collapse
Affiliation(s)
- Payam Piray
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA.
| | - Nathaniel D Daw
- Princeton Neuroscience Institute and Department of Psychology, Princeton University, Princeton, NJ, USA
| |
Collapse
|
21
|
Marković D, Stojić H, Schwöbel S, Kiebel SJ. An empirical evaluation of active inference in multi-armed bandits. Neural Netw 2021; 144:229-246. [PMID: 34507043 DOI: 10.1016/j.neunet.2021.08.018] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 07/07/2021] [Accepted: 08/11/2021] [Indexed: 10/20/2022]
Abstract
A key feature of sequential decision making under uncertainty is a need to balance between exploiting-choosing the best action according to the current knowledge, and exploring-obtaining information about values of other actions. The multi-armed bandit problem, a classical task that captures this trade-off, served as a vehicle in machine learning for developing bandit algorithms that proved to be useful in numerous industrial applications. The active inference framework, an approach to sequential decision making recently developed in neuroscience for understanding human and animal behaviour, is distinguished by its sophisticated strategy for resolving the exploration-exploitation trade-off. This makes active inference an exciting alternative to already established bandit algorithms. Here we derive an efficient and scalable approximate active inference algorithm and compare it to two state-of-the-art bandit algorithms: Bayesian upper confidence bound and optimistic Thompson sampling. This comparison is done on two types of bandit problems: a stationary and a dynamic switching bandit. Our empirical evaluation shows that the active inference algorithm does not produce efficient long-term behaviour in stationary bandits. However, in the more challenging switching bandit problem active inference performs substantially better than the two state-of-the-art bandit algorithms. The results open exciting venues for further research in theoretical and applied machine learning, as well as lend additional credibility to active inference as a general framework for studying human and animal behaviour.
Collapse
Affiliation(s)
- Dimitrije Marković
- Faculty of Psychology, Technische Universität Dresden, 01062 Dresden, Germany; Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, 01062 Dresden, Germany.
| | - Hrvoje Stojić
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, 10-12 Russell Square, London, WC1B 5EH, United Kingdom; Secondmind, 72 Hills Rd, Cambridge, CB2 1LA, United Kingdom
| | - Sarah Schwöbel
- Faculty of Psychology, Technische Universität Dresden, 01062 Dresden, Germany
| | - Stefan J Kiebel
- Faculty of Psychology, Technische Universität Dresden, 01062 Dresden, Germany; Centre for Tactile Internet with Human-in-the-Loop (CeTI), Technische Universität Dresden, 01062 Dresden, Germany
| |
Collapse
|
22
|
Liu M, Dong W, Qin S, Verguts T, Chen Q. Electrophysiological Signatures of Hierarchical Learning. Cereb Cortex 2021; 32:626-639. [PMID: 34339505 DOI: 10.1093/cercor/bhab245] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 06/26/2021] [Accepted: 06/27/2021] [Indexed: 11/13/2022] Open
Abstract
Human perception and learning is thought to rely on a hierarchical generative model that is continuously updated via precision-weighted prediction errors (pwPEs). However, the neural basis of such cognitive process and how it unfolds during decision-making remain poorly understood. To investigate this question, we combined a hierarchical Bayesian model (i.e., Hierarchical Gaussian Filter [HGF]) with electroencephalography (EEG), while participants performed a probabilistic reversal learning task in alternatingly stable and volatile environments. Behaviorally, the HGF fitted significantly better than two control, nonhierarchical, models. Neurally, low-level and high-level pwPEs were independently encoded by the P300 component. Low-level pwPEs were reflected in the theta (4-8 Hz) frequency band, but high-level pwPEs were not. Furthermore, the expressions of high-level pwPEs were stronger for participants with better HGF fit. These results indicate that the brain employs hierarchical learning and encodes both low- and high-level learning signals separately and adaptively.
Collapse
Affiliation(s)
- Meng Liu
- Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, 510631 Guangzhou, China.,School of Psychology, South China Normal University, 510631 Guangzhou, China.,Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China
| | - Wenshan Dong
- Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, 510631 Guangzhou, China.,School of Psychology, South China Normal University, 510631 Guangzhou, China.,Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China
| | - Shaozheng Qin
- State Key Laboratory of Cognitive Neuroscience and Learning & IDG/McGovern Institute for Brain Research, Beijing Normal University, 100875 Beijing, China
| | - Tom Verguts
- Department of Experimental Psychology, Ghent University, B-9000 Ghent, Belgium
| | - Qi Chen
- Key Laboratory of Brain, Cognition and Education Sciences (South China Normal University), Ministry of Education, 510631 Guangzhou, China.,School of Psychology, South China Normal University, 510631 Guangzhou, China.,Center for Studies of Psychological Application, South China Normal University, 510631 Guangzhou, China.,Guangdong Key Laboratory of Mental Health and Cognitive Science, South China Normal University, 510631 Guangzhou, China
| |
Collapse
|
23
|
Haarsma J, Harmer CJ, Tamm S. A continuum hypothesis of psychotomimetic rapid antidepressants. Brain Neurosci Adv 2021; 5:23982128211007772. [PMID: 34017922 PMCID: PMC8114748 DOI: 10.1177/23982128211007772] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Accepted: 03/08/2021] [Indexed: 01/10/2023] Open
Abstract
Ketamine, classical psychedelics and sleep deprivation are associated with rapid effects on depression. Interestingly, these interventions also have common psychotomimetic actions, mirroring aspects of psychosis such as an altered sense of self, perceptual distortions and distorted thinking. This raises the question whether these interventions might be acute antidepressants through the same mechanisms that underlie some of their psychotomimetic effects. That is, perhaps some symptoms of depression can be understood as occupying the opposite end of a spectrum where elements of psychosis can be found on the other side. This review aims at reviewing the evidence underlying a proposed continuum hypothesis of psychotomimetic rapid antidepressants, suggesting that a range of psychotomimetic interventions are also acute antidepressants as well as trying to explain these common features in a hierarchical predictive coding framework, where we hypothesise that these interventions share a common mechanism by increasing the flexibility of prior expectations. Neurobiological mechanisms at play and the role of different neuromodulatory systems affected by these interventions and their role in controlling the precision of prior expectations and new sensory evidence will be reviewed. The proposed hypothesis will also be discussed in relation to other existing theories of antidepressants. We also suggest a number of novel experiments to test the hypothesis and highlight research areas that could provide further insights, in the hope to better understand the acute antidepressant properties of these interventions.
Collapse
Affiliation(s)
- Joost Haarsma
- Wellcome Centre for Human Neuroimaging, University College London, London, UK
| | - Catherine J Harmer
- Department of Psychiatry and Oxford Health NHS Foundation Trust, Warneford Hospital, University of Oxford, Oxford, UK
| | - Sandra Tamm
- Department of Psychiatry and Oxford Health NHS Foundation Trust, Warneford Hospital, University of Oxford, Oxford, UK
- Stress Research Institute, Department of Psychology, Stockholm University, Stockholm, Sweden
- Department of Clinical Neuroscience, Karolinska Institute, Stockholm, Sweden
| |
Collapse
|
24
|
Reed EJ, Uddenberg S, Suthaharan P, Mathys CD, Taylor JR, Groman SM, Corlett PR. Paranoia as a deficit in non-social belief updating. eLife 2020; 9:56345. [PMID: 32452769 PMCID: PMC7326495 DOI: 10.7554/elife.56345] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2020] [Accepted: 05/22/2020] [Indexed: 12/14/2022] Open
Abstract
Paranoia is the belief that harm is intended by others. It may arise from selective pressures to infer and avoid social threats, particularly in ambiguous or changing circumstances. We propose that uncertainty may be sufficient to elicit learning differences in paranoid individuals, without social threat. We used reversal learning behavior and computational modeling to estimate belief updating across individuals with and without mental illness, online participants, and rats chronically exposed to methamphetamine, an elicitor of paranoia in humans. Paranoia is associated with a stronger prior on volatility, accompanied by elevated sensitivity to perceived changes in the task environment. Methamphetamine exposure in rats recapitulates this impaired uncertainty-driven belief updating and rigid anticipation of a volatile environment. Our work provides evidence of fundamental, domain-general learning differences in paranoid individuals. This paradigm enables further assessment of the interplay between uncertainty and belief-updating across individuals and species. Everyone has had fleeting concerns that others might be against them at some point in their lives. Sometimes these concerns can escalate into paranoia and become debilitating. Paranoia is a common symptom in serious mental illnesses like schizophrenia. It can cause extreme distress and is linked with an increased risk of violence towards oneself or others. Understanding what happens in the brains of people experiencing paranoia might lead to better ways to treat or manage it. Some experts argue that paranoia is caused by errors in the way people assess social situations. An alternative idea is that paranoia stems from the way the brain forms and updates beliefs about the world. Now, Reed et al. show that both people with paranoia and rats exposed to a paranoia-inducing substance expect the world will change frequently, change their minds often, and have a harder time learning in response to changing circumstances. In the experiments, human volunteers with and without psychiatric disorders played a game where the best choices change. Then, the participants completed a survey to assess their level of paranoia. People with higher levels of paranoia predicted more changes would occur and made less predictable choices. In a second set of experiments, rats were put in a cage with three holes where they sometimes received sugar rewards. Some of the rats received methamphetamine, a drug that causes paranoia in humans. Rats given the drug also expected the location of the sugar reward would change often. The drugged animals had harder time learning and adapting to changing circumstances. The experiments suggest that brain processes found in both rats, which are less social than humans, and humans contribute to paranoia. This suggests paranoia may make it harder to update beliefs. This may help scientists understand what causes paranoia and develop therapies or drugs that can reduce paranoia. This information may also help scientists understand why during societal crises like wars or natural disasters humans are prone to believing conspiracies. This is particularly important now as the world grapples with climate change and a global pandemic. Reed et al. note paranoia may impede the coordination of collaborative solutions to these challenging situations.
Collapse
Affiliation(s)
- Erin J Reed
- Interdepartmental Neuroscience Program, Yale School of Medicine, New Haven, United States.,Yale MD-PhD Program, Yale School of Medicine, New Haven, United States
| | - Stefan Uddenberg
- Princeton Neuroscience Institute, Princeton University, Princeton, United States
| | - Praveen Suthaharan
- Department of Psychiatry, Connecticut Mental Health Center, Yale University, New Have, United States
| | - Christoph D Mathys
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), Trieste, Italy.,Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich and ETH Zurich, Zurich, Switzerland
| | - Jane R Taylor
- Department of Psychiatry, Connecticut Mental Health Center, Yale University, New Have, United States
| | - Stephanie Mary Groman
- Department of Psychiatry, Connecticut Mental Health Center, Yale University, New Have, United States
| | - Philip R Corlett
- Department of Psychiatry, Connecticut Mental Health Center, Yale University, New Have, United States
| |
Collapse
|