1
|
Weydmann G, Palmieri I, Simões RAG, Buchmann S, Schmidt E, Alves P, Bizarro L. Disentangling negative reinforcement, working memory, and deductive reasoning deficits in elevated BMI. Prog Neuropsychopharmacol Biol Psychiatry 2024; 136:111173. [PMID: 39401563 DOI: 10.1016/j.pnpbp.2024.111173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/14/2024] [Revised: 10/11/2024] [Accepted: 10/11/2024] [Indexed: 10/19/2024]
Abstract
Neuropsychological data suggest that being overweight or obese is associated with a tendency to perseverate behavior despite negative feedback. This deficit might be observed due to other cognitive factors, such as working memory (WM) deficits or decreased ability to deduce model-based strategies when learning by trial-and-error. In the present study, a group of subjects with overweight or obesity (Ow/Ob, n = 30) was compared to normal-weight individuals (n = 42) in a modified Reinforcement Learning (RL) task. The task was designed to control WM effects on learning by manipulating cognitive load and to foster model-based learning via deductive reasoning. Computational modelling and analysis were conducted to isolate parameters related to RL mechanisms, WM use, and model-based learning (deduction parameter). Results showed that subjects with Ow/Ob had a higher number of perseverative errors and used a weaker deduction mechanism in their performance than control individuals, indicating impairments in negative reinforcement and model-based learning, whereas WM impairments were not responsible for deficits in RL. The present data suggests that obesity is associated with impairments in negative reinforcement and model-based learning.
Collapse
Affiliation(s)
- Gibson Weydmann
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil; Universidade La Salle, Canoas, Brazil.
| | - Igor Palmieri
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| | - Reinaldo A G Simões
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| | - Samara Buchmann
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| | - Eduardo Schmidt
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| | - Paulina Alves
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| | - Lisiane Bizarro
- Programa de Pós-Graduação em Psicologia, Universidade Federal do Rio Grande do Sul (UFRGS), Rua Ramiro Barcelos 2600, Porto Alegre, Brazil
| |
Collapse
|
2
|
Venditto SJC, Miller KJ, Brody CD, Daw ND. Dynamic reinforcement learning reveals time-dependent shifts in strategy during reward learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.28.582617. [PMID: 38464244 PMCID: PMC10925334 DOI: 10.1101/2024.02.28.582617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Different brain systems have been hypothesized to subserve multiple "experts" that compete to generate behavior. In reinforcement learning, two general processes, one model-free (MF) and one model-based (MB), are often modeled as a mixture of agents (MoA) and hypothesized to capture differences between automaticity vs. deliberation. However, shifts in strategy cannot be captured by a static MoA. To investigate such dynamics, we present the mixture-of-agents hidden Markov model (MoA-HMM), which simultaneously learns inferred action values from a set of agents and the temporal dynamics of underlying "hidden" states that capture shifts in agent contributions over time. Applying this model to a multi-step, reward-guided task in rats reveals a progression of within-session strategies: a shift from initial MB exploration to MB exploitation, and finally to reduced engagement. The inferred states predict changes in both response time and OFC neural encoding during the task, suggesting that these states are capturing real shifts in dynamics.
Collapse
|
3
|
Moskovitz T, Miller KJ, Sahani M, Botvinick MM. Understanding dual process cognition via the minimum description length principle. PLoS Comput Biol 2024; 20:e1012383. [PMID: 39423224 PMCID: PMC11534269 DOI: 10.1371/journal.pcbi.1012383] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/04/2024] [Accepted: 08/01/2024] [Indexed: 10/21/2024] Open
Abstract
Dual-process theories play a central role in both psychology and neuroscience, figuring prominently in domains ranging from executive control to reward-based learning to judgment and decision making. In each of these domains, two mechanisms appear to operate concurrently, one relatively high in computational complexity, the other relatively simple. Why is neural information processing organized in this way? We propose an answer to this question based on the notion of compression. The key insight is that dual-process structure can enhance adaptive behavior by allowing an agent to minimize the description length of its own behavior. We apply a single model based on this observation to findings from research on executive control, reward-based learning, and judgment and decision making, showing that seemingly diverse dual-process phenomena can be understood as domain-specific consequences of a single underlying set of computational principles.
Collapse
Affiliation(s)
- Ted Moskovitz
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| | - Kevin J. Miller
- Google DeepMind, London, United Kingdom
- Department of Ophthalmology, University College London, London, United Kingdom
| | - Maneesh Sahani
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
| | - Matthew M. Botvinick
- Gatsby Computational Neuroscience Unit, University College London, London, United Kingdom
- Google DeepMind, London, United Kingdom
| |
Collapse
|
4
|
Sankhe P, Haruno M. Model-free decision-making underlies motor errors in rapid sequential movements under threat. COMMUNICATIONS PSYCHOLOGY 2024; 2:81. [PMID: 39242765 PMCID: PMC11347585 DOI: 10.1038/s44271-024-00123-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Accepted: 07/30/2024] [Indexed: 09/09/2024]
Abstract
Our movements, especially sequential ones, are usually goal-directed, i.e., coupled with task-level goals. Consequently, cognitive strategies for decision-making and motor performance are likely to influence each other. However, evidence linking decision-making strategies and motor performance remains elusive. Here, we designed a modified version of the two-step task, named the two-step sequential movement task, where participants had to conduct rapid sequential finger movements to obtain rewards (n = 40). In the shock session, participants received an electrical shock if they made an erroneous or slow movement, while in the no-shock session, they only received zero reward. We found that participants who prioritised model-free decision-making committed more motor errors in the presence of the shock stimulus (shock sessions) than those who prioritised model-based decision-making. Using a mediation analysis, we also revealed a strong link between the balance of the model-based and the model-free learning strategies and sequential movement performances. These results suggested that model-free decision-making produces more motor errors than model-based decision-making in rapid sequential movements under the threat of stressful stimuli.
Collapse
Affiliation(s)
- Pranav Sankhe
- Center for Information and Neural Networks, NICT, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan.
- Institute of Cognitive Neuroscience, University College London, 17-19 Queen Square, London, WC1N 3AZ, UK.
| | - Masahiko Haruno
- Center for Information and Neural Networks, NICT, 1-4 Yamadaoka, Suita, Osaka, 565-0871, Japan.
- Graduate School of Frontier Biosciences, Osaka University, 1-3 Yamadaoka, Suita, Osaka, 565-0871, Japan.
| |
Collapse
|
5
|
Dubinsky JM, Hamid AA. The neuroscience of active learning and direct instruction. Neurosci Biobehav Rev 2024; 163:105737. [PMID: 38796122 DOI: 10.1016/j.neubiorev.2024.105737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Revised: 05/13/2024] [Accepted: 05/20/2024] [Indexed: 05/28/2024]
Abstract
Throughout the educational system, students experiencing active learning pedagogy perform better and fail less than those taught through direct instruction. Can this be ascribed to differences in learning from a neuroscientific perspective? This review examines mechanistic, neuroscientific evidence that might explain differences in cognitive engagement contributing to learning outcomes between these instructional approaches. In classrooms, direct instruction comprehensively describes academic content, while active learning provides structured opportunities for learners to explore, apply, and manipulate content. Synaptic plasticity and its modulation by arousal or novelty are central to all learning and both approaches. As a form of social learning, direct instruction relies upon working memory. The reinforcement learning circuit, associated agency, curiosity, and peer-to-peer social interactions combine to enhance motivation, improve retention, and build higher-order-thinking skills in active learning environments. When working memory becomes overwhelmed, additionally engaging the reinforcement learning circuit improves retention, providing an explanation for the benefits of active learning. This analysis provides a mechanistic examination of how emerging neuroscience principles might inform pedagogical choices at all educational levels.
Collapse
Affiliation(s)
- Janet M Dubinsky
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA.
| | - Arif A Hamid
- Department of Neuroscience, University of Minnesota, Minneapolis, MN, USA
| |
Collapse
|
6
|
Kim T, Lee SW, Lho SK, Moon SY, Kim M, Kwon JS. Neurocomputational model of compulsivity: deviating from an uncertain goal-directed system. Brain 2024; 147:2230-2244. [PMID: 38584499 PMCID: PMC11146420 DOI: 10.1093/brain/awae102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Revised: 02/18/2024] [Accepted: 03/07/2024] [Indexed: 04/09/2024] Open
Abstract
Despite a theory that an imbalance in goal-directed versus habitual systems serve as building blocks of compulsions, research has yet to delineate how this occurs during arbitration between the two systems in obsessive-compulsive disorder. Inspired by a brain model in which the inferior frontal cortex selectively gates the putamen to guide goal-directed or habitual actions, this study aimed to examine whether disruptions in the arbitration process via the fronto-striatal circuit would underlie imbalanced decision-making and compulsions in patients. Thirty patients with obsessive-compulsive disorder [mean (standard deviation) age = 26.93 (6.23) years, 12 females (40%)] and 30 healthy controls [mean (standard deviation) age = 24.97 (4.72) years, 17 females (57%)] underwent functional MRI scans while performing the two-step Markov decision task, which was designed to dissociate goal-directed behaviour from habitual behaviour. We employed a neurocomputational model to account for an uncertainty-based arbitration process, in which a prefrontal arbitrator (i.e. inferior frontal gyrus) allocates behavioural control to a more reliable strategy by selectively gating the putamen. We analysed group differences in the neural estimates of uncertainty of each strategy. We also compared the psychophysiological interaction effects of system preference (goal-directed versus habitual) on fronto-striatal coupling between groups. We examined the correlation between compulsivity score and the neural activity and connectivity involved in the arbitration process. The computational model captured the subjects' preferences between the strategies. Compared with healthy controls, patients had a stronger preference for the habitual system (t = -2.88, P = 0.006), which was attributed to a more uncertain goal-directed system (t = 2.72, P = 0.009). Before the allocation of controls, patients exhibited hypoactivity in the inferior frontal gyrus compared with healthy controls when this region tracked the inverse of uncertainty (i.e. reliability) of goal-directed behaviour (P = 0.001, family-wise error rate corrected). When reorienting behaviours to reach specific goals, patients exhibited weaker right ipsilateral ventrolateral prefronto-putamen coupling than healthy controls (P = 0.001, family-wise error rate corrected). This hypoconnectivity was correlated with more severe compulsivity (r = -0.57, P = 0.002). Our findings suggest that the attenuated top-down control of the putamen by the prefrontal arbitrator underlies compulsivity in obsessive-compulsive disorder. Enhancing fronto-striatal connectivity may be a potential neurotherapeutic approach for compulsivity and adaptive decision-making.
Collapse
Affiliation(s)
- Taekwan Kim
- Department of Brain and Cognitive Sciences, Seoul National University College of Natural Sciences, Seoul 08826, Republic of Korea
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Center for Neuroscience-inspired Artificial Intelligence, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Sang Wan Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Center for Neuroscience-inspired Artificial Intelligence, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
- Kim Jaechul Graduate School of AI, Korea Advanced Institute of Science and Technology, Daejeon 34141, Republic of Korea
| | - Silvia Kyungjin Lho
- Department of Neuropsychiatry, Seoul National University Hospital, Seoul 03080, Republic of Korea
| | - Sun-Young Moon
- Department of Neuropsychiatry, Seoul National University Hospital, Seoul 03080, Republic of Korea
| | - Minah Kim
- Department of Neuropsychiatry, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Department of Psychiatry, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
| | - Jun Soo Kwon
- Department of Brain and Cognitive Sciences, Seoul National University College of Natural Sciences, Seoul 08826, Republic of Korea
- Department of Neuropsychiatry, Seoul National University Hospital, Seoul 03080, Republic of Korea
- Department of Psychiatry, Seoul National University College of Medicine, Seoul 03080, Republic of Korea
| |
Collapse
|
7
|
Hackel LM, Kalkstein DA, Mende-Siedlecki P. Simplifying social learning. Trends Cogn Sci 2024; 28:428-440. [PMID: 38331595 DOI: 10.1016/j.tics.2024.01.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 01/16/2024] [Accepted: 01/17/2024] [Indexed: 02/10/2024]
Abstract
Social learning is complex, but people often seem to navigate social environments with ease. This ability creates a puzzle for traditional accounts of reinforcement learning (RL) that assume people negotiate a tradeoff between easy-but-simple behavior (model-free learning) and complex-but-difficult behavior (e.g., model-based learning). We offer a theoretical framework for resolving this puzzle: although social environments are complex, people have social expertise that helps them behave flexibly with low cognitive cost. Specifically, by using familiar concepts instead of focusing on novel details, people can turn hard learning problems into simpler ones. This ability highlights social learning as a prototype for studying cognitive simplicity in the face of environmental complexity and identifies a role for conceptual knowledge in everyday reward learning.
Collapse
Affiliation(s)
- Leor M Hackel
- University of Southern California, Los Angeles, CA 90089, USA.
| | | | | |
Collapse
|
8
|
Marzuki AA, Lim TV. Bridging minds and policies: supporting early career researchers in translating computational psychiatry research. Neuropsychopharmacology 2024; 49:903-904. [PMID: 38418567 PMCID: PMC11039629 DOI: 10.1038/s41386-024-01834-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/25/2023] [Revised: 02/12/2024] [Accepted: 02/15/2024] [Indexed: 03/01/2024]
Affiliation(s)
- Aleya A Marzuki
- Department of Psychology, Sunway University, Petaling Jaya, Selangor, Malaysia.
- Department of Psychiatry and Psychotherapy, Medical School and University Hospital, Eberhard Karls University of Tübingen, Tübingen, Germany.
- German Center for Mental Health (DZPG), Tübingen, Germany.
| | - Tsen Vei Lim
- Department of Psychiatry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
9
|
Kastner DB, Williams G, Holobetz C, Romano JP, Dayan P. The choice-wide behavioral association study: data-driven identification of interpretable behavioral components. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582115. [PMID: 38464037 PMCID: PMC10925091 DOI: 10.1101/2024.02.26.582115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Behavior contains rich structure across many timescales, but there is a dearth of methods to identify relevant components, especially over the longer periods required for learning and decision-making. Inspired by the goals and techniques of genome-wide association studies, we present a data-driven method-the choice-wide behavioral association study: CBAS-that systematically identifies such behavioral features. CBAS uses a powerful, resampling-based, method of multiple comparisons correction to identify sequences of actions or choices that either differ significantly between groups or significantly correlate with a covariate of interest. We apply CBAS to different tasks and species (flies, rats, and humans) and find, in all instances, that it provides interpretable information about each behavioral task.
Collapse
Affiliation(s)
- David B. Kastner
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, CA 94143, USA
- Lead Contact
| | - Greer Williams
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, CA 94143, USA
| | - Cristofer Holobetz
- Department of Psychiatry and Behavioral Sciences, University of California, San Francisco, CA 94143, USA
| | - Joseph P. Romano
- Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen 72076, Germany
| |
Collapse
|
10
|
Nagel J, Morgan DP, Gürsoy NÇ, Sander S, Kern S, Feld GB. Memory for rewards guides retrieval. COMMUNICATIONS PSYCHOLOGY 2024; 2:31. [PMID: 39242930 PMCID: PMC11332070 DOI: 10.1038/s44271-024-00074-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Accepted: 03/11/2024] [Indexed: 09/09/2024]
Abstract
Rewards paid out for successful retrieval motivate the formation of long-term memory. However, it has been argued that the Motivated Learning Task does not measure reward effects on memory strength but decision-making during retrieval. We report three large-scale online experiments in healthy participants (N = 200, N = 205, N = 187) that inform this debate. In experiment 1, we found that explicit stimulus-reward associations formed during encoding influence response strategies at retrieval. In experiment 2, reward affected memory strength and decision-making strategies. In experiment 3, reward affected decision-making strategies only. These data support a theoretical framework that assumes that promised rewards not only increase memory strength, but additionally lead to the formation of stimulus-reward associations that influence decisions at retrieval.
Collapse
Affiliation(s)
- Juliane Nagel
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
| | - David Philip Morgan
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
| | - Necati Çağatay Gürsoy
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
| | - Samuel Sander
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
| | - Simon Kern
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany
| | - Gordon Benedikt Feld
- Clinical Psychology, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
- Addiction Behavior and Addiction Medicine, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
- Psychiatry and Psychotherapy, Central Institute of Mental Health, Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany.
- Department of Psychology, University of Heidelberg, Heidelberg, Germany.
| |
Collapse
|
11
|
Karagoz AB, Moran EK, Barch DM, Kool W, Reagh ZM. Evidence for shallow cognitive maps in schizophrenia. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.26.582214. [PMID: 38464042 PMCID: PMC10925159 DOI: 10.1101/2024.02.26.582214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Individuals with schizophrenia can have marked deficits in goal-directed decision making. Prominent theories differ in whether schizophrenia (SZ) affects the ability to exert cognitive control, or the motivation to exert control. An alternative explanation is that schizophrenia negatively impacts the formation of cognitive maps, the internal representations of the way the world is structured, necessary for the formation of effective action plans. That is, deficits in decision-making could also arise when goal-directed control and motivation are intact, but used to plan over ill-formed maps. Here, we test the hypothesis that individuals with SZ are impaired in the construction of cognitive maps. We combine a behavioral representational similarity analysis technique with a sequential decision-making task. This enables us to examine how relationships between choice options change when individuals with SZ and healthy age-matched controls build a cognitive map of the task structure. Our results indicate that SZ affects how people represent the structure of the task, focusing more on simpler visual features and less on abstract, higher-order, planning-relevant features. At the same time, we find that SZ were able to display similar performance on this task compared to controls, emphasizing the need for a distinction between cognitive map formation and changes in goal-directed control in understanding cognitive deficits in schizophrenia.
Collapse
Affiliation(s)
- Ata B Karagoz
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Erin K Moran
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Deanna M Barch
- Department of Psychological & Brain Sciences, Washington University in St. Louis
- Department of Psychiatry, Washington University School of Medicine
| | - Wouter Kool
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| | - Zachariah M Reagh
- Department of Psychological & Brain Sciences, Washington University in St. Louis
| |
Collapse
|
12
|
Wise T, Emery K, Radulescu A. Naturalistic reinforcement learning. Trends Cogn Sci 2024; 28:144-158. [PMID: 37777463 PMCID: PMC10878983 DOI: 10.1016/j.tics.2023.08.016] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/23/2023] [Accepted: 08/24/2023] [Indexed: 10/02/2023]
Abstract
Humans possess a remarkable ability to make decisions within real-world environments that are expansive, complex, and multidimensional. Human cognitive computational neuroscience has sought to exploit reinforcement learning (RL) as a framework within which to explain human decision-making, often focusing on constrained, artificial experimental tasks. In this article, we review recent efforts that use naturalistic approaches to determine how humans make decisions in complex environments that better approximate the real world, providing a clearer picture of how humans navigate the challenges posed by real-world decisions. These studies purposely embed elements of naturalistic complexity within experimental paradigms, rather than focusing on simplification, generating insights into the processes that likely underpin humans' ability to navigate complex, multidimensional real-world environments so successfully.
Collapse
Affiliation(s)
- Toby Wise
- Department of Neuroimaging, King's College London, London, UK.
| | - Kara Emery
- Center for Data Science, New York University, New York, NY, USA
| | - Angela Radulescu
- Center for Computational Psychiatry, Icahn School of Medicine at Mt. Sinai, New York, NY, USA
| |
Collapse
|
13
|
Weydmann G, Miguel PM, Hakim N, Dubé L, Silveira PP, Bizarro L. How are overweight and obesity associated with reinforcement learning deficits? A systematic review. Appetite 2024; 193:107123. [PMID: 37992896 DOI: 10.1016/j.appet.2023.107123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/12/2023] [Accepted: 11/13/2023] [Indexed: 11/24/2023]
Abstract
Reinforcement learning (RL) refers to the ability to learn stimulus-response or response-outcome associations relevant to the acquisition of behavioral repertoire and adaptation to the environment. Research data from correlational and case-control studies have shown that obesity is associated with impairments in RL. The aim of the present study was to systematically review how obesity and overweight are associated with RL performance. More specifically, the relationship between high body mass index (BMI) and task performance was explored through the analysis of specific RL processes associated with different physiological, computational, and behavioral manifestations. Our systematic analyses indicate that obesity might be associated with impairments in the use of aversive outcomes to change ongoing behavior, as revealed by results involving instrumental negative reinforcement and extinction/reversal learning, but further research needs to be conducted to confirm this association. Hypotheses regarding how obesity might be associated with altered RL were discussed.
Collapse
Affiliation(s)
- Gibson Weydmann
- Department of Psychology, Universidade Federal Do Rio Grande Do Sul (UFRGS), 2600 Ramiro Barcelos, Postal Code 90035-003, Porto Alegre, Brazil; Ludmer Centre for Neuroinformatics and Mental Health, Montreal Neurological Institute, 3801 University, Postal Code H3A 2B4, Montreal, Quebec, Canada.
| | - Patricia Maidana Miguel
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal Neurological Institute, 3801 University, Postal Code H3A 2B4, Montreal, Quebec, Canada; Department of Psychiatry, McGill University, 1033 Pine Ave W, Postal Code H3A 1A1, Montreal, Quebec, Canada
| | - Nour Hakim
- Department of Psychology, University of Toronto, 100 George Street, Postal Code M1C 1A4, Toronto, Ontario, Canada; Desautels Faculty of Management, McGill Center for the Convergence of Health and Economics, McGill University, 1001 Sherbrooke, Postal Code H3A 1G5, Montreal, Quebec, Canada
| | - Laurette Dubé
- Desautels Faculty of Management, McGill Center for the Convergence of Health and Economics, McGill University, 1001 Sherbrooke, Postal Code H3A 1G5, Montreal, Quebec, Canada
| | - Patricia Pelufo Silveira
- Ludmer Centre for Neuroinformatics and Mental Health, Montreal Neurological Institute, 3801 University, Postal Code H3A 2B4, Montreal, Quebec, Canada; Department of Psychiatry, McGill University, 1033 Pine Ave W, Postal Code H3A 1A1, Montreal, Quebec, Canada
| | - Lisiane Bizarro
- Department of Psychology, Universidade Federal Do Rio Grande Do Sul (UFRGS), 2600 Ramiro Barcelos, Postal Code 90035-003, Porto Alegre, Brazil
| |
Collapse
|
14
|
Cisler JM, Dunsmoor JE, Fonzo GA, Nemeroff CB. Latent-state and model-based learning in PTSD. Trends Neurosci 2024; 47:150-162. [PMID: 38212163 PMCID: PMC10923154 DOI: 10.1016/j.tins.2023.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 12/18/2023] [Accepted: 12/18/2023] [Indexed: 01/13/2024]
Abstract
Post-traumatic stress disorder (PTSD) is characterized by altered emotional and behavioral responding following a traumatic event. In this article, we review the concepts of latent-state and model-based learning (i.e., learning and inferring abstract task representations) and discuss their relevance for clinical and neuroscience models of PTSD. Recent data demonstrate evidence for brain and behavioral biases in these learning processes in PTSD. These new data potentially recast excessive fear towards trauma cues as a problem in learning and updating abstract task representations, as opposed to traditional conceptualizations focused on stimulus-specific learning. Biases in latent-state and model-based learning may also be a common mechanism targeted in common therapies for PTSD. We highlight key knowledge gaps that need to be addressed to further elaborate how latent-state learning and its associated neurocircuitry mechanisms function in PTSD and how to optimize treatments to target these processes.
Collapse
Affiliation(s)
- Josh M Cisler
- Department of Psychiatry and Behavioral Sciences, University of Texas at Austin, Austin, TX, USA; Institute for Early Life Adversity Research, University of Texas at Austin, Austin, TX, USA.
| | - Joseph E Dunsmoor
- Department of Psychiatry and Behavioral Sciences, University of Texas at Austin, Austin, TX, USA; Institute for Early Life Adversity Research, University of Texas at Austin, Austin, TX, USA
| | - Gregory A Fonzo
- Department of Psychiatry and Behavioral Sciences, University of Texas at Austin, Austin, TX, USA; Institute for Early Life Adversity Research, University of Texas at Austin, Austin, TX, USA
| | - Charles B Nemeroff
- Department of Psychiatry and Behavioral Sciences, University of Texas at Austin, Austin, TX, USA; Institute for Early Life Adversity Research, University of Texas at Austin, Austin, TX, USA
| |
Collapse
|
15
|
Yang L, Jin F, Yang L, Li J, Li Z, Li M, Shang Z. The Hippocampus in Pigeons Contributes to the Model-Based Valuation and the Relationship between Temporal Context States. Animals (Basel) 2024; 14:431. [PMID: 38338074 PMCID: PMC10854895 DOI: 10.3390/ani14030431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/25/2024] [Accepted: 01/25/2024] [Indexed: 02/12/2024] Open
Abstract
Model-based decision-making guides organism behavior by the representation of the relationships between different states. Previous studies have shown that the mammalian hippocampus (Hp) plays a key role in learning the structure of relationships among experiences. However, the hippocampal neural mechanisms of birds for model-based learning have rarely been reported. Here, we trained six pigeons to perform a two-step task and explore whether their Hp contributes to model-based learning. Behavioral performance and hippocampal multi-channel local field potentials (LFPs) were recorded during the task. We estimated the subjective values using a reinforcement learning model dynamically fitted to the pigeon's choice of behavior. The results show that the model-based learner can capture the behavioral choices of pigeons well throughout the learning process. Neural analysis indicated that high-frequency (12-100 Hz) power in Hp represented the temporal context states. Moreover, dynamic correlation and decoding results provided further support for the high-frequency dependence of model-based valuations. In addition, we observed a significant increase in hippocampal neural similarity at the low-frequency band (1-12 Hz) for common temporal context states after learning. Overall, our findings suggest that pigeons use model-based inferences to learn multi-step tasks, and multiple LFP frequency bands collaboratively contribute to model-based learning. Specifically, the high-frequency (12-100 Hz) oscillations represent model-based valuations, while the low-frequency (1-12 Hz) neural similarity is influenced by the relationship between temporal context states. These results contribute to our understanding of the neural mechanisms underlying model-based learning and broaden the scope of hippocampal contributions to avian behavior.
Collapse
Affiliation(s)
- Lifang Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Fuli Jin
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Long Yang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Jiajia Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhihui Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| | - Mengmeng Li
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
| | - Zhigang Shang
- School of Electrical and Information Engineering, Zhengzhou University, Zhengzhou 450001, China; (L.Y.); (F.J.); (L.Y.); (J.L.); (Z.L.)
- Henan Key Laboratory of Brain Science and Brain-Computer Interface Technology, Zhengzhou 450001, China
- Institute of Medical Engineering Technology and Data Mining, Zhengzhou University, Zhengzhou 450001, China
| |
Collapse
|
16
|
Chan HK, Toyoizumi T. A multi-stage anticipated surprise model with dynamic expectation for economic decision-making. Sci Rep 2024; 14:657. [PMID: 38182692 PMCID: PMC10770108 DOI: 10.1038/s41598-023-50529-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 12/20/2023] [Indexed: 01/07/2024] Open
Abstract
There are many modeling works that aim to explain people's behaviors that violate classical economic theories. However, these models often do not take into full account the multi-stage nature of real-life problems and people's tendency in solving complicated problems sequentially. In this work, we propose a descriptive decision-making model for multi-stage problems with perceived post-decision information. In the model, decisions are chosen based on an entity which we call the 'anticipated surprise'. The reference point is determined by the expected value of the possible outcomes, which we assume to be dynamically changing during the mental simulation of a sequence of events. We illustrate how our formalism can help us understand prominent economic paradoxes and gambling behaviors that involve multi-stage or sequential planning. We also discuss how neuroscience findings, like prediction error signals and introspective neuronal replay, as well as psychological theories like affective forecasting, are related to the features in our model. This provides hints for future experiments to investigate the role of these entities in decision-making.
Collapse
Affiliation(s)
- Ho Ka Chan
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
| | - Taro Toyoizumi
- Laboratory for Neural Computation and Adaptation, RIKEN Center for Brain Science, Wako, Japan.
- Department of Mathematical Informatics, Graduate School of Information Science and Technology, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
17
|
Mah A, Schiereck SS, Bossio V, Constantinople CM. Distinct value computations support rapid sequential decisions. Nat Commun 2023; 14:7573. [PMID: 37989741 PMCID: PMC10663503 DOI: 10.1038/s41467-023-43250-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2023] [Accepted: 11/03/2023] [Indexed: 11/23/2023] Open
Abstract
The value of the environment determines animals' motivational states and sets expectations for error-based learning1-3. How are values computed? Reinforcement learning systems can store or cache values of states or actions that are learned from experience, or they can compute values using a model of the environment to simulate possible futures3. These value computations have distinct trade-offs, and a central question is how neural systems decide which computations to use or whether/how to combine them4-8. Here we show that rats use distinct value computations for sequential decisions within single trials. We used high-throughput training to collect statistically powerful datasets from 291 rats performing a temporal wagering task with hidden reward states. Rats adjusted how quickly they initiated trials and how long they waited for rewards across states, balancing effort and time costs against expected rewards. Statistical modeling revealed that animals computed the value of the environment differently when initiating trials versus when deciding how long to wait for rewards, even though these decisions were only seconds apart. Moreover, value estimates interacted via a dynamic learning rate. Our results reveal how distinct value computations interact on rapid timescales, and demonstrate the power of using high-throughput training to understand rich, cognitive behaviors.
Collapse
Affiliation(s)
- Andrew Mah
- Center for Neural Science, New York University, New York, NY, 10003, USA
| | | | - Veronica Bossio
- Center for Neural Science, New York University, New York, NY, 10003, USA
- Zuckerman Institute, Columbia University, New York, NY, 10027, USA
| | | |
Collapse
|
18
|
Luna R, Vadillo MA, Luque D. Model-free decision making resists improved instructions and is enhanced by stimulus-response associations. Cortex 2023; 168:102-113. [PMID: 37690266 DOI: 10.1016/j.cortex.2023.06.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2022] [Revised: 05/16/2023] [Accepted: 06/20/2023] [Indexed: 09/12/2023]
Abstract
Human behaviour may be thought of as supported by two different computational-learning mechanisms, model-free and model-based respectively. In model-free strategies, stimulus-response associations are strengthened when actions are followed by a reward and weakened otherwise. In model-based learning, previous to selecting an action, the current values of the different possible actions are computed based on a detailed model of the environment. Previous research with the two-stage task suggests that participants' behaviour usually shows a mixture of both strategies. But, interestingly, a recent study by da Silva and Hare (2020) found that participants primarily deploy model-based behaviour when they are given detailed instructions about the structure of the task. In the present study, we reproduce this essential experiment. Our results confirm that improved instructions give rise to a stronger model-based component. Crucially, we also found a significant effect of reward that became stronger under conditions that favoured the development of strong stimulus-response associations. This suggests that the effect of reward, often taken as indicator of a model-free component, is related to stimulus-response learning.
Collapse
Affiliation(s)
- Raúl Luna
- Institute of Optics, Spanish National Research Council (CSIC), Spain.
| | - Miguel A Vadillo
- Department of Basic Psychology, Faculty of Psychology, Universidad Autónoma de Madrid, Spain
| | - David Luque
- Department of Basic Psychology and Speech Therapy, Faculty of Psychology, Universidad de Málaga, Spain.
| |
Collapse
|
19
|
Donegan KR, Brown VM, Price RB, Gallagher E, Pringle A, Hanlon AK, Gillan CM. Using smartphones to optimise and scale-up the assessment of model-based planning. COMMUNICATIONS PSYCHOLOGY 2023; 1:31. [PMID: 39242869 PMCID: PMC11332031 DOI: 10.1038/s44271-023-00031-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 10/05/2023] [Indexed: 09/09/2024]
Abstract
Model-based planning is thought to protect against over-reliance on habits. It is reduced in individuals high in compulsivity, but effect sizes are small and may depend on subtle features of the tasks used to assess it. We developed a diamond-shooting smartphone game that measures model-based planning in an at-home setting, and varied the game's structure within and across participants to assess how it affects measurement reliability and validity with respect to previously established correlates of model-based planning, with a focus on compulsivity. Increasing the number of trials used to estimate model-based planning did remarkably little to affect the association with compulsivity, because the greatest signal was in earlier trials. Associations with compulsivity were higher when transition ratios were less deterministic and depending on the reward drift utilised. These findings suggest that model-based planning can be measured at home via an app, can be estimated in relatively few trials using certain design features, and can be optimised for sensitivity to compulsive symptoms in the general population.
Collapse
Affiliation(s)
- Kelly R Donegan
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Vanessa M Brown
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Rebecca B Price
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, USA
| | - Eoghan Gallagher
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Andrew Pringle
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Anna K Hanlon
- School of Psychology, Trinity College Dublin, Dublin, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland
| | - Claire M Gillan
- School of Psychology, Trinity College Dublin, Dublin, Ireland.
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin, Ireland.
- Global Brain Health Institute, Trinity College Dublin, Dublin, Ireland.
| |
Collapse
|
20
|
Abstract
Humans often generalize rewarding experiences across abstract social roles. Theories of reward learning suggest that people generalize through model-based learning, but such learning is cognitively costly. Why do people seem to generalize across social roles with ease? Humans are social experts who easily recognize social roles that reflect familiar semantic concepts (e.g., "helper" or "teacher"). People may associate these roles with model-free reward (e.g., learning that helpers are rewarding), allowing them to generalize easily (e.g., interacting with novel individuals identified as helpers). In four online experiments with U.S. adults (N = 577), we found evidence that social concepts ease complex learning (people generalize more and at faster speed) and that people attach reward directly to abstract roles (they generalize even when roles are unrelated to task structure). These results demonstrate how familiar concepts allow complex behavior to emerge from simple strategies, highlighting social interaction as a prototype for studying cognitive ease in the face of environmental complexity.
Collapse
Affiliation(s)
- Leor M Hackel
- Department of Psychology, University of Southern California
| | | |
Collapse
|
21
|
Crombie KM, Azar A, Botsford C, Heilicher M, Hiser J, Moughrabi N, Gruichich TS, Schomaker CM, Cisler JM. The influence of aerobic exercise on model-based decision making in women with posttraumatic stress disorder. JOURNAL OF MOOD AND ANXIETY DISORDERS 2023; 2:100015. [PMID: 37593142 PMCID: PMC10433398 DOI: 10.1016/j.xjmad.2023.100015] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/19/2023]
Abstract
Individuals with PTSD often exhibit deficits in executive functioning. An unexplored aspect of neurocognitive functions associated with PTSD is the type of learning system engaged in during decision-making. A model-free (MF) system is habitual in nature and involves trial-and-error learning that is often updated based on the most recent experience (e.g., repeat action if rewarded). A model-based (MB) system is goal-directed in nature and involves the development of an abstract representation of the environment to facilitate decisions (e.g., choose sequence of actions according to current contextual state and predicted outcomes). The existing neurocognitive literature on PTSD suggests the hypothesis of greater reliance on MF vs MB learning strategies when navigating their environment. While MF systems may be more cognitively efficient, they do not afford flexibility when making prospective predictions about likely outcomes of different decision-tree branches. Emerging research suggests that an acute bout of aerobic exercise improves certain aspects of neurocognition, and thereby could promote the utilization of MB over MF systems during decision making, although prior research has not yet tested this hypothesis. Accordingly, the current study administered a lab-based two-stage Markov decision-making task capable of discriminating MF vs MB decision making, in order to determine if moderate-intensity aerobic exercise (either shortly after or 30-minutes after the exercise bout has ended) promotes greater engagement in MB behavioral strategies compared to light-intensity aerobic exercise in adult women with and without PTSD (N=61). Results revealed that control women generally displayed higher levels of MB behavior that was further increased following immediate exercise, particularly moderate-intensity exercise. By contrast, the PTSD group generally displayed lower levels of MB behavior, and exhibited greater MB behavior when completing the task following moderate-intensity aerobic exercise compared to light-intensity aerobic exercise regardless of whether there was a short or long delay between exercise and the task. Additionally, women with PTSD demonstrated less impairment in MB decision-making compared to controls following moderate-intensity aerobic exercise. These results suggest that an acute bout of moderate-intensity aerobic exercise boosts MB behavior in women with PTSD, and suggests that aerobic exercise may play an important role in enhancing cognitive outcomes for PTSD.
Collapse
Affiliation(s)
- Kevin M. Crombie
- The University of Texas at Austin, Department of Psychiatry and Behavioral Sciences, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
- The University of Alabama, Department of Kinesiology, 1003 Wade Hall, Tuscaloosa, Alabama, United States of America 35487
| | - Ameera Azar
- The University of Texas at Austin, Department of Psychiatry and Behavioral Sciences, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
| | - Chloe Botsford
- University of Wisconsin – Madison, Department of Psychiatry, 6001 Research Park Boulevard, Madison, Wisconsin, United States of America 53719
| | - Mickela Heilicher
- University of Wisconsin – Madison, Department of Psychiatry, 6001 Research Park Boulevard, Madison, Wisconsin, United States of America 53719
| | - Jaryd Hiser
- University of Wisconsin – Madison, Department of Psychiatry, 6001 Research Park Boulevard, Madison, Wisconsin, United States of America 53719
- The Ohio State University, Department of Psychiatry and Behavioral Health, 1670 Upham Drive, Suite 130, Columbus, Ohio, United States of America 43210
| | - Nicole Moughrabi
- The University of Texas at Austin, Department of Psychiatry and Behavioral Sciences, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
| | - Tijana Sagorac Gruichich
- University of Wisconsin – Madison, Department of Psychiatry, 6001 Research Park Boulevard, Madison, Wisconsin, United States of America 53719
| | - Chloe M. Schomaker
- The University of Texas at Austin, Department of Psychiatry and Behavioral Sciences, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
| | - Josh M. Cisler
- The University of Texas at Austin, Department of Psychiatry and Behavioral Sciences, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
- Institute for Early Life Adversity Research, The University of Texas at Austin Dell Medical School, 1601 Trinity Street, Building B, Austin, Texas, United States of America 78712
| |
Collapse
|
22
|
Rischall I, Hunter L, Jensen G, Gottlieb J. Inefficient prioritization of task-relevant attributes during instrumental information demand. Nat Commun 2023; 14:3174. [PMID: 37264004 DOI: 10.1038/s41467-023-38821-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Accepted: 05/17/2023] [Indexed: 06/03/2023] Open
Abstract
In natural settings, people evaluate complex multi-attribute situations and decide which attribute to request information about. Little is known about how people make this selection and specifically, how they identify individual observations that best predict the value of a multi-attribute situation. Here show that, in a simple task of information demand, participants inefficiently query attributes that have high individual value but are relatively uninformative about a total payoff. This inefficiency is robust in two instrumental conditions in which gathering less informative observations leads to significantly lower rewards. Across individuals, variations in the sensitivity to informativeness is associated with personality metrics, showing negative associations with extraversion and thrill seeking and positive associations with stress tolerance and need for cognition. Thus, people select informative queries using sub-optimal strategies that are associated with personality traits and influence consequential choices.
Collapse
Affiliation(s)
- Isabella Rischall
- Department of Neuroscience, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Laura Hunter
- Department of Neuroscience, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Greg Jensen
- Department of Neuroscience, Columbia University, New York, NY, USA
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
- Department of Psychology, Reed College, Portland, OR, USA
| | - Jacqueline Gottlieb
- Department of Neuroscience, Columbia University, New York, NY, USA.
- Mortimer B. Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA.
- Kavli Institute for Brain Science, Columbia University, New York, NY, USA.
| |
Collapse
|
23
|
Ruan Z, Seger CA, Yang Q, Kim D, Lee SW, Chen Q, Peng Z. Impairment of arbitration between model-based and model-free reinforcement learning in obsessive-compulsive disorder. Front Psychiatry 2023; 14:1162800. [PMID: 37304449 PMCID: PMC10250695 DOI: 10.3389/fpsyt.2023.1162800] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Accepted: 05/05/2023] [Indexed: 06/13/2023] Open
Abstract
Introduction Obsessive-compulsive disorder (OCD) is characterized by an imbalance between goal-directed and habitual learning systems in behavioral control, but it is unclear whether these impairments are due to a single system abnormality of the goal-directed system or due to an impairment in a separate arbitration mechanism that selects which system controls behavior at each point in time. Methods A total of 30 OCD patients and 120 healthy controls performed a 2-choice, 3-stage Markov decision-making paradigm. Reinforcement learning models were used to estimate goal-directed learning (as model-based reinforcement learning) and habitual learning (as model-free reinforcement learning). In general, 29 high Obsessive-Compulsive Inventory-Revised (OCI-R) score controls, 31 low OCI-R score controls, and all 30 OCD patients were selected for the analysis. Results Obsessive-compulsive disorder (OCD) patients showed less appropriate strategy choices than controls regardless of whether the OCI-R scores in the control subjects were high (p = 0.012) or low (p < 0.001), specifically showing a greater model-free strategy use in task conditions where the model-based strategy was optimal. Furthermore, OCD patients (p = 0.001) and control subjects with high OCI-R scores (H-OCI-R; p = 0.009) both showed greater system switching rather than consistent strategy use in task conditions where model-free use was optimal. Conclusion These findings indicated an impaired arbitration mechanism for flexible adaptation to environmental demands in both OCD patients and healthy individuals reporting high OCI-R scores.
Collapse
Affiliation(s)
- Zhongqiang Ruan
- Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, China
| | - Carol A. Seger
- Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, China
- Department of Psychology, Colorado State University, Fort Collins, CO, United States
| | - Qiong Yang
- Affective Disorder Center, Affiliated Brain Hospital of Guangzhou Medical University (Guangzhou Huiai Hospital), Guangzhou, China
| | - Dongjae Kim
- Department of AI-based Convergence, College of Engineering, Dankook University, Yongin, Republic of Korea
| | - Sang Wan Lee
- Department of Bio and Brain Engineering, Program of Brain and Cognitive Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Republic of Korea
| | - Qi Chen
- School of Psychology, Shenzhen University, Shenzhen, China
| | - Ziwen Peng
- Guangdong Key Laboratory of Mental Health and Cognitive Science, School of Psychology, Center for Studies of Psychological Application, South China Normal University, Guangzhou, China
- Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, Guangzhou, China
- Department of Child Psychiatry, Shenzhen Kangning Hospital, Shenzhen University School of Medicine, Shenzhen, China
| |
Collapse
|
24
|
Steffen J, Marković D, Glöckner F, Neukam PT, Kiebel SJ, Li SC, Smolka MN. Shorter planning depth and higher response noise during sequential decision-making in old age. Sci Rep 2023; 13:7692. [PMID: 37169942 PMCID: PMC10175280 DOI: 10.1038/s41598-023-33274-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 04/11/2023] [Indexed: 05/13/2023] Open
Abstract
Forward planning is crucial to maximize outcome in complex sequential decision-making scenarios. In this cross-sectional study, we were particularly interested in age-related differences of forward planning. We presumed that especially older individuals would show a shorter planning depth to keep the costs of model-based decision-making within limits. To test this hypothesis, we developed a sequential decision-making task to assess forward planning in younger (age < 40 years; n = 25) and older (age > 60 years; n = 27) adults. By using reinforcement learning modelling, we inferred planning depths from participants' choices. Our results showed significantly shorter planning depths and higher response noise for older adults. Age differences in planning depth were only partially explained by well-known cognitive covariates such as working memory and processing speed. Consistent with previous findings, this indicates age-related shifts away from model-based behaviour in older adults. In addition to a shorter planning depth, our findings suggest that older adults also apply a variety of heuristical low-cost strategies.
Collapse
Affiliation(s)
- Johannes Steffen
- Department of Psychiatry and Psychotherapy, Technische Universität Dresden, Dresden, Germany
| | - Dimitrije Marković
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Franka Glöckner
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Philipp T Neukam
- Department of Psychiatry and Psychotherapy, Technische Universität Dresden, Dresden, Germany
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Stefan J Kiebel
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Shu-Chen Li
- Department of Psychology, Technische Universität Dresden, Dresden, Germany
| | - Michael N Smolka
- Department of Psychiatry and Psychotherapy, Technische Universität Dresden, Dresden, Germany.
| |
Collapse
|
25
|
Feher da Silva C, Lombardi G, Edelson M, Hare TA. Rethinking model-based and model-free influences on mental effort and striatal prediction errors. Nat Hum Behav 2023:10.1038/s41562-023-01573-1. [PMID: 37012365 DOI: 10.1038/s41562-023-01573-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Accepted: 02/27/2023] [Indexed: 04/05/2023]
Abstract
A standard assumption in neuroscience is that low-effort model-free learning is automatic and continuously used, whereas more complex model-based strategies are only used when the rewards they generate are worth the additional effort. We present evidence refuting this assumption. First, we demonstrate flaws in previous reports of combined model-free and model-based reward prediction errors in the ventral striatum that probably led to spurious results. More appropriate analyses yield no evidence of model-free prediction errors in this region. Second, we find that task instructions generating more correct model-based behaviour reduce rather than increase mental effort. This is inconsistent with cost-benefit arbitration between model-based and model-free strategies. Together, our data indicate that model-free learning may not be automatic. Instead, humans can reduce mental effort by using a model-based strategy alone rather than arbitrating between multiple strategies. Our results call for re-evaluation of the assumptions in influential theories of learning and decision-making.
Collapse
Affiliation(s)
| | - Gaia Lombardi
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Micah Edelson
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Todd A Hare
- Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland.
| |
Collapse
|
26
|
Nissan N, Hertz U, Shahar N, Gabay Y. Distinct reinforcement learning profiles distinguish between language and attentional neurodevelopmental disorders. BEHAVIORAL AND BRAIN FUNCTIONS : BBF 2023; 19:6. [PMID: 36941632 PMCID: PMC10029183 DOI: 10.1186/s12993-023-00207-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 01/26/2023] [Indexed: 03/23/2023]
Abstract
BACKGROUND Theoretical models posit abnormalities in cortico-striatal pathways in two of the most common neurodevelopmental disorders (Developmental dyslexia, DD, and Attention deficit hyperactive disorder, ADHD), but it is still unclear what distinct cortico-striatal dysfunction might distinguish language disorders from others that exhibit very different symptomatology. Although impairments in tasks that depend on the cortico-striatal network, including reinforcement learning (RL), have been implicated in both disorders, there has been little attempt to dissociate between different types of RL or to compare learning processes in these two types of disorders. The present study builds upon prior research indicating the existence of two learning manifestations of RL and evaluates whether these processes can be differentiated in language and attention deficit disorders. We used a two-step RL task shown to dissociate model-based from model-free learning in human learners. RESULTS Our results show that, relative to neurotypicals, DD individuals showed an impairment in model-free but not in model-based learning, whereas in ADHD the ability to use both model-free and model-based learning strategies was significantly compromised. CONCLUSIONS Thus, learning impairments in DD may be linked to a selective deficit in the ability to form action-outcome associations based on previous history, whereas in ADHD some learning deficits may be related to an incapacity to pursue rewards based on the tasks' structure. Our results indicate how different patterns of learning deficits may underlie different disorders, and how computation-minded experimental approaches can differentiate between them.
Collapse
Affiliation(s)
- Noyli Nissan
- Department of Special Education, University of Haifa, Haifa, Israel
- Edmond J. Safra Brain Research Center for the Study of Learning Disabilities, University of Haifa, 199 Abba Khoushy Ave, Haifa, Israel
| | - Uri Hertz
- Department of Cognitive Sciences, University of Haifa, Haifa, Israel
| | - Nitzan Shahar
- The School of Psychological Sciences, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Yafit Gabay
- Department of Special Education, University of Haifa, Haifa, Israel.
- Edmond J. Safra Brain Research Center for the Study of Learning Disabilities, University of Haifa, 199 Abba Khoushy Ave, Haifa, Israel.
| |
Collapse
|
27
|
Brandl F, Knolle F, Avram M, Leucht C, Yakushev I, Priller J, Leucht S, Ziegler S, Wunderlich K, Sorg C. Negative symptoms, striatal dopamine and model-free reward decision-making in schizophrenia. Brain 2023; 146:767-777. [PMID: 35875972 DOI: 10.1093/brain/awac268] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2022] [Revised: 06/13/2022] [Accepted: 07/04/2022] [Indexed: 11/13/2022] Open
Abstract
Negative symptoms, such as lack of motivation or social withdrawal, are highly prevalent and debilitating in patients with schizophrenia. Underlying mechanisms of negative symptoms are incompletely understood, thereby preventing the development of targeted treatments. We hypothesized that in patients with schizophrenia during psychotic remission, impaired influences of both model-based and model-free reward predictions on decision-making ('reward prediction influence', RPI) underlie negative symptoms. We focused on psychotic remission, because psychotic symptoms might confound reward-based decision-making. Moreover, we hypothesized that impaired model-based/model-free RPIs depend on alterations of both associative striatum dopamine synthesis and storage (DSS) and executive functioning. Both factors influence RPI in healthy subjects and are typically impaired in schizophrenia. Twenty-five patients with schizophrenia with pronounced negative symptoms during psychotic remission and 24 healthy controls were included in the study. Negative symptom severity was measured by the Positive and Negative Syndrome Scale negative subscale, model-based/model-free RPI by the two-stage decision task, associative striatum DSS by 18F-DOPA positron emission tomography and executive functioning by the symbol coding task. Model-free RPI was selectively reduced in patients and associated with negative symptom severity as well as with reduced associative striatum DSS (in patients only) and executive functions (both in patients and controls). In contrast, model-based RPI was not altered in patients. Results provide evidence for impaired model-free reward prediction influence as a mechanism for negative symptoms in schizophrenia as well as for reduced associative striatum dopamine and executive dysfunction as relevant factors. Data suggest potential treatment targets for patients with schizophrenia and pronounced negative symptoms.
Collapse
Affiliation(s)
- Felix Brandl
- Department of Psychiatry and Psychotherapy, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,Department of Neuroradiology, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,TUM-NIC Neuroimaging Center, School of Medicine, Technical University of Munich, Munich, 81675, Germany
| | - Franziska Knolle
- Department of Neuroradiology, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,TUM-NIC Neuroimaging Center, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,Department of Psychiatry, University of Cambridge, Cambridge CB20SZ, UK
| | - Mihai Avram
- Translational Psychiatry, Department of Psychiatry and Psychotherapy, University of Lübeck, Lübeck, 23538, Germany
| | - Claudia Leucht
- Department of Psychiatry and Psychotherapy, School of Medicine, Technical University of Munich, Munich, 81675, Germany
| | - Igor Yakushev
- Department of Nuclear Medicine, School of Medicine, Technical University of Munich, Munich, 81675, Germany
| | - Josef Priller
- Department of Psychiatry and Psychotherapy, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,Neuropsychiatry, Charité-Universitätsmedizin Berlin, and DZNE, Berlin, 10117, Germany.,UK DRI at University of Edinburgh, Edinburgh EH16 4SB, UK.,IoPPN, King's College London, London SE5 8AF, UK
| | - Stefan Leucht
- Department of Psychiatry and Psychotherapy, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,Department of Psychosis studies, King's College London, London, UK
| | - Sibylle Ziegler
- Department of Nuclear Medicine, Ludwig-Maximilians University Munich, Munich, 81377, Germany
| | - Klaus Wunderlich
- Department of Psychology, Ludwig-Maximilians University Munich, Munich, 81377, Germany
| | - Christian Sorg
- Department of Psychiatry and Psychotherapy, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,Department of Neuroradiology, School of Medicine, Technical University of Munich, Munich, 81675, Germany.,TUM-NIC Neuroimaging Center, School of Medicine, Technical University of Munich, Munich, 81675, Germany
| |
Collapse
|
28
|
Berner LA, Fiore VG, Chen JY, Krueger A, Kaye WH, Viranda T, de Wit S. Impaired belief updating and devaluation in adult women with bulimia nervosa. Transl Psychiatry 2023; 13:2. [PMID: 36604416 PMCID: PMC9816187 DOI: 10.1038/s41398-022-02257-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Revised: 11/07/2022] [Accepted: 11/10/2022] [Indexed: 01/07/2023] Open
Abstract
Recent models of bulimia nervosa (BN) propose that binge-purge episodes ultimately become automatic in response to cues and insensitive to negative outcomes. Here, we examined whether women with BN show alterations in instrumental learning and devaluation sensitivity using traditional and computational modeling analyses of behavioral data. Adult women with BN (n = 30) and group-matched healthy controls (n = 31) completed a task in which they first learned stimulus-response-outcome associations. Then, participants were required to repeatedly adjust their responses in a "baseline test", when different sets of stimuli were explicitly devalued, and in a "slips-of-action test", when outcomes instead of stimuli were devalued. The BN group showed intact behavioral sensitivity to outcome devaluation during the slips-of-action test, but showed difficulty overriding previously learned stimulus-response associations on the baseline test. Results from a Bayesian learner model indicated that this impaired performance could be accounted for by a slower pace of belief updating when a new set of previously learned responses had to be inhibited (p = 0.036). Worse performance and a slower belief update in the baseline test were each associated with more frequent binge eating (p = 0.012) and purging (p = 0.002). Our findings suggest that BN diagnosis and severity are associated with deficits in flexibly updating beliefs to withhold previously learned responses to cues. Additional research is needed to determine whether this impaired ability to adjust behavior is responsible for maintaining automatic and persistent binge eating and purging in response to internal and environmental cues.
Collapse
Affiliation(s)
- Laura A Berner
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| | - Vincenzo G Fiore
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joanna Y Chen
- Department of Psychology, Drexel University, Philadelphia, PA, USA
| | - Angeline Krueger
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Walter H Kaye
- Department of Psychiatry, University of California San Diego, San Diego, CA, USA
| | - Thalia Viranda
- Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sanne de Wit
- Department of Clinical Psychology, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
29
|
Barbosa J, Stein H, Zorowitz S, Niv Y, Summerfield C, Soto-Faraco S, Hyafil A. A practical guide for studying human behavior in the lab. Behav Res Methods 2023; 55:58-76. [PMID: 35262897 DOI: 10.3758/s13428-022-01793-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/04/2022] [Indexed: 11/08/2022]
Abstract
In the last few decades, the field of neuroscience has witnessed major technological advances that have allowed researchers to measure and control neural activity with great detail. Yet, behavioral experiments in humans remain an essential approach to investigate the mysteries of the mind. Their relatively modest technological and economic requisites make behavioral research an attractive and accessible experimental avenue for neuroscientists with very diverse backgrounds. However, like any experimental enterprise, it has its own inherent challenges that may pose practical hurdles, especially to less experienced behavioral researchers. Here, we aim at providing a practical guide for a steady walk through the workflow of a typical behavioral experiment with human subjects. This primer concerns the design of an experimental protocol, research ethics, and subject care, as well as best practices for data collection, analysis, and sharing. The goal is to provide clear instructions for both beginners and experienced researchers from diverse backgrounds in planning behavioral experiments.
Collapse
Affiliation(s)
- Joao Barbosa
- Brain Circuits & Behavior lab, IDIBAPS, Barcelona, Spain.
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Supérieure - PSL Research University, 75005, Paris, France.
| | - Heike Stein
- Brain Circuits & Behavior lab, IDIBAPS, Barcelona, Spain
- Laboratoire de Neurosciences Cognitives et Computationnelles, INSERM U960, Ecole Normale Supérieure - PSL Research University, 75005, Paris, France
| | - Sam Zorowitz
- Princeton Neuroscience Institute, Princeton University, Princeton, USA
| | - Yael Niv
- Princeton Neuroscience Institute, Princeton University, Princeton, USA
- Department of Psychology, Princeton University, Princeton, USA
| | | | - Salvador Soto-Faraco
- Multisensory Research Group, Center for Brain and Cognition, Universitat Pompeu Fabra Barcelona, Spain, and Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Spain
| | | |
Collapse
|
30
|
Spontaneous mind wandering impairs model-based decision making. PLoS One 2023; 18:e0279532. [PMID: 36701316 PMCID: PMC9879536 DOI: 10.1371/journal.pone.0279532] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Accepted: 12/08/2022] [Indexed: 01/27/2023] Open
Abstract
BACKGROUND If our attention wanders to other thoughts while making a decision, then the decision might not be directed towards future goals, reflecting a lack of model-based decision making, but may instead be driven by habits, reflecting model-free decision making. Here we aimed to investigate if and how model-based versus model-free decision making is reduced by trait spontaneous mind wandering. METHODS AND FINDINGS We used a sequential two-step Markov decision task and a self-report questionnaire assessing trait spontaneous and deliberate mind wandering propensity, to investigate how trait mind wandering relates to model-free as well as model-based decisions. We estimated parameters of a computational neurocognitive dual-control model of decision making. Analyzing estimated model parameters, we found that trait spontaneous mind wandering was related to impaired model-based decisions, while model-free choice stayed unaffected. CONCLUSIONS Our findings suggest trait spontaneous mind wandering is associated with impaired model-based decision making, and it may reflect model-based offline replay for other tasks (e.g., real-life goals) outside the current lab situation.
Collapse
|
31
|
Mikus N, Korb S, Massaccesi C, Gausterer C, Graf I, Willeit M, Eisenegger C, Lamm C, Silani G, Mathys C. Effects of dopamine D2/3 and opioid receptor antagonism on the trade-off between model-based and model-free behaviour in healthy volunteers. eLife 2022; 11:e79661. [PMID: 36468832 PMCID: PMC9721617 DOI: 10.7554/elife.79661] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 11/22/2022] [Indexed: 12/11/2022] Open
Abstract
Human behaviour requires flexible arbitration between actions we do out of habit and actions that are directed towards a specific goal. Drugs that target opioid and dopamine receptors are notorious for inducing maladaptive habitual drug consumption; yet, how the opioidergic and dopaminergic neurotransmitter systems contribute to the arbitration between habitual and goal-directed behaviour is poorly understood. By combining pharmacological challenges with a well-established decision-making task and a novel computational model, we show that the administration of the dopamine D2/3 receptor antagonist amisulpride led to an increase in goal-directed or 'model-based' relative to habitual or 'model-free' behaviour, whereas the non-selective opioid receptor antagonist naltrexone had no appreciable effect. The effect of amisulpride on model-based/model-free behaviour did not scale with drug serum levels in the blood. Furthermore, participants with higher amisulpride serum levels showed higher explorative behaviour. These findings highlight the distinct functional contributions of dopamine and opioid receptors to goal-directed and habitual behaviour and support the notion that even small doses of amisulpride promote flexible application of cognitive control.
Collapse
Affiliation(s)
- Nace Mikus
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
- Interacting Minds Centre, Aarhus UniversityAarhusDenmark
| | - Sebastian Korb
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
- Department of Psychology, University of EssexColchesterUnited Kingdom
| | - Claudia Massaccesi
- Department of Clinical and Health Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Christian Gausterer
- FDZ‐Forensisches DNA Zentrallabor GmbH, Medical University of ViennaViennaAustria
| | - Irene Graf
- Department of Psychiatry and Psychotherapy, Medical University of ViennaViennaAustria
| | - Matthäus Willeit
- Department of Psychiatry and Psychotherapy, Medical University of ViennaViennaAustria
| | - Christoph Eisenegger
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Claus Lamm
- Department of Cognition, Emotion, and Methods in Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Giorgia Silani
- Department of Clinical and Health Psychology, Faculty of Psychology, University of ViennaViennaAustria
| | - Christoph Mathys
- Interacting Minds Centre, Aarhus UniversityAarhusDenmark
- Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich and ETH ZurichZurichSwitzerland
- Scuola Internazionale Superiore di Studi Avanzati (SISSA)TriesteItaly
| |
Collapse
|
32
|
Colas JT, Dundon NM, Gerraty RT, Saragosa‐Harris NM, Szymula KP, Tanwisuth K, Tyszka JM, van Geen C, Ju H, Toga AW, Gold JI, Bassett DS, Hartley CA, Shohamy D, Grafton ST, O'Doherty JP. Reinforcement learning with associative or discriminative generalization across states and actions: fMRI at 3 T and 7 T. Hum Brain Mapp 2022; 43:4750-4790. [PMID: 35860954 PMCID: PMC9491297 DOI: 10.1002/hbm.25988] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 05/20/2022] [Accepted: 06/10/2022] [Indexed: 11/12/2022] Open
Abstract
The model-free algorithms of "reinforcement learning" (RL) have gained clout across disciplines, but so too have model-based alternatives. The present study emphasizes other dimensions of this model space in consideration of associative or discriminative generalization across states and actions. This "generalized reinforcement learning" (GRL) model, a frugal extension of RL, parsimoniously retains the single reward-prediction error (RPE), but the scope of learning goes beyond the experienced state and action. Instead, the generalized RPE is efficiently relayed for bidirectional counterfactual updating of value estimates for other representations. Aided by structural information but as an implicit rather than explicit cognitive map, GRL provided the most precise account of human behavior and individual differences in a reversal-learning task with hierarchical structure that encouraged inverse generalization across both states and actions. Reflecting inference that could be true, false (i.e., overgeneralization), or absent (i.e., undergeneralization), state generalization distinguished those who learned well more so than action generalization. With high-resolution high-field fMRI targeting the dopaminergic midbrain, the GRL model's RPE signals (alongside value and decision signals) were localized within not only the striatum but also the substantia nigra and the ventral tegmental area, including specific effects of generalization that also extend to the hippocampus. Factoring in generalization as a multidimensional process in value-based learning, these findings shed light on complexities that, while challenging classic RL, can still be resolved within the bounds of its core computations.
Collapse
Affiliation(s)
- Jaron T. Colas
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| | - Neil M. Dundon
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
- Department of Child and Adolescent Psychiatry, Psychotherapy, and PsychosomaticsUniversity of FreiburgFreiburg im BreisgauGermany
| | - Raphael T. Gerraty
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Center for Science and SocietyColumbia UniversityNew YorkNew YorkUSA
| | - Natalie M. Saragosa‐Harris
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of CaliforniaLos AngelesCaliforniaUSA
| | - Karol P. Szymula
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Koranis Tanwisuth
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Department of PsychologyUniversity of CaliforniaBerkeleyCaliforniaUSA
| | - J. Michael Tyszka
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
| | - Camilla van Geen
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Department of PsychologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Harang Ju
- Neuroscience Graduate GroupUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Arthur W. Toga
- Laboratory of Neuro ImagingUSC Stevens Neuroimaging and Informatics Institute, Keck School of Medicine of USC, University of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Joshua I. Gold
- Department of NeuroscienceUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Dani S. Bassett
- Department of BioengineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Electrical and Systems EngineeringUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of NeurologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of PsychiatryUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Department of Physics and AstronomyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Santa Fe InstituteSanta FeNew MexicoUSA
| | - Catherine A. Hartley
- Department of PsychologyNew York UniversityNew YorkNew YorkUSA
- Center for Neural ScienceNew York UniversityNew YorkNew YorkUSA
| | - Daphna Shohamy
- Department of PsychologyColumbia UniversityNew YorkNew YorkUSA
- Zuckerman Mind Brain Behavior Institute, Columbia UniversityNew YorkNew YorkUSA
- Kavli Institute for Brain ScienceColumbia UniversityNew YorkNew YorkUSA
| | - Scott T. Grafton
- Department of Psychological and Brain SciencesUniversity of CaliforniaSanta BarbaraCaliforniaUSA
| | - John P. O'Doherty
- Division of the Humanities and Social SciencesCalifornia Institute of TechnologyPasadenaCaliforniaUSA
- Computation and Neural Systems Program, California Institute of TechnologyPasadenaCaliforniaUSA
| |
Collapse
|
33
|
Nour MM, Liu Y, Dolan RJ. Functional neuroimaging in psychiatry and the case for failing better. Neuron 2022; 110:2524-2544. [PMID: 35981525 DOI: 10.1016/j.neuron.2022.07.005] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 06/06/2022] [Accepted: 07/08/2022] [Indexed: 12/27/2022]
Abstract
Psychiatric disorders encompass complex aberrations of cognition and affect and are among the most debilitating and poorly understood of any medical condition. Current treatments rely primarily on interventions that target brain function (drugs) or learning processes (psychotherapy). A mechanistic understanding of how these interventions mediate their therapeutic effects remains elusive. From the early 1990s, non-invasive functional neuroimaging, coupled with parallel developments in the cognitive neurosciences, seemed to signal a new era of neurobiologically grounded diagnosis and treatment in psychiatry. Yet, despite three decades of intense neuroimaging research, we still lack a neurobiological account for any psychiatric condition. Likewise, functional neuroimaging plays no role in clinical decision making. Here, we offer a critical commentary on this impasse and suggest how the field might fare better and deliver impactful neurobiological insights.
Collapse
Affiliation(s)
- Matthew M Nour
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK; Wellcome Trust Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK; Department of Psychiatry, University of Oxford, Oxford OX3 7JX, UK.
| | - Yunzhe Liu
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK; State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China; Chinese Institute for Brain Research, Beijing 102206, China
| | - Raymond J Dolan
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK; Wellcome Trust Centre for Human Neuroimaging, University College London, London WC1N 3AR, UK; State Key Laboratory of Cognitive Neuroscience and Learning, IDG/McGovern Institute for Brain Research, Beijing Normal University, Beijing 100875, China.
| |
Collapse
|
34
|
Wagner B, Mathar D, Peters J. Gambling Environment Exposure Increases Temporal Discounting but Improves Model-Based Control in Regular Slot-Machine Gamblers. COMPUTATIONAL PSYCHIATRY (CAMBRIDGE, MASS.) 2022; 6:142-165. [PMID: 38774777 PMCID: PMC11104401 DOI: 10.5334/cpsy.84] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 05/26/2022] [Indexed: 11/20/2022]
Abstract
Gambling disorder is a behavioral addiction that negatively impacts personal finances, work, relationships and mental health. In this pre-registered study (https://osf.io/5ptz9/) we investigated the impact of real-life gambling environments on two computational markers of addiction, temporal discounting and model-based reinforcement learning. Gambling disorder is associated with increased temporal discounting and reduced model-based learning. Regular gamblers (n = 30, DSM-5 score range 3-9) performed both tasks in a neutral (café) and a gambling-related environment (slot-machine venue) in counterbalanced order. Data were modeled using drift diffusion models for temporal discounting and reinforcement learning via hierarchical Bayesian estimation. Replicating previous findings, gamblers discounted rewards more steeply in the gambling-related context. This effect was positively correlated with gambling related cognitive distortions (pre-registered analysis). In contrast to our pre-registered hypothesis, model-based reinforcement learning was improved in the gambling context. Here we show that temporal discounting and model-based reinforcement learning are modulated in opposite ways by real-life gambling cue exposure. Results challenge aspects of habit theories of addiction, and reveal that laboratory-based computational markers of psychopathology are under substantial contextual control.
Collapse
Affiliation(s)
- Ben Wagner
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany
- Faculty of Psychology, Chair of Neuroimaging, Technical University of Dresden, Dresden, Germany
| | - David Mathar
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany
| | - Jan Peters
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany
| |
Collapse
|
35
|
Török B, Nagy DG, Kiss M, Janacsek K, Németh D, Orbán G. Tracking the contribution of inductive bias to individualised internal models. PLoS Comput Biol 2022; 18:e1010182. [PMID: 35731822 PMCID: PMC9255757 DOI: 10.1371/journal.pcbi.1010182] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 07/05/2022] [Accepted: 05/08/2022] [Indexed: 11/20/2022] Open
Abstract
Internal models capture the regularities of the environment and are central to understanding how humans adapt to environmental statistics. In general, the correct internal model is unknown to observers, instead they rely on an approximate model that is continually adapted throughout learning. However, experimenters assume an ideal observer model, which captures stimulus structure but ignores the diverging hypotheses that humans form during learning. We combine non-parametric Bayesian methods and probabilistic programming to infer rich and dynamic individualised internal models from response times. We demonstrate that the approach is capable of characterizing the discrepancy between the internal model maintained by individuals and the ideal observer model and to track the evolution of the contribution of the ideal observer model to the internal model throughout training. In particular, in an implicit visuomotor sequence learning task the identified discrepancy revealed an inductive bias that was consistent across individuals but varied in strength and persistence.
Collapse
Affiliation(s)
- Balázs Török
- Department of Computational Sciences, Wigner Research Centre for Physics, Budapest, Hungary
- Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
- Brain, Memory and Language Research Group, Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
| | - David G. Nagy
- Department of Computational Sciences, Wigner Research Centre for Physics, Budapest, Hungary
- Institute of Physics, Eötvös Loránd University, Budapest, Hungary
| | - Mariann Kiss
- Department of Cognitive Science, Faculty of Natural Sciences, Budapest University of Technology and Economics, Műegyetem rkp. 3., H-1111 Budapest, Hungary
- Brain, Memory and Language Research Group, Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
| | - Karolina Janacsek
- Institute of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary
- Centre for Thinking and Learning, Institute for Lifecourse Development, School of Human Sciences, Faculty of Education, Health and Human Sciences, University of Greenwich, London, United Kingdom
| | - Dezső Németh
- Brain, Memory and Language Research Group, Institute of Cognitive Neuroscience and Psychology, Research Centre for Natural Sciences, Budapest, Hungary
- Institute of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary
- Lyon Neuroscience Research Center (CRNL), Université Claude Bernard Lyon 1, Lyon, France
| | - Gergő Orbán
- Department of Computational Sciences, Wigner Research Centre for Physics, Budapest, Hungary
| |
Collapse
|
36
|
Castro-Rodrigues P, Akam T, Snorasson I, Camacho M, Paixão V, Maia A, Barahona-Corrêa JB, Dayan P, Simpson HB, Costa RM, Oliveira-Maia AJ. Explicit knowledge of task structure is a primary determinant of human model-based action. Nat Hum Behav 2022; 6:1126-1141. [PMID: 35589826 DOI: 10.1038/s41562-022-01346-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Revised: 03/19/2022] [Accepted: 03/31/2022] [Indexed: 11/09/2022]
Abstract
Explicit information obtained through instruction profoundly shapes human choice behaviour. However, this has been studied in computationally simple tasks, and it is unknown how model-based and model-free systems, respectively generating goal-directed and habitual actions, are affected by the absence or presence of instructions. We assessed behaviour in a variant of a computationally more complex decision-making task, before and after providing information about task structure, both in healthy volunteers and in individuals suffering from obsessive-compulsive or other disorders. Initial behaviour was model-free, with rewards directly reinforcing preceding actions. Model-based control, employing predictions of states resulting from each action, emerged with experience in a minority of participants, and less in those with obsessive-compulsive disorder. Providing task structure information strongly increased model-based control, similarly across all groups. Thus, in humans, explicit task structural knowledge is a primary determinant of model-based reinforcement learning and is most readily acquired from instruction rather than experience.
Collapse
Affiliation(s)
- Pedro Castro-Rodrigues
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Centro Hospitalar Psiquiátrico de Lisboa, Lisbon, Portugal
| | - Thomas Akam
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,Department of Experimental Psychology, University of Oxford, Oxford, UK
| | - Ivar Snorasson
- Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA
| | - Marta Camacho
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,John Van Geest Center for Brain Repair, University of Cambridge, Cambridge, UK
| | - Vitor Paixão
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal
| | - Ana Maia
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Department of Psychiatry and Mental Health, Centro Hospitalar de Lisboa Ocidental, Lisbon, Portugal
| | - J Bernardo Barahona-Corrêa
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal.,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany.,The University of Tübingen, Tübingen, Germany
| | - H Blair Simpson
- Center for Obsessive-Compulsive & Related Disorders, New York State Psychiatric Institute, New York, NY, USA.,Department of Psychiatry, Columbia University, New York, NY, USA
| | - Rui M Costa
- Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal.,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.,Zuckerman Mind Brain Behavior Institute, Columbia University, New York, NY, USA
| | - Albino J Oliveira-Maia
- Champalimaud Clinical Centre, Champalimaud Foundation, Lisbon, Portugal. .,Champalimaud Research, Champalimaud Foundation, Lisbon, Portugal. .,NOVA Medical School, NMS, Universidade Nova de Lisboa, Lisbon, Portugal.
| |
Collapse
|
37
|
Dennison JB, Sazhin D, Smith DV. Decision neuroscience and neuroeconomics: Recent progress and ongoing challenges. WILEY INTERDISCIPLINARY REVIEWS. COGNITIVE SCIENCE 2022; 13:e1589. [PMID: 35137549 PMCID: PMC9124684 DOI: 10.1002/wcs.1589] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/05/2020] [Revised: 11/28/2021] [Accepted: 12/21/2021] [Indexed: 01/10/2023]
Abstract
In the past decade, decision neuroscience and neuroeconomics have developed many new insights in the study of decision making. This review provides an overarching update on how the field has advanced in this time period. Although our initial review a decade ago outlined several theoretical, conceptual, methodological, empirical, and practical challenges, there has only been limited progress in resolving these challenges. We summarize significant trends in decision neuroscience through the lens of the challenges outlined for the field and review examples where the field has had significant, direct, and applicable impacts across economics and psychology. First, we review progress on topics including reward learning, explore-exploit decisions, risk and ambiguity, intertemporal choice, and valuation. Next, we assess the impacts of emotion, social rewards, and social context on decision making. Then, we follow up with how individual differences impact choices and new exciting developments in the prediction and neuroforecasting of future decisions. Finally, we consider how trends in decision-neuroscience research reflect progress toward resolving past challenges, discuss new and exciting applications of recent research, and identify new challenges for the field. This article is categorized under: Psychology > Reasoning and Decision Making Psychology > Emotion and Motivation.
Collapse
Affiliation(s)
- Jeffrey B Dennison
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - Daniel Sazhin
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| | - David V Smith
- Department of Psychology, Temple University, Philadelphia, Pennsylvania, USA
| |
Collapse
|
38
|
Rmus M, Ritz H, Hunter LE, Bornstein AM, Shenhav A. Humans can navigate complex graph structures acquired during latent learning. Cognition 2022; 225:105103. [PMID: 35364400 PMCID: PMC9201735 DOI: 10.1016/j.cognition.2022.105103] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 03/09/2022] [Accepted: 03/20/2022] [Indexed: 11/03/2022]
Abstract
Humans appear to represent many forms of knowledge in associative networks whose nodes are multiply connected, including sensory, spatial, and semantic. Recent work has shown that explicitly augmenting artificial agents with such graph-structured representations endows them with more human-like capabilities of compositionality and transfer learning. An open question is how humans acquire these representations. Previously, it has been shown that humans can learn to navigate graph-structured conceptual spaces on the basis of direct experience with trajectories that intentionally draw the network contours (Schapiro, Kustner, & Turk-Browne, 2012; Schapiro, Turk-Browne, Botvinick, & Norman, 2016), or through direct experience with rewards that covary with the underlying associative distance (Wu, Schulz, Speekenbrink, Nelson, & Meder, 2018). Here, we provide initial evidence that this capability is more general, extending to learning to reason about shortest-path distances across a graph structure acquired across disjoint experiences with randomized edges of the graph - a form of latent learning. In other words, we show that humans can infer graph structures, assembling them from disordered experiences. We further show that the degree to which individuals learn to reason correctly and with reference to the structure of the graph corresponds to their propensity, in a separate task, to use model-based reinforcement learning to achieve rewards. This connection suggests that the correct acquisition of graph-structured relationships is a central ability underlying forward planning and reasoning, and may be a core computation across the many domains in which graph-based reasoning is advantageous.
Collapse
Affiliation(s)
- Milena Rmus
- Department of Psychology, University of California, Berkeley, USA.
| | - Harrison Ritz
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA
| | | | - Aaron M Bornstein
- Department of Cognitive Sciences, University of California, Irvine, USA; Center for the Neurobiology of Learning and Memory, University of California, Irvine, USA
| | - Amitai Shenhav
- Department of Cognitive, Linguistic, and Psychological Sciences, Brown University, USA; Carney Institute for Brain Science, Brown University, USA
| |
Collapse
|
39
|
Grujic N, Brus J, Burdakov D, Polania R. Rational inattention in mice. SCIENCE ADVANCES 2022; 8:eabj8935. [PMID: 35245128 PMCID: PMC8896787 DOI: 10.1126/sciadv.abj8935] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Behavior exhibited by humans and other organisms is generally inconsistent and biased and, thus, is often labeled irrational. However, the origins of this seemingly suboptimal behavior remain elusive. We developed a behavioral task and normative framework to reveal how organisms should allocate their limited processing resources such that sensory precision and its related metabolic investment are balanced to guarantee maximal utility. We found that mice act as rational inattentive agents by adaptively allocating their sensory resources in a way that maximizes reward consumption in previously unexperienced stimulus-reward association environments. Unexpectedly, perception of commonly occurring stimuli was relatively imprecise; however, this apparent statistical fallacy implies "awareness" and efficient adaptation to their neurocognitive limitations. Arousal systems carry reward distribution information of sensory signals, and distributional reinforcement learning mechanisms regulate sensory precision via top-down normalization. These findings reveal how organisms efficiently perceive and adapt to previously unexperienced environmental contexts within the constraints imposed by neurobiology.
Collapse
Affiliation(s)
- Nikola Grujic
- Institute for Neuroscience, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
- Neuroscience Center Zürich, Zurich, Switzerland
| | - Jeroen Brus
- Neuroscience Center Zürich, Zurich, Switzerland
- Decision Neuroscience Lab, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
| | - Denis Burdakov
- Institute for Neuroscience, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
- Neuroscience Center Zürich, Zurich, Switzerland
- Corresponding author. (R.P.); (D.B.)
| | - Rafael Polania
- Neuroscience Center Zürich, Zurich, Switzerland
- Decision Neuroscience Lab, Department of Health Sciences and Technology, ETH Zurich, Zurich, Switzerland
- Corresponding author. (R.P.); (D.B.)
| |
Collapse
|
40
|
Hartogsveld B, Quaedflieg CWEM, van Ruitenbeek P, Smeets T. Decreased putamen activation in balancing goal-directed and habitual behavior in binge eating disorder. Psychoneuroendocrinology 2022; 136:105596. [PMID: 34839081 DOI: 10.1016/j.psyneuen.2021.105596] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Revised: 10/15/2021] [Accepted: 11/11/2021] [Indexed: 11/24/2022]
Abstract
Acute stress is associated with a shift from goal-directed to habitual behavior. This stress-induced preference for habitual behavior has been suggested as a potential mechanism by which binge eating disorder (BED) patients succumb to eating large amounts of high-caloric foods in an uncontrolled manner (i.e., binge episodes). While in healthy subjects the balance between goal-directed and habitual behavior is subserved by the anterior cingulate cortex (ACC), insular cortex, orbitofrontal cortex (OFC), anterior caudate nucleus, and posterior putamen, the brain mechanism that underlies this (possibly amplified) stress-induced behavioral shift in BED patients is currently unknown. In the current study, 76 participants (38 BED, 38 healthy controls (HCs)) learned six stimulus-response-outcome associations in a well-established instrumental learning task. Subsequently, three outcomes were selectively devalued, after which participants underwent either a stress induction procedure (Maastricht Acute Stress Test; MAST) or a no-stress control procedure. Next, the balance between goal-directed and habitual behavior was assessed during functional magnetic resonance imaging. Findings show that the balance between goal-directed and habitual behavior was associated with activity in the ACC, insula, and OFC in no-stress HCs. Although stress and BED did not modulate the balance between goal-directed and habitual behavior, BED participants displayed a smaller difference in putamen activation between trials probing goal-directed and habitual behavior compared with HCs when using a ROI approach. We conclude that putamen activity differences between BED and HC could reflect changes in monitoring of response accuracy or reward value, albeit perhaps not sufficiently to induce a measurable shift from goal-directed to habitual behavior. Future research could clarify potential boundary conditions of stress-induced shifts in instrumental behavior in BED patients.
Collapse
Affiliation(s)
- B Hartogsveld
- Department of Clinical Psychological Science, Faculty of Psychology and Neuroscience, Maastricht University, The Netherlands.
| | - C W E M Quaedflieg
- Department of Neuropsychology & Psychopharmacology, Faculty of Psychology and Neuroscience, Maastricht University, The Netherlands
| | - P van Ruitenbeek
- Department of Neuropsychology & Psychopharmacology, Faculty of Psychology and Neuroscience, Maastricht University, The Netherlands
| | - T Smeets
- Department of Clinical Psychological Science, Faculty of Psychology and Neuroscience, Maastricht University, The Netherlands; CoRPS - Center of Research on Psychological and Somatic disorders, Department of Medical and Clinical Psychology, Tilburg School of Social and Behavioral Sciences, Tilburg University, The Netherlands
| |
Collapse
|
41
|
Averbeck B, O'Doherty JP. Reinforcement-learning in fronto-striatal circuits. Neuropsychopharmacology 2022; 47:147-162. [PMID: 34354249 PMCID: PMC8616931 DOI: 10.1038/s41386-021-01108-0] [Citation(s) in RCA: 43] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 07/06/2021] [Accepted: 07/09/2021] [Indexed: 01/03/2023]
Abstract
We review the current state of knowledge on the computational and neural mechanisms of reinforcement-learning with a particular focus on fronto-striatal circuits. We divide the literature in this area into five broad research themes: the target of the learning-whether it be learning about the value of stimuli or about the value of actions; the nature and complexity of the algorithm used to drive the learning and inference process; how learned values get converted into choices and associated actions; the nature of state representations, and of other cognitive machinery that support the implementation of various reinforcement-learning operations. An emerging fifth area focuses on how the brain allocates or arbitrates control over different reinforcement-learning sub-systems or "experts". We will outline what is known about the role of the prefrontal cortex and striatum in implementing each of these functions. We then conclude by arguing that it will be necessary to build bridges from algorithmic level descriptions of computational reinforcement-learning to implementational level models to better understand how reinforcement-learning emerges from multiple distributed neural networks in the brain.
Collapse
Affiliation(s)
| | - John P O'Doherty
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| |
Collapse
|
42
|
van Swieten MMH, Bogacz R, Manohar SG. Hunger improves reinforcement-driven but not planned action. COGNITIVE, AFFECTIVE & BEHAVIORAL NEUROSCIENCE 2021; 21:1196-1206. [PMID: 34652602 PMCID: PMC8563670 DOI: 10.3758/s13415-021-00921-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Accepted: 05/17/2021] [Indexed: 11/08/2022]
Abstract
Human decisions can be reflexive or planned, being governed respectively by model-free and model-based learning systems. These two systems might differ in their responsiveness to our needs. Hunger drives us to specifically seek food rewards, but here we ask whether it might have more general effects on these two decision systems. On one hand, the model-based system is often considered flexible and context-sensitive, and might therefore be modulated by metabolic needs. On the other hand, the model-free system's primitive reinforcement mechanisms may have closer ties to biological drives. Here, we tested participants on a well-established two-stage sequential decision-making task that dissociates the contribution of model-based and model-free control. Hunger enhanced overall performance by increasing model-free control, without affecting model-based control. These results demonstrate a generalized effect of hunger on decision-making that enhances reliance on primitive reinforcement learning, which in some situations translates into adaptive benefits.
Collapse
Affiliation(s)
| | - Rafal Bogacz
- Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, UK
| | - Sanjay G Manohar
- Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford, UK.
| |
Collapse
|
43
|
Shahar N, Hauser TU, Moran R, Moutoussis M, Bullmore ET, Dolan RJ. Assigning the right credit to the wrong action: compulsivity in the general population is associated with augmented outcome-irrelevant value-based learning. Transl Psychiatry 2021; 11:564. [PMID: 34741013 PMCID: PMC8571313 DOI: 10.1038/s41398-021-01642-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Revised: 09/01/2021] [Accepted: 09/21/2021] [Indexed: 11/08/2022] Open
Abstract
Compulsive behavior is enacted under a belief that a specific act controls the likelihood of an undesired future event. Compulsive behaviors are widespread in the general population despite having no causal relationship with events they aspire to influence. In the current study, we tested whether there is an increased tendency to assign value to aspects of a task that do not predict an outcome (i.e., outcome-irrelevant learning) among individuals with compulsive tendencies. We studied 514 healthy individuals who completed self-report compulsivity, anxiety, depression, and schizotypal measurements, and a well-established reinforcement-learning task (i.e., the two-step task). As expected, we found a positive relationship between compulsivity and outcome-irrelevant learning. Specifically, individuals who reported having stronger compulsive tendencies (e.g., washing, checking, grooming) also tended to assign value to response keys and stimuli locations that did not predict an outcome. Controlling for overall goal-directed abilities and the co-occurrence of anxious, depressive, or schizotypal tendencies did not impact these associations. These findings indicate that outcome-irrelevant learning processes may contribute to the expression of compulsivity in a general population setting. We highlight the need for future research on the formation of non-veridical action-outcome associations as a factor related to the occurrence and maintenance of compulsive behavior.
Collapse
Affiliation(s)
- Nitzan Shahar
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK.
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, UK.
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel.
- Psychology Department, Tel Aviv University, Tel Aviv, Israel.
| | - Tobias U Hauser
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, UK
| | - Rani Moran
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, UK
| | - Michael Moutoussis
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, UK
| | | | - Raymond J Dolan
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London, WC1B 5EH, UK
- Wellcome Centre for Human Neuroimaging, University College London, London, WC1N 3BG, UK
| |
Collapse
|
44
|
Bolenz F, Eppinger B. Valence bias in metacontrol of decision making in adolescents and young adults. Child Dev 2021; 93:e103-e116. [PMID: 34655226 DOI: 10.1111/cdev.13693] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The development of metacontrol of decision making and its susceptibility to framing effects were investigated in a sample of 201 adolescents and adults in Germany (12-25 years, 111 female, ethnicity not recorded). In a task that dissociates model-free and model-based decision making, outcome magnitude and outcome valence were manipulated. Both adolescents and adults showed metacontrol and metacontrol tended to increase across adolescence. Furthermore, model-based decision making was more pronounced for loss compared to gain frames but there was no evidence that this framing effect differed with age. Thus, the strategic adaptation of decision making continues to develop into young adulthood and for both adolescents and adults, losses increase the motivation to invest cognitive resources into an effortful decision-making strategy.
Collapse
Affiliation(s)
- Florian Bolenz
- Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Center for Adaptive Rationality, Max Planck Institute for Human Development, Berlin, Germany.,Cluster of Excellence "Science of Intelligence", Technische Universität Berlin, Berlin, Germany
| | - Ben Eppinger
- Faculty of Psychology, Technische Universität Dresden, Dresden, Germany.,Department of Psychology, Concordia University, Montreal, Quebec, Canada.,PERFORM centre, Concordia University, Montreal, Quebec, Canada
| |
Collapse
|
45
|
Seow TXF, Benoit E, Dempsey C, Jennings M, Maxwell A, O'Connell R, Gillan CM. Model-Based Planning Deficits in Compulsivity Are Linked to Faulty Neural Representations of Task Structure. J Neurosci 2021; 41:6539-6550. [PMID: 34131033 PMCID: PMC8318073 DOI: 10.1523/jneurosci.0031-21.2021] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2021] [Revised: 04/29/2021] [Accepted: 05/04/2021] [Indexed: 11/21/2022] Open
Abstract
Compulsive individuals have deficits in model-based planning, but the mechanisms that drive this have not been established. We examined two candidates-that compulsivity is linked to (1) an impaired model of the task environment and/or (2) an inability to engage cognitive control when making choices. To test this, 192 participants performed a two-step reinforcement learning task with concurrent EEG recordings, and we related the neural and behavioral data to their scores on a self-reported transdiagnostic dimension of compulsivity. To examine subjects' internal model of the task, we used established behavioral and neural responses to unexpected events [reaction time (RT) slowing, P300 wave, and parietal-occipital alpha band power] measured when an unexpected transition occurred. To assess cognitive control, we probed theta power at the time of initial choice. As expected, model-based planning was linked to greater behavioral (RT) and neural (alpha power, but not P300) sensitivity to rare transitions. Critically, the sensitivities of both RT and alpha to task structure were weaker in those high in compulsivity. This RT-compulsivity effect was tested and replicated in an independent pre-existing dataset (N = 1413). We also found that mid-frontal theta power at the time of choice was reduced in highly compulsive individuals though its relation to model-based planning was less pronounced. These data suggest that model-based planning deficits in compulsive individuals may arise, at least in part, from having an impaired representation of the environment, specifically how actions lead to future states.SIGNIFICANCE STATEMENT Compulsivity is linked to poorer performance on tasks that require model-based planning, but it is unclear what precise mechanisms underlie this deficit. Do compulsive individuals fail to engage cognitive control at the time of choice? Or do they have difficulty in building and maintaining an accurate representation of their environment, the foundation needed to behave in a goal-directed manner? With reaction time and EEG measures in 192 individuals who performed a two-step decision-making task, we found that compulsive individuals are less sensitive to surprising action-state transitions, where they slow down less and show less alpha band suppression following a rare transition. These findings implicate failures in maintaining an accurate model of the world in model-based planning deficits in compulsivity.
Collapse
Affiliation(s)
- Tricia X F Seow
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
| | - Edith Benoit
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
| | - Caoimhe Dempsey
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
| | - Maeve Jennings
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
| | | | - Redmond O'Connell
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
| | - Claire M Gillan
- Department of Psychology, Trinity College Dublin, Dublin 2, Ireland
- Trinity College Institute of Neuroscience, Trinity College Dublin, Dublin 2, Ireland
- Global Brain Health Institute, Trinity College Dublin, Dublin 2, Ireland
| |
Collapse
|
46
|
Ciria LF, Watson P, Vadillo MA, Luque D. Is the habit system altered in individuals with obesity? A systematic review. Neurosci Biobehav Rev 2021; 128:621-632. [PMID: 34252472 DOI: 10.1016/j.neubiorev.2021.07.006] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 05/27/2021] [Accepted: 07/06/2021] [Indexed: 12/15/2022]
Abstract
Habit-like eating behavior is repeatedly pointed to as a key cognitive mechanism contributing to the emergence and maintenance of obesity. Here, we conducted a systematic review of the literature to assess the existent behavioral evidence for the Habit Hypothesis for Overeating (HHO) which states that obesity is the consequence of an imbalance between the habit and goal-directed reward learning systems, leading to overconsumption of food. We found a total of 19 studies implementing a variety of experimental protocols (i.e., free operant paradigm, slips-of-action test, two-step task, Pavlovian-to-Instrumental paradigm, probabilistic learning task) and manipulations. Taken together, the studies on clinical (binge eating disorder) and non-clinical individuals with overweight or obesity do not support the HHO conclusively. While the scientific literature on HHO is still in its infancy, the heterogeneity of the extant studies makes it difficult to evaluate the degree of convergence of these findings. Uncovering the role of reward learning systems in eating behaviors might have a transformative impact on public health.
Collapse
Affiliation(s)
- Luis F Ciria
- Departamento de Psicología Básica, Universidad Autónoma de Madrid, Spain; Departamento de Psicología Básica, Universidad de Málaga, Spain.
| | - Poppy Watson
- School of Psychology, University of New South Wales, Sydney, Australia
| | - Miguel A Vadillo
- Departamento de Psicología Básica, Universidad Autónoma de Madrid, Spain
| | - David Luque
- Departamento de Psicología Básica, Universidad Autónoma de Madrid, Spain; Departamento de Psicología Básica, Universidad de Málaga, Spain.
| |
Collapse
|
47
|
Ito KL, Cao L, Reinberg R, Keller B, Monterosso J, Schweighofer N, Liew SL. Validating Habitual and Goal-Directed Decision-Making Performance Online in Healthy Older Adults. Front Aging Neurosci 2021; 13:702810. [PMID: 34267650 PMCID: PMC8276057 DOI: 10.3389/fnagi.2021.702810] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/04/2021] [Indexed: 12/02/2022] Open
Abstract
Everyday decision-making is supported by a dual-system of control comprised of parallel goal-directed and habitual systems. Over the past decade, the two-stage Markov decision task has become popularized for its ability to dissociate between goal-directed and habitual decision-making. While a handful of studies have implemented decision-making tasks online, only one study has validated the task by comparing in-person and web-based performance on the two-stage task in children and young adults. To date, no study has validated the dissociation of goal-directed and habitual behaviors in older adults online. Here, we implemented and validated a web-based version of the two-stage Markov task using parameter simulation and recovery and compared behavioral results from online and in-person participation on the two-stage task in both young and healthy older adults. We found no differences in estimated free parameters between online and in-person participation on the two-stage task. Further, we replicate previous findings that young adults are more goal-directed than older adults both in-person and online. Overall, this work demonstrates that the implementation and use of the two-stage Markov decision task for remote participation is feasible in the older adult demographic, which would allow for the study of decision-making with larger and more diverse samples.
Collapse
Affiliation(s)
- Kaori L Ito
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| | - Laura Cao
- Computational Neuro-Rehabilitation Laboratory, Department of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, United States
| | - Renee Reinberg
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| | - Brenton Keller
- Department of Gerontology, University of Southern California, Los Angeles, CA, United States
| | - John Monterosso
- Department of Psychology, University of Southern California, Los Angeles, CA, United States
| | - Nicolas Schweighofer
- Computational Neuro-Rehabilitation Laboratory, Department of Biokinesiology and Physical Therapy, University of Southern California, Los Angeles, CA, United States
| | - Sook-Lei Liew
- Neural Plasticity and Neurorehabilitation Laboratory, Department of Occupational Science and Occupational Therapy, University of Southern California, Los Angeles, CA, United States
| |
Collapse
|
48
|
Moutoussis M, Garzón B, Neufeld S, Bach DR, Rigoli F, Goodyer I, Bullmore E, Guitart-Masip M, Dolan RJ. Decision-making ability, psychopathology, and brain connectivity. Neuron 2021; 109:2025-2040.e7. [PMID: 34019810 PMCID: PMC8221811 DOI: 10.1016/j.neuron.2021.04.019] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2020] [Revised: 02/16/2021] [Accepted: 04/19/2021] [Indexed: 12/11/2022]
Abstract
Decision-making is a cognitive process of central importance for the quality of our lives. Here, we ask whether a common factor underpins our diverse decision-making abilities. We obtained 32 decision-making measures from 830 young people and identified a common factor that we call "decision acuity," which was distinct from IQ and reflected a generic decision-making ability. Decision acuity was decreased in those with aberrant thinking and low general social functioning. Crucially, decision acuity and IQ had dissociable brain signatures, in terms of their associated neural networks of resting-state functional connectivity. Decision acuity was reliably measured, and its relationship with functional connectivity was also stable when measured in the same individuals 18 months later. Thus, our behavioral and brain data identify a new cognitive construct that underpins decision-making ability across multiple domains. This construct may be important for understanding mental health, particularly regarding poor social function and aberrant thought patterns.
Collapse
Affiliation(s)
- Michael Moutoussis
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK.
| | - Benjamín Garzón
- Aging Research Centre, Karolinska Institute, Stockholm, Sweden
| | - Sharon Neufeld
- Department of Psychiatry, University of Cambridge, Cambridge CB2 0SZ, UK
| | - Dominik R Bach
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK; Computational Psychiatry Research, Department of Psychiatry, Psychotherapy, and Psychosomatics, Psychiatric Hospital, University of Zurich, 8032 Zurich, Switzerland
| | | | - Ian Goodyer
- Department of Psychiatry, University of Cambridge, Cambridge CB2 0SZ, UK
| | - Edward Bullmore
- Department of Psychiatry, University of Cambridge, Cambridge CB2 0SZ, UK
| | - Marc Guitart-Masip
- Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK; Aging Research Centre, Karolinska Institute, Stockholm, Sweden
| | - Raymond J Dolan
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK; Max Planck University College London Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
| |
Collapse
|
49
|
Xu HA, Modirshanechi A, Lehmann MP, Gerstner W, Herzog MH. Novelty is not surprise: Human exploratory and adaptive behavior in sequential decision-making. PLoS Comput Biol 2021; 17:e1009070. [PMID: 34081705 PMCID: PMC8205159 DOI: 10.1371/journal.pcbi.1009070] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 06/15/2021] [Accepted: 05/12/2021] [Indexed: 11/19/2022] Open
Abstract
Classic reinforcement learning (RL) theories cannot explain human behavior in the absence of external reward or when the environment changes. Here, we employ a deep sequential decision-making paradigm with sparse reward and abrupt environmental changes. To explain the behavior of human participants in these environments, we show that RL theories need to include surprise and novelty, each with a distinct role. While novelty drives exploration before the first encounter of a reward, surprise increases the rate of learning of a world-model as well as of model-free action-values. Even though the world-model is available for model-based RL, we find that human decisions are dominated by model-free action choices. The world-model is only marginally used for planning, but it is important to detect surprising events. Our theory predicts human action choices with high probability and allows us to dissociate surprise, novelty, and reward in EEG signals.
Collapse
Affiliation(s)
- He A. Xu
- Laboratory of Psychophysics, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Alireza Modirshanechi
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Marco P. Lehmann
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Wulfram Gerstner
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- School of Computer and Communication Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | - Michael H. Herzog
- Laboratory of Psychophysics, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
- Brain-Mind Institute, School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| |
Collapse
|
50
|
Chen H, Mojtahedzadeh N, Belanger MJ, Nebe S, Kuitunen-Paul S, Sebold M, Garbusow M, Huys QJM, Heinz A, Rapp MA, Smolka MN. Model-Based and Model-Free Control Predicts Alcohol Consumption Developmental Trajectory in Young Adults: A 3-Year Prospective Study. Biol Psychiatry 2021; 89:980-989. [PMID: 33771349 DOI: 10.1016/j.biopsych.2021.01.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Revised: 12/21/2020] [Accepted: 01/17/2021] [Indexed: 11/15/2022]
Abstract
BACKGROUND A shift from goal-directed toward habitual control has been associated with alcohol dependence. Whether such a shift predisposes to risky drinking is not yet clear. We investigated how goal-directed and habitual control at age 18 predict alcohol use trajectories over the course of 3 years. METHODS Goal-directed and habitual control, as informed by model-based (MB) and model-free (MF) learning, were assessed with a two-step sequential decision-making task during functional magnetic resonance imaging in 146 healthy 18-year-old men. Three-year alcohol use developmental trajectories were based on either a consumption score from the self-reported Alcohol Use Disorders Identification Test (assessed every 6 months) or an interview-based binge drinking score (grams of alcohol/occasion; assessed every year). We applied a latent growth curve model to examine how MB and MF control predicted the drinking trajectory. RESULTS Drinking behavior was best characterized by a linear trajectory. MB behavioral control was negatively associated with the development of the binge drinking score; MF reward prediction error blood oxygen level-dependent signals in the ventromedial prefrontal cortex and the ventral striatum predicted a higher starting point and steeper increase of the Alcohol Use Disorders Identification Test consumption score over time, respectively. CONCLUSIONS We found that MB behavioral control was associated with the binge drinking trajectory, while the MF reward prediction error signal was closely linked to the consumption score development. These findings support the idea that unbalanced MB and MF control might be an important individual vulnerability in predisposing to risky drinking behavior.
Collapse
Affiliation(s)
- Hao Chen
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany; Neuroimaging Center, Technische Universität Dresden, Dresden, Germany
| | - Negin Mojtahedzadeh
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany; Neuroimaging Center, Technische Universität Dresden, Dresden, Germany
| | - Matthew J Belanger
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany; Neuroimaging Center, Technische Universität Dresden, Dresden, Germany
| | - Stephan Nebe
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany; Neuroimaging Center, Technische Universität Dresden, Dresden, Germany; Zurich Center for Neuroeconomics, Department of Economics, University of Zurich, Zurich, Switzerland
| | - Sören Kuitunen-Paul
- Institute of Clinical Psychology and Psychotherapy, Technische Universität Dresden, Dresden, Germany; Department of Child and Adolescent Psychiatry, Technische Universität Dresden, Dresden, Germany
| | - Miriam Sebold
- Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Maria Garbusow
- Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Quentin J M Huys
- Division of Psychiatry, University College London, London, United Kingdom; Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, United Kingdom
| | - Andreas Heinz
- Department of Psychiatry and Psychotherapy, Campus Charité Mitte, Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität zu Berlin, Berlin, Germany
| | - Michael A Rapp
- Area of Excellence Cognitive Sciences, University of Potsdam, Potsdam, Germany
| | - Michael N Smolka
- Department of Psychiatry, Technische Universität Dresden, Dresden, Germany; Neuroimaging Center, Technische Universität Dresden, Dresden, Germany.
| |
Collapse
|