1
|
Chalmers E, Luczak A. A bio-inspired reinforcement learning model that accounts for fast adaptation after punishment. Neurobiol Learn Mem 2024; 215:107974. [PMID: 39209018 DOI: 10.1016/j.nlm.2024.107974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Revised: 08/14/2024] [Accepted: 08/26/2024] [Indexed: 09/04/2024]
Abstract
Humans and animals can quickly learn a new strategy when a previously-rewarding strategy is punished. It is difficult to model this with reinforcement learning methods, because they tend to perseverate on previously-learned strategies - a hallmark of impaired response to punishment. Past work has addressed this by augmenting conventional reinforcement learning equations with ad hoc parameters or parallel learning systems. This produces reinforcement learning models that account for reversal learning, but are more abstract, complex, and somewhat detached from neural substrates. Here we use a different approach: we generalize a recently-discovered neuron-level learning rule, on the assumption that it captures a basic principle of learning that may occur at the whole-brain-level. Surprisingly, this gives a new reinforcement learning rule that accounts for adaptation and lose-shift behavior, and uses only the same parameters as conventional reinforcement learning equations. In the new rule, the normal reward prediction errors that drive reinforcement learning are scaled by the likelihood the agent assigns to the action that triggered a reward or punishment. The new rule demonstrates quick adaptation in card sorting and variable Iowa gambling tasks, and also exhibits a human-like paradox-of-choice effect. It will be useful for experimental researchers modeling learning and behavior.
Collapse
Affiliation(s)
- Eric Chalmers
- Department of Mathematics and Computing, Mount Royal University, 4825 Mt Royal Gate SW, Calgary, AB T3E 6K6, Canada.
| | - Artur Luczak
- Canadian Center for Behavioral Neuroscience, University of Lethbridge4401 University Dr W, Lethbridge, AB T1K 3M4, Canada.
| |
Collapse
|
2
|
Ngetich R, Villalba-García C, Soborun Y, Vékony T, Czakó A, Demetrovics Z, Németh D. Learning and memory processes in behavioural addiction: A systematic review. Neurosci Biobehav Rev 2024; 163:105747. [PMID: 38870547 DOI: 10.1016/j.neubiorev.2024.105747] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2024] [Revised: 05/28/2024] [Accepted: 06/01/2024] [Indexed: 06/15/2024]
Abstract
Similar to addictive substances, addictive behaviours such as gambling and gaming are associated with maladaptive modulation of key brain areas and functional networks implicated in learning and memory. Therefore, this review sought to understand how different learning and memory processes relate to behavioural addictions and to unravel their underlying neural mechanisms. Adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines, we systematically searched four databases - PsycINFO, PubMed, Scopus, and Web of Science using the agreed-upon search string. Findings suggest altered executive function-dependent learning processes and enhanced habit learning in behavioural addiction. Whereas the relationship between working memory and behavioural addiction is influenced by addiction type, working memory aspect, and task nature. Additionally, long-term memory is incoherent in individuals with addictive behaviours. Consistently, neurophysiological evidence indicates alterations in brain areas and networks implicated in learning and memory processes in behavioural addictions. Overall, the present review argues that, like substance use disorders, alteration in learning and memory processes may underlie the development and maintenance of behavioural addictions.
Collapse
Affiliation(s)
- Ronald Ngetich
- Centre of Excellence in Responsible Gaming, University of Gibraltar, Gibraltar, Gibraltar
| | | | - Yanisha Soborun
- Centre of Excellence in Responsible Gaming, University of Gibraltar, Gibraltar, Gibraltar
| | - Teodóra Vékony
- Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, INSERM, CNRS, Université Claude Bernard Lyon 1, Bron, France; Department of Education and Psychology, Faculty of Social Sciences, University of Atlántico Medio, Las Palmas de Gran Canaria, Spain
| | - Andrea Czakó
- Centre of Excellence in Responsible Gaming, University of Gibraltar, Gibraltar, Gibraltar; Institute of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary
| | - Zsolt Demetrovics
- Centre of Excellence in Responsible Gaming, University of Gibraltar, Gibraltar, Gibraltar; Institute of Psychology, ELTE Eötvös Loránd University, Budapest, Hungary; College of Education, Psychology and Social Work, Flinders University, Adelaide, Australia.
| | - Dezső Németh
- Centre de Recherche en Neurosciences de Lyon CRNL U1028 UMR5292, INSERM, CNRS, Université Claude Bernard Lyon 1, Bron, France; Department of Education and Psychology, Faculty of Social Sciences, University of Atlántico Medio, Las Palmas de Gran Canaria, Spain; BML-NAP Research Group, Institute of Psychology, Eötvös Loránd University & Institute of Cognitive Neuroscience and Psychology, HUN-REN Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
3
|
Zuo L, Ai K, Liu W, Qiu B, Tang R, Fu J, Yang P, Kong Z, Song H, Zhu X, Zhang X. Navigating Exploitative Traps: Unveiling the Uncontrollable Reward Seeking of Individuals With Internet Gaming Disorder. BIOLOGICAL PSYCHIATRY. COGNITIVE NEUROSCIENCE AND NEUROIMAGING 2024:S2451-9022(24)00138-1. [PMID: 38839035 DOI: 10.1016/j.bpsc.2024.05.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Revised: 04/17/2024] [Accepted: 05/19/2024] [Indexed: 06/07/2024]
Abstract
BACKGROUND Internet gaming disorder (IGD) involves an imbalance in the brain's dual system, characterized by heightened reward seeking and diminished cognitive control, which lead to decision-making challenges. The exploration-exploitation strategy is key to decision making, but how IGD affects this process is unclear. METHODS To investigate the impact of IGD on decision making, a modified version of the 2-armed bandit task was employed. Participants included 41 individuals with IGD and 44 healthy control individuals. The study assessed the strategies used by participants in the task, particularly focusing on the exploitation-exploration strategy. Additionally, functional magnetic resonance imaging was used to examine brain activation patterns during decision-making and estimation phases. RESULTS The study found that individuals with IGD demonstrated greater reliance on exploitative strategies in decision making due to their elevated value-seeking tendencies and decreased cognitive control. Individuals with IGD also displayed heightened activation in the presupplementary motor area and the ventral striatum compared with the healthy control group in both decision-making and estimation phases. Meanwhile, the prefrontal cortex showed more inhibition in individuals with IGD than in the healthy control group during exploitative strategies. This inhibition decreased as cognitive control diminished. CONCLUSIONS The imbalance in the development of the dual system in individuals with IGD may lead to an overreliance on exploitative strategies. This imbalance, marked by increased reward seeking and reduced cognitive control, contributes to difficulties in decision making and value-related behavioral processes in individuals with IGD.
Collapse
Affiliation(s)
- Lin Zuo
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Kedan Ai
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Weili Liu
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Bensheng Qiu
- Centers for Biomedical Engineering, USTC, Anhui, China
| | - Rui Tang
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Jiaxin Fu
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Ping Yang
- Department of Psychology, School of Humanities & Social Science, USTC, Anhui, China
| | - Zhuo Kong
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China
| | - Hongwen Song
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China; Key Laboratory of Philosophy and Social Science of Anhui Province on Adolescent Mental Health and Crisis Intelligence Intervention, Anhui, China.
| | - Xiaoyu Zhu
- Department of Hematology, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, USTC, Anhui, China.
| | - Xiaochu Zhang
- Hefei National Research Center for Physical Sciences at the Microscale, and Department of Hematology, the First Affiliated Hospital of USTC, Division of Life Science and Medicine, University of Science and Technology of China, Anhui, China; Department of Psychology, School of Humanities & Social Science, USTC, Anhui, China; Business School, Guizhou Education University, Guiyang, China; Institute of Health and Medicine, Hefei Comprehensive Science Center, Anhui, China.
| |
Collapse
|
4
|
Kang P, Tobler PN, Dayan P. Bayesian reinforcement learning: A basic overview. Neurobiol Learn Mem 2024; 211:107924. [PMID: 38579896 DOI: 10.1016/j.nlm.2024.107924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 03/21/2024] [Accepted: 04/02/2024] [Indexed: 04/07/2024]
Abstract
We and other animals learn because there is some aspect of the world about which we are uncertain. This uncertainty arises from initial ignorance, and from changes in the world that we do not perfectly know; the uncertainty often becomes evident when our predictions about the world are found to be erroneous. The Rescorla-Wagner learning rule, which specifies one way that prediction errors can occasion learning, has been hugely influential as a characterization of Pavlovian conditioning and, through its equivalence to the delta rule in engineering, in a much wider class of learning problems. Here, we review the embedding of the Rescorla-Wagner rule in a Bayesian context that is precise about the link between uncertainty and learning, and thereby discuss extensions to such suggestions as the Kalman filter, structure learning, and beyond, that collectively encompass a wider range of uncertainties and accommodate a wider assortment of phenomena in conditioning.
Collapse
Affiliation(s)
- Pyungwon Kang
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Philippe N Tobler
- University of Zurich, Department of Economics, Laboratory for Social and Neural Systems Research, Zurich, Switzerland.
| | - Peter Dayan
- Max Planck Institute for Biological Cybernetics, Tübingen, Germany; University of Tübingen, Tübingen Germany.
| |
Collapse
|
5
|
Gilmour W, Mackenzie G, Feile M, Tayler-Grint L, Suveges S, Macfarlane JA, Macleod AD, Marshall V, Grunwald IQ, Steele JD, Gilbertson T. Impaired value-based decision-making in Parkinson's disease apathy. Brain 2024; 147:1362-1376. [PMID: 38305691 PMCID: PMC10994558 DOI: 10.1093/brain/awae025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 12/07/2023] [Accepted: 01/13/2024] [Indexed: 02/03/2024] Open
Abstract
Apathy is a common and disabling complication of Parkinson's disease characterized by reduced goal-directed behaviour. Several studies have reported dysfunction within prefrontal cortical regions and projections from brainstem nuclei whose neuromodulators include dopamine, serotonin and noradrenaline. Work in animal and human neuroscience have confirmed contributions of these neuromodulators on aspects of motivated decision-making. Specifically, these neuromodulators have overlapping contributions to encoding the value of decisions, and influence whether to explore alternative courses of action or persist in an existing strategy to achieve a rewarding goal. Building upon this work, we hypothesized that apathy in Parkinson's disease should be associated with an impairment in value-based learning. Using a four-armed restless bandit reinforcement learning task, we studied decision-making in 75 volunteers; 53 patients with Parkinson's disease, with and without clinical apathy, and 22 age-matched healthy control subjects. Patients with apathy exhibited impaired ability to choose the highest value bandit. Task performance predicted an individual patient's apathy severity measured using the Lille Apathy Rating Scale (R = -0.46, P < 0.001). Computational modelling of the patient's choices confirmed the apathy group made decisions that were indifferent to the learnt value of the options, consistent with previous reports of reward insensitivity. Further analysis demonstrated a shift away from exploiting the highest value option and a reduction in perseveration, which also correlated with apathy scores (R = -0.5, P < 0.001). We went on to acquire functional MRI in 59 volunteers; a group of 19 patients with and 20 without apathy and 20 age-matched controls performing the Restless Bandit Task. Analysis of the functional MRI signal at the point of reward feedback confirmed diminished signal within ventromedial prefrontal cortex in Parkinson's disease, which was more marked in apathy, but not predictive of their individual apathy severity. Using a model-based categorization of choice type, decisions to explore lower value bandits in the apathy group activated prefrontal cortex to a similar degree to the age-matched controls. In contrast, Parkinson's patients without apathy demonstrated significantly increased activation across a distributed thalamo-cortical network. Enhanced activity in the thalamus predicted individual apathy severity across both patient groups and exhibited functional connectivity with dorsal anterior cingulate cortex and anterior insula. Given that task performance in patients without apathy was no different to the age-matched control subjects, we interpret the recruitment of this network as a possible compensatory mechanism, which compensates against symptomatic manifestation of apathy in Parkinson's disease.
Collapse
Affiliation(s)
- William Gilmour
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| | - Graeme Mackenzie
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| | - Mathias Feile
- Rehabilitation Psychiatry, Murray Royal Hospital, Perth PH2 7BH, UK
| | | | - Szabolcs Suveges
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - Jennifer A Macfarlane
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Medical Physics, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
- SINAPSE, University of Glasgow, Imaging Centre of Excellence, Level 2, Queen Elizabeth University Hospital, Glasgow G51 4TF, Scotland, UK
| | - Angus D Macleod
- Institute of Applied Health Sciences, School of Medicine, University of Aberdeen, Foresterhill, Aberdeen AB24 2ZD, UK
- Department of Neurology, Aberdeen Royal Infirmary, Foresterhill, Aberdeen AB24 2ZD, UK
| | - Vicky Marshall
- Institute of Neurological Sciences, Queen Elizabeth University Hospital, Glasgow G51 4TF, UK
| | - Iris Q Grunwald
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - J Douglas Steele
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
| | - Tom Gilbertson
- Division of Imaging Science and Technology, Ninewells Hospital and Medical School, University of Dundee, Dundee DD1 9SY, UK
- Department of Neurology, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
| |
Collapse
|
6
|
Paunov A, L'Hôtellier M, Guo D, He Z, Yu A, Meyniel F. Multiple and subject-specific roles of uncertainty in reward-guided decision-making. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.27.587016. [PMID: 38585958 PMCID: PMC10996615 DOI: 10.1101/2024.03.27.587016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Decision-making in noisy, changing, and partially observable environments entails a basic tradeoff between immediate reward and longer-term information gain, known as the exploration-exploitation dilemma. Computationally, an effective way to balance this tradeoff is by leveraging uncertainty to guide exploration. Yet, in humans, empirical findings are mixed, from suggesting uncertainty-seeking to indifference and avoidance. In a novel bandit task that better captures uncertainty-driven behavior, we find multiple roles for uncertainty in human choices. First, stable and psychologically meaningful individual differences in uncertainty preferences actually range from seeking to avoidance, which can manifest as null group-level effects. Second, uncertainty modulates the use of basic decision heuristics that imperfectly exploit immediate rewards: a repetition bias and win-stay-lose-shift heuristic. These heuristics interact with uncertainty, favoring heuristic choices under higher uncertainty. These results, highlighting the rich and varied structure of reward-based choice, are a step to understanding its functional basis and dysfunction in psychopathology.
Collapse
Affiliation(s)
- Alexander Paunov
- INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), NeuroSpin Center, CEA Paris-Saclay, Gif-sur-Yvette, France Université de Paris, Paris, France
- Institut de Neuromodulation, GHU Paris, Psychiatrie et Neurosciences, Centre Hospitalier Sainte-Anne, Pôle Hospitalo-universitaire 15, Université Paris Cité, Paris, France
| | - Maëva L'Hôtellier
- INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), NeuroSpin Center, CEA Paris-Saclay, Gif-sur-Yvette, France Université de Paris, Paris, France
| | - Dalin Guo
- Department of Cognitive Science, University of California San Diego, San Diego, CA, USA
| | - Zoe He
- Department of Cognitive Science, University of California San Diego, San Diego, CA, USA
| | - Angela Yu
- Department of Cognitive Science, University of California San Diego, San Diego, CA, USA
- Centre for Cognitive Science & Hessian AI Center, Technical University of Darmstadt, Germany
| | - Florent Meyniel
- INSERM-CEA Cognitive Neuroimaging Unit (UNICOG), NeuroSpin Center, CEA Paris-Saclay, Gif-sur-Yvette, France Université de Paris, Paris, France
- Institut de Neuromodulation, GHU Paris, Psychiatrie et Neurosciences, Centre Hospitalier Sainte-Anne, Pôle Hospitalo-universitaire 15, Université Paris Cité, Paris, France
| |
Collapse
|
7
|
Cheng Y, Magnard R, Langdon AJ, Lee D, Janak PH. Chronic Ethanol Exposure Produces Persistent Impairment in Cognitive Flexibility and Decision Signals in the Striatum. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.10.584332. [PMID: 38585868 PMCID: PMC10996555 DOI: 10.1101/2024.03.10.584332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
Lack of cognitive flexibility is a hallmark of substance use disorders and has been associated with drug-induced synaptic plasticity in the dorsomedial striatum (DMS). Yet the possible impact of altered plasticity on real-time striatal neural dynamics during decision-making is unclear. Here, we identified persistent impairments induced by chronic ethanol (EtOH) exposure on cognitive flexibility and striatal decision signals. After a substantial withdrawal period from prior EtOH vapor exposure, male, but not female, rats exhibited reduced adaptability and exploratory behavior during a dynamic decision-making task. Reinforcement learning models showed that prior EtOH exposure enhanced learning from rewards over omissions. Notably, neural signals in the DMS related to the decision outcome were enhanced, while those related to choice and choice-outcome conjunction were reduced, in EtOH-treated rats compared to the controls. These findings highlight the profound impact of chronic EtOH exposure on adaptive decision-making, pinpointing specific changes in striatal representations of actions and outcomes as underlying mechanisms for cognitive deficits.
Collapse
Affiliation(s)
- Yifeng Cheng
- Department Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD
| | - Robin Magnard
- Department Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD
| | - Angela J. Langdon
- Intramural Research Program, National Institute of Mental Health, National Institutes of Health, Bethesda, MD
| | - Daeyeol Lee
- Department Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD
- Zanvyl Krieger Mind/Brain Institute, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD
| | - Patricia H. Janak
- Department Psychological and Brain Sciences, Krieger School of Arts and Sciences, Johns Hopkins University, Baltimore, MD
- Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD
- Kavli Neuroscience Discovery Institute, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
8
|
Hoven M, Luigjes J, van Holst RJ. Learning and metacognition under volatility in GD: Lower learning rates and distorted coupling between action and confidence. J Behav Addict 2024; 13:226-235. [PMID: 38340145 PMCID: PMC10988407 DOI: 10.1556/2006.2023.00082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 12/22/2023] [Accepted: 12/22/2023] [Indexed: 02/12/2024] Open
Abstract
Background and aims Decisions and learning processes are under metacognitive control, where confidence in one's actions guides future behaviour. Indeed, studies have shown that being more confident results in less action updating and learning, and vice versa. This coupling between action and confidence can be disrupted, as has been found in individuals with high compulsivity symptoms. Patients with Gambling Disorder (GD) have been shown to exhibit both higher confidence and deficits in learning. Methods In this study, we tested the hypotheses that patients with GD display increased confidence, reduced action updating and lower learning rates. Additionally, we investigated whether the action-confidence coupling was distorted in patients with GD. To address this, 27 patients with GD and 30 control participants performed a predictive inference task designed to assess action and confidence dynamics during learning under volatility. Action-updating, confidence and their coupling were assessed and computational modeling estimated parameters for learning rates, error sensitivity, and sensitivity to environmental changes. Results Contrary to our expectations, results revealed no significant group differences in action updating or confidence levels. Nevertheless, GD patients exhibited a weakened coupling between confidence and action, as well as lower learning rates. Discussion and conclusions This suggests that patients with GD may underutilize confidence when steering future behavioral choices. Ultimately, these findings point to a disruption of metacognitive control in GD, without a general overconfidence bias in neutral, non-incentivized volatile learning contexts.
Collapse
Affiliation(s)
- Monja Hoven
- Department of Psychiatry, Amsterdam UMC – University of Amsterdam, Amsterdam, The Netherlands
| | - Judy Luigjes
- Department of Psychiatry, Amsterdam UMC – University of Amsterdam, Amsterdam, The Netherlands
| | - Ruth J. van Holst
- Department of Psychiatry, Amsterdam UMC – University of Amsterdam, Amsterdam, The Netherlands
- Centre for Urban Mental Health, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
9
|
Wiehler A, Peters J. Decomposition of Reinforcement Learning Deficits in Disordered Gambling via Drift Diffusion Modeling and Functional Magnetic Resonance Imaging. COMPUTATIONAL PSYCHIATRY (CAMBRIDGE, MASS.) 2024; 8:23-45. [PMID: 38774428 PMCID: PMC11104325 DOI: 10.5334/cpsy.104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/07/2023] [Accepted: 03/07/2024] [Indexed: 05/24/2024]
Abstract
Gambling disorder is associated with deficits in reward-based learning, but the underlying computational mechanisms are still poorly understood. Here, we examined this issue using a stationary reinforcement learning task in combination with computational modeling and functional resonance imaging (fMRI) in individuals that regular participate in gambling (n = 23, seven fulfilled one to three DSM 5 criteria for gambling disorder, sixteen fulfilled four or more) and matched controls (n = 23). As predicted, the gambling group exhibited substantially reduced accuracy, whereas overall response times (RTs) were not reliably different between groups. We then used comprehensive modeling using reinforcement learning drift diffusion models (RLDDMs) in combination with hierarchical Bayesian parameter estimation to shed light on the computational underpinnings of this performance deficit. In both groups, an RLDDM in which both non-decision time and decision threshold (boundary separation) changed over the course of the experiment accounted for the data best. The model showed good parameter and model recovery, and posterior predictive checks revealed that, in both groups, the model accurately reproduced the evolution of accuracies and RTs over time. Modeling revealed that, compared to controls, the learning impairment in the gambling group was linked to a more rapid reduction in decision thresholds over time, and a reduced impact of value-differences on the drift rate. The gambling group also showed shorter non-decision times. FMRI analyses replicated effects of prediction error coding in the ventral striatum and value coding in the ventro-medial prefrontal cortex, but there was no credible evidence for group differences in these effects. Taken together, our findings show that reinforcement learning impairments in disordered gambling are linked to both maladaptive decision threshold adjustments and a reduced consideration of option values in the choice process.
Collapse
Affiliation(s)
- Antonius Wiehler
- Department of Systems Neuroscience, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
- Institut du Cerveau et de la Moelle épinière (ICM), INSERM U 1127, CNRS UMR 7225, Sorbonne Universités Paris, France
| | - Jan Peters
- Department of Systems Neuroscience, University Medical Centre Hamburg-Eppendorf, Hamburg, Germany
- Department of Psychology, Biological Psychology, University of Cologne, Cologne, Germany
| |
Collapse
|
10
|
Sazhin D, Dachs A, Smith DV. Meta-Analysis Reveals That Explore-Exploit Decisions are Dissociable by Activation in the Dorsal Lateral Prefrontal Cortex and the Anterior Cingulate Cortex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.21.563317. [PMID: 37961286 PMCID: PMC10634720 DOI: 10.1101/2023.10.21.563317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Explore-exploit research has challenges in generalizability due to a limited theoretical basis of exploration and exploitation. Neuroimaging can help identify whether explore-exploit decisions use an opponent processing system to address this issue. Thus, we conducted a coordinate-based meta-analysis (N=23 studies) where we found activation in the dorsal lateral prefrontal cortex and anterior cingulate cortex during exploration versus exploitation, providing some evidence for opponent processing. However, the conjunction of explore-exploit decisions was associated with activation in the dorsal anterior cingulate cortex, dorsal medial prefrontal cortex, and anterior insula, suggesting that these brain regions do not engage in opponent processing. Further, exploratory analyses revealed heterogeneity in brain responses between task types during exploration and exploitation respectively. Coupled with results suggesting that activation in exploration and exploitation decisions is generally more similar than it is different suggests there remain significant challenges toward characterizing explore-exploit decision making. Nonetheless, dlPFC and ACC activation differentiate explore and exploit decisions and identifying these responses can help in targeted interventions aimed at manipulating these decisions.
Collapse
|
11
|
Mizell JM, Wang S, Frisvold A, Alvarado L, Farrell-Skupny A, Keung W, Phelps CE, Sundman MH, Franchetti MK, Chou YH, Alexander GE, Wilson RC. Differential impacts of healthy cognitive aging on directed and random exploration. Psychol Aging 2024; 39:88-101. [PMID: 38358695 PMCID: PMC10871551 DOI: 10.1037/pag0000791] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
Deciding whether to explore unknown opportunities or exploit well-known options is a ubiquitous part of our everyday lives. Extensive work in college students suggests that young people make explore-exploit decisions using a mixture of information seeking and random behavioral variability. Whether, and to what extent, older adults use the same strategies is unknown. To address this question, 51 older adults (ages 65-74) and 32 younger adults (ages 18-25) completed the Horizon Task, a gambling task that quantifies information seeking and behavioral variability as well as how these strategies are controlled for the purposes of exploration. Qualitatively, we found that older adults performed similar to younger adults on this task, increasing both their information seeking and behavioral variability when it was adaptive to explore. Quantitively, however, there were substantial differences between the age groups, with older adults showing less information seeking overall and less reliance on variability as a means to explore. In addition, we found a subset of approximately 26% of older adults whose information seeking was close to zero, avoiding informative options even when they were clearly the better choice. Unsurprisingly, these "information avoiders" performed worse on the task. In contrast, task performance in the remaining "information seeking" older adults was comparable to that of younger adults suggesting that age-related differences in explore-exploit decision making may be adaptive except when they are taken to extremes. (PsycInfo Database Record (c) 2024 APA, all rights reserved).
Collapse
Affiliation(s)
| | - Siyu Wang
- University of Arizona, Department of Psychology
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
12
|
Shen Q, Fu S, Jiang X, Huang X, Lin D, Xiao Q, Khadijah S, Yan Y, Xiong X, Jin J, Ebstein RP, Xu T, Wang Y, Feng J. Factual and counterfactual learning in major adolescent depressive disorder, evidence from an instrumental learning study. Psychol Med 2024; 54:256-266. [PMID: 37161677 DOI: 10.1017/s0033291723001307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
BACKGROUND The incidence of adolescent depressive disorder is globally skyrocketing in recent decades, albeit the causes and the decision deficits depression incurs has yet to be well-examined. With an instrumental learning task, the aim of the current study is to investigate the extent to which learning behavior deviates from that observed in healthy adolescent controls and track the underlying mechanistic channel for such a deviation. METHODS We recruited a group of adolescents with major depression and age-matched healthy control subjects to carry out the learning task with either gain or loss outcome and applied a reinforcement learning model that dissociates valence (positive v. negative) of reward prediction error and selection (chosen v. unchosen). RESULTS The results demonstrated that adolescent depressive patients performed significantly less well than the control group. Learning rates suggested that the optimistic bias that overall characterizes healthy adolescent subjects was absent for the depressive adolescent patients. Moreover, depressed adolescents exhibited an increased pessimistic bias for the counterfactual outcome. Lastly, individual difference analysis suggested that these observed biases, which significantly deviated from that observed in normal controls, were linked with the severity of depressive symoptoms as measured by HAMD scores. CONCLUSIONS By leveraging an incentivized instrumental learning task with computational modeling within a reinforcement learning framework, the current study reveals a mechanistic decision-making deficit in adolescent depressive disorder. These findings, which have implications for the identification of behavioral markers in depression, could support the clinical evaluation, including both diagnosis and prognosis of this disorder.
Collapse
Affiliation(s)
- Qiang Shen
- Shanghai Key Laboratory of Brain-Machine Intelligence for Information Behavior (Ministry of Education), 201620, Shanghai, China
- School of Business and Management, Shanghai International Studies University, 201620, Shanghai, China
- Joint Lab of Finance and Business Intelligence, Guangdong Institute of Intelligence Science and Technology, 519031, Zhuhai, China
| | - Shiguang Fu
- Shanghai Key Laboratory of Brain-Machine Intelligence for Information Behavior (Ministry of Education), 201620, Shanghai, China
- School of Business and Management, Shanghai International Studies University, 201620, Shanghai, China
- Joint Lab of Finance and Business Intelligence, Guangdong Institute of Intelligence Science and Technology, 519031, Zhuhai, China
| | - Xiaoying Jiang
- Hangzhou Mental Health Center of Children and Adolescents, Hangzhou Seventh People's Hospital, 310006, Hangzhou, China
| | - Xiaoyu Huang
- Hangzhou Mental Health Center of Children and Adolescents, Hangzhou Seventh People's Hospital, 310006, Hangzhou, China
| | - Doudou Lin
- School of Management, Zhejiang University of Technology, 310023, Hangzhou, China
| | - Qingyan Xiao
- Shanghai Key Laboratory of Brain-Machine Intelligence for Information Behavior (Ministry of Education), 201620, Shanghai, China
- School of Business and Management, Shanghai International Studies University, 201620, Shanghai, China
- Joint Lab of Finance and Business Intelligence, Guangdong Institute of Intelligence Science and Technology, 519031, Zhuhai, China
| | - Sitti Khadijah
- School of Management, Zhejiang University of Technology, 310023, Hangzhou, China
| | - Yaping Yan
- Department of Neurology, The Second Affiliated Hospital of Zhejiang University, 310009, Hangzhou, China
| | - Xiaoxing Xiong
- Department of Neurosurgery, Renmin Hospital of Wuhan University, 430060, Wuhan, China
| | - Jia Jin
- Shanghai Key Laboratory of Brain-Machine Intelligence for Information Behavior (Ministry of Education), 201620, Shanghai, China
- School of Business and Management, Shanghai International Studies University, 201620, Shanghai, China
- Joint Lab of Finance and Business Intelligence, Guangdong Institute of Intelligence Science and Technology, 519031, Zhuhai, China
| | - Richard P Ebstein
- China Center for Behavioral Economics and Finance, Southwestern University of Finance & Economics, 611130, Chengdu, China
| | - Ting Xu
- School of Business, University of Ningbo, 315210, Ningbo, China
| | - Yiquan Wang
- Hangzhou Mental Health Center of Children and Adolescents, Hangzhou Seventh People's Hospital, 310006, Hangzhou, China
| | - Jun Feng
- School of Economics, Hefei University of Technology, 230601, Hefei, China
| |
Collapse
|
13
|
Soussi C, Berthoz S, Chirokoff V, Chanraud S. Interindividual Brain and Behavior Differences in Adaptation to Unexpected Uncertainty. BIOLOGY 2023; 12:1323. [PMID: 37887033 PMCID: PMC10604029 DOI: 10.3390/biology12101323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Revised: 09/25/2023] [Accepted: 10/03/2023] [Indexed: 10/28/2023]
Abstract
To adapt to a new environment, individuals must alternate between exploiting previously learned "action-consequence" combinations and exploring new actions for which the consequences are unknown: they face an exploration/exploitation trade-off. The neural substrates of these behaviors and the factors that may relate to the interindividual variability in their expression remain overlooked, in particular when considering neural connectivity patterns. Here, to trigger environmental uncertainty, false feedbacks were introduced in the second phase of an associative learning task. Indices reflecting exploitation and cost of uncertainty were computed. Changes in the intrinsic connectivity were determined using resting-state functional connectivity (rFC) analyses before and after performing the "cheated" phase of the task in the MRI. We explored their links with behavioral and psychological factors. Dispersion in the participants' cost of uncertainty was used to categorize two groups. These groups showed different patterns of rFC changes. Moreover, in the overall sample, exploitation was correlated with rFC changes between (1) the anterior cingulate cortex and the cerebellum region 3, and (2) the left frontal inferior gyrus (orbital part) and the right frontal inferior gyrus (triangular part). Anxiety and doubt about action propensity were weakly correlated with some rFC changes. These results demonstrate that the exploration/exploitation trade-off involves the modulation of cortico-cerebellar intrinsic connectivity.
Collapse
Affiliation(s)
- Célia Soussi
- INCIA CNRS 5287, University of Bordeaux, 33076 Bordeaux, France; (C.S.); (V.C.); (S.C.)
- UNICAEN, INSERM, U1237, PhIND “Physiopathology and Imaging of Neurological Disorders”, NeuroPresage Team, Cyceron, Normandy University, 14000 Caen, France
| | - Sylvie Berthoz
- INCIA CNRS 5287, University of Bordeaux, 33076 Bordeaux, France; (C.S.); (V.C.); (S.C.)
- Department of Psychiatry for Adolescents and Young Adults, Institut Mutualiste Montsouris, 75014 Paris, France
| | - Valentine Chirokoff
- INCIA CNRS 5287, University of Bordeaux, 33076 Bordeaux, France; (C.S.); (V.C.); (S.C.)
- Ecole Pratique des Hautes Etudes, Section of Life and Earth Sciences, PSL Research University, 75014 Paris, France
| | - Sandra Chanraud
- INCIA CNRS 5287, University of Bordeaux, 33076 Bordeaux, France; (C.S.); (V.C.); (S.C.)
- Ecole Pratique des Hautes Etudes, Section of Life and Earth Sciences, PSL Research University, 75014 Paris, France
| |
Collapse
|
14
|
Sinclair AH, Wang YC, Adcock RA. Instructed motivational states bias reinforcement learning and memory formation. Proc Natl Acad Sci U S A 2023; 120:e2304881120. [PMID: 37490530 PMCID: PMC10401012 DOI: 10.1073/pnas.2304881120] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Accepted: 06/19/2023] [Indexed: 07/27/2023] Open
Abstract
Motivation influences goals, decisions, and memory formation. Imperative motivation links urgent goals to actions, narrowing the focus of attention and memory. Conversely, interrogative motivation integrates goals over time and space, supporting rich memory encoding for flexible future use. We manipulated motivational states via cover stories for a reinforcement learning task: The imperative group imagined executing a museum heist, whereas the interrogative group imagined planning a future heist. Participants repeatedly chose among four doors, representing different museum rooms, to sample trial-unique paintings with variable rewards (later converted to bonus payments). The next day, participants performed a surprise memory test. Crucially, only the cover stories differed between the imperative and interrogative groups; the reinforcement learning task was identical, and all participants had the same expectations about how and when bonus payments would be awarded. In an initial sample and a preregistered replication, we demonstrated that imperative motivation increased exploitation during reinforcement learning. Conversely, interrogative motivation increased directed (but not random) exploration, despite the cost to participants' earnings. At test, the interrogative group was more accurate at recognizing paintings and recalling associated values. In the interrogative group, higher value paintings were more likely to be remembered; imperative motivation disrupted this effect of reward modulating memory. Overall, we demonstrate that a prelearning motivational manipulation can bias learning and memory, bearing implications for education, behavior change, clinical interventions, and communication.
Collapse
Affiliation(s)
- Alyssa H. Sinclair
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
| | - Yuxi C. Wang
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
| | - R. Alison Adcock
- Department of Psychology & Neuroscience, Duke University, Durham, NC27710
- Department of Psychiatry & Behavioral Sciences, Duke University, Durham, NC27710
| |
Collapse
|
15
|
Kato A, Shimomura K, Ognibene D, Parvaz MA, Berner LA, Morita K, Fiore VG. Computational models of behavioral addictions: State of the art and future directions. Addict Behav 2023; 140:107595. [PMID: 36621045 DOI: 10.1016/j.addbeh.2022.107595] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Revised: 11/23/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
Non-pharmacological behavioral addictions, such as pathological gambling, videogaming, social networking, or internet use, are becoming major public health concerns. It is not yet clear how behavioral addictions could share many major neurobiological and behavioral characteristics with substance use disorders, despite the absence of direct pharmacological influences. A deeper understanding of the neurocognitive mechanisms of addictive behavior is needed, and computational modeling could be one promising approach to explain intricately entwined cognitive and neural dynamics. This review describes computational models of addiction based on reinforcement learning algorithms, Bayesian inference, and biophysical neural simulations. We discuss whether computational frameworks originally conceived to explain maladaptive behavior in substance use disorders can be effectively extended to non-substance-related behavioral addictions. Moreover, we introduce recent studies on behavioral addictions that exemplify the possibility of such extension and propose future directions.
Collapse
Affiliation(s)
- Ayaka Kato
- RIKEN Center for Brain Science, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan; Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan
| | - Kanji Shimomura
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan
| | - Dimitri Ognibene
- Department of Psychology, Università degli Studi Milano-Bicocca, Milan, Italy; School of Computer Science and Electronic Engineering, University of Essex, Colchester, UK
| | - Muhammad A Parvaz
- Departments of Psychiatry and Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Laura A Berner
- Center of Excellence in Eating and Weight Disorders, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Center for Computational Psychiatry, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kenji Morita
- Physical and Health Education, Graduate School of Education, The University of Tokyo, Tokyo 113-0033, Japan; International Research Center for Neurointelligence (WPI-IRCN), The University of Tokyo, Tokyo 113-0033, Japan
| | - Vincenzo G Fiore
- Center for Computational Psychiatry, Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
16
|
Hales CA, Clark L, Winstanley CA. Computational approaches to modeling gambling behaviour: Opportunities for understanding disordered gambling. Neurosci Biobehav Rev 2023; 147:105083. [PMID: 36758827 DOI: 10.1016/j.neubiorev.2023.105083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 01/05/2023] [Accepted: 02/06/2023] [Indexed: 02/10/2023]
Abstract
Computational modeling has become an important tool in neuroscience and psychiatry research to provide insight into the cognitive processes underlying normal and pathological behavior. There are two modeling frameworks, reinforcement learning (RL) and drift diffusion modeling (DDM), that are well-developed in cognitive science, and have begun to be applied to Gambling Disorder. RL models focus on explaining how an agent uses reward to learn about the environment and make decisions based on outcomes. The DDM is a binary choice framework that breaks down decision making into psychologically meaningful components based on choice reaction time analyses. Both approaches have begun to yield insight into aspects of cognition that are important for, but not unique to, gambling, and thus relevant to the development of Gambling Disorder. However, these approaches also oversimplify or neglect various aspects of decision making seen in real-world gambling behavior. Gambling Disorder presents an opportunity for 'bespoke' modeling approaches to consider these neglected components. In this review, we discuss studies that have used RL and DDM frameworks to investigate some of the key cognitive components in gambling and Gambling Disorder. We also include an overview of Bayesian models, a methodology that could be useful for more tailored modeling approaches. We highlight areas in which computational modeling could enable progression in the investigation of the cognitive mechanisms relevant to gambling.
Collapse
Affiliation(s)
- C A Hales
- Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada; Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada.
| | - L Clark
- Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada; Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada
| | - C A Winstanley
- Djavad Mowafaghian Centre for Brain Health, University of British Columbia, Vancouver, British Columbia, Canada; Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
17
|
Speers LJ, Bilkey DK. Maladaptive explore/exploit trade-offs in schizophrenia. Trends Neurosci 2023; 46:341-354. [PMID: 36878821 DOI: 10.1016/j.tins.2023.02.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 01/30/2023] [Accepted: 02/08/2023] [Indexed: 03/07/2023]
Abstract
Schizophrenia is a complex disorder that remains poorly understood, particularly at the systems level. In this opinion article we argue that the explore/exploit trade-off concept provides a holistic and ecologically valid framework to resolve some of the apparent paradoxes that have emerged within schizophrenia research. We review recent evidence suggesting that fundamental explore/exploit behaviors may be maladaptive in schizophrenia during physical, visual, and cognitive foraging. We also describe how theories from the broader optimal foraging literature, such as the marginal value theorem (MVT), could provide valuable insight into how aberrant processing of reward, context, and cost/effort evaluations interact to produce maladaptive responses.
Collapse
Affiliation(s)
- Lucinda J Speers
- Department of Psychology, University of Otago, Dunedin 9016, New Zealand
| | - David K Bilkey
- Department of Psychology, University of Otago, Dunedin 9016, New Zealand.
| |
Collapse
|
18
|
Suzuki S, Zhang X, Dezfouli A, Braganza L, Fulcher BD, Parkes L, Fontenelle LF, Harrison BJ, Murawski C, Yücel M, Suo C. Individuals with problem gambling and obsessive-compulsive disorder learn through distinct reinforcement mechanisms. PLoS Biol 2023; 21:e3002031. [PMID: 36917567 PMCID: PMC10013903 DOI: 10.1371/journal.pbio.3002031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 02/08/2023] [Indexed: 03/16/2023] Open
Abstract
Obsessive-compulsive disorder (OCD) and pathological gambling (PG) are accompanied by deficits in behavioural flexibility. In reinforcement learning, this inflexibility can reflect asymmetric learning from outcomes above and below expectations. In alternative frameworks, it reflects perseveration independent of learning. Here, we examine evidence for asymmetric reward-learning in OCD and PG by leveraging model-based functional magnetic resonance imaging (fMRI). Compared with healthy controls (HC), OCD patients exhibited a lower learning rate for worse-than-expected outcomes, which was associated with the attenuated encoding of negative reward prediction errors in the dorsomedial prefrontal cortex and the dorsal striatum. PG patients showed higher and lower learning rates for better- and worse-than-expected outcomes, respectively, accompanied by higher encoding of positive reward prediction errors in the anterior insula than HC. Perseveration did not differ considerably between the patient groups and HC. These findings elucidate the neural computations of reward-learning that are altered in OCD and PG, providing a potential account of behavioural inflexibility in those mental disorders.
Collapse
Affiliation(s)
- Shinsuke Suzuki
- Centre for Brain, Mind and Markets, The University of Melbourne, Carlton, Australia
- Center for the Promotion of Social Data Science Education and Research, Hitotsubashi University, Tokyo, Japan
- * E-mail:
| | - Xiaoliu Zhang
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Amir Dezfouli
- Data61, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Sydney, Australia
| | - Leah Braganza
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Ben D. Fulcher
- School of Physics, The University of Sydney, Sydney, Australia
| | - Linden Parkes
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
- Department of Bioengineering, School of Engineering & Applied Science, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Leonardo F. Fontenelle
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Ben J. Harrison
- Melbourne Neuropsychiatry Centre, Department of Psychiatry, The University of Melbourne, Carlton, Australia
| | - Carsten Murawski
- Centre for Brain, Mind and Markets, The University of Melbourne, Carlton, Australia
| | - Murat Yücel
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| | - Chao Suo
- BrainPark, Turner Institute for Brain and Mental Health, School of Psychological Sciences, and Monash Biomedical Imaging Facility, Monash University, Clayton, Australia
| |
Collapse
|
19
|
Disentangling the roles of dopamine and noradrenaline in the exploration-exploitation tradeoff during human decision-making. Neuropsychopharmacology 2022; 48:1078-1086. [PMID: 36522404 PMCID: PMC10209107 DOI: 10.1038/s41386-022-01517-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Revised: 11/29/2022] [Accepted: 11/30/2022] [Indexed: 12/23/2022]
Abstract
Balancing the exploration of new options and the exploitation of known options is a fundamental challenge in decision-making, yet the mechanisms involved in this balance are not fully understood. Here, we aimed to elucidate the distinct roles of dopamine and noradrenaline in the exploration-exploitation tradeoff during human choice. To this end, we used a double-blind, placebo-controlled design in which participants received either a placebo, 400 mg of the D2/D3 receptor antagonist amisulpride, or 40 mg of the β-adrenergic receptor antagonist propranolol before they completed a virtual patch-foraging task probing exploration and exploitation. We systematically varied the rewards associated with choice options, the rate by which rewards decreased over time, and the opportunity costs it took to switch to the next option to disentangle the contributions of dopamine and noradrenaline to specific choice aspects. Our data show that amisulpride increased the sensitivity to all of these three critical choice features, whereas propranolol was associated with a reduced tendency to use value information. Our findings provide novel insights into the specific roles of dopamine and noradrenaline in the regulation of human choice behavior, suggesting a critical involvement of dopamine in directed exploration and a role of noradrenaline in more random exploration.
Collapse
|
20
|
Dezza IC, Noel X, Cleeremans A, Yu AJ. Distinct motivations to seek out information in healthy individuals and problem gamblers. Transl Psychiatry 2021; 11:408. [PMID: 34312367 PMCID: PMC8313706 DOI: 10.1038/s41398-021-01523-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 06/04/2021] [Accepted: 06/28/2021] [Indexed: 02/07/2023] Open
Abstract
As massive amounts of information are becoming available to people, understanding the mechanisms underlying information-seeking is more pertinent today than ever. In this study, we investigate the underlying motivations to seek out information in healthy and addicted individuals. We developed a novel decision-making task and a novel computational model which allows dissociating the relative contribution of two motivating factors to seek out information: a desire for novelty and a general desire for knowledge. To investigate whether/how the motivations to seek out information vary between healthy and addicted individuals, in addition to healthy controls we included a sample of individuals with gambling disorder-a form of addiction without the confound of substance consumption and characterized by compulsive gambling. Our results indicate that healthy subjects and problem gamblers adopt distinct information-seeking "modes". Healthy information-seeking behavior was mostly motivated by a desire for novelty. Problem gamblers, on the contrary, displayed reduced novelty-seeking and an increased desire for accumulating knowledge compared to healthy controls. Our findings not only shed new light on the motivations driving healthy and addicted individuals to seek out information, but they also have important implications for the treatment and diagnosis of behavioral addiction.
Collapse
Affiliation(s)
- Irene Cogliati Dezza
- grid.4989.c0000 0001 2348 0746Centre for Research in Cognition and Neurosciences, ULB Neuroscience Institute, Université Libre de Bruxelles, Bruxelles, Belgium ,grid.83440.3b0000000121901201Department of Experimental Psychology, Faculty of Brain Sciences, University College London, London, UK ,grid.83440.3b0000000121901201The Max Planck UCL Centre for Computational Psychiatry and Ageing Research, University College London, London, UK ,grid.5342.00000 0001 2069 7798Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Xavier Noel
- grid.4989.c0000 0001 2348 0746Faculty of Medicine, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Axel Cleeremans
- grid.4989.c0000 0001 2348 0746Centre for Research in Cognition and Neurosciences, ULB Neuroscience Institute, Université Libre de Bruxelles, Bruxelles, Belgium
| | - Angela J. Yu
- grid.266100.30000 0001 2107 4242Department of Cognitive Science, University of California San Diego, San Diego, USA
| |
Collapse
|