1
|
Collomb-Clerc A, Gueguen MCM, Minotti L, Kahane P, Navarro V, Bartolomei F, Carron R, Regis J, Chabardès S, Palminteri S, Bastin J. Human thalamic low-frequency oscillations correlate with expected value and outcomes during reinforcement learning. Nat Commun 2023; 14:6534. [PMID: 37848435 PMCID: PMC10582006 DOI: 10.1038/s41467-023-42380-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Accepted: 10/09/2023] [Indexed: 10/19/2023] Open
Abstract
Reinforcement-based adaptive decision-making is believed to recruit fronto-striatal circuits. A critical node of the fronto-striatal circuit is the thalamus. However, direct evidence of its involvement in human reinforcement learning is lacking. We address this gap by analyzing intra-thalamic electrophysiological recordings from eight participants while they performed a reinforcement learning task. We found that in both the anterior thalamus (ATN) and dorsomedial thalamus (DMTN), low frequency oscillations (LFO, 4-12 Hz) correlated positively with expected value estimated from computational modeling during reward-based learning (after outcome delivery) or punishment-based learning (during the choice process). Furthermore, LFO recorded from ATN/DMTN were also negatively correlated with outcomes so that both components of reward prediction errors were signaled in the human thalamus. The observed differences in the prediction signals between rewarding and punishing conditions shed light on the neural mechanisms underlying action inhibition in punishment avoidance learning. Our results provide insight into the role of thalamus in reinforcement-based decision-making in humans.
Collapse
Affiliation(s)
- Antoine Collomb-Clerc
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France
| | - Maëlle C M Gueguen
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France
- Department of Psychiatry, Brain Health Institute and University Behavioral Health Care, Rutgers University-New Brunswick, Piscataway, NJ, USA
| | - Lorella Minotti
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France
- Neurology Department, University Hospital of Grenoble, Grenoble, France
| | - Philippe Kahane
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France
- Neurology Department, University Hospital of Grenoble, Grenoble, France
| | - Vincent Navarro
- Sorbonne Université, Paris Brain Institute - Institut du Cerveau, ICM, INSERM, CNRS, AP-HP, Pitié-Salpêtrière Hospital, Paris, France
| | - Fabrice Bartolomei
- Timone University Hospital, Sleep Unit, Epileptology and Cerebral Rhythmology, University Hospital of Marseille, Marseille, France
- Aix Marseille University, Inserm, Institut de Neurosciences des Systèmes, Marseille, France
| | - Romain Carron
- Aix Marseille University, Inserm, Institut de Neurosciences des Systèmes, Marseille, France
- Timone University Hospital, Department of functional and stereotactic neurosurgery, University Hospital of Marseille, Marseille, France
| | - Jean Regis
- Neurosurgery Department, University Hospital of Marseille, Marseille, France
| | - Stephan Chabardès
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France
- Neurosurgery Department, University Hospital of Grenoble, Grenoble, France
| | - Stefano Palminteri
- Laboratoire de Neurosciences Cognitives Computationnelles, Département d'Etudes Cognitives, ENS, PSL, INSERM, Paris, France
| | - Julien Bastin
- Univ. Grenoble Alpes, Inserm, U1216, CHU Grenoble Alpes, Grenoble Institut Neurosciences, 38000, Grenoble, France.
| |
Collapse
|
2
|
Skvortsova V, Palminteri S, Buot A, Karachi C, Welter ML, Grabli D, Pessiglione M. A Causal Role for the Pedunculopontine Nucleus in Human Instrumental Learning. Curr Biol 2021; 31:943-954.e5. [PMID: 33352119 DOI: 10.1016/j.cub.2020.11.042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 09/23/2020] [Accepted: 11/17/2020] [Indexed: 01/06/2023]
Abstract
A critical mechanism for maximizing reward is instrumental learning. In standard instrumental learning models, action values are updated on the basis of reward prediction errors (RPEs), defined as the discrepancy between expectations and outcomes. A wealth of evidence across species and experimental techniques has established that RPEs are signaled by midbrain dopamine neurons. However, the way dopamine neurons receive information about reward outcomes remains poorly understood. Recent animal studies suggest that the pedunculopontine nucleus (PPN), a small brainstem structure considered as a locomotor center, is sensitive to reward and sends excitatory projection to dopaminergic nuclei. Here, we examined the hypothesis that the PPN could contribute to reward learning in humans. To this aim, we leveraged a clinical protocol that assessed the therapeutic impact of PPN deep-brain stimulation (DBS) in three patients with Parkinson disease. PPN local field potentials (LFPs), recorded while patients performed an instrumental learning task, showed a specific response to reward outcomes in a low-frequency (alpha-beta) band. Moreover, PPN DBS selectively improved learning from rewards but not from punishments, a pattern that is typically observed following dopaminergic treatment. Computational analyses indicated that the effect of PPN DBS on instrumental learning was best captured by an increase in subjective reward sensitivity. Taken together, these results support a causal role for PPN-mediated reward signals in human instrumental learning.
Collapse
Affiliation(s)
- Vasilisa Skvortsova
- Motivation, Brain and Behavior (MBB) laboratory, Paris Brain Institute (ICM), Groupe Hospitalier Pitié-Salpêtrière, Paris 75013, France; INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Laboratoire de Neurosciences Cognitives et Computationnelles, Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris 75005, France; INSERM Unit 960, Université de Paris Sciences et Lettres (UP), 75005 Paris, France; Max Planck UCL Center for Computational Psychiatry and Aging, London WC1B 5EH, UK.
| | - Stefano Palminteri
- Motivation, Brain and Behavior (MBB) laboratory, Paris Brain Institute (ICM), Groupe Hospitalier Pitié-Salpêtrière, Paris 75013, France; INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Laboratoire de Neurosciences Cognitives et Computationnelles, Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris 75005, France; INSERM Unit 960, Université de Paris Sciences et Lettres (UP), 75005 Paris, France
| | - Anne Buot
- INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Laboratoire de Neurosciences Cognitives et Computationnelles, Département d'Etudes Cognitives, Ecole Normale Supérieure, Paris 75005, France; INSERM Unit 960, Université de Paris Sciences et Lettres (UP), 75005 Paris, France
| | - Carine Karachi
- INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Neurology and Neurosurgery department, Groupe Hospitalier Pitié-Salpêtrière, Assistance Publique-Hôpitaux de Paris, 75013 Paris, France
| | - Marie-Laure Welter
- INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Neurophysiology Department, Hôpital Universitaire de Rouen, 76000 Rouen, France
| | - David Grabli
- INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France; Neurology and Neurosurgery department, Groupe Hospitalier Pitié-Salpêtrière, Assistance Publique-Hôpitaux de Paris, 75013 Paris, France
| | - Mathias Pessiglione
- Motivation, Brain and Behavior (MBB) laboratory, Paris Brain Institute (ICM), Groupe Hospitalier Pitié-Salpêtrière, Paris 75013, France; INSERM Unit 1127, CNRS Unit 7225, Sorbonne Universités (SU), Paris 75005, France.
| |
Collapse
|
3
|
Pessiglione M, Vinckier F, Bouret S, Daunizeau J, Le Bouc R. Why not try harder? Computational approach to motivation deficits in neuro-psychiatric diseases. Brain 2019; 141:629-650. [PMID: 29194534 DOI: 10.1093/brain/awx278] [Citation(s) in RCA: 103] [Impact Index Per Article: 20.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2017] [Accepted: 08/30/2017] [Indexed: 12/19/2022] Open
Abstract
Motivation deficits, such as apathy, are pervasive in both neurological and psychiatric diseases. Even when they are not the core symptom, they reduce quality of life, compromise functional outcome and increase the burden for caregivers. They are currently assessed with clinical scales that do not give any mechanistic insight susceptible to guide therapeutic intervention. Here, we present another approach that consists of phenotyping the behaviour of patients in motivation tests, using computational models. These formal models impose a precise and operational definition of motivation that is embedded in decision theory. Motivation can be defined as the function that orients and activates the behaviour according to two attributes: a content (the goal) and a quantity (the goal value). Decision theory offers a way to quantify motivation, as the cost that patients would accept to endure in order to get the benefit of achieving their goal. We then review basic and clinical studies that have investigated the trade-off between the expected cost entailed by potential actions and the expected benefit associated with potential rewards. These studies have shown that the trade-off between effort and reward involves specific cortical, subcortical and neuromodulatory systems, such that it may be shifted in particular clinical conditions, and reinstated by appropriate treatments. Finally, we emphasize the promises of computational phenotyping for clinical purposes. Ideally, there would be a one-to-one mapping between specific neural components and distinct computational variables and processes of the decision model. Thus, fitting computational models to patients' behaviour would allow inferring of the dysfunctional mechanism in both cognitive terms (e.g. hyposensitivity to reward) and neural terms (e.g. lack of dopamine). This computational approach may therefore not only give insight into the motivation deficit but also help personalize treatment.
Collapse
Affiliation(s)
- Mathias Pessiglione
- Motivation, Brain and Behaviour (MBB) Lab, Institut du Cerveau et de la Moelle (ICM), Hôpital de la Pitié-Salpêtrière, Paris, France.,Inserm U1127, CNRS U9225, Université Pierre et Marie Curie (UPMC - Paris 6), France
| | - Fabien Vinckier
- Motivation, Brain and Behaviour (MBB) Lab, Institut du Cerveau et de la Moelle (ICM), Hôpital de la Pitié-Salpêtrière, Paris, France.,Inserm U1127, CNRS U9225, Université Pierre et Marie Curie (UPMC - Paris 6), France.,Service de Psychiatrie, Centre Hospitalier Sainte-Anne, Université Paris Descartes, Paris, France
| | - Sébastien Bouret
- Motivation, Brain and Behaviour (MBB) Lab, Institut du Cerveau et de la Moelle (ICM), Hôpital de la Pitié-Salpêtrière, Paris, France.,Inserm U1127, CNRS U9225, Université Pierre et Marie Curie (UPMC - Paris 6), France
| | - Jean Daunizeau
- Motivation, Brain and Behaviour (MBB) Lab, Institut du Cerveau et de la Moelle (ICM), Hôpital de la Pitié-Salpêtrière, Paris, France.,Inserm U1127, CNRS U9225, Université Pierre et Marie Curie (UPMC - Paris 6), France
| | - Raphaël Le Bouc
- Motivation, Brain and Behaviour (MBB) Lab, Institut du Cerveau et de la Moelle (ICM), Hôpital de la Pitié-Salpêtrière, Paris, France.,Inserm U1127, CNRS U9225, Université Pierre et Marie Curie (UPMC - Paris 6), France.,Urgences cérébro-vasculaires, Hôpital de la Pitié-Salpêtrière, Université Pierre et Marie Curie, Paris, France
| |
Collapse
|
4
|
Houvenaghel JF, Duprez J, Argaud S, Naudet F, Dondaine T, Robert GH, Drapier S, Haegelen C, Jannin P, Drapier D, Vérin M, Sauleau P. Influence of subthalamic deep-brain stimulation on cognitive action control in incentive context. Neuropsychologia 2016; 91:519-530. [PMID: 27664297 DOI: 10.1016/j.neuropsychologia.2016.09.015] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Revised: 08/23/2016] [Accepted: 09/20/2016] [Indexed: 01/24/2023]
Abstract
Subthalamic nucleus deep-brain stimulation (STN-DBS) is an effective treatment in Parkinson's disease (PD), but can have cognitive side effects, such as increasing the difficulty of producing appropriate responses when a habitual but inappropriate responses represent strong alternatives. STN-DBS also appears to modulate representations of incentives such as monetary rewards. Furthermore, conflict resolution can be modulated by incentive context. We therefore used a rewarded Simon Task to assess the influence of promised rewards on cognitive action control in 50 patients with PD, half of whom were being treated with STN-DBS. Results were analyzed according to the activation-suppression model. We showed that STN-DBS (i) favored the expression of motor impulsivity, as measured with the Barratt Impulsiveness Scale, (ii) facilitated the expression of incentive actions as observed with a greater increase in speed according to promised reward in patients with versus without DBS and (iii) may increase impulsive action selection in an incentive context. In addition, analysis of subgroups of implanted patients suggested that those who exhibited the most impulsive action selection had the least severe disease. This may indicate that patients with less marked disease are more at risk of developing impulsivity postoperatively. Finally, in these patients, incentive context increased the difficulty of resolving conflict situations. As a whole, the current study revealed that in patients with PD, STN-DBS affects the cognitive processes involved in conflict resolution, reward processing and the influence of promised rewards on conflict resolution.
Collapse
Affiliation(s)
- Jean-François Houvenaghel
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Neurology, Rennes University Hospital, Rennes, France.
| | - Joan Duprez
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France
| | - Soizic Argaud
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; "Neuroscience of Emotion and Affective Dynamics" Laboratory, Department of Psychology and Educational Sciences/Swiss Center for Affective Sciences, Campus Biotech, University of Geneva, Geneva, Switzerland
| | - Florian Naudet
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Clinical Investigation Center (INSERM 0203), Department of Pharmacology, Rennes University Hospital, Rennes, France; Department of Psychiatry, Rennes University Hospital, Rennes, France
| | - Thibaut Dondaine
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France
| | - Gabriel Hadrien Robert
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Psychiatry, Rennes University Hospital, Rennes, France
| | - Sophie Drapier
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Neurology, Rennes University Hospital, Rennes, France
| | - Claire Haegelen
- Department of Neurosurgery, Rennes University Hospital, Rennes, France; "MediCIS" laboratory (UMR 1099 LTSI), INSERM/University of Rennes, Rennes, France
| | - Pierre Jannin
- "MediCIS" laboratory (UMR 1099 LTSI), INSERM/University of Rennes, Rennes, France
| | - Dominique Drapier
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Psychiatry, Rennes University Hospital, Rennes, France
| | - Marc Vérin
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Neurology, Rennes University Hospital, Rennes, France
| | - Paul Sauleau
- "Behavior and Basal Ganglia" Research Unit (EA 4712), University of Rennes 1, Rennes, France; Department of Neurophysiology, Rennes University Hospital, Rennes, France
| |
Collapse
|
5
|
Aberg KC, Doell KC, Schwartz S. The “Creative Right Brain” Revisited: Individual Creativity and Associative Priming in the Right Hemisphere Relate to Hemispheric Asymmetries in Reward Brain Function. Cereb Cortex 2016; 27:4946-4959. [DOI: 10.1093/cercor/bhw288] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 08/23/2016] [Indexed: 12/21/2022] Open
|
6
|
Palminteri S, Kilford EJ, Coricelli G, Blakemore SJ. The Computational Development of Reinforcement Learning during Adolescence. PLoS Comput Biol 2016; 12:e1004953. [PMID: 27322574 PMCID: PMC4920542 DOI: 10.1371/journal.pcbi.1004953] [Citation(s) in RCA: 70] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2015] [Accepted: 04/29/2016] [Indexed: 11/19/2022] Open
Abstract
Adolescence is a period of life characterised by changes in learning and decision-making. Learning and decision-making do not rely on a unitary system, but instead require the coordination of different cognitive processes that can be mathematically formalised as dissociable computational modules. Here, we aimed to trace the developmental time-course of the computational modules responsible for learning from reward or punishment, and learning from counterfactual feedback. Adolescents and adults carried out a novel reinforcement learning paradigm in which participants learned the association between cues and probabilistic outcomes, where the outcomes differed in valence (reward versus punishment) and feedback was either partial or complete (either the outcome of the chosen option only, or the outcomes of both the chosen and unchosen option, were displayed). Computational strategies changed during development: whereas adolescents’ behaviour was better explained by a basic reinforcement learning algorithm, adults’ behaviour integrated increasingly complex computational features, namely a counterfactual learning module (enabling enhanced performance in the presence of complete feedback) and a value contextualisation module (enabling symmetrical reward and punishment learning). Unlike adults, adolescent performance did not benefit from counterfactual (complete) feedback. In addition, while adults learned symmetrically from both reward and punishment, adolescents learned from reward but were less likely to learn from punishment. This tendency to rely on rewards and not to consider alternative consequences of actions might contribute to our understanding of decision-making in adolescence. We employed a novel learning task to investigate how adolescents and adults learn from reward versus punishment, and to counterfactual feedback about decisions. Computational analyses revealed that adults and adolescents did not implement the same algorithm to solve the learning task. In contrast to adults, adolescents’ performance did not take into account counterfactual information; adolescents also learned preferentially to seek rewards rather than to avoid punishments, whereas adults learned to seek and avoid both equally. Increasing our understanding of computational changes in reinforcement learning during adolescence may provide insights into adolescent value-based decision-making. Our results might also have implications for education, since they suggest that adolescents benefit more from positive feedback than from negative feedback in learning tasks.
Collapse
Affiliation(s)
- Stefano Palminteri
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom
- Laboratoire de Neurosciences Cognitive, École Normale Supérieure, Paris, France
- * E-mail:
| | - Emma J. Kilford
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| | - Giorgio Coricelli
- Interdepartmental Centre for Mind/Brain Sciences, Università degli Studi di Trento, Trento, Italy
- Department of Economics, University of Southern California, Los Angeles, California, United States of America
| | - Sarah-Jayne Blakemore
- Institute of Cognitive Neuroscience, University College London, London, United Kingdom
| |
Collapse
|
7
|
The left hemisphere learns what is right: Hemispatial reward learning depends on reinforcement learning processes in the contralateral hemisphere. Neuropsychologia 2016; 89:1-13. [PMID: 27221149 DOI: 10.1016/j.neuropsychologia.2016.05.023] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Revised: 04/19/2016] [Accepted: 05/21/2016] [Indexed: 11/22/2022]
Abstract
Orienting biases refer to consistent, trait-like direction of attention or locomotion toward one side of space. Recent studies suggest that such hemispatial biases may determine how well people memorize information presented in the left or right hemifield. Moreover, lesion studies indicate that learning rewarded stimuli in one hemispace depends on the integrity of the contralateral striatum. However, the exact neural and computational mechanisms underlying the influence of individual orienting biases on reward learning remain unclear. Because reward-based behavioural adaptation depends on the dopaminergic system and prediction error (PE) encoding in the ventral striatum, we hypothesized that hemispheric asymmetries in dopamine (DA) function may determine individual spatial biases in reward learning. To test this prediction, we acquired fMRI in 33 healthy human participants while they performed a lateralized reward task. Learning differences between hemispaces were assessed by presenting stimuli, assigned to different reward probabilities, to the left or right of central fixation, i.e. presented in the left or right visual hemifield. Hemispheric differences in DA function were estimated through differential fMRI responses to positive vs. negative feedback in the left vs. right ventral striatum, and a computational approach was used to identify the neural correlates of PEs. Our results show that spatial biases favoring reward learning in the right (vs. left) hemifield were associated with increased reward responses in the left hemisphere and relatively better neural encoding of PEs for stimuli presented in the right (vs. left) hemifield. These findings demonstrate that trait-like spatial biases implicate hemisphere-specific learning mechanisms, with individual differences between hemispheres contributing to reinforcing spatial biases.
Collapse
|
8
|
Palminteri S, Pessiglione M. Reinforcement learning and Tourette syndrome. INTERNATIONAL REVIEW OF NEUROBIOLOGY 2013; 112:131-53. [PMID: 24295620 DOI: 10.1016/b978-0-12-411546-0.00005-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
In this chapter, we report the first experimental explorations of reinforcement learning in Tourette syndrome, realized by our team in the last few years. This report will be preceded by an introduction aimed to provide the reader with the state of the art of the knowledge concerning the neural bases of reinforcement learning at the moment of these studies and the scientific rationale beyond them. In short, reinforcement learning is learning by trial and error to maximize rewards and minimize punishments. This decision-making and learning process implicates the dopaminergic system projecting to the frontal cortex-basal ganglia circuits. A large body of evidence suggests that the dysfunction of the same neural systems is implicated in the pathophysiology of Tourette syndrome. Our results show that Tourette condition, as well as the most common pharmacological treatments (dopamine antagonists), affects reinforcement learning performance in these patients. Specifically, the results suggest a deficit in negative reinforcement learning, possibly underpinned by a functional hyperdopaminergia, which could explain the persistence of tics, despite their evident inadaptive (negative) value. This idea, together with the implications of these results in Tourette therapy and the future perspectives, is discussed in Section 4 of this chapter.
Collapse
Affiliation(s)
- Stefano Palminteri
- Laboratoire des Neurosciences Cognitives (LNC), Ecole Normale Supèrieure (ENS), Paris, France.
| | | |
Collapse
|