1
|
Pool ER, Pauli WM, Cross L, O'Doherty JP. Neural substrates of parallel devaluation-sensitive and devaluation-insensitive Pavlovian learning in humans. Nat Commun 2023; 14:8057. [PMID: 38052792 PMCID: PMC10697955 DOI: 10.1038/s41467-023-43747-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 11/17/2023] [Indexed: 12/07/2023] Open
Abstract
We aim to differentiate the brain regions involved in the learning and encoding of Pavlovian associations sensitive to changes in outcome value from those that are not sensitive to such changes by combining a learning task with outcome devaluation, eye-tracking, and functional magnetic resonance imaging in humans. Contrary to theoretical expectation, voxels correlating with reward prediction errors in the ventral striatum and subgenual cingulate appear to be sensitive to devaluation. Moreover, regions encoding state prediction errors appear to be devaluation insensitive. We can also distinguish regions encoding predictions about outcome taste identity from predictions about expected spatial location. Regions encoding predictions about taste identity seem devaluation sensitive while those encoding predictions about an outcome's spatial location seem devaluation insensitive. These findings suggest the existence of multiple and distinct associative mechanisms in the brain and help identify putative neural correlates for the parallel expression of both devaluation sensitive and insensitive conditioned behaviors.
Collapse
Affiliation(s)
- Eva R Pool
- Swiss Center for Affective Sciences, University of Geneva, Geneva, Switzerland.
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| | - Wolfgang M Pauli
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
| | - Logan Cross
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Department of Computer Science, Stanford University, Palo Alto, CA, USA
| | - John P O'Doherty
- Division of Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|
2
|
McNally GP, Jean-Richard-Dit-Bressel P, Millan EZ, Lawrence AJ. Pathways to the persistence of drug use despite its adverse consequences. Mol Psychiatry 2023; 28:2228-2237. [PMID: 36997610 PMCID: PMC10611585 DOI: 10.1038/s41380-023-02040-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/20/2022] [Revised: 03/10/2023] [Accepted: 03/15/2023] [Indexed: 04/01/2023]
Abstract
The persistence of drug taking despite its adverse consequences plays a central role in the presentation, diagnosis, and impacts of addiction. Eventual recognition and appraisal of these adverse consequences is central to decisions to reduce or cease use. However, the most appropriate ways of conceptualizing persistence in the face of adverse consequences remain unclear. Here we review evidence that there are at least three pathways to persistent use despite the negative consequences of that use. A cognitive pathway for recognition of adverse consequences, a motivational pathway for valuation of these consequences, and a behavioral pathway for responding to these adverse consequences. These pathways are dynamic, not linear, with multiple possible trajectories between them, and each is sufficient to produce persistence. We describe these pathways, their characteristics, brain cellular and circuit substrates, and we highlight their relevance to different pathways to self- and treatment-guided behavior change.
Collapse
Affiliation(s)
- Gavan P McNally
- School of Psychology, UNSW Sydney, Sydney, NSW, 2052, Australia.
| | | | - E Zayra Millan
- School of Psychology, UNSW Sydney, Sydney, NSW, 2052, Australia
| | - Andrew J Lawrence
- Florey Institute of Neuroscience and Mental Health, Parkville, VIC, 3010, Australia
- Florey Department of Neuroscience and Mental Health, University of Melbourne, Melbourne, VIC, 3010, Australia
| |
Collapse
|
3
|
Pool ER, Pauli WM, Cross L, O'Doherty JP. Neural substrates of parallel devaluation-sensitive and devaluation-insensitive Pavlovian learning in humans. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.26.525637. [PMID: 36747799 PMCID: PMC9901183 DOI: 10.1101/2023.01.26.525637] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Pavlovian learning depends on multiple and parallel associations leading to distinct classes of conditioned responses that vary in their flexibility following changes in the value of an associated outcome. Here, we aimed to differentiate brain areas involved in learning and encoding associations that are sensitive to changes in the value of an outcome from those that are not sensitive to such changes. To address this question, we combined a Pavlovian learning task with outcome devaluation, eye-tracking and functional magnetic resonance imaging. We used computational modeling to identify brain regions involved in learning stimulus-reward associations and stimulus-stimulus associations, by testing for brain areas correlating with reward-prediction errors and state-prediction errors, respectively. We found that, contrary to theoretical predictions about reward prediction errors being exclusively model-free, voxels correlating with reward prediction errors in the ventral striatum and subgenual anterior cingulate cortex were sensitive to devaluation. On the other hand, brain areas correlating with state prediction errors were found to be devaluation insensitive. In a supplementary analysis, we distinguished brain regions encoding predictions about outcome taste identity from those involved in encoding predictions about its expected spatial location. A subset of regions involved in taste identity predictions were devaluation sensitive while those involved in encoding predictions about spatial location were devaluation insensitive. These findings provide insights into the role of multiple associative mechanisms in the brain in mediating Pavlovian conditioned behavior - illustrating how distinct neural pathways can in parallel produce both devaluation sensitive and devaluation insensitive behaviors.
Collapse
|
4
|
Seitz BM, Hoang IB, DiFazio LE, Blaisdell AP, Sharpe MJ. Dopamine errors drive excitatory and inhibitory components of backward conditioning in an outcome-specific manner. Curr Biol 2022; 32:3210-3218.e3. [PMID: 35752165 DOI: 10.1016/j.cub.2022.06.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Revised: 04/29/2022] [Accepted: 06/13/2022] [Indexed: 01/06/2023]
Abstract
For over two decades, phasic activity in midbrain dopamine neurons was considered synonymous with the prediction error in temporal-difference reinforcement learning.1-4 Central to this proposal is the notion that reward-predictive stimuli become endowed with the scalar value of predicted rewards. When these cues are subsequently encountered, their predictive value is compared to the value of the actual reward received, allowing for the calculation of prediction errors.5,6 Phasic firing of dopamine neurons was proposed to reflect this computation,1,2 facilitating the backpropagation of value from the predicted reward to the reward-predictive stimulus, thus reducing future prediction errors. There are two critical assumptions of this proposal: (1) that dopamine errors can only facilitate learning about scalar value and not more complex features of predicted rewards, and (2) that the dopamine signal can only be involved in anticipatory cue-reward learning in which cues or actions precede rewards. Recent work7-15 has challenged the first assumption, demonstrating that phasic dopamine signals across species are involved in learning about more complex features of the predicted outcomes, in a manner that transcends this value computation. Here, we tested the validity of the second assumption. Specifically, we examined whether phasic midbrain dopamine activity would be necessary for backward conditioning-when a neutral cue reliably follows a rewarding outcome.16-20 Using a specific Pavlovian-to-instrumental transfer (PIT) procedure,21-23 we show rats learn both excitatory and inhibitory components of a backward association, and that this association entails knowledge of the specific identity of the reward and cue. We demonstrate that brief optogenetic inhibition of VTADA neurons timed to the transition between the reward and cue reduces both of these components of backward conditioning. These findings suggest VTADA neurons are capable of facilitating associations between contiguously occurring events, regardless of the content of those events. We conclude that these data may be in line with suggestions that the VTADA error acts as a universal teaching signal. This may provide insight into why dopamine function has been implicated in myriad psychological disorders that are characterized by very distinct reinforcement-learning deficits.
Collapse
Affiliation(s)
- Benjamin M Seitz
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Ivy B Hoang
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Lauren E DiFazio
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Aaron P Blaisdell
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA
| | - Melissa J Sharpe
- Department of Psychology, University of California, Los Angeles, Portola Plaza, Los Angeles, CA 91602, USA.
| |
Collapse
|
5
|
Rybicki AJ, Sowden SL, Schuster B, Cook JL. Dopaminergic challenge dissociates learning from primary versus secondary sources of information. eLife 2022; 11:74893. [PMID: 35289748 PMCID: PMC9023054 DOI: 10.7554/elife.74893] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 03/14/2022] [Indexed: 11/13/2022] Open
Abstract
Some theories of human cultural evolution posit that humans have social-specific learning mechanisms that are adaptive specialisations moulded by natural selection to cope with the pressures of group living. However, the existence of neurochemical pathways that are specialised for learning from social information and individual experience is widely debated. Cognitive neuroscientific studies present mixed evidence for social-specific learning mechanisms: some studies find dissociable neural correlates for social and individual learning, whereas others find the same brain areas and, dopamine-mediated, computations involved in both. Here, we demonstrate that, like individual learning, social learning is modulated by the dopamine D2 receptor antagonist haloperidol when social information is the primary learning source, but not when it comprises a secondary, additional element. Two groups (total N = 43) completed a decision-making task which required primary learning, from own experience, and secondary learning from an additional source. For one group, the primary source was social, and secondary was individual; for the other group this was reversed. Haloperidol affected primary learning irrespective of social/individual nature, with no effect on learning from the secondary source. Thus, we illustrate that dopaminergic mechanisms underpinning learning can be dissociated along a primary-secondary but not a social-individual axis. These results resolve conflict in the literature and support an expanding field showing that, rather than being specialised for particular inputs, neurochemical pathways in the human brain can process both social and non-social cues and arbitrate between the two depending upon which cue is primarily relevant for the task at hand.
Collapse
|
6
|
Millard SJ, Bearden CE, Karlsgodt KH, Sharpe MJ. The prediction-error hypothesis of schizophrenia: new data point to circuit-specific changes in dopamine activity. Neuropsychopharmacology 2022; 47:628-640. [PMID: 34588607 PMCID: PMC8782867 DOI: 10.1038/s41386-021-01188-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/23/2021] [Accepted: 09/07/2021] [Indexed: 02/07/2023]
Abstract
Schizophrenia is a severe psychiatric disorder affecting 21 million people worldwide. People with schizophrenia suffer from symptoms including psychosis and delusions, apathy, anhedonia, and cognitive deficits. Strikingly, schizophrenia is characterised by a learning paradox involving difficulties learning from rewarding events, whilst simultaneously 'overlearning' about irrelevant or neutral information. While dysfunction in dopaminergic signalling has long been linked to the pathophysiology of schizophrenia, a cohesive framework that accounts for this learning paradox remains elusive. Recently, there has been an explosion of new research investigating how dopamine contributes to reinforcement learning, which illustrates that midbrain dopamine contributes in complex ways to reinforcement learning, not previously envisioned. This new data brings new possibilities for how dopamine signalling contributes to the symptomatology of schizophrenia. Building on recent work, we present a new neural framework for how we might envision specific dopamine circuits contributing to this learning paradox in schizophrenia in the context of models of reinforcement learning. Further, we discuss avenues of preclinical research with the use of cutting-edge neuroscience techniques where aspects of this model may be tested. Ultimately, it is hoped that this review will spur to action more research utilising specific reinforcement learning paradigms in preclinical models of schizophrenia, to reconcile seemingly disparate symptomatology and develop more efficient therapeutics.
Collapse
Affiliation(s)
- Samuel J. Millard
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA
| | - Carrie E. Bearden
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA ,grid.19006.3e0000 0000 9632 6718Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095 USA
| | - Katherine H. Karlsgodt
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA ,grid.19006.3e0000 0000 9632 6718Department of Psychiatry and Biobehavioral Sciences, University of California, Los Angeles, CA 90095 USA
| | - Melissa J. Sharpe
- grid.19006.3e0000 0000 9632 6718Department of Psychology, University of California, Los Angeles, CA 90095 USA
| |
Collapse
|
7
|
Prével A, Krebs RM. Higher-Order Conditioning With Simultaneous and Backward Conditioned Stimulus: Implications for Models of Pavlovian Conditioning. Front Behav Neurosci 2021; 15:749517. [PMID: 34858147 PMCID: PMC8632485 DOI: 10.3389/fnbeh.2021.749517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Accepted: 10/18/2021] [Indexed: 11/23/2022] Open
Abstract
In a new environment, humans and animals can detect and learn that cues predict meaningful outcomes, and use this information to adapt their responses. This process is termed Pavlovian conditioning. Pavlovian conditioning is also observed for stimuli that predict outcome-associated cues; a second type of conditioning is termed higher-order Pavlovian conditioning. In this review, we will focus on higher-order conditioning studies with simultaneous and backward conditioned stimuli. We will examine how the results from these experiments pose a challenge to models of Pavlovian conditioning like the Temporal Difference (TD) models, in which learning is mainly driven by reward prediction errors. Contrasting with this view, the results suggest that humans and animals can form complex representations of the (temporal) structure of the task, and use this information to guide behavior, which seems consistent with model-based reinforcement learning. Future investigations involving these procedures could result in important new insights on the mechanisms that underlie Pavlovian conditioning.
Collapse
Affiliation(s)
- Arthur Prével
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| | - Ruth M Krebs
- Department of Experimental Psychology, Ghent University, Ghent, Belgium
| |
Collapse
|
8
|
Oleson EB, Hamilton LR, Gomez DM. Cannabinoid Modulation of Dopamine Release During Motivation, Periodic Reinforcement, Exploratory Behavior, Habit Formation, and Attention. Front Synaptic Neurosci 2021; 13:660218. [PMID: 34177546 PMCID: PMC8222827 DOI: 10.3389/fnsyn.2021.660218] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 05/05/2021] [Indexed: 12/12/2022] Open
Abstract
Motivational and attentional processes energize action sequences to facilitate evolutionary competition and promote behavioral fitness. Decades of neuropharmacology, electrophysiology and electrochemistry research indicate that the mesocorticolimbic DA pathway modulates both motivation and attention. More recently, it was realized that mesocorticolimbic DA function is tightly regulated by the brain's endocannabinoid system and greatly influenced by exogenous cannabinoids-which have been harnessed by humanity for medicinal, ritualistic, and recreational uses for 12,000 years. Exogenous cannabinoids, like the primary psychoactive component of cannabis, delta-9-tetrahydrocannabinol, produce their effects by acting at binding sites for naturally occurring endocannabinoids. The brain's endocannabinoid system consists of two G-protein coupled receptors, endogenous lipid ligands for these receptor targets, and several synthetic and metabolic enzymes involved in their production and degradation. Emerging evidence indicates that the endocannabinoid 2-arachidonoylglycerol is necessary to observe concurrent increases in DA release and motivated behavior. And the historical pharmacology literature indicates a role for cannabinoid signaling in both motivational and attentional processes. While both types of behaviors have been scrutinized under manipulation by either DA or cannabinoid agents, there is considerably less insight into prospective interactions between these two important signaling systems. This review attempts to summate the relevance of cannabinoid modulation of DA release during operant tasks designed to investigate either motivational or attentional control of behavior. We first describe how cannabinoids influence DA release and goal-directed action under a variety of reinforcement contingencies. Then we consider the role that endocannabinoids might play in switching an animal's motivation from a goal-directed action to the search for an alternative outcome, in addition to the formation of long-term habits. Finally, dissociable features of attentional behavior using both the 5-choice serial reaction time task and the attentional set-shifting task are discussed along with their distinct influences by DA and cannabinoids. We end with discussing potential targets for further research regarding DA-cannabinoid interactions within key substrates involved in motivation and attention.
Collapse
Affiliation(s)
- Erik B. Oleson
- Department of Psychology, University of Colorado Denver, Denver, CO, United States
| | - Lindsey R. Hamilton
- Department of Psychology, University of Colorado Denver, Denver, CO, United States
| | - Devan M. Gomez
- Department of Biomedical Sciences, Marquette University, Milwaukee, WI, United States
| |
Collapse
|
9
|
Kahnt T, Schoenbaum G. Cross-species studies on orbitofrontal control of inference-based behavior. Behav Neurosci 2021; 135:109-119. [PMID: 34060869 PMCID: PMC9338401 DOI: 10.1037/bne0000401] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Many decisions are guided by expectations about their outcomes. These expectations can arise from two fundamentally different sources: from direct experience with outcomes and the events and actions that precede them or from mental simulations and inferences when direct experience is missing. Here we discuss four elegant tasks from animal learning theory (devaluation, sensory preconditioning, Pavlovian-to-instrumental transfer, and Pavlovian overexpectation) and how they can be used to isolate behavior that is based on such mental simulations from behavior that can be based solely on experience. We then review findings from studies in rodents, nonhuman primates, and humans that use these tasks in combination with neural recording and loss-of-function experiments to understand the role of the orbitofrontal cortex (OFC) in outcome inference. The results of these studies show that activity in the OFC is correlated with inferred outcome expectations and that an intact OFC is necessary for inference-based behavior and learning. In summary, these findings provide converging cross-species support for the idea that the OFC is critical for behavior that is based on inferred outcomes, whereas it is not required when expectations can be based on direct experience alone. This conclusion may have important implications for our understanding of the role of OFC in psychiatric disorders and how we may be able to treat them. (PsycInfo Database Record (c) 2021 APA, all rights reserved).
Collapse
|
10
|
Gomez DM, Everett TJ, Hamilton LR, Ranganath A, Cheer JF, Oleson EB. Chronic cannabinoid exposure produces tolerance to the dopamine releasing effects of WIN 55,212-2 and heroin in adult male rats. Neuropharmacology 2021; 182:108374. [PMID: 33115642 PMCID: PMC7836093 DOI: 10.1016/j.neuropharm.2020.108374] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Revised: 10/16/2020] [Accepted: 10/24/2020] [Indexed: 02/06/2023]
Abstract
Synthetic cannabinoids were introduced into recreational drug culture in 2008 and quickly became one of the most commonly abused drugs in the United States. The neurobiological consequences resulting from synthetic cannabinoid repeated exposure remain poorly understood. It is possible that a blunted dopamine (DA) response may lead drug users to consume larger quantities to compensate for this form of neurochemical tolerance. Because the endogenous cannabinoid and opioid systems exhibit considerable cross-talk and cross-tolerance frequently develops following repeated exposure to either opioids or cannabinoids, there is interest in investigating whether a history of synthetic cannabinoid exposure influences the ability of heroin to increase DA release. To test the effects of chronic cannabinoid exposure on cannabinoid- and heroin-evoked DA release, male adult rats were treated with either vehicle or a synthetic cannabinoid (WIN55-212-2; WIN) using an intravenous (IV) dose escalation regimen (0.2-0.8 mg/kg IV over 9 treatments). As predicted, WIN-treated rats showed a rightward shift in the dose-response relationship across all behavioral/physiological measures when compared to vehicle-treated controls. Then, using fast-scan cyclic voltammetry to measure changes in the frequency of transient DA events in the nucleus accumbens shell of awake and freely-moving rats, it was observed that the DA releasing effects of both WIN and heroin were significantly reduced in male rats with a pharmacological history of cannabinoid exposure. These results demonstrate that repeated exposure to the synthetic cannabinoid WIN can produce tolerance to its DA releasing effects and cross-tolerance to the DA releasing effects of heroin.
Collapse
Affiliation(s)
- Devan M Gomez
- Psychology Department, University of Colorado Denver, USA; Current: Department of Biomedical Sciences, Marquette University, USA
| | | | | | - Ajit Ranganath
- Department of Neurobiology and Anatomy, University of Maryland Baltimore, USA
| | - Joseph F Cheer
- Department of Neurobiology and Anatomy, University of Maryland Baltimore, USA
| | - Erik B Oleson
- Psychology Department, University of Colorado Denver, USA; Biology Department, University of Colorado Denver, USA.
| |
Collapse
|
11
|
Emberly E, Seamans JK. Abrupt, Asynchronous Changes in Action Representations by Anterior Cingulate Cortex Neurons during Trial and Error Learning. Cereb Cortex 2020; 30:4336-4345. [PMID: 32239139 DOI: 10.1093/cercor/bhaa019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 01/09/2020] [Accepted: 01/12/2020] [Indexed: 11/13/2022] Open
Abstract
The ability to act on knowledge about the value of stimuli or actions factors into simple foraging behaviors as well as complex forms of decision-making. In striatal regions, action representations are thought to acquire value through a gradual (reinforcement-learning based) process. It is unclear whether this is also true for anterior cingulate cortex (ACC) where neuronal representations tend to change abruptly. We recorded from ensembles of ACC neurons as rats deduced which of 3 levers was rewarded each day. The rat's lever preferences changed gradually throughout the sessions as they eventually came to focus on the rewarded lever. Most individual neurons changed their responses to both rewarded and nonrewarded lever presses abruptly (<2 trials). These transitions occurred asynchronously across the population but peaked near the point where the rats began to focus on the rewarded lever. Because the individual transitions were asynchronous, the overall change at the population level appeared gradual. Abrupt transitions in action representations of ACC neurons may be part of a mechanism that alters choice strategies as new information is acquired.
Collapse
Affiliation(s)
- Eldon Emberly
- Department of Physics, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Jeremy K Seamans
- Department of Psychiatry, Centre for Brain Health, University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| |
Collapse
|
12
|
Adams RA, Moutoussis M, Nour MM, Dahoun T, Lewis D, Illingworth B, Veronese M, Mathys C, de Boer L, Guitart-Masip M, Friston KJ, Howes OD, Roiser JP. Variability in Action Selection Relates to Striatal Dopamine 2/3 Receptor Availability in Humans: A PET Neuroimaging Study Using Reinforcement Learning and Active Inference Models. Cereb Cortex 2020; 30:3573-3589. [PMID: 32083297 PMCID: PMC7233027 DOI: 10.1093/cercor/bhz327] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 11/18/2019] [Accepted: 12/05/2019] [Indexed: 12/17/2022] Open
Abstract
Choosing actions that result in advantageous outcomes is a fundamental function of nervous systems. All computational decision-making models contain a mechanism that controls the variability of (or confidence in) action selection, but its neural implementation is unclear-especially in humans. We investigated this mechanism using two influential decision-making frameworks: active inference (AI) and reinforcement learning (RL). In AI, the precision (inverse variance) of beliefs about policies controls action selection variability-similar to decision 'noise' parameters in RL-and is thought to be encoded by striatal dopamine signaling. We tested this hypothesis by administering a 'go/no-go' task to 75 healthy participants, and measuring striatal dopamine 2/3 receptor (D2/3R) availability in a subset (n = 25) using [11C]-(+)-PHNO positron emission tomography. In behavioral model comparison, RL performed best across the whole group but AI performed best in participants performing above chance levels. Limbic striatal D2/3R availability had linear relationships with AI policy precision (P = 0.029) as well as with RL irreducible decision 'noise' (P = 0.020), and this relationship with D2/3R availability was confirmed with a 'decision stochasticity' factor that aggregated across both models (P = 0.0006). These findings are consistent with occupancy of inhibitory striatal D2/3Rs decreasing the variability of action selection in humans.
Collapse
Affiliation(s)
- Rick A Adams
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK
- Division of Psychiatry, University College London, London W1T 7NF, UK
- Psychiatric Imaging Group, Robert Steiner MRI Unit, MRC London Institute of Medical Sciences, Hammersmith Hospital, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital, London W12 0NN, UK
| | - Michael Moutoussis
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK
- Max Planck-UCL Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
| | - Matthew M Nour
- Psychiatric Imaging Group, Robert Steiner MRI Unit, MRC London Institute of Medical Sciences, Hammersmith Hospital, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital, London W12 0NN, UK
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience (IoPPN), King’s College London, London SE5 8AF, UK
| | - Tarik Dahoun
- Psychiatric Imaging Group, Robert Steiner MRI Unit, MRC London Institute of Medical Sciences, Hammersmith Hospital, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital, London W12 0NN, UK
- Department of Psychiatry, University of Oxford, Warneford Hospital, Oxford OX3 7JX, UK
| | - Declan Lewis
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK
| | - Benjamin Illingworth
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK
| | - Mattia Veronese
- Centre for Neuroimaging Sciences, Institute of Psychiatry, Psychology & Neuroscience (IoPPN), King's College London, London SE5 8AF, UK
| | - Christoph Mathys
- Max Planck-UCL Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
- Scuola Internazionale Superiore di Studi Avanzati (SISSA), 34136 Trieste, Italy
- Translational Neuromodeling Unit (TNU), Institute for Biomedical Engineering, University of Zurich and ETH Zurich, 8032 Zurich, Switzerland
| | - Lieke de Boer
- Aging Research Center, Karolinska Institute, 171 65 Stockholm, Sweden
| | - Marc Guitart-Masip
- Max Planck-UCL Centre for Computational Psychiatry and Ageing Research, London WC1B 5EH, UK
- Aging Research Center, Karolinska Institute, 171 65 Stockholm, Sweden
| | - Karl J Friston
- Wellcome Centre for Human Neuroimaging, University College London, London WC1N 3BG, UK
| | - Oliver D Howes
- Psychiatric Imaging Group, Robert Steiner MRI Unit, MRC London Institute of Medical Sciences, Hammersmith Hospital, London W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Hammersmith Hospital, London W12 0NN, UK
- Department of Psychosis Studies, Institute of Psychiatry, Psychology & Neuroscience (IoPPN), King’s College London, London SE5 8AF, UK
| | - Jonathan P Roiser
- Institute of Cognitive Neuroscience, University College London, London WC1N 3AZ, UK
| |
Collapse
|
13
|
Cook JL, Swart JC, Froböse MI, Diaconescu AO, Geurts DEM, den Ouden HEM, Cools R. Catecholaminergic modulation of meta-learning. eLife 2019; 8:e51439. [PMID: 31850844 PMCID: PMC6974360 DOI: 10.7554/elife.51439] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2019] [Accepted: 12/18/2019] [Indexed: 01/03/2023] Open
Abstract
The remarkable expedience of human learning is thought to be underpinned by meta-learning, whereby slow accumulative learning processes are rapidly adjusted to the current learning environment. To date, the neurobiological implementation of meta-learning remains unclear. A burgeoning literature argues for an important role for the catecholamines dopamine and noradrenaline in meta-learning. Here, we tested the hypothesis that enhancing catecholamine function modulates the ability to optimise a meta-learning parameter (learning rate) as a function of environmental volatility. 102 participants completed a task which required learning in stable phases, where the probability of reinforcement was constant, and volatile phases, where probabilities changed every 10-30 trials. The catecholamine transporter blocker methylphenidate enhanced participants' ability to adapt learning rate: Under methylphenidate, compared with placebo, participants exhibited higher learning rates in volatile relative to stable phases. Furthermore, this effect was significant only with respect to direct learning based on the participants' own experience, there was no significant effect on inferred-value learning where stimulus values had to be inferred. These data demonstrate a causal link between catecholaminergic modulation and the adjustment of the meta-learning parameter learning rate.
Collapse
Affiliation(s)
- Jennifer L Cook
- School of PsychologyUniversity of BirminghamBirminghamUnited Kingdom
| | - Jennifer C Swart
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Monja I Froböse
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Andreea O Diaconescu
- Translational Neuromodeling Unit, Institute for Biomedical EngineeringUniversity of Zurich and ETH ZurichZurichSwitzerland
- Department of PsychiatryUniversity of BaselBaselSwitzerland
- Krembil Centre for Neuroinformatics,CAMHUniversity of TorontoTorontoCanada
| | - Dirk EM Geurts
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
- Department of PsychiatryRadboud University Medical CentreNijmegenNetherlands
| | - Hanneke EM den Ouden
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
| | - Roshan Cools
- Donders Institute for Brain, Cognition and Behaviour, Centre for Cognitive NeuroimagingRadboud UniversityNijmegenNetherlands
- Department of PsychiatryRadboud University Medical CentreNijmegenNetherlands
| |
Collapse
|
14
|
Suarez JA, Howard JD, Schoenbaum G, Kahnt T. Sensory prediction errors in the human midbrain signal identity violations independent of perceptual distance. eLife 2019; 8:43962. [PMID: 30950792 PMCID: PMC6450666 DOI: 10.7554/elife.43962] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Accepted: 03/28/2019] [Indexed: 01/15/2023] Open
Abstract
The firing of dopaminergic midbrain neurons is thought to reflect prediction errors (PE) that depend on the difference between the value of expected and received rewards. However, recent work has demonstrated that unexpected changes in value-neutral outcome features, such as identity, can evoke similar responses. It remains unclear whether the magnitude of these identity PEs scales with the perceptual dissimilarity of expected and received rewards, or whether they are independent of perceptual similarity. We used a Pavlovian transreinforcer reversal task to elicit identity PEs for value-matched food odor rewards, drawn from two perceptual categories (sweet, savory). Replicating previous findings, identity PEs were correlated with fMRI activity in midbrain, OFC, piriform cortex, and amygdala. However, the magnitude of identity PE responses was independent of the perceptual distance between expected and received outcomes, suggesting that identity comparisons underlying sensory PEs may occur in an abstract state space independent of straightforward sensory percepts.
Collapse
Affiliation(s)
- Javier A Suarez
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, United States
| | - James D Howard
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, United States
| | - Geoffrey Schoenbaum
- Intramural Research Program of the National Institute on Drug Abuse, National Institutes of Health, Baltimore, United States
| | - Thorsten Kahnt
- Department of Neurology, Feinberg School of Medicine, Northwestern University, Chicago, United States.,Department of Psychiatry and Behavioral Sciences, Feinberg School of Medicine, Northwestern University, Chicago, United States.,Department of Psychology, Weinberg College of Arts and Sciences, Northwestern University, Evanston, United States
| |
Collapse
|
15
|
Pool ER, Pauli WM, Kress CS, O'Doherty JP. Behavioural evidence for parallel outcome-sensitive and outcome-insensitive Pavlovian learning systems in humans. Nat Hum Behav 2019; 3:284-296. [PMID: 30882043 PMCID: PMC6416744 DOI: 10.1038/s41562-018-0527-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 12/21/2018] [Indexed: 02/07/2023]
Abstract
There is a dichotomy in instrumental conditioning between goal-directed actions and habits that are distinguishable on the basis of their relative sensitivity to changes in outcome value. It is less clear whether a similar distinction applies in Pavlovian conditioning, where responses have been found to be predominantly outcome sensitive. To test for both devaluation insensitive and devaluation sensitive Pavlovian conditioning in humans, we conducted four experiments combining Pavlovian conditioning and outcome devaluation procedures while measuring multiple conditioned responses. Our results suggest that Pavlovian conditioning involves two distinct types of learning: one that learns the current value of the outcome which is sensitive to devaluation, and one that learns about the spatial localisation of the outcome which is insensitive to devaluation. Our findings have implications for the mechanistic understanding of Pavlovian conditioning and provide a more nuanced understanding of Pavlovian mechanisms that might contribute to a number of psychiatric disorders.
Collapse
Affiliation(s)
- Eva R Pool
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA.
| | - Wolfgang M Pauli
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
| | - Carolina S Kress
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
| | - John P O'Doherty
- Division of the Humanities and Social Sciences, California Institute of Technology, Pasadena, CA, USA
- Computation and Neural Systems Program, California Institute of Technology, Pasadena, CA, USA
| |
Collapse
|